The datscience (dataanalysis and science) R-package contains functions, which are frequently required in the process of preparing data for publication. Among those are:
format_flextable()applies the APA 7th theme to a flextable object
apa_factorLoadings()creates APA 7th based factor-loadings table
flex_table1()tabular group comparison of 2 or more groups (usualy table 1)
Rcitation_appendix()Function that allows the creation of a full APA-style table with all citations and informations on the R-packages utilized.
The overall goal was to improve the workflow of data analysis and help with formatting challenges I encountered in the preparation of submission to scientific journals. For example: getting stats from R with the right format into MS Word.
You can install the latest released version of datscience easily, directly from GitHub with:
# Normal intsallation install.packages("devtools") devtools::install_github("Buedenbender/datscience")
My recommendation would be to use a the pacman package manger instead, as this installs the latest version from github and directly loads it
# Recommendation: pacman pacman::p_load_gh("Buedenbender/datscience")
Some users might encounter:
(System Error 267 @win/processx.c:1040), which is due to special characters in the username and the path to the directories (e.g.,
é). In this case, you can try to install
datscience with the remotes package in standalone mode in a fresh R-sessions (no packages loaded) (see below and for reference this stackoverflow posting).
A Teaser of
While R provides so many opportunities and power to conduct whatever analyses one can imagine, I found myself often having difficulties with the transfer of the analysis or the results from R session into MS Word.
Example Problem: Get a nicely formatted (in accordance with APA 7th publication manual) sociodemographic table 1 directly into a Word file (*.docx).
As of March 2022 the new function
datscience::flex_table1() makes the creation of the Sociodemograhpic Table 1 (including the statistical comparisons of subsamples), basically a piece of cake. Take a look at the new article
vignette("flex_table1"). For the example we took the popular iris dataset and included a simulated categorial variable called Color that contains either “Blue” or “Orange”. We supply the function with a formula that determines which variables are to be included in the table. For the example we include the two metric variables Sepal.Length, Sepal.Width as well as the simulated Color
# To load datscience pacman::p_load_gh("Buedenbender/datscience") str_formula <- "~ Sepal.Length + Sepal.Width + Color | Species" flex_table1(str_formula, data = iris_sim, overall = "Overall") # |>
Uncomment the pipe
|> operator above and the line after the call to
flex_table1() to diretly save this nicely formatted tabular comparison as .docs (Word) document.
Further Examples of
Below are just a few examples of the functionality of the package
datscience::apa_corrTable() displays correlations with marked significance and additionally adds descriptive statistics to the table, see below:
This function resolves around three other useful functions from this package.
Creates the correlation table by calling
datscience::corstars(iris[1:4]) #> Sepal.Length Sepal.Width Petal.Length #> Sepal.Length #> Sepal.Width -0.12 #> Petal.Length 0.87*** -0.43*** #> Petal.Width 0.82*** -0.37*** 0.96***
Appends desired summary stats to the flextable.
Formatting of the
flextable::flextable()object to APA 7th style, by utilizing the
format_flextable()function. To illustrate the function, we here use it to display the first 5 rows of the iris data set.
datscience::format_flextable(flextable::flextable(head(iris, 5)), table_caption = c("Table 2", "Illustrating Functionality of format_flextable()") )
datscience::save_flextable()function. This will savely (i.e., prohibiting overwrite of files by serializing the naming) write the flextable object to a Word (.docx) file
One of the most utilized functions inside the package is the
datscience::format_flextable() which takes a
flextable objects and applies the APA 7th edition theme on it. It also provides a work-around to give an APA ready table caption and a note.
Note: The code for the formatting (theme) for
format_flextable() function was inspired from the blog post of Rémi Thériault
The flextable package is so versatile and it was exactly what I was looking for to get nicely formatted tables directly from R(studio) into Word. The same holds true for the
datscience::format_flextable() function from the datscience package. It just applies some repetitive formatting necessary to convert a flextable to a “publication ready” APA formatted table.
One example of the flexibility would be to just try to print the factor loadings from a principal component analysis (PCA,
Let’s first create an exemplary PCA[1, 2]and extract the factor loadings:
(Which is also more conveniently packaged in the function
# Creation of an Example Prinicipal Component Analysis pacman::p_load(psych, dplyr) pc <- principal(Harman74.cor$cov, 4, rotate = "varimax") pc_loadings <- pc$loadings |> fa.sort() |> round(3) |> unclass() |> as.data.frame() |> mutate(across( everything(), ~ if_else((. < 0.3), "", as.character(.)) )) |> bind_cols( Communality = pc$communality, Uniqueness = pc$uniquenesses, Complexity = pc$complexity ) |> mutate(across(where(is.numeric), round, 2)) |> tibble::rownames_to_column("items")
Formatting these loadings to APA with just one function:
formatted_loadings <- datscience::format_flextable(flextable::flextable(pc_loadings), table_caption = c("Table 3", "Factor Loadings in Exemplary PCA") ) formatted_loadings
rempsyc I highly recommend this now available package. It offers great utility for psychologist and social scientist getting into R and provides nicely on-point tutorials on how to implement the provided functions (one of which was the origin of