Creates a Descriptive Bivariate Table1 Ready for Publication

A convenience function, that provides and easy wrapper for the two main enginges of the function

table1 provides a nice API given a formula to create demographics tables. I basically just advanced the functionality of the p-value function to also be able to run for multiple groups (ANOVA), added the possibility to correct p-values with either Bonferroni or Sidark, and set some sensible defaults to achieve a nice look
flextable which gives all the power to format the table as you please (e.g., conditional formatting ->adding bold for p values below .05), adding italic headers or notes explaining what was done.

Really all credit should go to these two packages their developers. My function just provides an easy to use API or wrapper around their packages to get a beautiful publication ready bivariate comparison Table 1.

Usage

flex_table1(
  str_formula,
  data,
  correct = NA,
  num = NA,
  table_caption = NA,
  ref_correction = TRUE,
  include_teststat = TRUE,
  drop_unused_cats = TRUE,
  PCTexcludeNA = TRUE,
  overall = FALSE,
  ...
)

Arguments

str_formula: A string representing a formula, e.g., "~ Sepal.Length + Sepal.Width | Species" used to construct the table1.
data: The dataset containing the variables for the table1 call (all terms from the str_formula must be present)
correct: Character, default = NA; NA for no correction. Currently available are "bonf" for Bonferroni correction or "sidark" for Sidark correction. If you want any other correction included just open an issue <https://github.com/Buedenbender/datscience/issues> or contact me via mail. Please see also the references and details on correction for multiple comparison
num: Integer number of comparisons. If NA will be determined automatically, by the number of terms in the formula
table_caption: Caption for the table, each element of the vector represents a new line. The first line will be bold face. All additional lines are in italic.
ref_correction: Boolean, default = TRUE, if TRUE corrected p-Values will be referenced in the foot note.
include_teststat: Boolean, default = TRUE, if TRUE includes two additional columns in the table. 1) Test statistic (either t, f or X²) and 2) degrees of Freedom
drop_unused_cats: Boolean, default = TRUE, if TRUE categories (i.e., factor levels) with 0 observations will be dropped.
PCTexcludeNA: Boolean, default = TRUE, Should calculation of percentages include or exclude Missings values. If PCTexcludeNA = TRUE, missings will be excluded.
overall: Character, default = FALSE, Should the final table also include a column for the totals of the sample? If a character is provided this give the name of the new column (recommendation "Overall")
...: (Optional), Additional arguments that can be passed to format_flextable (e.g., fontsize, font ...) or to serialNext

Value

A flextable object with APA ready correlation table. If a filepath is provided it also creates the respective file (e.g., a word .docx file)

Details

On Fisher's Exact Test (FET) vs Pearson's χ²-test
Newest feature (as of 07/22), according to an excellent post on cross-validated (Harrell 2011) the function refrains from using Fisher's exact test (FET) for categorical variables and only applies FET in the the rare case of cells with an expected cell frequencies do not exceed 1. This is due to the fact, that the FET can be extreme resource intensive (and slow), and can have type I error rates less than the nominal level (Crans and Shuster 2008) Contemporary evidence suggests, that Pearson s χ²-test with the modification of \(\frac{N-1}{N}\), nearly allways is more accurate than FET and generally recommended (Lydersen et al. 2009) . Thus in accordance we use the N-1 Pearson χ²-test proposed by (E.) Pearson and recommended as optimum test policy by (Campbell 2007) .

On Multiple Comparisons
Let me start with a direct quote "(..) researchers should not automatically (mindlessly) assume that alpha adjustment is necessary during multiple testing." (Rubin 2021)
Whether, how and when to correct for multiple comparison in inferential statistic, is still a an area of ongoing debate. However it was recently argued that it is essential to differentiate between different forms of multiple comparisons, to make the decision for or against a correction (Rubin 2021) . The types of multiple testing are:

disjunction testing
conjunction testing
individual testing

Correction is primarly adequate in case of disjunction testing. Please refer to the very well written and laid out original publication for more details. For the use case of this function, one can assume a joint null hypotheses, being that Group A <...> Group N do not differ. Now for example, if it is sufficient that the groups differ significantly in one characteristic, this would be considered disjunction testing.
However, if we are only interested in the constituent (null-)hypotheses (e.g., the groups differ in their highest level of education vs. they differ in the current employment status), it could be categorized as individual testing. Please chose considerately for your individual case. However for the typical exploratory bivariate comparison in sociodemographic table1, I deem it to be frequently cases of individual testing, thus the flex_table1() function defaults to applying no correction.

References

Campbell I (2007). “Chi-squared and Fisher–Irwin tests of two-by-two tables with small sample recommendations.” Statistics in Medicine, 26(19), 3661--3675. ISSN 02776715, doi:10.1002/sim.2832 , https://onlinelibrary.wiley.com/doi/10.1002/sim.2832.

Crans GG, Shuster JJ (2008). “How conservative is Fisher's exact test? A quantitative evaluation of the two‐sample comparative binomial trial.” Statistics in Medicine, 27(18), 3598--3611. ISSN 02776715, doi:10.1002/sim.3221 , https://onlinelibrary.wiley.com/doi/10.1002/sim.3221.

Harrell F (2011). “Is there ever a reason to do a chi-squared test rather than Fisher's exact test?” Cross Validated. URL:https://stats.stackexchange.com/q/14230 (version: 2021-01-26), https://stats.stackexchange.com/q/14230, https://stats.stackexchange.com/q/14230.

Lydersen S, Fagerland MW, Laake P (2009). “Recommended tests for association in 2×2 tables.” Statistics in Medicine, 28(7), 1159--1175. ISSN 02776715, doi:10.1002/sim.3531 , https://onlinelibrary.wiley.com/doi/10.1002/sim.3531.

Rubin M (2021). When to adjust alpha during multiple testing : a consideration of disjunction , conjunction , and individual testing, number 0123456789. Springer Netherlands. ISBN 0123456789, doi:10.1007/s11229-021-03276-4 .

Author

Bjoern Buedenbender

Examples

if (FALSE) {
# Comparison of just two Groups
str_formula <- "~ Sepal.Length + Sepal.Width +test | Species"
data <- dplyr::filter(iris, Species %in% c("setosa", "versicolor"))
data$test <- factor(rep(c("Female", "Male"), 50))
table_caption <- c("Table 1", "A test on the Iris Data")
flex_table1(str_formula, data = data, table_caption = table_caption)

# Comparison of Multiple Groups (ANOVA)
str_formula <- "~ Sepal.Length + Sepal.Width + Gender_example | Species"
data <- dplyr::filter(iris, Species %in% c("setosa", "versicolor"))
data <- iris
data$Gender_example <- factor(rep(c("Female", "Male"), nrow(data) / 2))
table_caption <- c("Table 1", "A test on the Iris Data")
flex_table1(str_formula, data = data, table_caption = table_caption)
}