1
votes

tbl_summary [library (gtsummary)] does not treat all numeric variables in the same way and I can't figure out how to change it. For example.

mtcars only has numeric variables, so when I run this, I expect the means of every variable to be calcuated. Instead, it treats cyl, gear and carb as categorical.

tbl_summary(mtcars, statistic = list(all_numeric() ~ "{mean} ({sd})",
                                      all_categorical() ~ "{n} / {N} ({p}%)"))

I actually have a much bigger dataset and tbl_summary is treating some of the numeric variables as categorical. Would it be because there are such few N's (let's say I have a lot of missing rows) and tbl_summary does not want to calculate the mean for such a small N?

I can't wrap my mind around this!

Just a further example from my data. Q12_5_TEXT is a numeric variable, but this is the output from tbl_summary.

enter image description here

2
@Daniel D. Sjoberg please let me know if you have any suggestions!NewBee

2 Answers

1
votes

Variables with few unique levels are summarized categorically. For example, mtcars$cyl only has three unique levels: 4, 6, 8. With only three levels, a categorical summary is more appropriate than a mean or median.

Use the type= argument to change the default summary type.

2
votes

I tried type = all_continuous() ~ "continuous2", and I have version 1.3.5, and it didn't change the summary type:

library(tidyverse)
library(gtsummary)

nrows <- 30

df <- tibble(
  a = sample(c(0,1,3.5,7.5),nrows,replace = T),
  b = sample(c("Group I","Group II"),nrows,replace = T)
)

df %>% 
  tbl_summary(
    by = b,
    type = all_continuous() ~ "continuous2",
    statistic = all_continuous() ~ "{mean} ({sd})"
  )

The output from this summarized variable 'a' as if it was a categorical variable in spite of the type argument. I also ran into this issue which is why I came here for the answer. If there is a different argument that I should be using that you could point me to, I would greatly appreciate it!