I have been looking for hours on how to create a summary statistics table grouped by a categorical variable in R with the stargazer package.
Basically, I want to display the means for two groups (control & treatment) next to each other and additionally calculate the differences between both groups.
Whenever I try to create the table with stargazer it creates both tables for each categorical variable underneath each other.
I created a sample with the mtcars data set. Assuming the variable 'am' is the categorical variable:
attach(mtcars)
library(dplyr)
data = mtcars
auto1 = data %>%
filter(am == 1) %>%
dplyr::select(mpg,disp,hp)
manu1 = data %>%
filter(am == 0) %>%
dplyr::select(mpg,disp,hp)
stargazer(auto1,manu1, type = "html", out = "summary.html",summary.stat = c("mean"), summary = TRUE)`
Since that did not work out as expected, I created the summary table manually and specified summary to FALSE inside stargazer to just obtain a a HTML table:
auto = data %>%
filter(am == 1) %>%
summarize_each(funs(mean)) %>%
melt(id.vars="am")
manu = data %>%
filter(am == 0) %>%
summarize_each(funs(mean)) %>%
melt(id.vars = "am")
end = dplyr::select(data.frame(auto,manu),-c(am,am.1,variable.1))
end$diff = end$value.1 - end$value
names(end) = c("Variable","Automatic","Manual","Difference")
stargazer(end, type = "html", out = "summary.html",summary.stat = c("mean"), summary = FALSE)
This is probably not really a neat way of creating the desired summary statistics table, but I couldn't figure out a better way myself. Any suggestions how that could work with stargazer or a different package?
dlpyr
. Would that do? – thepule