The stargazer package in R is fantastic for displaying multiple regression models as side-by-side columns—the standard style for many social science disciplines. However, the package doesn't play well with knitr+pandoc, since it generates output either as HTML or TeX, but not Markdown.
As a solution, I'm creating a function that can generate tables similar to those created with stargazer, but that saves the output as a simple data frame, which I can then render with pacakges like kable
and pander
in knitted documents. Doing this is trivial with functions like broom::tidy
.
I'm stuck, however, on how to order the coefficients displayed in the model. Take these three models, for example. When displayed with stargazer, the final coefficient order is c("wt", "qsec", "hp", "cyl", "gear", "carb", "drat")
. The order of all the coefficients is based primarily on the coefficients in the first model (wt
, qsec
, cyl
, gear
, carb
). When the second model is appended as a new column, the hp
row is inserted after qsec
and before cyl
.
lm0 <- lm(hp ~ wt + qsec + cyl + gear + carb, mtcars)
lm1 <- lm(qsec ~ hp + cyl + gear + carb, mtcars)
lm2 <- lm(qsec ~ wt + hp + gear + drat, mtcars)
stargazer(lm0, lm1, lm2, type="text")
====================================================
(1) (2) (3)
----------------------------------------------------
wt 16.879 0.827**
(12.113) (0.383)
qsec -8.124
(6.109)
hp -0.005 -0.026***
(0.007) (0.004)
cyl 18.210** -0.811***
(8.785) (0.280)
gear 13.342 -1.597*** -0.232
(15.115) (0.441) (0.439)
carb 9.277 0.098
(6.345) (0.222)
drat 0.099
(0.636)
Constant 49.424 29.181*** 19.530***
(171.876) (2.398) (2.766)
====================================================
Note: *p<0.1; **p<0.05; ***p<0.01
In the end, I hope to generate a character vector of the coefficient names that I can then use with dplyr::arrange()
to correctly sort a data frame of multiple model coefficients.
The sorting seems to follow this pseudo algorithm:
- Save the first list (
list_1
) of coefficient names - Go through the second list of names. If
element_1
oflist_2
doesn't matchelement_1
oflist_1
, check the next element oflist_1
until there's a match, then insert before the match - Put
element_2
oflist_2
afterelement_1
if it doesn't match anything else inlist_1
- Repeat with
list_3
, and so on
Writing simple R code to generate this order, however, has proven more difficult than I thought. Simply concatenating all the coefficient names in a vector and then keeping only unique values doesn't product the correct order, since new variables (like hp
) are just added to the end of existing variable names instead of getting inserted in the middle:
library(tidyverse)
names1 <- names(lm0$coefficients) %>% discard(~ .x == "(Intercept)")
names2 <- names(lm1$coefficients) %>% discard(~ .x == "(Intercept)")
names3 <- names(lm2$coefficients) %>% discard(~ .x == "(Intercept)")
# New variables just appended
unique(c(names1, names2, names3))
# [1] "wt" "qsec" "cyl" "gear" "carb" "hp" "drat"
Additionally, it seems like the only way to implement something like this is to use a ton of loops, which feels wildly inefficient.
So, in the end, how can I sort or reorder a character vector of coefficient names by order of appearance in a list of models, prioritizing the order of the first model in the list? That is, ultimately this is the character vector I'd like to get: c("wt", "qsec", "hp", "cyl", "gear", "carb", "drat")
Update: memisc::mtable(lm0, lm1, lm2)
is a neat alternative to stargazer that actually returns a data frame (and not just text), but it doesn't insert new coefficients in the already existing order and instead appends them to the list (with hp
and drat
at the end). It seems to just concatenate all the coefficient names and use their unique values.
===================================================
lm0 lm1 lm2
---------------------------------------------------
(Intercept) 49.424 29.181*** 19.530***
(171.876) (2.398) (2.766)
wt 16.879 0.827*
(12.113) (0.383)
qsec -8.124
(6.109)
cyl 18.210* -0.811**
(8.785) (0.280)
gear 13.342 -1.597** -0.232
(15.115) (0.441) (0.439)
carb 9.277 0.098
(6.345) (0.222)
hp -0.005 -0.026***
(0.007) (0.004)
drat 0.099
(0.636)
---------------------------------------------------