Can anyone suggest a dplyr
answer to the following question?
Split data.frame by country, and create linear regression model on each subset
For completeness, the question and answer from the link are included below.
Question
For reference, here's Josh's question:
I have a data.frame of data from the World Bank which looks something like this;
country date BirthRate US.
4 Aruba 2011 10.584 25354.8
5 Aruba 2010 10.804 24289.1
6 Aruba 2009 11.060 24639.9
7 Aruba 2008 11.346 27549.3
8 Aruba 2007 11.653 25921.3
9 Aruba 2006 11.977 24015.4
All in all there 70 something sub sets of countries in this data frame that I would like to run a linear regression on. If I use the following I get a nice lm for a single country;
andora = subset(high.sub, country == "Andorra")
andora.lm = lm(BirthRate~US., data = andora)
anova(andora.lm)
summary(andora.lm)
But when I try to use the same type of code in a for loop, I get an error which I'll print below the code;
high.sub = subset(highInc, date > 1999 & date < 2012)
high.sub <- na.omit(high.sub)
highnames <- unique(high.sub$country)
for (i in highnames) {
linmod <- lm(BirthRate~US., data = high.sub, subset = (country == "[i]"))
}
#Error message:
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases
If I can get this loop to run I would ideally like to append the coefficients and even better the r-squared values for each model to an empty data.frame. Any help would be greatly appreciated.
Answer
For reference, here's jlhoward's answer (incorporating BondedDust's comment) making use of the *apply functions found in this excellent question: R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate
models <- sapply(unique(as.character(df$country)),
function(cntry)lm(BirthRate~US.,df,subset=(country==cntry)),
simplify=FALSE,USE.NAMES=TRUE)
# to summarize all the models
lapply(models,summary)
# to run anova on all the models
lapply(models,anova)
#This produces a named list of models, so you could extract the model for Aruba as:
models[["Aruba"]]
?do
and posts on SO – Henrik