3
votes

I have generated a list of models, and would like to create a summary table.

As and example, here are two models:

x <- seq(1:10)
y <- sin(x)^2
model1 <- lm(y ~ x)
model2 <- lm(y ~ x + I(x^2) + I(x^3))

and two formulas, the first generating the equation from components of formula

get.model.equation <- function(x) {
  x <- as.character((x$call)$formula)
  x <- paste(x[2],x[1],x[3])
}

and the second generating the name of model as a string

get.model.name <- function(x) {
  x <- deparse(substitute(x))
}

With these, I create a summary table

model.list <- list(model1, model2)
AIC.data <- lapply(X = model.list, FUN = AIC)
AIC.data <- as.numeric(AIC.data)
model.models <- lapply(X = model.list, FUN = get.model)
model.summary <- cbind(model.models, AIC.data)
model.summary <- as.data.frame(model.summary)
names(model.summary) <- c("Model", "AIC")
model.summary$AIC <- unlist(model.summary$AIC)
rm(AIC.data)
model.summary[order(model.summary$AIC),]

Which all works fine. I'd like to add the model name to the table using get.model.name

x <- get.model.name(model1)

Which gives me "model1" as I want.

So now I apply the function to the list of models

model.names <- lapply(X = model.list, FUN = get.model.name)

but now instead of model1 I get X[[1L]]

How do I get model1 rather than X[[1L]]?

I'm after a table that looks like this:

 Model                Formula       AIC
model1                  y ~ x  11.89136
model2 y ~ x + I(x^2) + I(x^3) 15.03888
3
you've created an unnamed list in model.list and you're passing each element of this list to get.model.name. So, X[[1]] is indeed passed the first time, and it fetches what you've asked for. @baptiste's overcomes this issue by creating simply a named list and avoiding the complications. If his solution is not what you're looking for, you'll have to explain, perhaps, why you're doing it this way.Arun
Thanks Arun and baptiste. I've added what I'm trying to produce.Isaiah
Thanks again. I've just replaced model.list <- list(model1, model2) with model.list <- list(model1=model1, model2=model2) after reading your comments and it works well now.Isaiah

3 Answers

9
votes

Do you want something like this?

model.list <- list(model1 = lm(y ~ x), 
                   model2 = lm(y ~ x + I(x^2) + I(x^3)))
sapply(X = model.list, FUN = AIC)
2
votes

I'd do something like this:

model.list <- list(model1 = lm(y ~ x),
                   model2 = lm(y ~ x + I(x^2) + I(x^3)))
# changed Reduce('rbind', ...) to do.call(rbind, ...) (Hadley's comment)
do.call(rbind, 
        lapply(names(model.list), function(x) 
          data.frame(model = x, 
          formula = get.model.equation(model.list[[x]]), 
          AIC = AIC(model.list[[x]])
          )
        )
      )

#    model                 formula      AIC
# 1 model1                   y ~ x 11.89136
# 2 model2 y ~ x + I(x^2) + I(x^3) 15.03888
2
votes

Another option, with ldply, but see hadley's comment below for a more efficient use of ldply:

 # prepare data
    x <- seq(1:10)
    y <- sin(x)^2
    dat <- data.frame(x,y)

# create list of named models obviously these are not suited to the data here, just to make the workflow work...
models <- list(model1=lm(y~x, data = dat), 
               model2=lm(y~I(1/x), data=dat),
               model3=lm(y ~ log(x), data = dat),
               model4=nls(y ~ I(1/x*a) + b*x, data = dat, start = list(a = 1, b = 1)), 
               model5=nls(y ~ (a + b*log(x)), data=dat, start = setNames(coef(lm(y ~ log(x), data=dat)), c("a", "b"))),
               model6=nls(y ~ I(exp(1)^(a + b * x)), data=dat, start = list(a=0,b=0)),
               model7=nls(y ~ I(1/x*a)+b, data=dat, start = list(a=1,b=1))
)

library(plyr)
library(AICcmodavg) # for small sample sizes
# build table with model names, function, AIC and AICc
data.frame(cbind(ldply(models, function(x) cbind(AICc = AICc(x), AIC = AIC(x))), 
                 model = sapply(1:length(models), function(x) deparse(formula(models[[x]])))
                      ))

     .id     AICc      AIC                     model
1 model1 15.89136 11.89136                     y ~ x
2 model2 15.78480 11.78480                y ~ I(1/x)
3 model3 15.80406 11.80406                y ~ log(x)
4 model4 16.62157 12.62157    y ~ I(1/x * a) + b * x
5 model5 15.80406 11.80406      y ~ (a + b * log(x))
6 model6 15.88937 11.88937 y ~ I(exp(1)^(a + b * x))
7 model7 15.78480 11.78480        y ~ I(1/x * a) + b

It's not immediately obvious to me how to replace the .id with a column name in the ldply function, any tips?