0
votes

I am fitting 100 linear and 100 quadratic models to the same synthetic data with a for loop.

I want to extract (randomly) 5 linear fits, and 5 quadratic fits, and plot these regression functions along with the mean function of the true model.

My thoughts so far:

  1. pick 10 random numbers
  2. create a container (data frame?) to hold the 10 models
  3. enter the for loop, get the fits, if the ith iteration is from the set of random numbers, then hold that model in the container.
  4. exit for loop, plot all of the fits in the container alongside true model

The main difficulty I'm having is seeing how to create an empty dataframe for class "lm".

Before the for loop, build containers,

fit.container.linear <- list()
fit.container.quadratic <- list()

And pick indices randomly for the models that will go into the container

randy.linear <- sample(1:100,5,replace=F) 
randy.quadratic <- sample(1:100,5,replace=F) 

And then in the for loop, I get a linear and quadratic model using lm. model.1 is linear. model.2 is quadratic. Now within the for loop I store the models in the containers

for (k in 1:length(randy.linear)){
    if (i == randy.linear[k]){
        fit.container.linear <- list(fit.container.linear,model.1)
    }
    }

for (p in 1:length(randy.quadratic)) {
    if (i == randy.linear[p]){
        fit.container.quadratic <- list(fit.container.quadratic,model.2)
    }
}

Exit the for loop.

Now I want to access the containers holding the coefficients, and plot everything on one graph.

1
the container you want is a list: results <- list().Ben Bolker
thanks. is there a way to plot directly from results? the command plot(results[1]) throws an error.Dan Schwartz
can you edit your question to post the code you've got so far?Ben Bolker

1 Answers

0
votes

I'd store all of the models in a list, then just index them.

lin_mods <- vector(mode = "list", length = 100)
quad_mods  <- vector(mode = "list", length = 100)

for(i in 1:100){
    # ....
    lin_mods[[i]]  <- model.1
    quad_mods[[i]] <- model.2
}

randy_linear    <- sample(1:100, 5, replace=F) 
randy_quadradic <- sample(1:100, 5, replace=F)

rand_lin_mods  <- lin_mods[randy_linear]
rand_quad_mods <- quad_mods[randy_quadradic]

As for plotting them, that's hard to give a good answer for without a sense of your original data and predictors. What you could try is:

pred_grid <- expand.grid(y = seq(min(data$y), max(data$y), length = 100)
get_fit <- function(model, newdata){
    newdata$fit <- predict(model, newdata = newdata)
    return(newdata)
}    

library(plyr)
names(randy_lin_mods) <- 1:5
names(randy_quad_mods) <- 1:5

lin_fits  <- ldply(randy_lin_mods,  get_fit, newdata = pred_grid)
quad_fits <- ldply(randy_quad_mods, get_fit, newdata = pred_grid)

Now lin_fits and quad_fits are data frames of the predicted fits from the models.