1
votes

I need to run a one way ANOVA on every column in a data frame based on another column. My data frame has 57 columns, so it is very time consuming to type out every column name. Here is part of my data frame.

So basically, I need this function run for every column

aov(df$PGY_16 ~ df$Total_Time_cm_16, df)

So I need a loop to run that for every column in my data frame.

Any help would be greatly appreciated!

2
a simple lapply should do.Enigma

2 Answers

4
votes

For reproducibility, the code below uses the built-in mtcars data frame and returns a list in which each list element is an aov model of mtcars$cyl with every other column in the data frame. We use map from the purrr package (which is part of the tidyverse suite of packages) to take care of successively running aov on each column of the data frame.

library(tidyverse)

aov.models = mtcars[ , -grep("cyl", names(mtcars))] %>%
  map(~ aov(mtcars$cyl ~ .x))

For your data, the analogous code would be:

aov.models = df[ , -grep("PGY_16", names(df))] %>%
  map(~ aov(df$PGY_16 ~ .x))
0
votes

It can be done without installing tidyverse package. This is the example for mtcars data frame.

aov.models <- lapply(setdiff(names(mtcars), "cyl"), function(s) {
  aov(as.formula(paste(s, " ~ cyl")),mtcars)
})