0
votes

I am looking to run multiple ANOVAs in R, so I was hoping to write a function.

df = iris

run_anova <- function(var1,var2,df) {
  fit = aov(var1 ~ var1 , df)
  return(fit)
}

In the iris dataset, the column names are "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"

Assuming that I want to use these columns in the equations, how do I pass them into the run_anova function? I have tried passing them in as strings

run_anova("Sepal.Width", "Petal.Length", df)

that doesn't work because this error appears: "In storage.mode(v) <- "double" :"

run_anova(Sepal.Width, Petal.Length, df)

When I just pass them in without the quotes, "not found". How can I pass these names of the df columns into the function?

Many thanks in advance for your help.

2
Would try building a formula from the variable name strings: fit = aov(as.formula(paste(var1, "~", var2)) , df)Ben
or reformulate(var2, var1) for shortrawr

2 Answers

1
votes

1) Use reformulate to create the formula. The do.call is needed to cause the Call: line in the output to appear nicely but if you don't care about that you can use the shorter version shown in (3).

run_anova <- function(var1, var2, df) {
  fo <- reformulate(var2, var1)
  do.call("aov", list(fo, substitute(df)))
}

run_anova("Sepal.Width", "Petal.Length", iris)

giving

Call:
   aov(formula = Sepal.Width ~ Petal.Length, data = iris)    

Terms:
                Petal.Length Residuals
Sum of Squares      5.196047 23.110887
Deg. of Freedom            1       148

Residual standard error: 0.3951641
Estimated effects may be unbalanced

2) Although the use of eval is discouraged, an alternative which also gives nice output is:

run_anova2 <- function(var1, var2, df) {
  fo <- reformulate(var2, var1)
  eval.parent(substitute(aov(fo, df)))
}

run_anova2("Sepal.Width", "Petal.Length", iris)

3) If you don't care about the Call line in the output being nice then this simpler code can be used:

run_anova3 <- function(var1, var2, df) {
  fo <- reformulate(var2, var1)
  aov(fo, df)
}

run_anova3("Sepal.Width", "Petal.Length", iris)

giving:

Call:
   aov(formula = fo, data = df)
...etc...
0
votes

An alternative is to use rlang's quasi-quotation syntax

df = iris

library(rlang)
run_anova <- function(var1, var2, df) {
    var1 <- parse_expr(quo_name(enquo(var1)))
    var2 <- parse_expr(quo_name(enquo(var2)))
    eval_tidy(expr(aov(!!var1 ~ !!var2, data = df)))
}

This allows you to do use both strings and unquoted expressions for var1 and var2:

run_anova("Sepal.Width", "Petal.Length", df)
run_anova(Sepal.Width, Petal.Length, df)

Both expressions return the same result.