0
votes

I've a dataset looking like this:

> print(mydata)
                col1                 col2                col3
1               0.819               0.851               0.874
2               0.972               0.703               0.821
3               0.891               0.790               0.951
4               0.839               0.799               0.819

I would like to know if there are significant differences between the three groups col1, col2 and col3. For this matter, my guess is that the best way is to run an anova test.

Please find below the script I used to produce the dataset, to run the test and the Error displayed by R:


> mydata <- data.frame(col1, col2, col3)
> accuracymetrics <- as.vector(mydata)
> anova(accuracymetrics)

Error in UseMethod("anova") : no applicable method for 'anova' applied to an object of class "data.frame"

It's the first time I'm running such an analysis in R so bear with me if this question is not interesting for the forum. Any input to solve this error is appreciated!

1
Look at the help page for the anova function: "object an object containing the results returned by a model fitting function (e.g., lm or glm)." It's meant to be called on a model, not a data frame. That's reflected in your error message.camille
What do you mean with significant differences. Usually you perform the t-test to see if the means of the samples are the same (under the assumption that they come from a normal distribution) or the kolmogorov-smirnov test to see if they come from the same distribution. Anova is based on a regression model usually.LyzandeR
@LyzandeR Here, I need to compare between more than two groups, so according to ncbi.nlm.nih.gov/pmc/articles/PMC3916511 I need to use an ANOVA.juansalix
So, that's the t-testLyzandeR

1 Answers

1
votes

if I understood you correctly the three groups you are talking about are the three columns in your data. If this is the case you need to do two things:

First, reshape your data from wide to long format such that it looks like this

group | value
------------
grp1  | 0.819
grp1  | 0.972

This can easily be done with the tidyr package

library(tidyr)
longdata <- gather(mydata, group, value)

Second: you have to use aov instead of anova:

res.aov <- aov(value ~ group, data = longdata)
summary(res.aov)

Here you can find even more details. Hope this helps.