5
votes

I have a data which looks like below and I try to perform ANOVA and check the differences between all the columns. How significantly they are different from each other etc.

df<- structure(list(color = structure(c(3L, 4L, 3L, 4L, 4L, 4L, 4L, 
    4L, 4L, 4L), .Label = c("B", "G", "R", "W"), class = "factor"), 
        type = 1:10, X1 = c(0.006605138, 0.001165448, 0.006975109, 
        0.002207839, 0.00187902, 0.002208638, 0.001199808, 0.001162252, 
        0.001338847, 0.001106317), X2 = c(0.006041392, 0.001639298, 
        0.006140877, 0.002958169, 0.002744017, 0.003107995, 0.001729594, 
        0.001582564, 0.001971713, 0.001693236), X3 = c(0.024180351, 
        0.002189061, 0.027377442, 0.002886651, 0.002816333, 0.003527908, 
        0.00231891, 0.001695633, 0.00212034, 0.001962923)), .Names = c("color", 
    "type", "X1", "X2", "X3"), row.names = c(NA, 10L), class = "data.frame")

At first I perform ANOVA using the following command

 anovar= aov(type~.,df)

and then summary the outputs as follows:

summary(anovar)

Which so far so good and it performs just fine. However, when I try to perform the TukeyHSD, it seems like I have structure problem. The error is like below. I searched and i could not find any similar situation. Any comment would be appreciated

TukeyHSD(anovar)
# Error in rep.int(n, length(means)) : unimplemented type 'NULL' in 'rep3'
# In addition: Warning messages:
# 1: In replications(paste("~", xx), data = mf) : non-factors ignored: X1
# 2: In replications(paste("~", xx), data = mf) : non-factors ignored: X2
# 3: In replications(paste("~", xx), data = mf) : non-factors ignored: X3
1

1 Answers

7
votes

As it says in the description of the TukeyHSD documentation, the function creates a set of confidence intervals on the differences between the means of levels of a factor with the specified family-wise probability of coverage.

This means that you need to have factors in your data set in order to run it. So if you select the factor as follows it works:

> TukeyHSD(anovar, which = 'color') #color is the only categorical data

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = type ~ ., data = df)

$color
     diff       lwr      upr     p adj
W-R 4.375 -1.465325 10.21532 0.1121168

You also get a warning that non factors X1, X2, X3 are ignored.

In order to print the TukeyHSD object just save it and use plot. There is a plot method (as well as a print method) for class TukeyHSD objects.

forplot <- TukeyHSD(anovar, which = 'color')
plot(forplot) 

enter image description here