How to apply the wilcox.test to a whole dataframe in R?

Question

I have a data frame with one grouping factor (the first column) with multiple levels (more than two) and several columns with data. I want to apply the wilcox.test to the whole date frame to compare the each group variables with the others. How can I do this?

UPDATE: I know that the wilcox.test will only test for difference between two groups and my data frame contains three. But I am interested more in how to do this, than what test to use. Most likely that one group will be removed, but I have not decided yet on that, so I want to test all variants.

Here is a sample:

structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), var1 = c(9.3, 
9.05, 7.78, 7.11, 7.14, 8.12, 7.5, 7.84, 7.8, 7.52, 8.84, 6.98, 
6.1, 6.89, 6.5, 7.5, 7.8, 5.5, 6.61, 7.65, 7.68), var2 = c(11L, 
11L, 10L, 1L, 3L, 7L, 11L, 11L, 11L, 11L, 4L, 1L, 1L, 1L, 2L, 
2L, 1L, 4L, 8L, 8L, 1L), var3 = c(7L, 11L, 3L, 7L, 11L, 2L, 11L, 
5L, 11L, 11L, 5L, 11L, 11L, 2L, 9L, 9L, 3L, 8L, 11L, 11L, 2L), 
    var4 = c(11L, 11L, 11L, 11L, 6L, 11L, 11L, 11L, 10L, 7L, 
    11L, 2L, 11L, 3L, 11L, 11L, 6L, 11L, 1L, 11L, 11L), var5 = c(11L, 
    1L, 2L, 2L, 11L, 11L, 1L, 10L, 2L, 11L, 1L, 3L, 11L, 11L, 
    8L, 8L, 11L, 11L, 11L, 2L, 9L)), .Names = c("group", "var1", 
"var2", "var3", "var4", "var5"), class = "data.frame", row.names = c(NA, 
-21L))

UPDATE

Thanks to everyone for all answers!

wilcox.test will only test for difference between two groups. Your data frame contains three. Are you sure this is the test you want and if so, is it all possible pairwise tests you want? — RoyalTS
@RoyalTS, I know about that. But I am interested more in how to do this, than what test to use. Suppose that one group will be removed, but I have not decided yet on that, therefore I want to test all variants. — Iurie Malai

Aaron left Stack Overflow Aaron left Stack Overflow · Accepted Answer · 2014-01-23T12:43:05

The pairwise.wilcox.test function seems like it would be useful here; perhaps like this?

out <- lapply(2:6, function(x) pairwise.wilcox.test(d[[x]], d$group))
names(out) <- names(d)[2:6]
out

If you just want the p-values, you can go through and extract those and make a matrix.

sapply(out, function(x) {
    p <- x$p.value
    n <- outer(rownames(p), colnames(p), paste, sep='v')
    p <- as.vector(p)
    names(p) <- n
    p
})
##         var1      var2      var3 var4      var5
## 2v1 0.5414627 0.8205958 0.4851572    1 1.0000000
## 3v1 0.1778222 0.3479835 1.0000000    1 1.0000000
## 2v2        NA        NA        NA   NA        NA
## 3v2 0.5414627 0.3479835 0.3784941    1 0.6919826

Also note that pairwise.wilcox.test adjusts for multiple comparisons using the Holm method; if you'd rather do something different, look at the p.adjust parameter.

How to apply the wilcox.test to a whole dataframe in R?

3 Answers