Using apply in R

Question

I have a numeric matrix and I want to get the mean of the 5 lowest (smallest value) elements from each column. I am trying to use one of the different apply functions available. But I am not able to do it.

This is the function I need to apply, and I have tested it with a for loop, and works great.

   mean(head(sort(table[,x]),5))

This is one of the several examples I have tried to:

   a<-mapply(function(x){mean(head(sort(table[,x]),5))},table)

I get the following error:

   Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) :   undefined columns selected

I have also tried sapply, lapply... but I haven't make it work. Can't find anything to base on, surfing the internet..

Thanks in advance

apply(df, 2, function(x) mean(sort(x, decreasing = F)[1:5])) — mts

mts mts · Accepted Answer · 2015-07-22T10:10:20

Your easiest guess is apply here since you want to apply columnwise. With some sample data:

set.seed(123)
df = matrix(rnorm(100), 10, 10)

this will work:

apply(df, 2, function(x) mean(sort(x, decreasing = F)[1:5]))

What is this code doing?

the first argument to apply is the data, here df (you call it table in your question).
the second argument 2 indicates that the function is applied to each column. There is also 1 for rows and c(1,2) for both.
the third argument is your function. Since this one is non-trivial it is good practice to define it in-place, i.e. you define a function of x (where imagine x as one column of your dataframe/matrix) and then you take the mean of the first 5 elements (indexing [1:5]) of sort. You also see how you can pass on further arguments to the functions (e.g. decreasing = FALSE which admittedly here is default behaviour, but say you wanted the mean of the 5 highest values). If you have missing data you might want to add na.rm = TRUE as an argument to mean.

Here is your output:

> apply(df, 2, function(x) mean(sort(x, decreasing = F)[1:5]))
 [1] -0.6376458 -0.5049506 -1.1295099 -0.1233905 -0.7905504 -0.3444174 -0.5745786 -1.0836254 -0.1159064 -0.4503110

Using apply in R

4 Answers