0
votes

I have a numeric matrix and I want to get the mean of the 5 lowest (smallest value) elements from each column. I am trying to use one of the different apply functions available. But I am not able to do it.

This is the function I need to apply, and I have tested it with a for loop, and works great.

   mean(head(sort(table[,x]),5))

This is one of the several examples I have tried to:

   a<-mapply(function(x){mean(head(sort(table[,x]),5))},table)

I get the following error:

   Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) :   undefined columns selected 

I have also tried sapply, lapply... but I haven't make it work. Can't find anything to base on, surfing the internet..

Thanks in advance

4
apply(df, 2, function(x) mean(sort(x, decreasing = F)[1:5]))mts

4 Answers

4
votes

Your easiest guess is apply here since you want to apply columnwise. With some sample data:

set.seed(123)
df = matrix(rnorm(100), 10, 10)

this will work:

apply(df, 2, function(x) mean(sort(x, decreasing = F)[1:5]))

What is this code doing?

  • the first argument to apply is the data, here df (you call it table in your question).
  • the second argument 2 indicates that the function is applied to each column. There is also 1 for rows and c(1,2) for both.
  • the third argument is your function. Since this one is non-trivial it is good practice to define it in-place, i.e. you define a function of x (where imagine x as one column of your dataframe/matrix) and then you take the mean of the first 5 elements (indexing [1:5]) of sort. You also see how you can pass on further arguments to the functions (e.g. decreasing = FALSE which admittedly here is default behaviour, but say you wanted the mean of the 5 highest values). If you have missing data you might want to add na.rm = TRUE as an argument to mean.

Here is your output:

> apply(df, 2, function(x) mean(sort(x, decreasing = F)[1:5]))
 [1] -0.6376458 -0.5049506 -1.1295099 -0.1233905 -0.7905504 -0.3444174 -0.5745786 -1.0836254 -0.1159064 -0.4503110
4
votes

You are looking for colMeans, making the code more compact:

colMeans(head(apply(m, 2, sort),5))
2
votes

Try this:

set.seed(1)
(mat <- matrix(sample(1:12, 12), ncol = 3))
#      [,1] [,2] [,3]
# [1,]    4    2    3
# [2,]    5    7    1
# [3,]    6   10   11
# [4,]    9   12    8
n <- 2
apply(mat, 2, function(x) mean(head(sort(x), n)))
# [1] 4.5 4.5 2.0
1
votes

Try using apply:

a<-apply(table,2,function(x){mean(head(sort(x),5))})