1
votes

I know there are similar questions but I couldn't find an answer to my question. I'm trying to rank elements in a matrix and then extract data of 5 highest elements.

Here is my attempt.

set.seed(20)
d<-matrix(rnorm(100),nrow=10,ncol=10)
start<-d[1,1]
for (i in 1:10) {
for (j in 1:10) {
  if (start < d[i,j])
  {high<-d[i,j]
  rowind<-i
  colind<-j
  }
  }
}

Although this gives me the data of the highest element, including row and column numbers, I can't think of a way to do the same for elements ranked from 2 to 5. I also tried

  rank(d, ties.method="max")

But it wasn't helpful because it just spits out the rank in vector format. What I ultimately want is a data frame (or any sort of table) that contains rank, column name, row name, and the data(number) of highest 5 elements in matrix.

Edit

set.seed(20)
d<-matrix(rnorm(100),nrow=10,ncol=10)
d[1,2]<-5
d[2,1]<-5 
d[1,3]<-4
d[3,1]<-4

Thanks for the answers. Those perfectly worked for my purpose, but as I'm running this code for correlation chart -where there will be duplicate numbers for every pair- I want to count only one of the two numbers for ranking purpose. Is there any way to do this? Thanks.

3
Is it by row or column? - akrun
Please use set.seed before making a random example. Makes it easier for folks to verify and compare answers. - Frank
@Frank Thanks for the suggestion. Just made the change. - sh2657
@akrun I'd say by column, but I don't think it won't matter to my data because the original data I'm dealing with is a correlation table. Thanks. - sh2657
It is confusing bcz your question seems to be for the entire matrix. I'm trying to rank elements in a matrix and then extract data of 5 highest elements. - akrun

3 Answers

4
votes

Here's a very crude way:

DF = data.frame(row = c(row(d)), col = c(col(d)), v = c(d))
DF[order(DF$v, decreasing=TRUE), ][1:5, ]

   row col        v
91   1  10 2.208443
82   2   9 1.921899
3    3   1 1.785465
32   2   4 1.590146
33   3   4 1.556143

It would be nice to only have to partially sort, but in ?order, it looks like this option is only available for sort, not for order.


If the matrix has row and col names, it might be convenient to see them instead of numbers. Here's what I might do:

dimnames(d) <- list(letters[1:10], letters[1:10])
DF = data.frame(as.table(d))

DF[order(DF$Freq, decreasing=TRUE), ][1:5, ]

   Var1 Var2     Freq
91    a    j 2.208443
82    b    i 1.921899
3     c    a 1.785465
32    b    d 1.590146
33    c    d 1.556143

The column names don't make much sense here, unfortunately, but you can change them with names(DF) <- as usual.

2
votes

Here is one option with Matrix

library(Matrix)
m1 <- summary(Matrix(d, sparse=TRUE))
head(m1[order(-m1[,3]),],5)
#   i  j        x
#93 3 10 2.359634
#31 1  4 2.234804
#23 3  3 1.980956
#55 5  6 1.801341
#16 6  2 1.678989

Or use melt

library(reshape2)
m2 <- melt(d)
head(m2[order(-m2[,3]), ], 5)
1
votes

Here is something quite simple in base R.

# set.seed(20)
# d <- matrix(rnorm(100), nrow = 10, ncol = 10)

d.rank <- matrix(rank(-d), nrow = 10, ncol = 10)

which(d.rank <= 5, arr.ind=TRUE)
     row col
[1,]   3   1
[2,]   2   4
[3,]   3   4
[4,]   2   9
[5,]   1  10

d[d.rank <= 5]
[1] 1.785465 1.590146 1.556143 1.921899 2.208443

Results can (easily) be made clearer (see comment from Frank):

cbind(which(d.rank <= 5, arr.ind=TRUE), v = d[d.rank <= 5], rank = rank(-d[d.rank <= 5]))

     row col        v rank
[1,]   3   1 1.785465    3
[2,]   2   4 1.590146    4
[3,]   3   4 1.556143    5
[4,]   2   9 1.921899    2
[5,]   1  10 2.208443    1