1
votes

I want to compare a huge vector with selected element from a matrix in R.

A is a matrix and B is a vector. I want to compare each element of B with selected element from A. C and D are selection criteria. They are vectors of same length as B. C specifies the row number of A, and D specifies the column number. A is of dimension 10*100, and B,C,D are all vectors of length 72000. Code with for loop:

for ( j in 1:length(B) ){
  E[j] <- B[j] >= A[ C[j], D[j] ]
} 

This is too slow. I vectorize this by define a vector including elements from A first:

A1 <- array(0, length(B))
A2 <- A[,D]
for ( j in 1:length(B) ){
  A1[j] <- A2[ C[j], j ]
}   
E <- B >= A1

This is still too slow. Is there a better way to this?

3
May I ask if you found any of the three answers below of help? - MatthewS
I think all of them are helpful. The third one with cbind is faster in this case. However, if A is a 3 or higher dimensional array, then second is more suitable. I can use A[C,D,,] to subscript it. However, it is wrong to use A[cbind(C,D),,]. - lionup
Just to clarify @lionup, if you subset an array by a single matrix (for example A[cbind(C,D)] you get a vector with one value for each row in A. If you subset using multiple vectors (for example A[C,D]) you will receieve a length(C) x length(D) array. Both are useful when appropriate but it is important to understand these are not the same! - MatthewS
Hi MatthewS, can you help me with a similar question: stackoverflow.com/questions/16721120/… - lionup

3 Answers

2
votes

You can easily select each element of A that corresponds to each entry of B, based on the selection criteria B and C. Combine B and C into a two-column matrix, and then subset A with that matrix:

A.subset <- A[cbind(B, C)]

You now have a vector (A.subset) of the same length as B, and can perform whatever (vectorized) comparison you like in a performant manner.

1
votes

The absolute fastest way I can think of is to treat A as a vector and extract the elements you want. A matrix is really just a vector with dimension attributes. Arithmetic operations are extremely fast and the [ subsetting operator is vectorised.

To get the desired elements all you need to do is multiply your desired column number (D) by the total number of rows and then subtract the desired row number (C) minus total number of rows, eg A[ D * nrow(A) - ( nrow(A) - C) ] as in this example:

set.seed(1234)
A <- matrix( sample(5,16,repl=TRUE) , 4 )
#    [,1] [,2] [,3] [,4]
#[1,]    2    1    1    5
#[2,]    1    3    5    5
#[3,]    2    1    1    2
#[4,]    1    4    2    1

## Rows
C <- sample( nrow(A) , 3 , repl = TRUE )
#[1] 1 2 3

## Columns
D <- sample( ncol(A) , 3 , repl = TRUE )
#[1] 1 3 2

## Treat A as a vector
## Elements are given by:
rs <- nrow(A)
A[ D * rs - ( rs - C) ]
#[1] 2 5 1
0
votes

I'm not sure I totally get your question, but I think you want something like the following:

# setup some mock data
a <- matrix(rnorm(1000,0,1),nrow=10, ncol=100)
b <- rnorm(100,0,1)
c <- rep(1:10,10)
d <- 1:100

# define function
compare <- function(v,row,column)
    return(v >= a[row,column]) # you might want this to output to something else

# apply the comparison function to the b, c, and d vectors
mapply(FUN=compare, b, c, d)