4
votes

I am looking to vectorize a loop in R that counts elements of a pairwise matrix, relevant to a proposed order.

The problem is easier to understand with an example:

Given an example matrix

m <- matrix(c(0,2,1,0,0,2,2,1,0), nrow = 3)
row.names(m) <- colnames(m) <- c("apple", "orange", "pear")

You can imagine the columns of m to identify the number of times a person chooses one fruit over another. For example, in m, 1 person chose an apple instead of a pear, but two people chose pears instead of apples.

So, given a proposed order to represent the popularity of the three fruits:

p.order <- c("apple" =  2, "orange" = 1, "pear" = 3)

I want to count the number of people whose choices would not be well represented by p.order.

For this, I have a loop which works fine:

new.m <- array(dim = c(nrow(m), nrow(m)))

for(p in 1:nrow(m)){ 
  for(q in 1:nrow(m)){
    new.m[p,q] <- 0 + (p.order[p] < p.order[q])
  }
}
sum(m * new.m)

But this loop is slow, given a large enough problem.

Is there a way to vectorise (or speed up) this loop?

UPDATE As requested, performance of the accepted solution:

Loop function:

loop.function <- function(p.order, mat){
  nt <- nrow(mat)
  new.m <- array(dim=c(nt,nt))
  for(p in 1:nt){ for(q in 1:nt){ new.m[p,q] <- 0 + (p.order[p] < p.order[q])}}
  return(sum(mat * new.m))
}

Vectorized function:

  vec.function <- function(p.order, mat){
    return(sum(mat * outer(p.order, p.order, FUN = `<`)))
  }

Performance:

Unit: microseconds
                      expr  min   lq   mean median    uq    max neval
 loop.function(p.order, m) 14.4 14.7 93.049   14.9 15.15 7805.5   100
  vec.function(p.order, m)  7.6  8.1 33.850    8.3  8.60 2474.9   100
 cld
   a
   a
1

1 Answers

3
votes

Here is a vectorized option with outer

sum(m * outer(p.order, p.order, FUN = `<`))
#[1] 5