3
votes

Suppose I have two matrices x and y, both with dimensions 100x2. I would like to create a list such that for each row of x and y, I have the matrix t(x) %*% y. For example, via a for loop:

x = matrix(rnorm(10), nrow = 5)
y = matrix(rnorm(10), nrow = 5)
myList = list()
for(i in 1:5){
    myList[[i]] = t(x[i, , drop = FALSE]) %*% y[i, ]
}

Is there a more efficient way to do this calculation? I've tried to figure out how to express this a matrix multiplication but have had no luck. I've also considered mapply, but it seems as if I'd need to convert x and y to lists of vectors instead of matrices to use mapply, and I'm skeptical that that is the correct approach either.

3
I think it would be Map(function(x,y) matrix(x,ncol=1)%*%y , split(x, row(x)), split(y, row(y))) - akrun
You should preallocate the mylist object which will make your for loop approach noticeably faster. Use mylist = vector("list", 5) - talat

3 Answers

5
votes

One way with Map

Map(function(x,y) matrix(x,ncol=1)%*%y ,
               split(x, row(x)), split(y, row(y))) 
4
votes

You can shorten (and possibly slightly speed up) your code with

NewList <- list()
for (i in 1:nrow(x)) NewList[[i]] <- outer(x[i,],y[i,])
#> all.equal(NewList,myList)
#[1] TRUE

or, equivalently,

for (i in 1:nrow(x)) NewList[[i]] <- x[i,] %o% y[i,]
3
votes

It seems like Map is the best approach:

library(rbenchmark)

x = matrix(rnorm(10000), nrow = 5000)
y = matrix(rnorm(10000), nrow = 5000)
myList = list()

loopTest = function(){
    for(i in 1:nrow(x)){
        myList[[i]] = t(x[i, , drop = FALSE]) %*% y[i, ]
    }
}

loopTest2 = function(){
    for(i in 1:nrow(x)){
        myList[[i]] = outer(x[i, ], y[i, ])
    }
}

mapTest = function(){
    Map(function(x,y) matrix(x,ncol=1)%*%y ,
                   split(x, row(x)), split(y, row(y))) 
}

mapplyTest = function(){
    mapply(function(x,y) matrix(x,ncol=1)%*%y,
           x = split(x, row(x)), y = split(y, row(y))) 
}

benchmark(loopTest(), mapTest(), mapplyTest(), replications = 100)

This gives me:

test        elapsed
loopTest()   10.471
loopTest2()  12.225
mapplyTest()  3.100
mapTest()     2.252

However, the loop approach does win on smaller datasets, say with only 5 rows.