66
votes

I have a list of length 130,000 where each element is a character vector of length 110. I would like to convert this list to a matrix with dimension 1,430,000*10. How can I do it more efficiently?\ My code is :

output=NULL
for(i in 1:length(z)) {
 output=rbind(output,
              matrix(z[[i]],ncol=10,byrow=TRUE))
}
5
If you want the dimensions to be 1430000*11 why do you set ncol to be 10? - Dason
Wait- when you say that each entry has 11 characters, you mean that it is a vector with 11 items? I originally thought that each was a string with 11 characters in it. Can you show z[1:2] as an example? - David Robinson
Thank Dason and David! That's a typo. I have corrected it. - user1787675
@user1787675: I still don't understand. What is an "entry"? Is it a vector? Can you show z[1:2]? - David Robinson
Hi David, I looked up an dictionary and found that I mean the components in the list. I am sorry for the confusion I caused. I am not good at English :) - user1787675

5 Answers

140
votes

This should be equivalent to your current code, only a lot faster:

output <- matrix(unlist(z), ncol = 10, byrow = TRUE)
16
votes

I think you want

output <- do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))

i.e. combining @BlueMagister's use of do.call(rbind,...) with an lapply statement to convert the individual list elements into 11*10 matrices ...

Benchmarks (showing @flodel's unlist solution is 5x faster than mine, and 230x faster than the original approach ...)

n <- 1000
z <- replicate(n,matrix(1:110,ncol=10,byrow=TRUE),simplify=FALSE)
library(rbenchmark)
origfn <- function(z) {
    output <- NULL 
    for(i in 1:length(z))
        output<- rbind(output,matrix(z[[i]],ncol=10,byrow=TRUE))
}
rbindfn <- function(z) do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))
unlistfn <- function(z) matrix(unlist(z), ncol = 10, byrow = TRUE)

##          test replications elapsed relative user.self sys.self 
## 1   origfn(z)          100  36.467  230.804    34.834    1.540  
## 2  rbindfn(z)          100   0.713    4.513     0.708    0.012 
## 3 unlistfn(z)          100   0.158    1.000     0.144    0.008 

If this scales appropriately (i.e. you don't run into memory problems), the full problem would take about 130*0.2 seconds = 26 seconds on a comparable machine (I did this on a 2-year-old MacBook Pro).

7
votes

It would help to have sample information about your output. Recursively using rbind on bigger and bigger things is not recommended. My first guess at something that would help you:

z <- list(1:3,4:6,7:9)
do.call(rbind,z)

See a related question for more efficiency, if needed.

3
votes

You can also use,

output <- as.matrix(as.data.frame(z))

The memory usage is very similar to

output <- matrix(unlist(z), ncol = 10, byrow = TRUE)

Which can be verified, with mem_changed() from library(pryr).

-5
votes

you can use as.matrix as below:

output <- as.matrix(z)