0
votes

I need to repeatedly add a vector to a matrix. Both take on different lengths everytime I do this. The complete matrix is then used for further analysis (plotting, t-test) Three months ago this code worked:

    mlen <- max(length(matrix), length(vector))
    length(maxtrix) <- length(vector) <- mlen
    matrix <- cbind(matrix, vector)

I don't use any specific packages for that. Data input is unchanged a csv file. Now I have either of the following the issues:

a) the unequal length function doesn't work properly anymore. I.e. if the new vector has 970 'rows' but the longest column in the existing matrix has only 270 rows, then the remaining 500 rows of the added vector just get cut off. The warning message is In function (..., deparse.level = 1) : number of rows of result is not a multiple of vector length (arg 2) This doesn't always happen.

b) the values of the vector that is added get placed in empty cells at the bottom of an existing column in the matrix.

Both seriously screws up my further analysis. I have tried to use do.call(cbind...) as suggested here, merge, or append. Nothing procudes the output I need, which is a matrix with 1 column per vector witout any data loss or mixing.

Thanks!

Up-date: Above code lines are part of code doing the following: data import (which vary in size) - data cleaning (data varies even more in size) - storing data in a matrix or dataframe - calculating mean per column, plot / t-test data

Throwing everyting in a list and the creating a matrix is not useful for me unless the original data structure can be preserved.

2
length(matrix) is not what you believe it is. You want nrow(matrix).Roland
I think you want Tyler's answer, here, which was "stolen" from an R help page. Not sure how to mark a question as duplicate.shayaa
What probably happened in the past is that the new data was of a length that allowed for clean recycling (it was a multiple of the old data). Depending on your use, I'd suggest keeping the data in a list. as this is the most natural storage structure for your data.lmo
@shayaa@Imo Storing it in a list is not useful for me as I need the mean of each column, which I then need for a t-test. So I might need a completely different solutionSimone
@Roland thanks for the suggestions. nrow doesn't work properly for the first row and not at all for the 2nd of the code. "Error in nrow(tempResult) <- mlen : could not find function "nrow<-" Did a simple trial, i.e. nrow(vector) and received NULL as output. It works for the matrix though.Simone

2 Answers

1
votes

Implemented Tyler's solution here. For completion purposes here is the code again:

   cbind.fill <- function(...){
     nm <- list(...) 
     nm <- lapply(nm, as.matrix)
     n <- max(sapply(nm, nrow)) 
     do.call(cbind, lapply(nm, function (x) 
     rbind(x, matrix(, n-nrow(x), ncol(x))))) 
    }
   matrix <- cbind.fill(matrix, vector)

Using nrow resulted in the new data being written in NA cells of previous columns instead of a new column. For all those interested in the difference between nrow and length

0
votes

A potentially easier solution could be the following:

  1. Store all your vectors in a list instead of appending them one by one
  2. Make them the same length filling the missing items with NA
  3. cbind everything into a matrix

A mock up example:

library(dplyr)

ll <- list(c(1,2,3,4,5), c(2,3), c(5,6,7,8,12,13,14,15))
ll

lapply(ll, function(x) x[1: max(sapply(ll, length))]) %>% do.call(cbind, .) 

The output is:

    [,1] [,2] [,3]
[1,]    1    2    5
[2,]    2    3    6
[3,]    3   NA    7
[4,]    4   NA    8
[5,]    5   NA   12
[6,]   NA   NA   13
[7,]   NA   NA   14
[8,]   NA   NA   15