5
votes

Say I have the following matrix mat, which is a binary indicator matrix:

mat<-matrix(c(1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1), byrow=T, nrow=3)

> mat
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    1    0    0    0    0
[2,]    0    0    1    1    0    0
[3,]    0    0    0    0    1    1

This matrix has only 3 rows. I need to create one with 10000 rows, with the same pattern of pairs of 1s on the diagonals. E.g. for 5 rows, I expect a 5 x 10 matrix:

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    1    1    0    0    0    0    0    0    0     0
[2,]    0    0    1    1    0    0    0    0    0     0
[3,]    0    0    0    0    1    1    0    0    0     0
[4,]    0    0    0    0    0    0    1    1    0     0
[5,]    0    0    0    0    0    0    0    0    1     1

Does anyone know a simple way to do that? Thanks a lot

5
i.e. take the identity matrix and duplicate each columnsmci

5 Answers

4
votes

This is a sparse matrix, and, as such, you'll do much better referencing the non-zero entries: this will save you RAM and make it easier to automatically generate the matrix.

Each entry is indexed as (i,j,x), referring to the row, column, and value. Suppose that you have N (say N = 10) rows that you'd like to fill, then you are producing 2 entries per row (indexed by i, in the code below); each column is used only once, so there are 2*N unique column values. Each non-zero entry is 1.

The code for producing this is:

N = 10
i = rep(1:N, each = 2)
j = 1:(2*N)
v = 1

library(Matrix)
mat = sparseMatrix(i = i, j = j, x = v)

The resulting matrix is:

> mat
10 x 20 sparse Matrix of class "dgCMatrix"

 [1,] 1 1 . . . . . . . . . . . . . . . . . .
 [2,] . . 1 1 . . . . . . . . . . . . . . . .
 [3,] . . . . 1 1 . . . . . . . . . . . . . .
 [4,] . . . . . . 1 1 . . . . . . . . . . . .
 [5,] . . . . . . . . 1 1 . . . . . . . . . .
 [6,] . . . . . . . . . . 1 1 . . . . . . . .
 [7,] . . . . . . . . . . . . 1 1 . . . . . .
 [8,] . . . . . . . . . . . . . . 1 1 . . . .
 [9,] . . . . . . . . . . . . . . . . 1 1 . .
[10,] . . . . . . . . . . . . . . . . . . 1 1

Just use the code above and set N = 10000, and you'll have your matrix.

As an added bonus: your desired matrix (N = 1E5) consumes only 321424 bytes. In contrast, a standard dense matrix of size 10K x 20K will take 1.6GB, using numeric (i.e. 8 byte) entries. As they said in "Contact": that seems like an awful waste of space, right?

1
votes

When you do not provide enough elements to fill the matrix, they are recycled: if you provide two ones and n zeroes (the first row and the first two elements of the second row), you will get the desired matrix.

n <- 5
matrix( 
  c(1,1,rep(0,2*n)), 
  byrow=TRUE, nr=n, nc=2*n 
)
1
votes

Unless you intend on filling many other values in the matrix, you probably want Iterator's sparse matrix solution. That said, here's a cute way of generating a non-sparse version of the matrix:

double_diag <- function(n)
{
  matrix(rep(diag(n), each = 2), byrow = TRUE, nrow = n)
}
double_diag(5)
1
votes

@VincentZooneKynd has a nice solution, but it gives a warning. Here's a variant that avoids the warning:

n <- 5
matrix(rep(c(1,1,rep(0,2*n)), len=2*n*n), n, byrow=TRUE)
0
votes

Trickly:

> n <- 5
> t(model.matrix(~0+gl(n,2)))[,]
          1 2 3 4 5 6 7 8 9 10
gl(n, 2)1 1 1 0 0 0 0 0 0 0  0
gl(n, 2)2 0 0 1 1 0 0 0 0 0  0
gl(n, 2)3 0 0 0 0 1 1 0 0 0  0
gl(n, 2)4 0 0 0 0 0 0 1 1 0  0
gl(n, 2)5 0 0 0 0 0 0 0 0 1  1