3
votes

I am a newbie to R. Assume the memory layout is the same for data frame and matrix.

In the following matrix

a=matrix(1:10000000,1000000,10)

it has 1M rows and 10 columns. Is the memory for row or for column sequential physically? Or is the physical memory first store [1,1],[2,1],[3,1],,[1M,1],[2,1] or [1,2],[1,2],..[1,10],[2,1]...?

Suppose the matrix with 10M element is of size 100M, and the L2 cache is 4M, then L2 cache can't store all these 10M element. If we process the data sequentially, we will have less L2 cache missing ratio. For our case, we need to process row by row and read several columns at the same time, such as column A, B, C, and then create some result. If the layout of the memory is first store 10 items in 1st row, then store 10 items in the 2nd row, then the performance might be better.

If there any way to control the memory layout?

2
You could try comparing the performance of working with a vs. t(a) to see if rows/column have much of an effect. - Richie Cotton

2 Answers

6
votes

Matrices are stored column-wise:

> m=matrix(1:12,nrow=3)
> m
     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12

Data frames are just pretty lists, and lists are stored as vectors of elements. I'm not even sure that list elements are guaranteed to be contiguous in memory.

Read up on writing R extensions for more info on how memory is handled. As far as I know there's no way to control the memory layout. Don't worry about it until it becomes a problem.

2
votes

A matrix is simply a vector with a dim attribute. The elements of the matrix are stored in the vector in column-major order. There is no way to change this.

Therefore, if you need to operate row-by-row, it's faster to transpose the matrix before looping over it.

> set.seed(21)
> a = matrix(rnorm(1e6),1e3,1e3)
> ta = t(a)
> system.time(for(i in 1:1000) colSums(ta))
   user  system elapsed 
   1.39    0.00    1.40 
> system.time(for(i in 1:1000) rowSums(a))
   user  system elapsed 
   2.40    0.00    2.39 
> identical(rowSums(a), colSums(ta))
[1] TRUE

If you want to dig deeper, the code for colSums, rowSums, colMeans, and rowMeans is in the do_colsum function in src/main/array.c.