In R, I'm trying to work with a large matrix (39,146,166 rows by 127 columns) and I'm having memory issues with a number of operations on it. I've determined that about 35% of the entries in the matrix are non-zero, and the remainder are all zeros. Is this sparse enough that I would save some memory representing this matrix using one of R's sparse matrix classes? What is a good rule of thumb for determining when a matrix is worth representing sparsely?
6
votes
1 Answers
3
votes
I don't think the sparse representation will be that much more compact. You need three numbers for each numeric item other than an implicit zero. So even if two of those are 4 byte integers the space in memory will still be larger than a "serial" storage strategy.
By this reasoning anything above 50% will take more storage space, but I'm posting from an iPhone under SF Bay so cannot test with 'object.size'.