Large and Sparse Matrix Multiplcation

Question

I have a very large and sparse matrix of size 180GB(text , 30k * 3M) containing only the entries and no additional data. I have to do matrix multiplication , inversion and some similar linear algebra operations over it. I tried octave and simple single-threaded C code for the multiplication but my system RAM of 40GB gets used up very fast and then I can find the program starts thrashing. Is there any other options available to me. I am not familiar with MathLab or any other matrix operational library that can help me in doing so.

When I run a simple matrix multiplication of two matrices with 10 rows and 3 M cols, and its transpose, it gives the following error :

    memory exhausted or requested size too large for range of Octave's index type

I am not sure whether the same would work on Matlab or not. For sparse matrix representation and matrix multiplication, is there another library or code.

Are you saying that the full matrix data is 180GB, or do you mean that the sparse representation itself is 180GB? What are the matrix dimensions, and how many non-zero elements do you have? — paddy
if I get it right, you are able to load the entire 180GB matrix into an octave variable, then you run into memory troubles as soon as you try to fiddle with the huge variable? Can you cast/convert the huge variable into sparse, e.g., m=readFromFile( hugeFileName.txt );m=sparse(m);? — Shai
You have to block import your matrix, cast each imported block to sparse and store it into a cell array. Once you imported all blocks, just concatenate them all at once. You will notice that 180GB will vanish if your sparsity is 99%.] — Oleg
See this discussion about the size limit of a matrix in Octave (sparse matrix inclusive). Basically boils down to the fact that Octave uses a 32bit integer internally to index the matrix. You can build Octave with 64 bit indexing but all of Octave dependencies will also need it. — carandraug
MATLAB allows indices in sparse to be 2^48-1 = 281474976710655 where 3e4 * 3e6 is smaller (for 64 bit OS) — Oleg

useslessone useslessone · Accepted Answer · 2013-09-15T01:24:52

if there are few enough nonzero entries, I suggest creating a sparse matrix S with appropriate dimensions and max nonzero entries; see matlab create sparse matrix. Then as @oleg komarov described, load the matrix in blocks and assign the nonzero entries from each block into the correct address in the sparse matrix S. I feel that if your matrix is sparse enough, then loading it is really the only difficulty you face. I had similar issues with large transfer operators.

Large and Sparse Matrix Multiplcation

3 Answers