4
votes

I have a large matrix, currently in numpy that i would like to port over to scipy sparse matrix, because saving the text representations of the numpy (2000,2000) matrix is over 100mb.

(1) There seem to a surfeit of sparse matrices available in scipy [for instance, lil_matrix or dok_matrix- which one would be optimal for simple incrementing, and efficient to save to a database?

(2) I'd like to be able to address ranges in the matrix like so:

>> import numpy as np    
>> a = np.zeros((1000,1000))
>> a[3:5,4:7] += 1

It seems that this is not possible for the sparse matrices?

1

1 Answers

6
votes

I can't say which is most efficient to store. It's going to depend on your data.

I can say, however that the += operator works, just that you can't rely on the usual array broadcasting rules:

>>> m = sparse.lil_matrix((100,100))
>>> m[50:56,50:56]+=scipy.ones((6,6))
>>> m[50,50]  #1.0