If I want to instantiate a large boolean sparse matrix to assign values at certain indices later, what's the best way to initialize it?
For example, if I want to initialize a 20000000 X 7000 logical sparse matrix on MATLAB with a 10000 filled elements (without mentioning the location of non-zero elements), I would use the following syntax:
Matrix=logical(sparse([],[],[],20000000,7000,10000))
I have no speed constraints on assigning the non-zero values later.
On Python, if I initialize it as a CSR matrix, the creation of the matrix is very fast.
Matrix=csr_matrix((20000000, 7000), dtype=bool)
CPU times: user 860 µs, sys: 2.43 ms, total: 3.29 ms
Wall time: 9.72 ms
However, when I cannot efficiently assign values to a CSR_Matrix, the operation is very slow and you see the inbuilt warning.
If I try initializing it as a LIL matrix:
Matrix=lil_matrix((20000000, 7000), dtype=bool)
CPU times: user 12.4 s, sys: 624 ms, total: 13 s
Wall time: 13 s
or converting the csr_matrix to a lil_matrix:
Matrix=csr_matrix((20000000, 7000), dtype=bool)
Matrix=Matrix.tolil()
CPU times: user 26.8 s, sys: 734 ms, total: 27.5 s
Wall time: 27.5 s
The initialization takes significantly more time.
Is there any way to speed up the initialization of the LIL matrix? If not, what sparse matrix format can I use to speed up the assignment of non-zero elements to such matrices?
initialize
? What's your source of non-zero values? – hpaulj