You can replicate the dense calculation with sparse.hstack
:
In [1]: A = np.arange(9).reshape(3,3)
In [2]: from scipy import sparse
In [3]: M = sparse.csr_matrix(A)
In [4]: M
Out[4]:
<3x3 sparse matrix of type '<class 'numpy.int32'>'
with 8 stored elements in Compressed Sparse Row format>
In [5]: M.A
Out[5]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]], dtype=int32)
In [6]: [A[i:1+3+i-2,:] for i in range(2)]
Out[6]:
[array([[0, 1, 2],
[3, 4, 5]]), array([[3, 4, 5],
[6, 7, 8]])]
In [7]: [M[i:1+3+i-2,:] for i in range(2)]
Out[7]:
[<2x3 sparse matrix of type '<class 'numpy.int32'>'
with 5 stored elements in Compressed Sparse Row format>,
<2x3 sparse matrix of type '<class 'numpy.int32'>'
with 6 stored elements in Compressed Sparse Row format>]
In [8]: sparse.hstack([M[i:1+3+i-2,:] for i in range(2)])
Out[8]:
<2x6 sparse matrix of type '<class 'numpy.int32'>'
with 11 stored elements in COOrdinate format>
In [9]: _.A
Out[9]:
array([[0, 1, 2, 3, 4, 5],
[3, 4, 5, 6, 7, 8]], dtype=int32)
I won't make any speed promises.
The sparse matrix multiplication is well developed (provided the matrix is sparse, which this example isn't). In fact operations like row/column sum, and even slicing is done with matrix multiplication. That is it constructs a matrix or vector that when dotted
sums or selects values.
sparse.hstack
uses a sparse block matrix operation. This converts the inputs to coo
format, and combines the respective row
,col
,data
arrays with offsets, and makes a new coo
matrix.
I can imagine combining these steps, but it would take a fair amount of work.