4
votes

I have three sparse matrices A, B, and C, and I want to compute the element-wise result of: (A*B)/C, i.e. element-wise multiply A with B, then element-wise divide by C.

Naturally, since C is sparse, division by zero results in most of the matrix elements set to infinity/nan. However, I am not interested in these elements since for my needs A is essentially a mask, and all zero indices in A should stay zeros in the result. In practice, scipy does calculate these items even though they could be masked if we decide that 0/0=0.

What is the best way to avoid the redundant calculations of elements that are zeros in A?

Example for concreteness:

A = sparse.csr_matrix(np.identity(100))
B = sparse.csr_matrix(np.identity(100) * 2)
C = sparse.csr_matrix(np.identity(100) * 5)

Z = ((A*B)/C)

Z[0,0]
>>> 0.4
Z[0,1]
>>> nan

Required result:

Z[0,0]
>>> 0.4
Z[0,1]
>>> 0.0

Note: I am mostly interested in the performance of this operation.

1
If the sparsity is the same for all matrices, we can just do the math using the data attribute, which will be a 1d array.hpaulj
That's a good point, but data contains explicit zeros.Nur L

1 Answers

3
votes

This is the best way to do this but if C.data has any 0s in it they'll still come out as NaN. How you choose to handle this probably depends on what exactly you're doing.

A = sparse.csr_matrix(np.identity(100))
B = sparse.csr_matrix(np.identity(100) * 2)
C = sparse.csr_matrix(np.identity(100) * 5)

C.data = 1 / C.data

Z = A*B*C

>>> Z
<100x100 sparse matrix of type '<class 'numpy.float64'>'
    with 100 stored elements in Compressed Sparse Row format>

>>> Z[0,0]
0.4
>>> Z[0,1]
0.0