7
votes

I have two sparse matrices E and D, which have non-zero entries at the same places. Now I want to have E/D as a sparse matrix, defined only where D is non-zero.

For example take the following code:

import numpy as np
import scipy

E_full = np.matrix([[1.4536000e-02, 0.0000000e+00, 0.0000000e+00, 1.7914321e+00, 2.6854320e-01, 4.1742600e-01, 0.0000000e+00],
                    [9.8659000e-03, 0.0000000e+00, 0.0000000e+00, 1.9106752e+00, 5.7283640e-01, 1.4840370e-01, 0.0000000e+00],
                    [1.3920000e-04, 0.0000000e+00, 0.0000000e+00, 9.4346500e-02, 2.8285900e-02, 4.3967800e-02, 0.0000000e+00],
                    [0.0000000e+00, 4.5182676e+00, 0.0000000e+00, 0.0000000e+00, 7.3000000e-06, 1.5100000e-05, 4.0746900e-02],
                    [0.0000000e+00, 0.0000000e+00, 3.4002088e+00, 4.6826200e-02, 0.0000000e+00, 2.4246900e-02, 3.4529236e+00]])
D_full = np.matrix([[0.36666667, 0.        , 0.        , 0.33333333, 0.2       , 0.1       , 0.        ],
                    [0.23333333, 0.        , 0.        , 0.33333333, 0.4       , 0.03333333, 0.        ],
                    [0.06666667, 0.        , 0.        , 0.33333333, 0.4       , 0.2       , 0.        ],
                    [0.        , 0.63636364, 0.        , 0.        , 0.04545455, 0.03030303, 0.28787879],
                    [0.        , 0.        , 0.33333333, 0.33333333, 0.        , 0.22222222, 0.11111111]])
E = scipy.sparse.dok_matrix(E_full)
D = scipy.sparse.dok_matrix(D_full)

Then division E/D yields a full matrix.

matrix([[3.96436360e-02,            nan,            nan, 5.37429635e+00, 1.34271600e+00, 4.17426000e+00,            nan],
        [4.22824292e-02,            nan,            nan, 5.73202566e+00, 1.43209100e+00, 4.45211145e+00,            nan],
        [2.08799990e-03,            nan,            nan, 2.83039503e-01, 7.07147500e-02, 2.19839000e-01,            nan],
        [           nan, 7.10013476e+00,            nan,            nan, 1.60599984e-04, 4.98300005e-04, 1.41541862e-01],
        [           nan,            nan, 1.02006265e+01, 1.40478601e-01,            nan, 1.09111051e-01, 3.10763127e+01]])

I also tried a different package.

import sparse
sparse.COO(E) / sparse.COO(D)

This got me an error.

ValueError: Performing this operation would produce a dense result: <ufunc 'true_divide'>

So it tries to create a dense matrix as well.

I understand this is due to the fact that 0/0 = nan. But I am not interested in these values anyway. So how can I avoid computing them?

3
csr matrices are used for math, not coo or dok. Because of all the 0's, sparse element wise divide is not actually defined. The result will be non-sparse with these nan or inf values.hpaulj

3 Answers

2
votes

Update: (Inspired by sacul) Create an empty dok_matrix and modify only D's nonzero part with nonzero. (This should work for sparse matrices other than dok_matrix as well.)

F = scipy.sparse.dok_matrix(E.shape)
F[D.nonzero()] = E[D.nonzero()] / D[D.nonzero()]

You can try the update + nonzero method for dok_matrix.

nonzero_idx = [tuple(l) for l in np.transpose(D.nonzero())]
D.update({k: E[k]/D[k] for k in nonzero_idx})

First, we use nonzero to nail down the indices in the matrix D that is not 0. Then, we put the indices in the update method where we supply a dictionary

{k: E[k]/D[k] for k in nonzero_idx}

such that the values in D will be updated according to this dictionary.

Explanation:

What D.update({k: E[k]/D[k] for k in nonzero_idx}) does is

for k in {k: E[k]/D[k] for k in nonzero_idx}.keys():
    D[k] = E[k]/D[k]

Note that this changes D in place. If you want to create a new sparse matrix rather than modifying D in place, copy D to another matrix, say ret.

nz = [tuple(l) for l in np.transpose(D.nonzero())]
ret = D.copy()
ret.update({k: E[k]/D[k] for k in nz})
2
votes

A simple implementation using multiply:

def sparse_divide_nonzero(a, b):
    inv_b = b.copy()
    inv_b.data = 1 / inv_b.data
    return a.multiply(inv_b)

To be used as:

import scipy as sp
import scipy.sparse

N, M = 4, 4
M1 = sp.sparse.random(N, M, 0.5, 'csr')
M2 = sp.sparse.random(N, M, 0.5, 'csr')

M3 = sparse_divide_nonzero(M1, M2)

print(M1, '\n')
#   (0, 1)  0.9360024198546736
#   (1, 1)  0.625073080022902
#   (1, 2)  0.4086612951451881
#   (2, 0)  0.06864456080221182
#   (2, 1)  0.9871542989102963
#   (2, 3)  0.4371900022237898
#   (3, 0)  0.12121502419640318
#   (3, 3)  0.22950388104392383 

print(M2, '\n')
#   (1, 0)  0.9753308317090571
#   (1, 2)  0.29870724277296024
#   (1, 3)  0.21116220574550637
#   (2, 1)  0.5039729514070662
#   (2, 2)  0.4463809800134303
#   (3, 0)  0.36751994181969416
#   (3, 1)  0.6189763803260612
#   (3, 2)  0.3870101687623324 

print(M3, '\n')
#   (1, 2)  1.368099719817645
#   (2, 1)  1.9587446035629748
#   (3, 0)  0.3298189034212229 
0
votes

I don't think this is problem that you have nan values, it's expected result.

If you wanna replace nan with 0 you can use np.nan_to_num (doc: https://docs.scipy.org/doc/numpy/reference/generated/numpy.nan_to_num.html)