6
votes

For a current project I have to compute the inner product of a lot of vectors with the same matrix (which is quite sparse). The vectors are associated with a two dimensional grid so I store the vectors in a three dimensional array:

E.g:

X is an array of dim (I,J,N). The matrix A is of dim (N,N). Now the task is to compute A.dot(X[i,j]) for each i,j in I,J.

For numpy arrays, this is quite easily accomplished with

Y = X.dot(A.T) 

Now I'd like to store A as sparse matrix since it is sparse and only contains a very limited number of nonzero entries which results in a lot of unnecessary multiplications. Unfortunately, the above solution won't work since the numpy dot doesn't work with sparse matrices. And to the best of my knowledge there is not tensordot-like operation for scipy sparse.

Does anybody know a nice and efficient way to compute the above array Y with a sparse matrix A?

1

1 Answers

3
votes

The obvious approach is to run a loop over your vectors and use the sparse matrix's .dot method:

def naive_sps_x_dense_vecs(sps_mat, dense_vecs):
    rows, cols = sps_mat.shape
    I, J, _ = dense_vecs.shape
    out = np.empty((I, J, rows))
    for i in xrange(I):
        for j in xrange(J):
            out[i, j] = sps_mat.dot(dense_vecs[i, j])
    return out

But you may be able to speed things up a little by reshaping your 3d array to 2d and avoid the Python looping:

def sps_x_dense_vecs(sps_mat, dense_vecs):
    rows, cols = sps_mat.shape
    vecs_shape = dense_vecs.shape
    dense_vecs = dense_vecs.reshape(-1, cols)
    out = sps_mat.dot(dense_vecs.T).T
    return out.reshape(vecs.shape[:-1] + (rows,))

The problem is that we need to have the sparse matrix be the first argument, so that we can call its .dot method, which means that the return is transposed, which in turns means that after transposing, the last reshape is going to trigger a copy of the whole array. So for fairly large values of I and J, combined with not-so-large values of N, the latter method will be several times faster than the former, but performance may even be reversed for other combinations of the parameters:

n, i, j = 100, 500, 500
a = sps.rand(n, n, density=1/n, format='csc')
vecs = np.random.rand(i, j, n)

>>> np.allclose(naive_sps_x_dense_vecs(a, vecs), sps_x_dense_vecs(a, vecs))
True

n, i, j = 100, 500, 500
%timeit naive_sps_x_dense_vecs(a, vecs)
1 loops, best of 3: 3.85 s per loop
%timeit sps_x_dense_vecs(a, vecs)
1 loops, best of 3: 576 ms per 

n, i, j = 1000, 200, 200
%timeit naive_sps_x_dense_vecs(a, vecs)
1 loops, best of 3: 791 ms per loop
%timeit sps_x_dense_vecs(a, vecs)
1 loops, best of 3: 1.3 s per loop