2
votes

Say that I have a sparse matrix in scipy.sparse format. How can I extract a diagonal other than than the main diagonal? For a numpy array, you can use numpy.diag. Is there a scipy sparse equivalent?

For example:

from scipy import sparse
A = sparse.diags(ones(5),1)

How would I get back the vector of ones without converting to a numpy array?

1

1 Answers

2
votes

When the sparse array is in dia format, the data along the diagonals is recorded in the offsets and data attributes:

import scipy.sparse as sparse
import numpy as np

def make_sparse_array():
    A = np.arange(ncol*nrow).reshape(nrow, ncol)
    row, col = zip(*np.ndindex(nrow, ncol))
    val = A.ravel()

    A = sparse.coo_matrix(
        (val, (row, col)), shape=(nrow, ncol), dtype='float')
    A = A.todia()
    # A = sparse.diags(np.ones(5), 1)
    # A = sparse.diags([np.ones(4),np.ones(3)*2,], [2,3])
    print(A.toarray())
    return A

nrow, ncol = 10, 5
A = make_sparse_array()
diags = {offset:(diag[offset:nrow+offset] if 0<=offset<=ncol else
                 diag if offset+nrow-ncol>=0 else
                 diag[:offset+nrow-ncol])
         for offset, diag in zip(A.offsets, A.data)}


for offset, diag in sorted(diags.iteritems()):
    print('{o}: {d}'.format(o=offset, d=diag))

Thus for the array

[[  0.   1.   2.   3.   4.]
 [  5.   6.   7.   8.   9.]
 [ 10.  11.  12.  13.  14.]
 [ 15.  16.  17.  18.  19.]
 [ 20.  21.  22.  23.  24.]
 [ 25.  26.  27.  28.  29.]
 [ 30.  31.  32.  33.  34.]
 [ 35.  36.  37.  38.  39.]
 [ 40.  41.  42.  43.  44.]
 [ 45.  46.  47.  48.  49.]]

the code above yields

-9: [ 45.]
-8: [ 40.  46.]
-7: [ 35.  41.  47.]
-6: [ 30.  36.  42.  48.]
-5: [ 25.  31.  37.  43.  49.]
-4: [ 20.  26.  32.  38.  44.]
-3: [ 15.  21.  27.  33.  39.]
-2: [ 10.  16.  22.  28.  34.]
-1: [  5.  11.  17.  23.  29.]
0: [  0.   6.  12.  18.  24.]
1: [  1.   7.  13.  19.]
2: [  2.   8.  14.]
3: [ 3.  9.]
4: [ 4.]

The output above is printing the offset followed by the diagonal at that offset.

The code above should work for any sparse array. I used a fully populated sparse array only to make it easier to check that the output is correct.