Scipy.sparse CSC-matrix performance

Question

I'm currently trying to solve some equations using implicit euler. Since I was bored of Fortran, I thought it might be a good idea to try it with Python and see how close I can come (from performance point of view) to the existing Fortran program. For my problem, i want to take advantage of sparse matrices. I encountered that the current bottleneck of my program is initializing the sparse matrix and subtract something from the diagonal.

The following minimal example demonstrates this:

import numpy                as np
from scipy.sparse           import csc_matrix
from scipy.sparse.linalg    import spsolve
from timeit                 import default_timer

# Example data for Sparse Matrix in CSC format
data = np.array([ -6.07315337e+07,  -1.08191534e+06,  -5.85677031e+07, \
               5.96496184e+07,   1.99723260e+07,  -3.99136095e+07, \
              -3.10384281e+04,   1.99412852e+07,   3.10384281e+04, \
               4.14012789e+04,  -4.13845644e+04,  -4.14179805e+04, \
               4.13845708e+04,   1.67016486e+01,   6.40664368e+03, \
              -1.21556953e+02,   6.28508672e+03,  -6.40664368e+03, \
               1.21556953e+02,   1.87698938e-03,   1.87698938e-03, \
              -1.87698938e-03,   6.17782975e-05,   6.17782975e-05, \
              -6.17782975e-05,   3.23024684e+00,   3.23024684e+00, \
              -3.23024684e+00,   1.59838512e+00,   1.59838512e+00, \
              -1.59838512e+00,   1.96353333e-02,   1.96353333e-02, \
              -1.96353333e-02,   4.25269958e+01,   4.25269958e+01, \
              -4.25269958e+01,   4.84489810e-06,   4.84489810e-06, \
              -4.84489810e-06,   2.54951658e-07,   2.54951658e-07, \
              -2.54951658e-07,   6.42250438e-08,   6.42250438e-08, \
              -6.42250438e-08])

indices = np.array([ 0,  1,  2,  3,  0,  1,  2,  3,  4,  0,\
                     1,  2,  4,  5,  0,  1,  2,  3,  5,  0,\
                     3,  4,  0,  4,  5,  0,  5,  6,  0,  6,\
                     7,  0,  7,  8,  0,  8,  9,  0,  9, 10,\
                     0, 10, 11,  0, 11, 12], dtype=np.int32)

indptr = np.array([ 0,  4,  9, 14, 19, 22, 25, 28, 31, 34,\
                   37, 40, 43, 46], dtype=np.int32)

# Stop the time to initialize the Sparse matrix in CSC-format
start = default_timer()
for i in range(10000):
    J = csc_matrix((data, indices, indptr), shape=(13, 13))
stop = default_timer()
print 'Initialize:'.ljust(15),stop - start

# Set the diagonal of the matrix. The diagonal is in principle known.
start = default_timer()
for i in range(10000):
    J.setdiag(1./1e-10 + J.diagonal())
stop = default_timer()
print 'Set diagonal:'.ljust(15), stop - start

# Set an array to solve something
b = np.array([ -4.16737068e+05, 8.32180182e+05, 1.29378997e+03,\
               -4.15443441e+05,-1.29326784e+03,-2.60963259e-01,\
                0.00000000e+00, 0.00000000e+00, 0.00000000e+00,\
                0.00000000e+00, 0.00000000e+00, 0.00000000e+00,\
                0.00000000e+00])

# Stop the time to solve the system
start = default_timer()
for i in range(10000):
    x = spsolve(J,b)
stop = default_timer()
print 'Solve:'.ljust(15), stop - start

I know that changing the sparsity of a matrix is usually expensive. In principle I know the indices of the diagonal, but I don't know how to change the data, once it is stored in a scipy csc_matrix. But also the initialization of the Matrix is almost as expensive as solving the system? For me the output of the example program is:

Initialize: 0.516402959824

Set diagonal: 1.67107796669

Solve: 0.845117807388

Is there a way to get around the scipy sparse matrices or speeding this up? I thought about directly calling Pardiso, but this looked rather complicated for me.

I've found in other cases such as matrix multiplication, that a sparse matrix has to have a sparsity better than 0.1 to have a speed advantage. You are creating the csc matrix in the fastest way. I don't know much about the solver, but it isn't much slower than the creation step. I suspect the set_diagonal step can be sped up. — hpaulj
If you store also zeros on the diagonal, the sparsity structure doesn't need to change. But I think a 13 x 13 array is to small to achieve good performance. To much Python code that has to be run before you enter the fast C routines. — user7138814
The 13 x 13 matrix given here is only an example. Usually the matrices are ~4000 x 4000 with a sparsity of ~90%. But why does it take so long to create the scipy csc matrix? In principle i give the format as scipy wants to have it. — Mor

Mor Mor · Accepted Answer · 2017-08-29T14:27:27

The computation time can be drastically reduced when initializing an instance of scipy.sparse.csc_matrix only once. Instead of writing

J = csc_matrix((data, indices, indptr), shape=(13, 13))

in every iteration, it is better to write:

J.data = data
J.indices = indices
J.indptr = indptr

If the sparsity does not change and the indices of the diagonal are known, one can manipulate the data of the diagonal also directly instead of using the setdiag and diagonal property from scipy.

Scipy.sparse CSC-matrix performance

1 Answers