I am trying to compute few (5-500) eigenvectors corresponding to the smallest eigenvalues of large symmetric square sparse-matrices (up to 30000x30000) with less than 0.1% of the values being non-zero.
I am currently using scipy.sparse.linalg.eigsh in shift-invert mode (sigma=0.0), which I figured out through various posts on the topic is the prefered solution. However, it takes up to 1h to solve the problem in most cases. On the other hand the function is very fast, if I ask for the largest eigenvalues (sub seconds on my system), which was expected from the documentation.
Since I am more familiar with Matlab from work, I tried solving the problem in Octave, which gave me the same result using eigs (sigma=0) in just a few seconds (sub 10s). Since I want to do a parameter sweep of the algorithm including the eigenvector computation, that kind of time gain would be great to have in python as well.
I first changed the parameters (especially tolerance), but that didn't change much on the timescales.
I am using Anaconda on Windows, but tried to switch the LAPACK/BLAS used by scipy (which was a huge pain) from mkl (default Anaconda) to OpenBlas (used by Octave according to the documentation), but could not see a change in performance.
I was not able to figure out, whether there was something to change about the used ARPACK (and how)?
I uploaded a testcase for the code below to the following dropbox-folder: https://www.dropbox.com/sh/l6aa6izufzyzqr3/AABqij95hZOvRpnnjRaETQmka?dl=0
In Python
import numpy as np
from scipy.sparse import csr_matrix, csc_matrix, linalg, load_npz
M = load_npz('M.npz')
evals, evecs = linalg.eigsh(M,k=6,sigma=0.0)
In Octave:
M=dlmread('M.txt');
M=spconvert(M);
[evecs,evals] = eigs(M,6,0);
Any help is appriciated!
Some additional options I tried out based on the comments and suggestions:
Octave:
eigs(M,6,0)
and eigs(M,6,'sm')
give me the same result:
[1.8725e-05 1.0189e-05 7.5622e-06 7.5420e-07 -1.2239e-18 -2.5674e-16]
while eigs(M,6,'sa',struct('tol',2))
converges to
[1.0423 2.7604 6.1548 11.1310 18.0207 25.3933]
much faster, but only if the tolerance values is above 2, otherwise it does not converge at all and the values are strongly different.
Python:
eigsh(M,k=6,which='SA')
and eigsh(M,k=6,which='SM')
both do not converge (ARPACK error on no convergence reached). Only eigsh(M,k=6,sigma=0.0)
gives some eigenvalues (after almost an hour), which are different to octave for the smallest ones (even 1 additional small value is found):
[3.82923317e-17 3.32269886e-16 2.78039665e-10 7.54202273e-07 7.56251500e-06 1.01893934e-05]
If the tolerance is high enough I also get results from eigsh(M,k=6,which='SA',tol='1')
, which come close the other obtained values
[4.28732218e-14 7.54194948e-07 7.56220703e-06 1.01889544e-05, 1.87247350e-05 2.02652719e-05]
again with a different number of small eigenvalues. The computation time is still almost 30min. While the different very small values might be understandable, since they might represent multiples of 0, the different multiplicity baffles me.
Additionally, there seems to be some fundamental differences in SciPy and Octave, which I cannot figure out, yet.