Your problem is very likely being caused by an overflow of an index stored in an int32
, caused by the result of your dot product having more than 2^31 non-zero entries. Try the following...
>>> import scipy.sparse
>>> c = np.empty_like(A.indptr)
>>> scipy.sparse.sparsetools.csr_matmat_pass1(A.shape[0], B.shape[1], A.indptr,
A.indices, B.indptr, B.indices, c)
>>> np.all(np.diff(c) >= 0)
With your data, The array c
is a vector of 12414693 + 1
items, holding the accumulated number of non-zero entries per row in the product of your two matrices, i.e. it is what C.indptr
will be if C = A.dot(B)
finishes successfully. It is of type np.int32
, even on 64 bit platforms, which is not good. If your sparse matrix is too large, there will be an overflow, that last line will return False
, and the arrays to store the result of the matrix product will be instantiated with a wrong size (the last item of c
, which is very likely to be a negative number if an overflow did happen). If that's the case, then yes, file a bug report...
import scipy; scipy.__version__
). – Warren Weckesser