Multithreaded sparse matrix multiplication in Matlab

Question

I am performing several matrix multiplications of an NxN sparse (~1-2%) matrix, let's call it B, with an NxM dense matrix, let's call it A (where M < N). N is large, as is M; on the order of several thousands. I am running Matlab 2013a.

Now, usually, matrix multiplications and most other matrix operations are implicitly parallelized in Matlab, i.e. they make use of multiple threads automatically. This appears NOT to be the case if either of the matrices are sparse (see e.g. this StackOverflow discussion - with no answer for the intended question - and this largely unanswered MathWorks thread). This is a rather unhappy surprise for me.

We can verify that multithreading has no effects for sparse matrix operations by the following code:

clc; clear all; 

N = 5000;         % set matrix sizes
M = 3000;       
A = randn(N,M);   % create dense random matrices
B = sprand(N,N,0.015); % create sparse random matrix
Bf = full(B);     %create a dense form of the otherwise sparse matrix B

for i=1:3 % test for 1, 2, and 4 threads
  m(i) = 2^(i-1);
  maxNumCompThreads(m(i)); % set the thread count available to Matlab
  tic                      % starts timer
    y = B*A; 
  walltime(i) = toc;       % wall clock time
  speedup(i) = walltime(1)/walltime(i);
end

% display number of threads vs. speed up relative to just a single thread
[m',speedup']

This produces the following output, which illustrates that there is no difference between using 1, 2, and 4 threads for sparse operations:

threads   speedup
1.0000    1.0000
2.0000    0.9950
4.0000    1.0155

If, on the other hand, I replace B by its dense form, refered to as Bf above, I get significant speedup:

threads   speedup
1.0000    1.0000
2.0000    1.8894
4.0000    3.4841

(illustrating that matrix operations for dense matrices in Matlab are indeed implicitly parallelized)

So, my question: is there any way at all to access a parallelized/threaded version of matrix operations for sparse matrices (in Matlab) without converting them to dense form? I found one old suggestion involving .mex files at MathWorks, but it seems the links are dead and not very well documented/no feedback? Any alternatives?

It seems to be a rather severe restriction of implicit parallelism functionality, since sparse matrices are abound in computationally heavy problems, and hyperthreaded functionality highly desirable in these cases.

mathworks.com/help/matlab/math/sparse-matrix-operations.html and mathworks.com/matlabcentral/answers/… — Yvon
@Yvon In the links I see a general description of how things work, yet I can't make out how it is relevant for the question. — Dennis Jaheruddin
Just a silly afterthought: Does it help to make the full matrix sparse? — Dennis Jaheruddin
@DennisJaheruddin it helps in terms of speedup, but it is not practical in terms of memory. that's the reason for the question. — Daniyar
@Daniyar The information that M was large was hidden by a formatting problem, have edited the question to fix this. -- Still, going from full to sparse should only make the matrix twice as large to store, so unless you are close to the memory limit it may be an interesting approach. — Dennis Jaheruddin

Amro Amro · Accepted Answer · 2014-11-24T18:59:25

MATLAB already uses SuiteSparse by Tim Davis for many of its operation on sparse matrices (for example see here), but neither of which I believe are multithreaded.

Usually computations on sparse matrices are memory-bound rather than CPU-bound. So even you use a multithreaded library, I doubt you will see huge benefits in terms of performance, at least not comparable to those specialized in dense matrices...

After all that the design of sparse matrices have different goals in mind than regular dense matrices, where efficient memory storage is often more important.

I did a quick search online, and found a few implementations out there:

sparse BLAS, spBLAS, PSBLAS. For instance, Intel MKL and AMD ACML do have some support for sparse matrices
cuSPARSE, CUSP, VexCL, ViennaCL, etc.. that run on the GPU.

Multithreaded sparse matrix multiplication in Matlab

3 Answers