Sparse Matrix-Vector Multiplication By Row Partitioning

Question

I am a newby in MPI and Parallel Computing Environments. I have different sparse matrices from https://sparse.tamu.edu/ and I am trying to multiply a sparse matrix with a dense vector by row partitioning with the size of N x N and N x 1 respectively. I will test my parallel MPI Program with the number of processes 1,2,4,8,16.

I have done some research about this algorithm and found a better solution and roadmap from this presentation. https://www.sandia.gov/~mmwolf/presentations/CS591MH/CS591MH_20070913.pdf

Algorithm is like this;

First,partition the whole sparse matrix row-wise for each process and also partition the dense vector. For memory efficiency, also store nonzero elements of sparse matrix.
Before doing any computation, send a vector element needed by sent x[j] to remote processes that have nonzeros in column j.
Do computation and save each row results in output vector.

I did not understand how to specify remote processes for sending x[j]. If I specify, How can I communicate between these processes with non-blocking send and receive operations ? Should I use for loop for each send operation ?

Thanks in advance.

Do you really need to implement this algorithm by yourself? If not, a lot of pre-existing, validated and optimized libraries doing parallel linear algebra are available, and could be used easily. In that case, check out PETSc (mcs.anl.gov/petsc) or Trilinos (trilinos.github.io) which provide parallel data structures as well as parallel linear algebra algorithms. — jcgiret
Thanks for the answer,but I need to implement this algorithm in C without using any other libraries. — mustafamuratcoskun

mustafamuratcoskun mustafamuratcoskun · Accepted Answer · 2019-05-12T08:47:26

Note: I solved how to multiply sparse matrix - dense vector multiplications by rowwise partitioning.

In 1D - Rowwise Partitioning of Sparse Matrix, first I have read different sparse matrices from SuiteSparse Matrix Collection (https://sparse.tamu.edu/ ) by using “mmio.c” from https://math.nist.gov/MatrixMarket/mmio-c.html and partition this sparse matrix via RowWise into N x (N / p ) matrices and assign these matrices to different processes(p = number of processes ).After partition the whole matrix into different pieces, I simply create (N / p) x 1 dense vector for each process.

After partition ended, I needed to determine which processes interact with each other for non-blocking point-to-point communications (MPI_Isend and MPI_IRecv) to be small. For that reason, I needed to create a task-interaction graph that show which processes communicate with each other.A process can send its dense vector part to other processes that uses this dense vector for doing local multiplication. Also, a process can receive different dense vector parts from other processes. For example, after determining these processes, I obtained these two dependent processes.

Example

Process 0

Send Dependents

Process 1 Process 5 Process 6

Receive Dependents

Process 10 Process 13

While looking these dependent processes, I used two for loop in every process for sending its dense vector part and receiving different dense vector part from other processes by using non-blocking point-to-point communication (MPI_Isend and MPI_Irecv). In order to wait all send and receive requests , I used MPI_Waitall function.

As a final part, I have realised sparse matrix - dense vector multiplication.

Sparse Matrix-Vector Multiplication By Row Partitioning

1 Answers