Are sparse matrices typically stored in column major order or row major order?

Question

A little background: I'm interested in doing some research on sparse matrix*vector multiplication.

I've been looking through this database of sparse matrices:

The University of Florida Sparse Matrix Collection

I noticed that there are 3 formats the matrices are available in:

MATLAB (.mat)
Matrix Market (.mtx)
Harwell-Boeing (.rb)

It appears that the matrices are stored in column major order (i.e. columns are stored one right after each other, rather than rows right after each other). However, in the literature it appears that the compressed sparse row (CSR) format is apparently the most common format (see "Scientific Computing Kernels on the Cell Processor Samuel"). I know that somehow just the index (i,j) and the value at those coordinates are stored, but I think I would have to reformat the data first in order to perform the matrix*vector multiplication efficiently.

For my implementation, it would make more sense to have the data stored in row major order, so that the elements in a row could be accessed in order because they would be stored in consecutive memory addresses.

The CSR format appears to assume data is stored in row major order however. So what I'm wondering is this: How is data typically stored in memory for sparse matrices? And does part of the sparse matrix*vector computation involve regrouping the data from column-major to row-major order? I'm asking because I'm wondering if this conversion is typically considered in sparse matrix benchmark results.

Are you asking specifically how MATLAB stores sparse matrices? — beaker
Column major format is following Fortran's conventions. For M*v row major is better as you already figured out. — karakfa
@beaker, no. I am wondering how an application would typically store the data in memory, is it first in column major order and then then has to convert to row major order? OR, is it typically okay to just reformat your data into CSR format without considering the overhead for that when giving benchmark results? — Veridian
@karakfa, Yes, but is the conversion from column-major to row-major typically considered in benchmark results that are published? — Veridian
I am wondering how an application would typically store the data in memory I'll stick my neck out and state that there is no 'typical' convention for sparse matrices in memory. You've listed a number of well-used formats for storing sparse matrices on disk. I think that most codes operating on sparse matrices are carefully written to optimise the memory layout and access for the demands of the particular code. Obviously (?) a general purpose system such as Matlab must use a particular representation, but even if you know what it is why would you use it if it isn't suitable ? — High Performance Mark

paul-g paul-g · Accepted Answer · 2016-05-19T10:19:51

I am afraid there is no short answer. The best storage scheme depends on the problem you are trying to solve. The things to consider are not only the storage size, but also how efficient, from a computational and hardware perspective, access and operations on this storage format are.

For sparse matrix vector multiplication CSR is a good format as it allows linear access to the elements of a matrix row which is good for memory and cache performance. However CSR induces a more irregular access pattern into the multiplicand: fetch elements at different positions, depending on what index you retrieve from the row; this is bad for cache performance. A CSC matrix vector multiplication can remove the irregular access on the multiplicand, at the cost of more irregular access in the solution vector. Depending on your matrix structure you may choose one or another. For example a matrix with a few, long rows, with a similar nonzero distribution may be more efficient to handle in a CSC format.

Some examples in the well known software packages/tools:

To the best of my knowledge Matlab uses a column storage by default.
Scientific codes (and BLAS) based on Fortran also use a column storage by default. This is due mostly to historical reasons since Fortran arrays were AFAIK column oriented to begin with and a large number of Dense/Sparse BLAS codes were originally written in Fortran.
Eigen also uses a column storage by default, but this can be customised.
Intel MKL requires you to choose IIRC.
Boost ublas uses a row based storage format by default.
PetSC, which is a widely used tool in larger scale scientific computing, uses a row based format (SequentialAIJ stands for CSR), but also allows you to choose from a wide variety of storage formats (see the MatCreate* functions on their documentation)

And the list could go on. As you can see there is some spread between the various tools, and I doubt the criteria was the performance of the SpMV operation :) Probably aspects such as the common storage formats in the target problem domains, typical expectations of programmers in the target problem domain, integration with other library aspects and already existing codes have been the prime reason behind using CSR / CSC. These differ on a per tool basis, obviously.

Anyhow, a short overview on sparse storage formats can be found here but many more storage formats were/are being proposed in sparse matrix research:

There are also block storage formats, which attempt to leverage locally dense substructures of the matrix. See for example "Fast Sparse Matrix-Vector Multiplication by Exploiting Variable Block Structure" by Richard W. Vuduc, Hyun-Jin Moon.
A very brief but useful overview of some common storage formats can be found on the Python scipy documentation on sparse formats http://docs.scipy.org/doc/scipy/reference/sparse.html.
Further information of the advantages of various formats can be found in the following texts (and many others):
- Iterative methods for sparse linear systems, Yousef Saad
- SPARSKIT: A basic tool kit for sparse matrix computation, Tech. Rep. CSRD TR 1029, CSRD, University of Illinois, Urbana, IL, 1990.
- LAPACK working note 50: Distributed sparse data structures for linear algebra operations, Tech. Rep. CS 92-169, Computer Science Department, University of Tennessee, Knoxville, TN, 1992.

I have been doing research in the sparse matrix area on creating custom hardware architectures for sparse matrix algorithms (such as SpMV). From experience, some sparse matrix benchmarks tend to ignore the overhead of conversion between various formats. This is because, in principle, it can be assumed that you could just adapt the storage format of your entire algorithm. SpMV itself is hardly used in isolation, and generally a part of some larger iterative algorithm (e.g. a linear or nonlinear solver). In this case, the cost of converting between formats can be amortised across the many iterations and total runtime of the entire algorithm. Of course you would have to justify that your assumption holds in this situation.

As a disclaimer, in my area we are particularly inclined to make as many assumptions as possible, since the cost and time to implement a hardware architecture for a linear solver to benchmark a new SpMV storage format is usually substantial (on the order of months). When working in software, it is a lot easier to test, qualify and quantify your assumptions by running as many benchmarks as possible, which would probably take less than a week to set up :D

Are sparse matrices typically stored in column major order or row major order?

2 Answers