7
votes

I am building a sparse linear system with multiple (soft) constraints. I am converting some code which used to build the matrix with boost::ublas, to Eigen. The boost:ublas has a convenient way to create a sparse matrix with known (or estimated) number of non-zeros, and has reasonably fast operator(int row, int col) to update its elements.

The problem is as follows:

  • Using SparseMatrix::setFromTriplets:
    My system has many constraints. As a naive, 'slightly' exaggerated example, let say that I have a 100x100 sparse matrix, with 500 nnz but 1 billion redundant constraints (i.e., the non-zero coeffs are modified a billion times). setFromTriplets requires me to store 1 billion coefficients, most of which will be summed up to form my set of 500 non zero coefficients. That is not really efficient nor memory friendly. Of course, I can replace my std::vector by an std::map, and perform the accumulation of the constraints manually, but this somehow misses the point of having a sparse matrix class, and that wouldn't be efficient either.

  • Using SparseMatrix::insert(i,j,val):
    Does not work if the element is already present. My problem is to be able to accumulate coefficients that are already present.

  • using SparseMatrix::coeffRef(i, j):
    That does work, and would be the function I'm looking for. However it is several orders of magnitude slower than boost::ublas. I am surprised that I did not see a better function for that. I thought this would be due to the number of non-zeros that is not known in advance, and forces multiple reallocations (which is what happens in practice). However, using SparseMatrix::reserve() has no effect, since it is a function that only works for compressed matrices (a comment in the source says ""This function does not make sense in non compressed mode" before an assert).... and, as the documentation says "the insertion of a new element into a SparseMatrix converts this later to the uncompressed mode".

What is the most efficient way to build a sparse matrix in Eigen while still being able to update its coefficients multiple times ?

Thanks

[EDIT: sample use case: 10x10 matrix with 10 non-zeros. For simplicity, the matrix is diagonal]

SparseMatrix<double> mat(10, 10);
for (int i=0; i<10; i++) {
  for (int j=0; j<1000000; j++) {
    mat.coeffRef(i, i) += rand()%10;
  }
}

=> works, but orders of magnitude slower than the ublas operator() (for bigger matrices and a more realistic setting of course).

std::vector<Eigen::Triplet<double> > triplets(10000000);
int k=0;
for (int i=0; i<10; i++) {
  for (int j=0; j<1000000; j++) {
    triplets[k++] = Eigen::Triplet<double>(i,i,rand()%10);
  }
}
SparseMatrix<double> mat(10, 10);
mat.setFromTriplets(triplets.begin(), triplets.end());

=> not memory friendly...

1
I don't quite follow your problem. Would you post a simplified version of your use case? Ideally, do it with a 10x10 sparse matrix with, say, 10 non-zeroes.Escualo
I just added a trivial example.nbonneel
your example does not convey any meaning. I meant my question as your actual use case. That is: you have an sparse matrix, and then you do something with it, and then, what? do you change the matrix? Do you shuffle the coefficients? Do you add a new set of coefficients?Escualo
Did you try with mat.coeffRef(i,j) += v_ij;? (cf. eigen.tuxfamily.org/dox-devel/…)Escualo
thanks for the suggestion. But doing that efficiently ultimately amounts to creating a new SparseMatrix class to be able to initialize the Eigen::SparseMatrix. This is actually what my current code does : it uses ublas as a convenient way to create the matrix, and then converts it to Eigen::SparseMatrix. I wanted to get rid of this bizarre step... Thanks for your help!nbonneel

1 Answers

7
votes

To make insert of coeffRef efficient, you need to reserve enough space using mat.reserve(nnz) where nnz is a Eigen::VectorXi containing the estimated number of non zero of each column. It is better to slightly overestimate these numbers to avoid numerous reallocations/copies. Another complementary trick is to make sure that the first time you access to the element (i,j), this element is the last one of the column j.

If you can easily compute the sparsity pattern, then an alternative is to fill a vector of unique triplet with 0 for the values, and then coeffRef will be fast.