4
votes

I'm trying to implement the functionality of MATLAB function sparse.

Insert a value in sparse matrix at a specific index such that:

If a value with same index is already present in the matrix, then the new and old values are added.

Else the new value is appended to the matrix.

The function addNode performs correctly but the problem is that it is extremely slow. I call this function in a loop about 100000 times and the program takes more than 3 minutes to run. While MATLAB accomplishes this task in a matter of seconds. Is there any way to optimize the code or use stl algorithms instead of my own function to achieve what I want?

Code:

struct SparseMatNode
{
   int x;
   int y;
   float value;
};

std::vector<SparseMatNode> SparseMatrix;

void addNode(int x, int y, float val)
{
   SparseMatNode n;
   n.x = x;
   n.y = y;
   n.value = val;

   bool alreadyPresent = false;

   int i = 0;
   for(i=0; i<SparseMatrix.size(); i++)
   {
    if((SparseMatrix[i].x == x) && (SparseMatrix[i].y == y))
    {
        alreadyPresent = true;
        break;
    }
   }

   if(alreadyPresent)
   {
    SparseMatrix[i].value += val;
    if(SparseMatrix[i].value == 0.0f)
        SparseMatrix.erase(SparseMatrix.begin + i);
   }
   else
    SparseMatrix.push_back(n);
}
5

5 Answers

5
votes

Sparse matrices aren't typically stored as a vector of triplets as you are attempting.

MATLAB (as well as many other libraries) uses a Compressed Sparse Column (CSC) data structure, which is very efficient for static matrices. The MATLAB function sparse also does not build the matrix one entry at a time (as you are attempting) - it takes an array of triplet entries and packs the whole sequence into a CSC matrix. If you are attempting to build a static sparse matrix this is the way to go.

If you want a dynamic sparse matrix object, that supports efficient insertion and deletion of entries, you could look at different structures - possibly a std::map of triplets, or an array of column lists - see here for more information on data formats.

Also, there are many good libraries. If you're wanting to do sparse matrix operations/factorisations etc - SuiteSparse is a good option, otherwise Eigen also has good sparse support.

3
votes

Sparse matrices are usually stored in compressed sparse row (CSR) or compressed sparse column (CSC, also called Harwell-Boeing) format. MATLAB by default uses CSC, IIRC, while most sparse matrix packages tend to use CSR.

Anyway, if this is for production usage rather than a learning exercise, I'd recommend using a matrix package with support for sparse matrices. In the C++ world, my favourite is Eigen.

2
votes

The first thinks that stands out is that you are implementing your own functionality for finding an element: that's what std::find is for. So, instead of:

bool alreadyPresent = false;

int i = 0;
for(i=0; i<SparseMatrix.size(); i++)
{
  if((SparseMatrix[i].x == x) && (SparseMatrix[i].y == y))
  {
    alreadyPresent = true;
    break;
  }
}

You should write:

auto it = std::find(SparseMatrix.begin(), SparseMatrix().end(), Comparer);

where Comparer is a function that compares two SparseMatNode objects.

But the main improvement will come from using the appropriate container. Instead of std::vector, you will be much better off using an associative container. This way, finding an element will have just a O(logN) complexity instead of O(N). You may slighly modify your SparseMatNode class as follows:

typedef std::pair<int, int> Coords;
typedef std::pair<const Coords, float> SparseMatNode;

You may cover this typedefs inside a class to provide a better interface, of course.

And then:

std::unordered_map<Coords, float> SparseMatrix;

This way you can use:

auto it = SparseMatrix.find(std::make_pair(x, y));

to find elements much more efficiently.

2
votes

Have you tried sorting your vector of sparse nodes? Performing a linear search becomes costly every time you add a node. You could Insert In Place and always perform Binary Search.

-1
votes

Because sparse matrix may be huge and need to be compressed, you may use std::unordered_map. I assume matrix indexes (x and y) are always positive.

#include <unordered_map>

const size_t MAX_X =  1000*1000*1000;
std::unordered_map <size_t, float> matrix;

void addNode (size_t x, size_t y, float val)
{
   size_t index = x + y*MAX_X;
   matrix[index] += val;      //this function can be still faster
   if (matrix[index] == 0)    //using find() / insert() methods
       matrix.erase(index);
}

If std::unordered_map is not available on your system, you may try std::tr1::unordered_map or stdext::hash_map...

If you can use more memory, then use double instead of float, this will improve a bit your processing speed.