Algorithm for multiplying two sparse matrices

Question

I am working in Matlab and I am storing sparse matrices as structure arrays with fields: row, column and data. So for two matrices, I would have a collection of arrays giving (row, column, data) for each non-zero entry.

I'm trying to write an efficient program to multiply two sparse matrices in this form but am having some difficulties.

However this has the problem of having duplicated entries in the array, when really I would like to add them.

Any help is very much appreciated.

You doing this as an exercise, right? Do you know Matlab has built-in functionality for sparse matrices? — Luis Mendo
Yes I am doing this as an exercise to try and understand those functions! Thanks — Wooster
I've thought of one way of doing it, whenever there is a column and row match, we then do another for loop which cycles through all previous matches which this would contribute to, however putting another loop in there seems slightly overkill? — Wooster
I am not sure that this will help you understand. The functions that is used to multiply matrices are not written in matlab, but in C. Thus the most efficient way of doing it in matlab, may not have been most efficient in C. — patrik
@Wooster: your choice of data structure for storing sparse matrices is not very efficient: en.wikipedia.org/wiki/Sparse_matrix#Storing_a_sparse_matrix — Amro

MrAzzaman MrAzzaman · Accepted Answer · 2014-07-23T02:21:40

First off, you can do away with the forloops by utilising ismember as follows:

[lia,locb] = ismember([a.column],[b.row]);
loca = find(lia);

This will give you loca and locb, which are row and column indices of the answer matrix respectively. You can determine the destination entries in the final array as follows:

[dest,~,i] = unique([loca',locb'],'rows');
dest_row = num2cell(dest(:,1));
dest_col = num2cell(dest(:,2));

Here we're using the row-based unique sort to get make sure we don't duplicate entries in the final matrix. The i index array is a map from the initial data (which may have duplicates) to the final data (sans duplicates). You can then sum the data based on these indices using accumarray:

dest_data = num2cell(accumarray(i,[a(loca).data].*[b(locb).data]));

We convert each of these to cell arrays to make forming the final matrix easier, as you'll see shortly. Assuming you haven't already, you should preallocate the final matrix:

len = size(dest,1); % number of unique entries in the final matrix
c = repmat(struct('row',[],'column',[],'data',[]),len,1);

We can now set the values in the final matrix:

[c.row] = dest_row{:};
[c.column] = dest_col{:};
[c.data] = dest_data{:};

Algorithm for multiplying two sparse matrices

1 Answers