5
votes

How can a sparse matrix - matrix product be calculated? I know the 'classic' / mathematical way of doing it, but it seems pretty inefficient. Can it be improved?

I thought about storing the first matrix in CSR form and the second one in CSC form, so since the row and column vectors are sorted I won't have to search for a specific row / column I need, but I guess that doesn't help much.

1
This seems like a duplicate stackoverflow.com/questions/19022226/…romants
Uhm, in my opinion it isn't. That question was 'if there is a better compression format more suitable for matrix - matrix operation'. I asked for a better algorithm to multiply two matrices.user3170713

1 Answers

8
votes

With the disclaimers that (i) you really don't want to implement your own sparse matrix package and (ii) if you need to anyway, you should read Tim Davis's book on sparse linear algebra, here's how to do a sparse matrix multiply.

The usual naive dense multiply looks like this.

C = 0
for i {
    for j {
        for k {
            C(i, j) = C(i, j) + (A(i, k) * B(k, j))
        }
    }
}

Since addition commutes, we can permute the loop indices any way we like. Let's put j outermost and i innermost.

C = 0
for j {
    for k {
        for i {
            C(i, j) = C(i, j) + (A(i, k) * B(k, j))
        }
    }
}

Store all matrices in CSC form. Since j is outermost, we're working column-at-a-time on B and C (but not A). The middle loop is over k, which is rows of B, and, conveniently enough, we don't need to visit the entries of B that are zero. That makes the outer two loops go over the nonzero entries of B in the natural order. The inner loop increments the jth column of C by the kth column of A times B(k, j). To make this easy, we store the current column of C densely, together with the set of indexes where this column is nonzero, as a list/dense Boolean array. We avoid writing all of C or the Boolean array via the usual implicit initialization tricks.