2
votes

We have a matlab program in which we want to calculate the following expression:

sum(  (M*x) .* x) 

Here, M is a small dense matrix (say 100 by 100) and x is a sparse fat matrix (say of size 100 by 1 000 000, with 5% non-zero entries). When I run the code, then first M*x is calculated, which is a dense matrix-- however, most of the computation that went into computing that matrix is a complete waste of time, as most of it will be zero-ed out in the point-wise product with x afterwards.

In other words: What I want to do is to only calculate those entries (i,j) of M*x which correspond to (i,j) for which x(i,j) is non-zero. In the end, I will then also only be interested in each column count.

It seems pretty simple to start with but I could not figure out how to tell matlab to do it or how to reshape the calculation so that matlab does it efficiently. I would really like to avoid having to code up a mex-file for this operation, and this operation is eating up most of the computation time.

Here is a code snippet for comparison:

m = 100;
n = 100000;
density = 0.05;

M = randn(m); M = M * M';
x = sprandn(m,n,density);

tic
for i = 1:100
     xsi  = sum((M * x).*x,1);
end
toc
Elapsed time is 13.570713 seconds.
1

1 Answers

1
votes

To compute (M*x) .* x: find which entries of the final result can be nonzero (using find), compute manually only for those (sum(M(...).'.*x(...)) .* nonzeros(x).'), and from that build the final matrix (using sparse):

[ii jj] = find(x);
R = sparse(ii, jj, sum(M(ii,:).'.*x(:,jj)) .* nonzeros(x).');

Of course, to compute sum((M*x) .* x) you then simply use

full(sum(R))