3
votes

I have two big sparse double matrices in Matlab:

  • P with dimension 1048576 x 524288

  • I with dimension 1048576 x 524288

I want to find the number of entrances i,j such that P(i,j)<=I(i,j)

Naively I tried to run

 n=sum(sum(P<=I));

but it is extremely slow (I had to shut down Matlab because it was running since forever and I wasn't able to stop it).

Is there any other more efficient way to proceed or what I want to do is unfeasible?

2
Depends highly in the sparsity of P<=I. If its true all the time, then the answer is a "full" sparse matrix, taking about 2.5~3x more memory that if it was a normal full matrix. If this is expected, perhaps looping, even if its row-wise, is a better idea, as likely what it is making this operation slow is the need of memory, not the arithmetics - Ander Biguri

2 Answers

3
votes

From some simple tests,

n = numel(P) - nnz(P>I);

seems to be faster than sum(sum(P<=I)) or even nnz(P<=I). The reason is probably that the sparse matrix P<=I has many more nonzero entries than P>I, and thus requires more memory.

Example:

>> P = sprand(10485, 52420, 1e-3);
>> I = sprand(10485, 52420, 1e-3);
>> tic, disp(sum(sum(P<=I))); toc
   (1,1)      549074582
Elapsed time is 3.529121 seconds.
>> tic, disp(nnz(P<=I)); toc
   549074582
Elapsed time is 3.538129 seconds.
>> tic, disp(nnz(P<=I)); toc
   549074582
Elapsed time is 3.499927 seconds.
>> tic, disp(numel(P) - nnz(P>I)); toc
   549074582
Elapsed time is 0.010624 seconds.

Of course this highly depends on the matrix sizes and density.

2
votes

Here is a solution using indices of nonzero elements:

xp = find(P);
xi = find(I);
vp = nonzeros(P);
vi = nonzeros(I);
[s,ia,ib] = intersect(xp,xi);

iia = true(numel(vp),1);
iia(ia)=false;
iib = true(numel(vi),1);
iib(ib) = false;
n = sum(vp(ia) <= vi(ib))+sum(vp(iia)<0)+sum(vi(iib)>0)-(numel(xp)+numel(xi)-numel(s))+numel(P);