0
votes

Apologies for the awkward title, here is a more specific description of the problem. I have a large (e.g. 10^6 x 10^6) sparse symmetric matrix which defines bonds between nodes.

e.g. The matrix A = [0 1 0 0 0; 1 0 0 2 3; 0 0 0 4 0; 0 2 4 0 5; 0 3 0 5 0] would describe a 5-node system, such that nodes 1 and 2 are connected by bond number A(1,2) = 1, nodes 3 and 4 are connected by bond number A(3,4) = 4, etc.

I want to form two new matrices. The first, B, would list the nodes connected to each node (i.e. each row i of B has elements given by find(A(i,:)), and padded with zeros at the end if necessary) and the second, C, would list the bonds connected to that node (i.e. each row i of C has elements given by nonzeros(A(i,:)), again padded if necessary).

e.g. for the matrix A above, I would want to form B = [2 0 0; 1 4 5; 4 0 0; 2 3 5; 2 4 0] and C = [1 0 0; 1 2 3; 4 0 0; 2 4 5; 3 5 0]

The current code is:

B=zeros(length(A), max(sum(spones(A))))
C=zeros(length(A), max(sum(spones(A))))
for i=1:length(A)
    B(i,1:length(find(A(i,:)))) = find(A(i,:));
    C(i,1:length(nonzeros(A(i,:)))) = nonzeros(A(i,:));
end

which works, but is slow for large length(A). I have tried other formulations, but they all include for loops and don't give much improvement.

How do I do this without looping through the rows?

2

2 Answers

0
votes

Hmm. Not sure how to vectorize (find returns linear indices when given a matrix, which is not what you want), but have you tried this:

B=zeros(length(A), 0);
C=zeros(length(A), 0);
for i=1:length(A)
    Bi = find(A(i,:));
    B(i,1:length(Bi)) = Bi;
    Ci = nonzeros(A(i,:));
    C(i,1:length(Ci)) = Ci;
end

I made two changes:

  • removed call to spones (seems unnecessary; the performance hit needed to expand the # of columns in B and C is probably minimal)
  • cached result of find() and nonzeros() so they're not called twice
0
votes

I know it's hard to read, but that code is a vectorized version of your code:

[ i j k ] = find(A);
A2=(A~=0);
j2=nonzeros(cumsum(A2,2).*A2);
C2=accumarray([i,j2],k)
k2=nonzeros(bsxfun(@times,1:size(A,2),A2));
B2=accumarray([i,j2],k2);

Try it and tell me if it works for you.