4
votes

Is it possible to speed up large sparse matrix calculations by e.g. placing parantheses optimally?

What I'm asking is: Can I speed up the following code by forcing Matlab to do the operations in a specified order (for instance "from right to left" or something similar)?

I have a sparse square symmetric matrix H, that previously has been factorized, and a sparse vector M with length equal to the dimension of H. What I want to do is the following:

EDIT: Some additional information: H is typically 4000x4000. The calculations of z and c are done around 4000 times, whereas the computation of dVa and dVaComp is done 10 times for every 4000 loops, thus 40000 in total. (dVa and dVaComp are solved iteratively, where P_mis is updated).

Here M*c*M', will become a sparse matrix with 4 non-zero element. In Matlab:

[L U P] = lu(H);                 % H is sparse (thus also L, U and P)
% for i = 1:4000             % Just to illustrate
M = sparse([bf bt],1,[1 -1],n,1);  % Sparse vector with two non-zero elements in bt and bf
z = -M'*(U \ (L \ (P * M)));     %  M^t*H^-1*M = a scalar
c = (1/dyp + z)^-1;              % dyp is a scalar
  % while (iterations < 10 && ~=converged)
    dVa = - (U \ (L \ (P * P_mis))); 
    dVaComp = (U \ (L \ (P * M * c * M' * dVa)));
    % Update P_mis etc. 
  % end while
% end for

And for the record: Even though I use the inverse of H many times, it is not faster to pre-compute it.

Thanks =)

2
There's a few issues: you say "P is a pre-defined full vector", but you generate it as a sparse matrix the same size as H. Also, between the "mathematical" z and implemented z, there is a minus sign different. This makes it a bit unclear what you want to accomplish exactly; could you please correct/clarify these issues?Rody Oldenhuis
Thanks for pointing the errors out Rody. The "pre-defined" P, is now changed to P_mis, and is a mismatch vector that is updated for every iteration, when dVa and dVaComp are found. It is not to be confused with the permutation matrix P. It is supposed to be a minus sign in front of z.Stewie Griffin
It's a bit confusing that you use t once for transposing and once as variable. btw. I remember darkly that Matlab has some optimizations if you calculate M*M' - since c is scalar you should be able to switch it around in dVaComp.bdecaf

2 Answers

1
votes

There's a few things not entirely clear to me:

  • The command M = sparse([t f],[1 -1],1,n,1); can't be right; you're saying that on rows t,f and columns 1,-1 there should be a 1; column -1 obviously can't be right.
  • The result dVaComp is a full matrix due to multiplication by P_mis, while you say it should be sparse.

Leaving these issues aside for now, there's a few small optimizations I see:

  • You use inv(H)*M twice, so you could pre-compute that.
  • negation of the dVa can be moved out of the loop.
  • if you don't need dVa explicitly, leave out the assignment to a variable as well.
  • inversion of a scalar means dividing 1 by that scalar (computation of c).

Implementing changes, and trying to compare fairly (I used only 40 iterations to keep total time small):

%% initialize
clc
N = 4000;

% H  is sparse, square, symmetric
H = tril(rand(N));
H(H<0.5) = 0; % roughly half is empty
H = sparse(H+H.');

% M is sparse vector with two non-zero elements.
M = sparse([1 N],[1 1],1, N,1);

% dyp is some scalar
dyp = rand;

% P_mis = full vector
P_mis = rand(N,1);


%% original method

[L, U, P] = lu(H);   

tic             

for ii = 1:40

    z = -M'*(U \ (L \ (P*M)));
    c = (1/dyp + z)^-1;

    for jj  = 1:10        
        dVa = -(U \ (L \ (P*P_mis)));
        dVaComp = (U \ (L \ (P*M * c * M' * dVa)));    
    end

end

toc


%% new method I

[L,U,P,Q] = lu(H);    

tic            

for ii = 1:40

    invH_M = U\(L\(P*M));

    z = -M.'*invH_M;
    c = -1/(1/dyp + z);

    for jj = 1:10          
        dVaComp = c * (invH_M*M.') * ( U\(L\(P*P_mis)) );
    end 

end

toc

This gives the following results:

Elapsed time is 60.384734 seconds. % your original method
Elapsed time is 33.074448 seconds. % new method 
1
votes

You might want to try using the extended syntax for lu when factoring the (sparse) matrix H:

[L,U,P,Q] = lu(H);

The extra permutation matrix Q re-orders columns to increase the sparsity of the factors L,U (while the permutation matrix P only re-orders rows for partial pivoting).

Specific results depend on the sparsity pattern of H, but in many cases using a good column permutation significantly reduces the number of non-zeros in the factorisation, reducing memory use and increasing speed.

You can read more about the lu syntax here.