3
votes

Let's assume my matrix A is the output of comparison function i.e. logical matrix having values 0 and 1's only. For a small matrix of size 3*4, we might have something like:

A =

     1     1     0     0
     0     0     1     0
     0     0     1     1

Now, I am generating another matrix B which is of the same size as A, but its rows are filled with indexes of A and any leftover values in each row are set to zero.

B =

     1     2     0     0
     3     0     0     0
     3     4     0     0

Currently, I am using find function on each row of A to get matrix B. Complete code can be written as:

A=[1,1,0,0;0,0,1,0;0,0,1,1];
[rows,columns]=size(A);
B=zeros(rows,columns);

for i=1:rows
    currRow=find(A(i,:));
    B(i,1:length(currRow))=currRow;
end

For large martixes, "find" function is taking time in the calculation as per Matlab Profiler. Is there any way to generate matrix B faster?

Note: Matrix A is having more than 1000 columns in each row but non-zero elements are never more than 50. Here, I am taking Matrix B as the same size as A but Matrix B can be of much smaller size column-wise.

1
You can replace find with an indexing operation, but I’d be surprised if that’s faster. I = 1:columns; currRow = I(A(i,:));Cris Luengo
I was thinking along the lines that "for" loop can be removed altogether for faster operation. Is that possible?lonstud
Have you considered storing your matrix as a sparse matrix? I would also store the transpose of the matrix so as to make the searches along the columns, the way they're stored in MATLAB.beaker
for loops in Matlab are not necessarily slow. Historically they were but that's not so true nowadays. With the suggestions from Cris and beaker I wouldn't expect a vectorised version to be faster.David

1 Answers

1
votes

I would suggest using parfor, but the overhead is too much here, and there are more issues with it, so it is not a good solution.

rows = 5e5;
cols = 1000;
A = rand(rows, cols) < 0.050;
I = uint16(1:cols);
B = zeros(size(A), 'uint16');
% [r,c] = find(A);
tic
for i=1:rows
%     currRow = find(A(i,:));
    currRow = I(A(i,:));
    B(i,1:length(currRow)) = currRow;
end
toc

@Cris suggests replacing find with an indexing operation. It increases the performance by about 10%.

Apparently, there is not a better optimization unless B is required to be in that specific form you tell. I suggest using [r,c] = find(A); if the indexes are not required in a matrix form.