3
votes

Assume the following matrix:

myMatrix = [
1   0   1
1   0   0
1   1   1
1   1   1
0   1   1
0   0   0
0   0   0
0   1   0
1   0   0
0   0   0
0   0   0
0   0   1
0   0   1
0   0   1    
];

Given the above (and treating each column independently), I'm trying to create a matrix that will contain the number of rows since the last value of 1 has "shown up". For example, in the first column, the first four values would become 0 since there are 0 rows between each of those rows and the previous value of 1.

Row 5 would become 1, row 6 = 2, row 7 = 3, row 8 = 4. Since row 9 contains a 1, it would become 0 and the count starts again with row 10. The final matrix should look like this:

FinalMatrix = [
0   1   0
0   2   1
0   0   0
0   0   0
1   0   0
2   1   1
3   2   2
4   0   3
0   1   4
1   2   5
2   3   6
3   4   0
4   5   0
5   6   0     
];

What is a good way of accomplishing something like this?

EDIT: I'm currently using the following code:

[numRow,numCol] = size(myMatrix);
oneColumn = 1:numRow;
FinalMatrix = repmat(oneColumn',1,numCol);
toSubtract = zeros(numRow,numCol);
for m=1:numCol
    rowsWithOnes = find(myMatrix(:,m));
    for mm=1:length(rowsWithOnes);
        toSubtract(rowsWithOnes(mm):end,m) = rowsWithOnes(mm);
    end
end
FinalMatrix = FinalMatrix - toSubtract;

which runs about 5 times faster than the bsxfun solution posted over many trials and data sets (which are about 1500 x 2500 in size). Can the code above be optimized?

2

2 Answers

3
votes

For a single column you could do this:

col = 1; %// desired column
vals = bsxfun(@minus, 1:size(myMatrix,1), find(myMatrix(:,col)));
vals(vals<0) = inf;
result = min(vals, [], 1).';

Result for first column:

result =
     0
     0
     0
     0
     1
     2
     3
     4
     0
     1
     2
     3
     4
     5
1
votes

find + diff + cumsum based approach -

offset_array = zeros(size(myMatrix));
for k1 = 1:size(myMatrix,2)
    a = myMatrix(:,k1);
    widths = diff(find(diff([1 ; a])~=0));
    idx = find(diff(a)==1)+1;
    offset_array(idx(idx<=numel(a)),k1) = widths(1:2:end);
end
FinalMatrix1 = cumsum(double(myMatrix==0) - offset_array);

Benchmarking

The benchmarking code for comparing the above mentioned approach against the one in the question is listed here -

clear all
myMatrix = round(rand(1500,2500)); %// create random input array
for k = 1:50000
    tic(); elapsed = toc(); %// Warm up tic/toc
end

disp('------------- With FIND+DIFF+CUMSUM based approach') %//'#
tic
offset_array = zeros(size(myMatrix));
for k1 = 1:size(myMatrix,2)
    a = myMatrix(:,k1);
    widths = diff(find(diff([1 ; a])~=0));
    idx = find(diff(a)==1)+1;
    offset_array(idx(idx<=numel(a)),k1) = widths(1:2:end);
end
FinalMatrix1 = cumsum(double(myMatrix==0) - offset_array);
toc
clear FinalMatrix1 offset_array idx widths a

disp('------------- With original approach') %//'#
tic
[numRow,numCol] = size(myMatrix);
oneColumn = 1:numRow;
FinalMatrix = repmat(oneColumn',1,numCol); %//'#
toSubtract = zeros(numRow,numCol);
for m=1:numCol
    rowsWithOnes = find(myMatrix(:,m));
    for mm=1:length(rowsWithOnes);
        toSubtract(rowsWithOnes(mm):end,m) = rowsWithOnes(mm);
    end
end
FinalMatrix = FinalMatrix - toSubtract;
toc

The results I got were -

------------- With FIND+DIFF+CUMSUM based approach
Elapsed time is 0.311115 seconds.
------------- With original approach
Elapsed time is 7.587798 seconds.