Making a match-and-append code more efficient without 'for' loop

Question

I am trying to match 1^st column of A with 1st to 3^rd columns of B, and append corresponding 4^th column of B to A.

For example,

I compare A(:,1) and B(:, 1:3)

1 and 3 are in A(:,1)

1 is in the 1^st, 2^nd, 3^rd rows of B(:, 1:3), so append B([1 2 3], 4:end)' to A's 1^st row. 3 is in the 2^nd and 4^th rows of B(:,1:3), so append B([2 4], 4:end)' to A's 2^nd row.

So that it becomes:

1 2 5 4 5 3 1 2
3 4 5 3 6 5 0 0

I could code this using only for and if.

clearvars AA A B mem mem2 mem3

A = [1 2 ; 3 4]
B = [1 2 4 5 4; 1 2 3 5 3; 1 1 1 1 2; 3 4 5 6 5]

for n=1:1:size(A,1)
    mem  = ismember(B(:,[1:3]), A(n,1));
    mem2 = mem(:,1) + mem(:,2) + mem(:,3);
    mem3 = find(mem2>0);

    AA{n,:} = horzcat( A(n,:), reshape(B(mem3,[4,5])',1,[]) );  %'
end

maxLength = max(cellfun(@(x)numel(x),AA));
out = cell2mat(cellfun(@(x)cat(2,x,zeros(1,maxLength-length(x))),AA,'UniformOutput',false))

I am trying to make this code efficient, by not using for and if, but couldn't find an answer.

in your definition of AA (last line inside loop) you should use 4:end instead of [4,5]. ANd your code runs quite fast/efficient. Would recommend to keep it, if no faster solution is found... there is no reason to avoid loops just that many times there is a faster solution without loops. — The Minion
@TheMinion: there is the problem that his loop body contains ismember, which means JIT cannot accelerate this loop effectively. For larger problems, this will becomes a concern. — Rody Oldenhuis
@RodyOldenhuis True. Hence the problem isn't the for-loop but the ismember() inside the loop. Still when I ran his code and the one from Nishant, his was minimal faster even for 10.000x100 entries. So not sure if that "problem" with ismember() really results in such runtime issues. BTW nice solution +1 — The Minion

Nishant Nishant · Accepted Answer · 2014-08-01T08:58:31

Try this

a = A(:,1);
b = B(:,1:3);
z = size(b);
b = repmat(b,[1,1,numel(a)]);
ab = repmat(permute(a,[2,3,1]),z);
row2 = mat2cell(permute(sum(ab==b,2),[3,1,2]),ones(1,numel(a)));
AA = cellfun(@(x)(reshape(B(x>0,4:end)',1,numel(B(x>0,4:end)))),row2,'UniformOutput',0);
maxLength = max(cellfun(@(x)numel(x),AA));
out = cat(2,A,cell2mat(cellfun(@(x)cat(2,x,zeros(1,maxLength-length(x))),AA,'UniformOutput',false)))

UPDATE Below code runs in almost same time as the iterative code

a = A(:,1);
b = B(:,1:3);
z = size(b);
b = repmat(b,[1,1,numel(a)]);
ab = repmat(permute(a,[2,3,1]),z);
df = permute(sum(ab==b,2),[3,1,2])';
AA = arrayfun(@(x)(B(df(:,x)>0,4:end)),1:size(df,2),'UniformOutput',0);
AA = arrayfun(@(x)(reshape(AA{1,x}',1,numel(AA{1,x}))),1:size(AA,2),'UniformOutput',0);    
maxLength = max(arrayfun(@(x)(numel(AA{1,x})),1:size(AA,2)));
out2 = cell2mat(arrayfun(@(x,i)((cat(2,A(i,:),AA{1,x},zeros(1,maxLength-length(AA{1,x}))))),1:numel(AA),1:size(A,1),'UniformOutput',0));

Making a match-and-append code more efficient without 'for' loop

2 Answers