Merge two matrices in matlab

Question

I have two matrices. One is of size 1,000,000 x 9 and the other is 500,000 x 9.

The columns have the same meaning and the first 7 columns have the function of a key. Correspondingly, the last two columns have data character. There are many overlapping key values in both of the matrices and I would like to have a big matrix to compare the values. This big matrix should be of dimension 1,000,000 x 11.

For example:

A = [0 0 0 0 0 0 0 10 20; 0 0 0 0 0 0 1 30 40];
B = [0 0 0 0 0 0 0 50 60];

A merged matrix would look like this:

C = [0 0 0 0 0 0 0 10 20 50 60; 0 0 0 0 0 0 1 30 40 0 0];

As you can see, the first row of C has columns 8, 9 from matrix A and columns 10,11 from matrix B. The second row uses the columns 8, 9 from matrix A and 0,0 for the last to columns because there is no corresponding entry in matrix B.

I have accomplished this task theoretically, but it is very, very slow. I use loops a lot. In any other programming language, I would sort both tables, would iterate both of the tables in one big loop keeping two pointers.

Is there a more efficient algorithm available in Matlab using vectorization or at least a sufficiently efficient one that is idiomatic/short?

(Additional note: My largest issue seems to be the search function: Given my matrix, I would like to throw in one column vector 7x1, let's name it key to find the corresponding row. Right now, I use bsxfun for that:

targetRow = data( min(bsxfun(@eq, data(:, 1:7), key), [], 2) == 1, :);

I use min because the result of bsxfun is a vector with 7 match flags and I obviously want all of them to be true. It seems to me that this could be bottleneck of a Matlab algorithm)

Semantically, I would always prefer all(X,2) over min(X,[],2)==1 for a logical array. Not sure if it is any faster though. — Florian
"This could be a bottleneck" - you can actually check whether it is! Try using the profiler to display which lines / sections of your code are using the most time. uk.mathworks.com/help/matlab/ref/profile.html Namely, profile on; <yourCodeHere>; profile viewer; — Wolfie

Iban Cereijo Iban Cereijo · Accepted Answer · 2017-02-08T10:58:39

Maybe with ismember and some indexing:

% locates in B the last ocurrence of each key in A. idxA has logicals of
% those keys found, and idxB tells us where in B.
[idxA, idxB] = ismember(A(:,1:7), B(:,1:7),'rows'); 
C = [ A zeros(size(A, 1), 2) ];
C(idxA, 10:11) = B(idxB(idxA), 8:9); % idxB(idxA) are the idxB != 0

Merge two matrices in matlab

2 Answers