2
votes

I have the following data.

  • a row vector idx1 of length n1
  • an m x n1 matrix A1
  • a row vector idx2 of length n2
  • an m x n2 matrix A2

The row vectors idx1 and idx2 label the columns of A1 and A2, respectively. Essentially, I want to merge the columns of A1 and A2 according to the labels in idx1 and idx2. I think it's easiest to give some code I already have that does the job.

idx = union(idx1,idx2);
A = zeros(size(A1,1),length(idx));
for i = 1:length(idx)
    j1 = find(idx1 == idx(i),1);
    if ~isempty(j1)
        A(:,i) = A(:,i) + A1(:,j1);
    end
    j2 = find(idx2 == idx(i),2);
    if ~isempty(j2)
        A(:,i) = A(:,i) + A2(:,j2);
    end
end

Now, my problem is that I want to carry out this operation efficiently, sometimes on sparse matrices. Is there a faster way than what I have? Does the answer change if A1 and A2 are sparse?

2
Should we assume idx1 has no repeated values, and same for idx2? Otherwise I think your code doesn't workLuis Mendo
We can assume that idx1 and idx2 are sorted vectors of positive integers, and contain no repeated values.Stirling

2 Answers

2
votes

You can perform the addition with an entire array (at the cost of two calls to ismember:

idx = union(idx1,idx2);
A = zeros(size(A1,1),length(idx));

[~,loc1] = ismember(idx1,idx);
[~,loc2] = ismember(idx2,idx);

A(:,loc1) = A(:,loc1) + A1;
A(:,loc2) = A(:,loc2) + A2;
1
votes

If idx1 and idx2 contain integer values, you can put the sparse function to do the sum for you:

[ii1 jj1] = ndgrid(1:m, idx1);
[ii2 jj2] = ndgrid(1:m, idx2);
A = sparse([ii1 ii2],[jj1 jj2],[A1 A2]); %// do the sum for matching indices
A = A(:,union(idx1, idx2)); %// result in sparse form
A = full(A); %// optional