1
votes

Hi I have a correlation Matrix:

A = [1 2 1 3 1 2 4 3 5 1; 
     2 3 4 5 6 6 6 7 7 8];

I need to find out how many times each individual element of row 1 is linked to elements in row 2.

For example, here element 1 in row 1 is related to the following elements of row 2, {2, 4, 6, 8}, so total 4 elements.

Similarly 2 is linked to {3, 6}, for a total of 2 elements.

Resulting Matrix C should be:

[element name in 1st row; Number of connection].

In previous example, C = [1 2 ....; 4 2 ...];

Since actual matrix size is of the order of 1000s', it is impossible to do manually. Any help will be appreciated.

3
What if the column [1;2] appears twice? Should it be counted twice?Luis Mendo

3 Answers

2
votes

There is probably a way to do this without resorting to a for loop but this is one solution I can think of right now. Identify the unique elements in the first row of matrix A and loop over all the elements to find the ones they are linked to in the second row.

I have assumed that you only need to identify the unique elements that the first row is linked to, hence the unique() function inside the for loop; if that is not the case, please remove that from the code.

a = [1 2 1 3 1 2 4 3 5 1; 2 3 4 5 6 6 6 7 7 8];

row1el = unique(a(1, :));

c = zeros(2, length(row1el));
for i = 1:length(row1el)
  idx = a(1, :) == row1el(i);
  linkedEl = a(2, idx);
  c(1, i) = row1el(i);
  c(2, i) = length(unique(linkedEl));
end
disp(c)
2
votes

If I understood the question properly, you are not really concerned about the values in the second row, but the number of occurrences of the elements in row 1. This can be obtained with the unique and histc functions:

C(1,:)=unique(A(1,:));
C(2,:)=histc(A(1,:),C(1,:));

C =

     1     2     3     4     5
     4     2     2     1     1
1
votes

The answer depends on whether repeated columns should be counted repeatedly or not. Consider the following data:

A = [1 2 1 3 1 2 4 3 5 1; 
     2 3 4 5 6 6 6 5 7 8]; %// col [3;5] appears twice
  1. If repeated columns should be counted according to their multiplicity: you can use accumarray:

    [ii, ~, kk] = unique((A(1,:)));
    jj = accumarray(kk.', A(2,:).', [], @(x) numel(x)).';
    C = [ii; jj];
    

    Result with my example A:

    C =
         1     2     3     4     5
         4     2     2     1     1
    

    Or you can use sparse:

    [~, ii, jj] = find(sum(sparse(A(2,:), A(1,:), 1)));
    C = [ii; jj];
    

    The result is the same as above.

  2. If repeated columns should be counted just once: either of the two approaches is easily adapted to this case:

    [ii, ~, kk] = unique((A(1,:)));
    jj = accumarray(kk.', A(2,:).', [], @(x) numel(unique(x))).'; %'// note: "unique"
    C = [ii; jj];
    

    or

    [~, ii, jj] = find(sum(sparse(A(2,:), A(1,:), 1) > 0));  %// note: ">0"
    C = [ii; jj];
    

    Result (note third column is different than before):

    C =
         1     2     3     4     5
         4     2     1     1     1