I have a bit of a problem with the speed of my matlab calculation. I was able to write a code in matlab to run the calculation for small matrices, but it uses nested for-loops, and with the large datasets I'm using, matlab fails to finish the computation.
Note: I'm not terribly familiar with Matlab, so while the program works, it is extremely inefficient.
In short, I'm trying to create a matrix whose entries describe the relationship between a set of unique locations. As a concrete example, we start with this matrix:
B =
5873 4 1
5873 7 1
5873 1 1
2819 8 2
2819 1 2
9771 4 3
9771 2 3
9771 5 3
9771 6 3
5548 7 4
Where the third column is the unique location identifier and the second column is the number of the "segment" that happens to be in the location. Notice that multiple segments can fall into different locations.
What I would like to is to create a matrix that describes the relationship between the different locations. Specifically, for location i & j, I would like the (i,j) entry of the new matrix to be the number of segments that i & j have in common divided by the total number of segments of i & j combined.
Currently my code looks like this:
C = zeros(max(B.data(:,3)), max(B.data(:,3)));
for i = 1:max(B.data(:,3))
for j = 1:max(B.data(:,3))
vi = B.data(:,3) == i;
vj = B.data(:,3) == j;
C(i,j) = numel(intersect(B.data(vi,2), B.data(vj,2))) / numel(union(B.data(vi,2), B.data(vj,2)));
end
end
But it is very very slow. Does anyone have any suggestions for speeding up the calculation?
Thanks so much!!