Efficient computation of Euclidean distance between cell arrays

Question

I have an a-by-b cell array, C. In each element, there is a float array.

I now want to create a new symmetric matrix M. Each element (i, j) in M is to be set to the sum of the Euclidean distances of all the respective float arrays in C.

For example, to find M(i,j), I would take the set of b float arrays in C along row i, and the set of b float arrays in C along row j, find the Euclidean distance between each array across the two sets, and then sum up the b x b values. C{i,j} is a column vector. All columns are the same length.

Below is my "brute force" implementation of this:

for i=1:a
    for j=1:a
        dist_sum = 0;
        for k=1:b
            for l=1:b
                dist = sqrt(sum((C{i, k} - C{j, l}) .^ 2));
                dist_sum = dist_sum + dist;
            end
        end
        M(j, i) = dist_sum;
        M(i, j) = dist_sum;
    end
end

My question: Is there a more efficient way of doing this using matrix operations, without having to explicitly compute each Euclidean distance in turn?

What is exactly the contents of C{i,j}? A row vector? With the same length for all i and j? — Luis Mendo
C{i,j} is a column vector. All columns are the same length. — Karnivaurus
I would be waaay easier if you had a 3D array C(i,j,k), where k is each vector component. Is that possible for you? — Luis Mendo
Can you convert your cell array to a matrix using cell2mat? That should make it easier to use functions like pdist. — Cecilia
If all your elements in your cell-array C are of the same type, you should be using a normal array/matrix. This will improve your memory consumption and, as @2cents says, will be easier to use that array/matrix in other functions. — gire

Luis Mendo Luis Mendo · Accepted Answer · 2014-08-01T14:37:30

It would be better to use a 3D array, instead of a 2D cell array of equal-size column vectors.

If you have a cell array: first convert into a 3D array (D in my code); then it's easy to compute distances with bsxfun; and finally apply sum:

D = permute(C, [3 1 2]);
D = reshape(cat(2, D{:}), [], size(C,1), size(C,2)); %// 3D array
dist = sqrt(sum(bsxfun(@minus, D, permute(D, [1 4 5 2 3])).^2)); %// distances
M = squeeze(sum(sum(dist, 3), 5)); %// sum of distances

Example: with

>> C = {[1; 2], [30; 40], [0; 1]; [5; 7] [19; 17] [4; 5]}; %// a is 2, b is 3

the result of both your code and mine is

M =
  196.8391  182.8791
  182.8791   77.3002

Efficient computation of Euclidean distance between cell arrays

3 Answers