Ok here is the formula in matlab:
function D = dumDistance(X,Y)
n1 = size(X,2);
n2 = size(Y,2);
D = zeros(n1,n2);
for i = 1:n1
for j = 1:n2
D(i,j) = sum((X(:,i)-Y(:,j)).^2);
end
end
Credits here (I know it's not a fast implementation but for the sake of the basic algorithm).
Now here is my understanding problem;
Say that we have a matrix dictionary=140x100
words. And a matrix page=140x40
words. Each column represents a word in the 140 dimensional space.
Now, if I use dumDistance(page,dictionairy)
it will return a 40x100
matrix with the distances.
What I want to achieve, is to find how close is each word of page
matrix to the dictionary
matrix, in order to represent the page according to dictionary with a histogram let's say.
I know, that If I take the min(40x100), ill get a 1x100 matrix with locations of min values to represent my histogram.
What I really cant understand here, is this 40x100 matrix. What data does this matrix represents anyway? I cant visualize this in my mind.