A
and B
are matrices consisting of binary elements. A
is denoted as the base data matrix and B
is the query matrix. A
consists of 75 data points each of length 10 and B
consists of 50 data points each of length 10. I want to calculate the distance between all data points in A
and every query data point in B
in order to apply nearest neighbor search. So instead of using the Euclidean or the hamming distance, I have used another metric :
N = 2
, k = length of data samples
, s = A(1,:)
and t = B(1,:)
.
The code works for one data sample in A
and another data sample in B
. How do I scale so that it works for all base data points and all query data points?
Example for which the code works
Let A(1,:) = [1,0,1,1,0,0,0,1,1,0]
is the first sample in A matrix. Let B(1,:) = [1,1,0,0,1,1,1,1,0,0]
is the first query point.
If the elements in samples taken from A
and B are same, 0 is recorded for each similar element, otherwise 1. The final distance is the sum of the 1's. So the program checks to see if two sequences are the same, setting b
to 1 if
so, or a zero otherwise. Can somebody please show how I can apply this to matrices?
Code
l = length(A);
D=zeros(1,l);
for i=1:l,
if A(1,i)==B(1,i),
D(1,i)=0;
else
D(1,i)=1;
end
end
sum=0;
for j=1:l,
sum=sum+D(1,j);
end
if sum==0,
b = 1;
else
b = 0;
end
for
loop looks like it is similar... can you clarify? The Hamming distance adds up the total number of disagreeing positions between corresponding elements. However, you are calculating the total number of agreeing positions, which is what your firstfor
loop is doing. Also, are you saying that this code works between two query vectors and you want to extend to matrices? I would like to write a more vectorized approach, but if you are bent in using loops I can live with that. Please clarify. – rayryeng