2
votes

I have two numpy arrays a and b:

a and b are the same dimensions, a could be a different size than b.

For instance:

a = [[1,2], ..., [5,7]]
b = [ [3,8], [4,7], ... [9,15] ] 

Is there an easy way to compute the Euclidean distance between a and b such that this new array could be used in a k nearest neighbors learning algo.

Note: This is in python

4
So you want the Euclidean distance from each point in a to each point in b? Can you give a small example input and output? - aganders3

4 Answers

1
votes

If what you want is k nearest neighbors, then there are more efficient ways than computing the full distance matrix (especially with many points). Check out scipy's KDTree if you want fast k-neighbors searches.

0
votes

You can use scipy.spatial.distance.cdist like this:

from scipy.spatial import distance

a = [[1,2], ..., [5,7]]
b = [ [3,8], [4,7], ... [9,15] ] 
dist = distance.cdist(a, b, 'euclidean')

This method can be used only if a and b have small number of elements. If there are millions of elements than its slow and requires heavy space on memory. I.e. lets say 'a' has 1 million elements and 'b' has 1000 elements. You will end up with O(m*n), where m=1000000 and b=1000.

Here you can see few methods are compared in terms of efficiency: Efficient and precise calculation of the euclidean distance

0
votes

You can use numpy. Here an example:

import numpy as np


a = np.array([3, 0])
b = np.array([0, 4])

c = np.sqrt(np.sum(((a - b) ** 2)))
# c == 5.0