2
votes

I have a 3d point cloud of n points in the format np.array((n,3)). e.g This could be something like:

P = [[x1,y1,z1],[x2,y2,z2],[x3,y3,z3],[x4,y4,z4],[x5,y5,z5],.....[xn,yn,zn]]

I would like to be able to get the K-nearest neighbors of each point.

so for example the k nearest neighbors of P1 might be P2,P3,P4,P5,P6 and the KNN of P2 might be P100,P150,P2 etc etc.

how does one go about doing that in python?

2
Possible duplicate of fastest nearest neighbor algorithmAmadan
numpy.linalg.norm and numpy.argsort might help. See stackoverflow.com/questions/1401712/…dkato

2 Answers

12
votes

This can be solved neatly with scipy.spatial.distance.pdist.

First, let's create an example array that stores points in 3D space:

import numpy as np
N = 10  # The number of points
points = np.random.rand(N, 3)
print(points)

Output:

array([[ 0.23087546,  0.56051787,  0.52412935],
       [ 0.42379506,  0.19105237,  0.51566572],
       [ 0.21961949,  0.14250733,  0.61098618],
       [ 0.18798019,  0.39126363,  0.44501143],
       [ 0.24576538,  0.08229354,  0.73466956],
       [ 0.26736447,  0.78367342,  0.91844028],
       [ 0.76650234,  0.40901879,  0.61249828],
       [ 0.68905082,  0.45289896,  0.69096152],
       [ 0.8358694 ,  0.61297944,  0.51879837],
       [ 0.80963247,  0.1680279 ,  0.87744732]])

We compute for each point, the distance to all other points:

from scipy.spatial import distance
D = distance.squareform(distance.pdist(points))
print(np.round(D, 1))  # Rounding to fit the array on screen

Output:

array([[ 0. ,  0.4,  0.4,  0.2,  0.5,  0.5,  0.6,  0.5,  0.6,  0.8],
       [ 0.4,  0. ,  0.2,  0.3,  0.3,  0.7,  0.4,  0.4,  0.6,  0.5],
       [ 0.4,  0.2,  0. ,  0.3,  0.1,  0.7,  0.6,  0.6,  0.8,  0.6],
       [ 0.2,  0.3,  0.3,  0. ,  0.4,  0.6,  0.6,  0.6,  0.7,  0.8],
       [ 0.5,  0.3,  0.1,  0.4,  0. ,  0.7,  0.6,  0.6,  0.8,  0.6],
       [ 0.5,  0.7,  0.7,  0.6,  0.7,  0. ,  0.7,  0.6,  0.7,  0.8],
       [ 0.6,  0.4,  0.6,  0.6,  0.6,  0.7,  0. ,  0.1,  0.2,  0.4],
       [ 0.5,  0.4,  0.6,  0.6,  0.6,  0.6,  0.1,  0. ,  0.3,  0.4],
       [ 0.6,  0.6,  0.8,  0.7,  0.8,  0.7,  0.2,  0.3,  0. ,  0.6],
       [ 0.8,  0.5,  0.6,  0.8,  0.6,  0.8,  0.4,  0.4,  0.6,  0. ]])

You read this distance matrix like this: the distance between points 1 and 5 is distance[0, 4]. You can also see that the distance between each point and itself is 0, for example distance[6, 6] == 0

We argsort each row of the distance matrix to get for each point a list of which points are closest:

closest = np.argsort(D, axis=1)
print(closest)

Output:

[[0 3 1 2 5 7 4 6 8 9]
 [1 2 4 3 7 0 6 9 8 5]
 [2 4 1 3 0 7 6 9 5 8]
 [3 0 2 1 4 7 6 5 8 9]
 [4 2 1 3 0 7 9 6 5 8]
 [5 0 7 3 6 2 8 4 1 9]
 [6 7 8 9 1 0 3 2 4 5]
 [7 6 8 9 1 0 3 2 4 5]
 [8 6 7 9 1 0 3 5 2 4]
 [9 6 7 1 8 4 2 0 3 5]]

Again, we see that each point is closest to itself. So, disregarding that, we can now select the k closest points:

k = 3  # For each point, find the 3 closest points
print(closest[:, 1:k+1])

Output:

[[3 1 2]
 [2 4 3]
 [4 1 3]
 [0 2 1]
 [2 1 3]
 [0 7 3]
 [7 8 9]
 [6 8 9]
 [6 7 9]
 [6 7 1]]

For example, we see that for point 4, the k=3 closest points are 1, 3 and 2.

4
votes

@marijn-van-vliet's solution satisfies in most of the scenarios. However, it is called as the brute-force approach and if the point cloud is relatively large or if you have computational/time constraints, you might want to look at building KD-Trees for fast retrieval of K-Nearest Neighbors of a point.

In python, sklearn library provides an easy-to-use implementation here: sklearn.neighbors.KDTree

from sklearn.neighbors import KDTree
tree = KDTree(pcloud)

# For finding K neighbors of P1 with shape (1, 3)
indices, distances = tree.query(P1, K)

(Also see the following answer in another post for more detailed usage and output: https://stackoverflow.com/a/48127117/4406572)

Many other libraries do have the implementation for KD-Tree based KNN retrieaval, including Open3D (FLANN based) and scipy.