Python KD Tree Nearest Neigbour where distance is greater than zero

Question

I am trying to implement a Nearest neighbour search for Lat and Lon data. Here is the Data.txt

61.3000183105 -21.2500038147 0
62.299987793 -23.750005722 1
66.3000488281 -28.7500038147 2
40.8000183105 -18.250005722 3
71.8000183105 -35.7500038147 3
39.3000183105 -19.7500019073 4
39.8000183105 -20.7500038147 5
41.3000183105 -20.7500038147 6

The problem is, when I want to do the nearest neighbour for each of the Lat and Lon on the data set, it is searching it self. e.g Nearest Neighbour of (-21.2500038147,61.3000183105) will be (-21.2500038147,61.3000183105) and the resulting distance will be 0.0. I am trying to avoid this but with no luck. I tried doing if not (array_equal) but still...

Below is my python code

import numpy as np
from numpy import *
import decimal
from scipy import spatial
from scipy.spatial import KDTree
from math import radians,cos,sin,sqrt,exp


Lat =[]
Lon =[]
Day =[]

nja = []


Data = np.loadtxt('Data.txt',delimiter=" ")
for i in range(0,len(Data)):
    Lon.append(Data[i][:][0])
    Lat.append(Data[i][:][1])
    Day.append(Data[i][:][2])   

tree =spatial.KDTree(zip(Lon,Lat) )

print "Lon  :",len(Lon)
print "Tree :",len(tree.data)

for i in range(0,len(tree.data)):
    pts = np.array([tree.data[i][0],tree.data[i][1]])
    nja.append(pts)

for i in range(0, len(nja)):
    if not (np.array_equal(nja,tree.data)):
    nearest = tree.query(pts,k=1,distance_upper_bound =9)
    print nearest

gboffi gboffi · Accepted Answer · 2016-11-15T07:07:14

For each point P[i] in your data set, you're asking "Which is the point nearest to P[i] in my data set?" and you get the answer "It is P[i]".

If you ask a different question, "Which are the TWO points nearest to P[i]?", i.e., tree.query(pts,k=2) (the difference with your code being s/k=1/k=2/) you will get P[i] and also a P[j], the second nearest point, that is the result you want.

Side note:

I'd recommend that you project your data before building the tree, cause in your range of latitudes there is a large fluctuation in what is meant by a 1 degree distance in longitude.

Python KD Tree Nearest Neigbour where distance is greater than zero

2 Answers