I'm currently implementing a KD Tree and nearest neighbour search, following the algorithm described here: http://ldots.org/kdtree/
I have come across a couple of different ways to implement a KD Tree, one in which points are stored in internal nodes, and one in which they are only stored in leaf nodes. As I have a very simple use case (all I need to do is construct the tree once, it does not need to be modified), I went for the leaf-only approach is it seemed to be simpler to implement. I have successfully implemented everything, the tree is always constructed successfully and in most cases the nearest neighbour search returns the correct value. However, I have some issues that with some data sets and search points, the algorithm returns an incorrect value. Consider the points:
[[6, 1], [5, 5], [9, 6], [3, 81], [4, 9], [4, 0], [7, 9], [2, 9], [6, 74]]
Which constructs a tree looking something like this (excuse my bad diagramming):
Where the square leaf nodes are those that contain the points, and the circular nodes contain the median value for splitting the list at that depth. When calling my nearest neighbour search on this data set, and looking for the nearest neighbour to [6, 74]
, the algorithm returns [7, 9]
. Although this follows the algorithm correctly, it is not in fact the closest point to [6, 74]
. The closest point would actually be [3, 81]
which is at a distance of 7.6, [7, 9]
is at a distance of 65.
Here are the points plotted, for visualization, the red point being the one I am attempting to find the nearest neighbour for:
If it helps, my search method is as follows:
private LeafNode search(int depth, Point point, KDNode node) {
if(node instanceof LeafNode)
return (LeafNode)node;
else {
MedianNode medianNode = (MedianNode) node;
double meanValue = medianNode.getValue();
double comparisonValue = 0;
if(valueEven(depth)) {
comparisonValue = point.getX();
}
else {
comparisonValue = point.getY();
}
KDNode nextNode;
if(comparisonValue < meanValue) {
if (node.getLeft() != null)
nextNode = node.getLeft();
else
nextNode = node.getRight();
}
else {
if (node.getRight() != null)
nextNode = node.getRight();
else
nextNode = node.getLeft();
}
return search(depth + 1, point, nextNode);
}
}
So my questions are:
Is this what to expect from nearest neighbour search in a KD Tree, or should I be getting the closest point to the point I am searching for (as this is my only reason for using the tree)?
Is this an issue only with this form of KD Tree, should I change it to store points in inner nodes to solve this?