Efficient algorithm for finding closest point in a finite set to another point

Question

I have a list L of ~30k locations (written as longitude/latitude pairs), and a list E of ~1m events (with locations written as longitude/latitude pairs), each of which occurs at a point in L. I want to tag each event in E with its corresponding location in L. But the cooordinates in L and E are rounded differently—E to five decimal places, L to thirteen—so ostensibly identical coordinates can actually differ by ~10^-5 degrees, or ~1 meter. (Points in L are separated by at least ~10 m.)

I thus need the nearest point in L to every point of E; the obvious O(|L||E|) brute-force algorithm is too slow. L is small enough compared to E that algorithms that preprocess L and amortize the preprocessing time over E are fine. Is this a well-studied problem? The links that I can find are for related but distinct problems, like finding the minimal distance between a pair of points in one set.

Possibly relevant: Voronoi diagrams, though I can't see how preprocessing L into a Voronoi diagram would save me computational time.

Ok.. I've skipped most of your post because there's just too much useless information. Basically, you have a list 'L' of 30k XY points (call that lat/long if you want), and you have a list 'E' of a million XY points (ditto), and you want to know for each point in E which one in 'L' it's closest to. Is that it ? Please confirm. — AlexG
Wait - if the rounding errors cause a deviation of up to 1m, but the minimal distance between points is 10m, then you can just round both to the same precision and compare equality, right? — le_m
@AlexG: can confirm (and I've redacted some of the useless information). — Connor Harris
@ConnorHarris thanks. Looks like a Quadtree should do the job. fr.wikipedia.org/wiki/Quadtree There are probably a bunch of open source libraries that implements it, but I think you could do it yourself quite easily. Sort all your reference points by x. Sort the first half and the second half by y. You've just split your points into 4 regions. Just do that recursively while building a tree structure until each region contains only 1 point. After that, you can search for the closest point by comparing the ranges and traversing the tree. — AlexG

gue gue · Accepted Answer · 2017-04-12T15:33:57

Yes you are correct. First you can construct the Voronoi Diagram of your point set of locations L in O(|L| log |L|) time using Furtune's Sweep Line approach. There are various implementations out there that can be used, Triangle would be among the most common ones.

Now you have a partitioning of the plane of O(|L|) size. To allow O(log |L|) nearest neighbour queries you need a search structure on top of the Voronoi diagram. A common approach is to use the Dobkin-Kirkpatrick Hierarchy, details can be found in various lecture notes. This method supports O(log |L|) queries and also requires only O(|L|) size. (Also mentioned in this post.)

Then the |E| queries can be accomplished in O(|E| log |L|) time.

A different approach would be to use k-d trees. From an implementation point of view they might be less work and provide the same complexities (as far as iI know). A quick search revealed these two implementations that are maybe worth testing: C++, Java.

Efficient algorithm for finding closest point in a finite set to another point

4 Answers