One viable option for small k
would be to use your O(nlogn)
algorithm repeatedly on subsets of the original set of points, removing points as you go. More specifically, keep a set of points that formed a minimal pair. Every time you query for the next closest pair, we'll query for the next closest pair within these points and between each point and the rest of the original points, and take the closer of the two pairs.
To start, remove all but one of these points from the original set, and calculate the closest pair. Repeat this for each of the points in this "min set", and keep the overall closest pair. This will take O(j*nlogn)
time to compute, when the "min set" is of size j
.
Then, query for the next closest pair within this "min set" via a find-min (O(1)
time) on a min-max heap of size k
that we'll build as we add points to our "min set". Every time we add points to our "min set", we'll calculate the distance between each point in "min set" (size j
) and these (up to) 2 new points, insert these into our min-max heap, and then delete as many elements as necessary to make the heap size k
again (at most 2j
elements) in O(jlogk)
time.
Now, we take the closer pair of these two (deleting from the heap if relevant- O(logk)
time), add the points to our "min set" and update the min-max heap as described, and repeat for the remaining k-j
closest pairs. Overall, this will take O((k^2)nlogn + (k^2)logk + klogk) = O((k^2)(nlogn + logk)) = O((k^2)nlogn)
time.
closest points
how many dimensions? – greybeard