17
votes

I have a list L of points (x, y) and the usual euclidean distance measure

enter image description here

How do I find the maximum distance two points have in this list? Or, more formally: How do I find

enter image description here

The trivial approach

The simplest way to solve this problem seems to be to try everything:

def find_max_dist(L):
    max_dist = d(L[0], L[1])
    for i in range(0, len(L)-1):
        for j in range(i+1, len(L):
            max_dist = max(d(L[i], L[j]), max_dist)
    return max_dist

To make computation faster, I can use the squared distance in the loops and return the root at the end.

This approach has a run time complexity of O(n^2) where n is the lenght of the list L. (And a constant space complexity).

Convex Hull

Obviously, there cannot be any algorithm that is faster than O(n), as one has to look at least once at every element in the list.

The highest distance will be between elements of the convex hull. But it is easy to prove that the computation of the convex hull is at least in O(n log n) and Graham's scan seems to do it. But after finding the complex hull I still have to get the maximum distance. So I would end up with

def find_max_dist_graham_triv(L):
    H = Graham_scan(L)
    return find_max_dist(L)

Now, this is a point where I am not sure. I think one can improve this like so:

def find_max_dist_graham_c1(L):
    H = Graham_scan(L)  # Get ordered list of convex hull points
    max_dist = d(L[0], L[1])
    for i in range(0, len(L)-1):
        loop_max_dist = 0
        for j in range(i+1, len(L):
            curr_dist = d(L[i], L[j])
            if curr_dist < loop_max_dist:
                break
            loop_max_dist = curr_dist
            max_dist = max(loop_max_dist, max_dist)

    return max_dist

The idea is that when you take one point of a convex hull and start from its neighboring point, the diagonals keep increasing, get to a maximum and then keep decreasing. I'm not sure if this is true, though.

Intuitively, I would continue to improve the algorithm: As soon as the first inner loop finishes, we found a "longest diagonal" of that loop. This diagonal separates all other hull points in two disjunct sets. Every longer diagonal has to consist of points from both of those sets (correct?):

def find_max_dist_graham_c1(L):
    H = Graham_scan(L)  # Get ordered list of convex hull points
    max_dist = d(L[0], L[1])  # Get a fist idea what the longest distance might be

    i = 0
    p1 = L[i]  # Grab a random point
    loop_max_dist = 0
    for j in range(1, len(L):
        curr_dist = d(L[i], L[j])
        if curr_dist < loop_max_dist:
            break
        loop_max_dist = curr_dist
        max_dist = max(loop_max_dist, max_dist)
    # Every diagonal that is longer than the best one found with L[0]
    # has to have points in both of the following two sets (correct?):
    # [1...j] and [j+1...len(L)]

    # Try to find a neighboring diagonal that is longer.
    change = True
    while change:
        change = False
        if d(L[i-1], L[j+1]) > max_dist:
            max_dist = d(L[i-1], L[j+1])
            i -= 1
            j += 1
            change = True
        elif d(L[i+1], L[j-1]) > max_dist:
            max_dist = d(L[i+1], L[j-1])
            i += 1
            j -= 1
            change = True
    return max_dist

Every point P on the convex hull has a point Q on the convex hull such that PQ is the longest diagonal including P. But is then P also the "endpoint" for the longest diagonal of Q?

I am really not sure if this algorithm is correct. It would be in O(n log n).

I guess the problem is well analyzed, so could somebody please leave some notes for that?

Although I had a lot of sub-questions the main question is:

What is an efficient way to find the maximum distance of points in a list of points?

2
you want to find the 2 points that are the most far from each other ?CMPS
@Amir: No. I want to get the distance of the 2 points that are the most far from each other.Martin Thoma
how can a distance between 2 fixed point vary ?CMPS
I think you are on the right track. You can improve the second part by using binary search. For every point on convex hull, search for the point that has the max distance from it. That is dist of next point and previous point are smaller than current point.Abhishek Bansal
@Amir: The distance between two fixed points does not vary. Why do you ask?Martin Thoma

2 Answers

13
votes

You should look up on Rotating calipers(http://en.wikipedia.org/wiki/Rotating_calipers) - they're widely used for that kind of problems. Also, your assumption is wrong. For a fixed point p on the convex polygon: the diagonal can first increase, then decrease, then increase, and then decrease again. At least, I've got a case where this happens.

A heuristic approach also: select a point x. Find the furthest point y from it. Find the furthest point z from y. d(z,y) is a good estimation.

The image that illustrates the diagonals:

enter image description here

1->2: increasing; 2->3 decreasing; 3->4 increasing; 4->5 decreasing. The figure may not be precise, move the points that 3 and 4 point to a bit further away from p (on the same line).

2
votes

Assuming you have a uniform distribution of the points you can do the following thing:

Find max_x and min_x being the maximum and minimum X coordinates - (O(n)). Those value should help you pick a constant k as the "best" one for the current set of points. Different value of k will influence only the complexity of the algorithm.

Consider a new data structure which is matrix like and is a vector of vectors or vector of linked lists, lets name it structure where structure[i] is the corresponding vector/linked lists (as described above). Populate this data structure as follows: structure[i] should contain points that have their x coordinate being in the range of [max_x+ik,max_x+(i+1)k] this will take another O(n) time and O(n) extra space. Now you sort every entry of structure[i] by y coordinate. Having this done it is enough to compute the distances (brute force) between the following set of points: structure[0], structure[structure.length()-1], the extremes (entry at first and last index) of every other structure[i].

Basically this is almost the same as doing the convex hull and starting to compute the distances of the points that are on the hull, the difference is that picking the right k might either make it faster or slower. Having worst case complexity O(n^2) and best case complexity O(nLg(n)). Where k will influence the trade of either sorting bigger groups of points or having more points to compute the distances between.