3
votes

Given a set of 2D points, I want to define a polygon that covers a certain percentage of all points while keeping it's area as small as possible.

I started to think of an algorithm like this on:

  1. Start with 2 points that are closest to the mean
  2. Loop over all others points
  3. Add the point that results in the smallest covering area
  4. Repeat 2 & 3 until the desired percentage is reached

I think this would work but it sounds very inefficient. My best bet on calculating the area would be to use an existing implementation of the convex hull. My problem also sounds like a classic mathematical problem and I suspect there to be scipy function that probably does it out of the box.

What's an efficient and simple way of solving or approximating this problem?

EDIT: The resulting polygon should be convex

1
Does your polygon need to be convex?Reblochon Masque
No, it does notpietz
This does not sound like "a classic mathematical problem" to me but does sound very difficult. If, for example, you have 10 points and want to cover 7 of them, in effect you are looking at all combinations of 7 points out of those ten, then finding the smallest-area polygon that includes only those points, Then, out of all those minimal polygons, choose the one with smallest area. Looks hard to me. Also, your suggested algorithm is a "greedy" one, and such algorithms sometimes work but often do not find the optimal result.Rory Daulton
@RoryDaulton You are proposing an exhaustive search. Algorithm building directly the optimal polygon probably exist, at least for the second part computing the polygon. The procedure pietz describe is one of them, however, I cannot convince myself that it build the optimal polygon.Mathieu
The area covering the set of point can be as small as you want. Draw a tree that covers all the points and make the branches very thin, for example. This is obviously not what you want so you need to redefine your constraints.Matt Timmermans

1 Answers

0
votes
  1. find the center of all the points.
  2. Shift all the points to use that as the origin.
  3. swap to polar coordinates.
  4. sort points in order of increasing radius from polar coords.
  5. Select the X% of them closest to the origin.
  6. Generate the convex hull around those points.

Not sure if it is optimal, but it seems computable and should at least be a good approximation.