2
votes

Suppose that you have 100000 points on the curve y = x^2. You want to find the convex hull of these points. All the coordinates are floating numbers.

In my graham scan implementation the only place where I operate on floating numbers is when I initially sort all the points by their coordinates and then I have one function that determines whether three points make a left or a right turn.

Points:

struct point {
   double x; 
   double y;
};

Sorting comparator:

inline bool operator() (const point &p1, const point &p2) {
    return (p1.x < p2.x) || (p1.x == p2.x && p1.y > p2.y);
}

Left/Right turn:

inline int ccw(point *p1, point *p2, point *p3) {
  double left  = (p1->x - p3->x)*(p2->y - p3->y);
  double right = (p1->y - p3->y)*(p2->x - p3->x);
  double res = left - right;
  return res > 0;
}

My program says that out of 100 000 points only 68894 are part of the convex hull. But since they are on the curve, all of them should be part of the convex hull.

For your eye it doesn't make any difference. See figure below. The red points are part of the convex hull.

image

But if you look close enough, and zoom into the points, you'll see that some of them are blue, so they are not included in the convex hull.

image

Now my initial assumption is that floating point errors are causing this problem.

I guess I could use an external library that has an arbitrary precision for floating point numbers, but I'm more interested in the simple data types that we have for example in C++.

How could I increase the accuracy? I've read about epsilons, but how would using an epsilon help here? I would still assume some points that are close to each other to be the same, so I won't get an accuracy closer to 100%.

What's the best way to approach this problem?

2
did you try long double?mch
Is it possible that your convex hull algorithm is correct, but that roundoff occurred when you initially evaluated y = x^2?ajclinto
First, no finite precision will let you represent real numbers. Second, the points you plot are approximations. Third, true constructive reals (infinite precision) cannot be compared the way you want to compare them. Fifth, why do you care? (what purpose do you have for the hull that thus matters)Yakk - Adam Nevraumont
mch, yes but unfortunately I don't see a difference. ajclinto, I didn't think about it, that is probably the case as well. Yakk, my professor gave us an assignment to implement different convex hull algorithms, I've implemented all of them but have no clue how to approach the floating point errors.jsguy
This is not a generic answer, but you can use two integers to store a rational number exactly, and do all the math in rational space. It won't work for, let's say, sqrt or sin, but will be exact for any polynomial with rational coefficients as long as the independent variable is rational as well.triple_r

2 Answers

0
votes

You are correct that all of the points should be on the convex hull if you are indeed using points of the form (x, x^2). However, three points may be collinear. If you're shifting them or doing anything else weird, this goes out the window.

If you get to choose your 100000 points, I would suggest using the integers in [-50000,49999]. Your ccw function will compute left and right to be integers smaller in absolute value than 2.5e14 < 2^53, so no roundoff will occur.

Your coordinate-based sort will work correctly regardless of the input.

For general inputs, the following ccw predicate is buggy:

inline int ccw(point *p1, point *p2, point *p3) {
  double left  = (p1->x - p3->x)*(p2->y - p3->y);
  double right = (p1->y - p3->y)*(p2->x - p3->x);
  double res = left - right;
  return res > 0;
}

There can be roundoff both in the subtractions and in the multiplications. If all of your points lie in a H*W bounding box, the x-coordinate differences will be computed with an absolute error around H*eps/2 and the y-coordinate differences will be computed with an absolute error around W*eps/2. The products will therefore be computed with an absolute error around H*W*eps/2. If fabs(left - right) < 3*H*W*eps/2, you need to evaluate left and right more precisely. eps here is 2-52.

I'd probably recommend just using MPFR if the double comparison doesn't tell you anything. You can do it without, however. The trick from Kahan summation will get you the low bits from the differences, and the 227+1 trick can help you compute the products exactly.

0
votes

Very often with floating-point math you need to introduce the concept of "tolerance," sometimes denoted as epsilon. In your case, you could make your ccw() function three-valued: true/false/indeterminate. Then when you're trying to discover if a new point can be part of the convex hull, you ask "is it ccw=true or indeterminate", and either way you accept the point. The indeterminate result would occur when the slope is too close to a straight line to be decided.