I'll try to provide more-or-less real example of legitimate, meaningful and useful testing for float equality.
#include <stdio.h>
#include <math.h>
/* let's try to numerically solve a simple equation F(x)=0 */
double F(double x) {
return 2*cos(x) - pow(1.2, x);
}
/* I'll use a well-known, simple&slow but extremely smart method to do this */
double bisection(double range_start, double range_end) {
double a = range_start;
double d = range_end - range_start;
int counter = 0;
while(a != a+d) // <-- WHOA!!
{
d /= 2.0;
if(F(a)*F(a+d) > 0) /* test for same sign */
a = a+d;
++counter;
}
printf("%d iterations done\n", counter);
return a;
}
int main() {
/* we must be sure that the root can be found in [0.0, 2.0] */
printf("F(0.0)=%.17f, F(2.0)=%.17f\n", F(0.0), F(2.0));
double x = bisection(0.0, 2.0);
printf("the root is near %.17f, F(%.17f)=%.17f\n", x, x, F(x));
}
I'd rather not explain the bisection method used itself, but emphasize on the stopping condition. It has exactly the discussed form: (a == a+d)
where both sides are floats: a
is our current approximation of the equation's root, and d
is our current precision. Given the precondition of the algorithm — that there must be a root between range_start
and range_end
— we guarantee on every iteration that the root stays between a
and a+d
while d
is halved on every step, shrinking the bounds.
And then, after a number of iterations, d
becomes so small that during addition with a
it gets rounded to zero! That is, a+d
turns out to be closer to a
then to any other float; and so the FPU rounds it to the closest value: to the a
itself. This can be easily illustrated by calculation on a hypothetical calculating machine; let it have 4-digit decimal mantissa and some large exponent range. Then what result the machine should give to 2.131e+02 + 7.000e-3
? The exact answer is 213.107
, but our machine can't represent such number; it has to round it. And 213.107
is much closer to 213.1
than to 213.2
— so the rounded result becomes 2.131e+02
— the little summand vanished, rounded up to zero. Exactly the same is guaranteed to happen at some iteration of our algorithm — and at that point we can't continue anymore. We have found the root to maximum possible precision.
The edifying conclusion is, apparently, that floats are tricky. They look so much like real numbers that every programmer is tempted to think of them as real numbers. But they are not. They have their own behavior, slightly reminiscent of real's, but not quite the same. You need to be very careful about them, especially when comparing for equality.
Update
Revisiting the answer after a while, I have also noticed an interesting fact: in the algorithm above one cannot actually use "some small number" in the stopping condition. For any choice of the number, there will be inputs which will deem your choice too large, causing loss of precision, and there will be inputs which will deem your choiсe too small, causing excess iterations or even entering an infinite loop. Detailed discussion follows.
You might already know that calculus has no notion of a "small number": for any real number, you can easily find infinitely many even smaller ones. The problem is that one of those "even smaller" ones might be what we actually seek for; it might be a root of our equation. Even worse, for different equations there may be distinct roots (e.g. 2.51e-8
and 1.38e-8
), both of which will get approximated by the same number if our stopping condition would look like d < 1e-6
. Whichever "small number" you will choose, many roots which would've been found correctly to the maximum precision with a == a+d
stopping condition will get spoiled by the "epsilon" being too large.
It's true however that in floating point numbers the exponent has limited range, so you can actually find the smallest nonzero positive FP number (e.g. 1e-45
denorm for IEEE 754 single precision FP). But it's useless! while (d < 1e-45) {...}
will loop forever, assuming single-precision (positive nonzero) d
.
Setting aside those pathological edge cases, any choice of the "small number" in the d < eps
stopping condition will be too small for many equations. In those equations where the root has the exponent high enough, the result of substraction of two mantissas differing only at the least significant digit will easily exceed our "epsilon". For example, with 6-digit mantissas 7.00023e+8 - 7.00022e+8 = 0.00001e+8 = 1.00000e+3 = 1000
, meaning that the smallest possible difference between numbers with exponent +8 and 5-digit mantissa is... 1000! Which will never fit into, say, 1e-4
. For these numbers with (relatively) high exponent we simply have not enough precision to ever see a difference of 1e-4
.
My implementation above took this last problem into account too, and you can see that d
is halved each step, instead of being recalculated as a difference of (possibly huge in exponent) a
and b
. So if we change the stopping condition to d < eps
, the algorithm won't be stuck in infinite loop with huge roots (it very well could with (b-a) < eps
), but will still perform needless iterations during shrinking d
down below the precision of a
.
This kind of reasoning may seem to be overly theoretical and needlessly deep, but its purpose is to illustrate again the trickiness of floats. One should be very careful about their finite precision when writing arithmetic operators around them.
foo == bar
butbar != pi
:) – JYelton