3
votes

I have come seeking knowledge.
I am trying to understand floating point numbers.
I am trying to figure out why, when I print the largest floating point number, it does not print correctly.

2-(2^-23)                          Exponent Bits
1.99999988079071044921875 * (1.7014118346046923173168730371588e+38) = 
    3.4028234663852885981170418348451e+38

This should be the largest single-precision floating point number:

340282346638528859811704183484510000000.0

So,

float i = 340282346638528859811704183484510000000.0;
printf(TEXT, "Float %.38f", i);
Output: 340282346638528860000000000000000000000.0

Obviously the number is being rounded up, so I am trying to figure out just exactly what is going on.

My questions are: The Wikipedia documentation states that 3.4028234663852885981170418348451e+38 is the largest number that can be represented in IEEE-754 fixed point.

Is the number stored in the floating point register = 0 11111111 11111111111111111111111 and it is just not being displayed incorrectly?

If I write printf(TEXT, "Float %.38f", FLT_MAX);, I get the same answer. Perhaps the computer I am using does not use IEEE-754?

I understand errors with calculations, but I don't understand why the number 340282346638528860000000000000000000000.0 is the largest floating point number that can be accurately represented.

Maybe the Mantissa * Exponent is causing calculation errors? If that is true, then 340282346638528860000000000000000000000.0 would be the largest number that can be faithfully represented without calculation errors. I guess that would make sense. Just need a blessing.

Thanks,

2
"0 11111111 11111111111111111111111" is NAN. Suspect you want "0 11111110 11111111111111111111111" - chux - Reinstate Monica
FLT_MAX is what you think it is. Your printf() is showing an approximation of it. To see what exactly is the decimal value of FLT_MAX, you need to use different code. - chux - Reinstate Monica
Wikipedia also says "This gives from 6 to 9 significant decimal digits precision" so in everyday use as an approximation of some nearby value, trying to print all the digits as if it was a specific integer at that magnitude's a bit silly. None of the previous 10^~30 integers were representable. It's interesting though that printf is showing about the number of digits that would generally be meaningful in a double-precision value - my guess is your implementation casts the float to double, generates a reasonable representation of that, and pads with 0s to the requested length. - Tony Delroy
@Tony D printf(TEXT, "Float %.38f", i); converts i from float to double before passing it to printf(). printf() does not receive float. - chux - Reinstate Monica
@chux: ah yes of course... so it would be unreasonable to display less precision than that... and evidently the implementation feels pointless to display more. - Tony Delroy

2 Answers

4
votes

Looks like culprit is printf() (I guess because float is implicitly converted to double when passed to it):

#include <iostream>
#include <limits>

int main()
{
    std::cout.precision( 38 );
    std::cout << std::numeric_limits<float>::max() << std::endl;
}

Output is:

3.4028234663852885981170418348451692544e+38
4
votes

With float as binary32, the largest finite float is

340282346638528859811704183484516925440.0

printf("%.1f", FLT_MAX) is not obliged to print exactly to 38+ significant digits, so seeing output like the below is not unexpected.

340282346638528860000000000000000000000.0

printf() will print floating point accurately to DECIMAL_DIG significant digits. DECIMAL_DIG is at least 10. If more than DECIMAL_DIG significance is directed, a compliant printf() may round the result at some point. C11dr ยง7.21.6.1 6 goes into detail.