0
votes

So I am trying to convert 46bfc000 (which is a floating-point number in IEEE single precision) into a decimal value.

I can get a approximate value, but not the exact value. So here is my work for my approximate value:

1) Convert into binary: 0100 0110 1011 1111 1100 0000 0000 0000

2) Find b-exp: 141-127

3) Convert what is after the decimal value: 2^-1 + 2^-5... = .552726746

4) Now follow this equation format: (1)sign bit * (1. + value in step 3) * 2^b-exp

5) Calculate: +1 X (1.5527226746) X 2^14 = 25439.87501

Now I know that the exact value is: 24544. But I am wondering if there is a way for me to get the exact number, or is it impossible to convert a IEEE single precision binary to a decimal value?

2
I am not entirely sure what you mean by "the exact number", but each power of two is representable by a finite number of decimal digits. So if you use a sufficient number of decimal places in the above computation you will arrive at the exact decimal representation of your binary encoding.njuffa
I always found it easier to calculate it as sign * (2^len + mantissa) * 2^(exp - bias - len), where mantissa is the 23 bit integer value and len = 23. So then it becomes 1 * 0xBFC000 * 2^(14 - 23) = 0xBFC000 / 0x200 = 0x5FE0 = 24544.Rudy Velthuis

2 Answers

1
votes

I have figured out the equation to get out the exact number of the binary representation, it is: sign * 2^b-exp * mantissa

Edit: To get the right mantissa, you need to ONLY calculate it starting at the fractional part of the binary. So for example, if your fractional is 011 1111...

Then you would do (1*2^-0) + (1*2^-1) + (1*2^-2)...

Keep doing this for all the numbers and you'll get your mantissa.

1
votes

Instead of calculating all those bits behind the comma, which is heck of a job, IMO, just scale everything by 2^23 and subtract 23 more from the exponent for compensation.

This is explained in my article about floating point for Delphi.

First decode:

0 - 1000 1101 - 011 1111 1100 0000 0000 0000

Insert hidden bit:

0 - 1000 1101 - 1011 1111 1100 0000 0000 0000

In hex:

0 - 8D - BFC000

0x8D = 141, minus bias of 127, that becomes 14.

I like to scale things, so the calculation is:

sign * full_mantissa * (exp - bias - len)

where full_mantissa is the mantissa, including hidden bit, as integer; bias = 127 and len = 23 (the number of mantissa bits).

So then it becomes:

1 * 0xBFC000 * 2^(14-23) = 0xBFC000 / 0x200 = 0x5FE0 = 24544

because 2^(14-23) = 2^-9 = 1 / 2^9 = 1 / 0x200.