1
votes

I'm trying to express a fractional number in binary and then have it print out as a float. I've done the fixed point to floating point conversion.

The number in decimal: -342.265625

fixed point: -101010110.010001

32-bit float: 11000011101010110010001000000000

64-bit float (double): 1100000001110101011001000100000000000000000000000000000000000000

*I've double checked with an IEEE 754 Converter

*I'm also aware that printf changes floats into doubles to print them, but declaring it as a double should work? I thought...?

Code:

int main()
{

  float floaty = 0b11000011101010110010001000000000;
  double doubley = 0b1100000001110101011001000100000000000000000000000000000000000000;
  printf("Float: %f\n", floaty);
  printf("Double: %lf\n", doubley);

}

Output:

Float: 3282772480.000000
Double: 13868100853597995008.000000

The compiler is gcc and the standard is c99

3
0b must be a gcc extension. It would be nice if you could find the documentation on how it works with floating point initialization.2501
@CST-Link It shouldn't matter for printing floatyZac Taylor
@CST-Link In C, trailing arguments are subject to default argument promotions, which promotes float to double. Specifiers f and lf are valid for double or float. OP is using the best possible option.2501
@CST-Link the printf format %f is for double and any float argument is promoted to double in the same way that for %d a char argument is promoted to int.Weather Vane
Haha... apparently I am wrong... thanks everybody to opening up my eyes.user2271770

3 Answers

2
votes

You can use the binary constants with some more work.

We will have to assume the floating point represented using IEEE 754, and the system is in little endian:

uint32_t value = 0b11000011101010110010001000000000;
float f;
memcpy( &f , &value , sizeof( f ) );
printf( "%f\n" , f );
4
votes

From gcc's documentation:

The type of these constants follows the same rules as for octal or hexadecimal integer constants, so suffixes like ‘L’ or ‘UL’ can be applied.

So, the binary numbers you assign to float and double are actually of integer types and don't directly map to the bit pattern of the underlying types you assign to.

In other words, this:

 float floaty = 0b11000011101010110010001000000000;
 double doubley = 0b1100000001110101011001000100000000000000000000000000000000000000;

is equivalent to:

 float floaty = 3282772480;
 double doubley = 13868100853597995008;
3
votes

The problem is that the compiler is trying to help you out. Your literals (0b1...), which by the way is a non-standard extension and should be written as (0x...), are treaded as literals. The compiler then tries its very best to fit those values into the variables you cast them to. As such it produces very big values that are equal to the integer value of your literals.

To directly assign the value of a variable, you have to use unions (or pointers if you don't mind losing a bit of portability). This code works:

#include <stdint.h>

union floatint {
    float f;
    uint32_t i;
};

union doubleint {
    double d;
    uint64_t i;
};    

int main()
{
  floatint floaty;
  doubleint doubley;
  floaty.i = 0xC3AB2200;
  doubley.i = 0xC075644000000000; 
  printf("Float: %f\n", floaty.f); // implementation-defined, in your case IEEE 754
  printf("Double: %lf\n", doubley.d); // ditto

}

Note that this is the very definition of a union, two (or more) types that share the same representation, but are treated differently.