9
votes

I am getting a benign warning about possible data loss

warning C4244: 'argument' : conversion from 'const int' to 'float', possible loss of data

Question

I remember as if float has a larger precision than int. So how can data be lost if I convert from a smaller data type (int) to a larger data type (float)?

7
Nothing to do with your specific issue, but if you think that floats can somehow be faster than doubles, you are wrong - the purpose of floats is to minimise storage requirements, which are rarely a problem for modern applications. Your default choice of data type should be double, not float. - anon
@Neil: that depends very much on the CPU. There are many architectures where float is dramatically faster than double. - jalf
@jalf Possible, but in most situations in C or C++ code, the floats will be promoted to doubles anyway. - anon
@jalf: x86 isn't one of them. See the OP using Visual C++? - slacker

7 Answers

14
votes

Because float numbers are not precise. You cannot represent every possible value an int can hold into a float, even though the maximum value of a float is much higher.

For instance, run this simple program:

#include <stdio.h>

int main()
{
 for(int i = 0; i < 2147483647; i++)
 {
  float value = i;
  int ivalue = value;
  if(i != ivalue)
   printf("Integer %d is represented as %d in a float\n", i, ivalue);
 }
}

You'll quickly see that there are thousands billions of integers that can't be represented as floats. For instance, all integers between the range 16,777,219 and 16,777,221 are represented as 16,777,220.

EDIT again Running that program above indicates that there are 2,071,986,175 positive integers that cannot be represented precisely as floats. Which leaves you roughly with only 100 millions of positive integer that fit correctly into a float. This means only one integer out of 21 is right when you put it into a float.

I expect the numbers to be the same for the negative integers.

6
votes

On most architectures int and float are the same size, in that they have the same number of bits. However, in a float those bits are split between exponent and mantissa, meaning that there are actually fewer bits of precision in the float than the int. This is only likely to be a problem for larger integers, though.

On systems where an int is 32 bits, a double is usually 64 bits and so can exactly represent any int.

3
votes

Both types are composed of 4 bytes (32 bits). Only one of them allows a fraction (the float).

Take this for a float example;

34.156

(integer).(fraction)

Now use your logic; If one of them must save fraction information (after all it should represent a number) then it means that it has less bits for the integer part.

Thus, a float can represent a maximal integer number which is smaller than the int's type capability.

To be more specific, an "int" uses 32 bits to represent an integer number (maximal unsigned integer of 4,294,967,296). A "float" uses 23 bits to do so (maximal unsigned integer of 8,388,608).

That's why when you convert from int to float you might lose data.

Example: int = 1,158,354,125

You cannot store this number in a "float".

More information at:

http://en.wikipedia.org/wiki/Single_precision_floating-point_format

http://en.wikipedia.org/wiki/Integer_%28computer_science%29

1
votes

Precision does not matter. The precision of int is 1, while the precision of a typical float (IEEE 754 single precision) is approximately 5.96e-8. What matters is the sets of numbers that the two formats can represent. If there are numbers that int can represent exactly that float cannot, then there is a possible loss of data.

Floats and ints are typically both 32 bits these days, but that's not guaranteed. Assuming it is the case on your machine, it follows that there must be int values that float cannot represent exactly, because there are obviously float values that int cannot represent exactly. The range of one format cannot be a proper super-set of the other if both formats use the same number of bits efficiently.

A 32 bit int effectively has 31 bits that code for the absolute value of the number. An IEEE 754 float effectively has only 24 bits that code for the mantissa (one implicit).

0
votes

A float is usually in the standard IEEE single-precision format. This means there are only 24 bits of precision in a float, while an int is likely to be 32-bit. So, if your int contains a number whose absolute value cannot fit in 24 bits, you are likely to have it rounded to the nearest representable number.

0
votes

The fact is that both a float and an int are represented using 32 bits. The integer value uses all 32 bits so it can accommodate numbers from -231 to 231-1. However, a float uses 1 bit for the sign (including -0.0f) and 8 bits for the exponent. The means 32 - 9 = 23 bits left for the mantissa. However, the float assumes that if the mantissa and exponent are not zero, then the mantissa starts with a 1. So you more or less have 24 bits for your integer, instead of 32. However, because it can be shifted, it accommodates more than 224 integers.

A floating point uses a Sign, an eXponent, and a Mantissa
S X X X X X X X X M M M M M M M M M M M M M M M M M M M M M M M

An integer has a Sign, and a Mantissa
S M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M

So, an integer such as:

1 0 0 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

fits in a float because it can be shifted:

1 0 0 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
|       |                             |
|       +---------+                   +---------+
|                 |                             |
v                 v                             v
S X X X X X X X X M M M M M M M M M M M M M M M M M M M M M M M
1                 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0

I don't show you the eXponent because I most often make a mistake in computing it, but it should be something like 5 (or -5?) because I shifted by 5 bits (but you have to add or subtract 128...). This clearly shows you that if you have to shift by 5 bits, you're going to lose the 5 lower bits.

So this other integer can be converted to a float with a lose of 2 bits (i.e. when you convert back to an integer, the last two bits (11) are set to zero (00) because they were not saved in the float):

1 0 0 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1
|       |                             |               | | | | |
|       +---------+                   +---------+     +-+-+-+-+--> all lost
|                 |                             |
v                 v                             v
S X X X X X X X X M M M M M M M M M M M M M M M M M M M M M M M
1                 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0

Pretty simple stuff really.

IMPORTANT NOTE: Yes, the first 1 in the integer is the sign, then the next 1 is not copied in the mantissa, it is assumed to be 1 so it is not required.