2
votes

I have a strange floating-point problem.

Background:

I am implementing a double-precision (64-bit) IEEE 754 floating-point library for an 8-bit processor with a large integer arithmetic co-processor. To test this library, I am comparing the values returned by my code against the values returned by Intel's floating-point instructions. These don't always agree, because Intel's Floating-Point Unit stores values internally in an 80-bit format, with a 64-bit mantissa.

Example (all in hex):

X = 4C816EFD0D3EC47E:
biased exponent = 4C8 (true exponent = 1C9), mantissa = 116EFD0D3EC47E

Y = 449F20CDC8A5D665:
biased exponent = 449 (true exponent = 14A), mantissa = 1F20CDC8A5D665

Calculate X * Y

The product of the mantissas is 10F5643E3730A17FF62E39D6CDB0, which when rounded to 53 (decimal) bits is 10F5643E3730A1 (because the top bit of 7FF62E39D6CDB0 is zero). So the correct mantissa in the result is 10F5643E3730A1.

But if the computation is carried out with a 64-bit mantissa, 10F5643E3730A17FF62E39D6CDB0 is rounded up to 10F5643E3730A1800, which when rounded again to 53 bits becomes 10F5643E3730A2. The least significant digit has changed from 1 to 2.

To sum up: my library returns the correct mantissa 10F5643E3730A1, but the Intel hardware returns (correctly) 10F5643E3730A2, because of its internal 64-bit mantissa.

The problem:

Now, here's what I don't understand: sometimes the Intel hardware returns 10F5643E3730A1 in the mantissa! I have two programs, a Windows console program and a Windows GUI program, both built by Qt using g++ 4.5.2. The console program returns 10F5643E3730A2, as expected, but the GUI program returns 10F5643E3730A1. They are using the same library function, which has the three instructions:

fldl   -0x18(%ebp)
fmull  -0x10(%ebp)
fstpl  0x4(%esp)

And these three instructions compute a different result in the two programs. (I have stepped through them both in the debugger.) It seems to me that this might be something that Qt does to configure the FPU in its GUI startup code, but I can't find any documentation about this. Does anybody have any idea what's happening here?

2

2 Answers

5
votes

The instructions stream of and inputs to a function do not uniquely determine its execution. You must also consider the environment that is already established in the processor at the time of its execution.

If you inspect the x87 control word, you will find that it is set in two different states, corresponding to your two observed behaviors. In one, the precision control [bits 9:8] has been set to 10b (53 bits). In the other, it is set to 11b (64 bits).

As to exactly what is establishing the non-default state, it could be anything that happens in that thread prior to execution of your code. Any libraries that are pulled in are likely suspects. If you want to do some archaeology, the smoking gun is typically the fldcw instruction (though the control word can also be written to by fldenv, frstor, and finit.

3
votes

normally it's a compiler setting. Check for example the following page for Visual C++: http://msdn.microsoft.com/en-us/library/aa289157%28v=vs.71%29.aspx

or this document for intel: http://cache-www.intel.com/cd/00/00/34/76/347605_347605.pdf

Especially the intel document mentions some flags inside the processor that determine the behavior of the FPU instructions. This explains why the same code behaves differently in 2 programs (one sets the flags different to the other).