Is SSE floating-point arithmetic reproducible?

Question

The x87 FPU is notable for using an internal 80-bit precision mode, which often leads to unexpected and unreproducible results across compilers and machines. In my search for reproducible floating-point math on .NET, I discovered that both major implementations of .NET (Microsoft's and Mono) emit SSE instructions rather than x87 in 64-bit mode.

SSE(2) uses strictly 32-bit registers for 32-bit floats, and strictly 64-bit registers for 64-bit floats. Denormals can optionally be flushed to zero by setting the appropriate control word.

It would therefore appear that SSE does not suffer from the precision-related issues of x87, and that the only variable is the denormal behavior, which can be controlled.

Leaving aside the matter of transcendental functions (which are not natively provided by SSE unlike x87), does using SSE guarantee reproducible results across machines and compilers? Could compiler optimizations, for instance, translate into different results? I found some conflicting opinions:

If you have SSE2, use it and live happily ever after. SSE2 supports both 32b and 64b operations and the intermediate results are of the size of the operands. - Yossi Kreinin, http://www.yosefk.com/blog/consistency-how-to-defeat-the-purpose-of-ieee-floating-point.html

...

The SSE2 instructions (...) are fully IEEE754-1985 compliant, and they permit better reproducibility (thanks to the static rounding precision) and portability with other platforms. Muller et aliis, Handbook of Floating-Point Arithmetic - p.107

however:

Also, you can't use SSE or SSE2 for floating point, because it's too under-specified to be deterministic. - John Watte http://www.gamedev.net/topic/499435-floating-point-determinism/#entry4259411

I'm pretty sure that if there are two conflicting opinions on the web you'll get an argument here (and probably at least a 3rd opinion too) — KevinDTimm
@KevinDTimm that doesn't make this question subjective though. SSE is either reproducible or it's not. — Asik
"SSE or SSE2 [is] too under-specified to be deterministic". I do not claim to be an expert on these matters, but this sounds like BS to me. In the link there's talk about library functions for transcendental and of course there could be bugs in those on one platform and not another as indeed there could be (in fact, probably is) in any compiler's optimizer, but that does not say anything about SSE/SSE2 per se. Does he have an example of what he means? — 500 - Internal Server Error
@Hans Passant: without predictability, rigorous engineering is impossible. The behavior of high-level language source expressions is unpredictable in the face of compiler optimization when extended-precision is used. When non-extended precision is combined with strict compiler settings, the behavior is predictable. For most programmers most of the time, extended precision is a useful crutch. For experts, it is frequently an extreme inconvenience. — Stephen Canon
@HansPassant for multiplayer simulations, it matters less what the results are than that they are the same across computers. Scientific computing faces similar challenges. Also, it's not just a matter of a few bits: extended precision means the same computation may give either a real value or Infinity, for instance. — Asik

Stephen Canon Stephen Canon · Accepted Answer · 2013-03-01T06:08:31

SSE is fully specified*. Muller is an expert in floating point arithmetic; who are you going to trust, him or some guy on a gamedev forum?

(*) there are actually a few exceptions for non-IEEE-754 operations like rsqrtss, where Intel never fully specified the behavior, but that doesn't effect the IEEE-754 basic operations, and more importantly their behavior can't actually change at this point because it would break binary compatibility for too many things, so they're as good as specified.

Is SSE floating-point arithmetic reproducible?

2 Answers