A long time ago, I've used this simple x86 assembler trick to obtain 0 or 1 as a result of floating point number comparison:
fld [value1]
fcom [value2]
fnstsw ax
mov al, ah
and eax, 1
This trick allows to avoid branching if comparison result only affects selection of a value from a set of 2 values. It was fast in Pentium days, now it may not be so much faster, but who knows.
Now I mainly use C++ and compile using Intel C++ Compiler or GCC C++ Compiler.
Can someone please help rewrite this code into 2 built-in assembler flavors (Intel and GCC).
The required function prototype is: inline int compareDoublesIndexed( const double value1, const double value2 ) { ... }
Maybe using SSE2 operations could be even more efficient. Your perspective?
I've tried this:
__asm__(
"fcomq %2, %0\n"
"fnstsw %ax\n"
"fsubq %2, %0\n"
"andq $L80, %eax\n"
"shrq $5, %eax\n"
"fmulq (%3,%eax), %0\n"
: "=f" (penv)
: "0" (penv), "F" (env), "r" (c)
: "eax" );
But I get error in Intel C++ Compiler: Floating point output constraint must specify a single register.