Software-emulated IEEE floats/doubles are slow because of many edge cases one needs to check for and properly handle.
- +/-infinity in input
- Not-A-Number in input
- +/-0 in input
- normalized vs denormalized number in input and the implicit '1' in the mantissa
- unpacking and packing
- normalization/denormalization
- under- and overflow checks
- correct rounding, which can lead to extra (de)normalization and/or underflow/overflow
If you just roughly count the above as a number of primitive micro operations (1 for each item on the list), you get close to 10. There will be many more in the worst case.
So, if you're interested in IEEE-compilant floating point arithmetic, expect every emulated operation to be something like 30x slower than its integer counterpart (CodesInChaos's comment is timely with the 38 clocks per addition/multiplication).
You could cut some corners by choosing a floating-point format with:
- just one zero
- no Not-A-Number
- normalized numbers only
- no implicit '1' in the mantissa
- exponent and mantissa each occupying an integral number of bytes
- no or primitive rounding
- possibly, no infinities
- possibly, 2's complement mantissa
- possibly, no exponent bias
Fixed-point arithmetic may turn out much more performant. But the usual problem with it is that you have to know all the ranges of the inputs and intermediate results beforehand so you can choose the right format in order to avoid overflows. You'll also likely need a number of different fixed-point formats supported, e.g. 16.16, 32.32, 8.24, 0.32. C++ templates may help reduce code duplication here.
In any event, the best you can do is define your problem, solve it with both floating and fixed point arithmetic, observe which of the two is the best for which CPU and choose the winner.
EDIT: For an example of a simpler floating-point format, take a look at the MIL-STD-1750A's 32-bit floating point format:
MSB LSB MSB LSB
------------------------------------------------------------------
| S| Mantissa | Exponent |
------------------------------------------------------------------
0 1 23 24 31
Floating point numbers are represented as a fractional mantissa times 2 raised to the power of the exponent. All floating point numbers are assumed normalized or floating point zero at the beginning of a floating point operation and the results of all floating point operations are normalized (a normalized floating point number has the sign of the mantissa and the next bit of opposite value) or floating point zero. A floating point zero is defined as 0000 000016, that is, a zero mantissa and a zero exponent (0016). An extended floating point zero is defined as 0000 0000 000016, that is, a zero mantissa and a zero exponent. Some examples of the machine representation for 32-bit floating point numbers:
Decimal Number Hexadecimal Notation
(Mantissa x Exp)
0.9999998 x 2127 7FFFFF 7F
0.5 x 2127 400000 7F
0.625 x 24 500000 04
0.5 x 21 400000 01
0.5 x 20 400000 00
0.5 x 2-1 400000 FF
0.5 x 2-128 400000 80
0.0 x 20 000000 00
-1.0 x 20 800000 00
-0.5000001 x 2-128 BFFFFF 80
-0.7500001 x 24 9FFFFF 04