0
votes

With regards to the mantissa (re this guide on floating point arithmetic), how do you actually multiply two mantissa's together?

Assume IEEE 754 single precision floating point representation.

Lets say one number has a mantissa of 1.5, which would be encoded as 0b10000000000000000000000 (which is 4194304 in decimal). The second number has a mantissa of 1.125, which would be encoded as 0b00100000000000000000000 (which is 1048576 in decimal).

1.5 x 1.125 = 1.6875.

1.6875 is encoded as 0b10110000000000000000000 (which is 5767168 in decimal). However 4194304 * 1048576 does not equal 5767168...

How does mantissa multiplication work such that 4194304 (1.5) multiplied by 1048576 (1.125), yields 5767168 (1.6875)?

That is, how do you multiply the encoded mantissas?

I.e. how does the hardware use the encoded mantissas stored in memory to get the new value?



From the guide, multiplication of floating point numbers can be acheived as follows. I am stuck on step 2.

  1. Separate out the mantissa from the exponent
  2. Multiply (or divide) the mantissa parts together
  3. Add (or subtract) the exponents together
  4. Combine the two results into the new value
  5. Normalize the result value (optional)
3
Note that 1/2 and 1/8 both have the same mantissa since they are represented as 1*2^-1 and 1*2^-3, i.e. the mantissa is 1 and the exponents are -1 and -3. You actually won't even see the 1 in the IEEE 754 representations of these numbers because of the notion of the hidden bit. - Kevin Jin
Wrong thinking: 0.125 would have the same mantissa as 0.5, just a different exponent. IEEE-754 floating point is normalized (except for very small numbers). - Rudy Velthuis
Updated the example in the question. That was an error on my part. However if you look at the update, you'll see my question still holds. - Jet Blue

3 Answers

2
votes

It works exactly like grade school math just a lot easier as you only have to know your ones multiplication table and remember that anything times zero is zero.

Let me take 1.5 * 3.0, single precision.

0x3FC00000  1.5
0x40400000  3.0
0x40900000  1.5 * 3.0 = 4.5

expand

0x3FC0
0011111111000000
0 01111111 1000000

0x4040
0100000001000000
0 10000000 1000000

so in binary we are multiplying 1.1 times 1.1, just like grade school except easier

  11
* 11
======
  11 
+11
======
1001

then we had one (not) decimal (binary?) place from one operand and one from another so our period goes here two in from the right.

10.01

but we need to normalize that

1.001 with an increase in the exponent.

the exponents

01111111 2^0

10000000 2^1

Just like grade school we add the exponents 2^(0+1) = 2^1. Then we have the normalization of adding another shift of one to the exponent 2^(1+1) = 2^2.

0 10000001 001000....
010000001001000....
0100 0000 1001 000....

giving a result that matches what the computer produced

0x40900000

There is no magic to it.

Multiplication in hardware is no different than grade school, it is just simpler. Look at these individual bits a is a bit 0 or 1, b is a bit 0 or 1 and so on.

   ab
*  cd
======
   ef
  gh
======
 jklm

if d is a 0 then ef are both zeros if d is a 1 then ef = ab so the equations thus far are

e = a & d
f = b & d
g = a & c
h = b & c

anything anded with 0 is 0 anything anded with 1 is itself.

then we do addition

 nop0
   ef
+ gh
======
 jklm

I am doing some heavy cheating here two bit addition, carry out and result, truth table

xy cr
00 00 
01 01
10 01
11 10

the result is x or y, the carry out is x and y

m = f
p = 0 because f+0 cant have a carry bit.
l = e xor h 
o = e & h 
k = o xor g
n = o & g
j = n

substitutions and hopefully I dont make any (more) mistakes

m = f = b & d
l = e xor h = (a&d) xor (b&c)
k = o xor g = (e & h) xor (a&c) = ((a&d) & (b&c)) xor (a&c)
j = n = o & g = (e & h) & (a&c) = ((a&d) & (b&c)) & (a&c)

so there are likely some optimizations there but if you needed/wanted to compute a two bit by two bit multiplier in one clock cycle there is your input and output equations. Much cheating because do a 3x3 and the addition gets much worse. Work your way up to even 8 bits times 8 or up to 32 bits times 32. Carry bits start to spam on top of other carry bits, the math is not something you want to attempt by hand. It is possible to do, but creates a massive amount of logic, single clock multipliers can consume a significant percentage of your chip, there are tricks like...using more than one clock and a pipe giving the illusion of one clock...

Go back to the older days where we simply couldnt burn that much logic we would say zero an accumulator, then take the ab bits AND both with d shift left zero and add to the accumulator. Take the ab bits AND both with c shift left one and add to the accumulator...EXACTLY like we do the paper and pencil multiplication. If you have an 8 bit * 8 bit then it takes at least 8 clocks just for the multiply. Then normalization, etc, etc...

Anyway, you can see that multiplication in logic is no different than we learned in grade school just simpler as we are only multiplying against 0 and 1. Either you write zeros or copy down the operand as is.

Short answer, you probably missed the point that there is a hidden/implied bit, no reason to waste a bit in the format that could be used for precision. IEEE 754 is 1.mantissa to some power when normalized.

double same story, implied 1.mantissa:

0x3FF8000000000000 1.5
0x4008000000000000 3.0
0x4012000000000000 4.5

0011111111111000
0100000000001000
0100000000010010

0 01111111111 1000   1.1000...
0 10000000000 1000   1.1000...
0 10000000001 0010   1.0010....

Another way to look at this without the decimal points (or with them implied and they are not decimal points but binary points I guess).

From above 1.10000...*1.10000 is 0xC00000 * 0xC00000 = 0x900000000000, 48 bits. We strip off half the bits to 0x900000 because we have a 1 and 47 bits of mantissa but only room for 23 so we chop off the 24 lower bits, be they zeros or any other number. Not an accident that it works out that we keep half the bits and discard half the bits. Now had it been 1.000... * 1.000...we would have had 0x400000000000 1 and 46 bits chopping of 23 not 24 bits. You can do some experiments with smaller numbers of bits, but remember that unlike any two N bit numbers mutiplied by each other we have an implied/fixed 1 at the top. So (1 * 2^23) * (1 * 2^23) = (1*1) * (2^(23+23) = 1 * 2^46 is always there when the mantissas are multiplied (for normal, non-zero, numbers).

There are other floating point formats that for the most part work the same, they may get easier for negative numbers, etc (look up the ti DSP format for example if you can find it, built for speed not corner cases).

1
votes

You want the product (1.0*2-1) * (1.0*2-3).

The mantissas are 1 and 1 and the exponents are -1 and -3.

Therefore the product of the mantissas is 1 * 1 = 1 and the sum of the exponents is (-1) + (-3) = -4.

Therefore the product of the two factors is 1*2-4 = 0.0625, according to steps 1 through 4.

There are two important things to keep in mind: (1) exponents in binary IEEE 754 floating point always represent powers of two, and (2) the integer portion of a normalized mantissa is always a 1 (hence why the first bit can be hidden in the actual binary representation of the number).


EDIT: incorporating some of my comments into this answer since the question was rewritten.

The guide you linked appears to describe the behavior but not the implementation. If the implementation were so simple, I'd imagine floating point math would be as fast as integer math.

For the mantissa, the hardware may do something like add back in the hidden bits at some point, so that you have

0b110000000000000000000000 * 0b100100000000000000000000 = 0b11011000000000000000000000000000000000000000000

This becomes 0b10110000000000000000000 after dropping the hidden bit and rounding off the trailing bits.

For the exponent, the hardware may subtract 127 from the biased exponents of both factors, perform the arithmetic, and then add 127 back to the sum or difference to re-bias the exponent.

Keep in mind that the 64-bit format is used to compactly encode the numbers in memory, but that may not be the actual representation used when performing the math. Processors may even perform intermediate math in 80-bit precision before rounding the result to 64-bits when the value is written back into memory. See x87 80-bit double-extended precision.

-1
votes

I found this video that answered my question.

To multiply the encoded values, the hardware somehow ignores the trailling zeros. A leading one bit is also placed.

So for 1.5, 0b11 (3) is used instead of 0b10000000000000000000000 (4194304).
Likewise for 1.125, 0b1001 (9) is used instead of 0b00100000000000000000000 (1048576).

Thus 0b11 * 0b1001 = 0b11011 (27).

If ignore the extra leading one bit* in 0b11011 and add the trailing zeros, end up with 0b10110000000000000000000 (5767168).


*Something important about significant bits is explained at around 8:05. For this example, it works out that ignoring the extra leading one bit is sufficient.