I am designing a floating point unit in SystemVerilog that takes two 32-bit inputs that are in IEEE754 format, adds them together, and outputs the result in the same 32-bit IEEE754 format.
My question is, how can I tell if my result needs to be normalized?
I realize it is when you need to move the "leftmost" 1 to the correct bit, which should be bit 23.(starting with bit 0)
What I'm having a hard time wrapping my head around is how I can identify what the correct "leftmost" 1 bit is so I can shift the mantissa and increment/decrement the exponent bits appropriately.
If my understanding is correct, addition should have the following process.
- Separate the bits into sign, exponent, and mantissa
- Prepend a '1' to the mantissas
- Compare exponents and add the difference to the smaller exponent
- Shift the mantissa of the smaller exponent to the right by said difference to 'line' up the decimals/number correctly
- Perform binary addition
- Normalize the result if necessary
I believe I have every step except the normalizing part correct. My problem is, how can I identify that the result is not normalized if all I have are bits?
I know that it is not normalized if the result is not 1.(fraction).
ie. 10.10101 * 2^1 should be normalized to 1.010101 *2^2 and .1001 * 2^2 should be normalized to 1.001 * 2^1.
Specifically, I guess I'm trying to ask how I can keep track of where the "decimal" place is after adding two numbers.
For example: Adding input a: 0x3fc00000 (1.5) and b: 0x40500000 (3.25)
a = 0 | 0111 1111 | (1) 100 0000 0000 0000 0000 0000
b = 0 | 1000 0000 | (1)101 0000 0000 0000 0000 0000
exponent of a is less than b by a difference of 1, so:
a = 0 | 1000 0000 | 0(1)10 0000 0000 0000 0000 0000
b = 0 | 1000 0000 | (1)101 0000 0000 0000 0000 0000
adding the mantissas will give us a result of
1 0011 0000 0000 0000 0000 0000
Here we see the "leftmost" 1 being the bit 24 as opposed to bit 23, so we shift the mantissa to the right by 1 and increment the exponent to normalize the result. Then we remove the "leftmost" 1 because it is implied in IEEE754 format and we get:
0 | 1000 0001 | 001 1000 0000 0000 0000 0000 (4.75) as our final output which is correct.
Given this example, I thought I simply just had to check for the following cases:
- If bit 24 of the mantissa is equal to 1, shift mantissa right and increment exponent
- Else check bit 23 is 1, if true no normalization needed
- Else check bit 22 is 1, then shift mantissa left and decrement exponent
However, I'm only finding this to be true for some cases. What am I missing?
In my implementation I made a 26 bit value to hold the sum of the two mantissas, which I'm not sure is correct. Bit 25 is the sign of the mantissa, which I don't really I think I need, and the bits 24 and 23 are the hidden bits, or bits that won't be included in the final output.
For example: 0x449ebbc8 (1269.868163) + 0xc60eb709 (-9133.758561) gives me the following mantissa:
11 0111 1010 1101 1111 1001 0000 notice this is 26 bits(25:0)
If I followed the previous case that would mean the "leftmost 1" bit excluding the sign bit would bit 24, meaning I would shift the mantissa right and increment the exponent. However the correct answer is the opposite! The "'true' leftmost 1" bit is actually bit 22! Meaning I should shift left and decrement instead! Giving me the final output of:
1 | 10001011 | 111 0101 1011 1111 0010 0000 (-7863.8906) which is correct.
Similarly, adding 0x45c59cbd and 0xc473d9dc gives a mantissa of
01 1010 0111 0010 0001 1000 0010 but the "leftmost 1" bit is not the one at bit 24, but bit 23, so no normalization is needed.
Why is it that for the first case I needed to worry about bit 24 but not the other two cases? Is it because I'm adding opposite signs for the other cases? Overflow problem? Or is there something else I'm fundamentally missing?
Thanks for the help and sorry if the formatting is poor!