I feel I don't really understand the concept of overflow
and underflow
. I'm asking this question to clarify this. I need to understand it at its most basic level with bits. Let's work with the simplified floating point representation of 1
byte - 1
bit sign, 3
bits exponent and 4
bits mantissa:
0 000 0000
The max exponent we can store is 111_2=7
minus the bias K=2^2-1=3
which gives 4
, and it's reserved for Infinity
and NaN
. The exponent for max number is 3
, which is 110
under offset binary.
So the bit pattern for max number is:
0 110 1111 // positive
1 110 1111 // negative
When the exponent is zero, the number is subnormal and has implicit 0
instead of 1
. So the bit pattern for min number is:
0 000 0001 // positive
1 000 0001 // negative
I've found these descriptions for single-precision floating point:
Negative numbers less than −(2−2−23) × 2127 (negative overflow)
Negative numbers greater than −2−149 (negative underflow)
Positive numbers less than 2−149 (positive underflow)
Positive numbers greater than (2−2−23) × 2127 (positive overflow)
Out of them I understand only positive overflow which results in +Infinity
, and the example would be like this:
0 110 1111 + 0 110 1111 = 0 111 0000
Can anyone please demonstrate the three other cases for overflow and underflow using the bit patterns I outlined above?