Why do we typically adjust the smaller exponent to match the larger?
It may be less work than adjusting the larger value as simplifications can be made adjusting the smaller value.
"then adjust the mantissa accordingly" has more to it than only a shift.
Consider the addition/subtraction of normalized a,b with n bit significand and expo(a) >= expo(b).
All n bits of the significand of a are used.
The exponent of the b is made the same as the larger a and the lesser b significand is shifted, but maybe not all of it is explicitly remembered. Besides the b bits that remain aligned with a, 2 shifted out bits are remembered and the “or” of all the other bits shifted out.
Example, b shifted (right) n-6 places.
1.23456789….......n
a.aaaaaaaa…aaaaaaaa 000
0.00000000…00bbbbbb bbz (z is the “or” of all the less significant bits)
Now the addition/subtraction can be carried out using n+3+11 bit math. The 2 shifted out bits and the z are sufficient under all rounding modes to form the expected sum/difference.
1 +1 for overflow.
Without this simplification, a much wider than n+3 bit integer math is needed. Perhaps even 100s of bits.
Example, a shifted (left) n-6 places.
aa aaaaaaaa a.aaaaaa00…00000000
b.bbbbbbbb…bbbbbbbb