The mips [32 bit] architecture can only have addresses that are 32 bits. This means a 4GB address space. So, a PC that is 32 bits wide and uses byte addressing can address anything in that space.
What you're thinking of is, instead of the PC containing a byte address, where the rightmost two bits are always zero because instructions must be 4 byte/word aligned, and seem to be "wasted", why not have the PC contain a word address that would be left shifted by two bits to produce a 34 bit address. This would span 16GB.
But, that would exceed what the mips memory system is capable of addressing. So, nothing is gained by this, because the wider resultant address can't be used because it exceeds the addressability of the architecture. So, with byte addresses, nothing is really wasted.
All address calculations for the entire 32 bit/4GB address space fit in 32 bit wide registers. On 64 bit architectures, the registers are 64 bit and can span a much larger range.
So, anyway, the PC
itself holds byte addresses, but ...
... Where your idea can be used and is used is when encoding target offsets in branch instructions. They are of the form:
00000000 beqz $t0,XXXX
00000004 nop
mips is somewhat unique from other architectures:
XXXX
is a signed 16 bit word offset relative to PC + 4
. In this case, PC + 4
is 0x00000004. We take XXXX
and sign extend it to 32 bits. Then, we left shift it by two bits. Then, we add it to PC + 4
to get the final target address of the branch. By "we", I mean the mips branch instruction hardware.
Consider the reverse where we have the following program fragment:
00000000 nop
00000004: nop
00000008 loop: nop
0000000C nop
00000010 nop
00000014 beqz $t0,loop
00000018 nop
To arrive at the correct value for XXXX
in the branch instruction, the assembler takes the address of the label loop:
and subtracts PC + 4
from it to produce the relative byte offset. Here, the address of loop
is 0x00000008 and PC + 4
is 0x00000018, so we have 0x08 - 0x18
, which is -0x10
or 0xFFFFFFF0
. This is a byte offset, so we right shift it by two bits to produce a word offset: 0xFFFFFFFC
. We use the lower 16 bits of this for XXXX
, so we have FFFC
Because branch instructions use word offsets instead of byte offsets, they don't "waste" the two "must be zero" bits. They take advantage of this to extend the range of the branch instruction byte offset from -32768 to 32767
to -131072 to 131068
.