1
votes

I am studying MIPS instructions and this problem is confusing me a bit because the MIPS documentation seems to be saying something different from the answer provided. Here is the problem and answer:

What registers are referred to and/or changed in this instruction at location 0x5000?

0x5000 : 0x0140F809

answer:

Opcode=0x00, R type, Function=0x09 (jalr), Rs=10($t2)

jumps to address in $t2

puts 0x5004 in $ra

However, from the documentation, it says that in register 31 ($ra), it puts the PC + 4. So since the instruction is executing at address 0x5000, the PC should be 0x5004 right? So shouldn't the JALR instruction put 0x5004 + 4, or 0x5008, into the PC and not 0x5004?

To me it makes sense that it should jump back to 0x5004 since that is technically the next instruction after the jump, but the documentation explicitly says R[31] = PC + 4 so it is confusing me a little bit, which would be x5008. Thanks!

1

1 Answers

2
votes

The thing you have to consider are branch delay slots.

First let's handle the case where they're off. This is the default for simulators like spim and mars. Things are simple:

5000: jalr $10                      # (1) $31 will have 5004
5004: nop                           # (2) this executed upon return

This is the way most architectures work.

But, mips has [the aforementioned] branch delay slots.

If the delays are enabled [in simulators] or real hardware, after every transfer of control instruction (e.g. branch, jump, jal, jalr) is a single instruction that follows in the delay slot that is unconditionally executed before the branch is actually taken [or not]:

5000: jalr $10                      # (1) $31 will have 5008
5004: nop                           # (2) this executed _before_ branch taken
5008: nop                           # (3) this executed upon return

So, the effective execution order is actually (2), (1), (3).

In the general case, you have a three step sequence:

5000: beqz $10,foobar               # (1) conditional branch to foobar
5004: nop                           # (2) executed _before_ branch taken
5008: nop                           # (3) executed _after_ if branch _not_ taken

Once again, the effective execution order will be (2), (1). Then, either the first instruction of foobar is executed [if the branch was taken] or the instruction at 5008 (3) will be executed if the branch is not taken.

Okay, you may be asking why?

In early MIPS chips, instructions were prefetched. For example, the instruction for cycle N+1 was prefetched [and possibly predecoded] in cycle N (a one cycle delay).

So, on cycle N, the instruction execution unit is executing the instruction fetched in cycle N-1 (e.g. 5000), the instruction prefetch unit is fetching the next instruction (at 5004). They overlap with the one cycle delay. In cycle N+1, the execution unit is executing the prefetched instruction (at 5004) and the prefetch unit is prefetching the next instruction (at 5008).

This works great until a conditional transfer of control instruction is encountered.

Without the delay slot, the processor would have to stall, and the instruction after the branch that got prefetched on the same cycle as the branch was executed would be wasted. With the delay slot execution, you can usually populate the slot with something useful, so the prefetch needn't be wasted.

But, it does makes things a bit more complicated.