0
votes

I am studying computer architecture (MIPS architecture) and read the following statements:

1.Branch instructions have a 16 bit signed word offset field that allows a branch to an address + or -128kBytes (+0x1FFFC TO -0X20000) from the current location.

2.A jump instruction specifies an address within the current 256MByte(0x0FFFFFFC) region specified by Program counter most significant 4 bits.

I understand the concept of jump range described above but how are the three numbers 0x0FFFFFFC, 0x1FFFC and 0X20000 calculated using "the range of 256Mbyte" and "the range of +-128 kbytes"?

Thanks!

2
1 word = 4 bytes. So obviously a 16 bit signed word offset is +/- 128k and I hope you have no problem converting that into hex. As for the 256MByte that is not an offset. So the biggest jump you can make is of course 256M-4.Jester

2 Answers

3
votes

The other answers didn't really answer your question of how these Hex values are calculated/found. so here's my answer.

Thinking about this is much easier in Binary than HEX. as the 2bit left shift is important to understanding the concept 2bits is multiply by 4. Which cannot be represented in HEX as nicely since easy Hex digit is 16 values. but ill try to explain it still:

0x20000

1 Branch instructions use a 16 bit Immediate field. (5 bit RS, RT) (6 bit Opcode) == 32bits (https://en.wikibooks.org/wiki/MIPS_Assembly/Instruction_Formats#I_Format)

those 16 bits are Signed. they can be positive & Negative.

That gives you an effective range of -(2^15) == -32768

to +(2^15 -1) == 32767

MIPS multiples any address inputs by 4. Forcing them to be word aligned.

so your Minimum value -(2^15) Multiply by 4: -{2^15 *4} (4=2^2), {2^(15+2)} (15+2 == 17): becomes -(2^17) == -131072

in Binary (signed 2's complement). 1000 0000 0000 0000 <<2 == 10 0000 0000 0000 00[00]

Converting that to Hex 10=2 (0000=0) gives 2 0 0 0 0 == 0x20000

this would be sign extended before adding it to the (PC+4):

so for say, instruction #32770, PC=0x00420008 (PC+4)=0x0042 000C

0x0042000C - 0x20000 = 0x0040000C, instruction #3 (remember, offset is based off PC+4)

#32770+1 +-32768 == 3

0x1FFFC

Same for the Maximum value: (2^15 -1) Multiply by 4: {(2^15 -1) *4} (4=2^2), {2^(15+2) -(1*4)} (15+2 == 17):

becomes (2^17 -4) == 131068 0111 1111 1111 1111 <<2 == 01 1111 1111 1111 11[00]

Converting that to Hex 01=1 (1111=F) (1100=C) gives 1 F F F C == 0x1FFFC

Note the address needs to be added to the current (Program Counter+4)

so for say, instruction #32770, PC=0x00420008 (PC+4)=0x0042000C

0x0042000C + 0x1FFFC= 0x440008, instruction #65538 (remember, offset is based off PC+4)

#32770+1 +32767 == 65538

0x0FFFFFFC

2 now Jumps use a 28 bit address.
Also Note, Jumps use an absolute address. not an offset.

maximum 28 bit value is (2^26 -1) == 67108863, 0x03FFFFFF ``

Shifted 2 (*4) becoming 28bits. {(2^26 -1) *4}, == {2^28 -4} == 268435452, 0x0FFFFFFC

But then the missing four bits ? .. they come from the PC - which in the Memory stage, it has already been incremented to (PC+4)

for instruction #32770, PC=0x00420008 (PC+4)=0x0042000C 0x0042000C in binary is [0000] 0000 0100 0010 0000 0000 0000 1100

+0x0FFFFFFC in binary [####] 1111 1111 1111 1111 1111 1111 1100 it is only 28 (27:0) bits and missing the 31:28 bits.

Taking the bits from PC+4. we get:

 0000 ---- ---- ---- ---- ---- ---- ---- (PC+4)
 ---- 1111 1111 1111 1111 1111 1111 1100 (Target-Address)
-----------------------------------------
 0000 1111 1111 1111 1111 1111 1111 1100 (Jump-Address)

(which in this case is the same value as sign extending it)

A better explanation of how Addresses are calculated. How to Calculate Jump Target Address and Branch Target Address?

1
votes

Why dont you just ask a tested and debugged toolchain, then compare that to the documentation?

so.s

four:
nop
nop
nop
j one
nop
j two
nop
j three
nop
j four
nop
nop
nop
nop
nop
one:
nop
two:
nop
nop
three:
nop

build and disassemble

mips-elf-as so.s -o so.o
mips-elf-objdump -D so.o

so.o:     file format elf32-bigmips


Disassembly of section .text:

00000000 <four>:
    ...
   8:   0800000f    j   3c <one>
   c:   00000000    nop
  10:   08000010    j   40 <two>
  14:   00000000    nop
  18:   08000012    j   48 <three>
  1c:   00000000    nop
  20:   08000000    j   0 <four>
  24:   00000000    nop
    ...

0000003c <one>:
  3c:   00000000    nop

00000040 <two>:
    ...

00000048 <three>:
  48:   00000000    nop

link to some address and disassemble

00001000 <_ftext>:
    ...
    1008:   0800040f    j   103c <one>
    100c:   00000000    nop
    1010:   08000410    j   1040 <two>
    1014:   00000000    nop
    1018:   08000412    j   1048 <three>
    101c:   00000000    nop
    1020:   08000400    j   1000 <_ftext>
    1024:   00000000    nop
    ...

0000103c <one>:
    103c:   00000000    nop

00001040 <two>:
    ...

00001048 <three>:
    1048:   00000000    nop

so jumps are super easy what about branch?

four:
nop
nop
nop
beq $10,$11,one
nop
beq $10,$11,four
nop
nop
nop
one:
nop

assemble and disassemble

00000000 <four>:
    ...
   8:   114b0006    beq $10,$11,24 <one>
   c:   00000000    nop
  10:   114bfffb    beq $10,$11,0 <four>
  14:   00000000    nop
    ...

00000024 <one>:
  24:   00000000    nop

Some experience helps here, first going forward 0x24 - 0x8 = 0x1C. These are fixed 32 bit instructions, so unlikely they need to waste the two bits and cut the range, so 0x1C>>2 = 7. The encoding has a 6. Well it is also likely they are thinking in terms of the pc has been incremented, or another way to look at this is 6(+1) instructions ahead. 0xC, 0x10, 0x14, 0x18, 0x1c, 0x20, 0x24. So that would imply going backward is (0x00 - (0x10+4))>>2 = (0x00-0x14)>>2 = 0xFFFF...FFFFEC>>2 = 0xFF...FFFB and sure enough that is what we get.

So for branches you take

((destination - (current address + 4))/4)&0xFFFF = 
(((destination - current address)/4) + 1)&0xFFFF

For jumps immediate = {pc[31:28],destination[28:2]}

You should be able to figure out the ranges from that information.

The key to the encoding being the instructions are fixed at 32 bits and aligned on 32 bit boundaries so the two lsbits are always zeros along with the math associated with them, so why cut your range down by 4 to store zeros? You dont, you efficiently pack the offsets into the immediate. Some (fixed length) instruction sets dont do that but generally have a reason not to as part of the design.

In general a debugged assembler if you have access to one is going to provide more useful information than an instruction set reference, this is based on experience learning many instruction sets. If you are the first one to write an assembler for some processor then that means you work there or have direct access to the designers of the processor and you can simply ask them the math, rather than rely on the not yet written manual, which they will write after the chip has taped out, whichis too late as you/they need the assembler to validate the design. So emails, skypes, and most important whiteboard discussions of the instruction encoding. You might also have access to the chip source code and/or a simulator so you can run your code, see it execute in the sim (examine the waveforms) and see where it branches to (where it fetches), change the immediate, look at where it fetches.

Basically you should in general always have access to a resource with the answer that can help explain a manual lacking some detail. Granted sometimes you get a good manual...(and you should still verify that with the resource).