How is the pseudo instruction load address translated to real instructions?

Question

I have looked at similar questions, but they don't address the questions below.

When programming in MIPS assembly, there is a pseudo instruction called load address (la).

I'm coming from higher level languages so bear with me if I'm not down the rabbit hole yet.

In looking up what this pseudo instruction translates to, I found in a MIPS wiki the following:

la $a0,address

translates to

lui $at, 4097 (0x1001 → upper 16 bits of $at).
ori $a0,$at,disp

where the immediate (“disp”) is the number of bytes between the first data location (always 0x 1001 0000) and the address of the first byte in the string.

Elsewhere I found another expansion:

# la $t, A 
lui $t, A_hi
ori $t, $t, A_lo

Fundamentally, I think I understand the latter code. We want to place a 32-bit address in register $t using an instruction, but for I-type instructions we can only fit 16 bit addresses in the instruction, so we need two instructions since we can't fit the 32 bits in one instruction.

How does this second translation relate to the first one above?

Also, as a more practical matter, say I have the following assembly program:

.data 
  prompt1: .asciiz "How old are you?"

.text 
  main:
   la $a0, prompt1

How would I replace the la instruction with the real instructions?

Note: I can see what a simulator translates it to:

lui $1, 4097
ori $4, $1, 0

But in this case, it's using register $1 which is the register reserved for the assembler. I wouldn't be able to make use of this had I written the instructions myself right? Also, the lui is setting the upper bits to 1001. I know that static data starts at 0x10010000, so the string labeled prompt1 will start there. But (and this is the part where maybe I haven't grasped just how in control the programmer is expected to be in assembly), if I had another label prompt2 right after prompt1, would I be expected to know exactly how many bytes it comes after the first label, so as to correctly choose the immediate constant to put in the ori?

EDIT: In order to contextualize what I am doing, here is some code (by the way this is literally the first assembly program I am writing myself):

.data 
  prompt1: .asciiz "How old are you?"
  response1: .asciiz "You are "

.text 
  main:
   ori $v0, $zero, 4           # syscall number for printing a string

   lui $a0, 4097               # load the first address of static data segment

   syscall                     # prompt the user

   ori $v0, $zero, 5           # syscall number for reading an integer

   syscall                     # read integer

   or $t0, $v0, $zero          # save read integer to temp register

   lui $a0, 4097               # load the first address of static data segment
   ori $a0, $a0, 16            # response1 starts at byte 16 from start of static data segment

   ori $v0, $zero, 4           # syscall number for printing a string; response1

   syscall                     # print response1

   ori $v0, $zero, 1           # syscall number for printing an integer

   or $a0, $zero, $t0          # place the age the user typed in into $a0 for syscall

   syscall                     # print the age the user typed in

   ori $v0, $zero, 10          # syscall number for exiting

   syscall

Yes, you got everything correct. Obviously the reason for la and using an assembler (and linker) is that you don't have to hand code addresses. If you insist then it's up to you to use the proper values. Instead of $1 you can use the destination register directly, as in your "elsewhere" sample. — Jester
$1 is reserved for the assembler is only in the case that you use pseudo instructions. Otherwise you can count on $1 being yours to use for any scratch/temp purpose the same as any of the other volatile/caller-preserved registers (e.g. $t0..). Compilers don't use pseudo instructions, thus consider $1 as another temp/scratch register the same as the others. RISC V has done away with the "reserved for assembler" register, so you have to write some codes as two instructions yourself in assembler as there are a few missing pseudo instructions compared to MIPS. — Erik Eidt
Assuming I don't want to use pseudo instructions, what is the idiom here? For a more concrete context, say I want to load the address of the prompt1 so I can make a syscall that prints that data out to a console. @Jester you say I should use the assembler so I don't have to hand code stuff; so a pro assembly programmer would always use the pseudo instruction with a reference to the label? He would never have to actually code in the addresses? — evianpring
You need to write lui and ori, and one syntax is that you specify %hi(label) as the operand for lui and %lo(label) as the operand for ori, and those will build the proper immediates to assemble the 32-bit offset across two instructions. Actual syntax varies.. — Erik Eidt
If you’re using an assembler that doesn’t support %hi and %lo (or the equivalent), you really should just use the pseudoinstructions. Trying to keep track of the offsets of labels by hand is pretty error prone; it’s one of the main functions of an assembler. — prl

old_timer old_timer · Accepted Answer · 2019-12-23T06:09:47

Sounds like you have it but maybe the missing link you are overlooking.

First off assembly language is defined by the assembler, the program, not the target. So there could potentially be as many different MIPS assembly languages as there are folks willing to write assemblers. There aren't, fortunately, but there are some variations. Most of the places they vary are not the mnemonics/instructions. In the case of MIPS including the pseudo instruction la. But as shown in comments things like %hi and %lo and .asciiz are the kind of thing that don't necessarily span across all assemblers for MIPS, nor need to so long as la is. $a0, $v0 register names are not required either.

A pseudo instruction in this case means that the assembler replaces it with real instructions. The assemblers job is to make real instructions, machine instructions/code, or do the best it can. A toolchain will include ideally a compiler, assembler and linker, so The C compiler turns the C code into assembly the assembler turns that into an object, and take one or more objects and link those into a binary that has ideally resolved all externals (labels).

Different instruction sets have different features/rules. Some specifically talk about addressing modes, some do not. But addresses drive some percentage of the work, when you write C code, the name of the function, initially the name of variables, become labels, labels are addresses. Now an optimizer may remove the instance of a memory location for each of these things and their label and as a result address goes away, but if they do not they are an address. So when you have a call to a function the address needs to be figured out by the toolchain at build time (there are relocation exceptions, but in those cases the toolchain still figured it out relative to a base address that the relocation code has to patch for the toolchains output to work).

Sometimes the addresses are pc relative, program counter an internal register (or these days a set of registers) that keep track of the program, as a programmer reading some listing:

00000000 <.text>:
   0:   e3a01001    mov r1, #1
   4:   e3a02004    mov r2, #4
   8:   e0813002    add r3, r1, r2

(this is intentionally not MIPS)

So as a programmer we think at address 0 is an instruction mov r1,#4 and we then think the program counter is related to that address 0. Some instruction sets the program counter is a register we can access directly as a named register, others you cannot access it directly but perhaps indirectly with a special instruction, and some you cannot get at it with an instruction, but you can still have pc relative addressing in some form or fashion.

As you have seen in MIPS is not uncommon where there is a limited number of bits available in particular instructions for immediates. Constants within the instruction that provide a value to the instruction as a number. As above the last so many bits of the first two instructions 1 and 4 are related to the values in the mov. But as with MIPS being a fixed length instruction at 32 bits you can't have a 32 bit constant and also have opcode bits. So you have to find some solution to deal with loading constants.

Some instruction sets are variable length meaning they might have a one byte long instruction, think x86. Others are fixed length, think MIPS, ARM, risc-v, although all three of those have different sized instructions and different ways to use the different sized instructions, but their core instruction sets are/were fixed 32 bit instructions. What you would end up within many of the variable length instruction sets is say the address was 0x12345678 as the toolchain, likely the linker at this point, figured out where things were being placed. Let's say GG and JJ are the opcode bytes for some instruction to load a constant into a specific register and at this point this is now simply a constant it is no longer an address we just need those bits in the register

 0xGG 0xJJ 0x01 0x23 0x34 0x56 0x78

might be that instruction.

Other instruction sets will try to find what is sometimes called a pool and place the constants nearby, you will often see this with fixed length instruction sets, but can sometimes depending on the instruction set code it yourself.

   ldr r0,=labelname
   nop
   b somewhere

is the technically assembler (not target) specific pseudocode for a particular instruction set. The assembler sees that there is an unconditional branch which means unless the programmer is doing something hacky, you cant execute the byte(s) after that branch. And let's state that this label labelname is external it is not found in this code being assembled at this time into this object. So the toolchain is going to have to fill it in later, the assembler will take all of this information and at assemble time provide a place where the linker can fill in the address once known

00000000 <.text>:
   0:   e59f0004    ldr r0, [pc, #4]    ; c <.text+0xc>
   4:   e1a00000    nop         ; (mov r0, r0)
   8:   eafffffe    b   0 <somewhere>
   c:   00000000    andeq   r0, r0, r0

the disassembly of the OBJECT. Which is not linked and at least for disassembling purposes uses a base address of zero, once linked this code would most likely not live at address zero. But at address/offce C there are zeros that once linked will be filled in by an address, and a pc-relative addressing mode is used which means at the time this instruction is executed math is done on the program counter to produce an address, that address is read and the contents of that address are used, in this case to be put in the general purpose register r0. (most instruction sets don't have an always zero register like MIPS and risc-v which was heavily influenced by MIPS, so r0 here is a general purpose register not the always zero register). How that math works for this instruction set such that 4 is the right value is a longer discussion.

It is not the simulator that turns la into one or more instructions it is the assembler, the simulator you are using first has to assemble the code into machine code then it can simulate those instructions. Be it a simulator or real processor (okay sure someone could create one that doesn't make machine code out of it but just parses and simulates from the assembly language, fine, but in general) this is the case.

As you have figured out MIPS solution for general constants is there is an instruction that can load half the register and make the other half zeros then you can use ori or add to change the lower half of the register as a pair of instructions.

la $2,0x12345678
la $2,0x12340000
la $2,0x00005678
la $2,0x10000008

If I use a/the gnu cross assembler (part of binutils)(relatively easy to come by for the major operating systems):

mips-elf-as so.s -o so.o
mips-elf-objdump -D so.o

gives

Disassembly of section .text:

00000000 <.text>:
   0:   3c021234    lui $2,0x1234
   4:   34425678    ori $2,$2,0x5678
   8:   3c021234    lui $2,0x1234
   c:   24025678    li  $2,22136
  10:   3c021000    lui $2,0x1000
  14:   34420008    ori $2,$2,0x8

The every nibble is non-zero 0x12345678 took two instructions as expected, the 0x12340000 took one, the 0x00005678 (22136, why do disassemblers do this? who knows) is one instruction note it is neither lui nor ori nor add. And the 0x10000008 took two also as expected.

Also note this assembler did not use the scratch register. Also note that this assembler optimized those pseudo instructions into a mixture of solutions, it did try to use one instruction where possible, didn't have to, there isn't a rule the assembler could have always encoded an lui followed by an ori or add, it could have used a second scratch register or not. Your research found the use of an other register as a solution.

Hopefully your brain is putting some of these things together, okay so if the address is external and not known until link time, then is it possible to optimize? And even worse if possible to optimize then doesn't that change the number of instructions and thus size of the object and thus the size of the program making all the addresses that follow this instruction possibly be a value of 4 shorter which every so often will take an address that got lucky 0x12340000 now become 0x1233FFFC and now take two instructions instead of one. Yes all of that can happen, but toolchains deal with it. Let's try. I feel it is very good to just know what you are looking at and without having to run any code, you can learn a bunch about the toolchain and the instruction set:

la $2,some_ext_label

Disassembly of section .text:

00000000 <.text>:
   0:   3c020000    lui $2,0x0
   4:   24420000    addiu   $2,$2,0

At the object level the assembler sees this as an external label cannot determine if there is an optimization so pretty much needs to encode the basic two instructions. Note that the actual values are left zeros, to complete the task it needs to put something there so in this case it just puts zeros.

Now to link this I need an actual label, so:

.globl some_ext_label
add $3,$4,$5
some_ext_label:
add $3,$4,$5
add $4,$5,$6

build it, ignore the linker warning about _start:

mips-elf-as ex.s -o ex.o
mips-elf-as ex.s -o ex.o
mips-elf-ld -Ttext=0x1000 so.o ex.o -o so.elf
mips-elf-objdump -D so.elf

gives:

Disassembly of section .text:

00001000 <_ftext>:
    1000:   3c020000    lui $2,0x0
    1004:   2442100c    addiu   $2,$2,4108
    1008:   00851820    add $3,$4,$5

0000100c <some_ext_label>:
    100c:   00851820    add $3,$4,$5
    1010:   00a62020    add $4,$5,$6

the linker as it put the objects together starting at the specified address the label some_ext_label landed at address 0x0000100C then the linker goes back and through object file information/communication between the tools, patched up the instructions that needed their external address resolved. And note that if we had used la with a constant 0x0000100C we know this assembler would have optimized it but since the constant was not known until link time after the assembler had finished and made an object, it would have been difficult to optimize that instruction out because of the affect that would have on all the other offsets and addresses across the binary.

It needed to be able to deal with full 32 bit values:

mips-elf-as ex.s -o ex.o
mips-elf-as ex.s -o ex.o
mips-elf-ld -Ttext=0x87654444 so.o ex.o -o so.elf
mips-elf-objdump -D so.elf

87654444 <_ftext>:
87654444:   3c028765    lui $2,0x8765
87654448:   24424450    addiu   $2,$2,17488
8765444c:   00851820    add $3,$4,$5

87654450 <some_ext_label>:
87654450:   00851820    add $3,$4,$5
87654454:   00a62020    add $4,$5,$6

See how easy it is to examine this stuff without actually having to run code.

Note that even a local label might not work:

la $3,hello
add $5,$6,$7
add $5,$6,$7
add $5,$6,$7
hello:
add $5,$6,$7
add $5,$6,$7
add $5,$6,$7


00000000 <hello-0x14>:
   0:   3c030000    lui $3,0x0
   4:   24630014    addiu   $3,$3,20
   8:   00c72820    add $5,$6,$7
   c:   00c72820    add $5,$6,$7
  10:   00c72820    add $5,$6,$7

00000014 <hello>:
  14:   00c72820    add $5,$6,$7
  18:   00c72820    add $5,$6,$7
  1c:   00c72820    add $5,$6,$7

That is at the object level, the linker is going to replace those bits so for whatever reason the linker has put bits in that make it more confusing for the first time viewer:

mips-elf-ld -Ttext=0x12345678 so.o -o so.elf
mips-elf-objdump -D so.elf

Disassembly of section .text:

12345678 <_ftext>:
12345678:   3c031234    lui $3,0x1234
1234567c:   2463568c    addiu   $3,$3,22156
12345680:   00c72820    add $5,$6,$7
12345684:   00c72820    add $5,$6,$7
12345688:   00c72820    add $5,$6,$7

1234568c <hello>:
1234568c:   00c72820    add $5,$6,$7
12345690:   00c72820    add $5,$6,$7
12345694:   00c72820    add $5,$6,$7

The linker changed the 0x00000014 into the actual value once determined.

Yes, I am in no way trying to make a usable program that won't crash, it is up to the programmer ultimately to make sane programs. The tools are simply doing what I told them to do and I told them to take short instruction sequences that don't make much sense and don't terminate cleanly, etc, and just put them together. Even the four la instructions above, if COMPILED in a high level language:

unsigned int fun ( void )
{
    unsigned int a;

    a = 0x12345678;
    a = 0x12340000;
    a = 0x00005678;
    a = 0x10000008;
    return(a);
}

(optimized of course) gives

Disassembly of section .text:

00000000 <fun>:
   0:   3c021000    lui $2,0x1000
   4:   03e00008    jr  $31
   8:   24420008    addiu   $2,$2,8

easier to read with arm:

Disassembly of section .text:

00000000 <fun>:
   0:   e3a00281    mov r0, #268435464  ; 0x10000008
   4:   e12fff1e    bx  lr

The compiler optimized out the other three operations as dead code. But assemblers generally as a rule do exactly what you told them to do. In the case of pseudo instructions as you are asking about, it is up to the assembler authors to choose to optimize, and well there are some assembly languages that are more vague than others, less explicit, that allow the assembler more room to choose the instructions. As we saw above the assembler did not optimize out those four instructions even though as programmers we see that each instruction overwrites bits we had just put in that register and the end result is 0x10000008.

MIPS is pretty explicit, but even in assembly language:

lui $2,0x1000
addiu $2,$2,8
jr $31

I asked for that without any command line arguments I get this:

00000000 <.text>:
   0:   3c021000    lui $2,0x1000
   4:   03e00008    jr  $31
   8:   24420008    addiu   $2,$2,8

If I don't have the processor set for a branch shadow then I need to tell the assembler not to do that, or write code such that the assembler doesn't screw me over.

Also note in this case that the assembler chose to use lui + ori, the compiler chose to use lui + add. Or actually let's test the assembler:

la $2,0x10000008
jr $31

00000000 <.text>:
   0:   3c021000    lui $2,0x1000
   4:   03e00008    jr  $31
   8:   34420008    ori $2,$2,0x8

It was likely that two different individuals or teams did the port to MIPS.

I was going to go and show other instruction sets and how they can be vague in not necessarily giving you complete control over the exact instructions chosen, but that is perhaps just more of a tangent.

Assembly language is defined by the assembler, in this case if you are using SPIM that is an assembler, let's say linker, and instruction set simulator.

The assembler being the program that reads the text and turns it into machine code.

Having that job the assembler turns real and pseudo instructions into machine code. So it is the assembler at assembly time that turns la into the instruction pair if needed or a single instruction if the assembler was programmed to look for an optimization and chose a single instruction that functionally works.

Labels are addresses when a label is used with la because it is an absolute value not a pc-relative value so depending on the tool the assembler may or may not be able to resolve the address for this label and may have/desire to leave a two instruction placeholder for the linker to fill in once the address is known.

This is perhaps the missing link in your understanding, correct me if I am wrong I have no problem deleting this answer if it is off track. But a label is an address and address is ultimately just bits so at the end of the day the difference between:

la $5,0x12345678

and

la $5,some_label

is when the tools know what the bit pattern for the bits are and if they can optimize it into one instruction and when they place the bits into the machine code so that it is complete and can be executed.

Addresses, floating point numbers, signed integers, unsigned integers, pointers, ascii characters. These are all simply bit patterns to the processor, they have no meaning these terms mean something to the programmer but not the processor and not to the machine code.

The label becomes a bit pattern the bit pattern is encoded in the instruction. If there is an opportunity to optimize and the tool has been programmed to do it, then it may. If not programmed to do it, or the opportunity is not there or requires a significant amount of work/risk then it might not.

How is the pseudo instruction load address translated to real instructions?

1 Answers