6
votes

I have a boot-up code for a bare-metal ARM written in assembly and I'm trying to understand how it works. The binary is written in some external Flash, and is copying parts of itself in RAM at boot-up. I still didn't exactly get the concept of relocation in this context, even though I read this wikipedia entry. The RAM is mapped to a low address window, and the flash in a high address window. Can someone explain to me why we test the value of the link register here?

/* Test if we are running from an address, we are not linked at */
       bl check_position
 check_position:
        mov     r0, lr                  
        ldr     r1, =check_position
        cmp     r0, r1                  /* ; don't relocate during debug */
        beq     relocated_entry 
2
Thank you for the two excellent answers! I would accept both if I coud, since one explains the goal of the code (the JTAG program loader assumption is right) and the second how it works exactly.Étienne

2 Answers

5
votes

My guess is the application runs from ram, and when debugging the application this author is perhaps using some sort of bootloader and or jtag to load the test app directly into ram, thus no reason to copy and run (which could cause a crash).

Another reason you would do something like this is to avoid an infinite loop. If for example you want to boot from flash (have to usually) but execute from ram, the simplest way to do that is to just copy the whole flash or whole some chunk of flash to ram and just branch to the start of ram. Which when you do that means you hit the "copy the app to ram and branch" loop again, to avoid it the second time (which might crash you), you have some sort of am I running this loop from flash or not test.

4
votes

Can anyone explain to me why we test the value of the link register here?

The bl check_position will place the value of PC+4 in the link register and transfer control to the check_position also PC relative.bl at ARM So far everything is PC relative.

The ldr r1,=check_position gets a value from the literal pool.Ref1 The actual code looks like,

  ldr r1,[pc, #offset]
...
  offset:
    .long check_position   # absolute address from assemble/link. 

So the R0 contains a PC relative version and the R1 contains the absolute assembled version. Here, they are compared. You could also use arithmetic to calculate the difference and then branch to it if non-zero; or possibly copy the code to it's absolute destination.Ref2 If the code is running at the linked address, then R0 and R1 are the same. This is some pseudo code for bl.

 mov lr,pc               ; pc is actually two instruction ahead.
 add pc,pc,#branch_offset-8

The key is that BL does everything based on the PC including the update of lr. Instead of using this trick, we could use mov R0,PC, except the PC is 8 bytes ahead. Another alternative would be to use adr R0,check_position, which would get the assembler to do all the address math for us.

 /* Test if we are running from an address, we are not linked at */
 check_position:
    adr    r0, check_position
    ldr    r1, =check_position
    cmp    r0, r1                  /* ; don't relocate during debug */
    beq    relocated_entry 

An ARMv6 version might look like this,

 /* Test if we are running from an address, we are not linked at */
 check_position:
    adr    r0, check_position
    movw   r1, #:lower16:check_position
    movt   r1, #:upper16:check_position
    cmp    r0, r1                  /* ; don't relocate during debug */
    beq    relocated_entry 

In both cases, the code is more straight forward, smaller by one word and doesn't over write the lr register, so it could be used for other purposes.

Ref1: See Arm op-codes and .ltorg in gnu-assembler manual
Ref2: This is exactly what the Linux head.S is doing for the ARM.

Edit: I checked the ARM ARM and the PC is apparently the current instruction +8, which goes to show why the code was like this. I think the adr version is more direct and readable, but the adr pseudo-op is not used as often so people may not be familiar with it.