2
votes

I'm trying to understand relocations in ELF, but I'm having some trouble with the documentation on this which is rather cryptic. The relocation equations for instance, describe 3 parameters, S, A and P. Now I get it that A is just the addend which is some number used to aid in the relocation calculation and S is the "Value of the symbol whose index resides in the relocation entry"(which is the same as the function name right?) but what about P? The manual describes it as "the place of the storage unit being relocated" but what does that even mean?

I just found an example to illustrate this: Suppose we have 2 object files, obj1.o and obj2.o. The first one references a function called foo() which is located inside obj2.o.

objdump -d obj1.o yields:

Disassembly of section .text:
00000000 <func>:
0:   55                      push   %ebp
1:   89 e5                   mov    %esp,%ebp
3:   83 ec 08                sub    $0x8,%esp
6:   e8 fc ff ff ff          call 7 <func+0x7>
b:   c9                      leave  
c:   c3                      ret   

Now, readelf shows that this is a R_386_PC32 relocation whose equation is: S + A - P.

After combining the two files to generate a fully-fledged executable, relocated, the relocation entries are apparently patched:

objdump -d relocated
test:     file format elf32-i386
Disassembly of section .text:
080480d8 <func>:
80480d8:   55                      push   %ebp
80480d9:   89 e5                   mov    %esp,%ebp
80480db:   83 ec 08                sub    $0x8,%esp
80480de:   e8 05 00 00 00          call   80480e8 <foo>
80480e3:   c9                      leave  
80480e4:   c3                      ret    
80480e5:   90                      nop
80480e6:   90                      nop
80480e7:   90                      nop
080480e8 <foo>:
80480e8:   55                      push   %ebp
80480e9:   89 e5                   mov    %esp,%ebp
80480eb:   5d                      pop    %ebp
80480ec:   c3                      ret

So it seems that the linker performed the following calculation : S + A – P: 0x80480e8 + 0xfffffffc – 0x80480df

My questions are:

  • Where's the value of P coming from?
  • What's the point of having an addend?
2

2 Answers

0
votes

P is the Program-Counter, so this is a PC-relative relocation. I didn't check exactly what ELF32 uses as the reference point. Judging from the fc ff ff ff = -4 in the un-linked call rel32, it's probably the start of the 4-byte displacement. In machine code, relative jumps like call rel32 the end of the instruction (i.e. the start of the next instruction) as the base, so that would explain the 4-byte offset.

That's one use-case for the addend.

Another is PC-relative addressing to static data to make position-independent code. Your PC reference might be nearby, but not even inside the instruction you're using it with, or you want to index a global array.

So you might have something like

call get_eip_into_ebx
mov $table - this_instruction + 40(%ebx), %ecx

Or for a real example, look at what gcc and clang do for -m32 -PIE to load a global. (But the global offset table symbol names get special handling, so I'm not going to reproduce the compiler asm output.)

0
votes

S = 0x80480e8 is the symbol starting address (foo()'s entry address) P = 0x80480df is the address whose value needs to be modified (being relocated)

So S-P is the distance (in bytes) between these two address

However, call rel32 instruction counts the distance (rel32) starting from the next instruction and there is a 4 bytes offset from P, so a -4.