82
votes
0x00000000004004b6 <main+30>:   callq  0x400398 <printf@plt>

Anyone knows?

UPDATE

Why two disas printf give me different result?

(gdb) disas printf
Dump of assembler code for function printf@plt:
0x0000000000400398 <printf@plt+0>:  jmpq   *0x2004c2(%rip)        # 0x600860 <_GLOBAL_OFFSET_TABLE_+24>
0x000000000040039e <printf@plt+6>:  pushq  $0x0
0x00000000004003a3 <printf@plt+11>: jmpq   0x400388

(gdb) disas printf
Dump of assembler code for function printf:
0x00000037aa44d360 <printf+0>:  sub    $0xd8,%rsp
0x00000037aa44d367 <printf+7>:  mov    %rdx,0x30(%rsp)
0x00000037aa44d36c <printf+12>: movzbl %al,%edx
0x00000037aa44d36f <printf+15>: mov    %rsi,0x28(%rsp)
0x00000037aa44d374 <printf+20>: lea    0x0(,%rdx,4),%rax
0x00000037aa44d37c <printf+28>: lea    0x3f(%rip),%rdx        # 0x37aa44d3c2 <printf+98>
2
Where did that first output line come from? objdump I imagine?Ciro Santilli 新疆再教育营六四事件法轮功郝海东

2 Answers

134
votes

It's a way to get code fixups (adjusting addresses based on where code sits in virtual memory, which may be different across different processes) without having to maintain a separate copy of the code for each process. The PLT is the procedure linkage table, one of the structures which makes dynamic loading and linking easier to use.

printf@plt is actually a small stub which (eventually) calls the real printf function, modifying things on the way to make subsequent calls faster.

The real printf function may be mapped into any location in a given process (virtual address space) as may the code that is trying to call it.

So, in order to allow proper code sharing of calling code (left side below) and called code (right side below), you don't want to apply any fixups to the calling code directly since that will restrict where it can be located in other processes.

So the PLT is a smaller process-specific area at a reliably-calculated-at-runtime address that isn't shared between processes, so any given process is free to change it however it wants to, without adverse effects.


Examine the following diagram which shows both your code and the library code mapped to different virtual addresses in two different processes, ProcA and ProcB:

Address: 0x1234          0x9000      0x8888
        +-------------+ +---------+ +---------+
        |             | | Private | |         |
ProcA   |             | | PLT/GOT | |         |
        | Shared      | +---------+ | Shared  |
========| application |=============| library |==
        | code        | +---------+ | code    |
        |             | | Private | |         |
ProcB   |             | | PLT/GOT | |         |
        +-------------+ +---------+ +---------+
Address: 0x2020          0x9000      0x6666

This particular example shows a simple case where the PLT maps to a fixed location. In your scenario, it's located relative to the current program counter as evidenced by your program-counter-relative lookup:

<printf@plt+0>: jmpq  *0x2004c2(%rip)  ; 0x600860 <_GOT_+24>

I've just used fixed addressing to keep the example simpler.

The original way in which code was shared meant it they had to be loaded at the same memory location in each virtual address space of every process that used it. Either that or it couldn't be shared, since the act of fixing up the single shared copy for one process would totally stuff up other processes where it was mapped to a different location.

By using position independent code, along with the PLT and a global offset table (GOT), the first call to a function printf@plt (in the PLT) is a multi-stage operation, in which the following actions take place:

  • You call printf@plt in the PLT.
  • It calls the GOT version (via a pointer) which initially points back to some set-up code in the PLT.
  • This set-up code loads the relevant shared library if not yet done, then modifies the GOT pointer so that subsequent calls directly to the real printf rather than the PLT set-up code.
  • It then calls the loaded printf code at the correct address for this process.

On subsequent calls, because the GOT pointer has been modified, the multi-stage approach is simplified:

  • You call printf@plt in the PLT.
  • It calls the GOT version (via pointer), which now points to the real printf.

A good article can be found here, detailing how glibc is loaded at run time.

6
votes

Not sure, but probably what you have seen makes sense. The first time you run the disas command the printf is not yet called so it's not resolved. Once your program calls the printf method the first time the GOT is updated and now the printf is resolved and the GOT points to the real function. Thus, the next call to the disas command shows the real printf assembly.