1
votes

Hi I am new to assembly language and I am getting confused about the syntax of lea instruction I have seen when I study a piece of code ( which is generated by using gdb command : disassemble main ).

lea    0xa8e96(%rip),%rsi        # 0x4aa5df

The syntax I have seen for lea is

lea src, dest 

But it seems like there is an additional immediate value ( # 0x4aa5df ) following the %rsi register, how should I interpret this correctly?

Edit: I have checked the value stored in the %rip register which is

(gdb) p /x $rip 
$1 = 0x401730

So adding this with the 0xa8e96 gives me 0x4AA5C6 which does not match 0x4aa5df, am I missing something here ?

1
That is just a comment by your friendly disassembler. It tells you what the calculated address is (rip + 0xa8e96 = 0x4aa5df). It's not part of the instruction.Jester
Hi, Thank you for the reply. I checked the values stored in the %rip register but unfortunately the results does not match and I am not sure why..Zhezhong Jiang
You want to use the address of the lea instruction for what the processor will take as %rip during that instruction's execution. You don't want any old value in %rip as %rip changes at every instruction -- it would only make sense if you are currently stopped at the lea, in which case the value will be the address of the lea instruction..Erik Eidt
Just in general for x86, the operands can get rather fancy with respect to calculating an address. The general syntax is base(rb, re, n) (where you can leave out some of these fields if they arent needed), where the calculated address is base + rb + n * re where base and n are immediate values and rb and re` are values from those registers.Unn
@ErikEidt actually rip already points past the lea so the address of the next instruction should be used.Jester

1 Answers

2
votes

Thanks for the help from Jester, Unn and Erik. The original C code I used is :

#include <stdio.h>

int main(int argc, char** argv)
{    
    int ret = printf("%s\n", argv[argc-1]);
    argv[0] = '\0'; // NOOP to force gcc to generate a callq instead of jmp
    return ret;
}

And the assembler code generated by using gdb is :

(gdb) disassemble main
Dump of assembler code for function main:
=> 0x0000000000401730 <+0>:     endbr64
   0x0000000000401734 <+4>:     push   %rbx
   0x0000000000401735 <+5>:     movslq %edi,%rdi
   0x0000000000401738 <+8>:     mov    %rsi,%rbx
   0x000000000040173b <+11>:    xor    %eax,%eax
   0x000000000040173d <+13>:    mov    -0x8(%rsi,%rdi,8),%rdx
   0x0000000000401742 <+18>:    lea    0xa8e96(%rip),%rsi        # 0x4aa5df
   0x0000000000401749 <+25>:    mov    $0x1,%edi
   0x000000000040174e <+30>:    callq  0x44bbe0 <__printf_chk>
   0x0000000000401753 <+35>:    movq   $0x0,(%rbx)
   0x000000000040175a <+42>:    pop    %rbx
   0x000000000040175b <+43>:    retq
End of assembler dump.

So the rip does point past the lea instructions and the address should be used in the computation is 0x0000000000401749 , adding this to 0xa8e96 gives the address in the comment # 0x4aa5df.