x86-64 instruction set, AT&T syntax, confusion regarding lea and brackets

Question

I’ve been told that lea %rax, %rdx is invalid syntax as the source needs to be in brackets, i.e lea (%rax), %rdx

I think I’ve clearly misunderstood both lea and the purpose of brackets.

I thought that lea %rax, %rdx would move the memory address stored in %rax, to %rdx, but apparently this is what lea (%rax), %rdx does?

What confuses me is that I thought brackets signify going to an address in memory, and taking the value at that address. So by using brackets lea would be moving a value from the memory address stored in %rax into the destination register.

Hence why I thought lea %rax, %rdx would be used if you just wanted to move the address stored in %rax into %rdx

Could someone explain to me the significance of brackets in the case of the lea instruction?

This tripped me up as a beginner as well. It's basically just saying "handle this as if it was an indirect memory access, with the same calculation and constraints, but at the end don't dereference it, just get me the final address". But because it is being handled in the same way as an indirect access except for the actual access, it has the same syntax. — CherryDT
lea is more useful with offsets and multipliers. lea (%rax), %rdx is no different from mov %rax, %rdx. But you can also do stuff like lea (%rdx, %rdx, 4), %rax which essentially does rax = rdx + rdx * 4 i.e. rax = rdx * 5, which you can't do with a single mov. This usage for arithmetic (and not any real memory addresses involved) is essentially a byproduct of the (limited) math operations that indirect addressing modes allow. — CherryDT
@CherryDT I see, thank you, so brackets just signify that we are treating it as an indirect memory address, not value in memory then? — S G
Yes, it basically does everything that mov (...) would do except for the last step of actually reading that memory address' content. — CherryDT

Peter Cordes Peter Cordes · Accepted Answer · 2020-04-19T18:20:59

Never actually use lea (%rax), %rdx. Use mov %rax, %rdx instead because CPUs run it more efficiently, and both ways copy a register value (regardless of whether that value is a valid pointer or not).

LEA can only work on a memory addressing mode, not a bare register. LEA kind of "undoes" the brackets, taking the result of the address calculation instead of the value from memory at that address. This can't happen if there wasn't a memory operand in the first place.

This lets you use it to do shift/add operations on arbitrary values, whether they're valid pointers or not: Using LEA on values that aren't addresses / pointers? LEA uses memory-operand syntax and machine code to encode the shift/add operation into a single instruction, using x86's normal addressing-mode encoding that the CPU hardware already knows how to decode.

Compared to mov, it's like a C & address-of operator. And you can't take the address of a register. (Or in C, of a register variable.) You can only use it to undo a dereference.

  register char *rax = ...;

  register char dl = *rax;       // mov   (%rax), %dl
  register char *rcx = rax;      // mov   %rax, %rcx
  register char *rdi = &rax[0];  // lea   (%rax), %rdi  // never do this, mov is more efficient
  register char *rbx = &rax[rdx*4 + 1234];  // lea 1234(%rax, %rdx, 4), %rbx  // a real use-case
  
  register char **rsi = &rax;    // lea %rax, %rsi   // ERROR: can't take the address of a register

Of course if you asked an actual C compiler to compile that, you'd get mov %rax, %rdi, not lea (%rax), %rdi, even if it didn't optimize away the code. This is in terms of conceptual equivalents, using C syntax and operators to explain asm, not to show how anything would or should actually compile.

x86-64 instruction set, AT&T syntax, confusion regarding lea and brackets

1 Answers