I would like to know what the difference between these instructions is:
MOV AX, [TABLE-ADDR]
and
LEA AX, [TABLE-ADDR]
I would like to know what the difference between these instructions is:
MOV AX, [TABLE-ADDR]
and
LEA AX, [TABLE-ADDR]
LEA
means Load Effective AddressMOV
means Load ValueIn short, LEA
loads a pointer to the item you're addressing whereas MOV loads the actual value at that address.
The purpose of LEA
is to allow one to perform a non-trivial address calculation and store the result [for later usage]
LEA ax, [BP+SI+5] ; Compute address of value
MOV ax, [BP+SI+5] ; Load value at that address
Where there are just constants involved, MOV
(through the assembler's constant calculations) can sometimes appear to overlap with the simplest cases of usage of LEA
. Its useful if you have a multi-part calculation with multiple base addresses etc.
The instruction MOV reg,addr means read a variable stored at address addr into register reg. The instruction LEA reg,addr means read the address (not the variable stored at the address) into register reg.
Another form of the MOV instruction is MOV reg,immdata which means read the immediate data (i.e. constant) immdata into register reg. Note that if the addr in LEA reg,addr is just a constant (i.e. a fixed offset) then that LEA instruction is essentially exactly the same as an equivalent MOV reg,immdata instruction that loads the same constant as immediate data.
None of the previous answers quite got to the bottom of my own confusion, so I'd like to add my own.
What I was missing is that lea
operations treat the use of parentheses different than how mov
does.
Think of C. Let's say I have an array of long
that I call array
. Now the expression array[i]
performs a dereference, loading the value from memory at the address array + i * sizeof(long)
[1].
On the other hand, consider the expression &array[i]
. This still contains the sub-expression array[i]
, but no dereferencing is performed! The meaning of array[i]
has changed. It no longer means to perform a deference but instead acts as a kind of a specification, telling &
what memory address we're looking for. If you like, you could alternatively think of the &
as "cancelling out" the dereference.
Because the two use-cases are similar in many ways, they share the syntax array[i]
, but the existence or absence of a &
changes how that syntax is interpreted. Without &
, it's a dereference and actually reads from the array. With &
, it's not. The value array + i * sizeof(long)
is still calculated, but it is not dereferenced.
The situation is very similar with mov
and lea
. With mov
, a dereference occurs that does not happen with lea
. This is despite the use of parentheses that occurs in both. For instance, movq (%r8), %r9
and leaq (%r8), %r9
. With mov
, these parentheses mean "dereference"; with lea
, they don't. This is similar to how array[i]
only means "dereference" when there is no &
.
An example is in order.
Consider the code
movq (%rdi, %rsi, 8), %rbp
This loads the value at the memory location %rdi + %rsi * 8
into the register %rbp
. That is: get the value in the register %rdi
and the value in the register %rsi
. Multiply the latter by 8, and then add it to the former. Find the value at this location and place it into the register %rbp
.
This code corresponds to the C line x = array[i];
, where array
becomes %rdi
and i
becomes %rsi
and x
becomes %rbp
. The 8
is the length of the data type contained in the array.
Now consider similar code that uses lea
:
leaq (%rdi, %rsi, 8), %rbp
Just as the use of movq
corresponded to dereferencing, the use of leaq
here corresponds to not dereferencing. This line of assembly corresponds to the C line x = &array[i];
. Recall that &
changes the meaning of array[i]
from dereferencing to simply specifying a location. Likewise, the use of leaq
changes the meaning of (%rdi, %rsi, 8)
from dereferencing to specifying a location.
The semantics of this line of code are as follows: get the value in the register %rdi
and the value in the register %rsi
. Multiply the latter by 8, and then add it to the former. Place this value into the register %rbp
. No load from memory is involved, just arithmetic operations [2].
Note that the only difference between my descriptions of leaq
and movq
is that movq
does a dereference, and leaq
doesn't. In fact, to write the leaq
description, I basically copy+pasted the description of movq
, and then removed "Find the value at this location".
To summarize: movq
vs. leaq
is tricky because they treat the use of parentheses, as in (%rsi)
and (%rdi, %rsi, 8)
, differently. In movq
(and all other instruction except lea
), these parentheses denote a genuine dereference, whereas in leaq
they do not and are purely convenient syntax.
[1] I've said that when array
is an array of long
, the expression array[i]
loads the value from the address array + i * sizeof(long)
. This is true, but there's a subtlety that should be addressed. If I write the C code
long x = array[5];
this is not the same as typing
long x = *(array + 5 * sizeof(long));
It seems that it should be based on my previous statements, but it's not.
What's going on is that C pointer addition has a trick to it. Say I have a pointer p
pointing to values of type T
. The expression p + i
does not mean "the position at p
plus i
bytes". Instead, the expression p + i
actually means "the position at p
plus i * sizeof(T)
bytes".
The convenience of this is that to get "the next value" we just have to write p + 1
instead of p + 1 * sizeof(T)
.
This means that the C code long x = array[5];
is actually equivalent to
long x = *(array + 5)
because C will automatically multiply the 5
by sizeof(long)
.
So in the context of this StackOverflow question, how is this all relevant? It means that when I say "the address array + i * sizeof(long)
", I do not mean for "array + i * sizeof(long)
" to be interpreted as a C expression. I am doing the multiplication by sizeof(long)
myself in order to make my answer more explicit, but understand that due to that, this expression should not be read as C. Just as normal math that uses C syntax.
[2] Side note: because all lea
does is arithmetic operations, its arguments don't actually have to refer to valid addresses. For this reason, it's often used to perform pure arithmetic on values that may not be intended to be dereferenced. For instance, cc
with -O2
optimization translates
long f(long x) {
return x * 5;
}
into the following (irrelevant lines removed):
f:
leaq (%rdi, %rdi, 4), %rax # set %rax to %rdi + %rdi * 4
ret
If you only specify a literal, there is no difference. LEA has more abilities, though, and you can read about them here:
http://www.oopweb.com/Assembly/Documents/ArtOfAssembly/Volume/Chapter_6/CH06-1.html#HEADING1-136
It depends on the used assembler, because
mov ax,table_addr
in MASM works as
mov ax,word ptr[table_addr]
So it loads the first bytes from table_addr
and NOT the offset to table_addr
. You should use instead
mov ax,offset table_addr
or
lea ax,table_addr
which works the same.
lea
version also works fine if table_addr
is a local variable e.g.
some_procedure proc
local table_addr[64]:word
lea ax,table_addr
As stated in the other answers:
MOV
will grab the data at the address inside the brackets and place that data into the destination operand.LEA
will perform the calculation of the address inside the brackets and place that calculated address into the destination operand. This happens without actually going out to the memory and getting the data. The work done by LEA
is in the calculating of the "effective address".Because memory can be addressed in several different ways (see examples below), LEA
is sometimes used to add or multiply registers together without using an explicit ADD
or MUL
instruction (or equivalent).
Since everyone is showing examples in Intel syntax, here are some in AT&T syntax:
MOVL 16(%ebp), %eax /* put long at ebp+16 into eax */
LEAL 16(%ebp), %eax /* add 16 to ebp and store in eax */
MOVQ (%rdx,%rcx,8), %rax /* put qword at rcx*8 + rdx into rax */
LEAQ (%rdx,%rcx,8), %rax /* put value of "rcx*8 + rdx" into rax */
MOVW 5(%bp,%si), %ax /* put word at si + bp + 5 into ax */
LEAW 5(%bp,%si), %ax /* put value of "si + bp + 5" into ax */
MOVQ 16(%rip), %rax /* put qword at rip + 16 into rax */
LEAQ 16(%rip), %rax /* add 16 to instruction pointer and store in rax */
MOVL label(,1), %eax /* put long at label into eax */
LEAL label(,1), %eax /* put the address of the label into eax */
Basically ... "Move into REG ... after computing it..." it seems to be nice for other purposes as well :)
if you just forget that the value is a pointer you can use it for code optimizations/minimization ...what ever..
MOV EBX , 1
MOV ECX , 2
;//with 1 instruction you got result of 2 registers in 3rd one ...
LEA EAX , [EBX+ECX+5]
EAX = 8
originaly it would be:
MOV EAX, EBX
ADD EAX, ECX
ADD EAX, 5
Lets understand this with a example.
mov eax, [ebx] and
lea eax, [ebx] Suppose value in ebx is 0x400000. Then mov will go to address 0x400000 and copy 4 byte of data present their to eax register.Whereas lea will copy the address 0x400000 into eax. So, after the execution of each instruction value of eax in each case will be (assuming at memory 0x400000 contain is 30).
eax = 30 (in case of mov) eax = 0x400000 (in case of lea) For definition mov copy the data from rm32 to destination (mov dest rm32) and lea(load effective address) will copy the address to destination (mov dest rm32).
The difference is subtle but important. The MOV instruction is a 'MOVe' effectively a copy of the address that the TABLE-ADDR label stands for. The LEA instruction is a 'Load Effective Address' which is an indirected instruction, which means that TABLE-ADDR points to a memory location at which the address to load is found.
Effectively using LEA is equivalent to using pointers in languages such as C, as such it is a powerful instruction.