To answer those numbered questions:
1) subl $24,%esp
means esp = esp - 24
GNU AS uses AT&T syntax, which is the opposite of Intel syntax. AT&T has the destination on the right, Intel has the destination on the left. Also AT&T is explicit about the size of the arguments. Intel tries to deduce it or forces you to be explicit.
The stack grows down in memory, the memory at and after esp is the stack contents, addresses lower than esp are unused stack space. esp points to the last thing pushed onto the stack.
2) x86 instruction encoding mostly allows the following:
movl rm,r ' move value from register or memory to a register
movl r,rm ' move a value from a register to a register or memory
movl imm,rm ' Move immediate value.
there is no memory-to-memory instruction format. (Strictly speaking you can do memory-to-memory operations with movs
or by push mem
, pop mem
, but neither take two memory operands on the same instruction)
"Immediate" means the value is encoded right into the instruction. For example, to store 15 at the address in ebx:
movl $15,(%ebx)
15 is an "immediate" value.
The parentheses make it use the register as a pointer to memory.
3) movl 8(%ebp),%eax
means,
- take the value of ebp
- add 8 to it (does not modify ebp though),
- use it as an address (the parentheses),
- read the 32-bit value from that address,
- and store the value in eax
esp is the stack pointer.
In 32-bit mode, each push and pop on the stack is 4 bytes wide. Typically, most variables take up the 4 bytes anyway. So you could say 8(%ebp) means, starting at the top of stack, give me the value 2 (4 x 2 = 8) int's into the stack.
Typically, 32-bit code uses ebp to point to the beginning of the local variables in a function. In 16-bit x86 code, there was no way to use the stack pointer as a pointer (hard to believe, right?). So what people did was copy sp
to bp
and use bp as the local frame pointer. This became completely unnecessary when 32-bit mode came out (80386), it did have a way to just use the stack pointer directly. Unfortunately, ebp makes debugging easier so we ended up continuing to use ebp in 32-bit code (it's trivially easy to make a stack dump if ebp is being used).
Thankfully, amd64 gave us a new ABI which does not use ebp as a frame pointer, 64-bit code typically uses esp to access local variables, ebp is available to hold a variable.
4) Explained above
5) leave
is an old instruction that simply does movl %ebp,%esp
and popl %ebp
and saves a few code bytes. What it actually does is undo the changes to the stack and restore the caller's ebp. The called function must preserve ebp
in the x86 ABI.
On entry to the function, the compiler did subl $24,%esp to make room for local variables and sometimes temp storage that it didnt have enough registers to hold.
The best way to "imagine" the stack frame in your mind is to see it as a structure sitting on the stack. The first members of the imaginary structure are the most recently "pushed" values. So when you push to a stack, imagine inserting a new member at the beginning of the structure, while none of the other members moved. When you "pop" from the stack, you get the value of the first member of the imaginary struct, and that (first) line of the structure disappears from existence.
Stack frame manipulation is mostly just moving the stack pointer to make more or less room in that imaginary struct we call the stack frame. Subtracting from the stack pointer just puts multiple imaginary members at the start of the struct in one step. Adding to the stack pointer makes the first so many members disappear.
The end of the code you posted is not typical. That jmp
is typically a ret
. The compiler was clever about it and did a "tail call optimization", meaning it just cleans up what it did to the stack and jumps to f
. When f(2)
returns, it will actually return straight to the caller (not back to the code you posted)