Understanding pre/post assembly code for a function call in x86 IA32 assembly

Question

So we have the following code, setting up for a function call with its arguments, its main body omitted (etc etc etc), and then the popping at the end of the function.

pushl %ebp
movl %esp, %ebp
pushl %ebx
movl 8(%ebp), %ebx
movl 12(%ebp), %ecx
etc
etc
etc
//end of function
popl %ebx
popl %ebp

Here's what I (think) I understand.

Suppose we have %esp pointing to memory address 100.

pushl %ebp

So this essentially makes %ebp point to where %esp points (memory address 100) + 4. So now %ebp points to memory address 104. This leaves our current memory state looking like so:

----------
|100|%esp
|104|%ebp
----------

Then we have the next line of code:

movl %esp, %ebp

So from what I understand, ebp now pointers to memory address 100. I have a little intuition as to why we do this step, but my confusion is the next line:

pushl %ebx

What is the purpose of pushing ebx, which I assume will then point to memory address 104? I have a vague idea of how the space right below ebp (104) is supposed to be a reference to an "old stack pointer," so I can see why the next 2 lines add 8 and 12 to ebp to be the "arguments" of our function, rather than 4 and 8.

But I'm confused as to why we push ebx onto the stack, first.

I also do not understand popping, and why we pop ebx and ebp?

Talking to someone about this before he had to sleep, he mentioned that we have no reference to the fact that our stack pointer was at 100 -- until we pop ebp back. Now, I thought ebp's value was 100, so I don't understand the point he was trying to make.

So to clarify:

Is my understanding thus far correct?
Why do we push ebx onto the stack?
What is this "reference to the old stack pointer" that lives right below ebp? Is that the ebx that we push?
Is there something I'm not understanding, like some sort of difference between the ebx that we push, and the ebx in the line right after (our argument)? Is there a difference between the ebp that gets pushed and the ebp in the line right after?
Why are we popping at the end?

I apologize if this is difficult to understand. I understand similar questions have been asked about this, but I'm trying to intuitively understand and picture what exactly is going on in a function call in a way that makes sense to me.

Note: I edited some important things regarding my understanding of what's going on, particularly with regards to ebp.

The push instructions put the contents of the given register on the stack, it doesn't make the registers "point" to anything. Also, the stack grows downwards. — Some programmer dude

icktoofay icktoofay · Accepted Answer · 2014-08-07T04:29:46

As Joachim stated in a comment on your question, pushing a register pushes the contents of the register at that moment onto the stack; it doesn’t push a reference to the register or anything else. I’m not sure if you were saying that’s what was happening, but otherwise this diagram was unclear:

----------
|100|%esp
|104|%ebp
----------

Nevertheless, I’ll try to explain what it does and why.

Say %esp was 0x100 when the caller calls our function and the instruction after the call is at 0x200. When we execute call, we push 0x200 (the return address) and jump to the procedure. Our stack is then:

          Address  Value
%esp -->  0x100    0x200

And %ebp is some value or another; it might point into the stack or it might not. It doesn’t even need to represent an address. So %ebp is meaningless to us at this point.

But though it’s meaningless to us, the caller does expect it to stay the same before and after the call, so we have to preserve it. Let’s say it contained the value 0xDEADBEEF. We push it, so the stack now looks like this:

          Address  Value
          0x100    0x     200
%esp -->  0x0fc    0xDEADBEEF

In most situations, we can address everything as an offset from %esp, and that applies here, too. But if the compiler is compiling some C code that deals with variable-length arrays or other features, we often will want to index from the first thing we pushed rather than the last thing we pushed. To do that, we’ll set %ebp to where we are right now. Then things look like this:

                Address  Value
                0x100    0x     200
%esp, %ebp -->  0x0fc    0xDEADBEEF

Note that the value at the address pointed to by %ebp is the old value of %ebp, so you can walk the stack, as you mentioned you were aware of before.

Next, we push %ebx, which we’ll say has the value 0xBEEFCAFE. This is the first thing not directly related to a function prologue. Then our stack looks like this:

          Address  Value
          0x100    0x     200
%ebp -->  0x0fc    0xDEADBEEF
%esp -->  0x0f8    0xBEEFCAFE

But why do we push %ebx? Well, as it turns out, the x86 C calling convention dictates that, like %ebp, %ebx must stay the same as it was before the call. So because the code you omitted presumably changes %ebx, it has to preserve the initial value so it can restore it for the caller.

After we’ve restored %ebx, we pop %ebp, restoring its value as well, since that, too, must be preserved after the call. And finally we return.

TL;DR: %ebp and %ebx are pushed and popped because they are manipulated during the execution of the body of the function, but the x86 C calling convention dictates that the values must remain the same before and after the call, so the initial values must be preserved so we can restore them.

Understanding pre/post assembly code for a function call in x86 IA32 assembly

2 Answers