1
votes

I'm following section 5.1.3 at OS development by Nick Blundell. I'm investigating how the following C code is compiled into machine code:

void caller_fun(){
        callee_fun(0xdede);
}

int callee_fun(int arg){
        return arg;
}

My final dis-assembled machine code by ndisasm is this:

00000000  55                push ebp
00000001  89E5              mov ebp,esp
00000003  83EC08            sub esp,byte +0x8
00000006  83EC0C            sub esp,byte +0xc
00000009  68DEDE0000        push dword 0xdede
0000000E  E806000000        call dword 0x19
00000013  83C410            add esp,byte +0x10
00000016  90                nop
00000017  C9                leave
00000018  C3                ret
00000019  55                push ebp
0000001A  89E5              mov ebp,esp
0000001C  8B4508            mov eax,[ebp+0x8]
0000001F  5D                pop ebp
00000020  C3                ret

Studying on how the stack pointer and base pointer work, I made the following diagram which shows stack situation when the opcode at offset 0x1C is being run by the processor:

                 Stack situation when processor is
             running `mov eax,[ebp+0x8]` at offset 0x1C

    +---------------------------------+
    |           4 bytes for           |
    |    `push ebp` at offset 0x00    |
    +---------------------------------+
    |        20 (8+12) bytes for      |
    |        `sub esp,byte +0x8`      |
    |    and `sub esp,byte +0xc`      |
    |    at offsets 0x03 and 0x06     |
    +---------------------------------+
    | 4 bytes for `push dword 0xdede` |
    |         at offset 0x09          |
    +---------------------------------+
    | 4 bytes for instruction pointer |
    |      by `call dword 0x19`       |
    |        at offset 0x0E           |
    +---------------------------------+
    |     4 bytes for `push ebp`      |
    |        at offset 0x19           |
    +---------------------------------+ --> ebp & esp are both here by 
                                               `mov ebp,esp`
                                              at offset 0x1A

Now, I have questions which I couldn't figure out by researching and studying:

  1. Is my diagram of stack situation correct?

  2. Why are 20 bytes pushed into the stack by sub esp,byte +0x8 and sub esp,byte +0xc at offsets 0x03 and 0x06?

  3. Even if 20 bytes of stack memory is needed, why is it not assigned by a single instruction like sub esp,byte +0x14, i.e. 0x14=0x8+0xc


I'm compiling the C code with this make-file:

all: call_fun.o call_fun.bin call_fun.dis

call_fun.o: call_fun.c
    gcc -ffreestanding -c call_fun.c -o call_fun.o

call_fun.bin: call_fun.o
    ld -o call_fun.bin -Ttext 0x0 --oformat binary call_fun.o

call_fun.dis: call_fun.bin
    ndisasm -b 32 call_fun.bin > call_fun.dis
1
You did not show how you compiled the code. Chances are you forgot to enable optimizations, as usual. The diagram looks correct.Jester
Which compiler do you use? Your stack diagram looks correct to me.Jabberwocky
The two sub esp,byte + ... are probably not contracted into a single sub because you compiled without optimisations. Try to compile with gcc -O4 ...Jabberwocky
hasnt this been asked and answered dozens of times now?old_timer
One point of view is: why not? The C language doesn't define anything as stack and other platform implementation details, the C language defines abstract machine, and the resulting binary must work "correctly" in terms of that. There's no reason in C language why the compiler can't reserve 10kiB of space on stack on every function entry, if it thinks it's good idea (and in debug mode it has many weird ideas).Ped7g

1 Answers

5
votes

Without the optimizations, stack is going to be used to preserve and restore base pointers. In x86_64 calling conventions (https://en.wikipedia.org/wiki/X86_calling_conventions) stack must be aligned by 16 byte boundaries when calling functions, so this is most likely what happens in your case. At least, this is what happens in my case, when I compile your code on my system. Here is the ASM for this:

callee_fun(int): # @callee_fun(int)
  pushq %rbp
  movq %rsp, %rbp
  movl %edi, -4(%rbp)
  movl -4(%rbp), %eax
  popq %rbp
  retq
caller_fun(): # @caller_fun()
  pushq %rbp
  movq %rsp, %rbp
  subq $16, %rsp
  movl $57054, %edi # imm = 0xDEDE
  callq callee_fun(int)
  movl %eax, -4(%rbp) # 4-byte Spill
  addq $16, %rsp
  popq %rbp
  retq

It is worth noting, that when optimizations are turned on, there is no stack usage or modifications at all:

callee_fun(int): # @callee_fun(int)
  movl %edi, %eax
  retq
caller_fun(): # @caller_fun()
  retq

Last, but not least, when playing with ASM listing, do not disassemble object file or executable file. Instead, direct your compiler to generate assembly listing. This will give you much more context.

If you are using gcc, a good command to do this would be

gcc -fverbose-asm -S -O