Memory allocation and addressing in Assembly

Question

I am trying to learn assembly and there a couple of instructions whose purpose I do not fully understand.

C code

#include <stdio.h>

int main(int argc, char* argv[])
{
    printf("Argument One - %s\n", argv[1]);
    return 0;
}

Assembly

    .section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 10, 14
    .intel_syntax noprefix
    .globl  _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
## %bb.0:
    push    rbp
    mov rbp, rsp
    sub rsp, 32
    lea rax, [rip + L_.str]
    mov dword ptr [rbp - 4], 0
    mov dword ptr [rbp - 8], edi
    mov qword ptr [rbp - 16], rsi
    mov rsi, qword ptr [rbp - 16]
    mov rsi, qword ptr [rsi + 8]
    mov rdi, rax
    mov al, 0
    call    _printf
    xor ecx, ecx
    mov dword ptr [rbp - 20], eax ## 4-byte Spill
    mov eax, ecx
    add rsp, 32
    pop rbp
    ret
                                        ## -- End function
    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "Argument One - %s\n"


.subsections_via_symbols

Q1. sub rsp, 32

Why is space allocated for 32 bytes when there are no local variables? I believe argc and argv are saved in the registers edi and rsi respectively. If its so that they can be moved onto the stack, wouldn't that require only 12 bytes?

Q2. lea rax, [rip + L_.str] and mov rdi, rax

Am I correct in understanding that L_.str has the address of the string ""Argument One - %s\n"? From what I've understood, printf gets access to this string through the register rdi. So, why doesn't the instruction mov rdi, L_.str work instead?

Q3. mov dword ptr [rbp - 4], 0

Why is zero being pushed onto the stack?

Q4. mov dword ptr [rbp - 8], edi and mov qword ptr [rbp - 16], rsi

I believe these instruction are to get argc and argv onto the stack. Is it pure convention to use edi and rsi?

Q5. mov dword ptr [rbp - 20], eax

I haven't a clue what this does.

Most of that is noise and overhead from unoptimized code, e.g. copying args from registers to the stack for no reason, and (Q5) spilling the unused printf return value to stack space. Compile with -O3 or -O2 to get just the interesting part. How to remove "noise" from GCC/clang assembly output? — Peter Cordes
And yes, there is a standard that specifies how args are passed to functions, so compilers can make code that can call code from other compilers. In your case it's the x86-64 System V ABI. See the function-calling part of What are the calling conventions for UNIX & Linux system calls on i386 and x86-64, and What registers are preserved through a linux x86-64 function call. See also stackoverflow.com/tags/x86/info for more links to docs. — Peter Cordes
You are compiling without optimisations. This causes the compiler to generate a lot of useless instructions. Pass at least -O1, better -O2 so the compiler generates reasonable code. — fuz

rcgldr rcgldr · Accepted Answer · 2019-01-17T05:30:49

Q1. sub rsp, 32

This is allocating space that is used to store some data. Although it allocates 32 bytes, the code is only using the first 16 bytes of that allocated space, a qword at [rbp-8] (0:edi) and a qword at [rbp-16] (rdi).

Q2. lea rax, [rip + L_.str] and mov rdi, rax

The lea is getting the address of a string stored in the "code" segement. It's moved to rdi which is used as one of the parameters for printf.

Q3. mov dword ptr [rbp - 4], 0 ... mov dword ptr [rbp - 8], edi

This stores a 64-bit little endian value composed of 0:edi at [rbp - 8]. I'm not sure why it's doing this, since it never loads from that qword later on.

It's normal for un-optimized code to store their register arguments to memory, where debug info can tell debuggers where to look for and modify them, but it's unclear why clang zero-extends argc in edi to 64 bits.

More likely that 0 dword is something separate, because it if the compiler really wanted to store a zero-extend argc, compilers will zero-extend in registers with a 32-bit mov, like mov ecx, edi ; mov [rbp-8], rcx. Possibly this extra zero is a return-value temporary which it later decides not to use because of an explicit return 0; instead of the implicit one from falling off the end of main? (main is special, and I think clang does create an internal temporary variable for the return value.)

Q4 mov qword ptr [rbp - 16], rsi ... mov rsi, qword ptr [rbp - 16]

Optimization off? It stores rsi then loads rsi from [rbp - 16]. rsi holds your argv function arg ( == &argv[0]). The x86-64 System V ABI passes integer/pointer args in RDI, RSI, RDX, RCX, R8, R9, then on the stack.

... mov rsi, qword ptr [rsi + 8]

This is loading rsi with the contents of argv[1], as the 2nd arg for printf. (For the same reason that main's 2nd arg was in rsi).

The x86-64 System V calling convention is also the reason for zeroing AL before calling a varargs function with no FP args.

Q5. mov dword ptr [rbp - 20], eax

Optimization off? It's storing the return value from printf, but never using it.

Memory allocation and addressing in Assembly

1 Answers