I'm in the process of writing a compiler purely as a learning experience. I'm currently learning about stack frames by compiling simple c++ code and then studying the output asm produced by gcc 4.9.2 for Windows x86.
my simple c++ code is
#include <iostream>
using namespace std;
int globalVar;
void testStackStuff(void);
void testPassingOneInt32(int v);
void forceStackFrameCreation(int v);
int main()
{
globalVar = 0;
testStackStuff();
std::cout << globalVar << std::endl;
}
void testStackStuff(void)
{
testPassingOneInt32(666);
}
void testPassingOneInt32(int v)
{
globalVar = globalVar + v;
forceStackFrameCreation(v);
}
void forceStackFrameCreation(int v)
{
globalVar = globalVar + v;
}
Ok, when this is compiled with -mpreferred-stack-boundary=4 I was expecting to see a stack aligned to 16 bytes (technically it is aligned to 16 bytes but with an extra 16 bytes of unused stack space). The prologue for main as produced by gcc is:
22 .loc 1 12 0
23 .cfi_startproc
24 0000 8D4C2404 lea ecx, [esp+4]
25 .cfi_def_cfa 1, 0
26 0004 83E4F0 and esp, -16
27 0007 FF71FC push DWORD PTR [ecx-4]
28 000a 55 push ebp
29 .cfi_escape 0x10,0x5,0x2,0x75,0
30 000b 89E5 mov ebp, esp
31 000d 51 push ecx
32 .cfi_escape 0xf,0x3,0x75,0x7c,0x6
33 000e 83EC14 sub esp, 20
34 .loc 1 12 0
35 0011 E8000000 call ___main
35 00
36 .loc 1 13 0
37 0016 C7050000 mov DWORD PTR _globalVar, 0
38 .loc 1 15 0
39 0020 E8330000 call __Z14testStackStuffv
line 26 rounds esp down to the nearest 16 byte boundary.
lines 27, 28 and 31 push a total of 12 bytes onto the stack, then
line 33 subtracts another 20 bytes from esp, giving a total of 32 bytes!
Why?
line 39 then calls testStackStuff.
NOTE - this call pushes the return address (4 bytes).
Now, lets look at the prologue for testStackStuff, keeping in mind that the stack is now 4 bytes closer to the next 16 byte boundary.
67 0058 55 push ebp
68 .cfi_def_cfa_offset 8
69 .cfi_offset 5, -8
70 0059 89E5 mov ebp, esp
71 .cfi_def_cfa_register 5
72 005b 83EC18 sub esp, 24
73 .loc 1 22 0
74 005e C704249A mov DWORD PTR [esp], 666
line 67 pushes another 4 bytes (now 8 bytes towards the boundary).
line 72 subtracts another 24 bytes (total 32 bytes).
At this point the stack is now aligned correctly on a 16 byte boundary. But why the multiple of 2?
If I change the compiler flags to -mpreferred-stack-boundary=5 I would expect a stack aligned to 32 bytes, but again gcc seems to produce stack frames aligned to 64 bytes, twice the amount I was expecting.
Prologue for main
23 .cfi_startproc
24 0000 8D4C2404 lea ecx, [esp+4]
25 .cfi_def_cfa 1, 0
26 0004 83E4E0 and esp, -32
27 0007 FF71FC push DWORD PTR [ecx-4]
28 000a 55 push ebp
29 .cfi_escape 0x10,0x5,0x2,0x75,0
30 000b 89E5 mov ebp, esp
31 000d 51 push ecx
32 .cfi_escape 0xf,0x3,0x75,0x7c,0x6
33 000e 83EC34 sub esp, 52
34 .loc 1 12 0
35 0011 E8000000 call ___main
35 00
36 .loc 1 13 0
37 0016 C7050000 mov DWORD PTR _globalVar, 0
37 00000000
37 0000
38 .loc 1 15 0
39 0020 E8330000 call __Z14testStackStuffv
line 26 rounds esp down to the nearest 32 byte boundary
lines 27, 28 and 31 push a total of 12 bytes onto the stack, then
line 33 subtracts another 52 bytes from esp, giving a total of 64 bytes!
and the prologue for testStackStuff is
66 .cfi_startproc
67 0058 55 push ebp
68 .cfi_def_cfa_offset 8
69 .cfi_offset 5, -8
70 0059 89E5 mov ebp, esp
71 .cfi_def_cfa_register 5
72 005b 83EC38 sub esp, 56
73 .loc 1 22 0
(4 bytes on stack from) call __Z14testStackStuffv
(4 bytes on stack from) push ebp
(56 bytes on stack from) sub esp,56
total 64 bytes.
Does anybody know why gcc is creating this extra stack space or have I overlooked something obvious?
Thanks for any help you can offer.
and esp, -32
. The stack frame size looks like 64 bytes, but its alignment is only 32B. – Peter Cordespush DWORD PTR [ecx-4]
part. – Peter Cordes