10
votes

I had a mysterious bus error that occurred, on a x86 (32-bit) platform, when running code compiled with gcc-4.8.1 with -march=pentium4. I traced the problem to an SSE instruction:

movdqa %xmm5,0x50(%esp)

with esp = 0xbfffedac. movdqa requires the address to be 16-byte aligned, which is not the case here, thus the bus error.

The problem does not occur if compiling with -march=native (this is a Core-i3 processor).

As far as I know, the only stack alignment guaranteed on Linux/x86 is 4-byte. Thus, it seems weird that the code generator should choose to use movdqa, without some kind of alignment check, even though there is an instruction movdqu for possibly unaligned accesses.

So, this looks like there is a bug in gcc.

I'm not an expert on SSE and x86 ABI, and I'd appreciate feedback before I send a bug report.

1
You are wrong, stack alignment on Linux/x86 can sometimes be 16 bytes. See x86-64 ABI, Linux foundation references and x86 calling conventions - Basile Starynkevitch
This is x86, not x86-64... that's the problem! The site claims: "Since GCC version 4.5, the stack must be aligned to a 16-byte boundary when calling a function (previous versions only required a 4-byte alignment.)[citation needed]" "Citation needed", I'd like to get a reference for that. - David Monniaux
See this comment. Stack alignment on 32 bits x86/ia32 is now 16 bytes because of SSE, IIUC. And if the GCC compiler had to align stack frames for every SSE code, it would not worth the pain and runtime cost. - Basile Starynkevitch
Ah ah, indeed I found relevant paragraphs in the gcc documentation on x86 code generation. It seems OCaml misaligns the stack (and since it shows only if calling code containing some SSE instructions...). - David Monniaux
@monniaux, why does march=pentium4 cause the problem and not march=native? - Z boson

1 Answers

7
votes

Now the default in gcc is -mpreferred-stack-boundary=4 (16-byte alignment), which sets -mincoming-stack-boundary=4.

Problems can thus occur if gcc code using SSE is called from code generated by other compilers which have different stack alignment assumptions, such as OCaml (see discussion on the OCaml bug tracker).