Hello I am currently working on a program where I need to process a data blob that contains a series of floats which could be unaligned (and also are sometimes). I am compiling with gcc 4.6.2 for an ARM cortex-a8. I have a question to the generated assembly code:
As example I wrote a minimal example: For the following test code
float aligned[2];
float *unaligned = (float*)(((char*)aligned)+2);
int main(int argc, char **argv)
{
float f = unaligned[0];
return (int)f;
}
the compiler (gcc 4.6.2 - with optimization -O3) produces
00008634 <main>:
8634: e30038ec movw r3, #2284 ; 0x8ec
8638: e3403001 movt r3, #1
863c: e5933000 ldr r3, [r3]
8640: edd37a00 vldr s15, [r3]
8644: eefd7ae7 vcvt.s32.f32 s15, s15
8648: ee170a90 vmov r0, s15
864c: e12fff1e bx lr
The compiler here cannot know if the data is aligned but never the less it uses VLDR which needs aligned data or the program will crash with a bus error.
Now here is my actual question: Is this correct from the compiler and I need to take care of alignment in my C++ code or is this a bug in the compiler?
I also might add my current workaround which works and brings gcc to make a copy before accessing the value. The trick is to define a struct which only contains a float with the gcc packed attribute and access the data via a struct pointer. Code snippet:
struct FloatWrapper { float f; } __attribute__((packed));
const FloatWrapper *x = reinterpret_cast<const FloatWrapper *>(rawX.data());
const FloatWrapper *y = reinterpret_cast<const FloatWrapper *>(rawY.data());
for (size_t i = 0; i < vertexCount; ++i) {
vertices[i].x = x[i].f;
vertices[i].y = y[i].f;
}
float*
is always assumed to have at leastalignof(float)
alignment. If you're violating that, you need to usememcpy
or something other than simple dereference to avoid Undefined Behaviour. (Even when compiling for x86, for example.) Why does unaligned access to mmap'ed memory sometimes segfault on AMD64? / trust-in-soft.com/blog/2020/04/06/… – Peter Cordes