10
votes

I have to port source code from to an ARM platform that runs Linux. Unfortunately I have run into unaligned memory access problems. The source uses pointer casts and access heavily.

Code like the one below has spread over the codebase like a virus. I can pinpoint the problematic locations thanks to the gcc -Wcast-align command line option but there are over a thousand instances to go through.

u = (IEC_BOOL);
(((*(IEC_LINT*)pSP).H < b.H) 
   || (((*(IEC_LINT*)pSP).H == b.H) && ((*(IEC_LINT*)pSP).L < b.L) )) ? 1 : 0);
*(IEC_DWORD OS_SPTR *)pSP = 
    (IEC_DWORD)(*(IEC_DWORD OS_SPTR *)pSP >> u);  
*(IEC_DWORD OS_SPTR *)pSP = 
    (IEC_DWORD)(*(IEC_DWORD OS_SPTR *)pSP << -u);  
u = (IEC_BYTE)((*(IEC_DINT*)pSP != b) ? 1 : 0);  
*(IEC_DWORD*)pSP = (IEC_DWORD)(*(IEC_DWORD*)pSP & w);  
(*(IEC_ULINT*)pSP).H += u.H;   
(((*(IEC_ULINT OS_SPTR *)pSP).H == b.H) 
    && ((*(IEC_ULINT OS_SPTR *)pSP).L > b.L))) ? 1 : 0);
u = (IEC_BYTE)((*(IEC_REAL*)pSP >= b) ? 1 : 0);

Using echo 2 > /proc/cpu/alignment on makes the Linux Kernel fix the problems but the performance of the application is degraded to a point that is not acceptable any more.

I searched the net for something like a __unaligned or __packed keyword for the GCC (v4.4.1) compiler but as of yet came up empty.

I thought a lot of the poblematic code lines could be fixed with a more or less complex regexp/replace but now, after doing that for a while I see, that also this approach will take enourmous amounts of tedious work.

Do you guys have any suggestion how to get this job done? I think a gcc 4.5 compiler plugin would be overkill but is there something better than regular expressions? what other suggestions can you come up with? Not necessarily all problem instances have to be fixed since I can still rely on the Kernel for a few rarer cases.

3
I'm tempted to joke that this should be moved to TheDailyWTF.com.Crashworks
Continuing our study of modern linguistics, here we have a sample of the dialect of C used commonly by embedded programmers. Roughly translated to English, the above text means "F*** YOU!", although no natural language can truly convey the amount of spite, defiance against all that is sacred, and general disregard for the reader's humanity conveyed herein.Dmitri

3 Answers

7
votes

There is __attribute__((__packed__)) which might help in some instances, but I really think this code should be cleaned up rather sooner than later, because it is likely that you will spend more time working around problems than it would take to fix it once and for all.

2
votes

Wow, that's an unholy mess. Fiddling with the compiler isn't going to get you anywhere. The code is illegal on ALL architectures, but just happens to work on some (e.g x86). I would fix the code itself.

Sadly there's no pretty way to do that. However, you might get a long way with a long list of search-and-replaces, then manually fix up the rest. I'd start by removing the declarations of those data types, so if you compile any code you missed, it'll error. Then, search-and-replace snippets such as "*(IEC_DWORD OS_SPTR *)pSP =" with "set_dword(pSP, ". Make an inline function "set_dword" to do the right thing. Carry on for as many easily replaced snippets as you can imagine. There will still be a large amount to fix up by hand.

The only other way I can think of doing this would be a compiler plugin, as you suggest, and make every pointer in the entire compilation unit have alignment 1. The compiler will then byte load/store everything. It'll probably end up doing that for more than just the code you intended. This probably isn't as easy to achieve as it sounds.

0
votes

We can assume that the problem originates from the fact that ARM is a 32 bit machine and the Linux box runs in 64 bit mode, alternatively, the code could assume that is is running on a 16 bit machine.

One way would be to look at the underlying structure that is accessed. The members "H" and "L" might be 32 bit types that are accessed as though they are 64 bit.

Try to modify the types of L and H to make the code behave better.

(Admittedly, this is a stab into thin air, as the code snippet doesn't reveal the details of the application, nor of the underlying structures.)