6
votes

here is a declaration of a C struct:

struct align
{
    char c; //1 byte
    short s;//2 bytes
};

On my environment, sizeof(struct align) is 4 and the padding 1 byte is between 'char c' and 'short s'. Some say that's because `short' has to be 2-byte aligned, so pading 1 byte is after 'char c'. On 32-bit machine, I know 'int' better be 4-byte aligned to prevent 2 memory read cycles since addresses sent on address bus between CPU and memory is a multiple of 4. But 'short' is 2 bytes, which is less than 4 bytes, so its address could be any byte within a 4-byte unit (except last byte).

multiple of 4 address -> |0|1|2|3|

I mean, 'short' can start at 0, 1, or 2. All can be retrieved by 1 read cycle, doesn't have to 0 or 2. In my 'struct align' case, 'char c' could be at 0, 'short s' could be at 1-2, padding could be at 3.

Why 2-byte long "short" has to be 2-byte aligned?

Thanks

Update my environment: gcc version 4.4.7, i686, Intel

1
What compiler and platform are you using? What compiler options do you have set? - Dai
It depends on your compiler and system architecture. You should include those details in your system. On some systems, short must have 2-byte alignment. - M.M
On many machines, accessing an N-byte quantity (at least for N in {1, 2, 4, 8, 16}) works most efficiently when the quantity is N-byte aligned. It's the way life is; get used to it, because I doubt that chip manufacturers are going to change it just because you think it isn't the way it should be. - Jonathan Leffler
@Dai, update my environment, no specific compile options. Just "gcc". - password636
@MattMcNabb, update my environment, my system is a normal 32-bit CentOS with Intel chip. - password636

1 Answers

4
votes

That is because a member of a struct is no difference from a single variable of that type, from the machine's perspective. Whatever alignment you choose, it applies to both.

For example, if short is two-byte long,

struct align
{
    char c;
    short s; // two-byte word
};

short ss; // two-byte word

The member s is of 2-byte type (e.g. WORD in IA32), exactly the same type of a "standalone" variable ss. The underlying architecture regards them as the same. So when there comes to an alignment requirement for that type, it just applies to both.

And if you add the padding at the end of the data, it may still be misaligned. Consider the start of ss is at the end of a 4-byte boundary.