Why do compilers pad structs in C/C++?

Question

I was learning about structure padding, and read that the reason behind structure padding is that if the members of the struct are not aligned, the processor won't be able to read/write them in only one cycle. In general, the location of a data type that consists of N bytes should be at an address which is a multiple of N.

Suppose this struct for example:

struct X
{
    char c;
    // 3 bytes padding here so that i is aligned.
    int i;
};

Here the size of this struct should be 8 bytes, c is aligned by default because it only occupies 1 byte, but i is not. For i, we need to add 3 bytes of padding before it so that it's "aligned" and can be accessed at only one cycle. Tell me if I'm missing something.

1 - How does alignment work? What do the members get aligned to?

2 - What's better for the CPU to access an N bytes data type located at an address that's a multiple of N? Why for example, in the struct above, if i is located at address XXX3 (ending in 3, in other words, not a multiple of 4), why not read a word starting from address XXX3? Why does it have to be a multiple of 4? Do most CPUs access addresses that are only multiples of the word size? I believe that CPUs can read a word from the memory starting at any byte. Am I wrong?

3 - Why doesn't the compiler reorder the members so that it takes as much space as possible? Does the ordering matter? I'm not sure if anybody uses actual offset numbers to access members. Meaning that if there is a struct X x, usually members are accessed like this: x.i not *(&x + 4). In the latter case, ordering would actually matter, but in the first case (which I believe everybody uses), the ordering shouldn't matter. I have to note that in this example, It doesn't matter since if i came before c also, there will be a 3 bytes padding at the end. I'm asking generally why?

4 - I've read that this is not important anymore and that CPUs now can usually access non-aligned members taking the same time as aligned ones. Is that true? If yes then why?

finally, if there is a good place to learn more, I would be thankful.

Alignment is a platform architecture constraint. Misaligned data access can be expensive (up to x16 as expensive in performance) as aligned access on some architectures, and can thwart atomic read/write (only relevant for multithreaded applications), or be unsupported entirely (causing a process fault). Other architectures can handle them without issue, and other architectures can handle them but at a performance penalty (so the compiler errs on the side of performance). — Eljay
@ssd As mentioned in question 2, if I have a double for example at address XXX3, why not read a whole 8 bytes starting from the address XXX3? Can't the CPU just access any location in the memory? Why should the address be a multiple of 8 in this example? — StackExchange123
@StackExchange123 : Yes, CPU can access that piece of memory and you can achieve this by writing your own assembly code. Compilers are just optimized to read in chunks. — ssd
@StackExchange123 : I've googled and found that some CPU's (arm, for example) have compiler directives (-munaligned-access) that you can turn off this aligned access thing. — ssd

R.. GitHub STOP HELPING ICE R.. GitHub STOP HELPING ICE · Accepted Answer · 2020-08-07T03:14:47

They get aligned to at least _Alignof(type). In principle an implementation is allowed to align further, but this is generally undesirable and no major implementation does.
As noted in a comment by Eljay (emphasis mine):

Alignment is a platform architecture constraint. Misaligned data access can be expensive (up to x16 as expensive in performance) as aligned access on some architectures, and can thwart atomic read/write (only relevant for multithreaded applications), or be unsupported entirely (causing a process fault). Other architectures can handle them without issue, and other architectures can handle them but at a performance penalty (so the compiler errs on the side of performance).

The language standard is written to allow for such platform constraints.
It's not allowed to, at least not if the address of the structure is taken in a way that makes the representation visible to the application. The language specification requires members to be in order. This is 6.7.2.1 Structure and union specifiers, ¶15:

Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
No, it's not true. High-end cpus generally patch up misaligned access transparently, to allow certain types of sloppy code as well as operations that are necessarily misaligned (like memcpy or memmove through buffers with differing alignments), but that does not change the fact that these operations tend to be more expensive and that they're not available for some things like atomic operations.

Why do compilers pad structs in C/C++?

3 Answers