The structures defined in /usr/include/linux/*
and
/usr/include/asm-generic/*
do not appear to be packed, so they
depend on the compiler used and the alignment semantics of said
compiler, right?
That's not true, generally. Here is an example from GCC on 64-bit Ubuntu (/usr/include/x86_64-linux-gnu/asm/stat.h
):
struct stat {
__kernel_ulong_t st_dev;
__kernel_ulong_t st_ino;
__kernel_ulong_t st_nlink;
unsigned int st_mode;
unsigned int st_uid;
unsigned int st_gid;
unsigned int __pad0;
__kernel_ulong_t st_rdev;
__kernel_long_t st_size;
__kernel_long_t st_blksize;
__kernel_long_t st_blocks; /* Number 512-byte blocks allocated. */
__kernel_ulong_t st_atime;
__kernel_ulong_t st_atime_nsec;
__kernel_ulong_t st_mtime;
__kernel_ulong_t st_mtime_nsec;
__kernel_ulong_t st_ctime;
__kernel_ulong_t st_ctime_nsec;
__kernel_long_t __unused[3];
};
See __pad0
? int
is generally 4 bytes, but st_rdev
is long
, which is 8 bytes, so it must be 8-byte aligned. However, it is preceded by 3 ints = 12 bytes, so a 4-byte __pad0
is added.
Essentially, the implementation of stdlib takes care to hard-code its ABI.
BUT that isn't true for all APIs. Here is struct flock
(from the same machine, /usr/include/asm-generic/fcntl.h
) used by the fcntl()
call:
struct flock {
short l_type;
short l_whence;
__kernel_off_t l_start;
__kernel_off_t l_len;
__kernel_pid_t l_pid;
__ARCH_FLOCK_PAD
};
As you can see, there is no padding between l_whence
and l_start
. And indeed, for the following C program, saved as abi.c
:
#include <fcntl.h>
#include <string.h>
int main(int argc, char **argv)
{
struct flock fl;
int fd;
fd = open("y", O_RDWR);
memset(&fl, 0xff, sizeof(fl));
fl.l_type = F_RDLCK;
fl.l_whence = SEEK_SET;
fl.l_start = 200;
fl.l_len = 1;
fcntl(fd, F_SETLK, &fl);
}
We get:
$ cc -g -o abi abi.c && strace -e fcntl ./abi
fcntl(3, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=200, l_len=1}) = 0
+++ exited with 0 +++
$ cc -g -fpack-struct -o abi abi.c && strace -e fcntl ./abi
fcntl(3, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=4294967296, l_len=-4294967296}) = 0
+++ exited with 0 +++
As you can see, the fields following l_whence
are indeed garbage.
Moreover, C has no ABI, and so this fragile compatibility relies on implementation playing nice. struct stat
above assumes that the compiler wouldn't insert extra random padding.
ANSI C says:
There may also be unnamed padding at the end of a structure or union, as necessary to achieve the appropriate alignment were the structure or union to be a member of an array.
There's no wording on how padding may be inserted in the middle of a struct for reasons other than alignment, however there's also:
Implementation-defined behavior
Each implementation shall document its behavior in each of the areas listed in this section. The following are implementation-defined:
...
The padding and alignment of members of structures. This should present no problem unless binary data written by one implementation are read by another.
On my Ubuntu machine, both the compiler and the standard library come from GCC, so they interoperate smoothly. Clang wants to grow, so it's compatible with GNU libc. Everyone is just playing nice, most of the time.