3
votes

As far as I understand, the following piece of code exhibits undefined behaviour in C11:

#include <string.h>

struct aaaa { char bbbb; int cccc; };

int main(void) {
    unsigned char buffer[sizeof(struct aaaa)] = { 0 };
    struct aaaa *pointer = &buffer[0];

    return (*pointer).cccc;
}

According to N1570 section 6.5.3.2 clause 4,

If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.

which is accompanied by a footnote that clarifies that

Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime.

It's unlikely that struct aaaa * and unsigned char * have the same alignment, so we assigned an invalid value to pointer, and using *pointer therefore causes UB.

However, can I copy the structure?

#include <string.h>

struct aaaa { char bbbb; int cccc; };

int main(void) {
    unsigned char buffer[sizeof(struct aaaa)] = { 0 };
    struct aaaa target;

    memcpy(&target, buffer, sizeof(struct aaaa));

    return target.cccc;
}

Here, we pass a struct aaaa * and unsigned char * to memcpy. While that seems just as bad as the first piece of code, I can't find any wording in C11 that rules that this code exhibits UB. Does this usage of memcpy cause undefined behaviour?

2

2 Answers

5
votes

No, memcpy doesn't make any assumptions about alignement. It is functionally equivalent to copying byte by byte.

BTW, accessing an auto object through an lvalue of a different type that is not a character type leads to undefined behavior, regardless of alignment. This is a violation of the effective type rule, C11 6.5 p6 and p7.

-1
votes

From what I understand, both cases are UB (but not because of the call to memcpy), because the compiler does not enforce alignment of start offsets of variables properly. You can enforce alignment with compiler-specific attributes to be sure, but this is of course a platform-specific solution.

Assuming the start offsets are aligned (this is an assumption from practice), like compilers usually do it to gain perfomance:

In your first example you assign at first buffer index 0. buffer is usually aligned correctly. cccc will be aligned, too, because the struct is not packed. It should not cause a problem in this case.

In second example when using memcpy everything will copy properly, because (internally) it tries its best to do aligned copy for performance and when it is not possible, it copies byte-wise. And here again, all structures and buffers are aligned with the restrictions I mentioned above.

What is the actual problem here?

You would risk it (visibly in practice) if you assign &buffer[1] (given, it is usually not aligned). An access to cccc will load a word from unaligned address. On some architectures it causes the dreaded SIGBUS. x86 detects unaligned addressing and slows down a bit (perhaps), but does not crash.