4
votes

As we know, we have two types of Endianness: big endian and little endian.

Let's say that an integer takes 4 bytes, so the layout of the integer 1 should be 0x01 0x00 0x00 0x00 for little endian and 0x00 0x00 0x00 0x01 for big endian.

To check if a machine is little endian or big endian, we can code as below:

int main()
{
    int a = 1;
    char *p = (char *)&a;
    // *p == 1 means little endian, otherwise, big endian

    return 0;
}

As my understanding, *p is assigned with the first octet: 0x01 for little endian and 0x00 for big endian (the two bold parts above), that's how the code works.

Now I don't quite understand how bit field works with different Endianness.

Let's say we have such a struct:

typedef struct {
    unsigned char a1 : 1;
    unsigned char a2 : 1;
    unsigned char a6 : 3;
}Bbit;

And we do the assignment as below:

Bbit bit;
bit.a1 = 1;
bit.a2 = 1;

Will this piece of code be implementation specific? I'm asking if the values of bit.a1 and of bit.a2 are 1 on little endian and are 0 on big endian? Or are they definitely 1 regardless of the different Endianness?

3
The layout of bitfields is implementation-defined. The bits can really go anywhere the compiler wants them.paddy
"Let's say that an integer takes 4 bytes" --> Interesting one is concerned about various endians, yet not various int sizes.chux - Reinstate Monica
There is much implementation defined issues with bit-fields. "A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type." so even OP's unsigned char a1 : 1; lacks portability.chux - Reinstate Monica
@Yves Bit-fields are a place in C where int might differ from signed int --> "it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int." int a1 : 1; might encode -1,0 or 0,1.chux - Reinstate Monica
"we have two types of Endianness:" --> Could not resist: PDP-endian.chux - Reinstate Monica

3 Answers

1
votes

Let's say we have a struct:

typedef struct {
    unsigned char a1 : 1;
    unsigned char a2 : 1;
    unsigned char a6 : 3;
}Bbit;

and a definition:

Bbit byte;

Suppose byte is stored in a single byte and is currently zeroed out: 0000 0000.

byte.a1 = 1;

This sets the bit called a1 to 1. If a1 is the first bit, then byte has become 1000 0000, but if a1 is the fifth bit, then byte has become 0000 1000, and if a1 is the eighth bit, then byte has become 0000 0001.

byte.a2 = 1;

This sets the bit called a2 to 1. If a2 is the second bit, then byte has (likely) become 1100 0000, but if a2 is the sixth bit, then byte has (likely) become 0000 1100, and if a2 is the seventh bit, then byte has become 0000 0011. (These are only "likely" because there is no guarantee that the bits follow some reasonable order. It's just unlikely that a compiler will go out of its way to mess up this example.)

Endianness is not a factor when it comes to the values that are stored. Only the bits representing the specified bit-field are changed with each assignment, and the value being assigned is reduced to that number of bits (with implementation-defined behavior if that value is too big for that number of bits).

2
votes

With bitfields, not only is byte endianness implementation defined but so is bit endianness.

Section 6.7.2.1p11 of the C standard regarding structs states:

An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains,whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

So the compiler is free to reorder bitfields in a struct as it sees fit. As an example of this, here is the struct that represents an IP header in /usr/include/netinet/ip.h on Linux:

struct iphdr
  {
#if __BYTE_ORDER == __LITTLE_ENDIAN
    unsigned int ihl:4;
    unsigned int version:4;
#elif __BYTE_ORDER == __BIG_ENDIAN
    unsigned int version:4;
    unsigned int ihl:4;
#else
# error "Please fix <bits/endian.h>"
#endif
    u_int8_t tos;
    u_int16_t tot_len;
    u_int16_t id;
    u_int16_t frag_off;
    u_int8_t ttl;
    u_int8_t protocol;
    u_int16_t check;
    u_int32_t saddr;
    u_int32_t daddr;
    /*The options start here. */
  };

Here you can see that there are two fields which are bitfields, and the order in which they are declared depends on the endianness in use.

So what this means is that you can't depend on any particular byte (or bit) ordering if you send a raw struct over a network.

Given your example with some additions to view the representation:

Bbit bit;
bit.a1 = 1;
bit.a2 = 1;
unsigned char *p = (unsigned char *)&bit;
printf("%02x\n", *p);

A big endian system will probably print a0 while a little endian system will probably print 03. And that's assuming the unused bits happen to be set to 0.

1
votes

The C standard does not even require that the bytes representing an integer must be in big-endian order or little-endian order (they may be mixed) let alone what order bit-fields are in. These things are implementation-defined, which means the C standard does not specify them but requires they be documented in the compiler manual or other documentation. The order of bit-fields in bytes or other units does not have to match the order of bytes in objects.