0
votes

Many implementations of htonl() or ntohl() test for the endianness of the platform first and then return a function which is either a no-op or a byte-swap.

I once read a page on the web about a few tricks to handle to/from big/little-endian conversions, without any preconceived knowledge of the hardware configuration. Just taking endianness for what it is : a representation of integers in memory. But I could not find it again, so I wrote this :

typedef union {
    uint8_t b[4];
    uint32_t i;
} swap32_T;

uint32_t to_big_endian(uint32_t x) {
    /* convert to big endian, whatever the endianness of the platform */
    swap32_T y;
    y.b[0] = (x & 0xFF000000) >> 24;
    y.b[1] = (x & 0x00FF0000) >> 16;
    y.b[2] = (x & 0x0000FF00) >> 8;
    y.b[3] = (x & 0x000000FF);
    return y.i;
}

My two questions are :

  • Do you know a cleaner way to write this to_big_endian() function ?
  • Did you ever bookmarked this mysterious page I can not find, which contained very precious (because unusual) advices on endianness ?

edit

not really a duplicate (even if very close) mainly because I do not want to detect endianness. The same code compile on both architecture, with the same result

little endian

  • for u = 0x12345678 (stored as 0x78 0x56 0x34 0x12)
  • to_big_endian(u) = 0x12345678 (stored as 0x78 0x56 0x34 0x12)

big endian

  • for u = 0x12345678 (stored as 0x12 0x34 0x56 0x78)
  • to_big_endian(u) = 0x78563412 (stored as 0x78 0x56 0x34 0x12)

same code, same result... in memory.

4
Endianness is a compile-time thing, not a run-time thing - just test the appropriate preprocessor macro (e.g. _BIG_ENDIAN_) and use that to determine whether your function should be a no-op or not.Paul R

4 Answers

3
votes

Here is my own version of the same (although memory convention in this example is little endian instead of big endian) :

/* unoptimized version; solves endianess & alignment issues */
static U32 readLE32 (const BYTE* srcPtr)
{
    U32 value32 = srcPtr[0];
    value32 += (srcPtr[1]<<8);
    value32 += (srcPtr[2]<<16);
    value32 += (srcPtr[3]<<24);
    return value32;
}
static void writeLE32 (BYTE* dstPtr, U32 value32)
{
    dstPtr[0] = (BYTE)value32;
    dstPtr[1] = (BYTE)(value32 >> 8);
    dstPtr[2] = (BYTE)(value32 >> 16);
    dstPtr[3] = (BYTE)(value32 >> 24);
}

Basically, what's missing in your function prototype to make the code a bit easier to read is a pointer to the source or destination memory.

2
votes

Depending on your intentions, this may or may not be an answer to your question. However, if all you want to do is to be able to convert various types to various endiannesses (including 64-bit types and little endian conversions, which the htonl obviously won't do), you may want to consider the htobe32 and related functions:

   uint16_t htobe16(uint16_t host_16bits);
   uint16_t htole16(uint16_t host_16bits);
   uint16_t be16toh(uint16_t big_endian_16bits);
   uint16_t le16toh(uint16_t little_endian_16bits);

   uint32_t htobe32(uint32_t host_32bits);
   uint32_t htole32(uint32_t host_32bits);
   uint32_t be32toh(uint32_t big_endian_32bits);
   uint32_t le32toh(uint32_t little_endian_32bits);

   uint64_t htobe64(uint64_t host_64bits);
   uint64_t htole64(uint64_t host_64bits);
   uint64_t be64toh(uint64_t big_endian_64bits);
   uint64_t le64toh(uint64_t little_endian_64bits);

These functions are technically non-standard, but they appear to be present on most Unices.

It should also be said, however, as Paul R rightly points out in the comments, that there is no runtime test of endianness. The endianness is a fixed feature of a given ABI, so it is always a constant at compile-time.

1
votes

Well ... That's certainly a workable solution, but I don't understand why you'd use a union. If you want an array of bytes, why not just have an array of bytes as an output pointer argument?

void uint32_to_big_endian(uint8_t *out, uint32_t x)
{
  out[0] = (x >> 24) & 0xff;
  out[1] = (x >> 16) & 0xff;
  out[2] = (x >> 8)  & 0xff;
  out[3] = x & 0xff;
}

Also, it's often better code-wise to shift first, and mask later. It calls for smaller mask literals, which is often better for the code generator.

0
votes

Well, here's my solution for a general signed/unsigned integer, independent of machine endianness, and of any size capable to store the data ---you need a version for each, but the algorithm is the same):

AnyLargeEnoughInt fromBE(BYTE *p, size_t n)
{
    AnyLargeEnoughInt res = 0;
    while (n--) {
        res <<= 8;
        res |= *p++;
    } /* for */
    return res;
} /* net2host */

void toBE(BYTE *p, size_t n, AnyLargeEnoughInt val)
{
    p += n;
    while (n--) {
        *--p = val & 0xff;
        val >>= 8;
    } /* for */
} /* host2net */

AnyLargeEnoughInt fromLE(BYTE *p, size_t n)
{
    p += n;
    AnyLargeEnoughInt res = 0;
    for (n--) {
        res <<= 8;
        res |= *--p;
    } /* for */
    return res;
} /* net2host */

void toLE(BYTE *p, size_t n, AnyLargeEnoughInt val)
{
    while (n--) {
        *p++ = val & 0xff;
        val >>= 8;
    } /* for */
} /* host2net */