Working with individual bytes using unsigned char arrays

Question

I've searched through many sites and can not seem to find anything relevant.

I would like to be able to take the individual bytes of each default data types such as short, unsigned short, int, unsigned int, float and double, and to store each individual byte information(binary part) into each index of the unsigned char array. How can this be achieved?

For example:

int main() {
    short sVal = 1;
    unsigned short usVal = 2;
    int iVal = 3;
    unsigned int uiVal = 4;
    float fVal = 5.0f;
    double dVal = 6.0;

    const unsigned int uiLengthOfShort  = sizeof(short);
    const unsigned int uiLengthOfUShort = sizeof(unsigned short);
    const unsigned int uiLengthOfInt    = sizeof(int);
    const unsigned int uiLengthOfUInt   = sizeof(unsigned int);
    const unsigned int uiLengthOfFloat  = sizeof(float);
    const unsigned int uiLengthOfDouble = sizeof(double);

    unsigned char ucShort[uiLengthOfShort];
    unsigned char ucUShort[uiLengthOfUShort];
    unsigned char ucInt[uiLengthOfInt];
    unsigned char ucUInt[uiLengthOfUInt];
    unsigned char ucFloat[uiLengthOfFloat];
    unsigned char ucDouble[uiLengthOfDouble];

    // Above I declared a variable val for each data type to work with
    // Next I created a const unsigned int of each type's size.
    // Then I created unsigned char[] using each data types size respectively
    // Now I would like to take each individual byte of the above val's
    // and store them into the indexed location of each unsigned char array.

    // For Example: - I'll not use int here since the int is 
    // machine and OS dependent. 
    // I will use a data type that is common across almost all machines. 
    // Here I will use the short as my example

    // We know that a short is 2-bytes or has 16 bits encoded
    // I would like to take the 1st byte of this short:
    // (the first 8 bit sequence) and to store it into the first index of my unsigned char[].
    // Then I would like to take the 2nd byte of this short:
    // (the second 8 bit sequence) and store it into the second index of my unsigned char[]. 

    // How would this be achieved for any of the data types?

    // A Short in memory is 2 bytes here is a bit representation of an 
    // arbitrary short in memory { 0101 1101, 0011 1010 }
    // I would like ucShort[0] = sVal's { 0101 1101 } &
    //              ucShort[1] = sVal's { 0011 1010 } 

    ucShort[0] = sVal's First Byte info. (8 Bit sequence)
    ucShort[1] = sVal's Second Byte info. (8 Bit sequence)

    // ... and so on for each data type.

    return 0;
}

How do you know there aren't any systems where short is 32 bits? — user253751
Read up on unions. Not sure what you mean by 'individual bytes of each default data types'. — user1593881
I said that the short is 2 bytes across most machines. This does not imply all, but it is more likely the short is 2bytes then working with ints that could be either 16bits(2bytes-obsolete), 32bits(4bytes-common on x86) and 64bits(8bytes-common on 64bit machines). — Francis Cugler
@Raw N - yes I have read up on unions but will this also preserve endian byte order or would that have to be converted across different platforms? Also, I would like to make a function that would take any data type variable and to perform this action and return an unsigned char* — Francis Cugler

Jonathan Bedard Jonathan Bedard · Accepted Answer · 2015-03-30T19:39:29

Ok, so first, don't do that if you can avoid it. Its dangerous and can be extremely dependent on architecture.

The commentators above are correct, union is the safest way to do it, you have the endian problem still, yes, but at least you don't have the stack alignment problem (I assume this is for network code, so stack-alignment is another potential architecture problem)

This is what I've found to be the most straight-forward way to do this:

uint32_t example_int;
char array[4];

//No endian switch
array[0] = ((char*) &example_int)[0];
array[1] = ((char*) &example_int)[1];
array[2] = ((char*) &example_int)[2];
array[3] = ((char*) &example_int)[3];

//Endian switch
array[0] = ((char*) &example_int)[3];
array[1] = ((char*) &example_int)[2];
array[2] = ((char*) &example_int)[1];
array[3] = ((char*) &example_int)[0];

If you're trying to write cross-architecture code, you will need to deal with endian problems one way or another. My suggestion is to construct a short endian test and build functions to "pack" and "unpack" byte arrays based on the above method. It should be noted that to "unpack" a byte array, simply reverse the above assignment statements.

Working with individual bytes using unsigned char arrays

4 Answers