0
votes

I've searched through many sites and can not seem to find anything relevant.

I would like to be able to take the individual bytes of each default data types such as short, unsigned short, int, unsigned int, float and double, and to store each individual byte information(binary part) into each index of the unsigned char array. How can this be achieved?

For example:

int main() {
    short sVal = 1;
    unsigned short usVal = 2;
    int iVal = 3;
    unsigned int uiVal = 4;
    float fVal = 5.0f;
    double dVal = 6.0;

    const unsigned int uiLengthOfShort  = sizeof(short);
    const unsigned int uiLengthOfUShort = sizeof(unsigned short);
    const unsigned int uiLengthOfInt    = sizeof(int);
    const unsigned int uiLengthOfUInt   = sizeof(unsigned int);
    const unsigned int uiLengthOfFloat  = sizeof(float);
    const unsigned int uiLengthOfDouble = sizeof(double);

    unsigned char ucShort[uiLengthOfShort];
    unsigned char ucUShort[uiLengthOfUShort];
    unsigned char ucInt[uiLengthOfInt];
    unsigned char ucUInt[uiLengthOfUInt];
    unsigned char ucFloat[uiLengthOfFloat];
    unsigned char ucDouble[uiLengthOfDouble];

    // Above I declared a variable val for each data type to work with
    // Next I created a const unsigned int of each type's size.
    // Then I created unsigned char[] using each data types size respectively
    // Now I would like to take each individual byte of the above val's
    // and store them into the indexed location of each unsigned char array.

    // For Example: - I'll not use int here since the int is 
    // machine and OS dependent. 
    // I will use a data type that is common across almost all machines. 
    // Here I will use the short as my example

    // We know that a short is 2-bytes or has 16 bits encoded
    // I would like to take the 1st byte of this short:
    // (the first 8 bit sequence) and to store it into the first index of my unsigned char[].
    // Then I would like to take the 2nd byte of this short:
    // (the second 8 bit sequence) and store it into the second index of my unsigned char[]. 

    // How would this be achieved for any of the data types?

    // A Short in memory is 2 bytes here is a bit representation of an 
    // arbitrary short in memory { 0101 1101, 0011 1010 }
    // I would like ucShort[0] = sVal's { 0101 1101 } &
    //              ucShort[1] = sVal's { 0011 1010 } 

    ucShort[0] = sVal's First Byte info. (8 Bit sequence)
    ucShort[1] = sVal's Second Byte info. (8 Bit sequence)

    // ... and so on for each data type.

    return 0;
}
4
How do you know there aren't any systems where short is 32 bits?user253751
Read up on unions. Not sure what you mean by 'individual bytes of each default data types'.user1593881
I said that the short is 2 bytes across most machines. This does not imply all, but it is more likely the short is 2bytes then working with ints that could be either 16bits(2bytes-obsolete), 32bits(4bytes-common on x86) and 64bits(8bytes-common on 64bit machines).Francis Cugler
@Raw N - yes I have read up on unions but will this also preserve endian byte order or would that have to be converted across different platforms? Also, I would like to make a function that would take any data type variable and to perform this action and return an unsigned char*Francis Cugler
Alignment and order depend on the system.user1593881

4 Answers

1
votes

Ok, so first, don't do that if you can avoid it. Its dangerous and can be extremely dependent on architecture.

The commentators above are correct, union is the safest way to do it, you have the endian problem still, yes, but at least you don't have the stack alignment problem (I assume this is for network code, so stack-alignment is another potential architecture problem)

This is what I've found to be the most straight-forward way to do this:

uint32_t example_int;
char array[4];

//No endian switch
array[0] = ((char*) &example_int)[0];
array[1] = ((char*) &example_int)[1];
array[2] = ((char*) &example_int)[2];
array[3] = ((char*) &example_int)[3];

//Endian switch
array[0] = ((char*) &example_int)[3];
array[1] = ((char*) &example_int)[2];
array[2] = ((char*) &example_int)[1];
array[3] = ((char*) &example_int)[0];

If you're trying to write cross-architecture code, you will need to deal with endian problems one way or another. My suggestion is to construct a short endian test and build functions to "pack" and "unpack" byte arrays based on the above method. It should be noted that to "unpack" a byte array, simply reverse the above assignment statements.

1
votes

The simplest correct way is:

// static_assert(sizeof ucShort == sizeof sVal);

memcpy( &ucShort, &sVal, sizeof ucShort);

The stuff you write in comments is not correct; all types have machine-dependent size, other than character types.

0
votes

With the help of Raw N by providing me a website, I did a search on byte manipulation and found this thread - http://www.cplusplus.com/forum/articles/12/ and it presents a similar solution towards what I am looking for, however I would have to repeat this process for every default data type.

0
votes

After doing some testing this is what I have come up with so far and this is dependent on machine architecture, but to do this on other machines the concept is the same.

typedef struct packed_2bytes {
    unsigned char c0;
    unsigned char c1;
} packed_2bytes;

typedef struct packed_4bytes {
    unsigned char c0;
    unsigned char c1;
    unsigned char c2;
    unsigned char c3;
} packed_4bytes;

typedef struct packed_8bytes {
    unsigned char c0;
    unsigned char c1;
    unsigned char c2;
    unsigned char c3;
    unsigned char c4;
    unsigned char c5;
    unsigned char c6;
    unsigned char c7;
} packed_8bytes;

typedef union {
    short s;
    packed_2bytes bytes;
} packed_short;

typedef union {
    unsigned short us;
    packed_2bytes bytes;
} packed_ushort;

typedef union { // 32bit machine, os, compiler only
    int i;
    packed_4bytes bytes;
} packed_int;

typedef union { // 32 bit machine, os, compiler only
    unsigned int ui;
    packed_4bytes bytes;
} packed_uint;

typedef union { 
    float f;
    packed_4bytes bytes;
} packed_float;

typedef union {
    double d;
    packed_8bytes bytes;
} packed_double;

There is no implementation of use only the declarations or definitions to these types. I do think that they should contain which ever endian is being used, but the person who is using them has to know this ahead of time just as knowing the machines architectures sizes for each of the default types. I am not sure if there would be a problem with signed int or not due to one's, two's compliment or signed bit implementations, but it could also be something to consider.