Deciphering unsigned char*

Question

I have a process that listens to an UDP multi-cast broadcast and reads in the data as a unsigned char*.

I have a specification that indicates fields within this unsigned char*.

Fields are defined in the specification with a type and size.

Types are: uInt32, uInt64, unsigned int, and single byte string.

For the single byte string I can merely access the offset of the field in the unsigned char* and cast to a char, such as:

char character = (char)(data[1]);

Single byte uint32 i've been doing the following, which also seems to work:

uint32_t integer =  (uint32_t)(data[20]);

However, for multiple byte conversions I seem to be stuck.

How would I convert several bytes in a row (substring of data) to its corresponding datatype?

Also, is it safe to wrap data in a string (for use of substring functionality)? I am worried about losing information, since I'd have to cast unsigned char* to char*, like:

std::string wrapper((char*)(data),length); //Is this safe?

I tried something like this:

std::string wrapper((char*)(data),length); //Is this safe?
uint32_t integer = (uint32_t)(wrapper.substr(20,4).c_str()); //4 byte int

But it doesn't work.

Thoughts?

Update

I've tried the suggest bit shift:

void function(const unsigned char* data, size_t data_len)
{
    //From specifiction: Field type: uInt32 Byte Length: 4
    //All integer fields are big endian.
    uint32_t integer = (data[0] << 24) | (data[1] << 16) | (data[2] << 8) | (data[3]);
}

This sadly gives me garbage (same number for every call --from a callback).

instead of wrapping in std::string you may find your data is easier to manipulate as std::vector<unsigned char> ... alternately if you want to wrap in a string std::basic_string<unsigned char> would probably be preferable. — AJG85

unwind unwind · Accepted Answer · 2011-02-17T16:21:42

I think you should be very explicit, and not just do "clever" tricks with casts and pointers. Instead, write a function like this:

uint32_t read_uint32_t(unsigned char **data)
{
  const unsigned char *get = *data;
  *data += 4;
  return (get[0] << 24) | (get[1] << 16) | (get[2] << 8) | get[3];
}

This extracts a single uint32_t value from a buffer of unsigned char, and increases the buffer pointer to point at the next byte of data in the buffer.

This assumes big-endian data, you need to have a well-defined idea of the buffer's endian-mode in order to interpret it.

Deciphering unsigned char*

Update

5 Answers