I've recently been implementing a specialized parser for a slightly modified Abstract Syntax Notation. The specification says that integers are encoded as an array of octets which are to be interpreted as a binary two's-complement integer.
So, at first I thought the best way to unserialize this into an actual C++ int would be to simply start with a value of 0, and then OR each octet with the value like:
uint64_t value = 0;
int shift = 0;
std::vector<uint8_t> octets = { /* some values */ };
for (auto it = octets.rbegin(); it != octets.rend(); ++shift, ++it)
{
value |= uint64_t(*it) << (shift * 8);
}
This would leave me with a bit pattern stored in value, which I could then interpret as a signed (two's-complement) integer by casting it:
int64_t signed_value = static_cast<int64_t>(value);
But it occurred to me that this is really relying on implementation-defined behavior. C++ doesn't guarantee that signed integers are represented as two's complement. So, to get the actual value of the encoded integer as a C++ int64_t, I'd need to actually calculate the summation of 2^N for each Nth bit in the bit pattern, taking into account the sign bit. This seems kind of silly when I know that casting should just work most of the time.
Is there a better solution here that would be both portable and efficient?
c++11you are guaranteed 2s complement for the signed size-specific integer typedefs. - BoBTFishc++11, so that's why I went to look. I'll see what I can dig up in The Standard. The thing is, those typedefs aren't required to exist anyway. - BoBTFishc. Nothing similar I can find in Thec++Standard. - BoBTFish<cstdint>synopsis, paragraph 2 "The header defines all functions, types, and macros the same as 7.18 in the C standard." Not honestly sure if that includes the requirements on types, or just that the actual names of the types in the typedefs have to be the same as acimplementation on the same platform. (Edit: That's in N3337, first draft released after the actual 2011 Standard.) - BoBTFish