4
votes

I've recently been implementing a specialized parser for a slightly modified Abstract Syntax Notation. The specification says that integers are encoded as an array of octets which are to be interpreted as a binary two's-complement integer.

So, at first I thought the best way to unserialize this into an actual C++ int would be to simply start with a value of 0, and then OR each octet with the value like:

uint64_t value = 0;
int shift = 0;
std::vector<uint8_t> octets = { /* some values */ };

for (auto it = octets.rbegin(); it != octets.rend(); ++shift, ++it)
{
  value |= uint64_t(*it) << (shift * 8);
}

This would leave me with a bit pattern stored in value, which I could then interpret as a signed (two's-complement) integer by casting it:

int64_t signed_value = static_cast<int64_t>(value);

But it occurred to me that this is really relying on implementation-defined behavior. C++ doesn't guarantee that signed integers are represented as two's complement. So, to get the actual value of the encoded integer as a C++ int64_t, I'd need to actually calculate the summation of 2^N for each Nth bit in the bit pattern, taking into account the sign bit. This seems kind of silly when I know that casting should just work most of the time.

Is there a better solution here that would be both portable and efficient?

1
According to en.cppreference.com/w/cpp/types/integer in c++11 you are guaranteed 2s complement for the signed size-specific integer typedefs. - BoBTFish
@BoBTFish, that's the greatest news I've heard all day... if it's true. But the c++11 draft standard says: Types bool, char, char16_t, char32_t, wchar_t, and the signed and unsigned integer types are collectively called integral types.48 A synonym for integral type is integer type. The representations of integral types shall define values by use of a pure binary numeration system.49 [ Example: this International Standard permits 2’s complement, 1’s complement and signed magnitude representations for integral types. —end example ] - Channel72
Yes, but I did have some vague memory of things changing for c++11, so that's why I went to look. I'll see what I can dig up in The Standard. The thing is, those typedefs aren't required to exist anyway. - BoBTFish
I found stackoverflow.com/a/5254075/1171191 but that seems to be related to c. Nothing similar I can find in The c++ Standard. - BoBTFish
AHA! Section 18.4.1 Header <cstdint> synopsis, paragraph 2 "The header defines all functions, types, and macros the same as 7.18 in the C standard." Not honestly sure if that includes the requirements on types, or just that the actual names of the types in the typedefs have to be the same as a c implementation on the same platform. (Edit: That's in N3337, first draft released after the actual 2011 Standard.) - BoBTFish

1 Answers

1
votes

If your solution works, I think you can use a bit of metaprogramming to test whether your platform is one's complement or two's complement.

struct is_ones_complement {
    static const bool value = ( (1 & -1) == 0);
}

And then, you can write an inlinable conversion function:

template<bool is_ones_complement>
uint64_t convert_impl(const std::vector<uint8_t>& vec);

template<>
uint64_t convert_impl<true>(const std::vector<uint8_t>& vec) {
    // Your specialization for 1's-complement platforms
}

template<>
uint64_t convert_impl<false>(const std::vector<uint8_t>& vec) {
    // Your specialization for 2's-complement platforms
}

inline uint64_t convert(const std::vector<uint8_t>& vec) {
    return convert_impl<is_ones_complement::value>(vec);
}

Untested, but it should work.