2
votes

I understand that I am assigning signed int a value that is larger than what it can handle. Also I should be using %d for signed and %u for unsigned. Similarly I should not be assigning -ve value to unsigned. But if I make such assignments and use printf as below, I get the results show below.

My understanding is that in each case, the number of converted to its two's compliment binary representation which is same for -1 or 4294967295. That is why %u for signed prints 4294967295 by ignoring -ve leftmost bit. When used %d for signed int, it uses left most bit as -ve flag and prints -1. Similarly %u for unsigned prints unsigned value but %d causes it to treat the number as signed and thus prints -1. Is that correct?

signed int si = 4294967295;
unsigned int  ui = 4294967295;

printf("si = u=%u d=%d\n", si, si);
printf("ui = u=%u d=%d\n", ui, ui);

Output:

si = u=4294967295 d=-1
ui = u=4294967295 d=-1

signed int si = -1;
unsigned int  ui = -1;

printf("si = u=%u d=%d\n", si, si);
printf("ui = u=%u d=%d\n", ui, ui);

Output:

si = u=4294967295 d=-1
ui = u=4294967295 d=-1
2
Note as my answer to this question explains -1 converted to unsigned will always be the max unsigned value for that type ... so assigning -1 to an unsigned value is always well defined behavior.Shafik Yaghmour
But just re-interpreting a negative integral value as an unsigned integral value is not well-defined. Just as re-interpreting a too-big unsigned integral value as a signed integral value is not.Deduplicator
You might find "std::numeric_limits" useful. Changing from uin32_t to uint64_t is much more readable. std::numeric_limits<int32_t>::max(), becomes std::numeric_limits<int64_t>::max() etc.2785528

2 Answers

4
votes

That is why %u for signed prints 4294967295 by ignoring -ve leftmost bit. When used %d for signed int, it uses left most bit as -ve flag and prints -1.

In the case of unsigned, the "leftmost" or most significant bit is not ignored, and is not negative; rather it has a place value of 231.

In the negative case, the sign bit is not a flag; instead it is a bit with a place value of -231.

In both cases the value of the integer is equal to the sum of the place values of all the binary digits (bits) set to 1.

The encoding of signed values in this way is known as two's complement. It is not the only possible encoding; what you described is known as sign and magnitude for example, and one's complement is another possibility. However, these alternative encodings are seldom encountered in practice, not least because two's complement is how arithmetic works on modern hardware in all but perhaps the most arcane architectures.

1
votes

There are a few things going on here let's start out by saying that using the incorrect format specifier to printf is undefined behavior which means the results of your program are unpredictable, what actually happens will depends on many factors including your compiler, architecture, optimization level, etc...

For signed/unsigned conversions, that is defined by the respective standards, both C and C++ make it implementation defined behavior to convert a value that is larger than be stored in a signed integer type, from the C++ draft standard:

If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

for example gcc chooses to use the same convention as unsigned:

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

When you assign -1 to an unsigned value in both C and C++ the result will always be the maximum unsigned value of the type, from the draft C++ standard:

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]

The wording from C99 is easier to digest:

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

So we have the following:

-1 + (UNSIGNED_MAX + 1)

the result of which is UNSIGNED_MAX

As for printf and incorrect format specifier we can see form the draft C99 standard section 7.19.6.1 The fprintf function says:

If a conversion specification is invalid, the behavior is undefined.248) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

fprintf covers printf with respect to format specifiers and C++ falls back in C with respect to printf.