Suppose that we define:
short x = -1;
unsigned short y = (unsigned short) x;
According to the C99 standard:
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. (ISO/IEC 9899:1999 6.3.1.3/2)
So, assuming two bytes for short and a two's complement representation, the bit patterns of these two integers are:
x = 1111 1111 1111 1111 (value of -1),
y = 1111 1111 1111 1111 (value of 65535).
Since -1 is not in the value range for unsigned short, and the maximum value that can be represented in an unsigned short is 65535, 65536 is added to -1 to get 65535, which is in the range of unsigned short. Thus the bits remain unchanged in casting from int to unsigned, though the represented value is changed.
But, the standard also says that representations may be two's complement, one's complement, or sign and magnitude. "Which of these applies is implementation-defined,...." (ISO/IEC 9899:1999 6.2.6.2/2)
On a system using one's complement, x
would be represented as 1111 1111 1111 1110
before casting, and on a system using sign and magnitude representation, x
would be represented as 1000 0000 0000 0001
. Both of these bit patterns represent a value of -1, which is not in the value range of unsigned short, so 65536 would be added to -1 in each case to bring the values into range. After the cast, both of these bit patterns would be 1111 1111 1111 1111
.
So, preservation of the bit pattern in casting from int to unsigned int is implementation dependent.
It seems like the ability to cast an int to unsigned int while preserving the bit pattern would be a handy tool for doing bit-shifting operations on negative numbers, and I have seen it advocated as a technique for just that. But this technique does not appear to be guaranteed to work by the standard.
Am I reading the standard correctly here, or am I misunderstanding something about the details of the conversion from signed to unsigned types? Are two's complement implementations prevalent enough that the assumption of bit-pattern preservation under casting from int to unsigned is reasonable? If not, is there a better way to preserve bit patterns under a conversion from int to unsigned int?
Edit
My original goal was to find a way to cast an int to unsigned int in such a way that the bit pattern is preserved. I was thinking that a cast from int to intN_t could help accomplish this:
unsigned short y = (unsigned short)(int16_t) x;
but of course this idea was wrong! At best this would only enforce two's complement representation before casting to unsigned, so that the final bit pattern would be two's complement. I am tempted to just delete the question, yet I am still interested in ways to cast from int to unsigned int that preserve bit patterns, and @Leushenko has provided a really neat solution to this problem using unions. But, I have changed the title of the question to reflect the original intention, and I have edited the closing questions.
int
. Some methods to do this involve bit-shifting theint
to the right. It was suggested to me that a cast tounsigned
would make this possible, and that sounded too easy. So, here we are. My solution to the Hamming weight problem did not involve bit-shifting a negativeint
, but this seemed like an interesting corner to investigate. – ad absurdum