168
votes

Suppose I have the number 'numb'=1025 [00000000 00000000 00000100 00000001] represented:

On Little-Endian Machine:

00000001 00000100 00000000 00000000

On Big-Endian Machine:

00000000 00000000 00000100 00000001

Now, if I apply Left Shift on 10 bits (i.e.: numb <<= 10), I should have:

[A] On Little-Endian Machine:

As I noticed in GDB, Little Endian does the Left Shift in 3 steps: [I have shown '3' Steps to better understand the processing only]

  1. Treat the no. in Big-Endian Convention:

    00000000        00000000        00000100    00000001
    
  2. Apply Left-Shift:

    00000000        00010000        00000100        00000000
    
  3. Represent the Result again in Little-Endian:

    00000000        00000100        00010000        00000000 
    

[B]. On Big-Endian Machine:

00000000        00010000        00000100        00000000

My Question is:

If I directly apply a Left Shift on the Little Endian Convention, it should give:

numb:

00000001 00000100 00000000 00000000

numb << 10:

00010000 00000000 00000000 00000000

But actually, it gives:

00000000        00000100        00010000        00000000 

To achieve the second result only, I have shown three hypothetical steps above.

Please explain me why the above two results are different: The actual outcome of numb << 10 is different than the expected outcome.

5

5 Answers

212
votes

Endianness is the way values are stored in memory. When loaded into the processor, regardless of endianness, the bit shift instruction is operating on the value in the processor's register. Therefore, loading from memory to processor is the equivalent of converting to big endian, the shifting operation comes next and then the new value is stored back in memory, which is where the little endian byte order comes into effect again.

Update, thanks to @jww: On PowerPC the vector shifts and rotates are endian sensitive. You can have a value in a vector register and a shift will produce different results on little-endian and big-endian.

62
votes

No, bitshift, like any other part of C, is defined in terms of values, not representations. Left-shift by 1 is mutliplication by 2, right-shift is division. (As always when using bitwise operations, beware of signedness. Everything is most well-defined for unsigned integral types.)

6
votes

Whichever shift instruction shifts out the higher-order bits first is considered the left shift. Whichever shift instruction shifts out the lower-order bits first is considered the right shift. In that sense, the behavior of >> and << for unsigned numbers will not depend on endianness.

4
votes

Computers don't write numbers down the way we do. The value simply shifts. If you insist on looking at it byte-by-byte (even though that's not how the computer does it), you could say that on a little-endian machine, the first byte shifts left, the excess bits go into the second byte, and so on.

(By the way, little-endian makes more sense if you write the bytes vertically rather than horizontally, with higher addresses on top. Which happens to be how memory map diagrams are commonly drawn.)

3
votes

Though the accepted answer points out that endianess is a concept from the memory view. But I don't think that answer the question directly.

Some answers tell me that bitwise operations don't depend on endianess, and the processor may represent the bytes in any other way. Anyway, it's talking about that endianess gets abstracted.

But when we do some bitwise calculations on the paper for example, don't need to state the endianess in the first place? Most times we choose an endianess implicitly.

For example, assume we have a line of code like this

0x1F & 0xEF

How would you calculate the result by hand, on a paper?

  MSB   0001 1111  LSB
        1110 1111
result: 0000 1111

So here we use a Big Endian format to do the calculation. You can also use Little Endian to calculate and get the same result.

Btw, when we write numbers in code, I think it's like a Big Endian format. 123456 or 0x1F, most significant numbers starts from the left.

Again, as soon as we write some a binary format of a value on the paper, I think we've already chosen an Endianess and we are viewing the value as we see it from the memory.

So back to the question, an shift operation << should be thought as shifting from LSB(least significant byte) to MSB(most significant byte).

Then as for the example in the question:

numb=1025

Little Endian

LSB 00000001 00000100 00000000 00000000 MSB

So << 10 would be 10bit shifting from LSB to MSB.


Comparison and << 10 operations for Little Endian format step by step:

MSB                                        LSB
    00000000  00000000  00000100  00000001  numb(1025)
    00000000  00010000  00000100  00000000  << 10

LSB                                        MSB
    00000000  00000100  00010000  00000000 numb(1025) << 10, and put in a Little Endian Format

LSB                                        MSB
    00000001  00000100  00000000  00000000 numb(1205) in Little Endian format
    00000010  00001000  00000000  00000000 << 1 
    00000100  00010000  00000000  00000000 << 2 
    00001000  00100000  00000000  00000000 << 3 
    00010000  01000000  00000000  00000000 << 4
    00100000  10000000  00000000  00000000 << 5
    01000000  00000000  00000001  00000000 << 6
    10000000  00000000  00000010  00000000 << 7
    00000000  00000001  00000100  00000000 << 8
    00000000  00000010  00001000  00000000 << 9
    00000000  00000100  00010000  00000000 << 10 (check this final result!)

Wow! I get the expected result as the OP described!

The problems that the OP didn't get the expected result are that:

  1. It seems that he didn't shift from LSB to MSB.

  2. When shifting bits in Little Endian format, you should realize(thank god I realize it) that:

LSB 10000000 00000000 MSB << 1 is
LSB 00000000 00000001 MSB, not LSB 01000000 00000000 MSB

Because for each individual 8bits, we are actually writing it in a MSB 00000000 LSB Big Endian format.

So it's like

LSB[ (MSB 10000000 LSB) (MSB 00000000 LSB) ]MSB


To sum up:

  1. Though bitwise operations is said to be abstracted away blablablabla..., when we calculate bitwise operations by hand, we still need to know what endianess we are using as we write down the binary format on the paper. Also we need to make sure all the operators use the same endianess.

  2. The OP didn't get the expected result is because he did the shifting wrong.