27
votes

I have a question (or more likely a bug report) about bit shifting behavior in Delphi (tested in Borland Delphi 7).

Target: perform an "Arithmetic" bitwise shift right with any number.

This means that sign-bit must be extended – binary number will be filled from the left with 1 instead of 0 if the most significant bit of a number was set.

So, number "-1" after an arithmetic shift right must stay the same number (all bits = 1), but with "logical shift" (which always fill the number with zeroes) must give a maximal positive integer (maximal positive signed integer, to be correct)

I tested it only on 32-bit system (Windows); moreover, I need it to work explicitly with 32-bit integers.

Looks like there is an internal bug in Delphi with "shr" when the source number is stored in a variable.

My example code:

program bug;

{$APPTYPE CONSOLE}

var
I:Integer;
C:Cardinal;

begin
I := -1;  // we’ll need that later
C := $FFFFFFFF;

(It’s only the beginning). Next, let’s try some "shr"s:

Writeln('0) ', -1 shr 1 );
Writeln('1) ', $FFFFFFFF shr 1 );

"-1" is a signed equivalent to "$FFFFFFFF". Seems that "shr" behavior (arithmetic or logical) is based on the fact whether the source number is signed or is not (integer or cardinal).

Output is:

0) -1
1) 2147483647

Quite correct. Then I need to try to manually cast these numbers to integers or cardinals:

Writeln('2) ', Integer(-1) shr 1 );
Writeln('3) ', Integer($FFFFFFFF) shr 1 );
Writeln('4) ', Cardinal(-1) shr 1 );
Writeln('5) ', Cardinal($FFFFFFFF) shr 1 );

Result:

2) -1
3) -1
4) 2147483647
5) 2147483647

Still correct. So, I think that I can cast anything to "integer" if I need an arithmetic shift; or cast to "cardinal" when I want a logical shift. But wait! Example with variables (declared above) :

Writeln('6) ', I shr 1 );
Writeln('7) ', C shr 1 );

Suddenly:

6) 2147483647
7) 2147483647

INCORRECT. My "I" was a signed integer, and I expected the arithmetic shift! So, maybe casting could help?

Writeln('8) ', Integer(I) shr 1 );
Writeln('9) ', Cardinal(I) shr 1 );
Writeln('A) ', Integer(C) shr 1 );
Writeln('B) ', Cardinal(C) shr 1 );

No, still the same...

8) 2147483647
9) 2147483647
A) 2147483647
B) 2147483647

Things are even worse if I’ll try to make a function "a shr b" and use it instead:

// Simple shift right with signed integers
function shrI(a,b:Integer):Integer;
begin
Result := a shr b;
end;

// Simple shift right with unsigned integers
function shrC(a,b:Cardinal):Cardinal;
begin
Result := a shr b;
end;

Now:

Writeln('C) ', shrI(-1,1) );
Writeln('D) ', shrC($FFFFFFFF,1) );

– It stopped working even with constant expressions: (it makes sense, because numbers are again stored in variables inside a function)

C) 2147483647
D) 2147483647

Since I need to make my correct arithmetic shift anyway, I wrote these formulas to do this (shift "a" right by "b" bits). First is the logical shift:

(a shr b) and ((1 shl (32-b))-1)

I just need to bitwise-and the result with "32 - b" ones (from the right) to clear "b" left bits in case "shr" fail me and did arithmetic shift instead (no example shows this, but just to make sure). Then the arithmetic shift:

(a shr b) or (( 0-((a shr 31) and 1)) shl (32-b))

I need to bitwise-or the result with "b" ones from the left, but only when most significant bit was set; to do this firstly I take sign bit with "(a shr 31) and 1", then negate this number to get "-1" (or $FFFFFFFF – all bits =1) if source was negative, and 0 otherwise (I put "0-x" instead of just "-x" because in my C-port in some cases bcc32 C-compiler report a warning of tying to negate an unsigned integer); and finally I shifted it to "32 - b" bits left, so I got what I wish even when "shr" fails and give zeroes. I made two version of each function to deal with both integers and cardinals (also I could share names and "overload" them for me, but here I won’t do this to keep the example clear):

// Logical shift right with signed integers
function srlI(a,b:Integer):Integer;
begin
Result := (a shr b) and ((1 shl (32-b))-1);
end;

// Arithmetic shift right with signed integers
function sraI(a,b:Integer):Integer;
begin
Result := (a shr b) or (( 0-((a shr 31) and 1)) shl (32-b));
end;

// Logical shift right with unsigned integers
function srlC(a,b:Cardinal):Cardinal;
begin
Result := (a shr b) and ((1 shl (32-b))-1);
end;

// Arithmetic shift right with unsigned integers
function sraC(a,b:Cardinal):Cardinal;
begin
Result := (a shr b) or (( 0-((a shr 31) and 1)) shl (32-b));
end;

Test it:

Writeln('E) ', sraI(-1,1) );
Writeln('F) ', srlI(-1,1) );
Writeln('G) ', sraC($FFFFFFFF,1) );
Writeln('H) ', srlC($FFFFFFFF,1) );

And got perfect results:

E) -1
F) 2147483647
G) 4294967295
H) 2147483647

(G-case is still correct, because "4294967295" is the unsigned version of "-1")

Final checks with variables:

Writeln('K) ', sraI(I,1) );
Writeln('L) ', srlI(I,1) );
Writeln('M) ', sraC(C,1) );
Writeln('N) ', srlC(C,1) );

Perfect:

K) -1
L) 2147483647
M) 4294967295
N) 2147483647

For this bug I also tried to change the second number (amount of shift) to a variable and/or try different casting it – same bug present, looks like it isn’t related with second argument. And trying to cast the result (to integer or to cardinal) before output didn’t improve anything either.

To make sure that I’m not only one who have the bug, I tried to run my entire example at http://codeforces.com/ (there a registered user can compile and execute a piece of code in different languages and compilers on server-side) to see the output.

"Delphi 7" compiler gave me exactly what I have – bug was present. Alternative option, "Free Pascal 2" shows even more wrong output:

0) 9223372036854775807
1) 2147483647
2) 9223372036854775807
3) 9223372036854775807
4) 2147483647
5) 2147483647
6) 2147483647
7) 2147483647
8) 2147483647
9) 2147483647
A) 2147483647
B) 2147483647
C) 2147483647
D) 2147483647
E) -1
F) 2147483647
G) 4294967295
H) 2147483647
K) -1
L) 2147483647
M) 4294967295
N) 2147483647

Strange "9223372036854775807" in cases 0-2-3 (there was "-1", "Integer(-1)" and "Integer($FFFFFFFF)" who don’t remember).

Here is my entire example in Delphi:

program bug;

{$APPTYPE CONSOLE}

// Simple shift right with signed integers
function shrI(a,b:Integer):Integer;
begin
Result := a shr b;
end;

// Simple shift right with unsigned integers
function shrC(a,b:Cardinal):Cardinal;
begin
Result := a shr b;
end;

// Logical shift right with signed integers
function srlI(a,b:Integer):Integer;
begin
Result := (a shr b) and ((1 shl (32-b))-1);
end;

// Arithmetic shift right with signed integers
function sraI(a,b:Integer):Integer;
begin
Result := (a shr b) or (( 0-((a shr 31) and 1)) shl (32-b));
end;

// Logical shift right with unsigned integers
function srlC(a,b:Cardinal):Cardinal;
begin
Result := (a shr b) and ((1 shl (32-b))-1);
end;

// Arithmetic shift right with unsigned integers
function sraC(a,b:Cardinal):Cardinal;
begin
Result := (a shr b) or (( 0-((a shr 31) and 1)) shl (32-b));
end;

var
I:Integer;
C:Cardinal;

begin
I := -1;
C := $FFFFFFFF;

Writeln('0) ', -1 shr 1 );
Writeln('1) ', $FFFFFFFF shr 1 );
// 0) -1           - correct
// 1) 2147483647   - correct

Writeln('2) ', Integer(-1) shr 1 );
Writeln('3) ', Integer($FFFFFFFF) shr 1 );
// 2) -1           - correct
// 3) -1           - correct

Writeln('4) ', Cardinal(-1) shr 1 );
Writeln('5) ', Cardinal($FFFFFFFF) shr 1 );
// 4) 2147483647   - correct
// 5) 2147483647   - correct

Writeln('6) ', I shr 1 );
Writeln('7) ', C shr 1 );
// 6) 2147483647   - INCORRECT!
// 7) 2147483647   - correct

Writeln('8) ', Integer(I) shr 1 );
Writeln('9) ', Cardinal(I) shr 1 );
// 8) 2147483647   - INCORRECT!
// 9) 2147483647   - correct

Writeln('A) ', Integer(C) shr 1 );
Writeln('B) ', Cardinal(C) shr 1 );
// A) 2147483647   - INCORRECT!
// B) 2147483647   - correct

Writeln('C) ', shrI(-1,1) );
Writeln('D) ', shrC($FFFFFFFF,1) );
// C) 2147483647   - INCORRECT!
// D) 2147483647   - correct

Writeln('E) ', sraI(-1,1) );
Writeln('F) ', srlI(-1,1) );
// E) -1           - correct
// F) 2147483647   - correct

Writeln('G) ', sraC($FFFFFFFF,1) );
Writeln('H) ', srlC($FFFFFFFF,1) );
// G) 4294967295   - correct
// H) 2147483647   - correct

Writeln('K) ', sraI(I,1) );
Writeln('L) ', srlI(I,1) );
// K) -1           - correct
// L) 2147483647   - correct

Writeln('M) ', sraC(C,1) );
Writeln('N) ', srlC(C,1) );
// M) 4294967295   - correct
// N) 2147483647   - correct

end.

Then I was curios, is this bug also present in C++? I wrote a port to C++ and use (Borland!) bcc32.exe to compile it.

Results:

0) -1
1) 2147483647
2) -1
3) -1
4) 2147483647
5) 2147483647
6) -1
7) 2147483647
8) -1
9) 2147483647
A) -1
B) 2147483647
C) -1
D) 2147483647
E) -1
F) 2147483647
G) 4294967295
H) 2147483647
K) -1
L) 2147483647
M) 4294967295
N) 2147483647

Everything is OK. Here is C++ version, in case someone also wants to look:

#include <iostream>
using namespace std;

// Simple shift right with signed integers
int shrI(int a, int b){
return a >> b;
}

// Simple shift right with unsigned integers
unsigned int shrC(unsigned int a, unsigned int b){
return a >> b;
}

// Logical shift right with signed integers
int srlI(int a, int b){
return (a >> b) & ((1 << (32-b))-1);
}

// Arithmetic shift right with signed integers
int sraI(int a, int b){
return (a >> b) | (( 0-((a >> 31) & 1)) << (32-b));
}

// Logical shift right with unsigned integers
unsigned int srlC(unsigned int a, unsigned int b){
return (a >> b) & ((1 << (32-b))-1);
}

// Arithmetic shift right with unsigned integers
unsigned int sraC(unsigned int a, unsigned int b){
return (a >> b) | (( 0-((a >> 31) & 1)) << (32-b));
}

int I;
unsigned int C;

int main(){
I = -1;
C = 0xFFFFFFFF;

cout<<"0) "<<( -1 >> 1 )<<endl;
cout<<"1) "<<( 0xFFFFFFFF >> 1 )<<endl;
// 0) -1           - correct
// 1) 2147483647   - correct

cout<<"2) "<<( ((int)(-1)) >> 1 )<<endl;
cout<<"3) "<<( ((int)(0xFFFFFFFF)) >> 1 )<<endl;
// 2) -1           - correct
// 3) -1           - correct

cout<<"4) "<<( ((unsigned int)(-1)) >> 1 )<<endl;
cout<<"5) "<<( ((unsigned int)(0xFFFFFFFF)) >> 1 )<<endl;
// 4) 2147483647   - correct
// 5) 2147483647   - correct

cout<<"6) "<<( I >> 1 )<<endl;
cout<<"7) "<<( C >> 1 )<<endl;
// 6) -1           - correct
// 7) 2147483647   - correct

cout<<"8) "<<( ((int)(I)) >> 1 )<<endl;
cout<<"9) "<<( ((unsigned int)(I)) >> 1 )<<endl;
// 8) -1           - correct
// 9) 2147483647   - correct

cout<<"A) "<<( ((int)(C)) >> 1 )<<endl;
cout<<"B) "<<( ((unsigned int)(C)) >> 1 )<<endl;
// A) -1           - correct
// B) 2147483647   - correct

cout<<"C) "<<( shrI(-1,1) )<<endl;
cout<<"D) "<<( shrC(0xFFFFFFFF,1) )<<endl;
// C) -1           - correct
// D) 2147483647   - correct

cout<<"E) "<<( sraI(-1,1) )<<endl;
cout<<"F) "<<( srlI(-1,1) )<<endl;
// E) -1           - correct
// F) 2147483647   - correct

cout<<"G) "<<( sraC(0xFFFFFFFF,1) )<<endl;
cout<<"H) "<<( srlC(0xFFFFFFFF,1) )<<endl;
// G) 4294967295   - correct
// H) 2147483647   - correct

cout<<"K) "<<( sraI(I,1) )<<endl;
cout<<"L) "<<( srlI(I,1) )<<endl;
// K) -1           - correct
// L) 2147483647   - correct

cout<<"M) "<<( sraC(C,1) )<<endl;
cout<<"N) "<<( srlC(C,1) )<<endl;
// M) 4294967295   - correct
// N) 2147483647   - correct

}

Before posting here, I tried to search this problem, and didn’t found any mention of this bug. Also I looked here: What is the behaviour of shl and shr for non register sized operands? and here: Arithmetic Shift Right rather than Logical Shift Right – but there other problems were discussed (that compiler internally casts any type to 32-bit number before doing actual shift; or shifting more than 31 bits), but not mine bug.

But wait, here is my problem: http://galfar.vevb.net/wp/2009/shift-right-delphi-vs-c/ !

With one remark: they say –

In Delphi the SHR is always a SHR operation: it never takes into account the sign.

But my example shows that Delphi does take sign into account, but only when source number is a constant expression, not a variable. So "-10 shr 2" equals to "-3", but "x shr 2" equals to "1073741821" when "x:=-10".

So I think this is a bug, and not a "behavior" that "shr" is always logical. You see, not always.
Trying to enable/disable any compiler options such a range checking or optimizations didn’t change anything.

Also, here I posted examples of how to bypass this problem and have correct arithmetic shift right. And my main question is: am I right?

Seems that left shift is always good in Delphi (it never uses the original sign bit, and not "undefined": for signed integers it behave as casting to cardinal before shifting and casting the result back to integer – a number can suddenly became negative of course). But now I wonder, are there any other similar bugs in Delphi? This is the first really significant bug I ever discover in Delphi 7. I love Delphi more than C++ exactly because I was always sure that my code is everytime doing what I want, without debug-testing every new unusual piece of code that I’m about to write (IMHO).

P.S. Here are some useful links that StackOverflow system suggests me when I typed my title before posting this question. Again, interesting information, but not about this bug:

Arithmetic bit-shift on a signed integer
Signed right shift = strange result?
Bitwise shift operators on signed types
Should you always use 'int' for numbers in C, even if they are non-negative?
Are the results of bitwise operations on signed integers defined?
Verifying that C / C++ signed right shift is arithmetic for a particular compiler?
Emulating variable bit-shift using only constant shifts?

P.P.S. Thanks a lot to Stack Exchange Team for assistance in posting this article. Guys, you rock!

3
JFTR, these are the results of Delphi XE8 (Win32): 0) 2147483647 1) 2147483647 2) 2147483647 3) 2147483647 4) 2147483647 5) 2147483647 6) 2147483647 7) 2147483647 8) 2147483647 9) 2147483647 A) 2147483647 B) 2147483647 C) 2147483647 D) 2147483647 E) -1 F) 2147483647 G) 4294967295 H) 2147483647 K) -1 L) 2147483647 M) 4294967295 N) 2147483647Uwe Raabe
Awesome first question btw, mega kudos for you!David Heffernan
What ? It's not Ian's question :) !?TLama
Capital letter I in the middle of the sentence, no headers, no bonus reading, clearly not the work of @IanBoyd ;-)David Heffernan
@David: You missed and no images in your list. :-)Ken White

3 Answers

18
votes

There is a bug, but it's not what you think. Here's the documentation for shr:

If x is a negative integer, the shl and shr operations are made clear in the following example:

var
  x: integer;
  y: string;

...
begin
  x := -20;
  x := x shr 1;
  //As the number is shifted to the right by 1 bit, the sign bit's value replaced is
  //with 0 (all negative numbers have the sign bit set to 1). 

  y := IntToHex(x, 8);
  writeln(y);
  //Therefore, x is positive.
  //Decimal value: 2147483638
  //Hexadecimal value: 7FFFFFF6
  //Binary value: 0111 1111 1111 1111 1111 1111 1111 0110
end.

So, shr and shl are always logical shift and are not arithmetic shift.

The defect is actually in the handling of negative true constants:

Writeln('0) ', -1 shr 1 );

Here, -1 is a signed value. It actually has type Shortint, a signed 8 bit integer. But the shift operators operate on 32 bit values, so it is sign extended to a 32 bit value. So that means that this excerpt should produce two lines with identical output:

var
  i: Integer;
....
i := -1;
Writeln(-1 shr 1);
Writeln( i shr 1);

and that the output should be:

2147483647
2147483647

On modern versions of Delphi, certainly from version 2010 and later, but possibly even earlier versions, that is the case.

But according to your question, in Delphi 7, -1 shr 1 evaluates to -1 which is wrong, because shr is logical shift.

We can guess at the source of the defect. The compiler evaluates -1 shr 1 because it is a constant value, and the compiler simply does it incorrectly, using arithmetic shift instead of logical shift.

Incidentally, the documentation contains another error. It says:

The operations x shl y and x shr y shift the value of x to the left or right by y bits, which (if x is an unsigned integer) is equivalent to multiplying or dividing x by 2^y; the result is of the same type as x.

The final part is not true. The expression x shl y is a 32 bit type, if x is an 8, 16 or 32 bit type, a 64 bit type otherwise.


Since your actual goal is to implement arithmetic shift, then none of this matters to you. You cannot use shl or shr. You will have to implement arithmetic shift yourself. I suggest that you do so using inline assembler since I suspect that might ultimately be easier to read and verify.

3
votes

If you're stuck with the asm versions of the arithmetic shifts, here's some code that would work:

Note that according to: http://docwiki.embarcadero.com/RADStudio/XE8/en/Program_Control
The first 3 parameters are passed in registers as follows: EAX, EDX, ECX

In 64 bit the register order is: RCX, RDX, R8, R9

The result of functions is passed in EAX

unit SARL;

interface

  function sar(const base: integer; shift: byte): integer; 
  function sal(const base: integer; shift: byte): integer;

implementation

  function sar(const base: integer; shift: byte): integer;
  asm
    {$IFDEF CPU64BIT}
    mov eax,ecx
    mov ecx,edx
    sar eax,cl
    {$ELSE}
    mov ecx,edx
    sar eax,cl  //shr is very different from sar
    {$ENDIF}
  end;

  function sal(const base: integer; shift: byte): integer; 
  asm
    {$IFDEF CPU64BIT}
    mov eax,ecx
    mov ecx,edx
    shl eax,cl
    {$ELSE}
    mov ecx,edx
    shl eax,cl   //Note that sal and shl are the same thing.
    {$ENDIF}
  end;

end.
1
votes

I tested in Delphi 7 and it seems that simply using "div 2" on an integer variable directly compiles to a SAR assembler operation (as viewed in the CPU window).

Update: Div does not work correctly as a replacement for SAR as I explained in my comment in this answer. The compiler generates a SAR statement, but then tests the sign bit and adjusts the answer by adding in the bit that was shifted off on the right if the sign bit is set. This gives the correct behavior for the div operator on negative numbers but defeats our goal of getting the correct SAR behavior.