29
votes

It's known that CF indicates unsigned carry out and OF indicates signed overflow. So how does an assembly program differentiate between unsigned and signed data since it's only a sequence of bits? (Through additional memory storage for type information, or through positional information or else?) And could these two flags be used interchangeably?

6

6 Answers

43
votes

The distinction is in what instructions are used to manipulate the data, not the data itself. Modern computers (since circa 1970) use a representation of integer data called two's-complement in which addition and subtraction work exactly the same on both signed and unsigned numbers.

  • The difference in representation is the interpretation given to the most significant bit (also called the sign bit). For unsigned numbers the most significant bit is set when the number is in the upper half of the wholly positive range. For signed numbers the most significant bit is set when the number is in the lower and negative half of the whole range.

  • Different instructions may use different interpretations of the same bit. For example most big machines have both signed and unsigned multiply instructions. Machines with a 'set less than' instruction may have both signed and unsigned flavors.

  • The OF (overflow flag) tells whether a carry flipped the sign of the most significant bit in the result so that it is different from the most significant bits of the arguments. If numbers are interpreted as unsigned, the overflow flag is irrelevant, but if they are interpreted as signed, OF means, e.g., two large positive numbers were added and the result was negative.

  • The CF (carry flag) tells whether a bit was carried out of the word entirely (e.g. into bit 33 or bit 65). If numbers are interpreted as unsigned, carry flag means that addition overflowed, and the result is too large to fit in a machine word. The overflow flag is irrelevant.

The answer to your question is that assembly code has several ways of distinguishing signed from unsigned data:

  • It may choose either CF or OF to do signed or unsigned comparisons.
  • It may choose either signed or unsigned multiply and divide instructions.
  • It may choose a signed or unsigned right shift (signed copies the high bit; unsigned shifts in zeroes).
19
votes

Do not try to opcode the sign. That is impossible. Instead, only try to realize the truth: there is no sign. Then you'll see it is not the sign-type that differentiates, it is only yourself.

6
votes

There are different opcodes for dealing with signed and unsigned data. If a program wants to compare two signed integers, it uses the opcodes jl, jle, jg, and jge, where the l and g stand for less and greater respectively. If a program wants to compare two unsigned integers, it uses the opcodes jb, jbe, ja, and jae, where the a and b stand for above and below respectively. The e stands for 'or equal to' in all cases. These opcodes are used for branching based on a comparison.

Similarly, there are also the setCC instructions, which set a byte to 0 or 1 depending on a comparison. These function identically -- there are setl, setle, setg, setge, setb, setbe, seta, setae, and others.

The signed opcodes test the flags ZF, OF, and SF. The unsigned opcodes test the flags ZF, CF, and SF. See the 80386 Programmer's Reference Manual sections on the JCC instructions and the setCC instructions for the exact conditions tested.

4
votes

It doesn't. The flags just become set whenever the condition occurs. The programmer is supposed to know what types of ints he's working with and from that know which flag to examine if he cares.

3
votes

There is no way to ask the CPU to test and return the type of a byte/word/long.

0xFF may hold "255" or "-1" it all depends on what type of byte your program says it is.

Constructs such as "type", "signess" etc only exist in higher level languages such as Java and not at the CPU level. In the end everything is a byte to the CPU it is up to our programs to organise and know how to intrepret and manipulate these values...

The CPU flags found in the status do not enforce any paradigm it is up to your code to test and react accordingly.

On Intel CPUs the MMX and FPU registers actually occupy the same registers. It is thus impossible to mix FPU and MMX type instructions at the same time, because values from one operation will trash the other. Programs that need either typically complete their actions in one mode eg issuing FPU instructions and then might start MMX but never both at the same time.

1
votes

Usually assembly programs carry no special information around with variables to indicate whether they are signed or unsigned. It's the programmer's job to know when to check which flags and when to use which conditionals (i.e. using JA instead of JG).

So you need to know what type of variable you're about to work with so that you know which commands to use. This is why most programming languages give warnings when programmers use signed/unsigned types interchangeably (i.e. without an explicit cast), since this can be done in the hardware but can yield unexpected results.