How did an assember for x86 get a label's address defined after jmp instructions referenced to it?

Question

For example:

jmp LABEL

...  # loads of instructions

jmp LABEL

.... # loads of instructions

LABEL:

.....

Without the size of jmp LABEL instruction, the address of LABEL can't be determine because the two forms of jmp instruction (short (2 Bytes), near (3 or 5 Bytes)) have different size. Whereas, without knowing about LABEL's address, you cannot determine which form to use.

How does assembler solve it?

Logically, assembling code is a multi-phase operation. The address of labels will not be resolved until all other code is generated. So the size of all the encoded instructions between jmp and LABEL will be known. — Andon M. Coleman

Hans Passant Hans Passant · Accepted Answer · 2014-12-18T18:06:54

It depends on the kind of assembler you use. A simple 2-pass assembler (like MASM) makes it your problem. They'll pick a long jump and you have to write JMP SHORT LABEL explicitly to get the short one. And bitch at you when you guessed wrong.

An optimizing n-pass assembler (like TASM) sorts it out by itself. It assumes the short jump and if it discovers that it can't reach then it restarts the assembly, now with a long jump.

You can easily tell what kind of flavor you have. Just look at the code listing it generates, if you get the 5 byte long jump then you have the 2-pass kind.

How did an assember for x86 get a label's address defined after jmp instructions referenced to it?

2 Answers