How to reconcile short conditional jumps with branch target alignments in Delphi assembler?
I’m using Delphi version 10.2 Tokyo, for 32-bit and 64-bit assembly, to write some functions entirely using the assembly.
If I don’t use the .align
, the compiler correctly encodes short
conditional jumps instructions (2 byte instruction which consists of an 1-byte opcode 074h
and 1-byte relative offset -+ up to 07Fh). But if I ever put even a single .align
, even as small as .align 4
-- all conditional jump instructions that are located before the .align and have destination located after the .align
- in this case all these instructions become 6-byte instructions, not 2-byte as they should be. Only the instructions that are located after the .align remain correctly encoded as 2-byte short
.
Delphi Assembler doesn’t accept ‘short’ prefix.
How can I reconcile short conditional jumps with branch target alignments with .align
in Delphi assembler?
Here is a sample procedure – please note that there is an .align
in the middle.
procedure Test; assembler;
label
label1, label2, label3;
asm
mov al, 1
cmp al, 2
je label1
je label2
je label3
label1:
mov al, 3
cmp al, 4
je label1
je label2
je label3
mov al, 5
.align 4
label2:
cmp al, 6
je label1
je label2
je label3
mov al, 7
cmp al, 8
je label1
je label2
je label3
label3:
end;
Here is how it is encoded – conditional jumps, located before the align
, that point to to label2 and label3 (after the align
) are encoded as 6-byte instructions (this is a 64-bit CPU target):
0041C354 B001 mov al,$01 // mov al, 1
0041C356 3C02 cmp al,$02 // cmp al, 2
0041C358 740C jz $0041c366 // je label1
0041C35A 0F841C000000 jz $0041c37c // je label2
0041C360 0F8426000000 jz $0041c38c // je label3
0041C366 B003 mov al,$03 //label1: mov al, 3
0041C368 3C04 cmp al,$04 // cmp al, 4
0041C36A 74FA jz $0041c366 // je label1
0041C36C 0F840A000000 jz $0041c37c // je label2
0041C372 0F8414000000 jz $0041c38c // je label3
0041C378 B005 mov al,$05 // mov al, 5
0041C37A 8BC0 mov eax,eax // <-- a 2-byte dummy instruction, inserted by ".align 4" (almost a 2-byte NOP)
0041C37C 3C06 cmp al,$06 //label2: cmp al, 6
0041C37E 74E6 jz $0041c366 // je label1
0041C380 74FA jz $0041c37c // je label2
0041C382 7408 jz $0041c38c // je label3
0041C384 B007 mov al,$07 // mov al, 7
0041C386 3C08 cmp al,$08 // cmp al, 8
0041C388 74DC jz $0041c366 // je label1
0041C38A 74F0 jz $0041c37c // je label2
0041C38C C3 ret // label3:
But if I remove the .align
- all the instructions have correct size - just 2 bytes as they used to be:
0041C354 B001 mov al,$01 // mov al, 1
0041C356 3C02 cmp al,$02 // cmp al, 2
0041C358 7404 jz $0041c35e // je label1
0041C35A 740E jz $0041c36a // je label2
0041C35C 741C jz $0041c37a // je label3
0041C35E B003 mov al,$03 //label1: mov al, 3
0041C360 3C04 cmp al,$04 // cmp al, 4
0041C362 74FA jz $0041c35e // je label1
0041C364 7404 jz $0041c36a // je label2
0041C366 7412 jz $0041c37a // je label3
0041C368 B005 mov al,$05 // mov al, 5
0041C36A 3C06 cmp al,$06 //.align 4 label2:cmp al, 6
0041C36C 74F0 jz $0041c35e // je label1
0041C36E 74FA jz $0041c36a // je label2
0041C370 7408 jz $0041c37a // je label3
0041C372 B007 mov al,$07 // mov al, 7
0041C374 3C08 cmp al,$08 // cmp al, 8
0041C376 74E6 jz $0041c35e // je label1
0041C378 74F0 jz $0041c36a // je label2
0041C37A C3 ret // je label3
// label3:
Back to conditional jumps instructions: how can I reconcile short conditional jumps with branch target alignments with .align
in Delphi assembler?
I acknowledge that the benefit of aligning branch targets on processors like SkyLake and later is slim and I understand that I can just refrain from using .align
- it will also save the code size. But I want to know how can I use Delphi assembler to generate short jumps with align
. This problem persists in 32-bit target also, not only in the 64-bit one.
jcc rel8
a "short" jump, andjcc rel32
a "near" jump. Both of them are near jumps, as opposed to a far jump to a different code segment. So "short" means "near with compact encoding". The online HTML versions get messy after the first page of the table :( – Peter Cordes.align
with that assembler. That you get a few long forward branches shouldn't matter a lot. Most branches that matter (e.g. in loops) are backward anyway, and there it works. – Rudy Velthuissysenter
ABI like Linux does.) I'd guess that far jumps aren't predicted, but it's also possible that the CPU optimistically assumes that there's no call-gate or whatever. – Peter Cordes