4
votes

I'm writing a JIT compiler in C for x86_64 linux.

Currently the idea is to generate some bytecode in a buffer of executable memory (e.g. obtained with an mmap call) and jump to it with a function pointer.

I'd like to be able to link multiple blocks of executable memory together such that they can jump between each other using only native instructions.

Ideally, the C-level pointer to an executable block can be written into another block as an absolute jump address something like this:

unsigned char *code_1 = { 0xAB, 0xCD, ... };
void *exec_block_1 = mmap(code1, ... );
write_bytecode(code_1, code_block_1);
...
unsigned char *code_2 = { 0xAB, 0xCD, ... , exec_block_1, ... };
void *exec_block_2 = mmap(code2, ... );
write_bytecode(code_2, exec_block_2); // bytecode contains code_block_1 as a jump
                                      // address so that the code in the second block
                                      // can jump to the code in the first block

However I'm finding the limitations of x86_64 quite an obstacle here. There's no way to jump to an absolute 64-bit address in x86_64 as all available 64-bit jump operations are relative to the instruction pointer. This means that I can't use the C-pointer as a jump target for generated code.

Is there a solution to this problem that will allow me to link blocks together in the manner I've described? Perhaps an x86_64 instruction that I'm not aware of?

2
Hmm, maybe you are over-estimating the need to generate more than 2 gigabytes of code. The advantage of a jitter is that you can always tell that you need to fall back to an indirect jump, like jmp rax.Hans Passant
@HansPassant That's a good point. At the moment my goal is just to implement the simplest thing that works, and worry about performance later.AlexJ136
Also related: Handling calls to far away intrinsic functions in a JIT/ re: allocating blocks near each other with mmap with a hint address, so you can use a direct call or jmp rel32 encoding.Peter Cordes

2 Answers

3
votes

If you know the addresses of the blocks at the time when you are emitting the jump instructions, you can just check to see if the distance in bytes from the address of the jump instruction to the address of the target block fits within the 32-bit signed offset of the jXX family of instructions.

Even if you mmap each block separately, chances are pretty good that you won't get two neighbouring (in the control-flow sense) blocks that are more than ±2GiB apart. That being said, there are several good reasons not to map each block separately like that. First of all, mmap's minimum unit of allocation is (almost by definition) a page, which is probably at least 4KiB. That means that the unused space after the code for each block is wasted. Secondly, packing the basic blocks more tightly increases the utilization of the instruction cache and the chances of a shorter jump encoding being valid.

Perhaps an x86_64 instruction that I'm not aware of?

Incidentally, there is an instruction for loading a 64-bit immediate into rax. The GNU toolchain refers to it as movabs:

0000000000000000 <.text>:
   0:   49 b8 ff ff ff ff ff    movabs rax,0x7fffffffffffffff
   7:   ff ff 7f

So if you really want to, you can simply load the pointer into rax and use a jump to register.

0
votes

Hmm I'm not sure if I clearly understood your question and if that a proper answer. it's quite a convoluted way to achieve this:

    ;instr              ; opcodes [op size] (comment)
    call next           ; e8 00 00 00 00 [4] (call to get current location)
next:
    pop rax             ; 58 [1]  (next label address in rax)
    add rax, 12h        ; 48 83 c0 12 [4] (adjust rax to fall on landing label)
    push rax            ; 50 [1]  (push adjusted value)
    mov rax, code_block ; 48 b8 XX XX XX XX XX XX XX XX [10] (load target address)
    push rax            ; 50 [1] (push to ret to code_block)
    ret                 ; c3 [1] (go to code_block)
landing:    
    nop
    nop

e8 00 00 00 00 is just there to get the current pointer on top of stack. Then the code adjusts rax to fall on landing label later. You'll need to replace the XX (in mov rax, code_block) by the virtual address of code block. The ret instruction is used as a call. When caller returns, the code should fall on landing.

Is that this kind of thing you're trying to achieve?