7
votes

Ok, I have read several discussions regarding the differences between JIT and non-JIT enabled interpreters, and why JIT usually boosts performance.

However, my question is:

Ultimately, doesn't a non-JIT enabled interpreter have to turn bytecode (line by line) into machine/native code to be executed, just like a JIT compiler will do? I've seen posts and textbooks that say it does, and posts that say it does not. The latter argument is that the interpreter/JVM executes this bytecode directly with no interaction with machine/native code.

If non-JIT interpreters do turn each line into machine code, it seems that the primary benefits of JIT are...

  1. The intelligence of caching either all (normal JIT) or frequently encountered (hotspot/adaptive optimization) parts of the bytecode so that the machine code compilation step is not needed every time.

  2. Any optimization JIT compilers can perform in translating bytecode into machine code.

Is that accurate? There seems to be little difference (other than possible optimization, or JITting blocks vs line by line maybe) between the translation of bytecode to machine code via non-JIT and JIT enabled interpreters.

Thanks in advance.

2

2 Answers

10
votes

A non-JIT interpreter doesn't convert bytecode to machine code. You can imagine the workings of a non-JIT bytecode interpreter something like this (I'll use a Java-like pseudocode):

int[] bytecodes = { ... };
int   ip        = 0; // instruction pointer
while(true) {
  int code = bytecodes[ip];
  switch(code) {
    case 0;
      // do something
      ip += 1; break;
    case 1:
      // do something else
      ip += 1; break;
    // and so on...
  }
}

So for every bytecode executed, the interpreter has to retrieve the code, switch on its value to decide what to do, and increment its "instruction pointer" before going to the next iteration.

With a JIT, all that overhead would be reduced to nothing. It would just take the contents of the appropriate switch branches (the parts that say "// do something"), string them together in memory, and execute a jump to the beginning of the first one. No software "instruction pointer" would be required -- only the CPU's hardware instruction pointer. No retrieving of bytecodes from memory and switching on their values either.

Writing a virtual machine is not difficult (if it doesn't have to be extremely high performance), and can be an interesting exercise. I did one once for an embedded project where the program code had to be very compact.

0
votes

Decades ago, there seemed to be a widespread belief that compilers would turn an entire program into machine code, while interpreters would translate a statement into machine code, execute it, discard it, translate the next one, etc. That notion was 99% incorrect, but there were two a tiny kernels of truth to it. On some microprocessors, some instructions required the use of addresses that were specified in code. For example, on the 8080, there was an instruction to read or write a specified I/O address 0x00-0xFF, but there was no instruction to read-or write an I/O address specified in a register. It was common for language interpreters, if user code did something like "out 123,45", to store into three bytes of memory the instructions "out 7Bh/ret", load the accumulator with 2Dh, and make a call to the first of those instructions. In that situation, the interpreter would indeed be producing a machine code instruction to execute the interpreted instruction. Such code generation, however, was mostly limited to things like IN and OUT instructions.

Many common Microsoft BASIC interpreters for the 6502 (and perhaps the 8080 as well) made somewhat more extensive use of code stored in RAM, but the code that was stored in RAM not not significantly depend upon the program that was executing; the majority of the RAM routine would not change during program execution, but the address of the next instruction was kept in-line as part of the routine allowing the use of an absolute-mode "LDA" instruction, which saved at least one cycle off every byte fetch.