1
votes

Please note that my question is around JVM interpreter, not JIT compiler. JIT compiler converts java bytecodes to native machine code. As such, this MUST mean that the interpreter within the JVM DOES NOT convert bytecodes to machine code. Hence the question: in essence what does the interpreter do? If someone can help me answer this with a simple example of bytecodes equivalent of 1+1 = 2, i.e. what does the interpreter do with respect to executing this add operation? (My implicit question is, if interpreter does not translate to machine code which CPU then executes the ADD operation, how then is this operation performed? what machine code is ACTUALLY executed to support this ADD operation?)

1
The actual machine code is part of the interpreter. Just think of a loop containing a switch statement, having a case for each existing bytecode instruction performing the operation right there or invoking a subprogram.Holger

1 Answers

3
votes

The expression 1+1 will compile to the following bytecode:

iconst_1
iconst_1
add

(Actually, it will just compile to iconst_2 because the Java compiler performs constant-folding, but let's ignore that for the purposes of this answer.)

So to find out exactly what the interpreter does for those instructions, we should look at its source code. The relevant sections for const_1 and add start at line 983 and line 1221 respectively, so let's take a look:

#define OPC_CONST_n(opcode, const_type, value)                          \
      CASE(opcode):                                                     \
          SET_STACK_ ## const_type(value, 0);                           \
          UPDATE_PC_AND_TOS_AND_CONTINUE(1, 1);

          OPC_CONST_n(_iconst_m1,   INT,       -1);
          OPC_CONST_n(_iconst_0,    INT,        0);
          OPC_CONST_n(_iconst_1,    INT,        1);
          // goes on for several other constants

//...
#define OPC_INT_BINARY(opcname, opname, test)                           \
      CASE(_i##opcname):                                                \
          if (test && (STACK_INT(-1) == 0)) {                           \
              VM_JAVA_ERROR(vmSymbols::java_lang_ArithmeticException(), \
                            "/ by zero", note_div0Check_trap);          \
          }                                                             \
          SET_STACK_INT(VMint##opname(STACK_INT(-2),                    \
                                      STACK_INT(-1)),                   \
                                      -2);                              \
          UPDATE_PC_AND_TOS_AND_CONTINUE(1, -1);                        \
          // and then the same thing for longs instead of ints

      OPC_INT_BINARY(add, Add, 0);
      // other operators

The whole thing is inside a switch-statement that examines the opcode of the current instruction.

If we expand the macro-magic, replace the surrounding code with an extremely simplified template and make some simplifying assumptions (such as the stack only consisting of ints), we end up with something like this:

enum OpCode {
  _iconst_1, _iadd
};

// ...
int* stack = new int[calculate_maximum_stack_size()];
size_t top_of_stack = 0;
size_t program_counter = 0;
while(program_counter < program_size) {
  switch(opcodes[program_counter]) {
    case _iconst_1:
      // SET_STACK_INT(1, 0);
      stack[top_of_stack] = 1;
      // UPDATE_PC_AND_TOS_AND_CONTINUE(1, 1);
      program_counter += 1;
      top_of_stack += 1;
      break;

    case _iadd:
      // SET_STACK_INT(VMintAdd(STACK_INT(-2), STACK_INT(-1)), -2);
      stack[top_of_stack - 2] = stack[top_of_stack - 1] + stack[top_of_stack - 2];
      // UPDATE_PC_AND_TOS_AND_CONTINUE(1, -1);
      program_counter += 1;
      top_of_stack += -1;
      break;
}

So for 1+1 the sequence of operations would be:

stack[0] = 1;
stack[1] = 1;
stack[0] = stack[1] + stack[0];

And top_of_stack would be 1, so we'd end with a stack that contains the value 2 as its only element.