0
votes

I'm playing around with inline assembly in C++ using gcc-4.7 on 64-bit little endian Ubuntu 12.04 LTS with Eclipse CDT and gdb. The general direction of what I'm trying to do is to make some sort of bytecode interpreter for some esoteric stack-based programming language.

In this example, I process the instructions 4-bits at a time (in practice this will depend on the instruction), and when there are no more non-zero instructions (as 0 will be nop) I read the next 64-bit words.

I would like to ask though, how do I use a function-scoped label in inline assembly?

It seems labels in assembly are global, which is unfavourable, and I can't find a way to jump to a C++ function-scoped label from an assembly statement.

The following code is an example of what I'm trying to do (Note the comment):

  ...
  register long ip  asm("r8");
  register long buf asm("r9");
  register long op  asm("r10");
  ...
fetch:
  asm("mov (%r8), %r9");
  asm("add $8, %r8");
control:
  asm("test %r9, %r9");
  asm("jz   fetch"); // undefined reference to `fetch'
  asm("shr  $4, %r9");
  asm("mov  %r9, %r10");
  asm("and  $0xf, %r10");
  switch (op) {
  ...
  }
  goto control;
2

2 Answers

1
votes

Note the following comment from the gcc inline asm documentation:

Speaking of labels, jumps from one `asm' to another are not supported. The compiler's optimizers do not know about these jumps, and therefore they cannot take account of them when deciding how to optimize.

You also can't rely on the flags set in one asm being available in the next, as the compiler might insert something between them

With gcc 4.5 and later, you can use asm goto to do what you want:

fetch:
  asm("mov (%r8), %r9");
  asm("add $8, %r8");
control:
  asm goto("test %r9, %r9\n\t"
           "jz  %l[fetch]" : : : : fetch);

Note that all the rest of your asm is completely unsafe as it uses registers directly without declaring them in its read/write/clobbered lists, so the compiler may decide to put something else in them (despite the vars with the asm declarations on them -- it may decide that those are dead as they are never used). So if you expect this to actually work with -O1 or higher, you need to write it as:

  ...
  long ip;
  long buf;
  long op;
  ...
fetch:
  asm("mov (%1), %0" : "=r"(buf) : "r"(ip));
  asm("add $8, %0" : "=r"(ip) : "0"(ip));
control:
  asm goto("test %0, %0\n\t"
           "jz   %l[fetch]" : : "r"(buf) : : fetch);
  asm("shr  $4, %0" : "=r"(buf) : "0"(buf));
  asm("mov  %1, %0" : "=r"(op) : "r"(buf));
  asm("and  $0xf, %0" : "=r"(op) : "r"(op));

At which point, its much easier to just write it as C code:

long *ip, buf, op;

fetch:
  do {
    buf = *op++;
control:
  } while (!buf);
  op = (buf >>= 4) & 0xf;
  switch(op) {
     :
  }  
  goto control;
0
votes

You should be able to do this:

fetch:
  asm("afetch: mov(%r8), %r9");
  ...
  asm("jz afetch");

Alternatively, putting the label in a separate asm("afetch:"); should work as well. Note the different name to avoid conflicts - I'm not entirely sure that's necessary, but I suspect it is.