4
votes

In my venture of coding a disassembler for the 32-bit Linux on x86 platform, I came across an issue. I saw the following opcode sequence when I disassembled a simple ELF-32 executable using objdump:

dc 82 04 08 0d 00     faddl  0xd0804(%edx)

But when I look at the Intel manual, I don't see an opcode corresponding to this. The fadd instruction starts with 0xDC, but then it requires a m64fp operand, which is "A memory quadword operand in memory."

Now, does this mean that the operand is a 64-bit address (which then means that the fadd instruction is a 64-bit instruction, but isn't prefixed by a REX byte), or is it just a 32-bit address which points to a quadword (64-bit)?

Am I missing something trivial over here, or is my understanding of encoding x86 instructions wrong?

2
Sorry but I haven't looked at SO for quite some time. Will do it now.Hrishikesh Murali
Note that the Intel manuals use Intel notation, whereas linux uses AT&T notation, so you will not be able to look up linux opcodes in the Intel manual directly.Raymond Chen

2 Answers

5
votes

Let's break this down.

> dc 82 04 08 0d 00     faddl  0xd0804(%edx)
  |  |  \____ ____/
  |  |       V
  |  |       |
  |  |       +---------> 32-bit displacement
  |  +-----------------> ModRM byte
  +--------------------> Opcode

Looking at the docs in detail, dc is indeed for an m64real floating point argument as the source. It will add this 64-bit argument to the ST(0) floating point register.

However, it's the second byte 82 that is deciding where that 64-bit value comes from. This translates to the binary ModRM byte of:

+---+---+---+---+---+---+---+---+
| 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
+---+---+---+---+---+---+---+---+
|  MOD  |  REG/OPCD |    R/M    |

If you look at table 2.2 in your linked document (the one for 32-bit addressing modes), you'll see that this translates into disp32[EDX].

In other words it takes the next 32 bits (four bytes), adds that to the edx register and uses that address to extract the 64-bit value from memory.

2
votes

"Quadword operand in memory" means the value takes 64 bits in RAM. The address size will depend on whether it is being compiled as a 32 or 64 bit process, not on how big the operands are. Here is a full breakdown of the disassembly.

  • The first byte, DC is the opcode. Combined with the fact that the next byte is not between C0 and C7, and contains 0 in the register field (bits 3-5), this indicates a fadd instruction with a 64 bit memory operand. Interestingly, the l at the end of the opcode would indicate a 32 bit operand. It should be faddq.

  • The second byte contains 3 fields.

    • Bits 6-7 are indicate the mode of the last field.
    • Bits 3-5 are the register field. Since a register operand isn't necessary for this instruction, they are used as part of the opcode.
    • Bits 0-2 are the R/M field. It can hold a register or specify a memory operand. The combined mode 10 and R/M 010 indicate that the operand is a memory operand with a 32 bit address relative to the edx register.
  • The last 4 bytes are the relative offset of the operand in little endian (least significant byte first).