I am looking at some assembly code and the corresponding memory dump and I am having trouble understanding what is going on. I'm using this as reference for opcodes for x86 and this as reference for registers in x86. I ran into these commands and I realized I am still missing a big piece of the puzzle.
8B 45 F8 - mov eax,[ebp-08]
8B 80 78040000 - mov eax,[eax+00000478]
8B 00 - mov eax,[eax]
Basically I don't understand what the two bytes after the opcode mean and I can't find anywhere that gives a bit-by-bit format for the commands (if anyone could point me to one it would be much appreciated).
How does the CPU know how long each of these commands are?
According to my reference this 8B mov command allows the use of the 32b or 16b registers, meaning there are 16 possible registers (AX, CX, DX, BX, SP, BP, SI, DI, and their extended equivalents). That means you need a whole byte to specify which register to use in each operand.
Still fine so far, the two bytes after the opcode could specify which registers to use. Then I noticed that these commands are stacked byte to byte in the memory and all three of them use a different amount of bytes to specify the offset to be used when dereferencing the second operand.
I suppose you could limit the registers to only be able to use 16b with 16b and 32b with 32b, but that would only free up a single bit, not enough to tell the CPU how many bytes the offset is.
What values correspond to which registers?
The second thing that bothers me is that though my reference explicitly numbers the registers I do not see any correlation with the bytes after the opcode in these commands. These commands don't seem to be consistent even with themselves. The second and third commands are both going from eax to eax, but there is a bit midway through the first byte that is different.
Following my reference I would assume 0 is EAX, 1 is ECX, 2 is EDX, and so on. This doesn't, however, offer me any insight into how you would specify between RAX, EAX, AX, AL, and AH. Some of the commands seem to only accept 8b registers, while others take 16b or 32b, and on x86_64 some seem to take 16b, 32b, or 64b registers. So would you just do something like 0-7 are the R's, 8-15 the E's, 16-23 non-extended, and 24-31 the H's and L's? Even if it is something like that it seems like it should be a lot easier to find a manual or something specifying that.