There are 2 languages. The first one is assembly language, where you might have a string of characters like "mov rax,1". The second one is machine language where you'll have a set of bytes.
These languages are related, but different. For example, the mov
instruction in assembly language is actually multiple different opcodes in machine language (one for moving bytes to/from general purpose registers, one for moving words/dwords/qwords to general purpose registers, one for moving dwords/qwords to control registers, one for moving dwords/qwords to debug registers, etc). The assembler uses the instruction and its operands to select an appropriate opcode (e.g. if you do mov dr6,eax
then the assembler will choose the opcode for moving dwords/qwords to debug registers because none of the other opcodes are suitable).
In the same way, the operands may be different. For example, for assembly language the constant 1
has the type "integer" and doesn't have any size (its size is implied from how/where its used); but in machine code an immediate operand must be encoded somehow, and the size of the encoding will depend on which opcode (and which prefixes) are used for the mov
.
For example, if mov rax,1
is converted into the bytes 0x48, 0xC7, 0xC0, 0x01, 0x00, 0x00, 0x00; then you could say that the operand is "64 bits encoded in 4 bytes (using sign extension)"; or you could say that the operand is 32 bits encoded in 4 bytes (and that the instruction only moves 32 bits into RAX
and then sign extends into the upper 32 bits of RAX instead of moving anything into them). Even though these things sound different (and even though most people would say the latter is "more correct") the behaviour is exactly the same and the only differences are superficial differences in how machine code (a different language that isn't assembly language) is described. In assembly language, the 1
is still an ("implied from context") 64 bit operand, regardless of what happens in machine language.
mov eax, imm32
which has 32-bit operand size and follows the usual rule of writing a 32-bit register zero-extending to fill the 64-bit register. Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?. (I assume that's what you meant, but if we consider narrower operand-size then 16 and 8 bits are – Peter Cordesmovabs rax,<64b immediate>
which will contain encoded value "1" as 64b integer, but common modern assembler NASM will for examplemov rax,1
assemble into instructionmov eax,1
with 32b immediate (machine codeb8 01 00 00 00
), which will set up the finalrax
content in the exactly same way, but the encoding is much shorter. .. Anyway, if the instruction hasrax
as target register, then you can bet whatever operation is going on, will target whole 64 bits of target register. How/if the operand is extended depends on particular instruction and operand. – Ped7g