8
votes

At school we have been using a bootstrap program to run stand-alone programs without an operating system. I have been studying this program and when protected mode is enabled there is a far jump executed by directly assembling the opcode and operands as data within the program. This was for the GNU assembler:


         /* this code immediately follows the setting of the PE flag in CR0 */

.byte   0x66, 0xEA
.long   TARGET_ADDRESS
.word   0x0010          /* descriptor #2, GDT, RPL=0 */

First of all, why would one want to do this (instead of the instruction mnemonic)?

I have been looking at the Intel manuals, but am still a little confused by the code. Specifically in Volume 2A, page 3-549, there is a table of opcodes. The relevant entry:

EA *cp* JMP ptr16:32  Inv.  Valid  Jump far, absolute, address given in
operand

The actual opcode is obvious, but the the first byte, 0x66, has me confused. Referring to the table in the Intel manual, the cp apparently means that a 6 byte operand will follow. And obviously 6 bytes follow in the next two lines. 0x66 encodes an 'Operand-size override prefix'. What does this have to do with the cp in the table? I was expecting there to be some hex value for the cp, but instead there is this override prefix. Can someone please clear this up for me?

Here is a dump from od:

c022    **ea66    0000    0001    0010**    ba52    03f2    c030

TARGET_ADDRESS was defined as 0x00010000.

I am also confused a bit by the significance of the last two bytes. However, that seems to be another question altogether. It is getting quite late, and I have been staring at code and the Intel manuals for hours, so I hope I got my point across.

Thanks for looking!

3
People use opcodes (instead of instructions) for 2 reasons. The first reason is when the assembler is "less than adequate" and doesn't provide support for the instruction they need (this is/was common when new instructions are added and older assemblers don't support them yet). The second reason is when the assembler does support the instruction they need but the programmer doesn't know how to convince the assembler to generate it. Basically, it's either bad tools (including old tools, confusing syntax and/or bad documentation) or bad programmers.Brendan
Note: My comment above is "in general" and applies to all assemblers. I don't use GAS, and have no idea if it supports the "32-bit far jump in 16-bit code" instruction or not (or how good/bad the documentation is).Brendan

3 Answers

12
votes

The 0x66 indicates that the JMP (0xEA) refers to six bytes. The default is refering to 64K (16 bits) in real mode or to 32 bits in protected mode (if I recall well). Having it increased, it also includes the segment descriptor, the index of the segment either in the GDT or the LDT, which means, that this code is making what is traditionally called a "long jump": a jump that cross beyond segments in the x86 architecture. The segment, in this case, points to the second entry on the GDT. If you look before in that program, you'll likely see how the GDT is defined in terms of the segment starting address and length (look in the Intel manual to study the GDT and LDT tables, 32 bit entry describing each segment).

2
votes

I run into this a bit. Some assemblers will only jump to a LABEL . In this case the person wants to make an absolute jump to a specific hard coded offset. jmp TARGET_ADDRESS won't work I am guessing, so they just put it as bytes to get around this issue.

0
votes

0x66 specifies operand size override of the current code segment size. Assuming that current code size is 16-bit, the new instruction pointer will be 32-bit, not 16-bit. If current code segment size is 32-bit, the 0x66 will render target instruction pointer as 16-bit. The current code size attribute depends on CS selector in use and its attributes loaded from GDT/LDT table. In real mode the code segment size is usually 16-bit except special cases of "unreal" mode.