6
votes

Is it possible to use the 8-bit registers (al, ah, bl, bh, r8b) in indexed addressing modes in x86-64? For example:

mov ecx, [rsi + bl]
mov edx, [rdx + dh * 2]

In particular, this would let you use the bottom 8-bits of a register as a 0-255 offset, which could be useful for some kernels.

I poured over the Intel manuals and they aren't explicit on the matter, but all the examples they give only have 32-bit or 64-bit base and index registers. In 32-bit code I only saw 16 or 32-bit registers. Looking at the details of mod-r/m and SIB byte encoding also seems to point towards "no" but that's complex enough with enough corner cases that I'm not sure I got it right.

I'm mostly interested in the x86-64 behavior, but of course if it's possible in 32-bit mode only I'd like to know.

As an add-on question too small and related to deserve another post - can 16-bit registers be used for base or index? E.g., mov rax, [rbx + cx]. My investigation pointed towards basically the same answer as above: probably not.

1
It's not possible, both the registers have to be the same size as the address-size, so that also rules out [reg16 + disp32]. You need a movzx. The tables that show the encoding really do enumerate all possible encodings.Peter Cordes
Yeah the tables show stuff like rax/eax/ax/al though. So it kept my hopes up. You can count the bits and see that only three are available to select the register to guess that only one size is available, but you have to check for size changing bits across all the various bytes, and read the details of all the REX prefixes, etc. But yeah, I was already pretty sceptical.BeeOnRope
The addressing mode tables won't show al, because there are no 8-bit addressing modes. I understand the motivation for the question, though: when I was new to x86, I kept wondering if there were addressing modes I didn't know about. But it turns out there's only [base + idx*scale + disp8/disp32], or any subset of that omitting one or two of those three components. (Plus RIP-relative in 64-bit mode). That's why I wrote this answer, which might make a good Docs topic at this point.Peter Cordes
Right. I never did find a good table for the 64-bit addressing modes in the Intel docs though. There is table 2-3 in section 2.1.5 of Vol 2A of the current Intel dev manual, which covers the 32-bit case, but I never found a corresponding table for the 64-bit case. I was further thrown off early in my search by this page which clearly indicates a [disp + reg8 + reg32*scale] addressing mode, which is exactly what I want (in 32-bit mode). It seems like it was a typo though and they meant disp8 + reg32 ... instead.BeeOnRope
Well, there is the curious case of xlat (or xlatb). It does (implicitly) use the 8-bit register al as the index into a table implicitly pointed to by [r/e]bx. Unfortunately it's a horrible waste of encoding space and can only load into the al register.EOF

1 Answers

7
votes

No, you cannot use 8-bit or 16-bit registers in addressing calculations in 64-bit mode, nor can you use 8-bit registers in 32-bit mode. You can use 16-bit registers in 32-bit mode, and 32-bit registers in 64-bit mode, via use of the 0x67 address size prefix byte.

(But using a narrower register makes the whole address narrow, not a 16-bit index relative to a 32-bit array address. Any registers need to be the same width as the address, which you normally want to match the mode you're in, unless you have stuff in the low 16 or low 32 bits of address space.)

This table summarizes well the various options for operand and address sizes. The general pattern is that the default address size is the same as the current mode (i.e., 32-bits in 32-bit mode, 64-bits in 64-bit mode)1, and then if the 0x67 prefix is included, the address size is changed to half the usual size (i.e., 16-bits in 32-bit mode, 32-bits in 64-bit mode).

Here's an excerpt of the full table linked above showing 64-bit long-mode behavior only, for various values of the REX.W, 0x66 operand and 0x67 address size prefixes:

REX.W 0x66 prefix (operand) 0x67 prefix (address) Operand size (footnote 2) Address size
0 No No 32-bit 64-bit
0 No Yes 32-bit 32-bit
0 Yes No 16-bit 64-bit
0 Yes Yes 16-bit 32-bit
1 Ignored No 64-bit 64-bit
1 ignored Yes 64-bit 32-bit

1 That might seem obvious, but it's the opposite to the way operand sizes work in 64-bit mode: most default to 32-bits, even in 64-bit mode, and a REX prefix is needed to promote them to 64-bits.

2Some instructions default to 64-bit operand size without any REX prefix, notably push, pop, call and conditional jumps, and as Peter points out below, this leads to the odd situation where at least some of these instructions (push and pop included) can't be encoded to use 32-bit operands, but can use 16-bit operands (with the 0x66 prefix).