38
votes

According to Intel in x64 the following registers are called general purpose registers (RAX, RBX, RCX, RDX, RBP, RSI, RDI, RSP and R8-R15) https://software.intel.com/en-us/articles/introduction-to-x64-assembly.

In the following article, it's written that RBP and RSP are special purpose registers (RBP point to the base of the current stack frame and RSP point to the top of the current stack frame). https://www.recurse.com/blog/7-understanding-c-by-learning-assembly

Now I have two contradictory statements. The Intel statement should be the trusted one, but what is correct and why is RBP and RSP called general purpose at all ?

Thanks for any help.

3
You can use both as general purpose registers, meaning the usual arithmetic and logical instructions work with them just fine. rbp is pretty much general purpose, the frame pointer thing is just convention.Jester
Every register has some special-ness (except R8-R15), for some instructions. For RSP, it's special for push/pop/call/ret, so most code never uses it for anything else. But in controlled conditional (like no signal handlers) you don't have to use it for a stack pointer. e.g. you can use it to read an array in a loop with pop, like in this code-golf answer. (I actually used esp in 32-bit code, but same difference).Peter Cordes
I guess if you extend the definition of "specialness" to encoding, even r13 is a bit special, although it isn't really functional in that you can still effectively use every addressing mode (even if the assembly is sometimes putting in a hidden zero displacement for you).BeeOnRope
RBP can be used for general purposes with -fomit-frame-pointer. It's harder for RSP though[phuclv
@PeterCordes R11 has a special role for syscallphuclv

3 Answers

33
votes

General purpose means all of these registers might be used with any instructions doing computation with general purpose registers while, for example, you cannot do whatever you want with the instruction pointer (RIP) or the flags register (RFLAGS).

Some of these registers were envisioned to be used for specific use, and commonly are. The most critical ones are the RSP and RBP.

Should you need to use them for your own purpose, you should save their contents before storing something else inside, and restore them to their original value when done.

17
votes

If a register can be an operand for add, or used in an addressing mode, it's "general purpose", as opposed to registers like the FS segment register, or RIP. The GP registers are also called "integer registers", even though other kinds of registers can hold integers, too.

In computer architecture, it's common for CPUs to internally handle integer registers / instructions separately from FP/SIMD registers / instructions. e.g. Intel Sandybridge-family CPUs have separate physical register files for renaming GP integer vs. FP/vector registers. These are simply called the integer vs. FP register files. (Where FP is short-hand for everything that a kernel doesn't need to save/restore to use the GP registers while leaving user-space's FPU/SIMD state untouched.) Each entry in the FP register file is 256 bits wide (to hold an AVX ymm vector), but integer register file entries only have to be 64 bits wide.

On CPUs that rename segment registers (Skylake does not), I guess that would be part of the integer state, and so would RFLAGS + RIP. But when we say "integer register", we normally mean specifically a general-purpose register.


"General purpose" in this usage means "data or address", as opposed to an ISA like m68k where you had d0..7 data regs and a0..7 address regs, all 16 of which are integer regs. Regardless of how the register is normally used, general-purpose is about how it can be used.


Every register has some special-ness for some instructions, except some of the completely new registers added with x86-64: R8-R15. These don't disqualify them as General Purpose The (low 16 of the) original 8 date back to 8086, and there were implicit uses of each of them even in the original 8086.

For RSP, it's special for push/pop/call/ret, so most code never uses it for anything else. (And in kernel mode, used asynchronously for interrupts, so you really can't stash it somewhere to get an extra GP register the way you can in user-space code: Is ESP as general-purpose as EAX?)

But in controlled conditional (like no signal handlers) you don't have to use RSP for a stack pointer. e.g. you can use it to read an array in a loop with pop, like in this code-golf answer. (I actually used esp in 32-bit code, but same difference: pop is faster than lodsd on Skylake, while both are 1 byte.)


Implicit uses and special-ness for each register:

See also x86 Assembly - Why is [e]bx preserved in calling conventions? for a partial list.

I'm mostly limiting this to user-space instructions, especially ones a modern compiler might actually emit from C or C++ code. I'm not trying to be exhaustive for regs that have a lot of implicit uses.

  • rax: one-operand [i]mul / [i]div / cdq / cdqe, string instructions (stos), cmpxchg, etc. etc. As well as special shorter encodings for many immediate instructions like 2-byte cmp al, 1 or 5-byte add eax, 12345 (no ModRM byte). See also codegolf.SE Tips for golfing in x86/x64 machine code.

    There's also xchg-with-eax which is where 0x90 nop came from (before nop became a separately-documented instruction in x86-64, because xchg eax,eax zero-extends eax into RAX and thus can't use the 0x90 encoding. But xchg rax,rax can still assemble to REX.W=1 0x90.)

  • rcx: shift counts, rep-string counts, the slow loop instruction

  • rdx: rdx:rax is used by divide and multiply, and cwd / cdq / cqo to set up for them. rdtsc. BMI2 mulx.

  • rbx: 8086 xlatb. cpuid use all four of EAX..EDX. 486 cmpxchg8b, x86-64 cmpxchg16b. Most 32-bit compilers will emit cmpxchg8 for std::atomic<long long>::compare_exchange_weak. (Pure load / pure store can use SSE MOVQ or x87 fild/fistp, though, if targeting Pentium or later.) 64-bit compilers will use 64-bit lock cmpxchg, not cmpxchg8b.

    Some 64-bit compilers will emit cmpxchg16b for atomic<struct_16_bytes>. RBX has the fewest implicit uses of the original 8, but lock cmpxchg16b is one of the few compilers will actually use.

  • rsi/rdi: string ops, including rep movsb which some compilers sometimes inline. (gcc also inlines rep cmpsb for string literals in some cases, but that's probably not optimal).

  • rbp: leave (only 1 uop slower than mov rsp, rbp / pop rbp. gcc actually uses it in functions with a frame pointer, when it can't just pop rbp). Also the horribly-slow enter which nobody ever uses.

  • rsp: stack operations: push/pop/call/ret, and leave. (And enter). And in kernel mode (not user space) asynchronous use by hardware to save interrupt context. This is why kernel code can't have a red-zone.

  • r11: syscall/sysret use it to save/restore user-space's RFLAGS. (Along with RCX to save/restore user-space's RIP).

Addressing-mode encoding special cases:

(See also rbp not allowed as SIB base? which is just about addressing modes, where I copied this part of this answer.)

rbp/r13 can't be a base register with no displacement: that encoding instead means: (in ModRM) rel32 (RIP-relative), or (in SIB) disp32 with no base register. (r13 uses the same 3 bits in ModRM/SIB, so this choice simplifies decoding by not making the instruction-length decoder look at the REX.B bit to get the 4th base-register bit). [r13] assembles to [r13 + disp8=0]. [r13+rdx] assembles to [rdx+r13] (avoiding the problem by swapping base/index when that's an option).

rsp/r12 as a base register always needs a SIB byte. (The ModR/M encoding of base=RSP is escape code to signal a SIB byte, and again, more of the decoder would have to care about the REX prefix if r12 was handled differently).

rsp can't be an index register. This makes it possible to encode [rsp], which is more useful than [rsp + rsp]. (Intel could have designed the ModRM/SIB encodings for 32-bit addressing modes (new in 386) so SIB-with-no-index was only possible with base=ESP. That would make [eax + esp*4] possible and only exclude [esp + esp*1/2/4/8]. But that's not useful, so they simplified the hardware by making index=ESP the code for no index regardless of the base. This allows two redundant ways to encode any base or base+disp addressing mode: with or without a SIB.)

r12 can be an index register. Unlike the other cases, this doesn't affect instruction-length decoding. Also, it can't be worked around with a longer encoding like the other cases. AMD wanted AMD64's register set to be as orthogonal as possible, so it makes sense they'd spend a few extra transistors to check REX.X as part of the index / no-index decoding. For example, [rsp + r12*4] requires index=r12, so having r12 not fully generally purpose would make AMD64 a worse compiler target.

   0:   41 8b 03                mov    eax,DWORD PTR [r11]
   3:   41 8b 04 24             mov    eax,DWORD PTR [r12]      # needs a SIB like RSP
   7:   41 8b 45 00             mov    eax,DWORD PTR [r13+0x0]  # needs a disp8 like RBP
   b:   41 8b 06                mov    eax,DWORD PTR [r14]
   e:   41 8b 07                mov    eax,DWORD PTR [r15]
  11:   43 8b 04 e3             mov    eax,DWORD PTR [r11+r12*8] # *can* be an index

Compilers like it when all registers can be used for anything, only constraining register allocation for a few special-case operations. This is what's meant by register orthogonality.

2
votes

Dereferencing rbp might result in a #SS(stack segment) fault.

Recently, I hit a linux kernel crash with a 'stack segment fault'.

crash> dmesg
[...]
stack segment: 0000 [#1] SMP
[...]
RIP: 0010:[<ffffffff8125fa8b>]  lock_get_status+0x9b/0x3b0
RSP: 0018:ffff89954a317d90  EFLAGS: 00010282
[...]
RBP: 800000fa8c251867 R08: 0000000000001000 R09: 000000000000ffff
[...]
crash> dis lock_get_status+0x9b
0xffffffff8125fa8b <lock_get_status+0x9b>:      mov    0x28(%rbp),%rax

The memory address in rbp is non-canonical address. That's the reason for this crash. What I learned from this crash is that accessing rbp implicitly accesses ss segment register even through rbp is not used as a stack frame base pointer.

According to Intel SDMv1 3.4.1 General-Purpose Registers:

EBP — Pointer to data on the stack (in the SS segment)