Are some general purpose registers faster than others?

Question

In x86-64, will certain instructions execute faster if some general purpose registers are preferred over others?

For instance, would mov eax, ecx execute faster than mov r8d, ecx? I can imagine that the latter would need a REX prefix which would make the instruction fetch slower?

What about using rax instead of rcx? What about add or xor? Other operations? Smaller registers like r15b vs al? al vs ah?

AMD vs Intel? Newer processors? Older processors? Combinations of instructions?

Clarification: Should certain general purpose registers be preferred over others, and which ones are they?

The only difference I know of is that some instructions have smaller encodings if certain registers are used. For instance, add al, 7; add dl, 7; add r12b, 7 are 2, 3, 4 bytes respectively, the last due to the REX prefix as you note. This may slow down the time to fetch the instructions, or waste cache space, but I'm not aware that it makes any difference in the time to actually execute the instructions. — Nate Eldredge
My understanding is that the machine has a large number of anonymous internal registers, onto which the architectural registers are mapped in a dynamic fashion (register renaming), so I can't think of any way that any one architectural register could be inherently faster than any other. — Nate Eldredge
And in code that will interact with higher-level languages, there are the register saving conventions to consider. This may mean that rbx is "slower" than rcx simply because you would have to save and restore it at the start and end of your function. Or it can go the other way; if you call many other functions, rbx may end up being faster because you don't have to save and restore it around every function call you make. But that's nothing to do with the machine itself, of course. — Nate Eldredge
On (nearly all) modern systems with register renaming, the CPU isn't really using your register names anyway - it will gladly use another General Purpose register to avoid stalls for you. — Michael Dorgan

nickelpro nickelpro · Accepted Answer · 2020-07-15T20:33:20

LEA will be slower with EBP, RBP, or R13 as the base (PDF warning, page 3-22). But generally the answer is No.

Taking a step back, it's important to realize that since the advent of register renaming that architectural registers don't deal with actual, physical registers on most micro-architectures. For example, each Cascade Lake core has a register file of 180 integer and 168 FP registers.

Are some general purpose registers faster than others?

3 Answers

Special cases where a specific encoding is extra slow, not just size