5
votes

All CPU architectures, which I have encountered, have symmetric registers - i.e. the value you read is the value you wrote.

Is there a case, for register-limited 16-bit instructions, to have asymmetric registers?

e.g.

  • Registers 0-6 are "local" to the function invocation. The value written in this function call is the value which will be read. Each level of function call has its own register hardware, so local registers are implicitly saved.
  • Registers 7-9 are "global", perhaps "thread local" on a SMP CPU.
  • Values written to "call" registers 10-13 do not affect what is read from them in this function call context.
  • Values read from "call" registers 10-13 are the values written in the calling function, i.e. a function's register arguments are immutable.
  • Values written to "return" registers 14-15 do not affect whet is read from them in this function call context.
  • Values read from "return" registers 14-15 are the values written in the function which most recently returned to the current function.

Each functions-level's registers have their own hardware, spilling to the stack only when call-depth exceeded the hardware.

               (local) (global) ( call ) (ret)

global regset          07 .. 09
.                           
.                            
.                              
.                                |     | ^  ^
.                                v     v |  | 
regsetN-1     00 .. 06          10 .. 13 14 15
              |^    |^          |     |   ^  ^
              v|    v|          v     v   |  |
        fnN-1 RW    RW          RW    RW RW RW
                                 |     | ^  ^
                                 v     v |  | 
regsetN       00 .. 06          10 .. 13 14 15
              |^    |^          |     |   ^  ^
              v|    v|          v     v   |  |
        fnN   RW    RW          RW    RW RW RW
                                 |     | ^  ^
                                 v     v |  |
regsetN+1     00 .. 06          10 .. 13 14 15
              |^    |^          |     |   ^  ^
              v|    v|          v     v   |  |

Would a scheme like this reduce the register pressure within each function call by two or more registers?

I'm not expecting that this a new idea, but I am interested in whether it has been done, and if not, why not? If it isn't a mad idea, or already done, I may implement this on an FPGA CPU.

Is it just too complex to be worth the register savings?

Are llvm-difficulties a major reason that this is not done?

P.S. I am aware that super-scalar processors are already much more complex than this, with register-renaming schemes, etc. I'm just musing about microcontroller-class architectures.


Update: It looks like the SPARC architecture did this. Why was it not thought useful by later ISA designers?

When a procedure is called, the register window shifts by sixteen registers, hiding the old input registers and old local registers and making the old output registers the new input registers.

1
It isn't clear to me why this needs to be implemented at a micro architectural (hardware) level. This could just as easily be a convention established by the designers and adhered to by software. I mean, I guess you wouldn't get the hardware-assisted protection against writing to a register you shouldn't, but at these low levels, you aren't usually guaranteed these types of protections, anyway. Aside from that, once you have 16 registers, it hardly seems like you'd be classified as a "register-limited" ISA. - Cody Gray
Furthermore, there are architectures that implement register windows, which sounds pretty similar to what you're describing, thought not exactly the same. - Cody Gray
You're right, there's no reason why this needs to be hardware, it could just be an ISA. The "protections" were not a goal, just a happy coincidence. The point is that a function can write values to "call registers" (before calling a function) without clobbering the arguments which which it itself was called, etc., thus easing register pressure in the ISA. - fadedbee
re: register windows - yes, this is a form of register windowing, but also allowing reads and writes to higher and lower windows, depending on register number and access-type. - fadedbee
The "classic" version of the Zilog Z8 can have 144 or 256 8 bit registers, which are usually paired (even/odd) to form 16 bit addresses. There's a short form instruction that uses a 16 bit index from a base register to select a register. By using a base register (or more) per "thread", you get some of the functionality that you mention. There's a pin for code fetches versus data read / write, making it a Harvard architecture.. - rcgldr

1 Answers

3
votes

This was how SPARC's register windows worked. While it looks like a good idea in isolation, interactions with the rest of the system lowered the overall system performance.

From http://ieng9.ucsd.edu/~cs30x/sparcstack.html

That was the idea, anyway. The drawback is that upon interactions with the system the registers need to be flushed to the stack, necessitating a long sequence of writes to memory of data that is often mostly garbage. Register windows was a bad idea that was caused by simulation studies that considered only programs in isolation, as opposed to multitasking workloads, and by considering compilers with poor optimization. It also caused considerable problems in implementing high-end Sparc processors such as the SuperSparc, although more recent implementations have dealt effectively with the obstacles. Register windows is now part of the compatibility legacy and not easily removed from the architecture.