0
votes

My book mentions " Depending on what you consider as the baseline, the reduction can be viewed as decreasing the number of clock cycles per instruction (CPI), as decreasing the clock cycle time, or as a combination.If the starting point is a processor that takes multiple clock cycles per instruction, then pipelining is usually viewed as reducing the CPI."

What I fail to understand is pipelining affects CPI or the clock period because in case of pipelining clock period is taken as max stage-delay + Latch-delay so pipelining does affect the clock time . Also it affects CPI because it becomes 1 in case of pipelining. Am I missing on some concept?

1
pipelining increases average throughput for the same clock speed, which is exactly the same thing as decreasing average CPI. Or it lets you increase the clock speed if your CPU's clock was so slow that it could do everything for a whole instruction in one clock cycle. - Peter Cordes
@PeterCordes Does that mean that either we can decrease CPI or the clock cycle time, not both? - Amisha Bansal
It generally increases the cycle time for a single instruction, but on the average, since many(= length of pipeline) instructions are executing in parallel, cycle per instruction is decreased. I suggest you look at the famous Laundry Example, for example at hpca23.cse.tamu.edu/taco/utsa-www/cs5513-fall07/lecture3.html . - abjoshi - Reinstate Monica

1 Answers

3
votes

Executing an instruction requires a set of operations. For the sake of simplicity assume there are 5: fetch-instruction decode-execute-memory access-write back.

This can be implemented with several schemes.

A/ Mono cycle processor

The scheme is the following: The processor fetches an instruction, directs it to a decoder that controls a bank of multiplexers that will configure a large combinatorial datapath that will implement the instruction.

In this model, every instruction requires one cycle, and, assuming all the 5 "stages" require an equal time t, the period will be 5t. Hence CPI=1, T=5

Actually, this was more or less the underlying model of the earlier computers in the late 40's. Besides that, no real processor has be done like that, but it is theorically quite doable.

B/ Multi cycle processor

Compared to the previous model, you introduce registers on the datapath. First one fetches the instruction and sends it to the inputs of an automaton that will sequentially apply the computation "stages".

In that case, instructions require 5 cycles (maybe slightly less as some instructions may be simpler and, for instance, skip the memory access). Period is 1t (or maybe slighly more to take into account the registers traversal time).

CPI=5, T=1

The first "true" computers were implemented like that and this was the main architectural model up to the early 80's. Nowadays several microcontrollers or, for instance, the simpler version of NIOS, are still relying on this scheme.

C/ pipeline processor

You add extra registers between the stages in order to keep track of the instruction and of all the partial results. In that case, the execution of every stage can be independent and you can execute several instructions simutaneously in different stages.

CPI becomes 1, as you can start a new instruction at every clock cycle (probably a bit more because of the hazards, but that is another story). And T=1.

So CPI=1, T=1

(the CPI reflects the throughput increase but the execution time of a single instruction is not reduced)

So pipeline can be seen as either reducing the cycle time wrt scheme A, or reducing the CPI, wrt to scheme B. And you can also imagine an intermediate scheme (say 3 stages, with a period of 2) where pipeline will reduce both.