Low power design for adders

Question

I have to implement a circuit that performs A+B+C+D serially.

A and B are added using the first adder, the result is added to C using the second adder and finally the result is added to D using the third adder, one after the other.

The problem is, in order to make the design low power. I have to turn off the other two adders which are not in use. All I can think is Enable and Disable signals, but this causes latency issues.

How do I synthesize this in in an effective manner in verilog?

A,B,C,D may change every clock cycle. a start signal is used to indicate when a new calculation is required.

What do you mean by 'serially'? If it's real bit-serial you use a single one-bit adder and shift all the bits through it - minimal power, high latency, nothing to turn off, relatively easy to code. If you actually have to disable two real n-bit adders to save power then you have to turn off the clock to them. Enable and disable signals won't in general help (much), if they simply gate data. — EML
will using Verilog events to trigger the adders prove more effective ?? — chitranna

Morgan Morgan · Accepted Answer · 2013-09-10T08:38:08

I assume your adder has been implied via sum = A + B;. For area optimisation why do you not share a single adder unit. A+B in CLK1, SUM+C in CLK2, SUM+D in CLK3. Then you have nothing to disable or clock gate.

The majority of power is used when values change, so zeroing inputs when not used can actually increase power by creating unnecessary toggles. As adders are combinatorial logic all we can do to save power for a given architecture is hold values stable, this could be done through the use of clock gate cells controlling/sequencing input and output flip-flops clks.

Update

With the information that a new calculation may be required every clock cycle, and there is an enable signal called start. Th question made reference to adding them serially ie :

sum1 = A + B;
sum2 = sum1 + C;
sum3 = sum2 + D;

Since the result is calculated potentially every clock cycle they are all on or all off. The given serialisation (which is all to be executed in parallel) has 3 adders stringed together (ripple path of 3 adders). if we refactor to :

sum1 = A + B;
sum2 = C + D;
sum3 = sum1 + sum2;

Or ripple path is only 2 adders deep allowing a quicker settling time, which implies less ripple or transients to consume power.

I would be tempted to do this all on 1 line and allow the synthesis tool to optimise it.

sum3 = A + B + C + D;

For power saving I would turn on auto clock gating when synthesising and use a structure that worked well with this technique:

always @(posedge clk or negedge rst_n) begin
  if (~rst_n) begin
    sum3 <= 'b0;
  end
  else begin
    if (start) begin //no else clause, means this signal can clk gate the flop
      sum3 <= A + B + C + D;
    end
  end
end

Low power design for adders

1 Answers