Download Solution - University of California, Berkeley

UNIVERSITY OF CALIFORNIA College of Engineering Department of Electrical Engineering and Computer Sciences Last modified on April 21, 2006 by Seng Oon Toh (sengoon@eecs) Borivoje Nikolić EE 141 Homework #9: Activity Factor and Sequential Circuits Problem #1 G2 A G4 G1 B G6 G3 C G5 D 50 fF Circuit A A B C D G2 G1 50 fF Circuit B Let’s evaluate the effect of logic choice on power dissipation. Circuit A is an AND4 implemented as a NAND2 chain while Circuit B implements the same function using a 4 input gate. a) Size circuit A and circuit B for minimum delay with Cin of 3 fF and Cout of 50 fF. Report your answer in terms of the input capacitance seen at each gate. Circuit A H=50/3 * (4/3)3=39.51 h=H1/6=1.845 G1 G2 3fF 4.15fF G3 7.66fF G4 10.60fF G5 19.55fF G6 27.06fF Circuit B H=50/3 * (6/3) = 33.33 H=H1/2=5.77 G1=3fF G2=8.66fF b) Given that P(A=0) = P(B=0) = P(C=0) = P(D=0) = 0.25, calculate the probability of energy consuming transitions P(0 → 1) at the outputs of the gates in both circuits. Circuit A P(0→1) = P(0) * P(1) = (1-P(0)) * P(0) In1 0 (1/4) 0 (1/4) 1 (3/4) 1 (3/4) In2 Out 0 (1/4) 1 (1/16) 1 (3/4) 1 (3/16) 0 (1/4) 1 (3/16) 1 (3/4) 0 (9/16) NAND2 Truth Table with Probabilities For G3, P(In1=1) = P(G2=1) = 9/16, while P(In2=1) = P(C=1)=3/4. Therefore P(G3=0) = 9/16 * ¾ = 27/64 and P(G3=1)=1-P(G3=0) For G5, P(In1=1) = P(G4=1) = 27/64, while P(In2=1) = P(D=1)=3/4. Therefore P(G5=0) = 27/64 * ¾ = 81/256 P(0) P(1) P(0→1) G1 9/16 7/16 63/256 G2 7/16 9/16 63/256 G3 27/64 37/64 999/4096 G4 37/64 27/64 999/4096 G5 81/256 175/256 14175/65536 G6 175/256 81/256 14175/65536 Circuit B P(NAND4=0) = P(In1=1, In2=1, In3=1, In4=1) = 3/44 = 81/256 P(0) P(1) P(0→1) G1 81/256 175/256 14175/65536 G2 175/256 81/256 14175/65536 c) Assuming both circuits are operating at 500 MHz, calculate the dynamic power consumption of the circuit. Assume γ = 1, Vdd = 2.5V, and only consider capacitances at the inputs and output of the gates. Circuit A At primary inputs, P(0→1)=1/4*3/4=3/16 For a NAND2 gate, the capacitance at the output node is 6 γ /4 times bigger than the capacitance at the gate. There are 2 2x PMOS and 1 2x NMOS devices at the output node while the capacitance at each input is 4 units. That gives us 6/4. We need to add gamma to skew it for the technology. A quick way to figure it out is pγ /g for simpler logic gates like NAND’s NOR’s. It might not be accurate for AOI (and-or-invert) since the topology is not unique. Pdyn = Vdd2 f ((2*3 + 7.66 + 19.55)fF*3/16 + (4.15+3*3γ/2)fF*αG1 + (7.66+4.15γ)fF*αG2 + (10.6+7.66*3γ/2)fF*αG3 + (19.55+10.6γ)fF*αG4 + (27.06+19.55*3γ/2)fF*αG5 + (50+27.06γ)fF*αG6) = 165μW Circuit B For a NAND4 gate, the capacitance at the output node is 12 γ/6. Pdyn = Vdd2 f (4*3fF*3/16 + (3*12 γ/6+8.66)fF*αG1 + (8.66+50)fF*αG2) = 56.6μW Shorter logic depth reduces power consumption. Problem #2 Consider the simple state machine shown above. A, B, and C represent combinational logic blocks with the following properties: tminA = 200 psec; tmaxA = 1 nsec; tminB = 300 psec; tmaxB = 2 nsec; tminC = 100 psec; tmaxC = 0.5 nsec; The L-units represent positive latches clocked by . L has a setup time of 150 psec and a delay of 250 psec (td-q when latch is transparent). Tc-q is 100 psec and thold is 100 psec. The clock has a period T and is high for a duration of Ton. The duty cycle of the clock hence equals 100 Ton/T %. a) Determine the conditions on the clock necessary to avoid the occurrence of races. Race conditions occur when clock pulses are so short that input transitions meant to be latched in the next clock cycle “race” through fast-paths and appear at the output of the pipeline stage, causing data to arrive ahead of schedule. In the case of latches, what we are concerned about is the period where the latches are transparent. Race conditions put an upper bound on the transparent period (Ton). The condition to prevent race conditions is hence Ton + thold ≤ tc-q + tminA + tminC Ton < 100ps + 200ps + 100ps - 100ps = 300ps This requirement stems from the fact that we would like our fast path logic to reach the receiving latch after the hold time. This way, there will never be any spurious transitions within the t-setup to t-hold regions of the falling edge, thus ensuring proper values being latched. Ton, min = tc-q b) Determine the absolute minimum clock period for this circuit to work correctly as well as the maximum duty cycle. There are two cases we should consider. First is if the signal happens to be launched while the latches are transparent and keeps on looping around the loop while the latches are transparent. This is the ideal case since latches are meant to be traffic lights and it would be best if we saw green all the time. For this case, Tclk,min = td-q + tlogic,max (assuming that the latches stay transparent for a long enough period) = 2.75 ns The other case is if the signal arrives when the latch is latched, in this case we need to meet tsetup requirements. For this case, Tclk,min = tc-q + tlogic,max + tsetup = 2.75 ns We would take the max of the two but they are the same in this case. The maximum duty cycle for this case is 100*300/2750 % = 11 % c) Suppose that due to some sloppy clock-network routing, the clock signal at L1 arrives 100ps earlier than the clock signal at L2. Calculate the absolute minimum clock period for this circuit to work properly as well as the maximum duty cycle. This is a case of positive clock skew between L1 and L2 however since L1 is not in the critical path of the circuit, the minimum clock period remains the same as part (b). However Ton is affected by this positive skew because the input is now launched 100ps earlier into the stage. The new constraint on Ton is: Ton ≤ tc-q + tminA + tminC - thold - δ = 100ps + 200ps + 100ps – 100ps – 100ps= 200ps Hence, the maximum duty cycle is reduced to 100*600/2750 % = 7.3 % Problem #3 Clk 1 2 1 Vx A B 2 Out PMOS: 2 NMOS: 1 C 2 2 2 2 2 1 The circuit above is a Limited Switch Dynamic Logic (LSDL) NOR3 gate which is a circuit family used in high performance datapaths. It is essentially a domino gate followed by a latch. The relative sizing of the gates have been annotated on the schematic. a) What is the purpose of the shaded transistors? They function just like keepers to maintain a static output. Without these transistors, the input of the inverter would be floating during the pre-charge phase and highly susceptible to noise injection. b) Assuming no propagation delay, complete the following ideal timing diagram. Clk A B C Vx Out c) Calculate tsetup of the gate (tsetup is defined as the minimum period of time where inputs have to be stable before the clock transitions into the latched state in order for the correct value to be latched) in terms of intrinsic delay tp0. Assume γ=1. The circuit can be divided into three phases, the first stage looks like a domino gate, the second stage is the inverter with keepers while the third stage is the output inverter. For tsetup we only need to consider the propagation delay through the first two stages. Once the output of the second stage is set, the clock can safely transition without the risk of losing the latched data. Using the method of Logical Effort on the evaluate transition (pull-down) (Refer to HW8 solutions for details): g1 = 2/3 p1 = 3/3 = 1 f1 = 4/2 = 2 For the second stage, since the evaluate transition will cause the input to this stage to transition H-L, the network that we care about is the pull-up. We do not need to wrestle with any keepers in this case because the NMOS pull-down network is deactivated during this phase. The PMOS keeper is also off in this case. The logical effort is therefore straightforward. g2 = 4/3 p2 = 5/3 f2 = 3/4 tsetup = tp0[p1 + f1g1 + p2 + f2g2] = 5 tp0 d) What would be the minimum positive pulse-width of Clk (in terms of tp0) to ensure that a correct value is latched? We need to consider two cases. Case 1 : Vx is discharged to Gnd This case is exactly the same as the case for evaluating tsetup and the answer is 5 tp0. Case 2 : Vx remains at Vdd Now the evaluate phase goes through the pull-down network of the second stage. Since the PMOS keeper is on in this case, we need to convert our circuit to an equivalent form with only one NMOS pull-down transistor. Referring to the method in HW8 which is to assume that the PMOS keeper effectively reduces the pull-down strength of the NMOS network by an equivalent NMOS transistor, the equivalent minimum sized inverter has a NMOS of size 0.5 and a PMOS of size 1. g2 = 4/1.5 = 8/3 p2 = 5/1.5 = 10/3 tpulse,min = tp0(p2 + f2*g2) = (16/3) tp0 f2 = 3/4 The worst case is therefore case 2 and tpulse,min = 16/3 tp0 This value is particularly important in this logic style because the evaluate phases are usually implemented as pulses generating using NAND gates driven from the same clock but with one input delayed using inverters. We need to ensure that these pulses are sufficiently large across process variation. e) Assuming that A=B=C=1, compare the activity factor at node Out to the activity factor at node Vx which would be the output of a standard domino logic gate. What can you infer about the dynamic power consumption of static gates being driven by this gate compared to domino logic? Vx switches every clock period even if the inputs are constant while Out remains constant. The activity factor of gates being driven by domino logic will be higher than LSDL and therefore consume more power. The trade-off is a higher clock load in LSDL gates because Clk needs to drive three gates instead of 2 or 1 in the case of footed or footless domino respectively.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Solution - University of California, Berkeley