Download Solution - University of California, Berkeley

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

TRIAC wikipedia , lookup

Schmitt trigger wikipedia , lookup

Flexible electronics wikipedia , lookup

Soft error wikipedia , lookup

Operational amplifier wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Radio transmitter design wikipedia , lookup

Power MOSFET wikipedia , lookup

Valve RF amplifier wikipedia , lookup

Opto-isolator wikipedia , lookup

Phase-locked loop wikipedia , lookup

Regenerative circuit wikipedia , lookup

RLC circuit wikipedia , lookup

Integrated circuit wikipedia , lookup

Index of electronics articles wikipedia , lookup

Digital electronics wikipedia , lookup

Transistor–transistor logic wikipedia , lookup

Rectiverter wikipedia , lookup

Time-to-digital converter wikipedia , lookup

Flip-flop (electronics) wikipedia , lookup

CMOS wikipedia , lookup

Transcript
UNIVERSITY OF CALIFORNIA
College of Engineering
Department of Electrical Engineering and Computer Sciences
Last modified on April 21, 2006 by Seng Oon Toh (sengoon@eecs)
Borivoje Nikolić
EE 141
Homework #9: Activity Factor and Sequential Circuits
Problem #1
G2
A
G4
G1
B
G6
G3
C
G5
D
50 fF
Circuit A
A
B
C
D
G2
G1
50 fF
Circuit B
Let’s evaluate the effect of logic choice on power dissipation. Circuit A is an AND4
implemented as a NAND2 chain while Circuit B implements the same function using a 4
input gate.
a) Size circuit A and circuit B for minimum delay with Cin of 3 fF and Cout of 50 fF.
Report your answer in terms of the input capacitance seen at each gate.
Circuit A
H=50/3 * (4/3)3=39.51
h=H1/6=1.845
G1
G2
3fF
4.15fF
G3
7.66fF
G4
10.60fF
G5
19.55fF
G6
27.06fF
Circuit B
H=50/3 * (6/3) = 33.33
H=H1/2=5.77
G1=3fF
G2=8.66fF
b) Given that P(A=0) = P(B=0) = P(C=0) = P(D=0) = 0.25, calculate the probability
of energy consuming transitions P(0 → 1) at the outputs of the gates in both
circuits.
Circuit A
P(0→1) = P(0) * P(1) = (1-P(0)) * P(0)
In1
0 (1/4)
0 (1/4)
1 (3/4)
1 (3/4)
In2
Out
0 (1/4)
1 (1/16)
1 (3/4)
1 (3/16)
0 (1/4)
1 (3/16)
1 (3/4)
0 (9/16)
NAND2 Truth Table with Probabilities
For G3, P(In1=1) = P(G2=1) = 9/16, while P(In2=1) = P(C=1)=3/4.
Therefore P(G3=0) = 9/16 * ¾ = 27/64 and P(G3=1)=1-P(G3=0)
For G5, P(In1=1) = P(G4=1) = 27/64, while P(In2=1) = P(D=1)=3/4.
Therefore P(G5=0) = 27/64 * ¾ = 81/256
P(0)
P(1)
P(0→1)
G1
9/16
7/16
63/256
G2
7/16
9/16
63/256
G3
27/64
37/64
999/4096
G4
37/64
27/64
999/4096
G5
81/256
175/256
14175/65536
G6
175/256
81/256
14175/65536
Circuit B
P(NAND4=0) = P(In1=1, In2=1, In3=1, In4=1) = 3/44 = 81/256
P(0)
P(1)
P(0→1)
G1
81/256
175/256
14175/65536
G2
175/256
81/256
14175/65536
c) Assuming both circuits are operating at 500 MHz, calculate the dynamic power
consumption of the circuit. Assume γ = 1, Vdd = 2.5V, and only consider
capacitances at the inputs and output of the gates.
Circuit A
At primary inputs, P(0→1)=1/4*3/4=3/16
For a NAND2 gate, the capacitance at the output node is 6 γ /4 times bigger than the
capacitance at the gate. There are 2 2x PMOS and 1 2x NMOS devices at the output
node while the capacitance at each input is 4 units. That gives us 6/4. We need to add
gamma to skew it for the technology. A quick way to figure it out is pγ /g for simpler
logic gates like NAND’s NOR’s. It might not be accurate for AOI (and-or-invert)
since the topology is not unique.
Pdyn = Vdd2 f ((2*3 + 7.66 + 19.55)fF*3/16 + (4.15+3*3γ/2)fF*αG1 +
(7.66+4.15γ)fF*αG2 + (10.6+7.66*3γ/2)fF*αG3 + (19.55+10.6γ)fF*αG4 +
(27.06+19.55*3γ/2)fF*αG5 + (50+27.06γ)fF*αG6)
= 165μW
Circuit B
For a NAND4 gate, the capacitance at the output node is 12 γ/6.
Pdyn = Vdd2 f (4*3fF*3/16 + (3*12 γ/6+8.66)fF*αG1 + (8.66+50)fF*αG2)
= 56.6μW
Shorter logic depth reduces power consumption.
Problem #2
Consider the simple state machine shown above. A, B, and C represent combinational
logic blocks with the following properties:
tminA = 200 psec; tmaxA = 1 nsec;
tminB = 300 psec; tmaxB = 2 nsec;
tminC = 100 psec; tmaxC = 0.5 nsec;
The L-units represent positive latches clocked by . L has a setup time of 150 psec and a
delay of 250 psec (td-q when latch is transparent). Tc-q is 100 psec and thold is 100 psec.
The clock has a period T and is high for a duration of Ton. The duty cycle of the clock
hence equals 100 Ton/T %.
a) Determine the conditions on the clock necessary to avoid the occurrence of races.
Race conditions occur when clock pulses are so short that input transitions meant to
be latched in the next clock cycle “race” through fast-paths and appear at the output
of the pipeline stage, causing data to arrive ahead of schedule. In the case of latches,
what we are concerned about is the period where the latches are transparent. Race
conditions put an upper bound on the transparent period (Ton). The condition to
prevent race conditions is hence
Ton + thold ≤ tc-q + tminA + tminC
Ton < 100ps + 200ps + 100ps - 100ps = 300ps
This requirement stems from the fact that we would like our fast path logic to reach
the receiving latch after the hold time. This way, there will never be any spurious
transitions within the t-setup to t-hold regions of the falling edge, thus ensuring
proper values being latched.
Ton, min = tc-q
b) Determine the absolute minimum clock period for this circuit to work correctly as
well as the maximum duty cycle.
There are two cases we should consider. First is if the signal happens to be launched
while the latches are transparent and keeps on looping around the loop while the
latches are transparent. This is the ideal case since latches are meant to be traffic
lights and it would be best if we saw green all the time. For this case,
Tclk,min = td-q + tlogic,max (assuming that the latches stay transparent for a long enough
period) = 2.75 ns
The other case is if the signal arrives when the latch is latched, in this case we need to
meet tsetup requirements. For this case,
Tclk,min = tc-q + tlogic,max + tsetup = 2.75 ns
We would take the max of the two but they are the same in this case.
The maximum duty cycle for this case is 100*300/2750 % = 11 %
c) Suppose that due to some sloppy clock-network routing, the clock signal at L1
arrives 100ps earlier than the clock signal at L2. Calculate the absolute minimum
clock period for this circuit to work properly as well as the maximum duty cycle.
This is a case of positive clock skew between L1 and L2 however since L1 is not in
the critical path of the circuit, the minimum clock period remains the same as part (b).
However Ton is affected by this positive skew because the input is now launched
100ps earlier into the stage. The new constraint on Ton is:
Ton ≤ tc-q + tminA + tminC - thold - δ = 100ps + 200ps + 100ps – 100ps – 100ps= 200ps
Hence, the maximum duty cycle is reduced to 100*600/2750 % = 7.3 %
Problem #3
Clk
1
2
1
Vx
A
B
2
Out
PMOS: 2
NMOS: 1
C
2
2
2
2
2
1
The circuit above is a Limited Switch Dynamic Logic (LSDL) NOR3 gate which is a
circuit family used in high performance datapaths. It is essentially a domino gate
followed by a latch. The relative sizing of the gates have been annotated on the schematic.
a) What is the purpose of the shaded transistors?
They function just like keepers to maintain a static output. Without these transistors,
the input of the inverter would be floating during the pre-charge phase and highly
susceptible to noise injection.
b) Assuming no propagation delay, complete the following ideal timing diagram.
Clk
A
B
C
Vx
Out
c) Calculate tsetup of the gate (tsetup is defined as the minimum period of time where
inputs have to be stable before the clock transitions into the latched state in order
for the correct value to be latched) in terms of intrinsic delay tp0. Assume γ=1.
The circuit can be divided into three phases, the first stage looks like a domino gate,
the second stage is the inverter with keepers while the third stage is the output
inverter. For tsetup we only need to consider the propagation delay through the first
two stages. Once the output of the second stage is set, the clock can safely transition
without the risk of losing the latched data.
Using the method of Logical Effort on the evaluate transition (pull-down) (Refer to
HW8 solutions for details):
g1 = 2/3
p1 = 3/3 = 1
f1 = 4/2 = 2
For the second stage, since the evaluate transition will cause the input to this stage to
transition H-L, the network that we care about is the pull-up. We do not need to
wrestle with any keepers in this case because the NMOS pull-down network is deactivated during this phase. The PMOS keeper is also off in this case. The logical
effort is therefore straightforward.
g2 = 4/3
p2 = 5/3
f2 = 3/4
tsetup = tp0[p1 + f1g1 + p2 + f2g2] = 5 tp0
d) What would be the minimum positive pulse-width of Clk (in terms of tp0) to
ensure that a correct value is latched?
We need to consider two cases.
Case 1 : Vx is discharged to Gnd
This case is exactly the same as the case for evaluating tsetup and the answer is 5 tp0.
Case 2 : Vx remains at Vdd
Now the evaluate phase goes through the pull-down network of the second stage.
Since the PMOS keeper is on in this case, we need to convert our circuit to an
equivalent form with only one NMOS pull-down transistor. Referring to the method
in HW8 which is to assume that the PMOS keeper effectively reduces the pull-down
strength of the NMOS network by an equivalent NMOS transistor, the equivalent
minimum sized inverter has a NMOS of size 0.5 and a PMOS of size 1.
g2 = 4/1.5 = 8/3
p2 = 5/1.5 = 10/3
tpulse,min = tp0(p2 + f2*g2) = (16/3) tp0
f2 = 3/4
The worst case is therefore case 2 and tpulse,min = 16/3 tp0
This value is particularly important in this logic style because the evaluate phases are
usually implemented as pulses generating using NAND gates driven from the same
clock but with one input delayed using inverters. We need to ensure that these pulses
are sufficiently large across process variation.
e) Assuming that A=B=C=1, compare the activity factor at node Out to the activity
factor at node Vx which would be the output of a standard domino logic gate.
What can you infer about the dynamic power consumption of static gates being
driven by this gate compared to domino logic?
Vx switches every clock period even if the inputs are constant while Out remains
constant. The activity factor of gates being driven by domino logic will be higher than
LSDL and therefore consume more power. The trade-off is a higher clock load in
LSDL gates because Clk needs to drive three gates instead of 2 or 1 in the case of
footed or footless domino respectively.