Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COMBINATIONAL LOGIC DYNAMICS [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.] EE415 VLSI Design Fast Complex Gates: Design Technique 1 Transistor sizing » as long as fan-out capacitance dominates Progressive sizing InN CL MN In3 M3 C3 In2 M2 C2 In1 M1 C1 EE415 VLSI Design Distributed RC line M1 > M2 > M3 > … > MN (the fet closest to the output is the smallest) Can reduce delay by more than 20%; Fast Complex Gates: Design Technique 2 Transistor ordering critical path In3 1 M3 charged CL In2 1 M2 C2 charged In1 M1 01 C1 charged delay determined by time to discharge CL, C1 and C2 EE415 VLSI Design critical path 01 In1 M3 CLcharged In2 1 M2 C2 discharged In3 1 M1 C1 discharged delay determined by time to discharge CL Fast Complex Gates: Design Technique 3 Alternative logic structures F = ABCDEFGH EE415 VLSI Design Fast Complex Gates: Design Technique 4 Isolating fan-in from fan-out using buffer insertion CL EE415 VLSI Design CL Fast Complex Gates: Design Technique 5 Reducing the voltage swing tpHL = 0.69 (3/4 (CL VDD)/ IDSATn ) = 0.69 (3/4 (CL Vswing)/ IDSATn ) » linear reduction in delay » also reduces power consumption But the following gate is much slower! Or requires use of “sense amplifiers” to restore the signal level (memory design) EE415 VLSI Design Sizing Logic Paths for Speed Frequently, input capacitance of a logic path is constrained Logic also has to drive some capacitance Example: ALU load in an Intel’s microprocessor is 0.5pF How do we size the ALU datapath to achieve maximum speed? We have already solved this for the inverter chain – can we generalize it for any type of logic? EE415 VLSI Design Buffer Example In Out 1 2 N CL N Delay pi g i f i i 1 (in units of tinv) For given N: Ci+1/Ci = Ci/Ci-1 To find N: Ci+1/Ci ~ 4 How to generalize this to any logic path? EE415 VLSI Design Logical Effort CL Delay k RunitCunit 1 Cin t p g f p – intrinsic delay (3kRunitCunit) - gate parameter f(W) g – logical effort (kRunitCunit) – gate parameter f(W) f – effective fanout Normalize everything to an inverter: ginv =1, pinv = 1 Divide everything by tinv (everything is measured in unit delays tinv) Assume = 1. EE415 VLSI Design Delay in a Logic Gate Gate delay: d=h+p effort delay intrinsic delay Effort delay: h=gf logical effort effective fanout = Cout/Cin Logical effort is a function of topology, independent of sizing Effective fanout (electrical effort) is a function of load/gate size EE415 VLSI Design Logical Effort Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current Logical effort increases with the gate complexity EE415 VLSI Design Intrinsic Delay Inverter has the smallest intrinsic delay and of all static CMOS gates Intrinsic delay of a gate presents the ratio of its output capacitance to the inverter output capacitance when sized to deliver the same current Intrinsic delay increases with the gate complexity EE415 VLSI Design Logical Effort Logical effort is the ratio of input capacitance of a gate to the input capacitance of an inverter with the same output current VDD A VDD A 2 2 B F 2 F A A VDD B 4 A 4 2 F 1 A B Inverter g =p= 1 EE415 VLSI Design 1 B 1 2 2-input NAND g = 4/3, p=2 2-input NOR g = 5/3, p=2 Logical Effort of Gates Normalized delay (d) t pNAND g = 4/3 p=2 d = (4/3)f+2 d = h+p g – logical effort f - effective fan out p – intrinsic delay g=1 p=1 d = f+1 Effort Delay F(Fan-in) Intrinsic Delay 1 EE415 VLSI Design h=gf t pINV 2 3 4 5 Fan-out (f) 6 7 Add Branching Effort Branching effort: b EE415 VLSI Design Coff-path is the branch capacitance Con path Coff path Con path Multistage Networks N Delay pi g i f i i 1 Stage effort: hi = gifi Path electrical effort: F = Cout/Cin Path logical effort: G = g1g2…gN Branching effort: B = b1b2…bN Path effort: H = GFB Path delay D = Sdi = Spi + Shi EE415 VLSI Design Optimum Effort per Stage When each stage bears the same effort: hN H hN H Stage efforts: g1f1 = g2f2 = … = gNfN Effective fanout of each stage: fi h gi Minimum path delay Dˆ gi f i pi NH 1/ N P EE415 VLSI Design Logical Effort From Sutherland, Sproull EE415 VLSI Design Example – 8-input AND g=10/3 p=8 g=4/3 g=5/3 g=4/3 g=1 p=2 p=2 p=2 p=1 EE415 VLSI Design g=1 p=1 Logical efforts Intrinsic delays g=2 p=4 g=5/3 p=2 Fan out is not known here Example: Optimize Path 1 a b c 5 g1 = 1 g2 = 5/3 g3 = 5/3 g4 = 1 f1 = ag2/g1 f2 = bg3/ag2 f3 = cg4/bg3 f 4= 5/cg4= Output load Input load Stage fan-out is fi and a,b,c are scale factors comparing a gate size to the minimum size gate with the same speed as inverter Effective fanout, F = 5 G= H= h= a= b= c= EE415 VLSI Design Path electrical effort: F = Cout/Cin Path logical effort : G = g1g2…gN Path effort: H = GFB Stage effort: hi = gifi Example: Optimize Path 1 a b c 5 g1 = 1 g2 = 5/3 g3 = 5/3 g4 = 1 f1 = ag2/g1 f2 = bg3/ag2 f3 = cg4/bg3 f 4= 5/cg4 Stage fan-out is fi and a,b,c are scale factors comparing a gate size to the minimum size gate with the same speed as inverter Effective fanout, F = 5 G = 25/9 H = 125/9 = 13.9 h = 1.93 a = h/g2=1.16 b = ha/g3 = 1.34 c = hb/g4 = 2.59 EE415 VLSI Design (since no branching here then B=1, H=GFB) (this is the optimum effort for each gate h=H1/4) (from h=f1g1=1.93 and f1=ag2/g1) (same as h=f2g2=1.93 and f2= bg3/ag2) (same as h=f3g3=1.93 and f3=cg4/bg3) Method of Logical Effort Compute the path effort: H = GBF Find the best number of stages N ~ log4H Compute the stage effort h= H1/N Sketch the path with this number of stages Work from either end, find sizes: Cin = Cout*g/h Reference: Sutherland, Sproull, Harris, “Logical Effort, Morgan-Kaufmann 1999. EE415 VLSI Design Ratio Based Logic V DD Resistive Load V DD Depletion Load RL PDN V SS (a) resistive load PMOS Load V SS VT < 0 F In 1 In 2 In 3 V DD F In 1 In 2 In 3 PDN V SS (b) depletion load NMOS F In 1 In 2 In 3 PDN V SS (c) pseudo-NMOS Goal: to reduce the number of devices over complementary CMOS EE415 VLSI Design Ratio Based Logic VDD • N transistors + Load Resistive Load • V OH = V DD RL • V OL = F In1 In2 In3 RDN + RL VDD • Asymmetrical response PDN • Static power consumption VSS EE415 VLSI Design RDN • tpLH= 0.69 RL CL t pHL 0.69RL || RPDN C L Ratio Based Logic Problems Problems with Resistive Load •IL = (VDD – Vout )/ RL •Charging current drops rapidly once Vout starts to rise Solution: Use a current source! •Available current is independent of voltage •Reduces tpLH by 25% EE415 VLSI Design Active Loads VDD Depletion Load VDD PMOS Load VT < 0 VSS F In1 In2 In3 PDN VSS depletion load NMOS EE415 VLSI Design F In1 In2 In3 PDN VSS pseudo-NMOS Active Loads Depletion mode NMOS load •VGS = 0 •IL ~ (kn, load / 2) (|VTn|)2 •Deviates from ideal current source •Channel length modulation •Body effect •VSB varies with Vout •reduces |VTn|, hence IL gets smaller for increasing Vout EE415 VLSI Design Active Loads Pseudo-NMOS load •No body effect, VSB = 0V •VGS = - VDD , higher load current •IL = (kp / 2) (VDD - |VTn|)2 •Larger VGS causes pseudo-NMOS load to leave saturation mode sooner than NMOS EE415 VLSI Design Load Lines of Ratioed Gates IL(Normalized) 1 Current source 0.75 0.5 Pseudo-NMOS Depletion load 0.25 Resistive load 0 0.0 1.0 2.0 3.0 Vout (V) EE415 VLSI Design 4.0 5.0 Pseudo-NMOS VDD A B C D F CL VOH = VDD (similar to complementary CMOS) 2 V OL kp 2 k n VDD – V Tn V OL – ------------- = ------ V DD – VTp 2 2 kp V OL = VDD – V T 1 – 1 – -----(assuming that V T = V Tn = VTp ) kn SMALLER AREA & LOAD BUT STATIC POWER DISSIPATION!!! EE415 VLSI Design Pseudo-NMOS VTC 3.0 Noise margin low is significantly reduced comparing to CMOS 2.5 W/Lp = 4 Vout [V] 2.0 1.5 W/Lp = 2 1.0 0.5 W/Lp = 0.5 W/Lp = 1 W/Lp = 0.25 0.0 0.0 Vin_low 0.5 NL EE415 VLSI Design Vin_low 1.0 Vin_high 1.5 2.0 2.5 Vin [V] Pseudo-NMOS NAND Gate VDD Out GND EE415 VLSI Design Improved Loads (1) VDD M1 Enable M2 M1 >> M2 F For fast low-tohigh transition in standby circuits A B C D Adaptive Load EE415 VLSI Design CL Improved Loads (2) VDD M1 VDD M2 Out A A B B Out PDN1 PDN2 VSS VSS •Differential Cascode Voltage Switch Logic (DCVSL) •Have no static current •Requires that each gate generates both Out and its complement EE415 VLSI Design DCVSL Example Out Out B B A B B A XOR-NXOR gate EE415 VLSI Design DCVSL Transient Response V olta ge [V] 2.5 AB 1.5 0.5 -0.5 0 AB A,B 0.2 A,B 0.4 0.6 Time [ns] 0.8 1.0 DCVSL transient response of AND/NAND gate EE415 VLSI Design Pass-Transistor Logic Inputs B Switch Out A Out B Network B • N transistors • No static consumption EE415 VLSI Design Example: AND Gate B A B F = AB 0 EE415 VLSI Design NMOS-Only Logic 3.0 In 1.5m/0.25m VDD x 0.5m/0.25m Out 0.5m/0.25m Voltage [V] In Out 2.0 x 1.0 0.0 0 0.5 1 Time [ns] EE415 VLSI Design 1.5 2 NMOS-only Switch C = 2.5V C = 2.5 V M2 A = 2.5 V A = 2.5 V B B Mn CL M1 VB does not pull up to 2.5V, but 2.5V - VTN Threshold voltage loss causes static power consumption NMOS has higher threshold than PMOS (body effect) EE415 VLSI Design Pass-Transistor Logic- Solution 1: Level Restoring Transistor VDD VDD Level Restorer weak transistor Mr B A Mn M2 X Out M1 • Advantages: Full Swing, No static power dissipation • Restorer adds capacitance, takes away pull down current at X • Ratio problem EE415 VLSI Design Restorer Transistor Sizing V olta ge [V] 3.0 2.0 •Level restoring transistor cannot be too strong otherwise it will prevent output from reaching VDD value •Upper limit on restorer size •Pass-transistor pull-down can have several transistors in stack W /Lr =1.75/0.25 W /L r =1.50/0.25 1.0 W/ Lr =1.0/0.25 0.0 0 100 EE415 VLSI Design 200 W /L r =1.25/0.25 300 Time [ps] 400 500 Pass-Transistor Logic - Solution 2: Single Transistor Pass Gate with VT=0 VDD VDD 0V 2.5V VDD 0V Out 2.5V WATCH OUT FOR LEAKAGE CURRENTS EE415 VLSI Design If pass transistors have VT=0 the output does not require level restorer but there is a leakage current Complementary Pass Transistor Logic A A B B Pass-Transistor F Network (a) A A B B B Inverse Pass-Transistor Network B B A F B B A A B F=AB A B F=A+B F=AB AND/NAND EE415 VLSI Design A F=AÝ (b) A A B B F=A+B B OR/NOR A EXOR/NEXOR F=AÝ Pass-Transistor Logic Solution 3: Transmission Gate C A C A B B C C C = 2.5 V A = 2.5 V B CL C=0V EE415 VLSI Design Resistance of Transmission Gate 30 2.5 V Resistance, ohms Rn 20 Rp 2.5 V Rn Vou t Rp 10 0 0.0 EE415 VLSI Design 0V Rn || Rp 1.0 Vou t , V 2.0 Transmission Gate Based Multiplexer S S S S VDD S A VDD M2 F S M1 B S GND In1 EE415 VLSI Design In2 Transmission Gate Based XOR B B M2 A A F M1 M3/M4 B B EE415 VLSI Design Transmission Gate Full Adder P VDD Propagate signal Ci A P A A P B VDD Ci A P Ci S Sum Generation Ci P B Setup P Co Carry Generation P Similar delays for sum and carry 24 transistors EE415 VLSI Design VDD A Ci A VDD Example: Full Adder VDD VDD Ci A A B B A B Ci A B VDD X Ci Ci A S Ci A B B VDD A B Ci Co Co = AB + C i(A+B) EE415 VLSI Design 28 transistors A B A Revised Adder Circuit V DD VDD A B A V DD A B B Ci B Kill "0"-Propagate A Ci Ci Co S Ci A "1"-Propagate Generate A B B A B Ci A B 24 transistors EE415 VLSI Design Delay in Transmission Gate Networks 2.5 2.5 V1 In 2.5 Vi Vi-1 C 0 2.5 C 0 Vn-1 Vi+1 C 0 Vn C C 0 (a) Req Req V1 In Req Vi C Vn-1 Vi+1 C C Req C C (b) m Req Req Req Req Req Req In C CC C C (c) EE415 VLSI Design CC Vn C Delay Optimization EE415 VLSI Design