Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Lecture 4: Delay Optimization and Logical Effort Outline RC Delay Models Delay Estimation Logical Effort Delay in a Logic Gate Multistage Logic Networks Choosing the Best Number of Stages Examples Summary 5: DC and Transient Response CMOS VLSI Design 4th Ed. 2 Delay Definitions tpdr: rising propagation delay – Max time from input to rising output crossing VDD/2 tpdf: falling propagation delay – Max time from input to falling output crossing VDD/2 tpd: average propagation delay – tpd = (tpdr + tpdf)/2 tr: rise time – From output crossing 0.2 VDD to 0.8 VDD tf: fall time – From output crossing 0.8 VDD to 0.2 VDD 5: DC and Transient Response CMOS VLSI Design 4th Ed. 3 Delay Definitions tcdr: rising contamination delay – Min time from input to rising output crossing VDD/2 tcdf: falling contamination delay – Min time from input to falling output crossing VDD/2 tcd: average contamination delay – tcd = (tcdr + tcdf)/2 5: DC and Transient Response CMOS VLSI Design 4th Ed. 4 Simulated Inverter Delay Solving differential equations by hand is too hard SPICE simulator solves the equations numerically – Uses more accurate I-V models too! But simulations take time to write, may hide insight 2.0 1.5 1.0 (V) Vin tpdf = 66ps tpdr = 83ps Vout 0.5 0.0 0.0 200p 400p 600p 800p 1n t(s) 5: DC and Transient Response CMOS VLSI Design 4th Ed. 5 Delay Estimation We would like to be able to easily estimate delay – Not as accurate as simulation – But easier to ask “What if?” The step response usually looks like a 1st order RC response with a decaying exponential. Use RC delay models to estimate delay – C = total capacitance on output node – Use effective resistance R – So that tpd = RC Characterize transistors by finding their effective R – Depends on average current as gate switches 5: DC and Transient Response CMOS VLSI Design 4th Ed. 6 Effective Resistance Shockley models have limited value – Not accurate enough for modern transistors – Too complicated for much hand analysis Simplification: treat transistor as resistor – Replace Ids(Vds, Vgs) with effective resistance R • Ids = Vds/R – R averaged across switching of digital gate Too inaccurate to predict current at any given time – But good enough to predict RC delay 5: DC and Transient Response CMOS VLSI Design 4th Ed. 7 RC Delay Model Use equivalent circuits for MOS transistors – Ideal switch + capacitance and ON resistance – Unit nMOS has resistance R, capacitance C – Unit pMOS has resistance 2R, capacitance C Capacitance proportional to width Resistance inversely proportional to width d Input capacitance g d k s kC R/k Output capacitance s kC 2R/k g g kC kC d k s s 5: DC and Transient Response kC g kC d CMOS VLSI Design 4th Ed. 8 RC Values Capacitance – C = Cg = Cs = Cd = 2 fF/mm of gate width in 0.6 mm – Gradually decline to 1 fF/mm in nanometer techs. Resistance – R 6 KW*mm in 0.6 mm process – Improves with shorter channel lengths Unit transistors – May refer to minimum contacted device (4/2 l) – Or maybe 1 mm wide device – Doesn’t matter as long as you are consistent 5: DC and Transient Response CMOS VLSI Design 4th Ed. 9 Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter 2C R A 2 Y 2 1 1 2C 2C 2C 2C Y R C R C C C C d = 6RC 5: DC and Transient Response CMOS VLSI Design 4th Ed. 10 Delay Model Comparison 5: DC and Transient Response CMOS VLSI Design 4th Ed. 11 Example: 3-input NAND Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R). 2 2 2 3 3 3 5: DC and Transient Response CMOS VLSI Design 4th Ed. 12 3-input NAND Caps Annotate the 3-input NAND gate with gate and diffusion capacitance. 2C 2C 2 2C 5C 2C 2 2C 2 2C 3C 5C 3C 5C 3C 5: DC and Transient Response 2C 2C 2C 3 3 3 CMOS VLSI Design 4th Ed. 9C 3C 3C 3C 3C 13 Example Sketch a 2-input NOR gate with selected transistor widths so that effective rise and fall resistances are equal to a unit inverter’s. Annotate the gate and diffusion capacitances. 5: DC and Transient Response CMOS VLSI Design 4th Ed. 14 Elmore Delay ON transistors look like resistors Pullup or pulldown network modeled as RC ladder Elmore delay of RC ladder t pd Ri to sourceCi nodes i R1C1 R1 R2 C2 ... R1 R2 ... RN CN R1 R2 R3 C1 C2 5: DC and Transient Response RN C3 CMOS VLSI Design 4th Ed. CN 15 Example: 3-input NAND Estimate worst-case rising and falling delay of 3-input NAND driving h identical gates. 2 2 2 Y 3 9C 5hC n2 3C 3 n1 h copies 3 t pdr 9 5h RC 5: DC and Transient Response 3C t pdf 3C R3 3C R3 R3 9 5h C R3 R3 R3 11 5h RC CMOS VLSI Design 4th Ed. 16 Delay Components Delay has two parts – Parasitic delay • 9 or 11 RC • Independent of load – Effort delay • 5h RC • Proportional to load capacitance 5: DC and Transient Response CMOS VLSI Design 4th Ed. 17 Contamination Delay Best-case (contamination) delay can be substantially less than propagation delay. Ex: If all three inputs fall simultaneously 2 2 2 Y 3 9C 5hC n2 3C 3 n1 3 3C tcdr 5: DC and Transient Response R 5 9 5h C 3 h RC 3 3 CMOS VLSI Design 4th Ed. 18 Delay in a Logic Gate Express delays in process-independent unit d d abs Delay has two components: d = f + p 3RC f: effort delay = gh (a.k.a. stage effort) 3 ps in 65 nm process – Again has two components 60 ps in 0.6 mm process g: logical effort – Measures relative ability of gate to deliver current – g 1 for inverter h: electrical effort = Cout / Cin – Ratio of output to input capacitance – Sometimes called fanout p: parasitic delay – Represents delay of gate driving no load – Set by internal parasitic capacitance 5: DC and Transient Response CMOS VLSI Design 4th Ed. 19 Delay Plots =f+p = gh + p What about NOR2? 2-input NAND 6 Normalized Delay: d d Inverter 5 g=1 p=1 d=h+1 4 3 g = 4/3 p=2 d = (4/3)h + 2 Effort Delay: f 2 1 Parasitic Delay: p 0 0 1 2 3 4 5 Electrical Effort: h = Cout / Cin 5: DC and Transient Response CMOS VLSI Design 4th Ed. 20 Computing Logical Effort DEF: Logical effort is the ratio of the input capacitance of a gate to the input capacitance of an inverter delivering the same output current. Measure from delay vs. fanout plots Or estimate by counting transistor widths 2 Y 2 A 2 Y 1 A 2 B 2 Cin = 3 g = 3/3 5: DC and Transient Response A 4 B 4 Cin = 4 g = 4/3 CMOS VLSI Design 4th Ed. Y 1 1 Cin = 5 g = 5/3 21 Catalog of Gates Logical effort of common gates Gate type Number of inputs 1 2 3 4 n NAND 4/3 5/3 6/3 (n+2)/3 NOR 5/3 7/3 9/3 (2n+1)/3 2 2 2 2 4, 4 6, 12, 6 8, 16, 16, 8 Inverter Tristate / mux 1 2 XOR, XNOR 5: DC and Transient Response CMOS VLSI Design 4th Ed. 22 Catalog of Gates Parasitic delay of common gates – In multiples of pinv (1) Gate type Number of inputs 1 2 3 4 n NAND 2 3 4 n NOR 2 3 4 n 4 6 8 2n 4 6 8 Inverter Tristate / mux 1 2 XOR, XNOR 5: DC and Transient Response CMOS VLSI Design 4th Ed. 23 Example: Ring Oscillator Estimate the frequency of an N-stage ring oscillator Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay: Frequency: 5: DC and Transient Response 31 stage ring oscillator in g=1 0.6 mm process has h=1 frequency of ~ 200 MHz p=1 d=2 fosc = 1/(2*N*d) = 1/4N CMOS VLSI Design 4th Ed. 24 Example: FO4 Inverter Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay: 5: DC and Transient Response g=1 h=4 p=1 d=5 The FO4 delay is about 300 ps in 0.6 mm process 15 ps in a 65 nm process CMOS VLSI Design 4th Ed. 25 Multistage Logic Networks Logical effort generalizes to multistage networks Path Logical Effort G gi Path Electrical Effort g1 = 1 h1 = x/10 Cin-path F f i gi hi Path Effort 10 H Cout-path x g2 = 5/3 h2 = y/x 5: DC and Transient Response y g3 = 4/3 h3 = z/y z g4 = 1 h4 = 20/z CMOS VLSI Design 4th Ed. 20 26 Multistage Logic Networks Logical effort generalizes to multistage networks Path Logical Effort G gi Path Electrical Effort H Cout path Cin path F f i gi hi Path Effort Can we write F = GH? 5: DC and Transient Response CMOS VLSI Design 4th Ed. 27 Paths that Branch No! Consider paths that branch: 15 G H GH h1 h2 F =1 5 = 90 / 5 = 18 = 18 = (15 +15) / 5 = 6 = 90 / 15 = 6 = g1g2h1h2 = 36 = 2GH 5: DC and Transient Response CMOS VLSI Design 4th Ed. 15 90 90 28 Branching Effort Introduce branching effort – Accounts for branching between stages in path b Con path Coff path Con path B bi Note: h i BH Now we compute the path effort – F = GBH 5: DC and Transient Response CMOS VLSI Design 4th Ed. 29 Multistage Delays Path Effort Delay DF f i Path Parasitic Delay P pi Path Delay D d i DF P 5: DC and Transient Response CMOS VLSI Design 4th Ed. 30 Designing Fast Circuits D d i DF P Delay is smallest when each stage bears same effort fˆ gi hi F 1 N Thus minimum delay of N stage path is 1 N D NF P This is a key result of logical effort – Find fastest possible delay – Doesn’t require calculating gate sizes 5: DC and Transient Response CMOS VLSI Design 4th Ed. 31 Gate Sizes How wide should the gates be for least delay? fˆ gh g CCoutin gi Couti Cini fˆ Working backward, apply capacitance transformation to find input capacitance of each gate given load it drives. Check work by verifying input cap spec is met. 5: DC and Transient Response CMOS VLSI Design 4th Ed. 32 Example: 3-stage path Select gate sizes x and y for least delay from A to B x x A 8 5: DC and Transient Response x CMOS VLSI Design 4th Ed. y 45 y B 45 33 Example: 3-stage path x x A 8 x y 45 y Logical Effort Electrical Effort Branching Effort Path Effort Best Stage Effort Parasitic Delay Delay 5: DC and Transient Response B 45 G = (4/3)*(5/3)*(5/3) = 100/27 H = 45/8 B=3*2=6 F = GBH = 125 fˆ 3 F 5 P=2+3+2=7 D = 3*5 + 7 = 22 = 4.4 FO4 CMOS VLSI Design 4th Ed. 34 Example: 3-stage path Work backward for sizes y = 45 * (5/3) / 5 = 15 x = (15*2) * (5/3) / 5 = 10 x y x A P: 84 N: 4 5: DC and Transient Response 45 45 P: x 4 N: 6 P: y 12 N: 3 CMOS VLSI Design 4th Ed. B B 45 45 35 Best Number of Stages How many stages should a path use? – Minimizing number of stages is not always fastest Example: drive 64-bit datapath with unit inverter Initial Driver 1 1 1 1 8 4 2.8 16 8 D = NF1/N + P = N(64)1/N + N 23 Datapath Load N: f: D: 5: DC and Transient Response 64 1 64 65 CMOS VLSI Design 4th Ed. 64 2 8 18 64 3 4 15 Fastest 64 4 2.8 15.3 36 Derivation Consider adding inverters to end of path – How many give least delay? Logic Block: n1Stages Path Effort F n1 D NF pi N n1 pinv 1 N N - n1 ExtraInverters i 1 1 1 1 D F N ln F N F N pinv 0 N Define best stage effort F 1 N pinv 1 ln 0 5: DC and Transient Response CMOS VLSI Design 4th Ed. 37 Best Stage Effort pinv 1 ln 0 has no closed-form solution Neglecting parasitics (pinv = 0), we find = 2.718 (e) For pinv = 1, solve numerically for = 3.59 5: DC and Transient Response CMOS VLSI Design 4th Ed. 38 Review of Definitions Term Stage Path number of stages 1 N logical effort g G gi electrical effort h CCoutin H branching effort b effort f gh F GBH effort delay f DF f i parasitic delay p P pi delay d f p 5: DC and Transient Response Con-path Coff-path Con-path CMOS VLSI Design 4th Ed. Cout-path Cin-path B bi D d i DF P 39 Method of Logical Effort 1) 2) 3) 4) 5) Compute path effort Estimate best number of stages Sketch path with N stages Estimate least delay Determine best stage effort N log 4 F 1 N D NF P ˆf F N1 gi Couti Cini fˆ 6) Find gate sizes 5: DC and Transient Response F GBH CMOS VLSI Design 4th Ed. 40 Limits of Logical Effort Chicken and egg problem – Need path to compute G – But don’t know number of stages without G Simplistic delay model – Neglects input rise time effects Interconnect – Iteration required in designs with wire Maximum speed only – Not minimum area/power for constrained delay 5: DC and Transient Response CMOS VLSI Design 4th Ed. 41 Example 1 Calculate the a) logical effort b) parasitic delay c) effort and d) delay in the following 6-input AND implementations as a function of the path electrical effort H. Which implementation is the fastest if a) H=1, b) H=5 and c) H=20? (a) (b) 5: DC and Transient Response (c) CMOS VLSI Design 4th Ed. (d) 42 Example 2 For the path between x and F in the following circuit calculate: a. Total delay b. Path effort and parasitic delay c. minimum theoritical delay 6 4 20C 8 5: DC and Transient Response CMOS VLSI Design 4th Ed. 43 Summary Logical effort is useful for thinking of delay in circuits – Numeric logical effort characterizes gates – NANDs are faster than NORs in CMOS – Paths are fastest when effort delays are ~4 – Path delay is weakly sensitive to stages, sizes – But using fewer stages doesn’t mean faster paths – Delay of path is about log4F FO4 inverter delays – Inverters and NAND2 best for driving large caps Provides language for discussing fast circuits – But requires practice to master 5: DC and Transient Response CMOS VLSI Design 4th Ed. 44