Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Introduction to CMOS VLSI Design Lecture 6: Logical Effort David Harris Harvey Mudd College Spring 2007 Outline RC Delay Estimation Logical Effort Delay in a Logic Gate Multistage Logic Networks Choosing the Best Number of Stages Example Summary 6: Logical Effort CMOS VLSI Design Slide 2 RC Delay Model Use equivalent circuits for MOS transistors – Ideal switch + capacitance and ON resistance – Unit nMOS has resistance R, capacitance C – Unit pMOS has resistance 2R, capacitance C Capacitance proportional to width Resistance inversely proportional to width d g d k s s kC R/k 2R/k g g kC kC s 6: Logical Effort kC d k s kC g kC d CMOS VLSI Design Slide 3 RC Values Capacitance – C = Cg = Cs = Cd = 2 fF/mm of gate width – Values similar across many processes Resistance – R 6 KW*mm in 0.6um process – Improves with shorter channel lengths Unit transistors – May refer to minimum contacted device (4/2 l) – Or maybe 1 mm wide device – Doesn’t matter as long as you are consistent 6: Logical Effort CMOS VLSI Design Slide 4 Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter A 2 Y 2 1 1 6: Logical Effort CMOS VLSI Design Slide 5 Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter 2C R A 2 Y 2 1 1 2C 2C Y R C C C 6: Logical Effort CMOS VLSI Design Slide 6 Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter 2C R A 2 Y 2 1 1 2C 2C 2C 2C Y R C R C C C C 6: Logical Effort CMOS VLSI Design Slide 7 Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter 2C R A 2 Y 2 1 1 2C 2C 2C 2C Y R C R C C C C d = 6RC 6: Logical Effort CMOS VLSI Design Slide 8 Example: 3-input NAND Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R). 6: Logical Effort CMOS VLSI Design Slide 9 Example: 3-input NAND Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R). 6: Logical Effort CMOS VLSI Design Slide 10 Example: 3-input NAND Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R). 2 2 2 3 3 3 6: Logical Effort CMOS VLSI Design Slide 11 3-input NAND Caps Annotate the 3-input NAND gate with gate and diffusion capacitance. 2 2 2 3 3 3 6: Logical Effort CMOS VLSI Design Slide 12 3-input NAND Caps Annotate the 3-input NAND gate with gate and diffusion capacitance. 2C 2 2C 2C 2C 2 2C 2C 2C 3C 3C 3C 6: Logical Effort 2 CMOS VLSI Design 2C 2C 3 3 3 3C 3C 3C 3C Slide 13 3-input NAND Caps Annotate the 3-input NAND gate with gate and diffusion capacitance. 2 2 3 5C 3 5C 3 5C 6: Logical Effort 2 CMOS VLSI Design 9C 3C 3C Slide 14 Elmore Delay ON transistors look like resistors Pullup or pulldown network modeled as RC ladder Elmore delay of RC ladder t pd Ri to sourceCi nodes i R1C1 R1 R2 C2 ... R1 R2 ... RN CN R1 6: Logical Effort R2 R3 C1 C2 RN C3 CMOS VLSI Design CN Slide 15 Example: 2-input NAND Estimate worst-case rising and falling delay of 2input NAND driving h identical gates. 2 2 A 2 B 2x 6: Logical Effort Y h copies CMOS VLSI Design Slide 16 Example: 2-input NAND Estimate rising and falling propagation delays of a 2input NAND driving h identical gates. 2 2 A 2 B 2x 6: Logical Effort Y 4hC 6C h copies 2C CMOS VLSI Design Slide 17 Example: 2-input NAND Estimate rising and falling propagation delays of a 2input NAND driving h identical gates. 2 2 A 2 B 2x R Y (6+4h)C 6: Logical Effort Y 4hC 6C h copies 2C t pdr CMOS VLSI Design Slide 18 Example: 2-input NAND Estimate rising and falling propagation delays of a 2input NAND driving h identical gates. 2 2 A 2 B 2x R Y (6+4h)C 6: Logical Effort Y 4hC 6C h copies 2C t pdr 6 4h RC CMOS VLSI Design Slide 19 Example: 2-input NAND Estimate rising and falling propagation delays of a 2input NAND driving h identical gates. 2 2 A 2 B 2x 6: Logical Effort Y 4hC 6C h copies 2C CMOS VLSI Design Slide 20 Example: 2-input NAND Estimate rising and falling propagation delays of a 2input NAND driving h identical gates. 2 2 A 2 B 2x x R/2 R/2 2C 6: Logical Effort Y (6+4h)C Y 4hC 6C h copies 2C t pdf CMOS VLSI Design Slide 21 Example: 2-input NAND Estimate rising and falling propagation delays of a 2input NAND driving h identical gates. 2 2 A 2 B 2x x R/2 R/2 2C 6: Logical Effort Y (6+4h)C Y 4hC 6C h copies 2C t pdf 2C R2 6 4h C R2 R2 7 4h RC CMOS VLSI Design Slide 22 Delay Components Delay has two parts – Parasitic delay • 6 or 7 RC • Independent of load – Effort delay • 4h RC • Proportional to load capacitance 6: Logical Effort CMOS VLSI Design Slide 23 Contamination Delay Best-case (contamination) delay can be substantially less than propagation delay. Ex: If both inputs fall simultaneously 2 2 A 2 B 2x R R Y (6+4h)C 6: Logical Effort Y 4hC 6C 2C tcdr 3 2h RC CMOS VLSI Design Slide 24 Diffusion Capacitance We assumed contacted diffusion on every s / d. Good layout minimizes diffusion area Ex: NAND3 layout shares one diffusion contact – Reduces output capacitance by 2C – Merged uncontacted diffusion might help too 2C 2C Shared Contacted Diffusion Isolated Contacted Diffusion Merged Uncontacted Diffusion 2 2 3 3 3C 3C 3C 6: Logical Effort 2 CMOS VLSI Design 3 7C 3C 3C Slide 25 Layout Comparison Which layout is better? VDD A VDD B Y GND 6: Logical Effort A B Y GND CMOS VLSI Design Slide 26 Introduction Chip designers face a bewildering array of choices – What is the best circuit topology for a function? – How many stages of logic give least delay? – How wide should the transistors be? ??? Logical effort is a method to make these decisions – Uses a simple model of delay – Allows back-of-the-envelope calculations – Helps make rapid comparisons between alternatives – Emphasizes remarkable symmetries 6: Logical Effort CMOS VLSI Design Slide 27 Example Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the decoder for a register file. A[3:0] A[3:0] 32 bits 6: Logical Effort CMOS VLSI Design 16 Register File Slide 28 16 words 4:16 Decoder Decoder specifications: – 16 word register file – Each word is 32 bits wide – Each bit presents load of 3 unit-sized transistors – True and complementary address inputs A[3:0] – Each input may drive 10 unit-sized transistors Ben needs to decide: – How many stages to use? – How large should each gate be? – How fast can decoder operate? Delay in a Logic Gate Express delays in process-independent unit d d abs 3RC 6: Logical Effort 12 ps in 180 nm process 40 ps in 0.6 mm process CMOS VLSI Design Slide 29 Delay in a Logic Gate Express delays in process-independent unit d d abs Delay has two components d f p 6: Logical Effort CMOS VLSI Design Slide 30 Delay in a Logic Gate Express delays in process-independent unit d d abs Delay has two components d f p Effort delay f = gh (a.k.a. stage effort) – Again has two components 6: Logical Effort CMOS VLSI Design Slide 31 Delay in a Logic Gate Express delays in process-independent unit d d abs Delay has two components d f p Effort delay f = gh (a.k.a. stage effort) – Again has two components g: logical effort – Measures relative ability of gate to deliver current – g 1 for inverter 6: Logical Effort CMOS VLSI Design Slide 32 Delay in a Logic Gate Express delays in process-independent unit d d abs Delay has two components d f p Effort delay f = gh (a.k.a. stage effort) – Again has two components h: electrical effort = Cout / Cin – Ratio of output to input capacitance – Sometimes called fanout 6: Logical Effort CMOS VLSI Design Slide 33 Delay in a Logic Gate Express delays in process-independent unit d d abs Delay has two components d f p Parasitic delay p – Represents delay of gate driving no load – Set by internal parasitic capacitance 6: Logical Effort CMOS VLSI Design Slide 34 Delay Plots d =f+p = gh + p 2-input NAND NormalizedDelay:d 6 Inverter g= p= d= 5 g= p= d= 4 3 2 1 0 0 1 2 3 4 5 ElectricalEffort: h = Cout / Cin 6: Logical Effort CMOS VLSI Design Slide 35 Delay Plots d =f+p = gh + p 6 NormalizedDelay:d What about NOR2? 2-input NAND Inverter 5 3 g=1 p=1 d = h +1 2 EffortDelay:f 4 g = 4/3 p=2 d = (4/3)h + 2 1 Parasitic Delay: p 0 0 1 2 3 4 5 ElectricalEffort: h = Cout / Cin 6: Logical Effort CMOS VLSI Design Slide 36 Computing Logical Effort DEF: Logical effort is the ratio of the input capacitance of a gate to the input capacitance of an inverter delivering the same output current. Measure from delay vs. fanout plots Or estimate by counting transistor widths 2 Y 2 A Y 1 Cin = 3 g = 3/3 6: Logical Effort 2 A 2 B 2 Cin = 4 g = 4/3 CMOS VLSI Design A 4 B 4 Y 1 1 Cin = 5 g = 5/3 Slide 37 Catalog of Gates Logical effort of common gates Gate type Number of inputs 1 2 3 4 n NAND 4/3 5/3 6/3 (n+2)/3 NOR 5/3 7/3 9/3 (2n+1)/3 2 2 2 2 4, 4 6, 12, 6 8, 16, 16, 8 Inverter Tristate / mux XOR, XNOR 6: Logical Effort 1 2 CMOS VLSI Design Slide 38 Catalog of Gates Parasitic delay of common gates – In multiples of pinv (1) Gate type Number of inputs 1 2 3 4 n NAND 2 3 4 n NOR 2 3 4 n 4 6 8 2n 4 6 8 Inverter Tristate / mux XOR, XNOR 6: Logical Effort 1 2 CMOS VLSI Design Slide 39 Example: Ring Oscillator Estimate the frequency of an N-stage ring oscillator Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay: Frequency: 6: Logical Effort g= h= p= d= fosc = CMOS VLSI Design Slide 40 Example: Ring Oscillator Estimate the frequency of an N-stage ring oscillator Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay: Frequency: 6: Logical Effort 31 stage ring oscillator in g=1 0.6 mm process has h=1 frequency of ~ 200 MHz p=1 d=2 fosc = 1/(2*N*d) = 1/4N CMOS VLSI Design Slide 41 Example: FO4 Inverter Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay: 6: Logical Effort g= h= p= d= CMOS VLSI Design Slide 42 Example: FO4 Inverter Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay: g=1 h=4 p=1 d=5 The FO4 delay is about 200 ps in 0.6 mm process 60 ps in a 180 nm process f/3 ns in an f mm process 6: Logical Effort CMOS VLSI Design Slide 43 Multistage Logic Networks Logical effort generalizes to multistage networks Path Logical Effort G gi Path Electrical Effort g1 = 1 h1 = x/10 6: Logical Effort Cin-path F f i gi hi Path Effort 10 H Cout-path x g2 = 5/3 h2 = y/x y g3 = 4/3 h3 = z/y CMOS VLSI Design z g4 = 1 h4 = 20/z 20 Slide 44 Multistage Logic Networks Logical effort generalizes to multistage networks Path Logical Effort G gi Path Electrical Effort H Cout path Cin path F f i gi hi Path Effort Can we write F = GH? 6: Logical Effort CMOS VLSI Design Slide 45 Paths that Branch No! Consider paths that branch: 15 G H GH h1 h2 F = = = = = = GH? 6: Logical Effort 90 5 15 CMOS VLSI Design 90 Slide 46 Paths that Branch No! Consider paths that branch: 15 G H GH h1 h2 F =1 5 = 90 / 5 = 18 = 18 = (15 +15) / 5 = 6 = 90 / 15 = 6 = g1g2h1h2 = 36 = 2GH 6: Logical Effort CMOS VLSI Design 15 90 90 Slide 47 Branching Effort Introduce branching effort – Accounts for branching between stages in path b Con path Coff path Con path B bi Note: h i BH Now we compute the path effort – F = GBH 6: Logical Effort CMOS VLSI Design Slide 48 Multistage Delays Path Effort Delay DF f i Path Parasitic Delay P pi Path Delay D d i DF P 6: Logical Effort CMOS VLSI Design Slide 49 Designing Fast Circuits D d i DF P Delay is smallest when each stage bears same effort fˆ gi hi F 1 N Thus minimum delay of N stage path is 1 N D NF P This is a key result of logical effort – Find fastest possible delay – Doesn’t require calculating gate sizes 6: Logical Effort CMOS VLSI Design Slide 50 Gate Sizes How wide should the gates be for least delay? fˆ gh g CCoutin gi Couti Cini fˆ Working backward, apply capacitance transformation to find input capacitance of each gate given load it drives. Check work by verifying input cap spec is met. 6: Logical Effort CMOS VLSI Design Slide 51 Example: 3-stage path Select gate sizes x and y for least delay from A to B x x A 8 6: Logical Effort x CMOS VLSI Design y 45 y B 45 Slide 52 Example: 3-stage path x x A 8 x y 45 y Logical Effort Electrical Effort Branching Effort Path Effort Best Stage Effort Parasitic Delay Delay 6: Logical Effort B 45 G= H= B= F= fˆ P= D= CMOS VLSI Design Slide 53 Example: 3-stage path x x A 8 x y 45 y Logical Effort Electrical Effort Branching Effort Path Effort Best Stage Effort Parasitic Delay Delay 6: Logical Effort B 45 G = (4/3)*(5/3)*(5/3) = 100/27 H = 45/8 B=3*2=6 F = GBH = 125 fˆ 3 F 5 P=2+3+2=7 D = 3*5 + 7 = 22 = 4.4 FO4 CMOS VLSI Design Slide 54 Example: 3-stage path Work backward for sizes y= x= x x A 8 6: Logical Effort x y 45 y CMOS VLSI Design B 45 Slide 55 Example: 3-stage path Work backward for sizes y = 45 * (5/3) / 5 = 15 x = (15*2) * (5/3) / 5 = 10 45 A P: 4 N: 4 6: Logical Effort P: 4 N: 6 CMOS VLSI Design P: 12 N: 3 B 45 Slide 56 Best Number of Stages How many stages should a path use? – Minimizing number of stages is not always fastest Example: drive 64-bit datapath with unit inverter InitialDriver 1 1 1 1 D = DatapathLoad N: f: D: 6: Logical Effort CMOS VLSI Design 64 1 64 2 64 3 64 4 Slide 57 Best Number of Stages How many stages should a path use? – Minimizing number of stages is not always fastest Example: drive 64-bit datapath with unit inverter InitialDriver 1 1 1 1 8 4 2.8 16 8 D = NF1/N + P = N(64)1/N + N 23 DatapathLoad N: f: D: 6: Logical Effort CMOS VLSI Design 64 1 64 65 64 2 8 18 64 3 4 15 64 4 2.8 15.3 Fastest Slide 58 Derivation Consider adding inverters to end of path – How many give least delay? Logic Block: n1Stages Path Effort F n1 D NF pi N n1 pinv 1 N N - n1 ExtraInverters i 1 1 1 1 D F N ln F N F N pinv 0 N Define best stage effort F 1 N pinv 1 ln 0 6: Logical Effort CMOS VLSI Design Slide 59 Best Stage Effort pinv 1 ln 0 has no closed-form solution Neglecting parasitics (pinv = 0), we find = 2.718 (e) For pinv = 1, solve numerically for = 3.59 6: Logical Effort CMOS VLSI Design Slide 60 Sensitivity Analysis D(N) /D(N) How sensitive is delay to using exactly the best 1.6 number of stages? 1.51 1.4 1.26 1.2 1.15 1.0 ( =2.4) (=6) 0.0 0.5 0.7 1.0 1.4 2.0 N/ N 2.4 < < 6 gives delay within 15% of optimal – We can be sloppy! – I like = 4 6: Logical Effort CMOS VLSI Design Slide 61 Example, Revisited Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the decoder for a register file. A[3:0] A[3:0] 32 bits 6: Logical Effort CMOS VLSI Design 16 Register File Slide 62 16 words 4:16 Decoder Decoder specifications: – 16 word register file – Each word is 32 bits wide – Each bit presents load of 3 unit-sized transistors – True and complementary address inputs A[3:0] – Each input may drive 10 unit-sized transistors Ben needs to decide: – How many stages to use? – How large should each gate be? – How fast can decoder operate? Number of Stages Decoder effort is mainly electrical and branching Electrical Effort: H= Branching Effort: B= If we neglect logical effort (assume G = 1) Path Effort: F= Number of Stages: 6: Logical Effort N= CMOS VLSI Design Slide 63 Number of Stages Decoder effort is mainly electrical and branching Electrical Effort: H = (32*3) / 10 = 9.6 Branching Effort: B=8 If we neglect logical effort (assume G = 1) Path Effort: F = GBH = 76.8 Number of Stages: N = log4F = 3.1 Try a 3-stage design 6: Logical Effort CMOS VLSI Design Slide 64 Gate Sizes & Delay Logical Effort: Path Effort: Stage Effort: Path Delay: Gate sizes: A[3] A[3] 10 10 A[2] A[2] 10 10 A[1] A[1] 10 10 G= F= fˆ D z= y= A[0] A[0] 10 10 y z word[0] 96 units of wordline capacitance y 6: Logical Effort z word[15] CMOS VLSI Design Slide 65 Gate Sizes & Delay Logical Effort: Path Effort: Stage Effort: Path Delay: Gate sizes: A[3] A[3] 10 10 A[2] A[2] 10 10 A[1] A[1] 10 10 G = 1 * 6/3 * 1 = 2 F = GBH = 154 fˆ F 1/ 3 5.36 D 3 fˆ 1 4 1 22.1 z = 96*1/5.36 = 18 y = 18*2/5.36 = 6.7 A[0] A[0] 10 10 y z word[0] 96 units of wordline capacitance y 6: Logical Effort z word[15] CMOS VLSI Design Slide 66 Comparison Compare many alternatives with a spreadsheet Design N G P D NAND4-INV 2 2 5 29.8 NAND2-NOR2 2 20/9 4 30.1 INV-NAND4-INV 3 2 6 22.1 NAND4-INV-INV-INV 4 2 7 21.1 NAND2-NOR2-INV-INV 4 20/9 6 20.5 NAND2-INV-NAND2-INV 4 16/9 6 19.7 INV-NAND2-INV-NAND2-INV 5 16/9 7 20.4 NAND2-INV-NAND2-INV-INV-INV 6 16/9 8 21.6 6: Logical Effort CMOS VLSI Design Slide 67 Review of Definitions Term Stage Path number of stages 1 N logical effort g G gi electrical effort h CCoutin H branching effort b effort f gh F GBH effort delay f DF f i parasitic delay p P pi delay d f p 6: Logical Effort Con-path Coff-path Con-path CMOS VLSI Design Cout-path Cin-path B bi D d i DF P Slide 68 Method of Logical Effort 1) 2) 3) 4) 5) Compute path effort Estimate best number of stages Sketch path with N stages Estimate least delay Determine best stage effort N log 4 F 1 N D NF P ˆf F N1 gi Couti Cini fˆ 6) Find gate sizes 6: Logical Effort F GBH CMOS VLSI Design Slide 69 Limits of Logical Effort Chicken and egg problem – Need path to compute G – But don’t know number of stages without G Simplistic delay model – Neglects input rise time effects Interconnect – Iteration required in designs with wire Maximum speed only – Not minimum area/power for constrained delay 6: Logical Effort CMOS VLSI Design Slide 70 Summary Logical effort is useful for thinking of delay in circuits – Numeric logical effort characterizes gates – NANDs are faster than NORs in CMOS – Paths are fastest when effort delays are ~4 – Path delay is weakly sensitive to stages, sizes – But using fewer stages doesn’t mean faster paths – Delay of path is about log4F FO4 inverter delays – Inverters and NAND2 best for driving large caps Provides language for discussing fast circuits – But requires practice to master 6: Logical Effort CMOS VLSI Design Slide 71