Download Lecture 4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lecture 4:
Delay
Optimization
and Logical
Effort
Outline








RC Delay Models
Delay Estimation
Logical Effort
Delay in a Logic Gate
Multistage Logic Networks
Choosing the Best Number of Stages
Examples
Summary
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
2
Delay Definitions
 tpdr: rising propagation delay
– Max time from input to
rising output crossing
VDD/2
 tpdf: falling propagation delay
– Max time from input to
falling output crossing
VDD/2
 tpd: average propagation delay
– tpd = (tpdr + tpdf)/2
 tr: rise time
– From output crossing 0.2
VDD to 0.8 VDD
 tf: fall time
– From output crossing 0.8
VDD to 0.2 VDD
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
3
Delay Definitions
 tcdr: rising contamination delay
– Min time from input to rising output crossing VDD/2
 tcdf: falling contamination delay
– Min time from input to falling output crossing
VDD/2
 tcd: average contamination delay
– tcd = (tcdr + tcdf)/2
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
4
Simulated Inverter Delay
 Solving differential equations by hand is too hard
 SPICE simulator solves the equations numerically
– Uses more accurate I-V models too!
 But simulations take time to write, may hide insight
2.0
1.5
1.0
(V)
Vin
tpdf = 66ps
tpdr = 83ps
Vout
0.5
0.0
0.0
200p
400p
600p
800p
1n
t(s)
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
5
Delay Estimation
 We would like to be able to easily estimate delay
– Not as accurate as simulation
– But easier to ask “What if?”
 The step response usually looks like a 1st order RC
response with a decaying exponential.
 Use RC delay models to estimate delay
– C = total capacitance on output node
– Use effective resistance R
– So that tpd = RC
 Characterize transistors by finding their effective R
– Depends on average current as gate switches
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
6
Effective Resistance
 Shockley models have limited value
– Not accurate enough for modern transistors
– Too complicated for much hand analysis
 Simplification: treat transistor as resistor
– Replace Ids(Vds, Vgs) with effective resistance R
• Ids = Vds/R
– R averaged across switching of digital gate
 Too inaccurate to predict current at any given time
– But good enough to predict RC delay
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
7
RC Delay Model
 Use equivalent circuits for MOS transistors
– Ideal switch + capacitance and ON resistance
– Unit nMOS has resistance R, capacitance C
– Unit pMOS has resistance 2R, capacitance C
 Capacitance proportional to width
 Resistance inversely proportional to width
d
Input
capacitance
g
d
k
s
kC
R/k
Output
capacitance
s
kC
2R/k
g
g
kC
kC
d
k
s
s
5: DC and Transient Response
kC
g
kC
d
CMOS VLSI Design 4th Ed.
8
RC Values
 Capacitance
– C = Cg = Cs = Cd = 2 fF/mm of gate width in 0.6 mm
– Gradually decline to 1 fF/mm in nanometer techs.
 Resistance
– R  6 KW*mm in 0.6 mm process
– Improves with shorter channel lengths
 Unit transistors
– May refer to minimum contacted device (4/2 l)
– Or maybe 1 mm wide device
– Doesn’t matter as long as you are consistent
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
9
Inverter Delay Estimate
 Estimate the delay of a fanout-of-1 inverter
2C
R
A
2 Y
2
1
1
2C
2C
2C
2C
Y
R
C
R
C
C
C
C
d = 6RC
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
10
Delay Model Comparison
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
11
Example: 3-input NAND
 Sketch a 3-input NAND with transistor widths chosen to
achieve effective rise and fall resistances equal to a unit
inverter (R).
2
2
2
3
3
3
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
12
3-input NAND Caps
 Annotate the 3-input NAND gate with gate and diffusion
capacitance.
2C
2C
2
2C
5C
2C
2
2C
2
2C
3C
5C
3C
5C
3C
5: DC and Transient Response
2C
2C
2C
3
3
3
CMOS VLSI Design 4th Ed.
9C
3C
3C
3C
3C
13
Example
 Sketch a 2-input NOR gate with selected transistor
widths so that effective rise and fall resistances are
equal to a unit inverter’s. Annotate the gate and
diffusion capacitances.
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
14
Elmore Delay
 ON transistors look like resistors
 Pullup or pulldown network modeled as RC ladder
 Elmore delay of RC ladder
t pd 

Ri to sourceCi
nodes i
 R1C1   R1  R2  C2  ...   R1  R2  ...  RN  CN
R1
R2
R3
C1
C2
5: DC and Transient Response
RN
C3
CMOS VLSI Design 4th Ed.
CN
15
Example: 3-input NAND
 Estimate worst-case rising and falling delay of 3-input NAND
driving h identical gates.
2
2
2
Y
3
9C
5hC
n2
3C
3 n1
h copies
3
t pdr  9  5h  RC
5: DC and Transient Response
3C
t pdf   3C   R3    3C   R3  R3    9  5h  C   R3  R3  R3 
 11  5h  RC
CMOS VLSI Design 4th Ed.
16
Delay Components
 Delay has two parts
– Parasitic delay
• 9 or 11 RC
• Independent of load
– Effort delay
• 5h RC
• Proportional to load capacitance
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
17
Contamination Delay
 Best-case (contamination) delay can be substantially less than
propagation delay.
 Ex: If all three inputs fall simultaneously
2
2
2
Y
3
9C
5hC
n2
3C
3 n1
3
3C
tcdr
5: DC and Transient Response
R  5 
  9  5h  C      3  h  RC
3  3 
CMOS VLSI Design 4th Ed.
18
Delay in a Logic Gate
 Express delays in process-independent unit d  d abs

 Delay has two components: d = f + p
  3RC
 f: effort delay = gh (a.k.a. stage effort)
 3 ps in 65 nm process
– Again has two components
60 ps in 0.6 mm process
 g: logical effort
– Measures relative ability of gate to deliver current
– g  1 for inverter
 h: electrical effort = Cout / Cin
– Ratio of output to input capacitance
– Sometimes called fanout
 p: parasitic delay
– Represents delay of gate driving no load
– Set by internal parasitic capacitance
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
19
Delay Plots
=f+p
= gh + p
 What about
NOR2?
2-input
NAND
6
Normalized Delay: d
d
Inverter
5
g=1
p=1
d=h+1
4
3
g = 4/3
p=2
d = (4/3)h + 2
Effort Delay: f
2
1
Parasitic Delay: p
0
0
1
2
3
4
5
Electrical Effort:
h = Cout / Cin
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
20
Computing Logical Effort
 DEF: Logical effort is the ratio of the input
capacitance of a gate to the input capacitance of an
inverter delivering the same output current.
 Measure from delay vs. fanout plots
 Or estimate by counting transistor widths
2
Y
2
A
2
Y
1
A
2
B
2
Cin = 3
g = 3/3
5: DC and Transient Response
A
4
B
4
Cin = 4
g = 4/3
CMOS VLSI Design 4th Ed.
Y
1
1
Cin = 5
g = 5/3
21
Catalog of Gates
 Logical effort of common gates
Gate type
Number of inputs
1
2
3
4
n
NAND
4/3
5/3
6/3
(n+2)/3
NOR
5/3
7/3
9/3
(2n+1)/3
2
2
2
2
4, 4
6, 12, 6
8, 16, 16, 8
Inverter
Tristate / mux
1
2
XOR, XNOR
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
22
Catalog of Gates
 Parasitic delay of common gates
– In multiples of pinv (1)
Gate type
Number of inputs
1
2
3
4
n
NAND
2
3
4
n
NOR
2
3
4
n
4
6
8
2n
4
6
8
Inverter
Tristate / mux
1
2
XOR, XNOR
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
23
Example: Ring Oscillator
 Estimate the frequency of an N-stage ring oscillator
Logical Effort:
Electrical Effort:
Parasitic Delay:
Stage Delay:
Frequency:
5: DC and Transient Response
31 stage ring oscillator in
g=1
0.6 mm process has
h=1
frequency of ~ 200 MHz
p=1
d=2
fosc = 1/(2*N*d) = 1/4N
CMOS VLSI Design 4th Ed.
24
Example: FO4 Inverter
 Estimate the delay of a fanout-of-4 (FO4) inverter
d
Logical Effort:
Electrical Effort:
Parasitic Delay:
Stage Delay:
5: DC and Transient Response
g=1
h=4
p=1
d=5
The FO4 delay is about
300 ps in 0.6 mm process
15 ps in a 65 nm process
CMOS VLSI Design 4th Ed.
25
Multistage Logic Networks
 Logical effort generalizes to multistage networks
 Path Logical Effort
G
gi

 Path Electrical Effort
g1 = 1
h1 = x/10
Cin-path
F   f i   gi hi
 Path Effort
10
H
Cout-path
x
g2 = 5/3
h2 = y/x
5: DC and Transient Response
y
g3 = 4/3
h3 = z/y
z
g4 = 1
h4 = 20/z
CMOS VLSI Design 4th Ed.
20
26
Multistage Logic Networks
 Logical effort generalizes to multistage networks
 Path Logical Effort
G
gi

 Path Electrical Effort
H
Cout  path
Cin  path
F   f i   gi hi
 Path Effort
 Can we write F = GH?
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
27
Paths that Branch
 No! Consider paths that branch:
15
G
H
GH
h1
h2
F
=1
5
= 90 / 5 = 18
= 18
= (15 +15) / 5 = 6
= 90 / 15 = 6
= g1g2h1h2 = 36 = 2GH
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
15
90
90
28
Branching Effort
 Introduce branching effort
– Accounts for branching between stages in path
b
Con path  Coff path
Con path
B   bi
Note:
h
i
 BH
 Now we compute the path effort
– F = GBH
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
29
Multistage Delays
 Path Effort Delay
DF   f i
 Path Parasitic Delay
P   pi
 Path Delay
D   d i  DF  P
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
30
Designing Fast Circuits
D   d i  DF  P
 Delay is smallest when each stage bears same effort
fˆ  gi hi  F
1
N
 Thus minimum delay of N stage path is
1
N
D  NF  P
 This is a key result of logical effort
– Find fastest possible delay
– Doesn’t require calculating gate sizes
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
31
Gate Sizes
 How wide should the gates be for least delay?
fˆ  gh  g CCoutin
gi Couti
 Cini 
fˆ
 Working backward, apply capacitance
transformation to find input capacitance of each gate
given load it drives.
 Check work by verifying input cap spec is met.
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
32
Example: 3-stage path
 Select gate sizes x and y for least delay from A to B
x
x
A
8
5: DC and Transient Response
x
CMOS VLSI Design 4th Ed.
y
45
y
B
45
33
Example: 3-stage path
x
x
A
8
x
y
45
y
Logical Effort
Electrical Effort
Branching Effort
Path Effort
Best Stage Effort
Parasitic Delay
Delay
5: DC and Transient Response
B
45
G = (4/3)*(5/3)*(5/3) = 100/27
H = 45/8
B=3*2=6
F = GBH = 125
fˆ  3 F  5
P=2+3+2=7
D = 3*5 + 7 = 22 = 4.4 FO4
CMOS VLSI Design 4th Ed.
34
Example: 3-stage path
 Work backward for sizes
y = 45 * (5/3) / 5 = 15
x = (15*2) * (5/3) / 5 = 10
x
y
x
A P:
84
N: 4
5: DC and Transient Response
45
45
P:
x 4
N: 6
P:
y 12
N: 3
CMOS VLSI Design 4th Ed.
B
B
45
45
35
Best Number of Stages
 How many stages should a path use?
– Minimizing number of stages is not always fastest
 Example: drive 64-bit datapath with unit inverter
Initial Driver
1
1
1
1
8
4
2.8
16
8
D = NF1/N + P
= N(64)1/N + N
23
Datapath Load
N:
f:
D:
5: DC and Transient Response
64
1
64
65
CMOS VLSI Design 4th Ed.
64
2
8
18
64
3
4
15
Fastest
64
4
2.8
15.3
36
Derivation
 Consider adding inverters to end of path
– How many give least delay?
Logic Block:
n1Stages
Path Effort F
n1
D  NF   pi   N  n1  pinv
1
N
N - n1 ExtraInverters
i 1
1
1
1
D
  F N ln F N  F N  pinv  0
N
 Define best stage effort
F
1
N
pinv   1  ln    0
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
37
Best Stage Effort

pinv   1  ln    0 has no closed-form solution
 Neglecting parasitics (pinv = 0), we find  = 2.718 (e)
 For pinv = 1, solve numerically for  = 3.59
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
38
Review of Definitions
Term
Stage
Path
number of stages
1
N
logical effort
g
G   gi
electrical effort
h  CCoutin
H
branching effort
b
effort
f  gh
F  GBH
effort delay
f
DF   f i
parasitic delay
p
P   pi
delay
d f p
5: DC and Transient Response
Con-path Coff-path
Con-path
CMOS VLSI Design 4th Ed.
Cout-path
Cin-path
B   bi
D   d i  DF  P
39
Method of Logical Effort
1)
2)
3)
4)
5)
Compute path effort
Estimate best number of stages
Sketch path with N stages
Estimate least delay
Determine best stage effort
N  log 4 F
1
N
D  NF  P
ˆf  F N1
gi Couti
Cini 
fˆ
6) Find gate sizes
5: DC and Transient Response
F  GBH
CMOS VLSI Design 4th Ed.
40
Limits of Logical Effort
 Chicken and egg problem
– Need path to compute G
– But don’t know number of stages without G
 Simplistic delay model
– Neglects input rise time effects
 Interconnect
– Iteration required in designs with wire
 Maximum speed only
– Not minimum area/power for constrained delay
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
41
Example 1
Calculate the
 a) logical effort
 b) parasitic delay
 c) effort and
 d) delay in the following 6-input AND implementations as a function of
the path electrical effort H. Which implementation is the fastest if a)
H=1, b) H=5 and c) H=20?
(a)
(b)
5: DC and Transient Response
(c)
CMOS VLSI Design 4th Ed.
(d)
42
Example 2
 For the path between x and F in the following circuit
calculate:
 a. Total delay
 b. Path effort and parasitic delay
 c. minimum theoritical delay
6
4
20C
8
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
43
Summary
 Logical effort is useful for thinking of delay in circuits
– Numeric logical effort characterizes gates
– NANDs are faster than NORs in CMOS
– Paths are fastest when effort delays are ~4
– Path delay is weakly sensitive to stages, sizes
– But using fewer stages doesn’t mean faster paths
– Delay of path is about log4F FO4 inverter delays
– Inverters and NAND2 best for driving large caps
 Provides language for discussing fast circuits
– But requires practice to master
5: DC and Transient Response
CMOS VLSI Design 4th Ed.
44