Download Chapter 6

Document related concepts
no text concepts found
Transcript
Designing Combinational
Logic Circuits
EE141
1
Introduction
Combinational vs. Sequential Logic
In
Combinational
Logic
Circuit
In
Out
Combinational
Logic
Circuit
Out
State
Combinational
Sequential
Output = f(In, Previous In)
Output = f(In)
EE141
2
Static CMOS Circuit
At every point in time (except during the switching
transients) each gate output is connected to either
VDD or Vss via a low-resistive path.
The outputs of the gates assume at all times the value
of the Boolean function, implemented by the circuit
(ignoring, once again, the transient effects during
switching periods).
This is in contrast to the dynamic circuit class, which
relies on temporary storage of signal values on the
capacitance of high impedance circuit nodes.
EE141
3
Static Complementary CMOS
VDD
In1
In2
PUN
InN
In1
In2
InN
PMOS only
F(In1,In2,…InN)
PDN
NMOS only
PUN and PDN are dual logic networks
EE141
4
NMOS Transistors
in Series/Parallel Connection
Transistors can be thought as a switch controlled by its gate signal
NMOS switch closes when switch control input is high
A
B
X
Y
Y = X if A and B
A
X
B
Y = X if A OR B
Y
NMOS Transistors pass a “strong” 0 but a “weak” 1
EE141
5
PMOS Transistors
in Series/Parallel Connection
PMOS switch closes when switch control input is low
A
B
X
Y
Y = X if A AND B = A + B
A
X
B
Y
Y = X if A OR B = AB
PMOS Transistors pass a “strong” 1 but a “weak” 0
EE141
6
Threshold Drops
VDD
PUN
VDD
S
D
VDD
D
0  VDD
VGS
S
CL
CL
VDD  0
PDN
D
VDD
0  VDD - VTn
CL
S
VGS
VDD  |VTp|
S
CL
D
EE141
7
Complementary CMOS Logic Style
EE141
8
Example Gate: NAND
EE141
9
Example Gate: NOR
EE141
10
Complex CMOS Gate
B
A
C
D
OUT = D + A • (B + C)
A
D
B
C
EE141
11
Constructing a Complex Gate
VDD
VDD
C
F
SN4
F
SN1
A
SN3
D
B
C
B
SN2
A
D
A
B
D
C
F
(a) pull-down network
(b) Deriving the pull-up network
hierarchically by identifying
sub-nets
A
D
B
C
(c) complete gate
EE141
12
Cell Design

Standard Cells




General purpose logic
Can be synthesized
Same height, varying width
Datapath Cells



For regular, structured designs (arithmetic)
Includes some wiring in the cell
Fixed height and width
EE141
13
Standard Cell Layout
Methodology – 1980s
Routing
channel
VDD
signals
GND
EE141
14
Standard Cell Layout
Methodology – 1990s
Mirrored Cell
No Routing
channels
VDD
VDD
M2
M3
GND
Mirrored Cell
GND
EE141
15
Standard Cells
N Well
VDD
Cell height 12 metal tracks
Metal track is approx. 3 + 3
Pitch =
repetitive distance between objects
Cell height is “12 pitch”
2
Cell boundary
In
Out
GND
EE141
Rails ~10
16
Standard Cells
With minimal
diffusion
routing
VDD
With silicided
diffusion
VDD
VDD
M2
In
Out
In
Out
In
Out
M1
GND
EE141
GND
17
Standard Cells
2-input NAND gate
VDD
VDD
B
A
B
Out
A
GND
EE141
18
Stick Diagrams
Contains no dimensions
Represents relative positions of transistors
VDD
VDD
Inverter
NAND2
Out
Out
In
GND
GND
EE141
A B
19
Stick Diagrams
Logic Graph
A
j
X
C
C
B
A
i
X
X = C • (A + B)
C
PUN
i
B
VDD
j
B
GND
A
B
C
EE141
A
PDN
20
Two Versions of C • (A + B)
A
C
B
A
B
C
VDD
VDD
X
X
GND
GND
EE141
21
Consistent Euler Path
X
C
i
X
B
VDD
j
GND
EE141
A
A B C
22
OAI22 Logic Graph
A
C
B
D
X
C
X = (A+B)•(C+D)
C
D
A
B
D
VDD
X
B
A
B
C
D
A
GND
EE141
PUN
PDN
23
Example: x = ab+cd
x
x
c
b
VDD
x
a
c
b
VD D
x
a
d
GND
d
GND
(a) Logic graphs for (ab+cd)
(b) Euler Paths {a b c d}
VD D
x
GND
a
b
c
d
(c) stick diagram for ordering {a b c d}
EE141
24
Properties of Complementary CMOS
Gates Snapshot
•High noise margins:
V OH and VOL are at VDD and GND, respectively.
•No static power consumption:
There never exists a direct path between VDD and
VSS (GND) in steady-state mode.
•Comparable rise and fall times:
(under appropriate sizing conditions)
EE141
25
CMOS Properties






Full rail-to-rail swing; high noise margins
Logic levels not dependent upon the relative
device sizes; ratioless
Always a path to Vdd or Gnd in steady state; low
output impedance
Extremely high input resistance; nearly zero
steady-state input current
No direct path steady state between power and
ground; no static power dissipation
Propagation delay function of load capacitance
and resistance of transistors
EE141
26
Switch Delay Model
Req
A
A
Rp
A
Rp
Rp
B
Rn
Rp
Rp
A
CL
A
Cint
A
NAND2
Cint
A
Rn
B
Rn
B
INV
EE141
CL
Rn
Rn
A
B
CL
NOR2
27
Input Pattern Effects on Delay

Rp
A
Rp

B
Delay is dependent on the
pattern of inputs
Low to high transition

Rn
CL

B
Rn
A
Cint

both inputs go low
 delay is 0.69 Rp/2 CL
one input goes low
 delay is 0.69 Rp CL
High to low transition

both inputs go high
 delay is 0.69 2Rn CL
EE141
28
Delay Dependence on Input Patterns
3
Input Data
Pattern
Delay
(psec)
A=B=01
67
A=1, B=01
64
A= 01, B=1
61
0.5
A=B=10
45
0
A=1, B=10
80
A= 10, B=1
81
A=B=10
2.5
Voltage [V]
2
A=1 0, B=1
1.5
A=1, B=10
1
-0.5
0
100
200
300
time [ps]
EE141
400
NMOS = 0.5m/0.25 m
PMOS = 0.75m/0.25 m
CL = 100 fF
29
Transistor Sizing
Rp
2 A
Rp
B
Rn
2
B
2
Rn
Rp
4 B
2
CL
Rp
4
Cint
A
Cint
1
A
EE141
Rn
Rn
A
B
CL
1
30
Transistor Sizing a Complex
CMOS Gate
A
B
8 6
C
8 6
4 3
D
4 6
OUT = D + A • (B + C)
A
D
2
1
B
2C
EE141
2
31
Fan-In Considerations
A
B
C
D
A
CL
B
C3
C
C2
D
C1
Distributed RC model
(Elmore delay)
tpHL = 0.69 Reqn(C1+2C2+3C3+4CL)
Propagation delay deteriorates
rapidly as a function of fan-in –
quadratically in the worst case.
EE141
32
tp as a Function of Fan-In
1250
quadratic
tp (psec)
1000
Gates with a
fan-in
greater than
4 should be
avoided.
750
tpH
500
tp
L
250
tpL
linear
H
0
2
4
6
8
10
12
14
16
fan-in
EE141
33
tp as a Function of Fan-Out
tpNOR2
tpNAND2
tp (psec)
tpINV
2
4
6

C 
Delay inv  k  RunitCunit 1  L 
 Cin 
Delay gate    p  g  f 
8
10
12
14
16
All gates
have the
same drive
current.
Slope is a
function of
“driving
strength”
eff. fan-out
EE141
34
tp as a Function of Fan-In and Fan-Out


Fan-in: quadratic due to increasing resistance
and capacitance
Fan-out: each additional fan-out gate adds
two gate capacitances to CL
tp = a1FI + a2FI2 + a3FO
EE141
35
Sizing Logic Paths for Speed





Frequently, input capacitance of a logic path is
constrained
Logic also has to drive some capacitance
Example: ALU load in an Intel’s microprocessor is
0.5pF
How do we size the ALU datapath to achieve
maximum speed?
We have already solved this for the inverter chain –
can we generalize it for any type of logic?
EE141
36
Buffer example
In
Out
1
2
N
CL
 C gin, j 1 

t pj ~ RunitCunit 1 
 C

gin, j 

N
N 
C gin, j 1 
, C gin, N 1  C L
t p   t p , j  t p 0  1 

C gin, j 
j 1
i 1 
For given N: Ci+1/Ci = Ci/Ci-1
To find N: Ci+1/Ci ~ 4
How to generalize this to any logic path?
EE141
37
Logical Effort

CL 

Delay  k  RunitCunit 1 
 Cin 
 p  g  f 
p – intrinsic delay (3kRunitCunit) - gate parameter  f(W)
g – logical effort (kRunitCunit) – gate parameter  f(W)
f – effective fanout
Normalize everything to an inverter:
ginv =1, pinv = 1
Divide everything by inv
(everything is measured in unit delays inv)
Assume  = 1.
EE141
38
Delay in a Logic Gate
Gate delay:
d=h+p
effort delay
intrinsic delay
Effort delay:
h=gf
logical
effort
effective fanout =
Cout/Cin
Logical effort is a function of topology, independent of sizing
Effective fanout (electrical effort) is a function of load/gate size
EE141
39
Logical Effort



Inverter has the smallest logical effort and intrinsic
delay of all static CMOS gates
Logical effort of a gate presents the ratio of its input
capacitance to the inverter capacitance when sized
to deliver the same current
Logical effort increases with the gate complexity
EE141
40
Logical Effort
Logical effort is the ratio of input capacitance of a gate to the input
capacitance of an inverter with the same output current
VDD
A
VDD
A
2
2
B
F
2
F
A
A
VDD
B
4
A
4
2
F
1
A
B
Inverter
g=1
1
B
1
2
2-input NAND
g = 4/3 EE141
2-input NOR
g = 5/3
41
Normalized delay (d)
Logical Effort of Gates
t pNAND
g=
p=
d=
t pINV
g=
p=
d=
F(Fan-in)
1
2
3
4
5
Fan-out (h)
EE141
6
7
42
Normalized delay (d)
Logical Effort of Gates
t pNAND
g = 4/3
p=2
d = (4/3)f+2
t pINV
g=1
p=1
d = f+1
F(Fan-in)
1
2
3
4
5
Fan-out (f)
EE141
6
7
43
4/
3;
p
=
2
Logical Effort of Gates
=
D:
g
tN
AN
=
g
r:
e
t
er
1;
p=
1
v
in
3
pu
4
In
2-
Normalized Delay
5
Effort
Delay
2
1
Intrinsic
Delay
1
2
3
Fanout
f
EE141
4
5
44
Add Branching Effort
Branching effort:
b
Con path  Coff  path
Con path
EE141
45
Multistage Networks
N
Delay    pi  g i  f i 
i 1
Stage effort: hi = gifi
Path electrical effort: F = Cout/Cin
Path logical effort: G = g1g2…gN
Branching effort: B = b1b2…bN
Path effort: H = GFB
Path delay D = Sdi = Spi + Shi
EE141
46
Optimum Effort per Stage
When each stage bears the same effort:
hN  H
hN H
Stage efforts: g1f1 = g2f2 = … = gNfN
Effective fanout of each stage: f i  h g i
Minimum path delay
Dˆ   gi f i  pi   NH 1/ N  P
EE141
47
Logical Effort
From Sutherland, Sproull
EE141
48
Example: Optimize Path
1
g=1
f=a
b
a
g = 5/3
f = b/a
c
5
g = 5/3
f = c/b
g=1
f = 5/c
Effective fanout, F =
G=
H=
h=
a=
b=
EE141
49
Example: Optimize Path
1
g=1
f=a
b
a
g = 5/3
f = b/a
c
5
g = 5/3
f = c/b
Effective fanout, F = 5
G = 25/9
H = 125/9 = 13.9
h = 1.93
a = 1.93
b = ha/g2 = 2.23
c = hb/g3 = 5g4/f = 2.59
g=1
f = 5/c
fi  h gi
EE141
50
Method of Logical Effort





Compute the path effort: F = GBH
Find the best number of stages N ~ log4F
Compute the stage effort f = F1/N
Sketch the path with this number of stages
Work either from either end, find sizes:
Cin = Cout*g/f
Reference: Sutherland, Sproull, Harris, “Logical Effort, Morgan-Kaufmann 1999.
EE141
51
Fast Complex Gates:
Design Technique 1

Transistor sizing




It decreases serial resistance
But, it increases parasit capacitance as well
as long as fan-out capacitance dominates
Progressive sizing
InN
CL
MN
In3
M3
C3
In2
M2
C2
In1
M1
C1
Distributed RC line
M1 > M2 > M3 > … > MN
(the fet closest to the
output is the smallest)
Can reduce delay by more than
20%; decreasing gains as
technology shrinks
EE141
52
Fast Complex Gates:
Design Technique 2

Transistor ordering
critical path
In3 1 M3
critical path
01
In1
M3
charged
CL
In2 1 M2
C2 charged
In1
M1
01
C1 charged
delay determined by time to
discharge CL, C1 and C2
CLcharged
In2 1 M2
C2 discharged
In3 1 M1
C1 discharged
delay determined by time to
discharge CL
EE141
53
Fast Complex Gates:
Design Technique 3

Alternative logic structures
F = ABCDEFGH
EE141
54
Fast Complex Gates:
Design Technique 4

Isolating fan-in from fan-out using buffer
insertion
CL
CL
EE141
55
Fast Complex Gates:
Design Technique 5

Reducing the voltage swing
tpHL = 0.69 (3/4 (CL VDD)/ IDSATn )
= 0.69 (3/4 (CL Vswing)/ IDSATn )




linear reduction in delay
also reduces power consumption
But the following gate is much slower!
Or requires use of “sense amplifiers” on the
receiving end to restore the signal level (memory
design)
EE141
56