Download Diapositiva 1

Document related concepts
no text concepts found
Transcript
Progettazione di circuiti e sistemi VLSI
Anno Accademico 2010-2011
Lezione 6
18.3.2011
La logica combinatoria
La logica combinatoria
1
Combinational vs. Sequential
Logic
In
Combinational
Logic
Circuit
In
Combinational
Logic
Circuit
Out
Out
State
Combinational
Output = f(In)
Sequential
Output = f(In, Previous In)
La logica combinatoria
2
Static CMOS Circuit
At every point in time (except during the switching
transients) each gate output is connected to either
VDD or Vss via a low-resistive path.
The outputs of the gates assume at all times the value
of the Boolean function, implemented by the circuit
(ignoring, once again, the transient effects during
switching periods).
This is in contrast to the dynamic circuit class, which
relies on temporary storage of signal values on the
capacitance of high impedance circuit nodes.
La logica combinatoria
3
Static Complementary CMOS
VDD
In1
In2
PUN
PMOS only
InN
In1
In2
InN
F(In1,In2,…InN)
PDN
NMOS only
PUN and PDN are dual logic networks
La logica combinatoria
4
NMOS Transistors
in Series/Parallel Connection
Transistors can be thought as a switch controlled by its gate signal
NMOS switch closes when switch control input is high
A
B
X
Y
Y = X if A and B
A
X
B
Y
Y = X if A OR B
NMOS Transistors pass a “strong” 0 but a “weak” 1
La logica combinatoria
5
PMOS Transistors
in Series/Parallel Connection
PMOS switch closes when switch control input is low
A
B
X
Y
Y = X if A AND B = A + B
A
X
B
Y
Y = X if A OR B = AB
PMOS Transistors pass a “strong” 1 but a “weak” 0
La logica combinatoria
6
Threshold Drops
VDD
PUN
VDD
S
D
VDD
D
0  VDD
VGS
S
CL
VDD  0
PDN
D
VDD
0  VDD - VTn
CL
VGS
CL
S
VDD  |VTp|
S
CL
D
La logica combinatoria
7
Complementary CMOS Logic Style
La logica combinatoria
8
Example Gate: NAND
P
La logica combinatoria
9
Complex CMOS Gate
B
A
C
D
OUT = D + A • (B + C)
A
D
B
C
La logica combinatoria
10
Constructing a Complex Gate
VDD
VDD
C
F
SN4
F
SN1
A
SN3
D
B
C
B
SN2
A
D
A
B
D
C
F
(a) pull-down network
(b) Deriving the pull-up network
hierarchically by identifying
sub-nets
A
D
B
C
(c) complete gate
La logica combinatoria
11
Cell Design
• Standard Cells
– General purpose logic
– Can be synthesized
– Same height, varying width
• Datapath Cells
– For regular, structured designs (arithmetic)
– Includes some wiring in the cell
– Fixed height and width
La logica combinatoria
12
Standard Cells
N Well
VDD
Cell height 12 metal tracks
Metal track is approx. 3 + 3
Pitch =
repetitive distance between objects
Cell height is “12 pitch”
2
Cell boundary
In
Out
GND
La logica combinatoria
Rails ~10
13
Standard Cells
VDD
2-input NAND gate
VDD
B
A
B
Out
A
GND
La logica combinatoria
14
Stick Diagrams
Contains no dimensions
Represents relative positions of transistors
VDD
VDD
Inverter
NAND2
Out
Out
In
GND
GND
La logica combinatoria
A B
15
Stick Diagrams
Logic Graph
A
j
X
C
C
B
X = C • (A + B)
C
A
PUN
i
X
i
B
VDD
j
B
GND
A
B
C
La logica combinatoria
A
PDN
16
Two Versions of C • (A + B)
A
C
B
A
B
C
VDD
VDD
X
GND
X
GND
La logica combinatoria
17
Consistent Euler Path
X
C
i
X
B
VDD
j
A
GND
La logica combinatoria
A B C
18
OAI22 Logic Graph
A
C
B
D
X
D
X = (A+B)•(C+D)
C
D
A
B
PUN
C
VDD
X
B
A
B
C
D
A
GND
La logica combinatoria
PDN
19
Example: x = ab+cd
x
x
c
b
VDD
x
a
c
b
VD D
x
a
d
GND
d
GND
(a) Logic graphs for (ab+cd)
(b) Euler Paths {a b c d}
VD D
x
GND
a
b
c
d
(c) stick diagram for ordering {a b c d}
La logica combinatoria
20
Properties of Complementary
CMOS Gates Snapshot
High noise margins:
VOH and VOL are at VDD and GND, respectively.
No static power consumption:
There never exists a direct path between VDD and
VSS (GND) in steady-state mode.
Comparable rise and fall times:
(under appropriate sizing conditions)
La logica combinatoria
21
CMOS Properties
• Full rail-to-rail swing; high noise margins
• Logic levels not dependent upon the relative
device sizes; ratioless
• Always a path to Vdd or Gnd in steady state;
low output impedance
• Extremely high input resistance; nearly zero
steady-state input current
• No direct path steady state between power
and ground; no static power dissipation
• Propagation delay function of load
capacitance and resistance of transistors
La logica combinatoria
22
Switch Delay Model
Req
A
A
Rp
A
Rp
Rp
B
Rn
Rp
Rp
A
CL
CL
A
Cint
A
NAND2
Cint
A
Rn
B
Rn
B
INV
La logica combinatoria
Rn
Rn
A
B
CL
NOR2
23
Input Pattern Effects on
Delay
Rp
A
Rp
B
Rn
– both inputs go low
CL
A
• delay is 0.69 Rp/2 CL
– one input goes low
B
Rn
• Delay is dependent on
the pattern of inputs
• Low to high transition
• delay is 0.69 Rp CL
Cint
• High to low transition
– both inputs go high
• delay is 0.69 2Rn CL
La logica combinatoria
24
Delay Dependence on Input Patterns
3
Input Data
Pattern
Delay
(psec)
A=B=01
67
A=1, B=01
64
A= 01, B=1
61
0,5
A=B=10
45
0
A=1, B=10
80
A= 10, B=1
81
A=B=10
2,5
Voltage [V]
2
A=1 0, B=1
1,5
A=1, B=10
1
-0,5
0
100
200
300
time [ps]
La logica combinatoria
400
NMOS = 0.5m/0.25 m
PMOS = 0.75m/0.25 m
CL = 100 fF
25
Transistor Sizing
Rp
2 A
Rp
B
Rn
2
B
2
Rn
A
Rp
2
4 B
CL
4
Cint
Rp
Cint
A
1
Rn
Rn
A
B
La logica combinatoria
CL
1
26
Transistor Sizing a Complex
CMOS Gate
A
B
8 6
C
8 6
4 3
D
4 6
OUT = D + A • (B + C)
A
D
2
1
B
2C
2
La logica combinatoria
27
Fan-In Considerations
A
B
C
D
A
CL
B
C3
C
C2
D
C1
Distributed RC model
(Elmore delay)
tpHL = 0.69 Reqn(C1+2C2+3C3+4CL)
Propagation delay deteriorates
rapidly as a function of fan-in –
quadratically in the worst case.
La logica combinatoria
28
tp as a Function of Fan-In
1250
quadratic
tp (psec)
1000
Gates with a
fan-in
greater than
4 should be
avoided.
750
tpH
500
tp
L
250
tpL
linear
H
0
2
4
6
8
10
12
14
16
fan-in
La logica combinatoria
29
tp as a Function of Fan-Out
tpNOR2
tpNAND2
tpINV
tp (psec)
2
All gates
have the
same drive
current.
Slope is a
function of
“driving
strength”
4
6
8
10
12
14
16
eff. fan-out
La logica combinatoria
30
tp as a Function of Fan-In and
Fan-Out
• Fan-in: quadratic due to increasing
resistance and capacitance
• Fan-out: each additional fan-out gate
adds two gate capacitances to CL
tp = a1FI + a2FI2 + a3FO
La logica combinatoria
31
Fast Complex Gates:
Design Technique 1
• Transistor sizing
– as long as fan-out capacitance dominates
• Progressive sizing
InN
CL
MN
In3
M3
C3
In2
M2
C2
In1
M1
C1
Distributed RC line
M1 > M2 > M3 > … > MN
(the mos closest to the
output is the smallest)
Can reduce delay by more than
20%; decreasing gains as
technology shrinks
La logica combinatoria
32
Fast Complex Gates:
Design Technique 2
• Transistor ordering
critical path
In3 1 M3
critical path
01
In1
M3
charged
CL
In2 1 M2
C2 charged
In1
M1
01
C1 charged
delay determined by time to
discharge CL, C1 and C2
CLcharged
In2 1 M2
C2 discharged
In3 1 M1
C1 discharged
delay determined by time to
discharge CL
La logica combinatoria
33
Fast Complex Gates:
Design Technique 3
• Isolating fan-in from fan-out using buffer
insertion
CL
CL
La logica combinatoria
34
Fast Complex Gates:
Design Technique 4
• Reducing the voltage swing
tpHL = 0.69 (3/4 (CL VDD)/ IDSATn )
= 0.69 (3/4 (CL Vswing)/ IDSATn )
– linear reduction in delay
– also reduces power consumption
• But the following gate is much slower!
• Or requires use of “sense amplifiers” on the
receiving end to restore the signal level
(memory design)
La logica combinatoria
35
Sizing Logic Paths for Speed
• Frequently, input capacitance of a logic path
is constrained
• Logic also has to drive some capacitance
• Example: ALU load in an Intel’s
microprocessor is 0.5pF
• How do we size the ALU datapath to achieve
maximum speed?
• We have already solved this for the inverter
chain – can we generalize it for any type of
logic?
La logica combinatoria
36
Buffer Example
In
Out
1
2
N
CL
N
Delay    pi  g i  f i 
i 1
(in units of tinv)
For given N: Ci+1/Ci = Ci/Ci-1
To find N: Ci+1/Ci ~ 4
How to generalize this to any logic path?
La logica combinatoria
37
Logical Effort

CL 

Delay  k  RunitCunit 1 
 Cin 
t p  g  f 
p – intrinsic delay (3kRunitCunit) - gate parameter  f(W)
g – logical effort (kRunitCunit) – gate parameter  f(W)
f – effective fanout
Normalize everything to an inverter:
ginv =1, pinv = 1
Divide everything by tinv
(everything is measured in unit delays tinv)
Assume  = 1.
La logica combinatoria
38
Delay in a Logic Gate
Gate delay:
d=h+p
effort delay
intrinsic delay
Effort delay:
h=gf
logical
effort
effective fanout =
Cout/Cin
Logical effort is a function of topology, independent of sizing
Effective fanout (electrical effort) is a function of load/gate size
La logica combinatoria
39
Logical Effort
• Inverter has the smallest logical effort and
intrinsic delay of all static CMOS gates
• Logical effort of a gate presents the ratio of its
input capacitance to the inverter capacitance
when sized to deliver the same current
• Logical effort increases with the gate
complexity
La logica combinatoria
40
Logical Effort
Logical effort is the ratio of input capacitance of a gate to the input
capacitance of an inverter with the same output current
VDD
A
VDD
A
2
2
B
F
2
F
A
A
VDD
B
4
A
4
2
F
1
A
B
Inverter
g=1
1
B
1
2
2-input NAND
g = 4/3
La logica combinatoria
2-input NOR
g = 5/3
41
4/
3;
p
=
2
Logical Effort of Gates
=
g
D:
AN
tN
4
rte
e
nv
pu
in
3
=
g
:
1;
p=
1
r
I
2-
Normalized Delay
5
Effort
Delay
2
1
Intrinsic
Delay
1
2
3
Fanout f
4
La logica combinatoria
5
42
Add Branching Effort
Branching effort:
b
Con path  Coff  path
Con path
La logica combinatoria
43
Multistage Networks
N
Delay    pi  g i  f i 
i 1
Stage effort: hi = gifi
Path electrical effort: F = Cout/Cin
Path logical effort: G = g1g2…gN
Branching effort: B = b1b2…bN
Path effort: H = GFB
Path delay D = Sdi = Spi + Shi
La logica combinatoria
44
Optimum Effort per Stage
When each stage bears the same effort:
hN  H
hN H
Stage efforts: g1f1 = g2f2 = … = gNfN
Effective fanout of each stage: f i  h g i
Minimum path delay
Dˆ   gi f i  pi   NH 1/ N  P
La logica combinatoria
45
Optimal Number of Stages
For a given load,
and given input capacitance of the first gate
Find optimal number of stages and optimal sizing
D  NH 1/ N  Npinv
D
  H 1/ N ln H 1/ N  H 1/ N  pinv  0
N


Substitute ‘best stage effort’
hH
La logica combinatoria
1/ Nˆ
46
Logical Effort
From Sutherland, Sproull
La logica combinatoria
47
Method of Logical Effort
•
•
•
•
•
Compute the path effort: F = GBH
Find the best number of stages N ~ log4F
Compute the stage effort f = F1/N
Sketch the path with this number of stages
Work either from either end, find sizes:
Cin = Cout*g/f
Reference: Sutherland, Sproull, Harris, “Logical Effort, Morgan-Kaufmann 1999.
La logica combinatoria
48
Summary
Sutherland,
Sproull
Harris
La logica combinatoria
49
Power
• If well designed, the dynamic power
prevails
• P=α01 CL VDD2
α01 is activity factor = 0.5 for the inverter
La logica combinatoria
50
Ratioed Logic
VDD
Resistive
Load
VDD
Depletion
Load
RL
PDN
VSS
(a) resistive load
PMOS
Load
VSS
VT < 0
F
In1
In2
In3
VDD
F
In1
In2
In3
PDN
VSS
(b) depletion load NMOS
F
In1
In2
In3
PDN
VSS
(c) pseudo-NMOS
Goal: to reduce the number of devices over complementary CMOS
La logica combinatoria
51
Ratioed Logic
VDD
• N transistors + Load
Resistive
Load
• VOH = V DD
RL
• VOL =
RPN + RL
F
In1
In2
In3
RPN
• Assymetrical response
PDN
• Static power consumption
VSS
• tpL= 0.69 RLCL
La logica combinatoria
52
Active Loads
VDD
Depletion
Load
VDD
PMOS
Load
VT < 0
VSS
F
In1
In2
In3
PDN
F
In1
In2
In3
PDN
VSS
depletion load NMOS
VSS
pseudo-NMOS
La logica combinatoria
53
Pseudo-NMOS
VDD
A
B
C
D
F
CL
VOH = VDD (similar to complementary CMOS)
2
V OL
kp


2
k n  VDD – V Tn  V OL – -------------  = ------  V DD – VTp 
2 
2

kp
V OL =  VDD – V T  1 – 1 – -----(assuming that V T = V Tn = VTp )
kn
SMALLER AREA & LOAD BUT STATIC POWER DISSIPATION!!!
La logica combinatoria
54
Pseudo-NMOS VTC
3.0
2.5
W/Lp = 4
Vout [V]
2.0
1.5
W/Lp = 2
1.0
0.5
W/Lp = 0.5
W/Lp = 1
W/Lp = 0.25
0.0
0.0
0.5
1.0
1.5
2.0
2.5
Vin [V]
La logica combinatoria
55
Pass-Transistor Logic
La logica combinatoria
56
Pass-Transistor Logic
Inputs
B
Switch
Out
A
Out
Network
B
B
• N transistors
• No static consumption
La logica combinatoria
57
Example: AND Gate
B
A
B
F = AB
0
La logica combinatoria
58
NMOS-Only Logic
3.0
In
1.5m/0.25m
VDD
x
0.5m/0.25m
Out
0.5m/0.25m
Voltage [V]
In
Out
2.0
x
1.0
0.0
0
0.5
1
1.5
2
Time [ns]
La logica combinatoria
59
NMOS Only Logic:
Level Restoring Transistor
VDD
VDD
Level Restorer
Mr
B
A
M2
Mn
X
Out
M1
• Advantage: Full Swing
• Restorer adds capacitance, takes away pull down current at X
• Ratio problem
La logica combinatoria
60
Solution 2: Transmission Gate
C
A
C
A
B
B
C
C
C = 2.5 V
A = 2.5 V
B
CL
C=0V
La logica combinatoria
61
Resistance of Transmission Gate
30
2.5 V
Resistance, ohms
Rn
20
Rp
2.5 V
Rn
Vou t
Rp
10
0
0.0
0V
Rn || Rp
1.0
Vou t , V
2.0
La logica combinatoria
62
Transmission Gate XOR
B
B
M2
A
A
F
M1
M3/M4
B
B
La logica combinatoria
63
Delay in Transmission Gate Networks
2.5
2.5
V1
In
2.5
Vi
Vi-1
C
0
2.5
C
0
Vn-1
Vi+1
C
0
Vn
C
C
0
(a)
Req
Req
V1
In
Req
Vi
C
Vn-1
Vi+1
C
C
Req
Vn
C
C
(b)
m
Req
Req
Req
Req
Req
Req
In
C
CC
C
C
CC
C
(c)
La logica combinatoria
64
Delay Optimization
La logica combinatoria
65
Dynamic Logic
La logica combinatoria
66
Dynamic CMOS
• In static circuits at every point in time (except
when switching) the output is connected to
either GND or VDD via a low resistance path.
– fan-in of n requires 2n (n N-type + n P-type)
devices
• Dynamic circuits rely on the temporary
storage of signal values on the capacitance of
high impedance nodes.
– requires on n + 2 (n+1 N-type + 1 P-type)
transistors
La logica combinatoria
67
Dynamic Gate
Clk
Clk
Mp
Mp
Out
In1
In2
In3
Clk
CL
PDN
Out
A
C
B
Me
Clk
Me
Two phase operation
Precharge (CLK = 0)
Evaluate (CLK = 1)
La logica combinatoria
68
Dynamic Gate
Clk
Clk
Mp
off
Mp on
Out
In1
In2
In3
Clk
CL
PDN
1
Out
((AB)+C)
A
C
B
Me
Clk
off
Me on
Two phase operation
Precharge (Clk = 0)
Evaluate (Clk = 1)
La logica combinatoria
69
Conditions on Output
• Once the output of a dynamic gate is
discharged, it cannot be charged again until
the next precharge operation.
• Inputs to the gate can make at most one
transition during evaluation.
• Output can be in the high impedance state
during and after evaluation (PDN off), state is
stored on CL
La logica combinatoria
70
Properties of Dynamic
Gates
• Logic function is implemented by the PDN only
– number of transistors is N + 2 (versus 2N for static complementary
CMOS)
• Full swing outputs (VOL = GND and VOH = VDD)
• Non-ratioed - sizing of the devices does not affect
the logic levels
• Faster switching speeds
– reduced load capacitance due to lower input capacitance (Cin)
– reduced load capacitance due to smaller output loading (Cout)
– no Isc, so all the current provided by PDN goes into discharging CL
La logica combinatoria
71
Properties of Dynamic Gates
• Overall power dissipation usually higher than static
CMOS
– no static current path ever exists between VDD and GND
(including Psc)
– no glitching
– higher transition probabilities
– extra load on Clk
• PDN starts to work as soon as the input signals
exceed VTn, so VM, VIH and VIL equal to VTn
– low noise margin (NML)
• Needs a precharge/evaluate clock
La logica combinatoria
72
Solution to Charge Leakage
Keeper
Clk
Mp
A
Mkp
CL
Out
B
Clk
Me
Same approach as level restorer for pass-transistor logic
La logica combinatoria
73
Issues in Dynamic Design 2:
Charge Sharing
Clk
Mp
Out
A
CL
B=0
Clk
Charge stored originally on
CL is redistributed (shared)
over CL and CA leading to
reduced robustness
CA
Me
CB
La logica combinatoria
74
Charge Sharing
VDD
case 1) if V out < VTn
VDD
Clk

Mp
Mp
Out
Out
CL
A
A
=
BB
00
Clk 
CL
Ma
Ma
M
Mb
b
Mee
M
XX
a
CC
a
CC
bb
C L VDD = C L Vout  t  + Ca  VDD – V Tn  V X  
or
Ca
V out = Vout  t  – V DD = – --------  V DD – V Tn  V X  
CL
case 2) if V out > VTn
Ca 
 ---------------------
Vout = –V DD 
C
+
C
 a
L
La logica combinatoria
75
Solution to Charge
Redistribution
Clk
Mp
Mkp
Clk
Out
A
B
Clk
Me
Precharge internal nodes using a clock-driven transistor
(at the cost of increased area and power)
La logica combinatoria
76
Issues in Dynamic Design 3:
Backgate Coupling
Clk
Mp
A=0
Out1 =1
CL1
Out2 =0
CL2
In
B=0
Clk
Me
Dynamic NAND
La logica combinatoria
Static NAND
77
Other Effects
•
•
•
•
Capacitive coupling
Substrate coupling
Minority charge injection
Supply noise (ground bounce)
La logica combinatoria
78
Cascading Dynamic Gates
V
Clk
Mp
Clk
Mp
Out1
Me
Clk
Out2
In
In
Clk
Clk
Me
Out1
VTn
V
Out2
t
Only 0  1 transitions allowed at inputs!
La logica combinatoria
79
Domino Logic
Clk
In1
In2
In3
Clk
Mp
11
10
PDN
Me
Out1
Mp Mkp
Clk
Out2
00
01
In4
In5
PDN
Clk
La logica combinatoria
Me
80
Why Domino?
Clk
Ini
Inj
Clk
PDN
Ini
Inj
PDN
Ini
Inj
PDN
Ini
Inj
PDN
Like falling dominos!
La logica combinatoria
81
Properties of Domino Logic
• Only non-inverting logic can be implemented
• Very high speed
– static inverter can be skewed, only L-H transition
– Input capacitance reduced – smaller logical effort
La logica combinatoria
82
Designing with Domino Logic
VDD
VDD
VDD
Clk
Mp
Clk
Mp
Out1
Mr
Out2
In1
In2
In3
PDN
PDN
In4
Can be eliminated!
Clk
Me
Clk
Me
Inputs = 0
during precharge
La logica combinatoria
83
Footless Domino
VDD
Clk
VDD
Mp
Clk
Mp
Out1
0
0
Clk
Mp
Out2
1
0
In1
1
VDD
Outn
1
0
In2
1
0
In3
1
0
1
Inn
1
0
The first gate in the chain needs a foot switch
Precharge is rippling – short-circuit current
A solution is to delay the clock for each stage
La logica combinatoria
84
Differential (Dual Rail) Domino
off
Mp Mkp
Clk
Out = AB
1
on
Mkp
0
Clk
Mp
1
A
!A
0
Out = AB
!B
B
Clk
Me
Solves the problem of non-inverting logic
La logica combinatoria
85
np-CMOS
Clk
In1
In2
In3
Clk
Mp
11
10
PDN
Me
Out1
Clk
Me
In4
In5
PUN
00
01
Clk
Mp
Out2
(to PDN)
Only 0  1 transitions allowed at inputs of PDN
Only 1  0 transitions allowed at inputs of PUN
La logica combinatoria
86