Download Aggressor Alignment for Worst-Case Coupling Noise

Document related concepts
no text concepts found
Transcript
Full Chip Analysis
Chung-Kuan Cheng
Computer Science and Engineering Department
University of California, San Diego
La Jolla, CA 92093-0114
[email protected]
Outlines
Introduction
II. Circuit Level Analysis
III. Logic Level Analysis
I. Timing Analysis
II. Functional Analysis
IV. Mixed Signal Analysis
V. Research Directions
VI. Conclusion
I.
2
I. Introduction
1.
2.
3.
Trends of On-Chip Technologies
Statistics About Design Flaws
Spectrum of Analysis
3
I.1 Trends of On-Chip Technologies
System:
Huge Numbers of Devices and Wires
Power/Ground Distribution:
Low Voltage, High Current
Wires: Lateral Coupling, Fragmented Parasitics
Devices: Modeling, Noise
Mixed Signal Design: RF+Analog+Digital
4
5
6
Power/Ground Distribution (ITRS)
Lower V margin: Higher I & Inductance x Freq.
Supply
Voltage(V)
Max
Power
On-Chip
Freq(MHz)
Off-Chip
Freq(MHz)
2002
2003
2004
2005
1.5
1.5
1.2
1.2
130
140
150
160
1,600
1,724
1,857
2,000
885
932
982
1,035
7
Static Vs. Dynamic Voltage
Drop
Current envelope
Dynamic - peak current
Static- average current
(n+2)T
(n+1)T
nT


Wire sizing can be used to control static drop
Precise de-cap insertion filters peak current spikes
Courtesy of Apache
8
Flip Chip Dynamic Effects
F:MHz
Vdd
Static
IR
Dynamic
Ri(t) Ldi/dt
Dynamic
Total
Static vs.
Dynamic
250
1.8
9.9
mV
17.3
mV
1.2
mV
18.5
mV
0.5%
1.02%
500
1.5
16.2
mV
29.3
mV
12.3
mV
41.6
mV
1.08%
2.77%
750
1.2
19.3
mV
36
mV
28.5
mV
64.5
mV
1.6%
5.37%
1,000
1.0
22
mV
38.8
mV
41.6
mV
80.4
mV
2.2%
8.0%
Courtesy of Apache
9
Wire-bond Dynamic Effects
F:MHz
Volt
Static
IR
Dynamic
R i(t) Ldi/dt
Dynamic
Total
Static vs.
Dynamic
133
1.8
103
mV
147
mV
3 mV
150
mV
5.7%
8.3%
250
1.5
181
mV
275
mV
13 mV
288
mV
12%
19.2%
400
1.2
200
mV
276
mV
75
mV
351
mV
16.6%
29.2%
Courtesy of Apache
10
11
Increasing System Complexity
RF front end
Complex
converters
DSP
Memory
Courtesy of Mentor
12
I.2 Statistics about Design Flaws
Percent of Total Flaws Fixed in IC/ASIC Designs Having
Two or More Silicon Spins

Collett Intl. 2000 Survey
Logical or Functional
47%
Slow Path
7%
Clocking
5%
Race Condition
5%
Power
4%
Mixed-Signal Interface
4%
IR Drops


2%
0%


3%
Other flaws


12%
Yield


13%
Noise

10%
20%
Logical or Functional
Slow Path
Noise
30%
Logical or Functional
Analog
Noise
Slow Path
Mixed-signal interface
Clock, Power/Ground
Firmware
67%
Logical or Functional
Analog Circuit
Noise
40%
50%
Slow Path
Clocking
Yield
Mixed-Signal Interface
IR Drops
Race Condition
Power
Firmware
Other flaws
35%
29%
28%
25%
23%
21%
20%
17%
17%
13%
4%
0%
10% 20% 30% 40% 50% 60% 70% 80%
Collett Intl. 2001 Survey
13
I.3 Spectrum of Analysis
D Circuit
e
v
i Electrical
c
e
Physics
Logic
Behavior
EE
Timing
Switches
Extraction
Circuit theory
Precision
Complex, Real Math
Gate
RTL
S
y
s
t
e
m
CS Engineering
Algorithms
Database
Programming
Discrete
14
I.3 Spectrum of Analysis(flow)
System
Software
emulation
power
clock
Hardware
Floorplan
global wires
cross talk
Function
Timing
Circuit Anal.
Architect
Logic
Layout
function
freq
critical paths
Library
IP blocks
Analog
characterization
mixed signal
power noise
Chip
15
I.3 Spectrum of Analysis(coverage)
Coverage
Synopsys Cadence Mentor
Logic: Static Timing (sign off)
Mentor
Cadence Logic: Functional
Mixed Signal
Cadence
Axis
Circuit Analysis
Synopsys Aptix
Apache
Cadence
Mentor
IBM
ASX
Eldo
Celestry
Synopsys
Nassda
Spice
Iota
Mentor
Hspice
Circuit Size
16
I.3 Spectrum of Analysis(trend)
 Layout
Dominated Analysis
 Power/Ground, Clock
 Wires
 Pre-layout, Post-layout
 Layout Oriented Analysis
 EE + CS
EE=> CS High Complexity
 CS=>EE Deep Submicron Effect


Accuracy and Efficiency
17
II. Circuit Level Analysis
1.
2.
3.
4.
Circuit Analysis Advancement
Circuit Analysis Techniques
Examples
Tasks
18
II.1 Circuit Analysis Advancement
1st Generation (SPICE)
2nd Generation (Fast SPICE)
Memory Usage
Memory Usage
100M
Bytes
100K elements
Next Generation (HSIM)
Memory Usage
1G Bytes
2M elements
512M Bytes
300M elements
CPU Time
Circuit Size
100
hrs
Circuit Size
CPU Time
Circuit Size
CPU Time
20 hrs
100K elements
2M elements
2 hrs
300M elements
Circuit Size
Circuit Size
Circuit Size
Courtesy of Nassda
19
II.2 Circuit Analysis Techniques
Memory: Hierarchical Database
 Circuit Size: Parasitic Reduction
 Device Complexity: Table Model
 Simulation:





Backward Euler, Trapezoidal Integration
Hierarchical Flow
Event Driven (ignoring miller effect)
Mixed Rate, Multiple step sizes (partition)
20
II.3 Examples (HSIM)
Circuit Type
(#MOS, #R,#C,#L)
Total
Elements
Memory
Usage
CPU
Time (hrs)
Memory A
(159M, 159M, 155M,0)
473M
775MB
1.65
Memory B
(3.1M, 5.4M, 4.5M, 88)
13M
195MB
0.69
D/A
(9K,65K,47K,0)
121K
42MB
1.11
PLL
(2K, 8K, 23K, 0)
51K
15MB
0.21
525K
111MB
0.37
Analog
(119K, 175K, 232K,0)
Courtesy of Nassda
21
II.4. Tasks
Input Patterns
Device Mod. Convergence
Event Driven Hierarchy Database
Circuit Red.
Matrix Solver Hierarchical Flow
Integration Partition
Speed
EE
Accuracy
CS
Math
22
III. Logic Level Analysis
1.Separation of Timing and Function
2.Static Timing Analysis
Algorithms, Gate Models, Path,
Cross Talks
3.Functional Analysis
Event Driven, Cycle Based
4.Tasks
23
III.1 Separation of Timing and Function
Timing Analysis
Slew, RC tree
cross talk
Function +
Timing
High Complexity!
Simple timing
Input
model
independent
Functional
Analysis
Input vector
driven
24
III.2 Static Timing Analysis
Algor.: Shortest and Longest Paths Search
Gate Model:
i.
ii.
•
•
Path Model:
iii.
•
•
iv.
v.
Logic: Unate, Binate Signal Propagations
Timing: functions of Input Slope and Output Load
Logic: False Path, Multiple Cycle Path, Cycles of
Combinational Logic, Multiple Clock Frequencies
Timing: RC Tree
Cross Talks: Timing Window, ATPG
Tasks
25
III.2.i Algor.: Path Search
Arrival PI1
Time,
Slew Rate
PI2
PI3
A
B
G
H
F
E
Longest &
Shortest Paths
C
J
D
PO
Required
Arrival Time
0->1 slew rate window 0->1 arrival time window
1->0 slew rate window 1->0 arrival time window
Static Timing Analysis: Worst Case Analysis,
Independent of Input Patterns
26
III.2.I Algor: Path Search(cont)
0/0
PI1
1
1/2
2
0/0
0/0
PI2
PI3
3
2
3
A
G
1
B
2
C
1
1
3/3
F
2
2
3
H
E
2
J
2
PO
2
D
2/4
min/max
27
III.2.I Algor: Path Search(cont)
aminj, amaxj
dji
amini,amaxi
amini=minj aminj+dji
amaxi=maxj amaxj+dji
28
III.2i Algor.: Path Search
0/0
PI1
1
1/2
A
2
0/0
0/0
PI2
PI3
3
2
G
1
F
4/5
3
B
6/7
2
C
1
1
3/3
2
2
3
H
4/4
E
2
J
2
6/10
PO
8/12
2
D
6/8
2/4
4/6
Longest: PI2,G,F,E,D,J,PO
min/max
Shortest: PI2,G,H,J,PO
29
III.2.ii Gate Logic Model: Unate & Binate Signals
a
NAND
y
Unateness: a 0->1 => y 1->0
a
XNOR
a
BDD
y
Check unateness based on BDD
y
Binateness: a 0->1 => y 0->1 & 1->0
30
III.2.ii Gate Timing Model
a
y
Interconnect
Ceff
Slew rate of a
Slew rate of a
Delay of y
Slew rate of y
Ceff
Ceff
31
III.2.iii Path Logic Model: False Path
c8
c4
4 bit adder
Carry skip adder
+
y
4 bit adder
z
c0
p[0,3]
1 C0=1
1011
0100
1111 P[0:3]=1
10000 Z=1
32
III.2.iii Path Logic Model: False Path
c8
4 bit adder
Carry skip adder
c4
y
4 bit adder
z
c0
p[0,3]
False path: c0->y->c4 ->c8
Assumption: z->c4 ->c8 derives results faster
If we erase all false paths, we can identify the
true critical paths and the corresponding input
patterns
33
III.2.iii Path Logic Model: False Path
a
3
3,10
c
b
4
7,14
d
10
False path b->c->d->e
10
17,24
e
16
2
f
red+red=>red
red+blue=>blue
blue +blue=>blue
34
III.2.iv Cross Talk
 WCN:
worst-case noise:
Delay & Glitch
Noise with maximum
pulse height
 Fixed circuit structure
and parameters
 Fixed transition time of
input signals
 Variable arrival time of
input signals
WCN
35
III.2.iv Cross Talk: Timing Window
Aggressor / Victim Input
Victim Output
A1
P1
A2
P2
A3
P3
A4
P4
V0
P5
Aligned arrival time
Skewed peak noise
36
III.2.iv Cross Talk: Timing Window
Aggressor / Victim Input
Victim Output
A1
P1
A2
P2
A3
P3
A4
P4
V0
P5
Sweep line
Skewed arrival time
Aligned peak noise
Aggressor Alignment WITHOUT Timing Constraints
37
III.2.iv Cross Talk: Timing Window
Aggressor / Victim Input
Victim Output
A1
P1
A2
P2
A3
A4
P5
V0
Sweep Line
Aggressor Alignment WITH Timing Constraints
38
III.2.iv Cross Talk: Effective Timing Window
Timing window for aggressor input
Timing window for victim output
Pi
Ai
Ta
Earliest arrival
time
Tb
ta
Latest arrival
time
tb
Earliest peak noise Latest peak noise
occurring time
occurring time
Vmax
Pi
L
a
T Ta
R
b
Tb T
L
a
t ta
tb t
R
b
39
Aggressor Alignment with Timing
Constraints -- Reformulation
Old Sweep Line
New Sweep Line
New Sweep Line
A1
A2
A3
A4
V0
(a) Original timing window (b) Shifted timing window
(c) Expanded timing window
40
III.2.v Tasks
Gate model: power, noise
Path model: special cases
Path model: RCLK reduction Path search in hierarchy
Cross talk
ATPG
Timing window+pattern
Speed
EE
Accuracy
CS
Math
41
III.3 Logic Level: Functional Analysis
i.
ii.
iii.
iv.
Functional Analysis Techniques
Event Driven Analysis
Cycle Based Analysis
Tasks
42
III.3.i Functional Analysis Techniques
 Event
Driven Simulation
 VCS, Verilog-XL, VSS, ModelSim
 Cycle Based Simulation
 Frontline, Speedsim, Cyclone
 Domain Specific Simulation

SPW, COSSAP
43
III.3.ii Event Driven Analysis

Event Wheel



Advantages




Maintains schedules of events
Enables sub-cycle timing
Timing accuracy
Good Debug Capability
Handles asynchronous
Disadvantages

Performance
44
III.3.iii Cycle Based Analysis








RTL Description
All gates evaluated every cycle
Schedule is determined at compile time
No timing
No asynchronous feedback, latches
Regression Phase
High Performance
High Capacity
45
III.3.iv Tasks
Dynamic timing model
Hardware acceleration
Pattern generation
coverage
Speed
EE
Accuracy
CS
Math
46
IV. Mixed Signal Analysis
RF front-end
RF Simulator
Custom Logic & Mixed-Signal
Fast SPICE
Complex
Analog
Traditional
SPICE
Analog IP
VHDL-AMS
Verilog-AMS
DSP
Embedded
Memories
HDL
VHDL/Verilog
Hierarchical
Fast SPICE
Single-Kernel simulator for
full-chip SoC Verification
Courtesy of Mentor
47
IV. Mixed Signal Analysis: Interface
D/A
Analog
Digital
A/D
D/A
Analog Threshold
0,
1,
X
Signal Detector
A/D
Rise, Fall Time
Rise, Fall Resistance
48
Mixed Signal: Mixed Languages
Spice
VHDL
VHDL-AMS
Spice
VHDL-AMS
Verilog
Spice
VHDL-AMS
VHDL
Spice
Verilog-A
VHDL
Verilog-AMS
Spice
Verilog-A
VHDL-AMS
C
Verilog
Verilog
Verilog-A
C
•Single Kernel Architecture
•Single Netlist Hierarchy
•Automatic D/A and A/D converter insertion
Courtesy of Mentor
49
IV. Tasks
Language
RF, Analog,
Power, Noise,
Interface
Convergence
Partition
Compiler
Speed
EE
Accuracy
CS
Math
50
IV. Research Directions
Hierarchy Management
 Analysis + Optimization
 Layout Oriented Analysis
 Circuit Reduction
 Spice

51
Hierarchy Management
Circuit
Place
Logic
Comp,dynamic,
Rep, buffer
pass
Power, clock
packaging
Wire-length,
Congestion,
Block,alignment,
match,size
Algor
Floorplan
Physical obj
Partition,Bus
View, hierarchy
Route
Pattern,vias
Analysis
verification
Signal integrity
Ir drop, didt
manufacturability
52
Hierarchy management
Design process
algor logic layout
53
Hierarchy Management (cont.)
•Hierarchy Tree Construction
•Hierarchy Tree Transformation
•Incremental Changes
•Graph Process on Tree Structure
54
Analysis and Optimization
i. Circuit Reduction
ii. Transient Analysis
iii. Optimization of
• power/ground: pads, decoup caps,
network
• clock networks: topology, shield,
decoup caps
• Buses: shield, topology
55
Layout Oriented Analysis
•Huge Circuitry
•Millions of nodes
•Whole Chip Analysis
•Power/Ground, Substrate, Analog
•Guaranteed Accuracy
•Accuracy vs Execution Time
•Construction or Incremental
Changes
56
Layout Based Signal Analysis
•Generalized Y-Delta Transformation
•R,C,L,Coupling, Sources
•Natural Frequency
•Realizability
•Hierarchical Circuit Analysis
57
Conductance in parallel
1
g1
g2
2
1
y12
2
y12  g1  g2
58
Conductance in series
1
g1
0
g2
2
1
y12
2
g1  g 2
y12 
g1  g 2
59
Conductance in Y-structure
1
1
g1
g2
2
0
y13
y12
g3
y 23
3
2
yij 
3
gi  g j
g1  g2  g3
60
Admittance in Y-structure
1
1
g
1
ls
y13
y12
cs
0
y 23
3
2
e.g.
2
1
g
g
ls
y12 

2
1
1

gls

cls
g   cs
ls
3
61
Admittance in Y-structure, with current source
1
g
1
ls
I
0
I2
cs
4
y 23
3
y12 is the same , and
y13
y12
4
2
1
2
3
1
I
I
ls
I2 

2
1
1

gls

cls
g   cs
ls
62
K-element

3
1
I1
km s
V1

2
1
I2 
1
1
1
k1 1s
k 2 2s
V2
3
1
1
1
1
k1 1s
km s
km s
k 2 2s

4
 km1 s
2
 km1 s
4
 k111 s V1  km1 s V2  I1
1
 km s V1  k221 s V2  I 2
63
Reduction example
64
Waveform Estimation
Transient response
evaluated using Y-Δ
transformation with
Hurwitz polynomial
approximation.
8th order stabilized YΔ models are used
for near-end and farend node waveform
evaluation.
Only a 3rd order AWE
model is obtained.
65
Efficiency Comparison
CPU time (s)
Elements
Circuit
type
Tree-like
Mesh-like
#R
#L
#C
Stabilized
Y-Δ *
Spice3f4
Efficiency
1035
1034
1001
0.34
3.94
11.58
16397
16394
14299
11.42
134.05
11.74
1675
2439
733
5.22
73.19
14.02
8035
0
8038
2.07
25.95
12.54
66941
0
67119
41.25
1536.77
37.26
*15th order Y-Δ transformation is used.
66
Delay Accuracy Vs. Efficiency
Model order
RC*
RLC**
50% delay
Delay
90% delay
Accuracy
Delay
Accuracy
Efficiency
3rd
28.9
95%
33.4
95%
29.3
6th
27.9
99%
31.9
99.6%
27.9
3rd
49.5
96%
72.5
96%
14.7
6th
52.7
97%
71.5
97%
11.5
12th
51.2
99.8%
69.9
99.7%
10.1
* 50% delay is 27.6ps, and 90% delay is 31.7ps.
** 50% delay is 51.3ps, and 90% delay is 69.7ps.
R  10, C  10 fF , L  10 pH .
67
Observe Overshooting
Model Order *
Overshooting **
Accuracy
Efficiency
3rd
2.627
97.8%
11.1
6th
2.699
99.5%
10.7
12th
2.683
99.9%
10.3
* Mesh-like RLC circuit is tested, with R  10, C  10 fF , L  10 pH .
** Waveform will converge to DC 2.5v.
68
Pole Analysis
Both AWE and Y-Δ
transformation
have
artificial positive poles;
High order AWE tends
to collapse approximate
poles, hiding other less
dominant ones.
Y-Δ transformation with
model
stabilization
yields no positive poles,
and has broader band in
pole estimation.
69
V. Conclusion
 Layout
Oriented Analysis
 Unified tools combining EE and CS
with Math as foundation
 New Methodologies
 Larger Circuits, Shorter Product
Turnaround
70
References
M. Marek-Sadowska, UCSB
 L.T. Pileggi, CMU
 CK Cheng, et al, Interconnect Analysis and
Synthesis, John Wiley
 ACM/IEEE Design Automation Conf.
 IEEE/ACM Int. Conf. On CAD
 Apache, Nassda, Mentor, Synopsys,
Cadence, Celestry, IBM, and etc.

71
Related documents