Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Full Chip Analysis Chung-Kuan Cheng Computer Science and Engineering Department University of California, San Diego La Jolla, CA 92093-0114 [email protected] Outlines Introduction II. Circuit Level Analysis III. Logic Level Analysis I. Timing Analysis II. Functional Analysis IV. Mixed Signal Analysis V. Research Directions VI. Conclusion I. 2 I. Introduction 1. 2. 3. Trends of On-Chip Technologies Statistics About Design Flaws Spectrum of Analysis 3 I.1 Trends of On-Chip Technologies System: Huge Numbers of Devices and Wires Power/Ground Distribution: Low Voltage, High Current Wires: Lateral Coupling, Fragmented Parasitics Devices: Modeling, Noise Mixed Signal Design: RF+Analog+Digital 4 5 6 Power/Ground Distribution (ITRS) Lower V margin: Higher I & Inductance x Freq. Supply Voltage(V) Max Power On-Chip Freq(MHz) Off-Chip Freq(MHz) 2002 2003 2004 2005 1.5 1.5 1.2 1.2 130 140 150 160 1,600 1,724 1,857 2,000 885 932 982 1,035 7 Static Vs. Dynamic Voltage Drop Current envelope Dynamic - peak current Static- average current (n+2)T (n+1)T nT Wire sizing can be used to control static drop Precise de-cap insertion filters peak current spikes Courtesy of Apache 8 Flip Chip Dynamic Effects F:MHz Vdd Static IR Dynamic Ri(t) Ldi/dt Dynamic Total Static vs. Dynamic 250 1.8 9.9 mV 17.3 mV 1.2 mV 18.5 mV 0.5% 1.02% 500 1.5 16.2 mV 29.3 mV 12.3 mV 41.6 mV 1.08% 2.77% 750 1.2 19.3 mV 36 mV 28.5 mV 64.5 mV 1.6% 5.37% 1,000 1.0 22 mV 38.8 mV 41.6 mV 80.4 mV 2.2% 8.0% Courtesy of Apache 9 Wire-bond Dynamic Effects F:MHz Volt Static IR Dynamic R i(t) Ldi/dt Dynamic Total Static vs. Dynamic 133 1.8 103 mV 147 mV 3 mV 150 mV 5.7% 8.3% 250 1.5 181 mV 275 mV 13 mV 288 mV 12% 19.2% 400 1.2 200 mV 276 mV 75 mV 351 mV 16.6% 29.2% Courtesy of Apache 10 11 Increasing System Complexity RF front end Complex converters DSP Memory Courtesy of Mentor 12 I.2 Statistics about Design Flaws Percent of Total Flaws Fixed in IC/ASIC Designs Having Two or More Silicon Spins Collett Intl. 2000 Survey Logical or Functional 47% Slow Path 7% Clocking 5% Race Condition 5% Power 4% Mixed-Signal Interface 4% IR Drops 2% 0% 3% Other flaws 12% Yield 13% Noise 10% 20% Logical or Functional Slow Path Noise 30% Logical or Functional Analog Noise Slow Path Mixed-signal interface Clock, Power/Ground Firmware 67% Logical or Functional Analog Circuit Noise 40% 50% Slow Path Clocking Yield Mixed-Signal Interface IR Drops Race Condition Power Firmware Other flaws 35% 29% 28% 25% 23% 21% 20% 17% 17% 13% 4% 0% 10% 20% 30% 40% 50% 60% 70% 80% Collett Intl. 2001 Survey 13 I.3 Spectrum of Analysis D Circuit e v i Electrical c e Physics Logic Behavior EE Timing Switches Extraction Circuit theory Precision Complex, Real Math Gate RTL S y s t e m CS Engineering Algorithms Database Programming Discrete 14 I.3 Spectrum of Analysis(flow) System Software emulation power clock Hardware Floorplan global wires cross talk Function Timing Circuit Anal. Architect Logic Layout function freq critical paths Library IP blocks Analog characterization mixed signal power noise Chip 15 I.3 Spectrum of Analysis(coverage) Coverage Synopsys Cadence Mentor Logic: Static Timing (sign off) Mentor Cadence Logic: Functional Mixed Signal Cadence Axis Circuit Analysis Synopsys Aptix Apache Cadence Mentor IBM ASX Eldo Celestry Synopsys Nassda Spice Iota Mentor Hspice Circuit Size 16 I.3 Spectrum of Analysis(trend) Layout Dominated Analysis Power/Ground, Clock Wires Pre-layout, Post-layout Layout Oriented Analysis EE + CS EE=> CS High Complexity CS=>EE Deep Submicron Effect Accuracy and Efficiency 17 II. Circuit Level Analysis 1. 2. 3. 4. Circuit Analysis Advancement Circuit Analysis Techniques Examples Tasks 18 II.1 Circuit Analysis Advancement 1st Generation (SPICE) 2nd Generation (Fast SPICE) Memory Usage Memory Usage 100M Bytes 100K elements Next Generation (HSIM) Memory Usage 1G Bytes 2M elements 512M Bytes 300M elements CPU Time Circuit Size 100 hrs Circuit Size CPU Time Circuit Size CPU Time 20 hrs 100K elements 2M elements 2 hrs 300M elements Circuit Size Circuit Size Circuit Size Courtesy of Nassda 19 II.2 Circuit Analysis Techniques Memory: Hierarchical Database Circuit Size: Parasitic Reduction Device Complexity: Table Model Simulation: Backward Euler, Trapezoidal Integration Hierarchical Flow Event Driven (ignoring miller effect) Mixed Rate, Multiple step sizes (partition) 20 II.3 Examples (HSIM) Circuit Type (#MOS, #R,#C,#L) Total Elements Memory Usage CPU Time (hrs) Memory A (159M, 159M, 155M,0) 473M 775MB 1.65 Memory B (3.1M, 5.4M, 4.5M, 88) 13M 195MB 0.69 D/A (9K,65K,47K,0) 121K 42MB 1.11 PLL (2K, 8K, 23K, 0) 51K 15MB 0.21 525K 111MB 0.37 Analog (119K, 175K, 232K,0) Courtesy of Nassda 21 II.4. Tasks Input Patterns Device Mod. Convergence Event Driven Hierarchy Database Circuit Red. Matrix Solver Hierarchical Flow Integration Partition Speed EE Accuracy CS Math 22 III. Logic Level Analysis 1.Separation of Timing and Function 2.Static Timing Analysis Algorithms, Gate Models, Path, Cross Talks 3.Functional Analysis Event Driven, Cycle Based 4.Tasks 23 III.1 Separation of Timing and Function Timing Analysis Slew, RC tree cross talk Function + Timing High Complexity! Simple timing Input model independent Functional Analysis Input vector driven 24 III.2 Static Timing Analysis Algor.: Shortest and Longest Paths Search Gate Model: i. ii. • • Path Model: iii. • • iv. v. Logic: Unate, Binate Signal Propagations Timing: functions of Input Slope and Output Load Logic: False Path, Multiple Cycle Path, Cycles of Combinational Logic, Multiple Clock Frequencies Timing: RC Tree Cross Talks: Timing Window, ATPG Tasks 25 III.2.i Algor.: Path Search Arrival PI1 Time, Slew Rate PI2 PI3 A B G H F E Longest & Shortest Paths C J D PO Required Arrival Time 0->1 slew rate window 0->1 arrival time window 1->0 slew rate window 1->0 arrival time window Static Timing Analysis: Worst Case Analysis, Independent of Input Patterns 26 III.2.I Algor: Path Search(cont) 0/0 PI1 1 1/2 2 0/0 0/0 PI2 PI3 3 2 3 A G 1 B 2 C 1 1 3/3 F 2 2 3 H E 2 J 2 PO 2 D 2/4 min/max 27 III.2.I Algor: Path Search(cont) aminj, amaxj dji amini,amaxi amini=minj aminj+dji amaxi=maxj amaxj+dji 28 III.2i Algor.: Path Search 0/0 PI1 1 1/2 A 2 0/0 0/0 PI2 PI3 3 2 G 1 F 4/5 3 B 6/7 2 C 1 1 3/3 2 2 3 H 4/4 E 2 J 2 6/10 PO 8/12 2 D 6/8 2/4 4/6 Longest: PI2,G,F,E,D,J,PO min/max Shortest: PI2,G,H,J,PO 29 III.2.ii Gate Logic Model: Unate & Binate Signals a NAND y Unateness: a 0->1 => y 1->0 a XNOR a BDD y Check unateness based on BDD y Binateness: a 0->1 => y 0->1 & 1->0 30 III.2.ii Gate Timing Model a y Interconnect Ceff Slew rate of a Slew rate of a Delay of y Slew rate of y Ceff Ceff 31 III.2.iii Path Logic Model: False Path c8 c4 4 bit adder Carry skip adder + y 4 bit adder z c0 p[0,3] 1 C0=1 1011 0100 1111 P[0:3]=1 10000 Z=1 32 III.2.iii Path Logic Model: False Path c8 4 bit adder Carry skip adder c4 y 4 bit adder z c0 p[0,3] False path: c0->y->c4 ->c8 Assumption: z->c4 ->c8 derives results faster If we erase all false paths, we can identify the true critical paths and the corresponding input patterns 33 III.2.iii Path Logic Model: False Path a 3 3,10 c b 4 7,14 d 10 False path b->c->d->e 10 17,24 e 16 2 f red+red=>red red+blue=>blue blue +blue=>blue 34 III.2.iv Cross Talk WCN: worst-case noise: Delay & Glitch Noise with maximum pulse height Fixed circuit structure and parameters Fixed transition time of input signals Variable arrival time of input signals WCN 35 III.2.iv Cross Talk: Timing Window Aggressor / Victim Input Victim Output A1 P1 A2 P2 A3 P3 A4 P4 V0 P5 Aligned arrival time Skewed peak noise 36 III.2.iv Cross Talk: Timing Window Aggressor / Victim Input Victim Output A1 P1 A2 P2 A3 P3 A4 P4 V0 P5 Sweep line Skewed arrival time Aligned peak noise Aggressor Alignment WITHOUT Timing Constraints 37 III.2.iv Cross Talk: Timing Window Aggressor / Victim Input Victim Output A1 P1 A2 P2 A3 A4 P5 V0 Sweep Line Aggressor Alignment WITH Timing Constraints 38 III.2.iv Cross Talk: Effective Timing Window Timing window for aggressor input Timing window for victim output Pi Ai Ta Earliest arrival time Tb ta Latest arrival time tb Earliest peak noise Latest peak noise occurring time occurring time Vmax Pi L a T Ta R b Tb T L a t ta tb t R b 39 Aggressor Alignment with Timing Constraints -- Reformulation Old Sweep Line New Sweep Line New Sweep Line A1 A2 A3 A4 V0 (a) Original timing window (b) Shifted timing window (c) Expanded timing window 40 III.2.v Tasks Gate model: power, noise Path model: special cases Path model: RCLK reduction Path search in hierarchy Cross talk ATPG Timing window+pattern Speed EE Accuracy CS Math 41 III.3 Logic Level: Functional Analysis i. ii. iii. iv. Functional Analysis Techniques Event Driven Analysis Cycle Based Analysis Tasks 42 III.3.i Functional Analysis Techniques Event Driven Simulation VCS, Verilog-XL, VSS, ModelSim Cycle Based Simulation Frontline, Speedsim, Cyclone Domain Specific Simulation SPW, COSSAP 43 III.3.ii Event Driven Analysis Event Wheel Advantages Maintains schedules of events Enables sub-cycle timing Timing accuracy Good Debug Capability Handles asynchronous Disadvantages Performance 44 III.3.iii Cycle Based Analysis RTL Description All gates evaluated every cycle Schedule is determined at compile time No timing No asynchronous feedback, latches Regression Phase High Performance High Capacity 45 III.3.iv Tasks Dynamic timing model Hardware acceleration Pattern generation coverage Speed EE Accuracy CS Math 46 IV. Mixed Signal Analysis RF front-end RF Simulator Custom Logic & Mixed-Signal Fast SPICE Complex Analog Traditional SPICE Analog IP VHDL-AMS Verilog-AMS DSP Embedded Memories HDL VHDL/Verilog Hierarchical Fast SPICE Single-Kernel simulator for full-chip SoC Verification Courtesy of Mentor 47 IV. Mixed Signal Analysis: Interface D/A Analog Digital A/D D/A Analog Threshold 0, 1, X Signal Detector A/D Rise, Fall Time Rise, Fall Resistance 48 Mixed Signal: Mixed Languages Spice VHDL VHDL-AMS Spice VHDL-AMS Verilog Spice VHDL-AMS VHDL Spice Verilog-A VHDL Verilog-AMS Spice Verilog-A VHDL-AMS C Verilog Verilog Verilog-A C •Single Kernel Architecture •Single Netlist Hierarchy •Automatic D/A and A/D converter insertion Courtesy of Mentor 49 IV. Tasks Language RF, Analog, Power, Noise, Interface Convergence Partition Compiler Speed EE Accuracy CS Math 50 IV. Research Directions Hierarchy Management Analysis + Optimization Layout Oriented Analysis Circuit Reduction Spice 51 Hierarchy Management Circuit Place Logic Comp,dynamic, Rep, buffer pass Power, clock packaging Wire-length, Congestion, Block,alignment, match,size Algor Floorplan Physical obj Partition,Bus View, hierarchy Route Pattern,vias Analysis verification Signal integrity Ir drop, didt manufacturability 52 Hierarchy management Design process algor logic layout 53 Hierarchy Management (cont.) •Hierarchy Tree Construction •Hierarchy Tree Transformation •Incremental Changes •Graph Process on Tree Structure 54 Analysis and Optimization i. Circuit Reduction ii. Transient Analysis iii. Optimization of • power/ground: pads, decoup caps, network • clock networks: topology, shield, decoup caps • Buses: shield, topology 55 Layout Oriented Analysis •Huge Circuitry •Millions of nodes •Whole Chip Analysis •Power/Ground, Substrate, Analog •Guaranteed Accuracy •Accuracy vs Execution Time •Construction or Incremental Changes 56 Layout Based Signal Analysis •Generalized Y-Delta Transformation •R,C,L,Coupling, Sources •Natural Frequency •Realizability •Hierarchical Circuit Analysis 57 Conductance in parallel 1 g1 g2 2 1 y12 2 y12 g1 g2 58 Conductance in series 1 g1 0 g2 2 1 y12 2 g1 g 2 y12 g1 g 2 59 Conductance in Y-structure 1 1 g1 g2 2 0 y13 y12 g3 y 23 3 2 yij 3 gi g j g1 g2 g3 60 Admittance in Y-structure 1 1 g 1 ls y13 y12 cs 0 y 23 3 2 e.g. 2 1 g g ls y12 2 1 1 gls cls g cs ls 3 61 Admittance in Y-structure, with current source 1 g 1 ls I 0 I2 cs 4 y 23 3 y12 is the same , and y13 y12 4 2 1 2 3 1 I I ls I2 2 1 1 gls cls g cs ls 62 K-element 3 1 I1 km s V1 2 1 I2 1 1 1 k1 1s k 2 2s V2 3 1 1 1 1 k1 1s km s km s k 2 2s 4 km1 s 2 km1 s 4 k111 s V1 km1 s V2 I1 1 km s V1 k221 s V2 I 2 63 Reduction example 64 Waveform Estimation Transient response evaluated using Y-Δ transformation with Hurwitz polynomial approximation. 8th order stabilized YΔ models are used for near-end and farend node waveform evaluation. Only a 3rd order AWE model is obtained. 65 Efficiency Comparison CPU time (s) Elements Circuit type Tree-like Mesh-like #R #L #C Stabilized Y-Δ * Spice3f4 Efficiency 1035 1034 1001 0.34 3.94 11.58 16397 16394 14299 11.42 134.05 11.74 1675 2439 733 5.22 73.19 14.02 8035 0 8038 2.07 25.95 12.54 66941 0 67119 41.25 1536.77 37.26 *15th order Y-Δ transformation is used. 66 Delay Accuracy Vs. Efficiency Model order RC* RLC** 50% delay Delay 90% delay Accuracy Delay Accuracy Efficiency 3rd 28.9 95% 33.4 95% 29.3 6th 27.9 99% 31.9 99.6% 27.9 3rd 49.5 96% 72.5 96% 14.7 6th 52.7 97% 71.5 97% 11.5 12th 51.2 99.8% 69.9 99.7% 10.1 * 50% delay is 27.6ps, and 90% delay is 31.7ps. ** 50% delay is 51.3ps, and 90% delay is 69.7ps. R 10, C 10 fF , L 10 pH . 67 Observe Overshooting Model Order * Overshooting ** Accuracy Efficiency 3rd 2.627 97.8% 11.1 6th 2.699 99.5% 10.7 12th 2.683 99.9% 10.3 * Mesh-like RLC circuit is tested, with R 10, C 10 fF , L 10 pH . ** Waveform will converge to DC 2.5v. 68 Pole Analysis Both AWE and Y-Δ transformation have artificial positive poles; High order AWE tends to collapse approximate poles, hiding other less dominant ones. Y-Δ transformation with model stabilization yields no positive poles, and has broader band in pole estimation. 69 V. Conclusion Layout Oriented Analysis Unified tools combining EE and CS with Math as foundation New Methodologies Larger Circuits, Shorter Product Turnaround 70 References M. Marek-Sadowska, UCSB L.T. Pileggi, CMU CK Cheng, et al, Interconnect Analysis and Synthesis, John Wiley ACM/IEEE Design Automation Conf. IEEE/ACM Int. Conf. On CAD Apache, Nassda, Mentor, Synopsys, Cadence, Celestry, IBM, and etc. 71