Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Interconnect Driven CAD Michael Orshansky UC Berkeley A. Nardi, UC Berkeley A. Kuehlmann, UC Berkeley D. Sylvester, U. of Michigan P. Chen, UC Berkeley Conventional Design Flow Funct. Spec RTL Behav. Simul. Logic Synth. Stat. Wire Model Front-end Gate-level Net. Gate-Lev. Sim. Back-end Floorplanning Parasitic Extrac. Place & Route Layout 2 1 Trends in DSM CAD z System complexity increases (billions of transistors) Introduce higher levels of abstraction System – level design, simulation Platform-based design, etc… Abstraction hides details Limits interactions between low-level properties z Complexity of device and interconnect behavior at physical level increases Bring in more detail of the physical world into simulation / modeling / optimization (“reduce abstraction”) Physical behavior becomes more coupled 3 Modifications of CAD Flows z Used to have clear separation of front-end (logic) and back-end (physical) flows z Physical concerns must be addressed early in the design flow No more clear logical/physical dichotomy z Emerging design flow Higher levels of abstraction Power as a high-level design objective Better interconnect delay models Timing-driven floorplanning, placement, routing, everything Interaction between front-end and back-end 4 2 Deep Sub-Micron (DSM) Design Flow 5 Deep Sub-Micron (DSM) CAD: Preview z A multitude of new DSM effects become prominent New reliability concerns New optimization objectives for CAD z Interconnect-dominated systems Interconnect extraction and modeling Coupling capacitance between wires Timing closure z Challenges of manufacturability of VLSI Back-end flow changes Statistical design techniques z Power consumption Power as primary design objective Thermal and reliability concerns New techniques to control power 6 3 Interconnect Dominated Systems z Used to be concerned only with functional units (transistors/gates) Interconnects were secondary in determining circuit behavior z Interconnect parasitics (resistance, capacitance and inductance) dominate circuit delay z Hard to predict exact wire loads early Difficult to achieve timing convergence between pre-layout and post-layout Multiple iterations required z Need accurate models for interconnect for final timing signoff Parasitic extraction from layout 3-D full chip electromagnetic extraction Problem is computationally difficult Use Reduced Order Modeling 7 Wires: Second Class to First Class z Interconnect effects are no longer secondary # of wires v # of devices More metal levels RC delay Rising frequencies -- inductive effects Coupling capacitance -- noise z As devices get smaller, wiring capacitance can be a large component of load z Local wire length Scales with devices z Global wire length ~ chip size 8 4 Wire Capacitance Trends z Capacitance, C, is the measure of ability to store energy in the form of separated charge z To first order: - C v Area of overlap - C v 1 / distance z Interconnect geometries change with scaling z Distance between wires is shrinking Line spacing < Dielectric thickness Metal thickness > Metal width M1 M1 Sub Past Present / Future Sub 9 Wire Capacitance Trends z Coupling capacitance dominates interlayer capacitance z Impossible to estimate actual capacitance until we know local wire density (distance to neighboring wires) Routing and congestion analysis is critical! 10 5 Wire Resistance Trends z R is rising with scaling interconnect dimensions Reduces the applicability of simple, capacitance-only models z If R is large, the wire has an intrinsic delay associated with it RC delay is the time the wire takes to charge its own capacitance through its own resistance z To stop rise of R, use wider/thicker wires Wider wires use more routing space, making the design larger Thicker wires lead to larger capacitances between wires -- this leads to noise z Use new materials with lower U z Global RC delay will get much worse while local RC will only increase slightly 11 Statistical Wirelength Estimation z Estimate average wiring length for point-to-point nets by applying Rent’s rule recursively: Partition the chip into hierarchical divisions Estimate the connections between partitions by Rent’ s Rule Lavg = f(Ng, β) z Wire length also depends on fan out of the net Lavg (FO) = Lavg (1 + 0.4(FO − 1)) (J. Davis) 12 Wirelength (gate pitches) 6 Wire Load Models z Synthesis must produce circuits that meet the designer’s timing constraints z Wire delay estimated by typical (average) wirelength given by Rent’s rule Function of FO Function of block size 5 z Lumped RC model used fanout 4 cap 3 2 1 100 500 1000 2000 3000 block size (gates) 13 Logical and Physical Co-Design z Interconnect delay is the dominant component of circuit delay (up to 60-80%) Difficult to achieve timing convergence between pre-layout and post-layout To evaluate wire delay need wire congestion – need to know full placement and routing Multiple iterations required z Canonical logic synthesis flow (mid 80s-mid 90s) Timing done at gate level netlist Worked until wire cap was less than 20% Usually augmented with statistical wire-load model (WLM) z What are the possible fixes? From statistical wire-load models (WLM) to custom WLMs Design smaller blocks then assemble them (<50K gates) Gain-based synthesis Refinement-based flow 14 7 Custom Wire Load Models z Custom wire-load model flow (parasitic estimation): Perform initial placement and routing Generate estimated loads Use these to generate custom models and back-annotate wire cap into synthesis Not based on actual routing data (in contrast to parasitic extraction) z Iterative back-annotation can result in a different netlist at synthesis –> different P&R -> different WLM -> different timing -> may not converge z Little latitude to fix problems late in the flow 15 (Monterey Design Automation) Designing With Smaller Blocks z Perhaps statistical WLMs can still be used for small blocks of logic (Sylvester/Keutzer, “Getting to the Bottom of Deep Submicron”) Smaller than 50k cells z Divide netlist into blocks Intra-block interconnect delay can be neglected or roughly estimated Assemble blocks together z Requires time budgeting at the block level and at the chip level z Assembly must respect physical boundaries of blocks, so that the intra-block delays are preserved - placement is constrained Timing and/or congestion problems. z Virtually impossible to estimate inter-block delays Long interconnects depend on relative placement of blocks z Statistical WLM may not predict timing accurately even for small blocks due to routing congestion. Congestion impacts WLM predictability If routes in the block are forced to meander in congested areas, the net capacitances increase substantially. 16 8 Gain-Based (Constant-Delay) Synthesis z Tries to put synthesis and placement into same optimization space Don’t size until know the load! z Solve problem in two phases Logic synthesis: assign delays s.t. timing reqs are met (no sizing) Placement & routing -> get accurate wire loads Size each gate to keep delay fixed z Make delay model dependent on a single variable (gain) Rather than sizing and loading z Gate delay model is linear Transistor is an effective resistance inversely proportional to width Discharge network modeled as a linear capacitance composed of constant part CL and device dependent part Not dependent on input transition time 17 Gain-Based Delay Equation z Idea: in the delay model, capture the topological (circuit) properties of a logic gate that are independent of gate sizing z Logic effort equation τ = g f + p g = logical effort of a gate represents effect of gate’s internal topology on its ability to produce output current. g describes how much worse a given gate is at producing output current compared to the inverter Size independent, depend on function, topology, relative transistor dimensioning in the gate type 1/f = gain (electrical effort) of the gate. Describes how the gate’s electrical environment affects its performance. Gain is simply defined as Cout/Cin p = gate’s intrinsic delay z Input capacitances scale linearly (Cin = fj Cj ) with the load z Gate areas are proportional to capacitance 18 9 Gain-Based Synthesis z Phase 1: Determine delays per gate Find { fj } (i.e. gate sizes) to minimize the total area while meeting delay requirements z Phase 2: Determine sizing Given { fk } and Ci for each of the primary outputs, can Ci compute in reverse topological order z Problems: Delay models is linear, not dependent on input transition time Optimization is gain constraint and not area constraint Assumes that cells can be freely sized Sub-optimal mapping for discrete libraries 19 Refinement-Based Flow z All design variables must be considered together z A variable of a design in progress is estimated with some uncertainty A parameter cannot be optimized beyond the range of its estimation! Don’t waste time optimizing a variable until you can estimate it z Start with initial netlist and physically build it increasing resolution of every parameter simultaneously 20 (Monterey Design) 10 Physical Challenges of DSM Interconnects z In addition to methodological challenges, DSM interconnect creates many difficult issues related to physics and modeling of scaled wires z Signal integrity z Inductance z IR drop z Electromigration 21 Noise / Signal Integrity Noise can be defined as anything that causes a node to deviate from Vdd or GND when it should otherwise have a stable HI / LO value [Shepard 96] Noise sources cause signal integrity problems Coupling capacitance Causes crosstalk and delay degradation Package-level inductance L * di/dt voltage drop Power grid IR drop Reduced noise margins, slower gates 22 11 Signal Integrity z Wire-to-wire cap is dominant z Capacitive coupling results in cross-talk z When neighboring lines switch, a quiet line experiences a glitch due to coupling z Can lead to Logic faults Voltage overshoot (stress, forward-bias PN junctions) 23 Crosstalk z Delay degradation When neighboring lines switch in opposite direction of victim line, delay increases Capacitance is effectively increased through Miller effect z A stable line may experience glitch – wrong value latched Noise Propagation Increased Delay 24 12 Crosstalk Noise Inducing Timing Variation Victim with noise Vdd/2 Vdd/2 Opposite switching direction t Same switching direction Aggressor Next Stage Victim 25 Electrical Model of Cross-Talk z Crosstalk coupling model Simple model Aggressor tf Rv Cx Ca VDD Vv Victim Waveform Cv Maximum delta voltage : ∆Vv = VDD Cap. Ratio ∆Vv ≈ VDD CxR v − tf ( 1 − exp( )) tf R v( Cx + Cv ) Cx when t f << Rv (C x + Cv ) C x + Cv 26 13 Cross-Talk Modeling: Miller Factor Computation z Decoupling approximation for delay calculation Goal: using decoupling capacitance to emulate the loading effect of the original coupling circuit Could use Cg = 0 Cx for same-direction switching Cg = 2 Cx for same-direction switching Approximation is not conservative Doesn’t capture dynamic behavior Cx Cg Cg 27 Circuit Model for Crosstalk 28 14 True Crosstalk Noise Analysis Flow z Need noise-aware STA + RC extractor Computational complexity (too many nets) Should not be overly conservative z Prune netlist to find min set of nets to analyze Determine amplitude of noise on victim net If noise is larger than threshold, generate a violation Layout Spatial Pruning from Layout Temporal Pruning Functional Pruning Coupling Model Delay Model Functional Model Susceptible Nets 29 Locating Noise Problem z Spatial Pruning Identify nets with large coupling capacitance from layout z Electrical Pruning Compute delta voltage bound in terms of driver strength and slew rate z Temporal Pruning Compute timing windows in which noise coupling is significant z Functional Pruning Justify one input vector pair to exercise the maximum noise condition 30 15 Temporal and Functional Pruning z Similar to static timing analysis Identify the “noisy” window of concern Locate possible range of transitions Window of concern Victim Possible Range of Switching Constant Signal Aggressor z Similar to false paths in STA Use functional pruning to eliminate signals that can’t be responsible for noise due to their functional relationship z Zero-delay model Temporal information is ignored If aggressor transitions against a victim within a clock cycle, a conservative assumption is made that switching windows were correlated for worst noise z Timed Boolean calculus approach 31 Design for Crosstalk z Cross-talk analysis done at post layout stage GND z Router tries to fix problem VDD Space / shield afflicted nets Switch a net to a different layer Requires major rip-up / no guarantee of feasibility GND Substrate z Need to noise-aware routing Constraint router not to exceed Lmax of parallel lines Doesn’t include switchingwindow Over-constraining of routing Irresolvable congestion problems z Move cross-talk capability to placement and global routing Problem nets are given more space Need gridless detailed router 32 16 Inductance z Inductive effects become important on chip z Inductance, L, is the measure of ability to store energy in the form of a magnetic field Inductance of a wire consists of a selfinductance and a mutual inductance term z At high frequencies (>4GHz), inductance can become an appreciable portion of the total impedance Transmission line effects Reflections and ringing Slowed propagation Need impedance matching z Inductance is important when Trise 2 LC Length 2 R L C - Line must be long for the time-offlight to be comparable to rise time - Line must be short enough such that attenuation does not eliminate inductive effects 33 When is Inductance Important? z Higher performance leads to higher inductive effects z Inductance is most significant in long, fast-switching nets with low resistance Clocks are most susceptible for self inductance Bus signals are subject to mutual inductance z Inductance is a weak function of conductor dimensions (logarithmic) z Inductance is a strong function of current return path distance Want to have a nearby ground line to provide a small current loop z Evaluating return path is extremely difficult Inductance increases unpredictability of circuit behavior 34 17 Inductive Effects z Increased delay z Voltage overshoot z Ringing / non-monotonic voltage response z Inductive noise (coupling) z Supply / ground bounce 35 (X. Qi, CICC) Dealing with Inductance • DEC approach in Alpha 21264 -- use entire planes of metal as references (Vdd and GND) to eliminate inductance - Loss of routing density, added metal layers reduce yield & raise costs • Another industry approach uses shield wires every ~ 3 signal lines in a dense array Vdd Bus lines GND 36 18 Power Distribution: IR Drop z Wire resistance increases z On-chip supply voltage decreases z IR-drop: voltage drop of the power due to current flowing in the power/ground resistive network z Result: variation of actual Vdd level across the chip z Distribution depends on current flow within chip z Need accurate transistor-level analysis capability to correctly predict Vdd distribution VDD V’DD 37 Power Distribution: IR Drop z Unpredictable IR drop in power distribution networks causes design failures z Design and layout of power distribution network must be done early in the design process (then gradually refined) z Delay is increased z Noise margins are reduced (Simplex Solutions, Inc.) 38 19 Power Distribution Network z IR drop analysis early in design flow z Should drive power distribution network routing z Reduce max distance between supply pins and circuit supply connections Finger-shaped power distribution network VDD GND 39 Thermal Effects in Interconnects z Thermal effects arise due to self-heating (Joule heating) of interconnect due to current flow Wire Temperature ~ Wire Switching Power z Wire temperatures increase z Causes of more severe thermal effects Higher # of interconnect levels Higher current density Use of low-k dielectrics Increasing thermal coupling (K. Banerjee) 40 20 Thermal Effects: Reliability Degradation z Thermal heating causes reliability problems z Electromigration: transport of mass in metals under an applied current density Failure type: open Failure type: short (K. Banerjee)41 Thermal Effects: Impact on Circuit Performance z Current densities grow z TTF = time to failure due to electromigration decreses z Elelctromigration especially problematic in Power buses Also, clock and signal buses z Over-design to deal with electromigration Increase cross-sectional area Leads to congestion z Need tools to compute actual current density Do early, while routing space is available (K. Banerjee)42 21 Thermal Effect: Impact on Circuit Performance and Estimation z Long wires experience substrate thermal non-uniformity z Delay is temperature dependent Self-heating degrades circuit performance z Will be significant below 100nm z Assumption of uniformity leads to delay estimation errors z 5-6% delay increase / 20C in temperature difference (K. Banerjee)43 Thermal Effects: Electro-Static Discharge (ESD) and Reliability z Electrostatic Discharge (ESD) is a Short duration (< 200 ns) High current (> 1 A) event z Reliability danger: Open circuit failure of metals Latent damage that impact EM reliability z Interconnect failure temperatures ~1000C z Failure current densities are much higher than under normal circuit conditions (K. Banerjee)44 22 New Wire Material - Cupper z All 0.18 mm processes are replacing Aluminum wiring with Copper 40% lower resistance AND ~100X longer electromigration lifetime z Copper is the last metal -new wiring schemes will need to be radically different (e.g. superconducting, optical) TSMC Cupper Process z Gives 12% improvement over an aluminum process in a PowerPC design D. Sylvester 45 New Materials: Low-k Dielectrics • Lower wiring capacitance leads directly to lower delay and power consumption • Helps reduce noise in short to intermediate length wires • Industry outlook: - 0.18 mm processes will incorporate dielectrics ranging from k = 2.7 to 4.0 • Ultimately, aerogels may be used with k ~ 1 46 23 Decreasing Transistor Length (Lgate) NMOS PMOS 800 Idsat ( A/ m) 750 800 750 700 700 650 650 600 600 550 550 500 500 450 450 400 400 350 350 300 300 The “driver” behind scaling 250 200 250 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 0.24 0.26 200 Drawn Channel Length ( m) z Much innovation in device engineering in order to preserve a transistor Short channel effects force looking at new transistor structures SOI, dual gate transistors, FinFETs z Despite scaling, current capabilities are fairly constant Reduction of capacitance is key! D. Sylvester 47 Gate Oxide Scaling z ~2nm physical Tox at 180nm z Major concern: a huge increase in leakage current density z Reliability of oxide is in danger (Intel, VLSI 2000) 48 24 Increase of Leakage Current (Ioff) z Dramatic rise in Ioff z Gate is unable to turn off conducting channel z 2-3X Ioff increase / generation (Jan Rabaey, UCB) 49 (Intel, VLSI 2000) Leakage Power Scaling Process Vdd Generation (V) Vt (V) Wdevice (Pm) Istatic Block Pstatic area (mW/mm2) (PA / 2 block) (mm ) 2.63 2.5 0.003 0.25 2.5 0.625 5 0.18 1.8 0.450 3.6 132 1.43 0.166 0.13 1.5 0.375 2.6 589 0.83 1.06 0.1 1.2 2 2781 0.5 6.67 0.3 T = 50° C, Block size = 50K 0.001% of Pdyn ~2.3% of Pdyn z Due to Vt scaling, leakage power is becoming a larger portion of the overall power consumption z Higher operating temperatures will lead to enhanced static power dissipation 50 25 Increase of Gate (Tunneling) Current z Due to oxide leakage, current through gate (!) will become comparable to ‘normal’ channel current!!! z Total ILEAKAGE = IOFF + IGATE (Intel, VLSI 2000) 51 Reduction of Vdd and Vth z Vdd and Vth are reduced Will be ~0.8 / generation z Driven by reliability and power concerns z Can’t reduce Vth as much because of high leakage current z Over-drive strength (Vdd/Vth) gets smaller (Intel, VLSI 2000) 52 26 Device Degradation Due to Hot Carrier z Hot carriers are energetic electrons excited by E-field in transistor’s drain region z May be injected into gate oxide and cause damage to oxide or interface z Hot carrier effect degrades circuit performance “Trapped” electrons increase Vt Degrades saturation current ION Circuit speed is degraded z Degradation is time-dependent Circuit will fail at some point 53 Device Degradation Due to Hot Carrier z EDA needs (B. McGaughy, 1998) Need to estimate life-time (I.e. time to failure) Need to estimate trade-offs between more reliable and slower design 54 27 Deep Sub-Micron CAD: Summary z A multitude of device and circuit design challenges that have root in the changing physical realization of CMOS circuits Transistor miniaturization and higher density of wire packing Increased circuit size and complexity z Interconnect-dominated CAD flows Timing closure, cross-talk, inductance, IR drop, electromigration z Power consumption and thermal and reliability concerns Electromigration, hot-carrier effects in transistors z Impact on CAD Physical concerns must be addressed earlier in design flow Flow (methodological) changes required New analysis, estimation, and optimization tools required (e.g. IR drop, electromigration) z Challenges of manufacturability of VLSI Hard to manufacture a chip that we want 55 28