Download Power Distribution - Technion

Power Distribution Final Report for VLSI Interconnect Course (#046884) EE Faculty, Technion, Spring 2003 Part 1: Introduction – Power Distribution as an Interconnect / Signal-Integrity Challenge Part 2: Case Review – The MSC8102 DSP David Kaushinsky Avshalom Elyada July 10, 2003 1 Introduction – An Interconnect/SignalIntegrity Challenge 1.1 Equation Chapter 1 Section 1Power Distribution – An Interconnect/Signal-Integrity Challenge The object of an IC’s Power Distribution (PD) Grid is to deliver the required (relatively large) power quantities into a small object – the silicon die. Rough calculations can show that a current-technology hi-performance IC may consume tens of watts and even more, especially if power reduction is not considered. This figure compares to a household electric bulb, although the bulb’s physical dimensions are of course much larger, and an IC’s reliability and signalstability requirements are a few magnitudes more stringent. Such a comparison can help to initially grasp the bulk of the PD challenge. In addition to sheer power quantities, the PD grid must supply stable voltage levels (within margins) to transistors variably located, with varying power demands in quantity and time. The voltage levels must progress a lengthy way, from the supply on the application-board, through the chip’s package, and from there to the silicon pads. From there the voltage “passes” through the on-chip grid to connect to each transistor’s drain (for power lines) and source (for ground lines). 1.1.1 Cutting Down on Power Consumption Reducing power is an obvious goal in case of low-power ICs such as mobile devices. However it is far from true that power reduction and it’s application to design of the PD grid is of concern only in a low-power device. Stable power (and reference) voltages, heat dissipation and electrical stress of the device are key factors directly affected by design of the PD grid, which makes PD an important issue also in designs where ample power is readily available. Although this work does not deal with power-reduction techniques, they are sometimes mentioned where appropriate due to the association to PD. 1.2 Dynamic Power Dissipation (DPD) Ideally, a CMOS inverter (and in a similar way CMOS-built gates) consumes power only when it performs a transition and it’s channel opens. As can be seen in Figure 1, the time the inverter is open depends on the input’s slew-rate and on Vt, the threshold voltage. Hence controlling power dissipation involves controlling these two factors. When attempting to optimize the slew rate, transistors should be sized appropriately. On the one hand, a larger (wider) driving gate will charge the input junction capacitance faster and so reduce the slew rate. But on the other hand, that same gate’s input capacitance will become large, so that it’s preceding gate will drive slower, at a more moderate slew-rate. The optimization called for here is similar to that shown (in several different flavors) in class, however in class the optimizations were always for minimum delay. A slightly different variation of this method should be used to optimize for power, such as discussed in [Cherkauer & Friedman SC-30 1995] Figure 1: Time CMOS transistor open as function of slew-rate and threshold voltage 1.3 DPD Rough Estimation DPD ~ afCV 2 (1.1) Equation (1.1) reminds us of the basic, widely used DPD estimation. Some immediate conclusions follow: Reducing frequency obviously cuts down on power consumption, however this will hurt performance so it is not desired. We want to regulate consumption without hurting performance Using lower voltage has the most substantial effect. However, lowering the voltage draws penalties in other performance aspects such as delay and noise-margins, so there is a limit to how much voltage can be reduced. This is also due to the fact that voltage is a technology parameter. Reducing capacitance that needs to be loaded/unloaded during transition is another aspect of PD. Many articles discuss techniques to downsize gates where strong drive capability is not needed, either during layout or as an improving second-run on the database. Also mentioned are power aware placement/routing algorithms that minimize the length of critical wires, thereby reducing their capacitance. The activity factor (a) estimates what average fraction of the chip’s junction-capacitors is active in each cycle. Some low-power design techniques such as clock-gating aim to reduce this factor directly. 1.4 Static Power Dissipation (SPD) As mentioned in the previous section, ideally a CMOS transistor is open only upon transition. In reality, parasitic leakage current causes significant power consumption even when transistors are supposedly closed.   qVt   kT  Leakage  I 0 e (1.2) As can be seen from equation (1.2), leakage is governed by the threshold-voltage Vt and the system temperature T. One method of controlling static PD is by designing transistors with a higher Vt, however a higher threshold voltage results in lengthened delay (see equation (1.3) below). It is also worthwhile to note that controlling dynamic PD can also improve static PD due to the strong dependency of static PD on temperature. Leakage intensifies with the advance of technology, since the number of transistors on a chip grows, while their down-scaling results in a smaller gate-bulk distance; both factors increase leakage (although leakage is linearly proportional to the supply voltage which typically decreases with technology). Estimations may show that unchecked leakage-current in a highperformance device can reach up to %50 of total power consumption. 1.5 Designer Balancing Act To demonstrate the intricate and opposing considerations involved in PD design, we will complement equations (1.1) and (1.2) with a simple delay model of a CMOS inverter [Alphapower law MOSFET Model” / Sakurai, Newton 1990]: Delay ~ Vdd Vdd  Vt   (1.3) From these 3 simple relations we can graphically visualize the “designer-balancing act” as shown in Figure 2. (The α is a factor related to the transistor drive-strength, which is not paramount to this discussion). Vdd  Dynamic PD  Delay  Vt  T Static PD  Static PD  Figure 2: Designer Balancing Act Delay  1.6 Typical PD Grid Structure As illustrated in Figure 3, a typical PD grid is constructed in multiple (3-6) metal layers. Starting from the layer closest to the supply (which can be top or bottom, see below regarding packages), ground and power lines are laid “one power, one ground”, with regular signal lines in between, as the area is also needed for signal-routing. As the layers get nearer the transistor layer, the power/ground lines become thinner and more tightly spaced; the grid resolution is increased until the wires connect to each and every transistor Vdd and Vss in a tight and uniform (as much as possible) mesh. These final wires, located at the “bottom” (or “top”) of the grid, are sometimes referred to as power-rails. Vias connect between lines on neighboring layers. The grid is connected by pads to outside power supplies (these are Figure 3: PD grid general structure located at the periphery of the die if the package uses the wire-bond” method, but not necessarily for a flipchip, see below). Designers aim to increase the wire cross-cut as much as possible in order to minimize impedance and current-density (see below), however area is also limited as these layers are used for signal-routing as well. 1.7 Voltage Drop Effect Minimization of Voltage Drop in the PD grid is a primary design challenge. Large current consumption across long power lines with parasitic impedance can cause a situation where transistors receive varied, low and/or unstable voltage levels from the grid. This affects their performance and can cause failure due to increased delay as a result of the lowered supply. Voltage drop also causes smaller effective Vt, resulting in increased sensitivity to noise. The effect is intensified as technology advances, since downscaling of interconnects feature size increases wire resistivity, and downscaling Vdd, increases the fraction of the drop. Voltage drop has two aspects: 1.7.1 DC Voltage Drop Often referred to as IR drop, DC voltage drop is a result of the power lines’ resistance. As can be seen in Figure 4, high current can cause transistors to see lower voltage levels than intended. This problem cannot be entirely solved by raising the value of Vdd, due to the variation between different areas and even neighboring transistors. Figure 4: Voltage Drop 1.7.2 AC Voltage Drop Voltage drop also takes a transient form, due to fluctuations in local or global current demand, and the capacitive (and inductive, if the frequency is high enough) character of the loads. In a synchronous chip, a system-wide clock may typically switch thousands of transistors simultaneously, causing current surges that further reduce the supply seen by gates. This phenomenon is called Simultaneous Switching Noise – SSN. On/Off-chip decoupling capacitors are typically used to smoothen these transients. 1.8 Electro-migration Electro-migration is a phenomenon in which metal wires are degraded as a result of continuous high current density flowing through them. Electro-migration along with Voltage Drop is also a key consideration in designing a PD grid. When current density in a conductor is too high, electrons traveling in the current (“electron wind”) can cause metal-ions to migrate with them. Ion voids are created upwind where ions have disappeared (Figure 5), and ion accumulations in places where they can travel no further create “hillock” (Figure 6) and “whisker” structures downwind. Conductor resistance increases, and with it voltage drop and subsequently delay. Eventually failure may be caused, as voids develop into open-circuits and hillock/whiskers short to neighboring wires. As with many other IC design aspects, this problem intensifies as current demands rise and conductor crosscut decreases resulting in higher current densities. Figure 5: A Void in a metal conductor Figure 6: A Hillock is formed where ions can travel no further 1.9 Effect of Package on PD Typically, voltage supply is located on the application-board, and in order to reach the silicon die it must pass through the chip’s package, connecting to the package pins. Within the package the power or ground signal travels (basically as any other signal) through one or more layer of the substrate-PCB, on which the chip is mounted. Only from there can it reach the pads themselves on the die. Therefore any discussion of PD should include key issues of package selection and considerations. 1.10 BGA Package with Wire-Bonding We discuss here one of the commonly used silicon-packages, the BGA (Ball Grid Array) package with wire-bonding (WB) due to its known limitations with regard to PD. In a BGA package, cross-section shown in Figure 7, the die sits atop a small substrate-PCB. Power signals travel from the voltage supply located on the application board, to the device package. Like any signal going to or from the chip, they connect to the package balls from where a trace runs on one or two of the 2-6 substrate-PCB layers towards the die located in the Figure 7: A BGA package with wire-bonding center. From the end of the trace, the gap to one of the silicon pads must be crossed; a thin gold thread known a wire-bond connects the trace from the closest point on substrate to the silicon pad. 1.11 Wire-Bonding Limitations A typical WB may introduce induction of about 1nH, capacitance of around 0.01pF, and 50mOhm resistance, making it a substantially large inductive element. WBs cannot be matched because they “hang” in the molding compound and it is not possible to place capacitors alongside them. The distance of the WB can’t be reduced due to the large number of traces that need to connect to the die (see Figure 8) and also due to the need for power rings for Vdd and ground. As seen in Figure 9, some WBs take the long “high loop” while others take the shorter “low loop. Figure 8: BGA blueprint overview. Trace endpoints can be seen reaching maximum density at a certain distance from the chip periphery The substrate itself along with the traces passing through it introduces significant impedance as well. Figures may be around L=5-15nH, C=2-5pF, and R=100-300mOhm. Traces can’t be widened due to routing density near the die seen in Figure 8. Power and ground wires are given routing priority: outer pads, shorter path through substrate, straighter line, and low loop WB. A distinct disadvantage of this type of packaging is that power is delivered to peripheral IO pads, which increases the voltage drop near the chip’s center. The need for IO pads to be located on the edges of the die also introduces a considerable limitation on the number of IO connections, power and ground among others, which may Figure 9: High and low -loop wire-bonds, power and ground rings become a constraint on the PD grid. 1.12 Flip-Chip One of the main reasons BGA with WB cannot be used in many hi- performance ICs is voltage drop. As mentioned earlier, peripheral IO pads cause significant voltage drop in the die’s center, which is not acceptable in a low-voltage, hi-frequency device with tens of millions of transistors. In an alternate packaging method known as “flip-chip”, the silicon-die is placed upside-down (with the pad-contacts on the bottom). As shown in Figure 10, instead of lengthily routing the trace through the substrate, then looping the WB up-and-over to reach pad, flipping the chip allows signals to connect in an Figure 10: Flip-chip almost direct course to any point on the die’s surface. Eliminating the WB and significantly reducing power-line travel distance through the substrate takes out much of the impedance characteristic to BGA with WB. Furthermore with the pads no longer constrained to be peripheral, the number of IO connections may grow, and relatively more power and ground pads can be allocated. Also worth noting is that with a flipped die and no WBs, it is possible to achieve better contact of the heat-slug with the die, improving heat dissipation. Drawbacks of flip-chip drawbacks are mainly in production costs. Figure 10 shows the solder balls need to be adhered to the pads to serve as contact points. The flip-chip substrate is usually a more costly one. 2 Case Review – The MSC8102 DSP 2.1 Abstract Verification of highly integrated chip design prior to silicon realization becomes a complex task. Advanced tools and adaptive techniques are required for fast turn-around of analysis and design modification iterations. This paper presents a case study of how the ElixIR power grid analysis tool was deployed throughout the development of the MSC8102 product to verify its power distribution network. Fixes to design layout were carried out on the fly by getting detailed visual descriptions of unacceptable voltage drop situations. The verification using ElixIR increased the design quality and enabled product tape-out on time. 2.2 Introduction to the MSC8102 a. b. c. d. e. f. g. h. i. j. k. The 8102 is a modern DSP platform consisting of 4 DSP cores. Four 300Mhz StarCore SC140 DSP Extended Cores. 16 ALUs 4800 MMACs, 12G RISC MIPS 1436 Kbyte (11.488Mbit) Dual PowerPC system an local buses Four serial TDM interfaces Flip-chip package. 1.6 Watts, 18x18 or 16x16mm package Industry’s First DSP to use 0.13 micron Hip6 Copper Process Technology ~70,000,000 transistors. 2.3 The Simulation Tool To ensure the integrity of the on-chip power distribution network, it is necessary to determine the voltage and current distributions in the power/ground rails through simulation and then apply design corrections, as necessary, at various stages of the product design. The voltage distribution helps to identify parts of the power grid experiencing unacceptable voltage drop. Likewise, the current distribution helps to identify sections of wires that are carrying currents in excess of what is specified for EM reliability. Since current pulled in one section of the power grid affects the voltages and currents throughout the power grid, it is necessary to model and simulate the entire chips power grid as a whole. As a result the size of the electrical model to be simulated becomes extremely large, typically few tens of millions of electrical nodes. Hence a very efficient circuit simulator is required to realize an acceptable turn-around time for repeated analysis and design modification ElixIR is one such power grid analysis tool, developed by the Advanced Tools group in Motorola. ElixIR is capable of simulating multi-million node power networks very efficiently using many proprietary techniques such as automatic partitioning of the network and simulating in parallel on multiple computers. The power/ground distribution within blocks (e.g. cores, arrays) and of entire chips can be verified with this tool. ElixIR offers three modes of operation: early mode, floor-plan mode, and verification mode. The early mode provides exploratory capabilities to help the power grid designer in determining adequate wiring widths and pitches, pin -outs, pin placements, and load balanced floorplanning. In the floor-plan mode, an analysis of, typically, the global power wiring (i.e. excluding the wiring inside blocks/cells) is done based on the chips floorplan, power wiring, and current demands of the circuit blocks. Multiple runs of the tool in this mode, once after every major design change, are useful to ensure that a power distribution design consistent with the rest of the design is carried through the design stages. Several fixes are done to the power grid in the process to enhance its quality and reliability. It is important to note that the compact and powerful input formats of the tool enable construction of power grids of arbitrary complexity using the early and floor-plan modes of the tool. Finally, ElixIR is run in the verification mode using the power wiring and block placement data from the actual layout. This mode uses an electrical parasitic netlist of the power grid extracted from the layout information using any standard commercial extraction tool. This analysis is helpful in revealing many common problems, such as due to missing vias, insufficient metalization, missing connections, etc. Our experience shows that requiring a successful verification as one of the tape-out criteria will help to identify and eliminate many design problems that would otherwise be discovered on the silicon, thus saving a great deal of time and money. ElixIR features two types of analyses: static analysis using average or peak currents, and dynamic analysis using time-dependant currents for various blocks. The time dependant currents are to be obtained through simulation of circuit blocks, using fast circuit simulators such as PowerMill. ElixIR permits a very flexible input data specification of the power grid and current models so that the tool can be adopted in a wide variety of design flows and used in most design situations. 2.4 Early Mode Analysis        Location of Vdd/Gnd Pads Nominal pitch and widths of metal layers Via styles (oint or bar vias) Parameters of the chip package IR-drop analysis is performed using very simplistic models of the grid topology and the block currents. At the areas where the metal lines of adjacent layers cross over, Vias are places according to user specified Via geometries and Via styles The clean Vdd/Gnd pads can be places on the surface of the chip using C4 pads – for flip-chip. Even though the real power grid will not be as regular as the mock grid, and all the devices will not be drawing the estimated current simultaneously, important design decisions are made from the results of the simple analysis and an early picture of the robustness of the grid is obtained. 2.5 Post Floorplan Analysis      Global power distribution network as been designed Blocks have been placed. The locations and geometries of the power lines and the blocks are read from the design database. The power grids within blocks have not yet been wired. Analysis is performed to yield the IR-drop values at each of the block ports. 2.6 Post Layout Analysis    The global and block level grids have been completely designed. Current signatures are obtained for custom blocks and Random Logic Macros. RC extraction of the global grid is performed using a commercial extraction tool. 2.7 Methodology This section describes the methodology used for IR drop and EM simulations during the 8102 project. The first step in the verification flow is the parasitic extraction of the power and ground rails from the layout file. A resistance-only extraction was done using the STARRC tool (vendor: Avanti Corporation). Owing to the large size of the chip level power net, the extraction was run using the parallel extraction capability of Star-RC. In this mode of running, the extraction took about 14 hours. In order to decrease the simulation run time of ElixIR, the 8102 unit was analyzed using the partitioned analysis feature of ElixIR. The network was automatically partitioned into 10 partitions using the partitioning utility accompanying Elixir, and then the ElixIR analysis was carried out in parallel on 10 compute servers. The ElixIR analysis took about 2 hours to complete. At the end of the analysis, ElixIR generates graphical data to help visualize the voltage, current, current density distributions in the chip. 2.8 ElixIR’s Graphical User Interface Tool. Using the navigational and query features of the GUI, detailed investigation is done by the user to identify problems and their causes. During the initial analyses, fixes in layout are almost always needed. Fixing the layout and re-extracting the parasitic usually takes several days. This procedure of layout fixes and rerunning ElixIR simulation may take several iterations. A new metal density visualization feature was added to the GUI recently. The purpose of such display is to provide the power grid designer with a view of the density distribution of each metal layer. This helps to check if there are any density violations at the local or unit level. Fig. 4 is an example of a metal density picture, showing the density distribution of metal 4 in the 8120 extended core. 2.9 Part 2 Summary A case study of the power distribution design flow in the MSC8102 chip was presented. The flow is based on the power of an IR-drop simulation tool and a divide- and-concur methodology. The Power distribution was not left as one of the last chip level assignments but was treated with high priority throughout the entire design process. The result was a robust and efficient power grid. 2.10 Bibliography 2.10.1 Part 1     Various eedesign.com Online Magazine: Low-power Design Techniques Span RTL-to-GDSII Flow Cherkauer & Friedman SC-30 1995 Alpha-power law MOSFET Model” / Sakurai, Newton 1990 2.10.2 Part 2    Y.Shoshany, H.Marom, S.Sundareswaran, M.Zhao, T.Edwards, R.Panda ASP/Advanced Tools, Austin, TX, USA. “Using ElixIR or IR Drop and EM Simulation of an Advanced DSP Device” Rajendran Panda, David Blaauw, Rajat Chaudhry, Vladimir Zolotov, Brian Young, and Ravi Ramaraju Motorola, Inc., Austin, TX, USA.” Model and analysis for combined package and on-chip power grid simulation”, International Symposium on Low Power Electronics and Design archive 2000 Abhijit Dharchoudhury, Rajendran Panda, David Blaauw, Ravi, Vaidyanathan, Bogdan Tutuianu, David Bearden “Design and analysis of power distribution networks in PowerPC microprocessor”, Annual ACM IEEE Design Automation Conference archive 1998

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Power Distribution - Technion