Download Voltage Binning Under Process Variation

Voltage Binning Under Process Variation Vladimir Zolotov Chandu Viweswariah Jinjun Xiong IBM T.J. Watson Research Center Yorktown Heights, NY, USA IBM System and Technology Group Hopewell Junction, NY, USA IBM T.J. Watson Research Center Yorktown Heights, NY, USA [email protected] [email protected] [email protected] ABSTRACT Process variation is recognized as a major source of parametric yield loss, which occurs because a fraction of manufactured chips do not satisfy timing or power constraints. On the other hand, both chip performance and chip leakage power depend on supply voltage. This dependence can be used for converting the fraction of too slow or too leaky chips into good ones by adjusting their supply voltage. This technique is called voltage binning [4]. All the manufactured chips are divided into groups (bins) and each group is assigned its individual supply voltage. This paper proposes a statistical technique of yield computation for different voltage binning schemes using results of statistical timing and variational power analysis. The paper formulates and solves the problem of computing optimal supply voltages for a given binning scheme. Categories and Subject Descriptors B.7.2 [Integrated Circuits]: Design Aids General Terms Algorithms, Design, Theory Keywords Voltage binning, parametric yield, leakage current 1. INTRODUCTION Now process variation is recognized as a major source of parametric yield loss. In the future, due to scaling down of CMOS transistor size, the situation is expected to get worse. Process variation causes high variability in gate delays and leakage current, which leads to high variability of chip operational frequency and power consumption. Due to die-to-die and within die variability, only some of the manufactured chips satisfy both performance and power requirements. The other chips are either too slow or consume too much power. They represent the parametric yield loss. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ICCAD ’09 San Jose, California USA Copyright 2009 ACM 978-1-60558-800-1/09/11 ...$10.00. A number of approaches were proposed to address this problem. Conservative design sets chip timing and power to a stricter target than required. The main disadvantage of this approach is larger chip area and higher design cost. Speed binning reduces parametric yield loss by accepting low performance chips and selling them at discount price [3]. However, often ASIC chips have strict requirements for their frequency and power. The chips not satisfying them have no value at all. Additionally, selling chips at a discount price reduces the profit. Body biasing [5], [2] either reduces transistor leakage or improves gate delays. This technique requires connecting transistor bodies to the biasing voltage source. This wiring negatively affects chip routability. It increases design time and cost and is impractical for SOI technology. In this paper we analyze voltage binning [4] as a technique to reduce parametric yield loss. This technique exploits the fact that both chip performance and power consumption depend on supply voltage. Higher supply voltage improves chip performance but increases both leakage and switching power [10]. On the other hand, slow chips often have low leakage and chips with high leakage have higher performance. By adjusting supply voltage it is possible to make initially failing chips satisfy application constraints. Because all the accepted chips satisfy these constraints they have the same value and can be sold at the same price. Supply voltage adjustment does not require either additional circuitry or additional wiring on the chip. There are different schemes of supply voltage adjustment. For example, it is possible to assign an individual supply voltage to each manufactured chip [10], [2]. It is an attractive methodology but it requires significant effort for chip testing at different supply voltages. Voltage binning [4] is an alternative technique. It divides all manufactured chips into several bins and assigns to each bin some value of supply voltage. This technique is more practical at the cost of small yield reduction. The patent [4] proposes a general idea of voltage binning. However, it does not analyze how to select optimal supply voltages and how much improvement can be obtained by different voltage binning schemes. We develop a technique for computing yield for a given voltage binning scheme, including the case of individually adjustable supply voltage. Our approach is based on results and ideas of statistical timing analysis [11] and statistical models of chip leakage power [8]. We represent chip yield by an integral over the process variation space. Monte-Carlo technique is the most obvious method for computing such integrals. However, it is rather slow and requires a large number of samples to ensure accuracy. Besides Monte-Carlo computation has significant random numerical noise. However, even small random noise severely interferes with an optimization procedure. Therefore, we developed a combination of analytical and numerical techniques for accurate and efficient calculation of chip yield. By applying the yield computation technique, we solve the problem of an optimal voltage binning scheme i.e., computing optimal voltage levels. We use a functional representation of chip performance and power consumption. Statistical static timing analysis (SSTA) gives chip performance as a linear form of process variables [11]. Chip power analysis is not a primary goal of this paper. Therefore, here we do not delve into details, like computation of switching activity or probability of different leakage states. However, in order to make our parametric yield computation sufficiently realistic we have to use an adequate model of chip power consumption. This model takes into account both switching and leakage power. Switching power is expressed as a function of supply voltage and gate load and wire capacitancies. Modeling of chip leakage power has attracted the attention of many authors. For our purpose, a convenient representation of chip leakage is given by [7]. However, our experiments showed that we need a more accurate model. We made two modifications of the leakage computation technique. First, we represented leakage current with the exponent of a quadratic form of transistor channel length (Leff) and supply voltage (Vdd). This function approximates chip leakage with less than 1% error. Second, instead of probability density function (PDF) of chip leakage, we directly use its functional expression. In order to verify the proposed technique we applied it to the analysis of large industrial chips. In our experiments we computed total chip leakage in a functional form using SPICE simulation of individual gates and the chip statistics of gate types and their states. Similarly we computed chip switching power using statistics of gate switching frequency and their load capacitances. Chip statistical timing slack was computed by an in-house statistical timing analyzer. These results were used for computing yield and optimal voltage binning schemes. The rest of the paper is organized as follows. Section II introduces the timing and power models. Section III gives definitions of voltage binning schemes and formulates problems of computation and optimization of yield. Section IV presents our technique of computing and optimizing yield of various voltage binning schemes. Section V describes our computational experiments. Section VI draws conclusions. 2. Plim P Leaky & Slow Too leaky Too slow Good Chips F F req Figure 1: Joint PDF of chip operational frequency and power consumption JPDF over this region gives manufacturing yield. The region marked Leaky & Slow represents the chips that are too slow and have too high leakage. These chips cannot be fixed by adjusting supply voltage. The region marked Too Leaky represents the chips that consume too much power due to leakage but have sufficiently high operational frequency. Some of these chips can be fixed by lowering supply voltage. The region marked Too slow represents chips that do not satisfy performance requirements but consume power less than allowed. Some of these chips can be fixed by increasing supply voltage. While the power-performance space is helpful for visualizing chip distribution, it is not convenient for actual yield computation. First, due to highly non-linear dependence of chip leakage on Leff it is very difficult to represent the JPDF of power and performance numerically or analytically even for fixed Vdd. Second, there are no convenient mathematical tools to operate with a JPDF parameterized by Vdd. Because of that we perform our analysis in the space of sources of variation and Vdd. For simplicity, we assume that all variational parameters have normal Gaussian distributions. 2.1 Timing Model Traditionally chip performance is expressed by its clock frequency. However, the results of timing are expressed in terms of timing slack. Minimization of clock period is equivalent to maximization of timing slack. Therefore, for convenience we use timing slack instead of clock frequency. Statistical timing computes timing slack S in linear form [11] BACKGROUND For computing and optimizing yield corresponding to different voltage binning schemes we need proper performance and power statistical models, parameterized by supply voltage. One such model is a joint distribution of chip operational frequency and power consumption. Fig. 1 shows a contour plot of the joint probability density function (JPDF) of chip frequency and power for some Vdd value. The area of this plot is divided into 4 regions with lines F = Freq and P = Plim indicating chip performance and power requirements, respectively. The region marked Good chips represents the chips satisfying the application requirements. The integral of the S = S0 + n X ai ∆Xi + aR ∆R (1) i=1 where S0 is the mean value of chip slack, ∆Xi is chip to chip variation of parameter Xi , ∆R is uncorrelated variation summing effects of all intra-die variations, ai and aR are the sensitivities to these variations. The spatial variation is computed as it is described in [1] and included into uncorrelated variation because it does not affect chip to chip variation. We give special consideration to chip-to-chip variations of supply voltage ∆V and transistor channel length ∆L. We combine all other sources of variations into a single parameter ∆X, using the fact that linear combination of Gaussian random variables is a Gaussian random variable too. Threshold voltage variation due to random dopant fluctuation is included into the uncorrelated variation modeling all the intra-die variations. Then the statistical timing slack is expressed as follows T1 E T2 T3 T4 S = S0 + aV ∆V + aL ∆L + aX ∆X (2) where aV , aL and aX are sensitivities of timing slack to Vdd, Leff and the combined variational parameter ∆X. The sensitivity aV with respect to Vdd can be computed either by pure statistical timing considering Vdd a statistical variable or from two statistical timing procedures performed at low and high Vdd values. 2.2 Power model Total chip power is expressed as PT ot = Ps + Pl , where Ps is switching or dynamic power and Pl is leakage power due to sub-threshold and gate leakage. Switching chip power is expressed as Ps = n X si F Ci i=1 V2 V2 =α 2 2 (3) where si and Ci are the switching activity and the load capacitance of the ith gate, F is the chip frequency, V is the supply voltage, α is the coefficient obtained by summing all terms together. From (3) we get a variational form of chip switching power Ps = Ps,0 + cv ∆V + cvv ∆V 2 (4) where Ps,0 is the switching power at nominal Vdd, ∆V is the variation of Vdd, cv and cvv are constant coefficients. The chip leakage current is expressed as Ichip = V ki n X X pi,j Ii,j (5) i=1 j=1 where Ii,j is the leakage of the ith gate in the j th leakage state, pi,j is the probability of that state occurrence and ki is the number of leakage states of the ith gate. The paper [9] proposes approximating the leakage of each gate with a log-normal distribution. The total chip leakage is computed iteratively by summing many log-normal distributions. The resulting sum is approximated by a log-normal distribution again using modified Wilkinson’s method. This approach has two sources of error. First, the use of lognormal distributions corresponds to approximation of leakage current with the exponent of a linear function. According to our experiments it is not accurate, especially when leakage is considered as a function of Vdd and Leff. This fact is confirmed by [8] showing that it is necessary to use the exponent of a quadratic function of Leff. The second source of error is the approximation of the sum of several log-normal random variables by a log-normal variable again. In our case the situation is even more complex because we consider Vdd as a parameter. Therefore, we developed an alternative technique. Using SPICE simulations, we characterize the leakage current of each type of gate in each possible leakage state. Fig. Figure 2: Circuit For Leakage Current Characterization 2 shows an example of the characterization circuit. The leakage current is measured as the current flowing through voltage source E. Thus we measure both the sub-threshold and transistor gate leakage currents together. The leakage current is measured for all required values of Vdd and Leff. The characterization results are represented by a twodimensional table. Then we numerically sum the leakage tables of all gate types and all their leakage states weighted by the corresponding coefficients pi,j from (5). This operation does not incur any additional approximation error. The resulting table is fitted to the exponent of a quadratic form of Vdd and Leff Ichip = I0 eall L 2 +avv V 2 +alv LV +al L+aV V (6) where L and V are the values of the transistor channel length and the chip supply voltage, respectively, I0 , all , avv , alv , al , aV are the fitting coefficients. Assuming variational representations of Vdd and Leff in the form V = V0 + ∆V , L = L0 + ∆L we derive the variational form of leakage power Pl = V · Il (V, L) = (V0 + ∆V ) · Il (V0 + ∆V, L0 + ∆L) = Inom · (V0 + ∆V ) · ebll ∆L 2 +bvv ∆V 2 +blv ∆L∆V +bl ∆L+bV ∆V (7) where Inom is the leakage current at the nominal Vdd and Leff 2 2 Inom = I0 eall L0 +avv V0 +alv L0 V0 +al L0 +aV V0 (8) and bll , bvv , blv , bl and bV are coefficients computed from the coefficients all , avv , alv , al , aV . We preserve quadratic terms of variations because they are too large to be neglected. Variational form of the total chip power can be obtained by summing (4) and (7). We use this formula for our computational experiments. However, all the formulations and derivations in this paper are general enough for chip power expressed by an arbitrary function of Vdd and Leff. We do not consider intra-die variation of leakage current. The total chip leakage is the sum of leakage currents of a large number NG of gates. The variance of the chip to chip leakage variation is proportional to the number of gates NG while the variance of the intra-die leakage variation is pro√ portional to NG . Therefore, the contribution of intra-die variation into the correlation between the chip slack and leakage is small comparing with the contribution of chip to chip variation. Spatial variation does not have high impact on this correlation too because the number of spatially independent regions on a chip is also large(at least hundreds). The main impact of intra-die variability is a small change of mean value. This adjustment can be computed similarly to [8]. 3. PROBLEM FORMULATION For voltage binning there are two main problems, for which we present solutions here. The first one is yield computation for a given binning scheme. The second one is computation of the optimal binning scheme given a required number of voltage bins. In order to solve these problems, we give a definition of a voltage binning scheme and analyze its properties. Definition 1. A voltage binning scheme is a set of supb = ply voltage levels Vb = {V1 , V2 , . . . , Vn }, a set of bins U {U1 , U2 , . . . , Un } corresponding to these voltages and a binning algorithm A, which distributes manufactured chips among the bins. The binning algorithm A assigns chips to bins so that any chip assigned to bin Ui meets both the timing and power constraints at the supply voltage Vi corresponding to this bin. The chips not assigned to any bin constitute yield loss. From the definition of a voltage binning scheme, we see that yield computation depends on two separate factors: the bin voltage levels and the binning algorithm. For the same set of bin voltage levels, there are many different binning algorithms. These algorithms may produce different yield. It makes sense to consider only the binning algorithms producing the maximum possible yield. Fortunately, the criterion for optimality of a binning algorithm is simple. An optimal binning algorithm must not put good chips in yield loss. In other words, any chip for which there exists at least one bin, where the chip satisfies timing and power constraints, should be assigned to some bin. Obviously any chip assignment to voltage bins satisfying this criterion cannot be improved because no chip from the yield loss bucket can be assigned to a voltage bin without violating either the power or timing constraints. Our yield computation and optimization is based on the following binning algorithm: Algorithm 1. 1. Take a manufactured chip and find the highest bin voltage Vi at which the chip satisfies power constraints. 2. Test chip performance for Vdd=Vi . 3. If the chip meets the timing constraints, assign it to the bin Ui with Vdd=Vi . Otherwise, put this chip in the scrap bucket. Usually chip power and performance are increasing functions of Vdd. Therefore, this algorithm produces optimal chip assignment to voltage bins. The simplicity of this binning algorithm makes yield computation convenient. It does not mean that the actual binning should be done by exactly this algorithm. However, any optimal binning produces the same yield. Therefore, our computation is valid for any other optimal binning. The problem of computing optimal voltage binning scheme Vb = {V1 , V2 , . . . , Vn } is formulated as follows: max V1 ,V2 ,...Vn Y s.t. 8 V1 > > > < V2 > > > : Vn Vn > > ... > < Vmin V1 (9) Vn−1 Vmax where Y is total yield corresponding to the bin voltage levels {V1 , V2 , . . . , Vn }. Optimization constraints impose an ascending order of voltage levels. For formulation of the optimization problem this order is not obligatory. However, without those constraints, the yield is a multi-modal function of voltage levels, because any permutation transforms the vector of optimal voltages into another optimal solution. By imposing an ascending order of voltage levels, we remove the duplicates of optimal solutions, which simplifies the optimization algorithm. There are two special types of voltage binning. The first one is a scheme with a single bin. Obviously, it is just the usual case with a single operational voltage. The other one is a scheme with infinite number of voltage bins with all possible voltage levels. This binning scheme describes the situation when supply voltage is individually tailored for each chip to meet timing and power constraints. Optimal binning schemes have the following properties. Property 1. If optimal voltage binning scheme S1 has n voltage levels, optimal binning scheme S2 has m voltage levels and n < m, then binning scheme S1 gives lower or equal yield than binning scheme S2 . Property 2. Yield for chips with individually adjustable Vdd is an upper bound of yield for any other voltage binning scheme. Property 3. Yield for single optimal value of supply voltage (optimal single voltage bin) is a lower bound for any optimal voltage binning scheme. We use these properties for verifying our computation of optimal binning schemes. 4. YIELD COMPUTATION 4.1 Yield for single voltage bin In order to be good, a chip should satisfy requirements that its timing slack is positive S > 0 and its power does not exceed the required limit P < Plim . For a single Vdd, the parametric chip yield is the percentage of manufactured chips satisfying these constraints. We compute yield for a given voltage level ∆V by direct integration in the space of process parameters Y = ZZ p(∆L, ∆X)d∆Ld∆X (10) S>0 P <Plim where p(∆L, ∆X) is the JPDF of variations of Leff and our combined parameter X, S > 0 and P < Plim are timing and power constraints defining the integration area in the process variation space. In order to simplify the presentation we assume that ∆L and ∆Xare normalized Gaussians. The simplest way to compute the integral (10) is a MonteCarlo technique. However, Monte-Carlo integration is prone to numerical noise. For yield estimation this noise is acceptable. However, optimization is guided with variation of objective function with respect to optimization steps. Even small random noise confuses the optimization by making the objective function to be locally non-convex. In fact, a systematic deterministic inaccuracy is less harmful than this random noise. The computation of the objective function multiple times during optimization exacerbates the situation by increasing the probability of large errors. Since there is no closed formula for this integral, we developed a combination of numerical and analytical techniques. Fig. 3 shows the variational space for channel length ∆L and combined parameter ∆X. The vertical line ∆L = ∆Lmin corresponds to the power constraint P < Plim . The value ∆Lmin corresponds to the minimum Leff at which the power constraint is not violated. ∆Lmin is calculated by numerically solving the equation P (∆Lmin ) = Plim where P (∆Lmin ) is total chip power considered as a function of Leff variation. The oblique line ∆X = (−S0 − aV ∆V − aL ∆L)/aX expresses the timing constraint S > 0. It is derived from the equation for timing slack (2). We do not set upper bound on Leff variation because Gaussian distribution goes to 0 very fast and the impact of high values of Leff is negligibly small. These two lines limit the area where chips satisfy timing and power constraints. The integral (10) over this area can be rewritten as an iterated integral Y = 1 2π Z∞ Z∞ e− ∆L2 +∆X 2 2 d∆Xd∆L and power requirements at least at one supply voltage Vi . Mathematically yield can be expressed as follows: Y = Z Z Sn p(∆L, ∆X)d∆Ld∆X (13) i=1 {S(Vi )>0,P (Vi )<Plim } where integration is performed over the union of all the areas defined by the timing and power constraints corresponding to each supply voltage of the binning scheme. We extended the technique of subsection 4.1 for computing this integral. Fig. 4 shows the variational space for channel length variations ∆L and combined parameter variations ∆X. The gray area constitutes chip yield. It is bounded by 3 vertical lines ∆L = ∆Lmin,i and 3 oblique lines ∆X = (−S0 − aV ∆Vi − aL ∆L)/aX . Each vertical line ∆L = ∆Lmin,i corresponds to power constraint P (∆Lmin,i ) < Plim,i . Each oblique line ∆X = (−S0 − aV ∆Vi − aL ∆L)/aX corresponds to timing constraint S(∆Vi , ∆L, ∆X) = 0. The formula for the oblique lines is derived from the expression for timing slack (2). The whole yield area in Fig. 4 can be divided into three subregions marked Y1 , Y2 and Y3 . Two of them are bounded by two vertical lines and by one oblique line. The third area is bounded by one vertical and one oblique lines. The chip yield can be represented as Y = Y1 + Y2 + Y3 . ∆X (11) Y1 ∆Lmin −S0 −aV ∆V −aL ∆L Y2 Y3 aX ∆X Yield 3 −aL ∆L ∆X= −S0 −aV ∆V aX ∆L −aL ∆L ∆X= −S0 −aV a∆V X ∆Lmin,1 ∆Lmin,2 ∆Lmin,3 ∆L ∆Lmin Figure 4: Variational space for multiple vdd bins yield Figure 3: Variational space for single vdd yield By expressing the internal integral through the normalized Gaussian PDF φ and cumulative distribution function (CDF) Φ we simplify the formula for yield Y = Z∞ φ (∆L) Φ „ S0 + aV ∆V + aL ∆L aX « For the general case, we represent the total yield for a binning scheme with n voltages V1 , V2 , . . . , Vn as the sum of the yields corresponding to n subregions. Y = d∆L (12) ∆Lmin This integral can be efficiently computed numerically. It takes less than one second. This computational technique can be easily extended to the case of more variables affecting both chip power and timing. Then we will have to compute numerically 2 or 3 dimensional integral, which is is not too difficult. 4.2 Yield for multiple voltage bins Assume we have a voltage binning scheme with n voltage levels V1 , V2 , . . . , Vn . Then a chip is good if it satisfies timing n X Yi (14) i=1 The term Yn corresponds to the region bounded by one vertical and one oblique lines. It can be efficiently computed by (12). The other terms Yi with i < n correspond to the regions bounded by two vertical lines ∆L = ∆Lmin,i , ∆L = ∆Lmin,i+1 and one oblique line ∆X = (−S0 − aV ∆Vi − aL ∆L)/aX . These terms are computed as ∆Lmin,i+1 Yi = Z φ (∆L) Φ „ S0 + aV ∆Vi + aL ∆L aX « d∆L ∆Lmin,i (15) 4.3 Yield optimization Leakage as function of Vdd and Leff 0.5 Leakage (A) Having derived efficient and accurate formulas and numerical procedures for yield computation, we now proceed to solve the problem of computing optimal voltage levels, i.e., the problem of computing optimal binning scheme (9). Our formulas can be used for computing yield derivatives with respect to Vdd levels. However, we found that even optimization without derivatives by simplex Nelder and Mead method [6] gives good results. 0.4 0.3 0.2 10 9.5 9 −8 4.4 Yield for adjustable supply voltage Y = Prob(∃∆V |S(∆V, ∆L, ∆X) > 0, P (∆V, ∆L) < Plim , ∆Vmin < ∆V < ∆Vmax ) (16) This formula correctly defines yield of the chips with individually adjustable Vdd, but it is not helpful for actual calculation. Therefore, we transform it to an integral that is more suitable for computation. Z Z p(∆L, ∆X)d∆L∆X (17) S(∆V,∆L,∆X)>0 P (∆V,∆L)<Plim ∆Vmin <∆V <∆Vmax The integration in this formula is performed across the area where the system of inequalities has at least one solution. This integral can be computed by the following MonteCarlo algorithm Algorithm 2. 1. Assume certain number of Monte-Carlo samples N . 2. Set Y =0. 3. For i ∈ [1, N ] do the following: • Generate random samples of ∆L and ∆X according to their distributions. • If system of inequalities S(∆V, ∆L, ∆X) > P (∆V, ∆L) < ∆Vmin < ∆V < is compatible set Y = Y + 1 4. Compute yield Y = Y /N . 0 Plim ∆Vmax 8 Leff (m) (18) Vdd (v) Leakage as function of Leff 0.45 0.32 0.4 Leakage (A) 0.34 0.3 0.28 0.26 0.24 1.4 1.3 1.2 1.1 1 Leakage as function of Vdd Leakage (A) Now we consider the case when each chip can be assigned its own individual supply voltage to meet timing and power constraints. Obviously it is the same as having infinite number of voltage bins. Therefore, the yield for this case can be expressed as the limit of optimal binning as the number of bins approaches infinity. However, its computation as a limit is not convenient and we need a more efficient technique. Just as timing and power constraints are formulated in terms of deviation of Vdd from its nominal value V0 , we express the yield in terms of this deviation ∆V too. The yield Y for chips with individually adjustable supply voltage in the interval of voltage variations [∆Vmin , ∆Vmax ] is expressed as the probability that there exists supply voltage in this interval such that both the timing and power constraints are satisfied: Y = 8.5 x 10 0.35 0.3 0.25 1 1.1 1.2 Vdd (v) 1.3 1.4 0.2 8 8.5 9 Leff (m) 9.5 10 −8 x 10 Figure 5: Leakage as a function of Vdd and Leff The only unusual thing here is checking feasibility of the system of inequalities. It is not difficult because both timing slack and power have analytical expressions. Substituting into these expressions the sampled values of ∆L and ∆X we obtain a system of inequalities with only one variable ∆V . Then checking its feasibility is straightforward, especially taking into account that here both timing slack and power are monotonic functions of supply voltage. MonteCarlo technique is suitable for computing this integral because we do not use in an optimization procedure. Therefore, the random numerical noise does not create any additional problems. 5. EXPERIMENTAL RESULTS We verified the proposed technique by analyzing 3 large industrial chips with strict requirements on power consumption. We characterized the leakage of library cells as a function of Vdd and Leff using SPICE simulations. Vdd was varied between 1 volt and 1.4 volts. Leff was varied from 80 to 100 nanometers. We used 41 points for Vdd and 21 points for Leff variations, which makes a total of 861 points for all Vdd and Leff combinations. The total leakage of each chip was calculated by summing the leakage of all cells. The upper 3D plot in Fig. 5 shows chip leakage current as a function of Vdd and Leff. The two plots at the bottom of this figure show variation of the leakage current when either Leff or Vdd is fixed. From these plots we see strong dependence of leakage both on Vdd and Leff. The tabulated function of chip leakage was fit to the exponent of quadratic form (6) using Matlab. For efficiency we fit the logarithm of leakage current to the quadratic form of Vdd and Leff. This approach helped to perform fitting in less than 1 min. Fig. 6 shows error of leakage approxi- Error of leakage approximation of chip3 1 0.9 0.8 1 0.7 0.6 Yield Error %% 0.5 0.5 0.4 0 0.3 0.2 −0.5 0.1 −1 10 0 1 2 3 4 5 6 7 8 9 10 Number of bins 9.5 1.4 1.3 9 −8 x 10 1.2 8.5 Figure 8: Yield for different binning schemes 1.1 8 Leff (m) 1 Vdd (v) Figure 6: Error of leakage approximation 0.8 0.7 0.6 Yield 0.5 0.4 0.3 0.2 0.1 0 1 1.05 1.1 1.15 1.2 1.25 1.3 1.35 1.4 Vdd (v) of bins. Table 1 shows results of this experiment for three industrial chips. The first column gives the name of the chip. The 2nd column shows chip yield when Vdd is adjusted individually for each manufactured chip. We see that in this case all three chips have pretty high yield. The columns 3 through 7 give yield for optimal binning with 1, 2, 3, 5 and 10 bins. We see that yield always increases when we increase the number of bins but it never exceeds the yield corresponding to individually adjustable Vdd. We analyzed binning schemes with as many as 10 bins in order to verify that the yield for a large number of bins gets close to the yield corresponding to individually adjustable Vdd. This table confirms all our theoretical predictions. We also see that voltage binning can significantly improve yield. For example, 3 bins give 81.7% yield comparing with only 44% for single optimal Vdd value. The computation time was insignificant (about 1 minute) even for the case of 10 bins. Figure 7: Yield as a function of Vdd mation as a function of Vdd and Leff. The maximum error is only 0.91%. For the other two chips the error was within 0.84% and 0.78%, which confirms the accuracy and robustness of the proposed approximation technique. The total chip power was computed by summing its leakage and switching power. The switching power was estimated from chip switching activity and load capacitances of logic gates. The proposed technique also requires chip slack in linear canonical form (2). The chip slack was computed by statistical timing analysis. We used 15 sources of variations modeling variability of transistors and interconnects. For yield analysis all the components of the chip slack except variations due to Vdd and Leff were combined statistically into a single term. For validating the analytical formulas of yield computation we compared our technique (12) and Monte-Carlo calculation of chip yield for different Vdd values. Monte-Carlo calculation used 105 samples to guarantee sufficient accuracy. The maximum difference between Monte-Carlo and the proposed technique was only 0.44%. We obtained similar results for the case of two voltage bins. Fig. 7 shows a plot of chip yield as a function of Vdd. We see that the yield strongly depends on Vdd and achieves its maximum at some Vdd value. The next step in our experiments was computation of the optimal Vdd levels for binning schemes with varying number Table 1: Yield for different voltage binning schemes Design Adj 1 bin 2 bins 3 bins 5 bins 10 bins Vdd chip1 97.8 63.2 82.9 89.8 94.2 96.5 chip2 97.9 66.9 85.0 90.0 94.8 96.8 chip3 99.4 44.0 68.9 81.7 92.2 97.9 Fig. 8 shows two histograms of chip yield for voltage binning with number of bins varying from 1 to 10. The red bars (left bars in the groups) show yield for optimal binning. The blue bars (right bars in the groups) show yield for uniform binning, i.e., binning with Vdd values uniformly distributed between the minimum and maximum Vdd values. As expected, optimal binning always provides higher yield. Obviously, the difference between optimal and uniform voltage binning reduces when the number of bins gets large. We also see that the uniform binning yield decreases when the number of voltage bins is increased from 1 to 2. This happens because the single optimal Vdd is close to the middle of the Vdd interval but the two uniformly distributed Vdd values are far from the optimal values. Fig. 9 shows bin distribution for optimal voltage binning schemes. One horizontal axis gives the number of bins in the binning schemes. So, we see here 5 binning schemes with 1, 2, 3, 4, 5 bins. The other horizontal axis gives Vdd values of voltage bins. The stems growing from the horizontal plane correspond to the bins of the binning schemes. Table 2: Power constr 100% 95% 90% 85% 80% 75% Yield dependence on power requirements Adj 1 bin 2 bins 3 bins 5 bins 10 bins Vdd 97.8 63.2 82.9 89.8 94.2 96.5 94.9 54.2 74.1 82.1 89.5 91.8 87.4 44.1 62.7 71.2 78.4 83.5 76.5 33.4 49.3 57.3 64.8 70.7 61.1 23.8 35.2 41.8 48.6 54.4 44.1 14.3 22.3 27.1 32.2 37.1 0.8 Yield 0.6 0.4 1.35 1.3 0.2 1.25 0 5 4 1.2 3 Number of bins 2 1 1.15 Vdd (v) Figure 9: Size of Voltage Bins The coordinates of a stem root indicate the number of bins in the corresponding binning scheme and the Vdd of the corresponding bin. The height of each stem is the amount of yield obtained from the corresponding bin. Clearly, the optimization improves yield by distributing Vdd values nonuniformly among the bins. Voltage binning provides an opportunity to improve yield for very strict power constraints. We analyzed this opportunity by computing optimal binning for different power requirements. Table 2 shows parametric yield for the same chip for different optimal binning schemes and power requirements. Column 1 shows power constraints as a percentage of the initial target value. We varied power constraints from the initial target value down to 75% of it. Column 2 shows yield when each manufactured chip is assigned an individually adjusted optimal Vdd. Columns 3 through 7 show yield for binning schemes with different numbers of bins. From this table we see that voltage binning can significantly improve yield even for very strict power requirements. For instance, when power constraint is 25% stricter than nominal, single Vdd gives only 14.3% yield but just 3 bins improve up to 27.1%, while 10 bins improve it up to 37.1%. 6. CONCLUSIONS We have presented a technique for yield computation and optimization for different voltage binning schemes. Our approach uses the results of statistical timing analysis and chip leakage expressed in functional form. For our experiments we approximated chip leakage as an exponential of quadratic form of Vdd and Leff. We demonstrated that this approximation has less than 1% error. However, the proposed technique of analyzing and optimizing voltage binning can be adapted to other representations of chip leakage and switching power. Our experiments showed that optimal voltage binning is able to more than double parametric yield for strict power requirements. The proposed algorithms demonstrated high computational efficiency and require only a few minutes of CPU time. 7. REFERENCES [1] R. Chen, L. Zhang, V. Zolotov, C. Visweswariah, and J. Xiong. Static timing: back to our roots. ASP-DAC, pages 310–315, 2008. [2] T. Chen and S. Naffziger. Comparison of adaptive body bias and adaptive supply voltage for improving delay and leakage under the presence of process variation. IEEE Trans. on VLSI. Vol. 11, No 5, pages 888–899, October 2003. [3] A. Datta, S. Bhunia, J. H. Choi, S. Mukhopadhyay, and K. Roy. Profit aware circuit design under process variations considering speed binning. IEEE Trans. on VLSI. Vol. 16, No 7, pages 806–815, July 2008. [4] M. W. Kuemerle, S. K. Lichtensteiger, D. W.Douglas, and I. L. Wemple. Integrated circuit design closure method for selective voltage binning. U. S. Patent 7,475,366, January 2009. [5] C. Neau and K. Roy. Optimal body bias selection for leakage improvement and process compensation over different technology generations. ISLPD, pages 116–121, August 2003. [6] W. H. Press, S. A. Teukolsky, W. V. Vetterling, and B. P. Flannery. Numerical Recipes in C. Cambridge University Press, 1999. [7] R. R. Rao, D. Blaauw, D. Sylvester, and A. Devgan. Modeling and analysis of parametric yield under power and performance constraints. IEEE Design & Test of Computers, pages 376–385, July 2005. [8] R. R. Rao, A. Devgan, D. Blaauw, and D. Sylvester. Analytical yield prediction considering leakage/performance correlation. IEEE Transactions on CAD, pages 1685–1695, September 2006. [9] A. Srivastava, K. Chopra, S. Shah, D. Sylvester, and D. Blaauw. A novel approach to peform gate-level yield analysis and optimization considering correlated variations in power and performance. IEEE Trans. on CAD. Vol. 27, No 2, pages 272–285, February 2008. [10] J. W. Tschanz, S. Narendra, R. Nair, and V. De. Effectiveness of adaptive supply voltage and body bias for reducing impact of parameter variations in low power and high performance microprocessors. IEEE Journal of Solid State Circuits. Vol. 38, No. 5, pages 826–829, May 2003. [11] C. Visweswariah, K. Ravindran, K. Kalafala, S. G. Walker, and S. Narayan. First-order incremental block-based statistical timing analysis. DAC, pages 331–336, June 2004. San Diego, CA.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Voltage Binning Under Process Variation