Download Voltage Binning Under Process Variation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Transistor–transistor logic wikipedia , lookup

Integrating ADC wikipedia , lookup

TRIAC wikipedia , lookup

Radio transmitter design wikipedia , lookup

Audio power wikipedia , lookup

Operational amplifier wikipedia , lookup

Decibel wikipedia , lookup

Valve RF amplifier wikipedia , lookup

Ohm's law wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Integrated circuit wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Schmitt trigger wikipedia , lookup

Josephson voltage standard wikipedia , lookup

Opto-isolator wikipedia , lookup

MOS Technology SID wikipedia , lookup

Current mirror wikipedia , lookup

CMOS wikipedia , lookup

Surge protector wikipedia , lookup

Voltage regulator wikipedia , lookup

Power electronics wikipedia , lookup

Power MOSFET wikipedia , lookup

Rectiverter wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Transcript
Voltage Binning Under Process Variation
Vladimir Zolotov
Chandu Viweswariah
Jinjun Xiong
IBM T.J. Watson Research
Center
Yorktown Heights, NY, USA
IBM System and Technology
Group
Hopewell Junction, NY, USA
IBM T.J. Watson Research
Center
Yorktown Heights, NY, USA
[email protected]
[email protected]
[email protected]
ABSTRACT
Process variation is recognized as a major source of parametric yield loss, which occurs because a fraction of manufactured chips do not satisfy timing or power constraints.
On the other hand, both chip performance and chip leakage power depend on supply voltage. This dependence can
be used for converting the fraction of too slow or too leaky
chips into good ones by adjusting their supply voltage. This
technique is called voltage binning [4]. All the manufactured chips are divided into groups (bins) and each group
is assigned its individual supply voltage. This paper proposes a statistical technique of yield computation for different voltage binning schemes using results of statistical
timing and variational power analysis. The paper formulates and solves the problem of computing optimal supply
voltages for a given binning scheme.
Categories and Subject Descriptors
B.7.2 [Integrated Circuits]: Design Aids
General Terms
Algorithms, Design, Theory
Keywords
Voltage binning, parametric yield, leakage current
1.
INTRODUCTION
Now process variation is recognized as a major source of
parametric yield loss. In the future, due to scaling down
of CMOS transistor size, the situation is expected to get
worse. Process variation causes high variability in gate delays and leakage current, which leads to high variability of
chip operational frequency and power consumption. Due to
die-to-die and within die variability, only some of the manufactured chips satisfy both performance and power requirements. The other chips are either too slow or consume too
much power. They represent the parametric yield loss.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ICCAD ’09 San Jose, California USA
Copyright 2009 ACM 978-1-60558-800-1/09/11 ...$10.00.
A number of approaches were proposed to address this
problem. Conservative design sets chip timing and power
to a stricter target than required. The main disadvantage
of this approach is larger chip area and higher design cost.
Speed binning reduces parametric yield loss by accepting
low performance chips and selling them at discount price
[3]. However, often ASIC chips have strict requirements for
their frequency and power. The chips not satisfying them
have no value at all. Additionally, selling chips at a discount
price reduces the profit. Body biasing [5], [2] either reduces
transistor leakage or improves gate delays. This technique
requires connecting transistor bodies to the biasing voltage
source. This wiring negatively affects chip routability. It
increases design time and cost and is impractical for SOI
technology.
In this paper we analyze voltage binning [4] as a technique to reduce parametric yield loss. This technique exploits the fact that both chip performance and power consumption depend on supply voltage. Higher supply voltage
improves chip performance but increases both leakage and
switching power [10]. On the other hand, slow chips often
have low leakage and chips with high leakage have higher
performance. By adjusting supply voltage it is possible to
make initially failing chips satisfy application constraints.
Because all the accepted chips satisfy these constraints they
have the same value and can be sold at the same price. Supply voltage adjustment does not require either additional
circuitry or additional wiring on the chip.
There are different schemes of supply voltage adjustment.
For example, it is possible to assign an individual supply
voltage to each manufactured chip [10], [2]. It is an attractive methodology but it requires significant effort for chip
testing at different supply voltages. Voltage binning [4] is
an alternative technique. It divides all manufactured chips
into several bins and assigns to each bin some value of supply voltage. This technique is more practical at the cost of
small yield reduction.
The patent [4] proposes a general idea of voltage binning.
However, it does not analyze how to select optimal supply
voltages and how much improvement can be obtained by
different voltage binning schemes. We develop a technique
for computing yield for a given voltage binning scheme, including the case of individually adjustable supply voltage.
Our approach is based on results and ideas of statistical
timing analysis [11] and statistical models of chip leakage
power [8]. We represent chip yield by an integral over the
process variation space. Monte-Carlo technique is the most
obvious method for computing such integrals. However, it is
rather slow and requires a large number of samples to ensure
accuracy. Besides Monte-Carlo computation has significant
random numerical noise. However, even small random noise
severely interferes with an optimization procedure. Therefore, we developed a combination of analytical and numerical
techniques for accurate and efficient calculation of chip yield.
By applying the yield computation technique, we solve the
problem of an optimal voltage binning scheme i.e., computing optimal voltage levels. We use a functional representation of chip performance and power consumption. Statistical static timing analysis (SSTA) gives chip performance as
a linear form of process variables [11].
Chip power analysis is not a primary goal of this paper.
Therefore, here we do not delve into details, like computation of switching activity or probability of different leakage states. However, in order to make our parametric yield
computation sufficiently realistic we have to use an adequate
model of chip power consumption. This model takes into account both switching and leakage power. Switching power
is expressed as a function of supply voltage and gate load
and wire capacitancies.
Modeling of chip leakage power has attracted the attention
of many authors. For our purpose, a convenient representation of chip leakage is given by [7]. However, our experiments showed that we need a more accurate model. We
made two modifications of the leakage computation technique. First, we represented leakage current with the exponent of a quadratic form of transistor channel length (Leff)
and supply voltage (Vdd). This function approximates chip
leakage with less than 1% error. Second, instead of probability density function (PDF) of chip leakage, we directly
use its functional expression.
In order to verify the proposed technique we applied it
to the analysis of large industrial chips. In our experiments
we computed total chip leakage in a functional form using
SPICE simulation of individual gates and the chip statistics
of gate types and their states. Similarly we computed chip
switching power using statistics of gate switching frequency
and their load capacitances. Chip statistical timing slack
was computed by an in-house statistical timing analyzer.
These results were used for computing yield and optimal
voltage binning schemes.
The rest of the paper is organized as follows. Section II
introduces the timing and power models. Section III gives
definitions of voltage binning schemes and formulates problems of computation and optimization of yield. Section IV
presents our technique of computing and optimizing yield
of various voltage binning schemes. Section V describes our
computational experiments. Section VI draws conclusions.
2.
Plim
P
Leaky
&
Slow
Too leaky
Too slow
Good Chips
F
F req
Figure 1: Joint PDF of chip operational frequency
and power consumption
JPDF over this region gives manufacturing yield. The region marked Leaky & Slow represents the chips that are
too slow and have too high leakage. These chips cannot be
fixed by adjusting supply voltage. The region marked Too
Leaky represents the chips that consume too much power due
to leakage but have sufficiently high operational frequency.
Some of these chips can be fixed by lowering supply voltage.
The region marked Too slow represents chips that do not
satisfy performance requirements but consume power less
than allowed. Some of these chips can be fixed by increasing supply voltage.
While the power-performance space is helpful for visualizing chip distribution, it is not convenient for actual yield
computation. First, due to highly non-linear dependence
of chip leakage on Leff it is very difficult to represent the
JPDF of power and performance numerically or analytically
even for fixed Vdd. Second, there are no convenient mathematical tools to operate with a JPDF parameterized by
Vdd. Because of that we perform our analysis in the space
of sources of variation and Vdd. For simplicity, we assume
that all variational parameters have normal Gaussian distributions.
2.1 Timing Model
Traditionally chip performance is expressed by its clock frequency. However, the results of timing are expressed in
terms of timing slack. Minimization of clock period is equivalent to maximization of timing slack. Therefore, for convenience we use timing slack instead of clock frequency. Statistical timing computes timing slack S in linear form [11]
BACKGROUND
For computing and optimizing yield corresponding to different voltage binning schemes we need proper performance
and power statistical models, parameterized by supply voltage. One such model is a joint distribution of chip operational frequency and power consumption. Fig. 1 shows a
contour plot of the joint probability density function (JPDF)
of chip frequency and power for some Vdd value. The area
of this plot is divided into 4 regions with lines F = Freq and
P = Plim indicating chip performance and power requirements, respectively.
The region marked Good chips represents the chips satisfying the application requirements. The integral of the
S = S0 +
n
X
ai ∆Xi + aR ∆R
(1)
i=1
where S0 is the mean value of chip slack, ∆Xi is chip to
chip variation of parameter Xi , ∆R is uncorrelated variation summing effects of all intra-die variations, ai and aR
are the sensitivities to these variations. The spatial variation is computed as it is described in [1] and included into
uncorrelated variation because it does not affect chip to chip
variation.
We give special consideration to chip-to-chip variations of
supply voltage ∆V and transistor channel length ∆L. We
combine all other sources of variations into a single parameter ∆X, using the fact that linear combination of Gaussian random variables is a Gaussian random variable too.
Threshold voltage variation due to random dopant fluctuation is included into the uncorrelated variation modeling all
the intra-die variations. Then the statistical timing slack is
expressed as follows
T1
E
T2
T3
T4
S = S0 + aV ∆V + aL ∆L + aX ∆X
(2)
where aV , aL and aX are sensitivities of timing slack to Vdd,
Leff and the combined variational parameter ∆X. The sensitivity aV with respect to Vdd can be computed either by
pure statistical timing considering Vdd a statistical variable
or from two statistical timing procedures performed at low
and high Vdd values.
2.2 Power model
Total chip power is expressed as PT ot = Ps + Pl , where Ps
is switching or dynamic power and Pl is leakage power due
to sub-threshold and gate leakage.
Switching chip power is expressed as
Ps =
n
X
si F Ci
i=1
V2
V2
=α
2
2
(3)
where si and Ci are the switching activity and the load
capacitance of the ith gate, F is the chip frequency, V is the
supply voltage, α is the coefficient obtained by summing all
terms together.
From (3) we get a variational form of chip switching power
Ps = Ps,0 + cv ∆V + cvv ∆V
2
(4)
where Ps,0 is the switching power at nominal Vdd, ∆V is
the variation of Vdd, cv and cvv are constant coefficients.
The chip leakage current is expressed as
Ichip = V
ki
n X
X
pi,j Ii,j
(5)
i=1 j=1
where Ii,j is the leakage of the ith gate in the j th leakage
state, pi,j is the probability of that state occurrence and ki
is the number of leakage states of the ith gate.
The paper [9] proposes approximating the leakage of each
gate with a log-normal distribution. The total chip leakage
is computed iteratively by summing many log-normal distributions. The resulting sum is approximated by a log-normal
distribution again using modified Wilkinson’s method. This
approach has two sources of error. First, the use of lognormal distributions corresponds to approximation of leakage current with the exponent of a linear function. According to our experiments it is not accurate, especially when
leakage is considered as a function of Vdd and Leff. This
fact is confirmed by [8] showing that it is necessary to use
the exponent of a quadratic function of Leff. The second
source of error is the approximation of the sum of several
log-normal random variables by a log-normal variable again.
In our case the situation is even more complex because we
consider Vdd as a parameter. Therefore, we developed an
alternative technique.
Using SPICE simulations, we characterize the leakage current of each type of gate in each possible leakage state. Fig.
Figure 2: Circuit For Leakage Current Characterization
2 shows an example of the characterization circuit. The
leakage current is measured as the current flowing through
voltage source E. Thus we measure both the sub-threshold
and transistor gate leakage currents together. The leakage current is measured for all required values of Vdd and
Leff. The characterization results are represented by a twodimensional table. Then we numerically sum the leakage
tables of all gate types and all their leakage states weighted
by the corresponding coefficients pi,j from (5). This operation does not incur any additional approximation error. The
resulting table is fitted to the exponent of a quadratic form
of Vdd and Leff
Ichip = I0 eall L
2
+avv V 2 +alv LV +al L+aV V
(6)
where L and V are the values of the transistor channel length
and the chip supply voltage, respectively, I0 , all , avv , alv ,
al , aV are the fitting coefficients.
Assuming variational representations of Vdd and Leff in
the form V = V0 + ∆V , L = L0 + ∆L we derive the variational form of leakage power
Pl = V · Il (V, L) = (V0 + ∆V ) · Il (V0 + ∆V, L0 + ∆L) =
Inom · (V0 + ∆V ) · ebll ∆L
2
+bvv ∆V 2 +blv ∆L∆V +bl ∆L+bV ∆V
(7)
where Inom is the leakage current at the nominal Vdd and
Leff
2
2
Inom = I0 eall L0 +avv V0 +alv L0 V0 +al L0 +aV V0
(8)
and bll , bvv , blv , bl and bV are coefficients computed from the
coefficients all , avv , alv , al , aV . We preserve quadratic terms
of variations because they are too large to be neglected.
Variational form of the total chip power can be obtained
by summing (4) and (7). We use this formula for our computational experiments. However, all the formulations and
derivations in this paper are general enough for chip power
expressed by an arbitrary function of Vdd and Leff.
We do not consider intra-die variation of leakage current.
The total chip leakage is the sum of leakage currents of a
large number NG of gates. The variance of the chip to chip
leakage variation is proportional to the number of gates NG
while the variance
of the intra-die leakage variation is pro√
portional to NG . Therefore, the contribution of intra-die
variation into the correlation between the chip slack and
leakage is small comparing with the contribution of chip to
chip variation. Spatial variation does not have high impact
on this correlation too because the number of spatially independent regions on a chip is also large(at least hundreds).
The main impact of intra-die variability is a small change of
mean value. This adjustment can be computed similarly to
[8].
3.
PROBLEM FORMULATION
For voltage binning there are two main problems, for which
we present solutions here. The first one is yield computation
for a given binning scheme. The second one is computation
of the optimal binning scheme given a required number of
voltage bins. In order to solve these problems, we give a
definition of a voltage binning scheme and analyze its properties.
Definition 1. A voltage binning scheme is a set of supb =
ply voltage levels Vb = {V1 , V2 , . . . , Vn }, a set of bins U
{U1 , U2 , . . . , Un } corresponding to these voltages and a binning algorithm A, which distributes manufactured chips among
the bins.
The binning algorithm A assigns chips to bins so that any
chip assigned to bin Ui meets both the timing and power
constraints at the supply voltage Vi corresponding to this
bin. The chips not assigned to any bin constitute yield loss.
From the definition of a voltage binning scheme, we see
that yield computation depends on two separate factors: the
bin voltage levels and the binning algorithm. For the same
set of bin voltage levels, there are many different binning algorithms. These algorithms may produce different yield. It
makes sense to consider only the binning algorithms producing the maximum possible yield. Fortunately, the criterion
for optimality of a binning algorithm is simple. An optimal
binning algorithm must not put good chips in yield loss. In
other words, any chip for which there exists at least one bin,
where the chip satisfies timing and power constraints, should
be assigned to some bin. Obviously any chip assignment to
voltage bins satisfying this criterion cannot be improved because no chip from the yield loss bucket can be assigned to
a voltage bin without violating either the power or timing
constraints.
Our yield computation and optimization is based on the
following binning algorithm:
Algorithm 1.
1. Take a manufactured chip and find the highest bin
voltage Vi at which the chip satisfies power constraints.
2. Test chip performance for Vdd=Vi .
3. If the chip meets the timing constraints, assign it to
the bin Ui with Vdd=Vi . Otherwise, put this chip in
the scrap bucket.
Usually chip power and performance are increasing functions
of Vdd. Therefore, this algorithm produces optimal chip
assignment to voltage bins. The simplicity of this binning
algorithm makes yield computation convenient. It does not
mean that the actual binning should be done by exactly
this algorithm. However, any optimal binning produces the
same yield. Therefore, our computation is valid for any other
optimal binning.
The problem of computing optimal voltage binning scheme
Vb = {V1 , V2 , . . . , Vn } is formulated as follows:
max
V1 ,V2 ,...Vn
Y
s.t.
8
V1
>
>
>
< V2
>
>
>
: Vn
Vn
>
>
...
>
<
Vmin
V1
(9)
Vn−1
Vmax
where Y is total yield corresponding to the bin voltage levels
{V1 , V2 , . . . , Vn }.
Optimization constraints impose an ascending order of
voltage levels. For formulation of the optimization problem this order is not obligatory. However, without those
constraints, the yield is a multi-modal function of voltage
levels, because any permutation transforms the vector of optimal voltages into another optimal solution. By imposing
an ascending order of voltage levels, we remove the duplicates of optimal solutions, which simplifies the optimization
algorithm.
There are two special types of voltage binning. The first
one is a scheme with a single bin. Obviously, it is just the
usual case with a single operational voltage. The other one
is a scheme with infinite number of voltage bins with all possible voltage levels. This binning scheme describes the situation when supply voltage is individually tailored for each
chip to meet timing and power constraints.
Optimal binning schemes have the following properties.
Property 1. If optimal voltage binning scheme S1 has n
voltage levels, optimal binning scheme S2 has m voltage
levels and n < m, then binning scheme S1 gives lower or
equal yield than binning scheme S2 .
Property 2. Yield for chips with individually adjustable
Vdd is an upper bound of yield for any other voltage binning
scheme.
Property 3. Yield for single optimal value of supply voltage (optimal single voltage bin) is a lower bound for any
optimal voltage binning scheme.
We use these properties for verifying our computation of
optimal binning schemes.
4. YIELD COMPUTATION
4.1 Yield for single voltage bin
In order to be good, a chip should satisfy requirements that
its timing slack is positive S > 0 and its power does not
exceed the required limit P < Plim . For a single Vdd,
the parametric chip yield is the percentage of manufactured
chips satisfying these constraints. We compute yield for a
given voltage level ∆V by direct integration in the space of
process parameters
Y =
ZZ
p(∆L, ∆X)d∆Ld∆X
(10)
S>0
P <Plim
where p(∆L, ∆X) is the JPDF of variations of Leff and our
combined parameter X, S > 0 and P < Plim are timing
and power constraints defining the integration area in the
process variation space. In order to simplify the presentation
we assume that ∆L and ∆Xare normalized Gaussians.
The simplest way to compute the integral (10) is a MonteCarlo technique. However, Monte-Carlo integration is prone
to numerical noise. For yield estimation this noise is acceptable. However, optimization is guided with variation of
objective function with respect to optimization steps. Even
small random noise confuses the optimization by making the
objective function to be locally non-convex. In fact, a systematic deterministic inaccuracy is less harmful than this
random noise. The computation of the objective function
multiple times during optimization exacerbates the situation by increasing the probability of large errors.
Since there is no closed formula for this integral, we developed a combination of numerical and analytical techniques.
Fig. 3 shows the variational space for channel length ∆L
and combined parameter ∆X. The vertical line ∆L =
∆Lmin corresponds to the power constraint P < Plim . The
value ∆Lmin corresponds to the minimum Leff at which the
power constraint is not violated. ∆Lmin is calculated by
numerically solving the equation P (∆Lmin ) = Plim where
P (∆Lmin ) is total chip power considered as a function of
Leff variation. The oblique line ∆X = (−S0 − aV ∆V −
aL ∆L)/aX expresses the timing constraint S > 0. It is derived from the equation for timing slack (2). We do not set
upper bound on Leff variation because Gaussian distribution
goes to 0 very fast and the impact of high values of Leff is
negligibly small. These two lines limit the area where chips
satisfy timing and power constraints. The integral (10) over
this area can be rewritten as an iterated integral
Y =
1
2π
Z∞
Z∞
e−
∆L2 +∆X 2
2
d∆Xd∆L
and power requirements at least at one supply voltage Vi .
Mathematically yield can be expressed as follows:
Y =
Z
Z
Sn
p(∆L, ∆X)d∆Ld∆X
(13)
i=1 {S(Vi )>0,P (Vi )<Plim }
where integration is performed over the union of all the areas
defined by the timing and power constraints corresponding
to each supply voltage of the binning scheme. We extended
the technique of subsection 4.1 for computing this integral.
Fig. 4 shows the variational space for channel length variations ∆L and combined parameter variations ∆X. The
gray area constitutes chip yield. It is bounded by 3 vertical lines ∆L = ∆Lmin,i and 3 oblique lines ∆X = (−S0 −
aV ∆Vi − aL ∆L)/aX . Each vertical line ∆L = ∆Lmin,i corresponds to power constraint P (∆Lmin,i ) < Plim,i . Each
oblique line ∆X = (−S0 − aV ∆Vi − aL ∆L)/aX corresponds
to timing constraint S(∆Vi , ∆L, ∆X) = 0. The formula for
the oblique lines is derived from the expression for timing
slack (2). The whole yield area in Fig. 4 can be divided into
three subregions marked Y1 , Y2 and Y3 . Two of them are
bounded by two vertical lines and by one oblique line. The
third area is bounded by one vertical and one oblique lines.
The chip yield can be represented as Y = Y1 + Y2 + Y3 .
∆X
(11)
Y1
∆Lmin −S0 −aV ∆V −aL ∆L
Y2
Y3
aX
∆X
Yield
3 −aL ∆L
∆X= −S0 −aV ∆V
aX
∆L
−aL ∆L
∆X= −S0 −aV a∆V
X
∆Lmin,1 ∆Lmin,2 ∆Lmin,3
∆L
∆Lmin
Figure 4: Variational space for multiple vdd bins
yield
Figure 3: Variational space for single vdd yield
By expressing the internal integral through the normalized Gaussian PDF φ and cumulative distribution function
(CDF) Φ we simplify the formula for yield
Y =
Z∞
φ (∆L) Φ
„
S0 + aV ∆V + aL ∆L
aX
«
For the general case, we represent the total yield for a
binning scheme with n voltages V1 , V2 , . . . , Vn as the sum of
the yields corresponding to n subregions.
Y =
d∆L
(12)
∆Lmin
This integral can be efficiently computed numerically. It
takes less than one second. This computational technique
can be easily extended to the case of more variables affecting
both chip power and timing. Then we will have to compute
numerically 2 or 3 dimensional integral, which is is not too
difficult.
4.2 Yield for multiple voltage bins
Assume we have a voltage binning scheme with n voltage
levels V1 , V2 , . . . , Vn . Then a chip is good if it satisfies timing
n
X
Yi
(14)
i=1
The term Yn corresponds to the region bounded by one
vertical and one oblique lines. It can be efficiently computed
by (12). The other terms Yi with i < n correspond to the
regions bounded by two vertical lines ∆L = ∆Lmin,i , ∆L =
∆Lmin,i+1 and one oblique line ∆X = (−S0 − aV ∆Vi −
aL ∆L)/aX . These terms are computed as
∆Lmin,i+1
Yi =
Z
φ (∆L) Φ
„
S0 + aV ∆Vi + aL ∆L
aX
«
d∆L
∆Lmin,i
(15)
4.3 Yield optimization
Leakage as function of Vdd and Leff
0.5
Leakage (A)
Having derived efficient and accurate formulas and numerical procedures for yield computation, we now proceed to
solve the problem of computing optimal voltage levels, i.e.,
the problem of computing optimal binning scheme (9). Our
formulas can be used for computing yield derivatives with
respect to Vdd levels. However, we found that even optimization without derivatives by simplex Nelder and Mead
method [6] gives good results.
0.4
0.3
0.2
10
9.5
9
−8
4.4 Yield for adjustable supply voltage
Y = Prob(∃∆V |S(∆V, ∆L, ∆X) > 0, P (∆V, ∆L) < Plim ,
∆Vmin < ∆V < ∆Vmax )
(16)
This formula correctly defines yield of the chips with individually adjustable Vdd, but it is not helpful for actual
calculation. Therefore, we transform it to an integral that
is more suitable for computation.
Z
Z
p(∆L, ∆X)d∆L∆X
(17)
S(∆V,∆L,∆X)>0
P (∆V,∆L)<Plim
∆Vmin <∆V <∆Vmax
The integration in this formula is performed across the
area where the system of inequalities has at least one solution. This integral can be computed by the following MonteCarlo algorithm
Algorithm 2.
1. Assume certain number of Monte-Carlo samples N .
2. Set Y =0.
3. For i ∈ [1, N ] do the following:
• Generate random samples of ∆L and ∆X according to their distributions.
• If system of inequalities
S(∆V, ∆L, ∆X) >
P (∆V, ∆L) <
∆Vmin < ∆V <
is compatible set Y = Y + 1
4. Compute yield Y = Y /N .
0
Plim
∆Vmax
8
Leff (m)
(18)
Vdd (v)
Leakage as function of Leff
0.45
0.32
0.4
Leakage (A)
0.34
0.3
0.28
0.26
0.24
1.4
1.3
1.2
1.1
1
Leakage as function of Vdd
Leakage (A)
Now we consider the case when each chip can be assigned
its own individual supply voltage to meet timing and power
constraints. Obviously it is the same as having infinite number of voltage bins. Therefore, the yield for this case can be
expressed as the limit of optimal binning as the number of
bins approaches infinity. However, its computation as a limit
is not convenient and we need a more efficient technique.
Just as timing and power constraints are formulated in
terms of deviation of Vdd from its nominal value V0 , we
express the yield in terms of this deviation ∆V too. The
yield Y for chips with individually adjustable supply voltage in the interval of voltage variations [∆Vmin , ∆Vmax ] is
expressed as the probability that there exists supply voltage in this interval such that both the timing and power
constraints are satisfied:
Y =
8.5
x 10
0.35
0.3
0.25
1
1.1
1.2
Vdd (v)
1.3
1.4
0.2
8
8.5
9
Leff (m)
9.5
10
−8
x 10
Figure 5: Leakage as a function of Vdd and Leff
The only unusual thing here is checking feasibility of the
system of inequalities. It is not difficult because both timing
slack and power have analytical expressions. Substituting
into these expressions the sampled values of ∆L and ∆X
we obtain a system of inequalities with only one variable
∆V . Then checking its feasibility is straightforward, especially taking into account that here both timing slack and
power are monotonic functions of supply voltage. MonteCarlo technique is suitable for computing this integral because we do not use in an optimization procedure. Therefore, the random numerical noise does not create any additional problems.
5. EXPERIMENTAL RESULTS
We verified the proposed technique by analyzing 3 large industrial chips with strict requirements on power consumption. We characterized the leakage of library cells as a function of Vdd and Leff using SPICE simulations. Vdd was
varied between 1 volt and 1.4 volts. Leff was varied from
80 to 100 nanometers. We used 41 points for Vdd and 21
points for Leff variations, which makes a total of 861 points
for all Vdd and Leff combinations.
The total leakage of each chip was calculated by summing
the leakage of all cells. The upper 3D plot in Fig. 5 shows
chip leakage current as a function of Vdd and Leff. The
two plots at the bottom of this figure show variation of the
leakage current when either Leff or Vdd is fixed. From these
plots we see strong dependence of leakage both on Vdd and
Leff.
The tabulated function of chip leakage was fit to the exponent of quadratic form (6) using Matlab. For efficiency
we fit the logarithm of leakage current to the quadratic form
of Vdd and Leff. This approach helped to perform fitting
in less than 1 min. Fig. 6 shows error of leakage approxi-
Error of leakage approximation of chip3
1
0.9
0.8
1
0.7
0.6
Yield
Error %%
0.5
0.5
0.4
0
0.3
0.2
−0.5
0.1
−1
10
0
1
2
3
4
5
6
7
8
9
10
Number of bins
9.5
1.4
1.3
9
−8
x 10
1.2
8.5
Figure 8: Yield for different binning schemes
1.1
8
Leff (m)
1
Vdd (v)
Figure 6: Error of leakage approximation
0.8
0.7
0.6
Yield
0.5
0.4
0.3
0.2
0.1
0
1
1.05
1.1
1.15
1.2
1.25
1.3
1.35
1.4
Vdd (v)
of bins. Table 1 shows results of this experiment for three
industrial chips. The first column gives the name of the
chip. The 2nd column shows chip yield when Vdd is adjusted
individually for each manufactured chip. We see that in this
case all three chips have pretty high yield. The columns
3 through 7 give yield for optimal binning with 1, 2, 3, 5
and 10 bins. We see that yield always increases when we
increase the number of bins but it never exceeds the yield
corresponding to individually adjustable Vdd. We analyzed
binning schemes with as many as 10 bins in order to verify
that the yield for a large number of bins gets close to the
yield corresponding to individually adjustable Vdd. This
table confirms all our theoretical predictions. We also see
that voltage binning can significantly improve yield. For
example, 3 bins give 81.7% yield comparing with only 44%
for single optimal Vdd value. The computation time was
insignificant (about 1 minute) even for the case of 10 bins.
Figure 7: Yield as a function of Vdd
mation as a function of Vdd and Leff. The maximum error is only 0.91%. For the other two chips the error was
within 0.84% and 0.78%, which confirms the accuracy and
robustness of the proposed approximation technique. The
total chip power was computed by summing its leakage and
switching power. The switching power was estimated from
chip switching activity and load capacitances of logic gates.
The proposed technique also requires chip slack in linear
canonical form (2). The chip slack was computed by statistical timing analysis. We used 15 sources of variations
modeling variability of transistors and interconnects. For
yield analysis all the components of the chip slack except
variations due to Vdd and Leff were combined statistically
into a single term.
For validating the analytical formulas of yield computation we compared our technique (12) and Monte-Carlo calculation of chip yield for different Vdd values. Monte-Carlo
calculation used 105 samples to guarantee sufficient accuracy. The maximum difference between Monte-Carlo and
the proposed technique was only 0.44%. We obtained similar results for the case of two voltage bins. Fig. 7 shows
a plot of chip yield as a function of Vdd. We see that the
yield strongly depends on Vdd and achieves its maximum at
some Vdd value.
The next step in our experiments was computation of the
optimal Vdd levels for binning schemes with varying number
Table 1: Yield for different voltage binning schemes
Design Adj 1 bin 2 bins 3 bins 5 bins 10 bins
Vdd
chip1
97.8 63.2
82.9
89.8
94.2
96.5
chip2
97.9 66.9
85.0
90.0
94.8
96.8
chip3
99.4 44.0
68.9
81.7
92.2
97.9
Fig. 8 shows two histograms of chip yield for voltage binning with number of bins varying from 1 to 10. The red
bars (left bars in the groups) show yield for optimal binning. The blue bars (right bars in the groups) show yield
for uniform binning, i.e., binning with Vdd values uniformly
distributed between the minimum and maximum Vdd values. As expected, optimal binning always provides higher
yield. Obviously, the difference between optimal and uniform voltage binning reduces when the number of bins gets
large. We also see that the uniform binning yield decreases
when the number of voltage bins is increased from 1 to 2.
This happens because the single optimal Vdd is close to the
middle of the Vdd interval but the two uniformly distributed
Vdd values are far from the optimal values.
Fig. 9 shows bin distribution for optimal voltage binning
schemes. One horizontal axis gives the number of bins in
the binning schemes. So, we see here 5 binning schemes
with 1, 2, 3, 4, 5 bins. The other horizontal axis gives Vdd
values of voltage bins. The stems growing from the horizontal plane correspond to the bins of the binning schemes.
Table 2:
Power
constr
100%
95%
90%
85%
80%
75%
Yield dependence on power requirements
Adj 1 bin 2 bins 3 bins 5 bins 10 bins
Vdd
97.8 63.2
82.9
89.8
94.2
96.5
94.9 54.2
74.1
82.1
89.5
91.8
87.4 44.1
62.7
71.2
78.4
83.5
76.5 33.4
49.3
57.3
64.8
70.7
61.1 23.8
35.2
41.8
48.6
54.4
44.1 14.3
22.3
27.1
32.2
37.1
0.8
Yield
0.6
0.4
1.35
1.3
0.2
1.25
0
5
4
1.2
3
Number of bins
2
1
1.15
Vdd (v)
Figure 9: Size of Voltage Bins
The coordinates of a stem root indicate the number of bins
in the corresponding binning scheme and the Vdd of the
corresponding bin. The height of each stem is the amount
of yield obtained from the corresponding bin. Clearly, the
optimization improves yield by distributing Vdd values nonuniformly among the bins.
Voltage binning provides an opportunity to improve yield
for very strict power constraints. We analyzed this opportunity by computing optimal binning for different power requirements. Table 2 shows parametric yield for the same
chip for different optimal binning schemes and power requirements. Column 1 shows power constraints as a percentage of the initial target value. We varied power constraints
from the initial target value down to 75% of it. Column
2 shows yield when each manufactured chip is assigned an
individually adjusted optimal Vdd. Columns 3 through 7
show yield for binning schemes with different numbers of
bins. From this table we see that voltage binning can significantly improve yield even for very strict power requirements. For instance, when power constraint is 25% stricter
than nominal, single Vdd gives only 14.3% yield but just 3
bins improve up to 27.1%, while 10 bins improve it up to
37.1%.
6.
CONCLUSIONS
We have presented a technique for yield computation and
optimization for different voltage binning schemes. Our approach uses the results of statistical timing analysis and chip
leakage expressed in functional form. For our experiments
we approximated chip leakage as an exponential of quadratic
form of Vdd and Leff. We demonstrated that this approxi-
mation has less than 1% error. However, the proposed technique of analyzing and optimizing voltage binning can be
adapted to other representations of chip leakage and switching power. Our experiments showed that optimal voltage
binning is able to more than double parametric yield for
strict power requirements. The proposed algorithms demonstrated high computational efficiency and require only a few
minutes of CPU time.
7. REFERENCES
[1] R. Chen, L. Zhang, V. Zolotov, C. Visweswariah, and
J. Xiong. Static timing: back to our roots. ASP-DAC,
pages 310–315, 2008.
[2] T. Chen and S. Naffziger. Comparison of adaptive
body bias and adaptive supply voltage for improving
delay and leakage under the presence of process
variation. IEEE Trans. on VLSI. Vol. 11, No 5, pages
888–899, October 2003.
[3] A. Datta, S. Bhunia, J. H. Choi, S. Mukhopadhyay,
and K. Roy. Profit aware circuit design under process
variations considering speed binning. IEEE Trans. on
VLSI. Vol. 16, No 7, pages 806–815, July 2008.
[4] M. W. Kuemerle, S. K. Lichtensteiger, D. W.Douglas,
and I. L. Wemple. Integrated circuit design closure
method for selective voltage binning. U. S. Patent
7,475,366, January 2009.
[5] C. Neau and K. Roy. Optimal body bias selection for
leakage improvement and process compensation over
different technology generations. ISLPD, pages
116–121, August 2003.
[6] W. H. Press, S. A. Teukolsky, W. V. Vetterling, and
B. P. Flannery. Numerical Recipes in C. Cambridge
University Press, 1999.
[7] R. R. Rao, D. Blaauw, D. Sylvester, and A. Devgan.
Modeling and analysis of parametric yield under
power and performance constraints. IEEE Design &
Test of Computers, pages 376–385, July 2005.
[8] R. R. Rao, A. Devgan, D. Blaauw, and D. Sylvester.
Analytical yield prediction considering
leakage/performance correlation. IEEE Transactions
on CAD, pages 1685–1695, September 2006.
[9] A. Srivastava, K. Chopra, S. Shah, D. Sylvester, and
D. Blaauw. A novel approach to peform gate-level
yield analysis and optimization considering correlated
variations in power and performance. IEEE Trans. on
CAD. Vol. 27, No 2, pages 272–285, February 2008.
[10] J. W. Tschanz, S. Narendra, R. Nair, and V. De.
Effectiveness of adaptive supply voltage and body bias
for reducing impact of parameter variations in low
power and high performance microprocessors. IEEE
Journal of Solid State Circuits. Vol. 38, No. 5, pages
826–829, May 2003.
[11] C. Visweswariah, K. Ravindran, K. Kalafala, S. G.
Walker, and S. Narayan. First-order incremental
block-based statistical timing analysis. DAC, pages
331–336, June 2004. San Diego, CA.