Download New methods for building-in and improvement of integrated circuit

Document related concepts
no text concepts found
Transcript
New Methods for
Building-in and Improvement of
Integrated Circuit Reliability
Application to High Volume
Semiconductor Manufacturing
Jacob van der Pol
Cover design :
Omslag :
Taib El Ghazi
de Bipolair – CMOS - DMOS WaferFab ‘AN’ van Philips
Semiconductors te Nijmegen
ISBN : 9036514614
NEW METHODS FOR
BUILDING-IN AND IMPROVEMENT
OF INTEGRATED CIRCUIT
RELIABILITY
APPLICATION TO HIGH VOLUME
SEMICONDUCTOR MANUFACTURING
PROEFSCHRIFT
ter verkrijging van
de graad van doctor aan de Universiteit Twente,
op gezag van de rector magnificus,
prof.dr. F.A. van Vught,
volgens het besluit van het College voor Promoties
in het openbaar te verdedigen
op donderdag 8 Juni 2000 te 15.00 uur.
door
Jacob Antonius van der Pol
geboren op 5 mei 1961
te Hoensbroek
Dit proefschrift is goedgekeurd door de promotoren
prof.dr. J.F. Verweij
prof.dr.ir. F.G Kuper
’…. doe nou maar wat je zegt,
dat is al moeilijk genoeg ….’
naar Bram Faas†
Aan Anne-Mieke, Bram en Karel voor de
vele uren dat papa weer zat te ‘tikken’ en
mijn ouders voor hun nooit aflatende steun
De promotiecommissie:
Voorzitter:
Prof.dr. H. Wallinga
Universiteit Twente
Secretaris:
Prof.dr. H. Wallinga
Universiteit Twente
Promotoren:
Prof.dr. J.F. Verweij
Prof.dr.ir. F.G. Kuper
Semiconductors
Universiteit Twente
Universiteit Twente / Philips
Leden:
Prof.dr.ir. A.C. Brombacher Technische Universiteit Eindhoven
Prof.dr. H.E. Maes
Universiteit Leuven / IMEC, Belgie
Prof.dr.ir. A.J. Mouthaan
Universiteit Twente
Prof.dr.ir. B. Nauta
Universiteit Twente
Prof.dr. P.H. WoerleeUniversiteit Twente / Philips Research
Contents
1.
INTRODUCTION
1.1 Introduction
1.2 Integrated circuit technology and reliability trends
1.3 System for building-in and improvement of product reliability
1.4 References
2.
RELIABILITY ISSUES IN HIGH VOLTAGE BIPOLAR-CMOS-DMOS
INTEGRATED CIRCUITS
2.1 Introduction
2.2 Threshold voltage instabilities of HV DMOS transistors
2.3 Parasitic leakage currents induced by ‘charge-creep’
2.3.1 Failure mechanism
2.3.2 Surface potential modelling by a lumped element RC-network
2.3.3 'Charge-creep' characterisation using test structures
2.3.4 Comparison of experimental data and model predictions
2.3.5 Design rules
2.4 Conclusions
2.5 References
3.
RELATION BETWEEN THE HOT CARRIER LIFETIME OF TRANSISTORS AND
CMOS SRAM PRODUCTS
3.1 Introduction
3.2 Experimental
3.3 Transistor and SRAM parameter degradation
3.4 Analysis and discussion of the SRAM parameter degradation
3.5 Relation between the transistor and SRAM hot carrier lifetime
3.6 Summary and conclusions
3.7 References
4.
SYSTEMATIC DERIVATION OF LATCHUP DESIGN RULES FOR SUBMICRON
CMOS PROCESSES FROM TEST STRUCTURES
4.1 Introduction
4.2 Latchup susceptibility reduction options
4.3 Design rule derivation approach
4.4 Application: design rule derivation for a CMOS process on p-/p++
epitaxial substrates
4.4.1
Impact P+ substrate contact placement and P+ guardrings
4.4.2 Impact Nwell contact placement and N+/Nwell guardrings
4.4.3 Process specific design rules
4.5 Conclusions
4.6 References
5.
SHORT LOOP MONITORING OF METAL STEPCOVERAGE BY SIMPLE
ELECTRICAL MEASUREMENTS
5.1 Introduction
5.2 Electrical assessment of metal stepcoverage
5.3 Design rule verification for (non-)planarized bipolar processes
5.4 Effect of metal stepcoverage on electromigration lifetime
5.5 Design rule verification for a non-planarized BiCMOS process
5.6 Process split evaluation and shortloop equipment monitoring
5.7 Metal stepcoverage wafermaps
5.8 Summary and conclusions
5.9 References
6.
RELATION BETWEEN YIELD AND RELIABILITY OF INTEGRATED CIRCUITS
AND APPLICATION TO FAILURE RATE ASSESSMENT AND REDUCTION IN THE
ONE DIGIT FIT AND PPM RELIABILITY ERA
6.1 Introduction
6.2 Yield as a reliability indicator
6.3 Experimental results
6.3.1 Relation between yield and line fall-off.
6.3.2 Relation between line fall-off and field returns
6.3.3 Rrelation between yield and burn-in reject rate
6.3.4 Relation between burn-in and High Temperature Operating Life
(HTOL) failure rate
6.4 Failure rate prediction and assesment
6.5 Options for failure rate reduction
6.5 1 Yield improvement
6.5.2 Elimination of special causes (‘maverick’ batches)
6.5.3 Screening of weak parts with latent defects during product test
6.6 Conclusions
6.7 References
7.
IMPACT OF SCREENING OF LATENT DEFECTS AT ELECTRICAL TEST ON
THE YIELD-RELIABILITY RELATION AND APPLICATION TO BURN-IN
ELIMINATION
7.1 Introduction
7.2 Impact of screening latent defects at e-sort on product reliability
7.2.1 Yield-reliability relation
7.2.2 Failure rate reduction options
7.2.3 Impact of latent defect screens at e-sort on yield-reliability
relation
7.3 Model predicting burn-in failure rate from batch yield
7.3.1 Experiment and failure rate evolution model
7.3.2 Validation of the model
7.3.3 Process dependence of the model constants
7.3.4 Burn-in failure rate prediction
7.4 Application of model to burn-in elimination
7.4.1 Impact of screens
7.4.2 Verification of the model
7.5 Conclusions
7.6 References
SUMMARY
SAMENVATTING
LIST OF PUBLICATIONS
DANKWOORD
LEVENSLOOP
BIOGRAPHY
1
Introduction
1.1
1.2
1.3
1.4
Introduction
Integrated circuit technology and reliability trends
System for building-in and improvement of product reliability
References
1.1 INTRODUCTION
Over the past 30 years the reliability of semiconductor products has increased
by an astonishing factor of over 10 million despite the unprecedented progress of
technology in this period. This thesis deals with the methodology and techniques
that have been developed in process development, product development and high
volume manufacturing firstly to realise this huge improvement and secondly to
continue this improvement rate in the next millennium. This first chapter briefly
describes the trends in semiconductor technology and product reliability and also
introduces the system for the building-in and improvement of integrated circuit
reliability. The other chapters focus on the new methods and techniques that have
been developed to enable a further advancement of the system.
In process development the adoption of highly accelerated stress techniques
(preferably on wafer level) has become crucial as this gives the opportunity to simulate 10 years of product lifetime within a few hours or days, in-line with
today’s development cycle times. In chapter 2 these Wafer Level Reliability
(WLR) techniques are applied during the development of a high voltage BipolarCMOS-DMOS (BCD) technology to evaluate the product lifetime due to a sodium
ingression wear-out failure mechanism and to select the best lifetime
improvement candidate from various process modification options. Using similar
stress methods, also a new model is developed that quantitatively describes
transistor instabilities induced by surface charges. These charges originate from
high voltage circuitry on the chip and constitute a dominant wear-out failure
mode in high voltage products. It is shown how the model can be used to derive
design rules that eliminate the effects of the surface charges and thus ensure
reliable high voltage products.
1
Chapter 1
The highly accelerated stresses are often carried out on dedicated test
structures designed in such a way that they are ’susceptible’ to primarily only the
failure mechanism of interest. This introduces the problem of how to convert test
structure lifetime data to actual product lifetimes. As reliability margins are
vanishing rapidly in modern semiconductor technologies this is of great interest to
the industry. In chapter 3 this has been explored for the case of hot carrier
degradation. Large lifetime differences can occur in this case between test
structures and products due duty cycle effects, differences between AC and DC
degradation and the varying sensitivity of the electrical parameters of a product to
the degradation of one or more of its components. It is shown that lifetimes of
products in dynamic operation can easily exceed the lifetimes of corresponding
transistors in static operation by a factor 100. This finding is now commonly
applied during product design, enabling increases in the maximum of the
operation frequency of state-of-the-art microprocessors and Systems-On-A-Chip
as well as more aggressive scaling of process technologies without jeopardising
the product reliability.
For building-in reliability during product development, the availability of
reliability related design rules is mandatory. One aspect of this are design rules
that ensure that the product is robust against voltage spikes on its external pins so
that it does not ‘latchup’ and burn-out. In chapter 4 a consistent approach is
demonstrated that allows the derivation of latchup design rules from simple test
structures and that is applicable to any CMOS technology. This allows first-timeright design of products and it changes the perception of latchup prevention being
an ‘art’ to being an ‘engineering science’.
In high volume manufacturing the prevention and detection of process excursions that might deteriorate product reliability and yield is of the uttermost importance. Therefore very sophisticated in-line and end-of-line control systems have
been implemented in manufacturing flows where all critical equipment and
process parameters that may influence product performance or reliability are
regularly measured and kept under Statistical Process Control. One such critical
process parameter is the metal stepcoverage of a metallisation system as it may
have a dramatic effect on electromigration related reliability of a product. In
chapter 5 a new method is described that enables monitoring of the metal stepcoverage of a metallisation system by simple electrical measurements. It can be used
for process optimization, design rule drivation and stepcoverage monitoring. In
the latter case electrical test structures, called Process Control Modules (PCM’s),
are placed on a number of positions on each wafer to identify any material that
might contain a reliability hazard.
As a result of the ‘building-in’ reliability approach in process and product
development, wear-out failure modes do not occur anymore in today’s products.
Instead product failures are dominated by early failures caused by manufacturing
defects. Product failure rates however have become so low that conventional life
testing techniques are not capable anymore of providing enough statistically significant data (at reasonable cost) to guide the improvement actions in the high
volume manufacturing lines. Therefore a paradigm shift is needed. In chapter 6 it
is for the first time quantitatively shown, based on data of over 50 million
2
Introduction
products, that there exists a clear correlation between the yield of a product and its
reliability in the field. Thus the yield is a primary reliability indicator. It can be
used to screen out material that does not fit into the normal yield distribution of a
product, thus preventing that products with a larger failure probability
(‘Maverick’ lots) are shipped to customers. Also quantitative model is developed
and validated allowing to predict the reliability in the field based on yield data. In
this way so that yield scrap limits can be set based on engineering arguments instead of based on qualitative reasoning as in the past, enabling a much better
trade-off between cost and benefit of scrapping deviating material.
Finally, chapter 7 deals with the failure rate evolution versus time, failure rate
prediction and reduction of the failure rate by various screening techniques. Experimental data show that the failure rate curve indeed has a bathtub shape (see section 1.2). Its evolution is described by a new model allowing to show quantitatively what the impact of various ‘burn-in’ options are on product failure rate.
Furthermore it is shown for the first time what the quantitative effect on failure
rate is of alternative screening techniques that can be implemented in the electrical-sort (‘E-sort’) test program at wafer level like voltage screens and quiescent
current (‘IddQ’) tests. It appears that these techniques are a good alternative to
burn-in and can reduce failure rates by about a factor 2. This finding opens the
way for significant efficiency improvements and cost reductions in high volume
semiconductor manufacturing and consequently the screening techniques are
rapidly becoming standard industry practice.
1.2 INTEGRATED CIRCUIT TECHNOLOGY AND RELIABILITY
TRENDS
The reliability of semiconductor products as a function of time is commonly
described by a bathtub curve [1,2,49,54-56]. This is because the plot of the product failure rate as a function of time has the shape of a cross sectioned bathtub as
shown in fig. 1. Three failure regimes can be distinguished in the bathtub curve.
In the ‘infant mortality’ or ‘early failure’ period, the products show a high, but
decreasing failure rate as a function of time until the failure rate stabilises. This
period is referred to as the ‘random failure’ period. Finally, in the ‘wear-out’ period, the failure rate increases again when end-of-life of the products is reached.
3
Failure Rate
Chapter 1
Manufacturing
Defects
Early Failure
Period
Intrinsic
Degradation
Mechanisms
Electrical Overstress
Events & Defect Tail
Random Failure
Period
Wear-Out
Period
Time
Fig. 1: Failure rate as a function of time: the bathtub curve.
The nature of the failures in the three periods is generally very different, see
table 1. The majority of the failures in the ‘early failure’ period are caused by manufacturing defects like e.g. particles, near opens and shorts in metal lines, weak
spots in isolating dielectrics or poorly bonded bondwires in the package. In the
‘random failure’ period many different rootcauses occur but failures related to
specific events like lightning, load dump spikes occurring during disconnection of
car batteries or other overstress situations are most notable. Failures in the ‘wearout’ period are related to intrinsic properties of the materials and devices used in
the product in combination with the product use conditions like temperature, voltage and currents including their time dependence. Examples of wear-out failure
mechanisms are electromigration, (gate) oxide breakdown, hot carrier degradation, mobile ion contamination and dry corrosion of bondballs [1,3-5,31-32], see
also section 1.3. Reliability engineering deals with on one hand systematically reducing the infant mortality and random failures and on the other hand keeping the
wear-out phase beyond practical duration.
Early Failure Period
particles
gate oxide defects
near-opens / nearshorts
pinholes in isolating
dielectrics
scratches
loose bondwires
popcorn damage
Random Failures
Period
latch-up
latent ESD damage
Safe-Operating-Area
Wear-Out Period
electro-migration
(gate) oxide breakdown
hot carrier degradation
(SOAR) violations
mobile ion contamination
load-dump car battery
electrical overstress
extended early failures
transistor instabilities
stress voiding
thermo-migration
surface charges
corrosion
pattern shift
4
Introduction
bondwire fracture
‘dry’ bondball corrosion
Table 1: Characteristic failure modes in the three regimes of the bathtub curve.
Today's state-of-the-art products like microprocessors or Systems-On-a-Chip
(SOC) contain tens of millions transistors, a factor 105 more than in the early seventies as shown in fig. 2. This has been realised by a simultaneous reduction in
minimum feature size and increase of die area, see fig. 3. At the same time also
package technology has evolved from simple Dual-In-Line (DIL) packages to
complex high pincount Chip Scale packages, see fig. 4 and 5.
The remarkable thing about semiconductors is that despite this dramatic increase in complexity of processes, products and packages, simultaneously the product failure rate has decreased by more than two orders of magnitude as witnessed
by fig. 6. Here it must be noted that the failure rate of IC’s usually is expressed in
FIT (Failures In Time). A FIT is one failure per 1 billion (109) device hours under
normal operating conditions. As the failure rate of IC’s decreases in time, failure
rates are usually determined after 48 or 168 hours as well as after 1000 hours accelerated testing. From the 48 or 168 hours results an ‘Early Failure Rate’ (EFR)
can be determined and from the 1000 hours results an ‘Intrinsic Failure Rate’
(IFR). Today, Early Failures Rate requirements by customers are below 10 FIT,
corresponding to a maximum of one failure during 100 million operating hours.
Number of Transistors per Chip
1E+10
Memory
Microprocessors
1E+09
1G
256M
1E+08
64M
16M
1E+07
4M
1E+06
1M
256k
1E+05
64k
16k
1E+04
4k
1k
468
K6-3
K6
Pentium Pro
Pentium
368
80268
8086
8080
4004
1E+03
1965 1970 1975 1980 1985 1990 1995 2000 2005
Year
Fig. 2: Trend in chip complexity [53].
5
10
10000
1
1000
0.1
Die Size [mm2]
Minimum Feature Size [um]
Chapter 1
100
0.01
10
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
Year
Fig. 3: Trend in minimum feature size and die sizes of DRAMs [5,11].
100 mm2 - 11%
Fig. 4: Package integration trend [58], QFP= QUAD Flat Pack, TAB= Tape
Automated Bonding, COB= Chip On Board, CSP= Chip Scale Package.
6
Introduction
Fig. 5: Microprocessor pin count trend [58].
In order to be able to achieve this 7-decade reliability improvement, the semiconductor manufacturers have implemented a very refined product reliability assurance system. Key elements are building-in reliability during process and product
development, thorough product and process qualification procedures, in-line control of product reliability items in the waferfab and assembly plant, ‘Maverick’ lot
control (a ‘Maverick’ lot shows an exceptionally high failure rate), reliability monitoring via analysis of lifetest rejects and customer returns, see fig. 7, and a well
functioning continuous improvement process focussed on corrective actions on
any deviation observed. It is also clear that along with the increase in product
complexity and decrease of allowed failure rate, this system and the methods used
need to be continuously updated. The next section deals in more detail with the
product reliability assurance ‘chain’ and will indicate what the contribution of the
work in this thesis to the system is.
7
Chapter 1
Fig. 6: Trend in early- and intrinsic failure rate (FIT) targets for application in
consumer products [3, 47].
1.3 SYSTEM FOR BUILDING-IN AND IMPROVEMENT OF PRODUCT
RELIABILITY
1.3.1 Introduction
In fig. 7 the system that is widely implemented in semiconductor manufacturing to build-in and improve product reliability has been schematically depicted. It
will be described in more detail below and furthermore it will be indicated what
the contribution of the work in this thesis to the advancement of the system is.
1.3.2 Materials and process module research
In the research phase work is oriented to the choice of the proper materials and
development of revolutionary process steps that will finally be able to meet the future product requirements. These choices typically have primarily impact on the
wear-out failure mechanisms as listed in table 1. Present examples are the work
on copper metallisation, see fig. 8, dual-damascene processes, low-k dielectrics,
alternative gate dielectrics, sub-0.15µm lithography [3-8] and in the field of
packaging novel types of moulding plastics offering improved moisture resistance
and robustness against soldering treatments [12-14].
8
Introduction
Materials & Process Module
research
Continuous Improvement
and Feedback Loops
Process development
Package development
Ch. 2 & 4
Product development
Process qualification
Package qualification
Ch. 3 & 4
Product qualification
Reliability control of processes
in waferfab and assembly plant
Ch. 6
Screening of defects
Ch. 6 & 7
Ch. 5
‘Maverick Lot’ detection
Ch. 7
Reliability monitoring and
customer feedback
Ch. 6 & 7
Fig. 7: The product reliability assurance and improvement system.
100000
MTTF [sec]
AlCu, Ea= 0.62eV
10000
Cu
, Ea= 0.97eV
1000
100
10
1.6E-03
1.7E-03
1.8E-03 1.9E-03
1/T [K-1]
2.0E-03
Fig. 8: Electromigration lifetime versus temperature of a full Cu-metallisation
versus that of an AlCu-based metallisation [8].
1.3.3 Process development
In the process development phase, the game is to combine (known) materials
and new (evolutionary) process steps in such a way that the process and product
requirements are met in the shortest possible time at the lowest cost. Emphasis in
this thesis is firstly on process reliability investigations and secondly on the derivation of reliability related design rules for products. By means of extensive high9
Chapter 1
ly accelerated Wafer Level Reliability (WLR) techniques [15,16] and the use of
appropriate lifetime extrapolation models, the lifetimes related to each of the
wear-out failure mechanisms can be established. If insufficient, process modifications are required as well as verification of the expected improvement by renewed
WLR-investigations. In chapter 2 this approach is demonstrated for the case of a
high voltage Bipolar-CMOS-DMOS (BCD) technology.
Based on the WLR-data and extrapolation models also design rules are derived
intended to eliminate wear-out effects and thus ensure reliability operation of the
products during their useful life. In this context it is interesting to note that the
useful life period can vary from 7000 hrs for automotive products via 30000 hrs
for consumer products to 250000 hrs for some telecommunication devices. Design
rules can be optimised accordingly. Examples of these design rules are the maximum allowable current through a metal line versus the line width to prevent electromigration failures, the maximum allowable voltage on a MOS transistor versus
the poly gate length to prevent hot carrier degradation effects and metal-stress design rules to prevent passivation cracking and pattern-shift for the package case.
In high voltage products transistor instabilities induced by surface charges are a
dominant wear-out failure mode. In chapter 2 for the first time the interaction between these instabilities and the surface charges is determined quantitatively and
it is furthermore also discussed how the effects of the surface charges in high voltage products can be eliminated by means of proper design rules.
The lifetime estimates for the various wear-out failure mechanisms are generally extracted from static stress WLR-experiments on test structures. For advanced processes the safety margins have completely disappeared. Current DC hot
carrier lifetimes of state-of-the-art 0.35 µm and 0.25 µm are for example typically
less than 2 months [17]. This means that it becomes important to establish the relation between the lifetimes as measured (typically by means of static stresses) on
test structures and those of real products. Large differences can occur due duty
cycle effects, differences between AC and DC degradation and the varying sensitivity of the electrical parameters of a product to the degradation of one or more of
its components. Chapter 3 deals with this issue for the case of hot carrier degradation [18]. It is shown that product lifetimes can be easily a factor 100 larger than
the corresponding DC transistor lifetime. This finding is now commonly applied
during product design, enabling increases in the maximum of the operation frequency of state-of-the-art microprocessors and Systems-On-A-Chip as well as more aggressive scaling of process technologies without jeopardising the product reliability.
During process development also design rules are derived and devices are designed to make the products robust against electrical overstress events like Electro-Static Discharge (ESD) [19] and latch-up [20], that can result in randomly occurring failures. Quite often the derivation of ESD and latch-up design rules is regarded as a kind of ‘black magic’. However, in chapter 4 a consistent approach is
demonstrated that allows to derive latch-up design rules from simple test structures and that is applicable to any CMOS technology [21]. Remarkably, such an approach was not available up to now and it thus fills a gap in engineering science.
In literature some complementary studies on building-in ESD and latchup robust10
Introduction
ness [22-25, 51] and improvement of Safe-Operating-Area (SOAR)-capability of
bipolar products [26] are available. Finally, based on package reliability studies,
‘metal stress’ rules are generated aimed at making the product robust against mechanical stress excerted by the package materials [12-13].
1.3.4 Process qualification
During process qualification first a set of standard WLR-tests [11,12] are executed in order to prove that in the final process flow all wear-out failure mechanisms, see table 1, are sufficiently covered by the design rules or by the process architecture itself (as e.g. in the case of mobile ions). An example of such a program
is given in fig. 9. Second, a number of package reliability tests like Temperature
Cycling (TMCL) and Highly Accelerated Steam Tests (HAST) are executed to
show that the intrinsic properties of the passivation on top of the die meets the requirements related to mechanical strength and moisture permeability. Third, ESD
and latch-up test are executed to determine whether the ESD-protection and latchup prevention design rules are appropriate.
Table 9
Test Methods for Wafer Level Reliability
Test
Electromigration
Gate Oxide
Breakdown
Method
SNW-FQ-101A
SNW-FQ-101B
Hot Carrier
SNW-FQ-101C
Mobile Ion
SNW-FQ-101D
Metallization
Stress Voiding
SNW-FQ-101E
Notes:
1. t (0.1%) = 10 years at 70
Acceptance Criteria
Reference note 1
Defect density < 10 ‘killing
defects/cm2
60% Confidence Level (note ²)
10 year life (analog)
0.5 year life (digital)2
60 % Confidence Level
BTS
TVS (optional)
< 10 class B defects per cm line
< 10 class C defects per cm line
60% Confidence Level
c/n
0/5
0/5
°C, 60% Confidence Level
2. If the requirement is not met business lines shall be informed and
implementation of appropriate screening procedures should be
considered by the BL.
3. If a process does not meet the 0.5 year life time requirement, then
10 years life time at use conditions must be demonstrated at the
circuit level for products manufactured with that process.
11
Chapter 1
Table 10
Construction Analysis Test Methods for Wafers
Test Description
Abrv.
External Wafer Inspection
Wafer Bow
Wafer Strength
Crystal Strength
Roughness
Adhesion
Method
EXWI
WABO
WAST
XTST
ROGH
ADHE
c/n
Local Document
0/5
3 x 0/5
3 x 0/1
3 x 0/10
0/5
0/5
Fig. 9: Overview of typical Wafer Level Reliability qualification program including construction analysis [27,40].
1.3.5 Package development and qualification
In packaging the trend is towards larger die sizes (fig. 3), a larger pincount [914] requiring a finer pitch of the package leads (fig. 5), smaller bondpad sizes and
bondpad pitch on the circuit die (fig. 10), thinner packages (fig. 4 and 11) and an
improved resistance against soldering treatments during mounting of the package
on a Printed Circuit Board (PCB). For consumer and automotive applications, the
majority of the packages are still (derivatives of) the conventional Dual-In-Line
(DIL), Single-In-Line (SIL), Quad-Flat-Pack (QFP) and Small-Outline (SO) packages. For state-of-the-art devices like microprocessors however also novel
packages like Ball-Grid-Arrays (BGA), Multi-Chip-Modules (MCM) and
techniques like e.g. Flip-Chip packaging, Chip scale Packaging (CSP), TapeAutomated-Bonding (TAB) and Controlled Collapse Chip Connection (C4) have
been introduced [9-10, 58], see fig. 12.
Array/C4
# Pads/
# Pins
TAB
Evolutionary
Vector
Wirebond
(Aluminum)
(Gold ball)
Performance, I/O
Compaction
50
100
150
Pad Pitch [um]
Fig. 10: Technology trend in packaging [13].
12
Introduction
Fig. 11: Package profile comparison [58].
Fig. 12: Package size versus number of I/O’s for various package families [58].
In conventional packaging, emphasis is on package materials that lower the
mechanical stress on the die surface and on improved plastic moulding compounds. These new compounds firstly absorb less moisture, secondly contain less
contaminants that might induce bondpad corrosion and thirdly have an intrinsically better adhesion to the scratch protection of the die and the leadframe. This is
because it has been shown that the loss of adhesion between the moulding plastic
and the package materials during e.g. a Temperature Cycling test (TMCL) or a
soldering treatment like ‘popcorn’ test is the key factor degrading the package related reliability of the product [28-30]. Delaminated packages are more prone to
bondpad corrosion, package cracking, lifted bondballs and lifted wedgebonds and
passivation cracking or even ‘pattern shift’.
The sensitivity of a product to passivation cracking can be significantly reduced by applying proper ‘metal stress’ design rules [12-13] during the product development. Alternatives are the use of a mechanical stress resistant passivation
scheme, the use of a polyamide wafer coating or a silicone die-coating that acts as
a kind of stress relief layer [13].
The novel ‘anti-popcorn’ moulding plastics developed to reduce moisture uptake and improve ‘popcorn’ behaviour generally have a low glass-transition (Tg)
temperature of around 120-130°C. Unfortunately, this is below the commonly
13
Chapter 1
used High Temperature Operating Lifetest (HTOL) stress temperature of 150°C
and also below the normal use junction temperature in some special applications
as e.g. lighting and automotive. At these temperatures the Sb- and Br-flame retardant additives in the plastic are less stable and more mobile in these anti-popcorn
compounds. As a result, especially the Au-Al ball-bond ‘dry corrosion’ degradation mechanism [31,32] is strongly accelerated compared to the case where a normal low stress moulding compound is used. For some compounds ‘open circuit
failures’ are observed within a few thousand operating hours at 150°C, see fig. 13.
The problem can be somewhat alleviated by choosing proper bonding conditions
[50]. Nevertheless, the trend is towards moulding compound recipes with a lower
amount of flame retardant additives and to wirebond materials that are less susceptible to the ‘dry corrosion’ mechanism.
(a)
(b)
14
Introduction
Fig. 13: SEM photograph of (a) a bondpad and (b) the bottom of a lifted bondball
showing Au-Al intermetallics due to the ‘dry corrosion’ degradation mechanism [52].
Package qualification is generally done on test chips with emphasis on the intrinsic properties of the package materials. In most cases however package reliability is also investigated as part of the product qualification program due to the potential interaction with the actual product and waferfab process.
1.3.6 Product development
During product development the designer firstly must make sure that the product adheres to all design rules derived during the process development phase. Especially the ESD and latchup robustness of the product and its ability to withstand
mechanical stress tests is largely determined by the specific design solutions chosen by the designer. Secondly, because of vanishing reliability margins, also reliability simulation techniques [33-36] are employed more and more during the design phase of products in state-of-the-art processes. In this way the actual impact
of wear-out failure mechanisms like e.g. hot carrier degradation and electromigration on the circuit performance and thus circuit lifetime can be determined. Furthermore the reliability simulations also reveal the weak spots in the circuit. By
making appropriate design changes to these weak spots (e.g. longer channel
lengths of MOS transistors or wider metal lines), the product lifetime can be improved until the required lifetime target is met. This approach is the best guarantee that the optimum between circuit performance, die size and required lifetime is
achieved.
1.3.7 Product qualification
During product qualification the actual product is subjected to a set of standard
accelerated stress (life)tests [27,37] aimed at finding any deficiencies in the combination of the design, process, package and application, see fig. 14. In these tests
first the endurance performance, robustness against overstress phenomena, mechanical stress resistance and the ability to survive in a humid environment of the
product are examined. All these test are related to the intrinsic reliability of the
product. If the previous work has been properly done, no rejects are observed. Second, the capability of the product with respect to assembly on a PCB-board is
checked. Third, especially in case of automotive or military applications, the sensitivity of the product to defects is determined by subjecting large sample sizes of
products to short duration stress tests like e.g. a 24 hours dynamic operation at
150°C (‘burn-in’). Failures observed during these tests are generally related to insufficient control over the manufacturing process by the semiconductor supplier.
15
Chapter 1
1.3.8 Reliability control in the waferfab and assembly plant
In order to prevent process excursions that might deteriorate product reliability
and yield, the semiconductor wafer and package manufacturers have introduced
very sophisticated in-line control systems in their manufacturing process.
Table 14
Product Environmental & Electrical Tests for Leaded IC Packages
Stress Test
Abrv.
High
EFR
Temperature
Operational
IFR
Life (Static
or Dynamic)
High Temperature
Storage Life L
Latch-up
1
ESD Susceptibilityl
(Human Body Model)
ESD Susceptibilityl
(Machine Model)
SMD Preconditioning
Pressure Pot
Test Condition
HTOL T = 150°C, biased
SNW-FQ-500
j
HTSLl T = 150°C, unbiasedl
a
1.5 Vcc(max)
Unsaturated Pressure
Pot or
Temperature Humidity
Bias or
2
2
High Acceleration
Stress Test
2
Temperature Cycling
7
Thermal fatigue
(Power Devices only)
Data Retention
Erase/Write Cycling
3
3
SNW-FQ-114l
Requirement
RFS/Extended
< 168 hl
1000 / 2000 hl
c/n
0/231
Note 5,6
0/77
Note 6
1000 / 2000 hl
0/77l
SNW-FQ-302A
±100 / 200 mA
2 kV
0/500
ESDH 1500Ω / 100pf
ESDM 0.75 µH / 200pf
SNW-FQ-302B
200 V
0/3
PCON For SMD devices
SNW-FQ-225Al
JEDEC A113
SNW-FQ-225A
SNW-FQ-C102
96 / 192 h
0/77
96 / 192 h
0/77
Preconditioning (SMD’s)
THBS l85°C / 85% RH, biased
SNW-FQ-225A
SNW-FQ-D102
SNW-FQ-225Al
SNW-FQ-A102
1000 / 2000 h
0/45
Preconditioning (SMD’s)
HAST 130°C / 85% RH, biased
SNW-FQ-225Al
SNW-FQ-D102
96 / 192 h
0/45
Preconditioning (SMD’s)
-65°C to 150°C
(Air to Air)
TFAT Power on/off @ T max
SNW-FQ-225Al 200 / 500 cycles
SNW-FQ-112
LAUP
SNW-FQ-303
Preconditioning (SMD’s)
121°C/100% RH, unbiasedl
Preconditioning (SMD’s)
UPOT
130°C / 85% RH, unbiasedl
PPOT
2
Specification
TMCL
J
DRET T = 150°C
ERWR Note 4.
a
0/3
0/77
SNW-FQ-532
10,000 cycles
0/45
SNW-FQ-541
SNW-FQ-540
1000/ 2000 h
1.0x Spec Cycles
0/45
0/45
1. HTSL test unnecessary unless HTOL is conducted at T < 150 °C.
2. Either PPOT or UPOT and THBS or HAST required for Process or Package
qualification. TTPP (Temperature Treatment Pressure Pot) test may be performed in
place of PPOT for Through Hole Mounted Devices.
3. Additional stress tests for non-volatile products
4. Maximum endurance / operating temperature, according to Product Specification.
5. Minimum sample size to commence qualification. Wafer Fab Process Changes must
demonstrate a capability to equal or exceed a 500 FPM performance level within 1 year
of qualification completion at a 60% confidence level.
6. Complex, high-pin-count packages may necessitate sample size reduction.
J
16
Introduction
7. An acceptable alternative TMCL condition approximating 500 cycles of -65 °C to
+150°C is 1000 cycles of -55 °C to +125 °C
8. Stress duration dependent on Fab Process
Fig. 14: Overview of typical product reliability qualifications program [27].
Generally, critical parameters that may influence product performance or
reliability are defined for all equipment in the fab and measurement frequencies
are determined based on statistical techniques. Examples of circuit performance
related parameters are sheet resistances, layer thickness and line widths while the
number of particles generated per wafer pass and mechanical stresses in layers are
related to circuit reliability. In case of assembly plants parameters like ball bond
shear-off force, plastic delamination and dimensions are measured. All parameters
are controlled using Statistical Process Control (SPC) techniques [38,39]. Using
SPC, any deviation of the process from its normal operation is identified and
results in ‘blocking’ of the equipment for production before the products are
negatively affected. By proper execution of Out-of-Control-Action-Plans
(OCAPs), the equipment can later again safely be released for production.
Apart from in-line control, also end-of-line control techniques can be used to
identify any material that might contain a reliability hazard. For this purpose each
wafer contains on a number of positions a large variety of electrical test structures
called Process Control Modules (PCMs) suited to monitor and control the performance and reliability of the complete final process. Typical devices on a PCM are
transistors, ‘van der Pauw’-type of resistors, capacitors, contact strings, zener diodes, metal lines etc. However often also reliability modules are included. Most
common are large capacitors to monitor gate oxide quality and metal meanders to
monitor metal shorts. For older non-planarised processes however it is also important to monitor the metal stepcoverage as this may have a dramatic effect on electromigration related reliability of the product. In general this is done by examining SEM cross sections of a worst case step on a regular, in most cases weekly,
basis. Chapter 5 deals with a new method that allows to monitor the metal stepcoverage by simple electrical measurements [41]. This enables a metal stepcoverage
control on 100% of the produced wafers, see e.g. fig. 15.
In this way the probability is greatly reduced that unreliable material slips
through all screens and is shipped to a customer between thousands of good
wafers. The method can also easily be extended to measure metal electrically
stepcoverage in contact holes and vias and is thus also relevant for sub-micron
technologies with planarised backends.
A new development is the use of ‘Fast Wafer Level Reliability’ (Fast-WLR)
techniques [39,40]. Here each wafer contains a series of test structures each dedicated to a particular failure mechanism. The design of the structures and the associated stresses are such that very large acceleration factors are achieved and the
failure thresholds are reached within 0.1 to 1 minute. The thus obtained reliability
data are controlled by SPC-techniques, allowing to identify any changes in the reliability of the process from its standard level. It is however still necessary in those
cases to verify the validity of the observed reliability change by execution of stan17
Chapter 1
dard, less accelerated, stress tests as the very large acceleration factors may also
induce degradation mechanisms that are non-relevant at normal use conditions.
1.30
LSL
Target
USL
Resistance Ratio
1.25
LCL
UCL
1.20
1.15
1.10
1.05
jan-99
nov-98
sep-98
jul-98
mei-98
mrt-98
jan-98
nov-97
1.00
Date
Fig. 15: SPC control chart of a metal stepcoverage monitoring parameter showing
the resistance ratio between a metal2 line over metal1 and polysilicon
steps and a metal2 line over a flat surface (mean = 1.105 and sigma =
0.010). The chart contains data of about 700 wafer batches (35000 wafers) and reveals 4 out-of-control events and 1 out-of-specification event.
1.3.9 Maverick lot detection
In case the design adheres to all (reliability related) design rules and is produced in a mature process in a waferfab with excellent process control, product reliability is dominated by defects occurring during the manufacturing process like
particles, scratches, near-opens and near-shorts etc, see e.g. fig. 16. These same
defects are generally also the origin of E-sort (Electrical Sort) yield loss; the larger
defects then result in zero hour product failure (and thus yield loss) and the smaller size defect constitute latent defects that may fail during operational life of the
product.
In chapter 6 it is for the first time shown quantitatively, based on data of over
50 million devices, that there exists a clear correlation between the yield of a product, its burn-in fall-out [42,57] and its reliability in the field [42], provided the
yield loss is dominated by functional failures and not by parametric failures [43].
Thus the E-sort yield is a primary reliability indicator and can be used to screen
out material that does not fit into the normal yield distribution of a product. In this
way it is prevented that products with a larger failure probability (‘Maverick’ lots)
are shipped to customers. Note that based on the E-sort yield the reliability in the
18
Introduction
field can be predicted quantitatively so that yield scrap limits can be set based on
engineering arguments instead of based on qualitative reasoning as in the past.
This allows a much better trade-off between cost and benefit of scrapping
deviating material.
FIB X-section
particle
(a)
Metal 2
Aluminum
particle
Si3N4
Oxide
Silicon
(b)
Fig. 16: Photograph of a particle in BiCMOS circuit (a) causing a failure due to
an open metal1 line and a Focussed Ion Beam cross section of the particle
(b) revealing that is an aluminum particle.
A more sophisticated ‘Maverick lot’ detection method, apart from using the
plain yield number, is that also the reject data from the individual tests in the Esort test program (called ‘BIN fingerprint’) are used to distinguish deviating ma19
Chapter 1
terial from the material fitting within the normal distribution (‘Moving Limits’
technique). Deviating material is generally put on hold for more thorough analysis
by product engineers after which a decision about scrapping or shipping of the
material is made. The results of these analyses are used for continuous improvement of the test programs, product designs or the waferfab processes. Fig. 17
shows the trend of the defect density reduction in a high-volume bipolar-BiCMOS
waferfab resulting from this approach. A remarkably constant improvement rate
of nearly 20% per year is observed over a period of 20 years. The impact of the
continuous improvement feedback loop on the occurrence of ‘maverick’ lots is also demonstrated in chapter 6.
1 00 .0
4 in ch
10 .0
3 in ch
2 in ch
1 99 9
1997
1995
1 993
1 99 1
1989
1987
1985
1 98 3
1981
1979
1 977
0.1
1 975
1.0
1973
De fect Density [cm -2]
5 in ch
Fig. 17: Defect density reduction trend in a bipolar-BiCMOS waferfab
1.3.10 Screening of defects
In order to reduce the failure rates in the field, products have been traditionally
subjected to burn-in [1,45,46]. During burn-in the product is operated for a longer
time (6 to 168hrs) at an elevated temperature (125°C to 150°C junction temperature) and often also at a higher than nominal supply voltage (e.g. 7V instead of
5V). In this way latent defects that otherwise would fail during the beginning of
the early failure period of the bathtub can be screened out, resulting in a lower
subsequent failure rate in the field for devices that survive the burn-in [45,46]. In
chapter 7 it is shown, based on experimental data, that the failure rate evolution
versus time indeed behaves as a the bathtub curve and it is shown quantitatively
what the impact of various burn-in options are on product failure rate [47].
The major drawback of a burn-in is the cost involved with the whole procedure
and the fact that for high yielding products the burn-in process itself might induce
more (latent) failures due to e.g. handling damage than that are screened out by
the procedure. Therefore many alternative screening techniques have been develo20
Introduction
ped over the last decade that can be implemented in the electrical-sort (‘E-sort’)
test program at wafer level like voltage screens, quiescent current (‘IddQ’)tests
and parameter distribution oriented tests like ‘Moving Limits’ [46-48]. In general
these tests significantly improve the testcoverage compared to the case when only
failures following the ‘Stuck-At’ fault model are detected. An example of the effectiveness of an IddQ test is shown in fig. 18.
Defect Level [ppm]
125000
Functional test only
100000
Functional test + IddQ
75000
50000
25000
0
0
20
40
60
80
100
Stuck-At Fault Coverage [%]
Fig. 18: Impact of IddQ testing on the PPM level of a product as a function of
‘Stuck-At' fault test coverage [48].
Two different kind of screening tests exist. The first operates the products outside their normal operating window with the aim to force latent defects into ‘hard’
failures (e.g. forcing a weak spot in a gateoxide into a short by applying a high
voltage). The second aims at screening out products that are functional and within
specification limits but nevertheless show analog parameter values (e.g. a supply
current or output voltage) that are outside the distribution of the remainder of the
products. In chapter 7 it is shown that these techniques are a good alternative to
burn-in and can reduce failure rates by about a factor 2. This finding opens the
way for significant efficiency improvements and cost reductions in high volume
semiconductor manufacturing and consequently the screening techniques are rapidly becoming standard industry practice.
1.3.11 Reliability monitoring and continuous improvement
The semiconductor supplier has the primary responsibility for failure rate monitoring. For this purpose the supplier executes extensive reliability monitoring
programs where on a sample basis a part of the production is subjected to reliability evaluations. However, today's failure rates are so low that excessive sample si21
Chapter 1
zes (more than 100000 products) are needed to demonstrate the reliability targets
required by the customers. For statistically relevant information about the largest
reliability hazards in production even millions of devices are needed. Apart from
the prohibitive cost, also the time needed to execute all these tests is so long (several months) that the lifetests have become hardly usable for continuous improvement purposes. The lifetests are however still suitable to detect and subsequently
evaluate the delivery risk of potentially ‘Maverick’ lots.
Semiconductor suppliers generally agree on ‘PPM-cooperation’ programs with
their key-customers. In such a cooperation all devices failing during assembly and
testing of the Printed-Circuit-Board (PCB) at the customer (called ‘line fall-off’)
are sent back to the suppliers and the rootcause of the failure is determined. Fig.
19 shows as an example a pareto of the failures of a high volume BiCMOS TV
signal processing IC.
UNKNOWN 14%
GOOD 23%
ASSEMBLY 1%
DIE FAULT 20%
OVERSTRESS 30%
TEST COVERAGE 10%
ESD 2%
Fig. 19: Pareto of ‘line fall-off’ failure causes of a BiCMOS TV signal processing
product. Note that about a quarter of the returned devices appears to be
good due to mismatch between product specification and application or
due to poor repair procedures at the customer.
As millions of devices are shipped to the customer, the failure sample size is
generally large enough to be of statistical significance. Consequently, these data
are used extensively to define corrective actions and continuous improvement programs in the waferfabs. Major drawback however is that this feedback loop spans
a time of at least 3 months to half a year due to pipeline effects. The way out is the
fact that these reliability failures correlate with the yield failures and have the same failure signature, as shown in chapter 6. Apart from in-line defect monitors,
data from yield analysis is the fastest feedback loop possible in semiconductor manufacturing with a feedback time of a few weeks. In conclusion, a strong focus on
22
Introduction
defect reduction and yield improvement in the waferfab is the best option for a
continuous reliability improvement program.
23
Chapter 1
1.4 REFERENCES
[1] E.A. Amerasekera, F.N. Najim, ‘Failure Mechanisms in Semiconductor Devices’, John Wiley & Sons, New York, (1997)
[2] D. Thompson, B. Wood, ‘Semiconductor defect reliability modeling’, Tutorial International reliability Physics Symposium (IRPS), (1996)
[3] D.L. Crook, ‘Evolution of VLSI reliability engineering’, Proceedings IRPS,
pp. 2-11, (1990)
[4] B. El-Kareh, W.R. Tonti, ‘Chip reliability’, Tutorial IRPS, (1997)
[5] P. Chatterjee, W.R. Hunter, A. Amerasekera, S. Aur, C. Duvvury, P. Nicollian, L. Ning, P. Yang, ‘Trends for deep submicron VLSI and their implications for reliability’, Proceedings IRPS, pp. 1-11, (1995)
[6] J.W. McPherson, ‘Reliability/processing challenges for ULSI metallization’,
Tutorial IRPS, (1994)
[7] R.L. Hance, J.W. Miller, K. Erington, M.A. Chonko, ‘Mobile ion contamination in CMOS circuits’, Tutorial IRPS, (1995)
[8] H.S. Rathore, D. Nguyen, ‘Copper metallization for sub-micron technology’,
Tutorial IRPS, (1997)
[9] R. Master, ‘Flip chip and ball grid array packaging’, Tutorial IRPS, (1998)
[10] K. Puttlitz, P. Totta, ‘Flip-chip interconnections’, Tutorial IRPS, (1994)
[11] ‘National Technology Roadmap for Semiconductors, technology needs’, ed.
Semiconductor Industry Association (SIA), (1997)
[12] T.M. Moore, S.J. Kelsall, D.R. Edwards, ‘Improving plastic package reliability’, Tutorial IRPS, (1992)
[13] J.T. Cullen, T.M. Moore, S.V. Golwalker, ‘Package technology’, Tutorial
IRPS, (1996)
[14] R. Shook, T. Conrad, ‘Moisture/reflow sensitivity of plastic packaged surface
mount IC’s: theory, evaluation and avoidance’, Tutorial IRPS, (1995)
[15] D.A. Baglee, D.S. Gibson, ‘Wafer-Level Reliability implementation issues’,
Tutorial IRPS, (1990)
[16] D.G. Pierce, E.S. Snyder, ‘Wafer Level Reliability : pushing the envelope’,
Tutorial IRPS, (1997)
[17] R. Bellens, ‘Building-in reliability during library development: hot carrier
degradation is no longer a problem of technologists only!’, Microelectronics
& Reliability, pp. 1425-1428, (1997)
[18] J.A. van der Pol, J.J.M. Koomen, ‘Relation between the hot carrier lifetime
of transistors and CMOS SRAM products’, Proceedings IRPS, pp. 178-185,
(1990)
[19] E.A. Amerasekera, C. Duvvury, ‘ESD in silicon integrated circuits’, John
Wiley & Sons, New York, (1995)
[20] R. Troutman, ‘Latchup in CMOS technology’, Kluwer, Boston, (1986)
[21] J.A. van der Pol, P.B.M. Wolbert, ‘Systematic derivation of latch-up design
rules for submicron CMOS processes from test structures’, Microelectronics
& Reliability, pp. 1051-1056, (1998)
24
Introduction
[22] E.A. Amerasekera, R. Chapman, ‘Technology design for high current and
ESD robustness in a deep submicron process’, IEEE Electronic Device Letters, pp. 383-385, (1994)
[23] E.A. Amerasekera, S.T. Selvam, R.A. Chapman, ‘Designing latchup robustness in a 0.35µm technology, Proceedings IRPS, pp.280-288, (1994)
[24] E.R. Ooms, J.A. van der Pol, ‘Occurrence and elimination of anomalous temperature dependence of latchup trigger currents in BICMOS processes’, Proceedings IRPS, pp. 138-143, (1999)
[25] C. Duvvury, C. Hu, G. Hills, ‘Integrated circuit damage due to electrical
stress’, Tutorial IRPS, (1994)
[26] B. Krabbenborg, J.A. van der Pol, ‘The influence of process variations on the
robustness of an audio power IC’, Microelectronics & Reliability, pp. 18191822, (1996)
[27] ‘General Quality Specification for Integrated Circuits’, SNW-FQ-611,
Philips Semiconductors, (1998)
[28] K. van Doorselaer, K. de Zeeuw, ‘Relation between delamination and temperature cycling induced failures in plastic packaged devices’, IEEE Transactions on Components & Hybrids Manufacturing Technology, pp. 879-882,
(1990)
[29] T.M. Moore, S.J Kelsall, ‘The impact of delamination on stress-induced and
contamination-related failure in Surface Mount IC’s’, ‘Proceedings IRPS, pp.
169-176, (1992)
[30] K. van Doorselaer, T.M. Moore, J.A. van der Pol, ‘Failure criteria for inspection using acoustic microscopy after moisture sensitivity testing of plastic
surface mount devices’, Proceedings International Symposium on Testing
and Failure Analysis (ISTFA), pp. 229-239, (1994)
[31] J.R. Devaney, P.H. Eisenberg, ‘Gold-Aluminum intermetallics, key parameters - reactions -effects & reliability impact - a review’, Tutorial IRPS, (1990)
[32] F.W. Ragay, J.A. van der Pol, J. Naderman, ‘In-situ monitoring of dry corrosion degradation of Au ballbonds to Al bondpads in plastic packages during
HTSL’, Microelectronics & Reliability, pp. 1931-1934, (1996)
[33] C. Hu, ‘AC effects in IC reliability’, Microelectronics & Reliability, pp.
1611-1617, (1996)
[34] M. Lunenborg, ‘MOSFET hot carrier degradation’, Thesis, University of
Twente, (1995)
[35] R. Bellens, ‘Hot carrier degradation in sub-micron CMOS technologies: problems and solutions’, Tutorial IRPS, (1998)
[36] S. Rochel, G. Steele, J.R. Lloyd, S.Z. Hussain, D. Overhauser, ‘Full chip reliability analysis’, Proceedings IRPS, pp. 356-362, (1998)
[37] ‘Stress Test Qualification for automotive-grade integrated circuits’, CDFAEC-Q100, Automotive Electronics Council, (1994)
[38] D.J. Wheeler, D.S. Chambers, ‘Understanding Statistical Process Control’,
Statistical Process Control Inc., Knoxville, (1986)
[39] D.G. Pierce, E.S. Snyder, ‘Wafer level reliability: pushing the envelope’, Tutorial IRPS, (1997)
25
Chapter 1
[40] J.S. May, H.H Hoang, ‘Wafer level reliability control program at SGS-Thomson Microelectronics, AEC Reliability Workshop, Indianapolis, October 2124, (1995)
[41] J.A. van der Pol, E.R. Ooms, ‘Short loop monitoring of metal stepcoverage
by simple electrical measurements’, Proceedings IRPS, pp. 148-155, (1996)
[42] F. Kuper, J.A. van der Pol, E.R. Ooms, T. Johnson, R. Wijburg, W. Koster,
D. Johnston, ‘Relation between yield and reliability of integrated cicruits:
experimental results and application to continuous early failure rate reduction programs’, Proceedings IRPS, pp. 17-21, (1996)
[43] J.A. van der Pol, F.G. Kuper, E.R. Ooms, ‘Relation between yield and reliability of integrated circuits and application to failure rate assessment and
reduction in the one digit FIT and PPM reliability era’, Microelectronics &
Reliability, pp. 1603-1610, (1996)
[44] C.G. Shirley, ‘A defect model of reliability’, Tutorial IRPS, (1995)
[45] R. Moazzami, C. Hu, ‘SiO2 TDDB testing and burn-in’, Tutorial IRPS,
(1992)
[46] A.J. Wagner, ‘Semiconductor defect reliability screening and modeling, Tutorial IRPS, (1996)
[47] J.A. van der Pol, E.R. Ooms, A. van ‘t Hof, F. Kuper, ‘Impact of screening of
latent defects at electrical test on the yield-reliability relation and application
to burn-in elimination’, pp. 370-377, Proceedings IRPS, (1998)
[48] S. McEuen, T. Paquette, ‘IddQ testing and its application’, Tutorial IRPS,
(1995)
[49] J. Møltoft, 'Behind the 'bathtub'-curve, a new model and its consequences',
Microelectronics & Reliability, pp. 489-500, (1983)
[50] Z.N. Liang, F.G. Kuper, M.S. Chen, 'A concept to relate wire bonding parameters to bondability and ball bond reliability', Microelectronics & Reliability, pp. 1287-1292, (1998)
[51] J.A. van der Pol, J-P.F. Huijser, R.B.H. Basten, ‘New latchup mechanism in
complementary bipolar power Ics triggered by backside die attach glue’, Microelectronics & Reliability, pp. , (1999)
[52] A.A. Gallo, ‘Effect of mold compound components on moisture-induced degradation of gold-aluminum bonds in epoxy encapsulated devices’, Proceedings IRPS, pp. 244-251, (1990)
[53] T. Claasen, ‘The logarithmic law of usefulness’, Semiconductor International, pp. 175-184, (1998)
[54] D.S. Peck, ‘Semiconductor reliability predictions from life distribution data’,
in ‘Semiconductor Reliability’, ed. Schwop and Sullivan, pp. 51-67, Reinhold, New York, (1961)
[55] D.S. Peck, ‘The reliability of semiconductor devices in the Bell system’, Proceedings of the IEEE, pp. 185-213, (1974)
[56] Ö. Hallberg, ‘Failure rate as a function of time due to log-normal life distributions(s) of weak parts’, Microelectronics & Reliability, pp. 155-158, (1977)
[57] W.C. Riordan, R. Miller, J.M. Sherman, J. Hicks, ‘Microprocesor reliability
performance as a function of die location for a 0.25 µm five layer metal
CMOS logic process’, Proceedings IRPS, pp. 1-11, (1999)
26
Introduction
[58] M. Salagoïty, ‘Reliability of high density packages’, Tutorial European Symposium on Reliability of Electron devices and Failure analysis (ESREF),
(1999)
27
2
Reliability Issues in High Voltage BipolarCMOS-DMOS Integrated Circuits [19,20]
2.1 Introduction
2.2 Threshold voltage instabilities of HV DMOS transistors
2.3 Parasitic leakage currents induced by ‘charge-creep’
2.3.1
Failure mechanism
2.3.2
Surface potential modelling by a lumped element RC-network
2.3.3
‘Charge-creep' characterisation using test structures
2.3.3.1 Test structures
2.3.3.2 Experimental results
2.3.4
Comparison of experimental data and model predictions
2.3.4.1 Steady-state surface potential
2.3.4.2 Delay time
2.3.5 Design rules
2.4 Conclusions
2.5 References
2.1 INTRODUCTION
The combination of high operating temperature (∼140°C) and high voltages
(>600V) in current state-of-the-art high power / high voltage (HV) BipolarCMOS-DMOS (BCD) technologies in applications as e.g. lighting and power
supplies induces new degradation mechanisms that are non-relevant in standard 5
V and 3.3 V CMOS technologies. Hardly any published data are available on
these mechanisms. The three most significant failure modes are breakdown
voltage instabilities of the high voltage lateral double-diffused MOS (DMOS)
transistor [1], threshold voltage (Vt) instabilities of this transistor [2,3] and
parasitic leakage currents in low voltage parts of the circuit induced by high
surface potentials at the moulding plastic - passivation interface originating from
25
Chapter 2
the high voltage part of the circuit (‘gate induced leakage’ [4]). This chapter will
discuss the latter two issues.
2.2 THRESHOLD VOLTAGE INSTABILITIES OF HIGH VOLTAGE
DMOS TRANSISTORS
A cross section of the DMOS transistor is shown in fig. 1. The devices are fabricated in a 3 µm double poly, single metal technology. The poly-metal dielectric
is a TEOS oxide stack containing a thin P2O5 layer for mobile ion gettering purposes. Extensive life testing has shown that this getter layer results in an adequate
reliability for a 12 V BiCMOS technology. In the 650 V BCD-technology it however appears to be insufficient.
Curve B in fig. 2 shows the threshold voltage (Vt) instability occurring during
High Temperature Reverse Bias (HTRB) lifetest at 150 °C where the gate bias
equals 0 V. The same failure mode also occurs during a Static High Temperature
Lifetest (SHTL) with a 12 V gate bias. The failure mode is caused by the fact that
commercially available plastic moulding compounds contain traces (≈ 4 ppm) of
sodium ions (Na+) originating from the resin manufacturing process. Under high
voltage operating conditions, a large lateral electric field (about 10 V/µm) exists
along the surface of the transistor between its source and drain, forcing the Na+ions in the plastic at high temperature towards the source. Here the vertical electric field component points towards the grounded source, enabling the Na+ to penetrate the device through pinholes, microcracks, fissures and pores in the Si3N4
plasma nitride passivation [16] and reach the DMOS gate oxide via the path depicted in fig. 1.
Na +
Si3N4
G
Al
TEOS
S
D
LOCOS
p+
p++
p
n+
-
-
p
n
P2 O 5
n+
p++
p-Fig. 1: Overview of a high voltage lateral DMOS transistor and the sodium (Na+)
penetration path.
26
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
norm. threshold voltage
Process and device layout improvements were evaluated using a wafer level
highly accelerated lifetest [5,6]. Devices were deliberately contaminated with 0.5
weight % NaOH and subjected to high voltage and high temperature while the Vt
was continuously monitored to determine the failure times, see fig. 3. The apparent variation of the sigma of the distribution with temperature is most likely just
a statistical effect caused by the small samples sizes (about 9) used in this experiment. The resulting activation energy Ea equals 0.87±0.09 eV. This result is in
reasonable agreement with the ≈ 0.7 eV value reported in literature for diffusion
of Na+ in silicon oxide (SiO2) [2,7] although also values ranging from 0.45 eV [8]
to 1.1 eV [3] have been reported.
1.25
1.00
0.75
0.50
A
B
C
0.25
0.00
10
100
1000
10000
time (hours)
Fig. 2: Vt-degradation of the standard DMOS transistor during HTRB lifetest
(Vds=500V/Vgs=0V/T=150°C) with A) 0.45µm, B) 0.9µm and C) 1.8µm
Si3N4 passivation.
4
99.9%
99%
prob inv (F)
3
2
1
90%
0
50%
-1
10%
250°C
200°C
150°C
-2
-3
1.0%
0.1%
-4
0.1
1
10
100
1000
10000
time (sec)
Fig. 3: Vt-shift failure time distribution during a 250V wafer level HTRB stress
(Vgs=0V) at 150, 200 and 250°C.
27
Chapter 2
norm. threshold voltage
It was found that significant improvements could be achieved by increasing
the silicon nitride (Si3N4) passivation thickness (fig. 2) and optimising the layout,
see curve A and B in fig. 4. Also the densification of the nitride along the vertical
sidewall of the anisotropically etched metal lines by the ion bombardment during
plasma enhanced chemical vapour deposition (PE-CVD) appeared to be important
as devices with wet etched metal were found to be superior to those with dry etched metal. Apparently, the PE-CVD deposited silicon nitride is relatively porous
along the sidewalls of the metal lines. This is also demonstrated in fig. 5 showing
a cross-section of a metal line with passivation after HF-etch. The Si3N4 etch rate
is clearly larger at the sidewall than at the top or bottom. Finally, a significant lifetime improvement could be achieved by implementing an enhanced phosphorous gettering layer in the TEOS oxide poly-metal dielectric or by including a
PSG layer in the TEOS stack [18], see curve C and D in fig. 4. In some of the lifetest experiments a thin Si3N4 passivation layer was used in order to accelerate the
Vt-shift failure mode and reduce the required stress time.
1.25
1.00
0.75
0.50
A
B
C
D
0.25
0.00
10
100
1000
10000
time (hours)
Fig. 4: Vt of the DMOS transistor with optimised layout versus time during
HTRB lifetest (500V/150°C) for A) 0.45µm, B) 0.9µm Si3N4 passivation
and C) 0.45µm Si3N4 and improved P2O5-getter layer and D) 1.8µm Si3N4
and a PSG-getter layer.
28
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
densified
non-densified
Fig. 5: Cross-section of a passivated metal line after HF-etch, the fissure occurs
at the border between densified and non-densified (more porous) Si3N4.
2.3 PARASITIC LEAKAGE CURRENTS INDUCED BY ‘CHARGE-CREEP’
2.3.1 Failure mechanism
Plastic moulding compounds have a measurable conductivity due to the presence of water and ionic impurities like e.g. Na+, K+, Cl-, NH4+, HxPO43-, and
NO32- in the compound. Furthermore also Br and Sb ions are added to the plastics,
acting as flame retardants. The conductivity is strongly temperature dependent
and increases over 4 decades between 20 °C and 150°C (Ea= 0.65 eV) as shown in
fig. 6 [9]. Consequently, at high temperatures, the high voltage (HV) surface potential (>600V) present at the bondpads of the HV circuitry can spread over the
low voltage (<20V) part of the circuit, see fig. 7, and may induce parasitic channels and leakage currents in low voltage devices like bipolar transistors and active
as well as parasitic MOS transistors or may affect diffused resistance values [17].
200°C175°C150°C125°C 100°C
17
Epoxy Resistivity (ohm-cm)
10
75°C
50°C
16
10
15
10
Ea~0.65ev
14
10
EME1100HS
EME1100HS
EME6210S
13
EME6210S
10
EME6210SR
EME6210SR
EME1100HJ
EME1100HJ
Nitto HC10-2
12
10
Nitto HC-10
Ea~2.5ev
11
10
10
10
2.0
2.2
2.4
2.6
2.8
3.0
3.2
1000/T (/°K)
29
Chapter 2
Fig. 6: Resistivity of various commercially available plastic moulding compounds after full moisture saturation as a function of temperature.
Fig. 7: Schematic view of parasitic leakage currents induced by high surface potentials originating from a high voltage bondpad (‘charge-creep’).
The above phenomena is called 'charge-creep' or 'gate induced leakage' [4] and
may result in malfunctioning of circuits within a few hours during Dynamic High
Temperature operating Lifetests (DHTL) as shown in fig. 8. The surface potential
is “frozen” and thus a permanent failure is created when the circuit is subsequently cooled down to room temperature. The mobility of the ionic impurities is namely strongly reduced at lower temperatures and thus a net positive charge remains
‘traped’ in the moulding compound.
4
A /150°C
A /125°C
B/150°C
C/150°C
prob inv (F)
3
2
1
99.9%
99%
90%
0
50%
-1
10%
-2
1.0%
0.1%
-3
-4
0.1
1
10
100
1000
10000
time (hours)
Fig. 8: Cumulative failure distribution of a BCD-product during a 390V DHTL
stress at 125°C and 150°C for various packages A) cresol-novolac low
stress compound, B) as A) but with epoxy-plastic interface modification
and C) bi-phenylic anti-popcorn compound. At each readpoint the HVbias was kept on the devices while lowering the stress temperature to
room temerature in order to prevent potential annealing effects.
30
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
The lifetest data of the BCD-product in fig. 8 show that the failure mechanism
is strongly dependent on temperature as well as the specific type of moulding
compound and package construction. Other experiments reported in [10] show
that also the moisture content of the package is very important. The water concentration in the package determines the mobility of the ionic impurities and thus also affects the ‘charge-creep’ failure mechanism. It appears that the 'charge-creep'
effects can be virtually eliminated by first subjecting products to a 500 hrs bake at
150 °C (which reduces the moisture content of the package to 0 weight %) before
the high voltage stress [10]. Similar results are obtained after a 24 hrs bake at 175
°C. If the same samples are subsequently fully moisturised (to ≈ 0.3 weight %) by
a storage for 168 hrs at 85 °C / 85 % RH, they again become sensitive to the
''charge-creep' mechanism. It must be noted though that in that case the leakage
currents induced by '’charge-creep' are significantly less than that of virgin fully
moisturised samples and it also takes longer before the leakage current increase
starts. This is probably caused by the ongoing curing of the moulding compound
as the 150 °C and 175 °C bake temperatures are close to or even over the 165 °C
glass transition temperature Tg of the plastic. The curing affects plastic material
properties like maximum moisture uptake, ionic mobility and conductivity.
It must be noted that in the experiment shown in fig. 8, the moisture content of
the samples was not controlled so the data must be treated cautiously. For proper
experimental results and e.g. activation energy determination, the moisture content of the package of the devices must firstly be in equilibrium and secondly be
controlled for all samples and at all readpoints.
2.3.2 Surface potential modelling by a lumped element RC-network
As discussed in the previous section, the parasitic leakage currents are induced
by high surface potentials originating from the high voltage (HV) bondpads. So
the 'charge-creep' effect can be modelled by describing the evolution of the surface
potentials at the die-plastic interface as a function of place and time. As will be
shown in section 2.3.3.1, the surface potential has a one-to-one relation to the leakage currents.
The place and time dependence of the surface potential can be modelled by a
lumped element RC-network, see fig. 9, with R being the resistance of the moulding compound from the HV-bondpad to the low voltage circuitry and C the capacitance between the active silicon and the interface between the nitride passivation
and the moulding compound. Note that after long stress times obviously an equilibrium will be reached governed by the boundary conditions defined by the potentials of the bondpads and the diepad of the circuit.
31
Chapter 2
R
HV
Bondpad
R
C
R
C
R
C
R
R
C
C
0V
Earthlane
Silicon Nitride
Metal
TEOS
Nwell
n+ LOCOS
Metal
n+
p- Si
dnode
Dbondpad
d=0
Dearthlane
Fig. 9: Lumped-element RC-network used for modelling of the evolution of the
surface potential as a function of place and time.
A real circuit has normally a rectangular geometry and asymmetrically placed
HV-bondpads. In order to be able to realistically model the surface potential evolution as a function of place and time by analytical formulas we simplify this geometry to the cylindrical and symmetrical one shown in fig. 10. This allows us in
the following sections to compare model predictions and experimental data without having to rely on complex 3D-device simulations. It thus provides more insight in the 'charge-creep' mechanism.
Dbondpad
dnode
HV
Dearthlane
Metal
0V
Earthlane
Fig. 10: Geometry used for modelling of the surface potential as a function of the
distance to the HV-bondpad.
In case of the geometry shown in fig. 10, the resistance R between a node at
distance d from the edge of the HV-bondpad and the HV-bondpad is given by
32
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
equation (1) and the capacitance C of the corresponding area by equation (2) and
(3).
R(d ) =
d + Dbondpad ρ
plastic ⋅ ∂r
ò
Dbondpad
=
2π ⋅ t plastic ⋅ r
=
æ d + Dbondpad
ρ plastic
⋅ ln ç
ç D
2π ⋅ t plastic
bondpad
è
(1)
[F]
(2)
ö
÷
÷
ø
d + Dbondpad
C (d ) =
ò C 0 ⋅ 2π ⋅ r ⋅ ∂r =
Dbondpad
[
[Ω]
= π ⋅ C 0 ⋅ (d + Dbondpad ) − Dbondpad
2
2
]
where:
C0 =
ε 0 ⋅ε oxide ⋅ε nitride ⋅
[Fm-2] (3)
ε nitride ⋅(t LOCOS + t TEOS ) + ε oxide ⋅t nitride
Here is ρplastic the resistivity of the moulding plastic as shown in fig. 6, tplastic
the thickness of the moulding plastic on top of the silicon-nitride passivation,
Dbondpad the radius of the HV-bondpad and tLOCOS, tTEOS and tnitride the thickness of
the LOCOS oxide, TEOS oxide and the Si3N4 passivation. εoxide and εnitride equal
3.9 and 7.5 respectively. Fig. 6 shows that for the epoxy-novolac plastic ρplastic
equals about 1.4⋅1014 Ωcm and 6⋅1013 Ωcm at 130°C and 150°C respectively. In
the experiments that will be described in section 2.2.3, tLOCOS, tTEOS and tnitride
equal 0.95µm, 1.2µm and 1.8µm respectively and tplastic , Dbondpad and C0 equal
about 1.2 mm, 40 µm and 1.12⋅10-9 Fcm-2 respectively.
The surface potential Vsurface as a function of place and time after switching the
high voltage on at t= 0 seconds is for the lumped-element RC-network given by
equation (4):
t
−
æ
ç
τ (d )
Vsurface (d, t ) = Vsurface (d ) ⋅ ç1 − e delay
ç
è
ö
÷
÷
÷
ø
[V]
(4)
Equation (5) gives the steady state surface potential Vsurface (d) at a node at a
distance d from the HV-bondpad. Note that Vsurface (D earthlane) = 0 V.
33
Chapter 2
Vsurface(d ) = Vbondpad ⋅
R(d → Dearthlane)
=
R(Dbondpad → d ) + R(d → Dearthlane)
Dearthlane ö
÷
÷
çd +D
bondpad
ø
è
æ
lnç
= Vbondpad ⋅
[V]
ö
æD
lnç earthlane ÷
÷
çD
è bondpad ø
(5)
where Vbondpad is the high voltage applied to the bondpad, Dbondpad the radius of the
bondpad, Dearthlane the radius from the centre of the bondpad to the grounded
earthlane (also called sawlane) and d the distance between the edge of the bondpad and the location of interest. In the experiments described in section 2.2.3,
Dbondpad equals ≈ 40 µm, Dearthlane ≈ 1100 µm and Vbondpad = 500V.
The delay time τdelay in equation (4) between the application of the high voltage at the HV-bondpad and the response of the voltage at one of its nodes is given
by equation (6). τdelay is equal to the RC-time constant between the edge of the
HV-bondpad and the node located at a distance d. Note that at t= τdelay the surface
potential has reached 63% of its steady-state value.
d + Dbondpad
τ delay (d ) =
ò ò R ( r ) ⋅ C ( r ) ⋅ ∂r∂r
=
Dbondpad
d + Dbondpad
=
ò
ò
Dbondpad
=
[
ρ
⋅ C 0 ⋅ 2π ⋅ r ⋅ ∂r∂r =
2π ⋅ r ⋅ t plastic
ρ ⋅ C0
2
2
⋅ (d + Dbondpad ) − Dbondpad
2 ⋅ t plastic
]
[s]
(6)
Equation (6) implies that for large distances d to the HV-bondpad, τdelay increases approximately quadratically with d. For smaller distances the exponent
will be more like 1.8 than 2.0.
2.3.3 'Charge-creep' characterisation using test structures
34
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
2.3.3.1 Test structures
A dedicated test chip has been designed and processed to quantify the 'chargecreep' effect as a function of time, temperature and distance of sensitive circuitry
to the HV components. A parasitic field-oxide NMOS transistor is used as the low
voltage ‘sense’ device, see fig. 11. The transistor width and length are 10µm and
5µm respectively and its threshold voltage is typically about 40V. The effects of
moisture content have been reported in [10], see also section 2.3.1. The test chip
has a size of 2.2 x 2.9 mm2 and is packaged in a cresol-novolac low stress compound. In all the experiments discussed below the packages were fully saturated
with moisture (≈ 0.3 weight %) and in equilibrium before starting the stresses unless otherwise mentioned.
d
Source
n+
2* Dbondpad
Drain
n+
Bondpad
Field oxide
Metal
Parasitic NMOS
sense transistor
W/L=10/5 µm
(a)
(b)
Fig. 11: Schematic overview of the 'charge-creep' test structures; a) top view and
b) cross-section.
The testchip also contained a field oxide NMOS transistor of the same geometry as the parasitic NMOS transistor in fig. 11 but in this case with a metal gate
on top. The leakage current of this transistor as a function of the metal gate volta35
Chapter 2
ge is shown in fig. 12. If the metal gate is not present, a similar leakage current
can be induced by raising the surface potential at the silicon nitride to plastic interface. Obviously in this case a higher voltage is needed because apart from the
LOCOS and TEOS oxide, now also the silicon nitride passivation is part of the
gate dielectric, see fig. 10. The relation between them is given by equation (7).
æ
V surface = çç1 +
è
ö
ε oxide
t nitride
÷ •V
•
ε nitride t LOCOS + t TEOS ÷ø gate
[V]
(7)
where Vsurface is the surface potential at the plastic-nitride interface and Vgate
the gate voltage corresponding to the measured leakage current as shown in fig.
12. εoxide and εnitride equal 3.9 and 7.5 respectively. tLOCOS, tTEOS and tnitride are the
thickness of the LOCOS oxide, TEOS oxide and silicon nitride passivation and
are 0.95 µm, 1.2 µm and 1.8 µm respectively in our experiments. Consequently,
Vsurface equals 1.44 xVgate. Equation (7) thus allows us to convert the measured
leakage current of the parasitic NMOS transistor in our experiments in a simple
way to a surface potential.
The test chip used in the experiments had a size of 2.2 x 2.9 mm2 and was
packaged in a cresol-novolac low stress compound. In all the experiments discussed below the packages were saturated with moisture (≈ 0.3 weight-%) and in
equilibrium before starting the stresses unless otherwise mentioned (see section
2.3.3.2).
1E-02
1E-03
Leakage Current [A]
1E-04
1E-05
150'C
1E-06
130'C
1E-07
1E-08
110'C
1E-09
90'C
1E-10
1E-11
0
100
200
300
400
500
Gate Voltage Parasitic NMOS [V]
36
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
Fig. 12: Leakage current of a W/L=10/5 µm parasitic field oxide NMOS
transistor as a function of the voltage on the metal gate for various
temperatures. The gate dielectric consists of a 0.95 µm LOCOS oxide
and a 1.2 µm TEOS oxide.
2.3.3.2 Experimental results
Conventional high temperature lifetests
leakage current (A)
Fig. 13 shows results from a conventional HTRB lifetest at 150°C on the test
structures. Here the samples were removed from the stress set-up at each readpoint and the leakage current was measured at room temperature. Note that the
storage time at room temperature at the various readpoints was less than 24 hrs,
thus limiting any moisture uptake by the packages.
The data show that the induced leakage currents decrease with the distance
from the HV bondpad as expected and that, due to the small size of the test die
(2.2x2.9 mm2), equilibrium is reached within ¼ hour. The HV surface potential
then extends to more than 1 mm from the HV bondpad. Remarkably, for longer
times the leakage current and thus also the surface potential decreases. This is
probably due to the ongoing curing of the moulding compound, see section 2.3.1.
Annealing behaviour is shown in fig. 14. The annealing takes longer than the
leakage increase, firstly due to the fact that the ion mobility and thus the conductivity is much less in dry samples and secondly due to the curing effect. Note that
during the leakage increase the samples still contain moisture. Similarly, repeated
high voltage stress and annealing experiments show that the leakage increase time
constant increases with every cycle.
1E-02
1E-03
1E-04
1/4h
1h
24h
1E-05
1E-06
0
200
400
600
800
distance to HV bondpad (µm)
Fig. 13: Leakage current of a parasitic NMOS transistor versus distance to a high
voltage bondpad for various stress times during a conventional 500 V
HTRB lifetest at 150 °C.
37
Chapter 2
leakage current (A)
1E-02
1E-04
Stress /Anneal
1E-06
0h/0h
24h/0h
24h/24h
24h/65h
1E-08
1E-10
1E-12
0
200
400
600
800
distance to HV bondpad (µm)
Fig. 14: Recovery of the leakage current of a parasitic NMOS transistor versus
distance to a HV bondpad during a 500 V stress at 150 °C and annealing
at 150 °C. The legend shows both the stress and anneal times.
In-situ high temperature lifetests
During the conventional lifetest experiments equilibrium was already reached
at the first readpoint being about 15 minutes after the start of the stress. Therefore
also in-situ stress experiments were carried out where the leakage current was
continuously monitored at the stress temperature during a 500V high voltage
stress. Fig. 15, 16 and 17 show the results of these in-situ 'charge-creep' stress experiments.
Fig. 15 shows the leakage current as a function of time for various distances of
the parasitic ‘sense’ transistor to the HV-bondpad and for temperatures ranging
from 110 °C to 150 °C. The data confirm that the 'charge-creep' mechanism is a
fast effect as already indicated by the data in fig. 13 from the conventional lifetest
experiments. At a distance of 100 µm from the HV-bondpad equilibrium or steady
state is reached after about 1000 seconds at 150 °C. At the same time, the induced
leakage current effects extend to beyond 0.9 mm from the HV-bondpad. Furthermore, we observe a delay time τdelay between the start of the HV-stress and the onset of the leakage current increase. This is typical for a higher order system and
thus consistent with the lumped element RC-model that we have used to describe
the surface potential evolution versus time and place.
38
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
1E-02
T= 150 C
Leakage Current [A]
1E-03
1E-04
1E-05
Distance to
HV-Bondpad
100um
1E-06
1E-07
350um
550um
1E-08
850um
1E-09
10
100
1000
10000
100000
Time [s]
(a)
1E-02
T= 130 C
Leakage Current [A]
1E-03
1E-04
1E-05
1E-06
D istanc e to
H V-B ondpad
10 0um
35 0um
55 0um
85 0um
1E-07
1E-08
1E-09
1E-10
10
100
1000
Tim e [s]
1 0000
100 000
(b)
1E-02
Distance to
HV-Bo nd pad
Leakage Current [A]
1E-03
100um
1E-04
350um
1E-05
550um
1E-06
850um
1E-07
1E-08
T = 110 C
1E-09
1E-10
10
100
1000
10000
100000
T im e [s]
39
Chapter 2
(c)
Fig. 15: Parasitic NMOS leakage current as a function of time during a 500 V
SHTL stress for various distances between the parasitic NMOS ‘sense’
transistor and the HV bondpad and for three temperatures (a) 150 °C, (b)
130 °C and (c) 110 °C [J. Bruggers].
Fig. 16 shows for various temperatures how the delay time τdelay increase with
the distance d to the HV-bondpad. It appears that the τdelay is proportional to ≈ d2.
Data from test structures packaged both in epoxy-novolac and in bi-phenylic
moulding compounds and stressed at temperatures between 90°C and 150°C show
that the exponent varies between 1.72 and 2.37 with an average and spread of
2.06 ± 0.19. This is consistent with the exponent value of 2 predicted by the lumped-element RC-model in section 2.2.2.
100000
150 C
130 C
110 C
Delay Time [s]
@ I-leak= 10nA
10000
1000
y = 0.026x
100
2.03
y = 0.0045x
2.0 2
y = 0.0051x
1.7 2
10
10
100
D istan ce to HV-bon dp ad [um ]
1000
Fig. 16: Delay time between the start of the 500 V SHTL stress and the onset of
the leakage current increase of the parasitic NMOS, defined as a leakage
larger than 10 nA, as a function of the distance to the HV-bondpad for
three different temperatures.
Finally, it is shown in fig. 17 that the leakage current increase is strongly temperature dependent. It also shows an increase of delay time τdelay with decreasing
temperature that is also consistent with the lumped element RC-model.
40
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
1E-03
150'C
140'C
1E-04
130'C
Leakage Current [A]
1E-05
120'C
110'C
1E-06
1E-07
1E-08
1E-09
1E-10
10
100
1000
10000
100000
Time [s]
Fig. 17: Leakage current of a parasitic NMOS transistor located at 550 µm distance from the HV-bondpad as a function of time during a 500 V SHTL
stress at various temperatures.
As described in section 2.3.3.1, all leakage current graphs in fig. 15 and 17
can be converted to surface potential graphs. Using equation (4), this allows us to
determine the delay time τdelay for a given distance to the HV-bondpad as a function of temperature. Thus the activation energy Ea of the 'charge-creep' failure
mechanism can be measured. For the cresol-novolac low stress compound and the
bi-phenylic anti-popcorn compound activation energies of Ea= 0.9 ± 0.2 eV and
Ea= 1.1 ± 0.1 eV respectively are found [10]. The fact that we find different values
supports our model that the 'charge-creep' is governed by conduction in the moulding compound.
2.3.4 Comparison of experimental data and model predictions
2.3.4.1 Steady-state surface potential
In order compare the experimental data with the model predictions, the leakage currents graphs must be converted to surface potential graphs using fig. 12 and
equation (7). An example is shown in fig. 18. It depicts the surface potential as a
function of time for various distances to the HV-bondpad during a 500V stress at
130°C as derived from fig. 15b that shows the corresponding leakage currents. It
clearly shows that the steady state surface potential decreases with increasing distance to the HV-bondpad. Similarly, the delay time between the rise of the surface
potential and the start of the stress increases as is qualitatively predicted by the
lumped element RC-model.
41
Chapter 2
500
Distance to
HV-Bondpad
400
T= 130 C
Surface Potential [V]
100um
350um
300
550um
850um
200
100
0
10
100
1000
10000
100000
Time [s]
Fig. 18 : Surface potential induced by the ‘charge-creep’ effect as a function of time for various distances to the HV bondpad during a 500 V SHTL stress
at 130 °C as derived from the leakage currents of parasitic NMOS transistors, see text. The corresponding leakage data can be found in fig. 15b.
Fig. 19 shows the steady state surface potential as a function of the distance d
to the HV-bondpad as obtained from various experiments on test structures packaged in both epoxy-novolac and bi-phenylic moulding compounds and stressed at
500V at temperatures ranging from 90°C to 150°C. Note that the surface voltages
have been derived from the measured leakage currents using the calibration curves in fig. 12 and equation (7).
In order to obtain a good fit between the data and the model in equation (5) we
need to modify equation (5). This is because in fig. 19 the 500V value of the HVbondpad is reached at a distance doffset of about 80 µm from the HV-bondpad instead at a d= 0 µm as predicted by the model. We can incorporate this in our model by replacing d by d-doffset in equation (5).
Two effects cause this discrepancy. The first is that our model assumes a cylindrical symmetry while in reality we are dealing with a square 80 µm x 80 µm
bondpad and a rectangular die. The second, more important, effect is that the 1.8
µm thick silicon-nitride passivation layer becomes conductive at high lateral field
strengths and high temperatures due to Frenkel-Poole conduction [11,12]. Actually, if at 150 °C a 500 V voltage is applied to the bondpad, the silicon-nitride can
be significantly more conductive than the moulding plastic up to a significant
distance from the edge of the HV-bondpad due to the high lateral electric field in
the silicon-nitride layer, see fig. 19. Consequently, the 500 V value in the measurements is reached at a distance doffset > 0 µm. Moreover, the real thickness of the
gate dielectric is thinner than assumed because the silicon-nitride is conductive
too and is virtually no longer part of the gate dielectric. Thus, the use of equation
42
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
(7) for the calculation of surface potentials from the measured leakage currents,
see section 2.3.3.1, will result in too high values for the surface potential. So for
small distances to the HV-bondpad, Vsurface is actually more or less equal to Vgate.
This explains the fact that in fig. 19 also surface potential values above the 500 V
stress voltage occur. The above effect can not be modelled analytically and requires the use of 3D-device simulations to take it properly into account. This is beyond the scope of this work. It should furthermore be noted that for large leakage
currents, and thus for high surface potentials, the inaccuracy in the derived surface potential values increases strongly, see fig. 12.
600
Experiment
Model fit
Surface Potential [V]
500
400
300
200
100
0
0
200
400
600
800
1000
Distance to Edge HV-Bondpad [um]
Fig. 19: Steady-state surface potential induced by the ‘charge-creep’ effect as a
function of the distance to the HV bondpad during 500V SHTL stresses at
various temperatures ranging from 90°C to 150°C and for various plastic
moulding compounds. The dashed line is the model fit after replacing d
by d-doffset in equation (6), see text.
The significance of the silicon-nitride conductivity is illustrated in the following. The silicon-nitride conductivity is exponentially dependent on lateral field
strength as well as on temperature [13-15] and furthermore increases with increasing Si-content of the layer [13,14], see also fig. 20. The Si:N stoichiometric ratio
also determines the refractive index of the layer [13,14]. The silicon-nitride layer
in our experiments has a refractive index of 2.0 and at 20 °C its resistivity ρSiN
equals about 7⋅1014 Ωcm at 20 V/µm lateral field strength, see fig. 20. This results
in a resistivity of about 4.2⋅1010 Ωcm at 20 V/µm at 150 °C, using an activation
energy for the conductivity of about 0.8 eV[14].
At this operating condition, ρplastic/tplastic equals ≈ 5⋅1014 Ω whereas ρSiN/tSiN
equals ≈ 2.3⋅1014 Ω. The actual electric field strength in the silicon-nitride is de43
Chapter 2
termined by the surface potential at the SiN-plastic interface and is larger than 10
V/µm up to 200 µm from the edge of the HV-bondpad as shown in fig. 19. Thus,
the silicon-nitride will indeed be significantly more conductive than the moulding
plastic near the HV-bondpad and consequently the 500 V value in the measurements is reached at a distance doffset > 0 µm.
1E+15
Si-rich SiN n=2.45
Resistivity [Ohm.cm]
1E+14
Standard SiN n=2.00
1E+13
1E+12
1E+11
1E+10
1E+09
1E+08
0
1
2
3
4
5
6
7
Electric Field [MV/cm]
Fig. 20: Resistivity of a 200 nm thick PE-CVD deposited silicon nitride layer as a
function of the electric field at 20°C for various Si:N stoichiometric ratios
a) standard (refractive index n=2.0) and b) silicon rich (n=2.45) [S.
Evseev and G. Timan].
If we now replace ‘d’ in equation (5) by ‘d – doffset’ we obtain an excellent fit
between the data and the model fit depicted by the dashed line in fig. 19. The fit
constants are Dbondpad = 24 ± 7 µm, dasym = 107 ± 6 µm and Dearthlane = 1024 ± 91
µm. We find that the value of Dearthlane obtained from the model fit is in very good
agreement with the ≈ 1100 µm actual distance on the test structure. The value of
Dbondpad corresponds reasonably well with actual value of 40 µm.
2.3.4.2 Delay time
In order to compare the calculated and measured delay times quantitatively,
we again replace ‘d’ in equation (6) by ‘d – doffset’ and use the previously derived
fit constants for Dbondpad,, dasym and Dearthlane. The result is shown in fig. 21 for a
stress at 130 °C. We find a good qualitative agreement between the calculated delay times and the ones determined from the leakage current graphs, taking into
account the previously discussed model limitations. The delay times determined
from the 63 % level of the surface potential graph are however significantly larger
44
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
than the calculated ones. This is most likely caused by the fact that during the
stress at 130 °C the moisture already evaporates from the plastic. As this reduces
the ion mobility, the actual resistivity of the plastic mould compound will be larger than the values shown in fig. 6 that were used in the calculations. Note that
the data in fig. 6 are valid for moisture saturated compounds.
5000
Delay Time [s]
I= 10nA
4000
V= 63%
3000
Calculated
2000
1000
0
0
200
400
600
800
1000
Distance from HV-bondpad [um]
Fig. 21: Comparison between measured and calculated delay times as a function
of the distance to the HV-bondpad during a 130 °C stress using two different delay time criteria: a) the time at which the leakage current exceeds
10 nA (see fig. 14b) and b) the time at which the surface potential equals
63 % of its steady-state value (see fig. 16).
2.3.5 Design rules
The model derived in section 2.2.4 can be used to derive design rules for the
safe distance of an active circuit element from the HV-bondpad in order to prevent
the occurrence of parasitic leakage currents for a given operating/use condition.
Assuming that circuit design is robust against leakage currents smaller than 1 µA,
we find from fig. 12 that the maximum allowable surface potential equals about
50 V. Using equation (5) one can then calculate the corresponding safe distance
dsafe to the HV-bondpad. All devices located closer to the HV-bondpad than dsafe
should be protected by applying proper shielding measures like using field plates.
Field plates are metal or polysilicon plates that are placed on top of sensitive devices, thus shielding these from the high surface potentials at the silicon-nitride to
plastic interface. Note that sensitive devices located further away from the HVbondpad can be left un-shielded. For a practical case where Dearthlane= 3 mm,
45
Chapter 2
Dbondpad= 40 µm and Vbondpad= 400 V we find that dsafe ≈ 1.7 mm. It can be concluded that a significant fraction of the die area is sensitive to the 'charge-creep' effect. The problem can be alleviated somewhat by placing metal lines connected to
ground potential around the HV-bondpads while simultaneously establishing contact between these metal lines and the moulding plastic by locally removing the
silicon-nitride passivation on top of these lines.
2.4 CONCLUSIONS
Dominant failure modes in high power/high voltage (650 V) BCDtechnologies are threshold voltage instabilities of the lateral DMOS transistor due
to sodium ingression and parasitic leakage currents in low voltage devices induced
by high surface potentials originating from the high voltage devices ('chargecreep'). The threshold voltage instabilities can be prevented by improving the
sodium getter capabilities of the dielectric layers in the backend process and by
increasing the silicon nitride passivation thickness. The occurrence of parasitic
leakage currents appears to be strongly dependent on temperature, moisture
content of the plastic package, circuit layout and applied operating voltage. The
'charge-creep' effect can be modelled by describing the evolution of the surface
potential as a function of place and time by means of a lumped element RCmodel. A good qualitative and a reasonable quantitative agreement between experimental data and model predictions is found. Using the model also design rules
that can be used to eliminate the 'charge-creep' effects in actual circuits have been
derived.
2.5 REFERENCES
[1] T. Fujihara, Y. Yano, S. Obinata, N. Kumagai, K. Sakurai, “Proposal for
new interconnection technique for very high-voltage IC’s”, Journal of Appl.ied Physics, Vol. 35, pp. 5655-5663, (1996)
[2] R.L. Hance, J.W. Miller, K. Erington, M.A. Chonko, “Mobile ion contamination in CMOS circuits”, International Reliability Physics Symposium
(IRPS) Tutorial, Topic 4, (1995)
[3] E.H. Nicollian, J.R. Brews, ‘MOS (Metal Oxide Semiconductor) physics and
technology’, Wiley, New York, (1982)
[4] R.D. Mosbarger, D.J. Hickey, “The effects of materials and post-mold profiles on plastic encapsulated integrated circuits”, Proceedings IRPS 1994, pp.
93-100, (1994)
[5] P.L. Hefley, J.W. McPherson, “The impact of an external sodium diffusion
source on the reliability of MOS circuitry”, Proceedings IRPS, pp. 167-172,
(1988)
[6] C. Hong, B. Henson, T. Scelsi, R. Hance, “An accelerated sodium resistance
test for IC passivation films”, Proceedings IRPS, pp. 318-325, (1995)
[7] J.P Stagg, Applied Physics Letters, no. 10, pp. 532, (1977)
46
Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits
[8] G. Greeuw, Thesis, University Groningen, (1984)
[9] R. McClelland, “Generic leakage plastic - recoverable, code B7a”, Philips
Semiconductors Failure Analysis Handbook, (1995)
[10] H.J. Bruggers, R.T.H. Rongen, C.P. Meeuwsen, A.W. Ludikhuize, ‘Reliability problems due to ionic conductivity of IC encapsulation materials under
high voltage conditions’, Proceedings International Symposium on Power
Semiconductor Devices (ISPSD), pp. 197-200, (1999)
[11] J. Frenkel, ‘On pre-breakdown phenomena in insulators and electronic semiconductors’, Physical Review, pp. 647, (1938)
[12] S.M. Sze, ‘Physics of semiconductor devices’, 2nd edition, John Wiley &
Sons, New York, (1981)
[13] J.W. Osenbach, W.R. Knolle, ‘Semi-insulating Silicon Nitride (SinSiN) as a
resistive field shield’, IEEE Transactions on Electron Devices, pp. 15221528, (1990)
[14] J.W. Osenbach, J.L. Zell, W.R. Knolle, L.J. howard, ‘Electrical, physical and
chemical characteristics of plasma-assisted chemical-vapor deposited semiinsulating a-SiN:H and their use as a reistive shield for high voltage integrated circuits’, Journal Applied Physics, pp. 6830-6843, (1990)
[15] K. Matsuzaki, T. Horasawa, G. Tada, M. Saga, ‘Application of a semi-insulating amorphous hydrogenated silicon nitride film as a resistive field shield
and its reliability’, Journal Electrochemical Society, pp. 4296-4304, (1998)
[16] J.V. Dalton, J. Drobek, ‘Structure and sodium migration in silicon nitride
films’, Journal of the Electrochemical Society, pp. 865-868, (1968)
[17] R.C. Olberg, ‘The effects of epoxy encapsulant composition on semiconductor device stability’, Journal of the Electrochemical Society, pp. 129-133,
(1971)
[18] L.H. Kaplan, M.E. Lowe, ‘Phosphosilicate glass stabilization of MOS structures’, Journal of the Electrochemical Society, pp. 1649-1653, (1971)
[19] J.A. van der Pol, H.J. Gerritsen, R.T.H. Rongen, P.P.M.C. Groeneveld, P.W.
Ragay, H.A. van den Hurk, ‘Reliability issues in 650V high voltage BipolarCMOS-DMOS integrated circuits’, Microelectronics & Reliability, pp. 17231726, (1997)
[20] J.A. van der Pol, R.T.H. Rongen, H.J. Bruggers, ‘Modelling of surface potential induced leakage failures in high voltage integrated circuits and application to design rule derivation’, Submitted to ESREF2000 Conference.
47
3
Relation Between the Hot Carrier
Lifetime of Transistors and
CMOS SRAM Products [24]
3.1
3.2
3.3
3.4
3.5
3.6
3.7
Introduction
Experimental
Transistor and SRAM parameter degradation
Analysis and discussion of the SRAM parameter degradation
Relation between the transistor and SRAM hot carrier lifetime
Summary and conclusions
References
3.1 INTRODUCTION
Along with decreasing MOS transistor device geometries, hot carrier degradation of integrated circuits is becoming a more and more serious reliability problem. Under static operating conditions, the hot carriers lifetimes of transistors in
0.35 µm and 0.25 µm technologies for example hardly exceed a few months. It is
therefore increasingly important to determine the relation between the lifetimes of
stand-alone transistors and actual circuits. Many publications have appeared about
hot carrier effects in MOS transistors [1,2,16]. However, the relation to real products is not very clear. Moreover, apart from some circuit simulation work [3,4]
and a few experiments on circuits [5,6], hardly any data did exist about this topic
before this work. This is mainly due to the fact that the derivation of circuit lifetime from transistor static stress results is complicated. Factors contributing to this
are the variety of transistor lifetime criteria in use (e.g. 100 mV threshold voltage
shift, 10% transconductance or 10% Id, degradation etc.), duty cycle effects, possible AC enhanced degradation effects [7,8], annealing effects [9,10] and the usually unknown sensitivity of the circuit performance to the transistor degradation.
47
Chapter 3
Furthermore, it has been shown that the sensitivity may be supply voltage dependent [3].
In this chapter, a detailed experimental study into the relation between the
transistor and circuit hot carrier lifetime, carried out on 45 ns / 100 pF low power
64K (8k8) full-CMOS static random access memories (SRAM), is described.
3.2 EXPERIMENTAL
3.2.1 SRAM circuit description
The full-CMOS 8K8 SRAMs used in this study [11] feature a 6 transistor memory cell and are fabricated in a single poly, double metal 1.2 µm twin tub CMOS
process with p-epi on a p+ substrate. The 1.2 µm LDD n-channel and 1.4 µm conventional p-channel transistors have a minimum effective channel length Leff,min of
0.75 µm and have n+ and p+ doped polysilicon gates in the matrix respectively. All
periphery transistors have n+ doped gates. Gate oxide thickness and n+ and p+
source-drain junction depths equal 25 nm and 500 nm respectively. The devices
have a plasma-nitride passivation layer.
Fig. 1:
Functional block diagram of the 8K8 SRAM [11].
48
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
Fig. 1 shows a functional block diagram of the SRAM. The matrix is organised in 16 sections and each section consists of a cell array of 128 rows and 32 columns. The logic combination of the addresses activates one polysilicon wordline
WL within a section and, via pass gates, 8 cells from the total of 32 cells are selected for read or write operations (bytewide organisation). The cells on the remaining 24 columns are in a 'pseudo-read mode'; they also pass their data contents on
to the bitline pairs (columns) but stay isolated from the local section read- and
write databusses. The minimum SRAM read and write cycle times, Trc,min and
Twc,min, are specified at 45 ns.
3.2.2 SRAM stress method and stress conditions
The full circuit stress consisted of dynamically operating the products at high
speed, at low ambient temperature and at high supply voltage, using a worst case
pattern. The read and write cycle time Trc and Twc, equalled 60 ns. The ambient
temperature was set at -30'C to diminish the effects of other degradation mechanisms (e.g. electromigration). Lowering the temperature also slightly accelerates
hot carrier degradation effects [12].
Five different supply voltages Vdd ranging from 7.5 V to 10 V were used during the stress of these nominal 5 V devices. All input voltages were raised to the
Vdd- level to limit power dissipation in the in- and output buffers. Under these
conditions, the junction temperature Tj of the devices equalled approximately 0°C.
Care was taken to avoid inductive ringing of the supply lines. Each I/O pin was
loaded by a 30 pF capacitor. In total 83 devices from 2 different batches were
stressed.
3.2.3 SRAM stress pattern
The stress pattern applied was a simple two complementary address-toggling
mode. It involves reading and writing of alternating data in two complementary
addresses (i.e. in two wordlines in different sections). The waveforms of the stress
pattern are depicted in fig. 2.
The aim of this 'Two Address Method (TAM)'-pattern is to focus the hot carrier stress on the memory datapath by maximising the stress duty cycle of all datapath elements (i.e. in- and output buffers, decoders, write and read bus drivers,
pass gates, memory cells, sense amplifiers etc., see fig. 1). Its advantage is that it
yields much more information in the same amount of stress time than a pattern
which accesses all cells; the stress duty cycle of e.g. the memory cell would be
1000 times smaller in case a linear scan pattern is used. A drawback of the TAM
pattern is that only a small fraction of the circuit transistors is stressed so the weakest circuit part from the design point of view may remain unstressed. However,
the addresses are chosen such that transistors in all distinct building blocks of the
memory are stressed. It is therefore a reasonable assumption that the results obtained are representative for the intrinsic circuit lifetime. Nevertheless, to determine
49
Chapter 3
the influence of process related defects (e.g. particles in the polysilicon layer), or
to obtain statistical information on the hot carrier stress resistance of all cells, a
scantype pattern should be used.
Fig. 2 : Schematic of the stress pattern waveforms, Twc= Trc= 60 ns.
3.2.4 Transistor stress conditions
The static transistor degradation experiments were carried out at 20°C ambient
temperature. The channel length L and width W of the devices equalled 1.2 µm
and 20 µm respectively. To check for narrow width effects also some devices with
W= 1.2 µm were stressed. No difference in degradation was observed. The devices
were stressed at several Vds values, while Vgs equalled Vds /2-0.5 V (approximately
maximum substrate current condition). During the stress the following parameters
were monitored: the threshold voltage Vt, defined as the Vgs value at which Ids
equals 0.01 W/L µA, the maximum transconductance βmax and the drain-source
current Ids at Vgs= 2.5 V, all in the linear region (Vds= 0.1 V) and the forward mode (source and drain terminal the same as during the stress).
3.3 TRANSISTOR AND SRAM PARAMETER DEGRADATION
3.3.1 Transistor parameter degradation
50
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
A typical example of the transistor degradation is shown in fig. 3. The stress
conditions are shown in the figure. Note the super-linear time dependence of the
degradation. This is indicative of severe electron trapping [10].
Fig. 3: Degradation of the threshold voltage Vt, the maximum transconductance
βmax and the drain-source current Ids, all measured in the linear region, as
a function of time of a W/L= 20/1.2 µm n-channel transistor (Leff= 0.78
µm). The stress conditions are shown in the figure and the measurement
conditions are described in section 3.2.4.
3.3.2 SRAM parameter degradation
The circuit parameters were monitored during the stress by characterising the
devices at several readpoints at 20°C ambient temperature on a Teradyne J386A
memory tester. Under the stress conditions described above, changes in the electrical parameter occurred. Fig. 4 shows the increase of the address access time Taa as
a function of stress time (the chip enable access time Tac did degrade in a similar
way). The increase can be clearly seen although only at high voltage levels and after long times. The most sensitive parameter, however, appears to be the minimum operating voltage Vdd,min as is shown in fig. 5. Large changes occur already
after short stress times. Note that the time at which the onset of the degradation
occurs is exponentially dependent on the supply voltage, which is indicative of hot
carrier degradation effects.
Along with Vdd,min, also the write timing parameters, such as the minimum data- to-write-time overlap Tdw, measured at Vdd = 4.4 V, degraded. Fig. 6 shows that
51
Chapter 3
both changes are strongly correlated. This is because they result from the same degradation effect, as will be shown in section 3.4. No shift was observed in the write time parameters measured at Vdd = 5.6 V, which will also be explained in section 3.4. Furthermore, no shift of the DC SRAM parameters occurred.
Fig. 4: The address access time Taa at Vdd=4.4 V of batch 1 as function of stress
time for five different stress supply voltages Vdd. The n- and p-channel
transistors Leff equal 0.94 µm and 1.07 µm respectively. Stress conditions
as described in section 3.2.4.
52
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
Fig. 5: The minimum operating voltage Vdd,min of batch 1 as a function of stress
time for five different stress supply voltages Vdd. Leff and stress condition
as in fig. 4.
Fig. 6: Correlation between the degradation of the write pulse width Tdw, measured at Vdd= 4.4 V, and the minimum operating voltage Vdd,min of batch 1
and 2. Stress conditions as in fig. 4. The n- and p-channel Leff of batch 2
equal 0.85 µm and 0.97 µm respectively. The Tdw data-sheet specification
limit of 20 ns corresponds to a Vdd,min value of 3.5 to 4.0 V.
A strong saturation effect (and even recovery after long at times) can be seen
in fig. 5. This could be caused by annealing effects, induced by detrapping of electrons from shallow trap levels in the gate oxide [13]. Note that strong electron
trapping occurred during the transistor stress. Electron detrapping will lead to a
recovery of the threshold voltage [9,10] and of the circuit parameter shifts [6].
Therefore, also annealing effects have been investigated.
Fig. 7 shows the recovery of Vdd,min during storage at various temperatures after the stress. Note that Vdd,min completely recovers to its 0 hour value (about 2.4
V) and that the recovery time constant τrecov (T) is strongly temperature dependent.
Fig. 13 shows that the Vdd,min shift is proportional to the Vt -shift of the lifetime limiting transistor. Therefore, the recovery activation energy Ea, which equals the
average electron trap level energy Etrap, can be calculated from the following simple model [10,13]:
53
Chapter 3
∂Vdd,min ( t ) = A ⋅ ∂Vt ( t ) = B ⋅ N trap ( t ) = B ⋅ Ntrap ( t = 0 ) ⋅ e
t
τ re cov ( T )
(1)
where A and B are constants, Ntrap the number of trapped electrons/cm and τrecov
(T) the recovery time constant in hrs. Using equation (1), τrecov (T) can be calculated from fig. 7. By plotting the results in an Arrhenius plot, see fig. 8, we find that
Etrap and Ea equal 1.03 ± 0.16 eV. At 200 °C, τrecov (T) equals 1300 hrs so at 0 °C
(Tjunction of SRAM) τrecov (T) equals about 3 years. Therefore we conclude that annealing effects are negligible at -30°C ambient temperature. In section 3.4 it will
be shown that the saturation and recovery after long stress times is caused by a
compensating degradation effect.
2
Fig. 7: Recovery of the minimum operating voltage Vdd,min after the stress during
storage at various temperatures.
54
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
Fig. 8: Arrhenius plot of the Vdd,min recovery time constants, Ea = 1.03 ± 0.16 eV.
Similar recovery effects were observed in the write time parameters, consistent
with the fact that they are strongly correlated to Vdd,min. No significant recovery of
the access times was observed.
3.4 ANALYSIS AND DISCUSSION OF THE SRAM PARAMETER
DEGRADATION
In this section the origin of the circuit degradation is investigated. The location of the damage is determined by electrical analysis, which is confirmed by circuit simulation, voltage micro probing and photoemission microscopy. Finally it is
verified whether the lifetime limiting transistor can explain the observed parameter shifts quantitatively.
3.4.1 Localisation of the circuit damage
Detailed electrical analysis of the circuit degradation by means of bitmapping
techniques on a memory tester showed that the access time degradation occurred
in all the cells in the two sections containing the two stressed wordlines, see fig. 1.
This reveals that the hot carrier damage is localised in the sense amplifiers and/or
read bus drivers. Bitmapping further showed that the Vdd,min and write time degradation were associated with a write problem; while reading was possible down to
Vdd = 2.4 V (equal to the 0 hour Vdd,min value, see fig. 5), for writing a higher supply voltage was required. The write problem appeared to be solely confined to the
64 cells connected to the two stressed wordlines, see section 3.2. Apparently, the
damage was localised in the memory cells themselves. A schematic of the SRAM
cell and bitline circuit is depicted in fig. 9.
55
Chapter 3
The further analysis was concentrated on the Vdd,min and write time degradation, because these determine the circuit lifetime (see section 3.5).
3.4.2 Stress on the memory cell transistors
Fig. 10a and 10b show the Vds and Vgs voltages of the 6 memory cell transistors
during a typical read and write cycle at Vdd= 5.5 V. It can be clearly seen that the
access transistor T2 (see fig. 9) connected to the ‘0’-node of the memory cell suffers the largest stress, as also was shown by Sakurai et.al. [14].
Whenever the wordline WL is activated at the start of a read or write cycle
(denoted by 'A' in fig. 10b), the Vgs of T2 is swept from 0 V to Vdd while its Vds
still (almost) equals Vdd due to the large bitline capacitance. The stress on the
other cell transistors is much less. Firstly, their maximum Vds is smaller due to the
voltage drop across the diode, see fig. 9. Secondly, during reading no large Vgs
transients occur. Thirdly, when Vgs transients do occur during writing, their Vds
has already dropped significantly due to the small cell node capacitances. Note
however that at elevated supply voltages also the driver transistor D1 will be stressed, because during reading (denoted by 'B' in fig. 10a) its Vgs will have risen to
above its Vt (about 0.8 V). At Vdd = 5.5 V, Vgs will remain smaller than Vt.
Fig. 9: Schematic of the SRAM cell and bitline circuitry. D1 and D2 are the nchannel driver transistors (W/L= 2.0/1.2 µm), Ll and L2 the p-channel
load transistors (W/L=1.0/1.4 µm), Tl and T2 the n-channel access transistors (W/L= L2/1.2 µm) and Cl and C2 the p-channel bitline load transistors. The diode results from the use of n+ and p+ doped polysilicon gates. The black dots indicate the degradation sites.
56
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
(a)
(b)
Fig. 10: Circuit simulation showing the drain-source Vds and gate-source Vgs voltages of the 6 memory cell transistors during a typical read and a write
cycle at Vdd= 5.5 V. During writing a '1' is written in node 2 (see fig. 9)
which initially, and during reading, contained a ‘0’. Due to the delay of
the write signals, a write cycle starts with a read operation. T2 and D1
57
Chapter 3
are stressed during the periods indicated by 'A' and 'B' respectively.
3.4.3 Photo-emission recording of the memory array
The above reasoning was verified by means of photoemission microscopy [15].
Fig. 11a shows the layout of the cell [11] and fig. 11b shows a photo emission recording of the memory array under dynamic operation at Vdd= 8.5 V. Data 1' was
written into the cells so node 2, see fig. 9, contained a ‘0’. The access transistor
T2 and driver transistor D1, as explained in the previous section emit light. This
is indicative of severe hot carrier degradation effects. No light was seen when the
part was not toggled, so punch-through effects are negligible and the emitted light
is undoubtedly associated with hot carrier effects.
Note that the access transistors are completely covered by the bitlines (in second metal). Due to multiple interference, this results in two light spots on both
sides of the bitline at the location of the T2. The PMOS transistors emitted no
light.
(a)
58
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
(b)
Fig 11: Layout of the memory cell showing the location of the transistors depicted in fig. 9 (a) and photo-emission recording of the memory array (b).
The SRAM was operated in the read mode at Vdd= 8.5 V, Trc= 250 ns
and with 'data 1' written into the cells. The integration time equalled
2000 s, using a Hamamatsu C3230-01 photo-emission camera. The light
emitted by the access transistors T2 and driver transistors D1 can be
clearly seen.
59
Chapter 3
3.4.4 Voltage micro-probing of the memory cell transistors
Further confirmation of the above degradation mechanism was obtained by
voltage microprobing of degraded and non-degraded cell transistors. Their bitlines
were isolated by means of laser cutting. Fig. 12 shows the results after 250 hrs
stress at Vdd= 9 V, corresponding to a 4.0 V Vdd,min value. The measurement conditions are shown in the figure. Curve A (series combination of the access and
load transistor) clearly shows a positive 1.25 V Vt -shift of the access transistor
when operated in the reverse mode (source and drain terminals interchanged with
respect to the stress). Back-bias effects have been taken into account. The Vt -shift
leads to a reduction of the saturated drain-source current by more than a factor of
2. Curve B (series combination of the driver and access transistor) shows that,
even in the forward mode, the current drive capability has decreased by almost a
factor 2 in the linear region. However, as is well known, the effects are less severe
in the saturated region.
The above is a clear example of the fact that especially ‘pass’-type transistors
(that are operated with both Vds polarities) are sensitive to hot carrier degradation
in circuits. This is because saturation current decrease is much larger when operated in the reverse mode than in the forward mode. In contrast, ‘inverter’-type
transistors are only operated in the forward mode and consequently these can
withstand much more hot carrier damage at the drain side before a significant decrease of the saturation current occurs.
Fig. 12: Micro-probing results of the cell transistors with isolated bitlines after a
250 hrs stress at Vdd= 9 V. The measurement conditions are shown in the
figure. The thick and thin lines respectively represent measurements on
60
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
five non-degraded and on one degraded cell. Curve A reveals a 1.25 V
positive Vt -shift of the access transistor.
3.4.5 Verification of the SRAM parameter degradation
In contrast to reading, during writing the access transistor operates in the
strongly degraded reverse mode. Hence, especially the memory write parameters
will be affected. We will explain this by a qualitative reasoning followed by a
quantitative verification.
Writing of a '0' at a node that initially contains a ‘1’ occurs if this node, which
is connected to the gate of the opposite driver (see fig. 9), is pulled low enough.
The opposite driver will be shut off and the cell will switch. However, degradation
of the access transistor leads to an increase of the minimum node voltage that can
be attained as this value is determined by the voltage division between the access
and load transistor. In the end, the opposite driver will remain open and the cell
can not be switched any more without increasing the supply voltage Vdd. This explains not only the Vdd,min degradation but also its recovery after long stress times,
see fig 5. As was shown in fig. 10 and 11, in the end also the driver transistors
will degrade and their Vt will increase. As a result, they can be shut of more easily
and the cell can be switched, and thus written, at lower Vdd.
Circuit simulations showed that the write time degradation is caused by the
presence of a meta-stable state in the degraded cell during writing. The occurrence of this state is strongly dependent on the supply voltage and, actually, only occurs when the Vdd,min of the cell is close to the operation voltage, see fig. 6 and 15.
Therefore no write time degradation was measured at Vdd= 5.6 V, see section 5.3.
The asymptote at Vdd,min= 4.4 V in fig. 6 can be easily explained. When Vdd,min
equals the operation voltage, the cell can not be written any more and thus the Tdw
value will approach infinite.
The above has been verified quantitatively as well. Fig. 13 and 14 show the
calculated increase of Vdd,min and Tdw respectively as a function of the Vt -shift of
the access transistor. Their correlation is shown in fig. 15. The transconductance
β, normalised to its 0-hour value, is used as a parameter.
The simulations are in good agreement with the measured values. Fig. 3 shows
that in case of a 1.25 V Vt -shift, the β degradation equals about 30%. From fig.
13 can be seen that the corresponding Vdd,min equals 3.7 V, compared to the measured value of 4.0 V (fig. 12). Furthermore, the simulated Tdw - Vdd,min correlation
in fig. 15 fits well to the measured values in fig. 6. This proves that the large
Vdd,min and Tdw increase can indeed quantitatively be explained by the measured Vt
-shift.
We conclude that the damage in the memory cell is primarily located in the
access transistor and that it is the lifetime limiting transistor. This explains why
not only the two stressed bytes in the product, but also the other cells connected to
the same wordlines, see fig. 1, degraded. These cells are in the 'pseudo read mode'
when their wordline is activated, see section 2, so their access transistors suffer
the same stress as those of the two stressed bytes.
61
Chapter 3
Fig. 13: Circuit simulations showing the minimum operating voltage Vdd,min as a
function of the Vt -shift of the access transistor. The transconductance β,
normalised to its 0-hour value, is used as a parameter.
Fig. 14: Circuit simulations showing the minimum write pulse width Tdw as a
function of the Vt -shift of the access transistor. The transconductance β,
normalised to its 0-hour value, is used as a parameter.
62
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
Fig. 15: Circuit simulations showing the correlation between the minimum operating voltage Vdd,min and the minimum write pulse width Tdw. The transconductance β, normalised to its 0-hour value, is used as a parameter.
3. 5 RELATION BETWEEN THE TRANSISTOR AND SRAM HOT
CARRIER LIFETIME
Two different criteria were used to define the transistor lifetime; a 100 mV Vt shift and a 10% βmax degradation. The SRAM lifetime was determined by the first
parameter that ran out of its datasheet specification. One might argue that the 'distance' to the specification limit is different for each product batch, and therefore a
10% degradation of any parameter would be a more appropriate lifetime criterion.
However, the fact that the parameters of a batch with a short channel length will
be further away from the specification limit, will probably be compensated by the
fact that they will degrade faster. We therefore will use the datasheet related
lifetime criterion. The Tdw write time parameter (specification limit 20 ns), see
section 3.3, appears to be the lifetime limiting parameter. Therefore, a 20 ns Tdw
-value was used as the SRAM lifetime criterion, which, as can be seen from fig. 6,
corresponds to a Vdd,min value of 3.5 to 4.0 V.
In order to allow a comparison, both the transistor and product lifetimes were
extrapolated to minimum Leff (0.75 µm), minimum junction temperature Tj (0 °C)
63
Chapter 3
and, in case of the product, to maximum operating frequency (Twc,min=Trc,min= 45
ns), using equation (2):
B
Tlf = A ⋅ e
V dd
−
⋅e
C
Leff
q⋅ E a
T
⋅ e k ⋅T ⋅ rc
Trc ,min
[s]
(2)
where A, B, C and Ea are constants. The activation energy Ea= -0.04 eV [12]. The
channel length acceleration factor C is only technology dependent and equals 4.4
µm for our technology. Constant A and the voltage acceleration factor B can also
be product type dependent.
In fig. 16 the resulting hot carrier lifetimes of two SRAM batches and the 1.2
µm n-channel transistors are plotted as a function of 1/Vdd. It clearly shows that
slopes of the product and transistor lifetime data are the same (B= 165±10 V), indicating a similar degradation mechanism. However, the SRAM lifetime is significantly larger than the transistor lifetime and equals 1250 years at Vdd = 5.5 V.
This corresponds to a bit lifetime of 4.4⋅1017 read or write cycles. The product transistor lifetime ratio is determined by comparing curve 2 and curve 4, because
only batch 2 was processed using the same flow as the transistors. A lifetime ratio
of about a factor 50 results.
Fig. 16: Hot carrier lifetime of 1.2 µm n-channel transistors, stressed at maximum substrate current Isub (curves 1 and 2) and of two 8K8 SRAM batches (curves 3 and 4) as a function of 1/Vdd, using respectively a 10% β,
a 100 mV Vt -shift and a 20 ns Tdw lifetime criterion. All lifetimes have
64
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
been extrapolated to Leff= 0.75 µm, Tj= 0 °C and maximum operating
frequency. The SRAM lifetime at Vdd= 5.5 V equals 300 to 1250 years,
compared to a transistor lifetime of 17 years (Vt -criterion).
The lifetime discrepancy is in the first place caused by the small sensitivity of
the performance of this particular product to the transistor degradation. In section
3.4 it was shown that the Vt -shift of the access transistor needed to reach the
SRAM end-of-life is 10 times larger, i.e. 1.0 V, than the 100 mV value used to define the transistor end-of-life. Fig. 3 shows that this accounts for about a factor 5
in the product - transistor lifetime ratio. The second major contribution to the
lifetime discrepancy is duty cycle effects. Although compensated by AC enhanced
degradation effects, they apparently account for the remaining factor 10. The influence of AC enhanced degradation effects is thus limited.
3.6 SUMMARY AND CONCLUSIONS
Hot carrier degradation of full-CMOS SRAM results in an increase of the minimum operating voltage, the write time parameters and the access times. The degradation of the first two parameters can be traced back to the degrading of only
one transistor, that is the access transistor of the memory cell. This is a ‘pass’-type
transistor that is operated with both source-drain voltage polarities. This type of
operation makes the transistor much more sensitive to the effects of hot carrier degradation than in the case of operation in the ‘inverter’-type mode because for a
given level of hot carrier damage, the saturation current decrease is much larger
in the reverse mode (source-drain terminals interchanged with respect to the stress
situation) than in the forward mode.
Circuit simulations show that the observed circuit parameter shifts can quantitatively be explained by this failure mode. The relation between the hot carrier lifetime of static stressed transistors and the SRAM products has been established.
Although their supply voltage dependence is the same, the product lifetime appears to be about a factor 50 larger than the transistor lifetime. This discrepancy
can be accounted for by the small sensitivity of the SRAM to the transistor degradation and by duty cycle effects.
In conclusion, product lifetimes are severely underestimated if they are
straightforwardly derived from static transistor lifetime data. This finding can be
applied during product design, for example by eliminating the need for cascoding
of transistors at critical locations. In this way increases in memory access times
and the maximum operation frequency of microprocessors are as well as more aggressive scaling of process technologies without jeopardising the product reliability.
True product lifetimes can best be obtained from stressing the product or by simulation of the degradation effects in the circuit. In addition to this work [24],
numerous papers have appeared dealing with both simulation [16-20] and the product stress issues [20-23]. Of course transistor stresses remain important for process evaluation, optimisation and monitoring.
65
Chapter 3
66
Relation between the hot carrier lifetime of transistors and CMOS SRAM products
3.7 REFERENCES
[1] C. Hu, S.C. Tam, F.-C. Hsu, PA. Ko, T.Y. Chan, K.W. Terril, 'Hot-electron
induced MOSFET degradation - model, monitor and improvement', IEEE
Transaction on Electron Devices, vol.32, pp. 375-385, (1985)
[2] P. Heremans, R. Bellens, G. Groeseneken, H.E. Maes, 'Consistent model for
the hotcarrier degradation in n-channel and p-channel MOSFET's', IEEE
Transactions on Electron Devices, vol. 35, pp. 2194-2209, (1988)
[3] S. Aur, D.E. Hocevar, P. Yang, 'Circuit hot electron effect simulation', IEDM
Technical Digest, pp. 498-501, (1987)
[4] P.M. Lee, M.M. Kuo, K. Seki, P.K. Ko, C. Hu, 'Circuit aging simulator
(CAS)', IEDM Technical Digest, pp. 134-137, (1988)
[5] C. Duvvury, D. Redwine, H. Kitagawa, R. Ham, Y. Chuang, C. Beydler, A.
Hyslop, 'Impact of hot carriers on DRAM circuits', Proceedings International
Reliability Physics Symposium, pp. 201-206, (1987)
[6] M. Matsumoto, Y. Kimura, K. Hirayarna, H. Koyarna, N. Maki, H. Matsumoto, 'Degradation mechanism due to hot electron trapping in high density
CMOS SRAM, Proceedings International Symposium on Testing and Failure
Analysis, pp. 89-94, (1988)
[7] W. Weber, ‘Dynamic stress experiments for understanding hot carrier degradation phenomena', IEEE Transactions on Electron Devices, vol. 35, pp.
1476-1486, (1988)
[8] W. Weber, I. Borchert, 'Hot-hole and electron effects in dynamically stressed
n-MOSFETs', Proceedings ESSDERC, pp. 719-722, (1989)
[9] A.G. Sabnis, J.T. Nelson, 'A physical model for degradation of DRAMS during accelerated stress aging', Proceedings International Reliability Physics
Symposium, pp. 90-95, (1983)
[10] R. Annunziata, G. Dalla Libera, E. Ghio, A. Maggis, 'Annealing of hot carrier damaged double metal MOSIPEV, Proceedings ESSDERC, pp. 715-718,
(1989)
[11] W.C.H. Gubbels, C.D. Hartgring, R.H.W. Salters, J.A.M. Lammerts, M.J.
Tooher, P.F.P.C. Hens, J.J.J. Bastiaens, J.M.F. van Dijk, M.A. Sprokel, 'A
40-ns/ 100-pF low-power full-CMOS 256K (32Kx8) SRAM, IEEE Journal of
Solid State Circults, vol. 22, pp. 741-747, (1987)
[12] C. Yao, J. Tzou, R. Cheung, H. Chan, 'Temperature dependence of CMOS
device reliability', Proceedings International Reliability Physics Symposium,
pp. 175-182, (1986)
[13] R. Mahnkopf, G. Przyrembel, H.G. Wagemann, 'Annealing of hot carrier- induced MOSFET degradation', Journal. de Physique, Coil. C4, pp. 771-774,
(1988)
[14] T. Sakurai, M. Kakumu, T. Lizuka, 'Hot carrier suppressed VLSI with submicrometer geometry', International Solid State Circuits Conference, pp.
272-273, (1985)
[15] N. Khurana, C-L. Chiang, 'Analysis of product hot electron problems by gated emission microscopy', Proceedings International Reliability Physics Symposium, pp. 189-194, (1986)
67
Chapter 3
[16] H. Wang, H. De, R. Lahri, D. Haueisen, ‘Improving hot-electron reliability
through circuit analysis and design’, Proceedings International Reliability
Physics Symposium, pp. 107-111, (1991)
[17] K.N. Quader, P. Fang, J. Yue, P.K. Ko, C. Hu, ‘Simulation of CMOS circuit
degradation due to hot carrier effects’, Proceedings International Reliability
Physics Symposium, pp. 16-23, (1992)
[18] M. Pagey, R.J. Milanowksi, E.S. Snyder, N. Bui, B. Deem, B. Bhuva, S.
Kerns, ‘Unified model for n-channel hot carrier degradation under different
degradation mechanisms’, Proceedings International Reliability Physics
Symposium, pp. 289-293, (1996)
[19] P.-C. Li, G.I. Stamoulis, I.N. Hajj, ‘I-Probe-d: a hot carrier and oxide reliability simulator’, Proceedings International Reliability Physics Symposium, pp.
274-279, (1994)
[20] R. Bellens, I. Clemminck, K. Van Doorselaer, ‘Building-in reliability during
library development: hot carrier degradation is no longer a problem of the
technologists only!’, Microelectronics & Reliability, pp. 1425-1428, (1997)
[21] C. Jiang, E. Johnson, J.J. Shaw, C. Hu, ‘AC hot carrier degradation in a voltage controled oscillator’, Proceedings International Reliability Physics Symposium, pp. 53-56, (1993)
[22] Y. Huh, D. Yang, H. Shin, Y. Sung, ‘Hot carrier induced circuit degradation
in actual DRAM’, Proceedings International Reliability Physics Symposium,
pp. 72-75, (1995)
[23] R. Bellens, ‘Hot carrier degradation in sub-micron CMOS technologies: problems and possible solutions’, Tutorial IRPS, (1998)
[24] J.A. van der Pol, J. Koomen, ‘Relation between the hot carrier lifetime of
transistors and CMOS SRAM products’, Proceedings International Reliability Physics Symposium, pp. 178-185, (1990)
68
4
Systematic Derivation of Latchup Design
Rules for Submicron CMOS Processes
from Test Structures [7]
4.1
4.2
4.3
4.4
Introduction
Latchup susceptibility reduction options
Design rule derivation approach
Application: design rule derivation for a CMOS process on p-/p++ epitaxial substrates
4.4.1
Impact P+ substrate contact placement and P+ guardrings
4.4.2 Impact Nwell contact placement and N+/Nwell guardrings
4.4.3 Process specific design rules
4.5 Conclusions
4.6 References
4.1 INTRODUCTION
Latchup is an intrinsic reliability risk for CMOS processes [1,2] due to the presence of built-in parasitic thyristors in this technology. These thyristors can switch
into an unwanted high current mode, called latchup, due to external disturbances
like e.g. voltage spikes. This high current mode can only be switched off by disconnecting the supply voltage. So, when built into a system, the occurrence of
latchup can easily damage the circuit and cause a system malfunction. Thus each
CMOS circuit requires some sort of latchup protection to prevent randomly occurring failures.
No clear approach has been published in literature for the derivation of latchup design rules. Up to now, it has been regarded as a kind of ‘art‘ and a significant amount of trial-and-error has been associated with circuit design and product
development. In this work, however, a consistent approach is demonstrated that
allows to derive latch-up design rules from simple test structures and that is appli67
Chapter 4
cable to any CMOS technology [7]. It thus helps eliminating the kind of ‘black
magic’ at-mosphere around this phenomenom.
In a CMOS process using p-type substrates, the thyristors consist of lateral
NPN and vertical PNP transistors, see fig. 1.
P-emitter
N-emitter
substrate
contact
C
P+
Nwell
contact
A
p- epi
Rbase p-
Rbase p++
B
n+
LOCOS
L-NPN
P+
V-PNP
n+
Nwell
RbaseNw
p++ bulk
Fig. 1: Schematic view of a latchup test structure showing the parasitic bipolar
transistors and base resistances, the relevant design rules (A: n+p+-spacing, B: distance N-well contact to P+ emitter and C: distance P+ substrate
contact to N+-emitter) and the N- and P-emitters.
As for new generation submicron processes the n+p+-spacing (that determines
the base-width and thus the gain of the parasitic bipolar transistors) continues to
shrink, the latchup susceptibility increases strongly. In principle, latchup free products can be obtained by ensuring that the thyristor holding voltage Vh is larger
than the supply voltage Vdd. This appears however to be impractical for submicron
processes as Vh > Vdd is only achieved for n+p+ spacings significantly larger than
the minimum design rules, even if latchup robust p-/p++ epitaxial substrates are
used [3,4]. This is illustrated by the data in fig.2 from two 5 V 0.7 µm twin well
CMOS processes from two waferfabs using 10 Ωcm p- epi on a 0.01 Ωcm p++ substrate.
Process A has a significantly higher temperature budget than process B1 (the
alignment markers are made by a double LOCOS oxidation instead of by a silicon
dry etch) which results in an about 1.5 µm thinner remaining epi thickness at the
end of the process for process A due to diffusion of the p++ bulk into the p- epi layer. This explains that the 9 µm epi Vh data of process A correspond to those of
7.5 µm epi of process B1. Fig. 2 shows that by going to very thin epi layers, Vh
approaches Vdd but this option is limited by P+/Nwell/p++ substrate punch-through
68
Systematic derivation of latchup design rules from test structures
and Rsheet Nwell requirements, see e.g. fig. 3. Thus there is a clear need for proper
design rules to obtain latchup robust products. In this paper a systematic method
is proposed for the derivation of those design rules from test structures. This
method is illustrated using data from one of the above CMOS processes on epitaxial substrates. However, it is also applicable for processes on p- bulk substrates
with or without buried layers.
7
5.5V supply voltage
6
Vhold [V]
5
4
5 um, A
7 um, A
8 um, A
9 um, A
12 um, A
7.5um, B1
3
2
1
0
0
2
4
6
8
n+p+ spacing [um]
10
12
Rsheet N-well [kOhm/sq]
Fig. 2: Holding voltage at 125 °C as a function of n+p+ spacing A for various epitaxial layer thicknesses and two different 0.7 µm processes (A and B1)
from two waferfabs.
3.0
M ax. value
Analog design constraint
2.5
2.0
1.5
1.0
0.5
0.0
4
6
8
10
12
Epi Thickness [um ]
14
69
Chapter 4
Fig. 3: Nwell sheet resistance as a function of epitaxial layer thickness for process
A. For small thicknesses the p++ doped substrate overdopes the Nwell resulting in an increase of the sheet resistance. Analog design constraints determine the maximum allowable Rsheet Nwell.
70
Systematic derivation of latchup design rules from test structures
4.2 LATCHUP SUSCEPTIBILITY REDUCTION OPTIONS
Facing the fact that Vh<Vdd, there are basically only three options to improve
latchup robustness. First, one can increase the thyristor P- and N-emitter trigger
currents Itrig,p/n (hole and electron injection respectively) by reducing the gain of
the parasitic bipolar transistors and/or their base resistances by means of process
as well as design options, see table 1. Second, one can reduce the number of carriers reaching the thyristor after injection at the product I/O-bondpads by applying
P+ or N+/Nwell guardrings, see e.g. fig. 4.
Process
option
p- epi layer on
p++ substrate
N / Pwell dose
n++/p++ buried
layers
Silicidation
Design option
Effect on gain bipolar transistors
-
+ +
n p -spacing
n+p+-spacing
partitioning
Placement of Nwell /
substrate contacts
Yes
Yes (VPNP only)
Effect on base
resistance
Yes (L-NPN
only)
Yes
Yes
Yes
Yes
Yes
-
-
Yes
Table 1: Impact of process and design options on gain and base resistance of LNPN and V-PNP parasitic bipolar transistors, assuming a p--type substrate
n+/Nwell
guardring
n +-injector
A
n+
p - epi
n+/Nwell
collector
B
n+
n+
Nw
Nw
p ++ bulk
71
Chapter 4
Fig. 4: Schematic view of a N+/Nwell guardring test structure showing the relevant design rules (A: distance guardring to N+-injector and B: guardring
width)
Finally the injection points can be placed at a larger distance d from the thyristors which also reduces the current density at the location of the thyristors due to
geometrical spreading of the injected carriers. In case of an injecting diffusion
with length L, the decrease of the current density with distance d has been calculated for the two extreme cases shown in fig. 5a and 5b. Assuming a 2D-spreading
of the injected current, we derive the equations (1) and (2) respectively for the current density reduction factors Fspread. Note that this a worst case approach as in
reality near-3D spreading will occur, resulting in even lower values of Fspread.
E le m e n t d x
α
L
C i r c u it
d
E S D P ro te c tio n
= In je c to r
(a)
L
d
ESD Protection
Circuit
Element dx
= Injector
(b)
Fig. 5: Latchup sensitive circuit located a) perpendicular and b) parallel at a distance d from a uniformly injecting diffusion with length L (e.g. an ESD
protection).
Circuit located perpendicular to injector with length L:
72
Systematic derivation of latchup design rules from test structures
π
1
Fspread = (ln(tan( )) − ln(tan(
π
4
arctan(
2
2d
)
L )))
(1)
Circuit located parallel to injector with length L:
Fspread =
1
d+ L
(ln(
))
2⋅ π
d
(2)
Fig. 6 shows some numerical values for an injector with length L=100 µm (typical length of an ESD protection that acts as injecting diffusion). We see that are
only minor differences between the extreme cases for longer distances d and that
the current density decrease in general can be reasonably approximated to be proportionally to ≈ ln(1+L/d).
F-spread
1
Injector = 100um long
ESD diode
Perpendicular to
Injector
Parallel to
Injector
0.1
0.01
0
200
400
600
800
Distance to Injector [um]
1000
Fig. 6: Current density reductions factors Fspread as a function of distance d and
orientation to a uniformly injecting diffusion with length L equal to 100
µm.
3. DESIGN RULE DERIVATION APPROACH
In order to ensure that a product can withstand a specified injection current
Iinjection from the product bondpads, the following criterion must be satisfied :
73
Chapter 4
J trig , p /n ≥
I injection
Linjection
⋅ Fescape ⋅ Fspread
[µA/µm]
(3)
with Jtrig,p/n the trigger current density of the thyristor in case of hole and electron
injection respectively, Linjection the perimeter of the injecting junction(s) connected
to the bondpad (typically the ESD-protection), Fescape,p/n the fraction of injected
carriers that ‘escaped’ from the guardrings and Fspread the current density reduction factor due to the geometrical spreading of the injected current. Equation (3)
holds for both positive and negative pulses (hole injection from P-emitter and
electron injection from N-emitter respectively). Typical values for Iinjection and
Linjection are +/- 100 mA (JEDEC latchup qualification requirement [6]) and 200
µm (perimeter typical ESD protection) respectively resulting in a maximum
injected current density at the bondpad of ≈ 500µA/µm.
Using guardring efficiency and thyristor trigger currents data obtained from
latchup test structures, see fig. 7 and 8, one can now determine what the maximum allowed distances of Nwell and P+ substrate contacts to the emitter diffusions
are at minimum n+p+ spacing and what the guardring width and distance requirements are. Note that here data obtained at maximum junction temperature must be
used as latchup trigger currents are lowest at maximum temperature [1,2,5]. Thus
the appropriate design rules can be derived as will be illustrated using data from
the CMOS process B1, see section 4.4.3.
Fig. 7: Schematic view of a latchup test structure to determine effect of Nwell and
P+ substrate contact placement on latchup sensitivity.
74
Systematic derivation of latchup design rules from test structures
Fig. 8: Schematic view of a test structure to determine effect of N+/Nwell guardring width and distance to N-emitter on the electron collection efficiency.
4.4 APPLICATION: DESIGN RULE DERIVATION FOR A CMOS
- ++
PROCESS ON P /P EPITAXIAL SUBSTRATES
4.4.1. Impact P+ substrate contact placement and P+ guardrings
In case of p-/p++ epi substrates, the p++ bulk acts as a low ohmic shunt for the
base resistance of the lateral parasitic NPN-transistor otherwise formed by the
high ohmic p- epi layer, see fig.1. This strongly improves P-emitter trigger currents as shown in fig. 9, particularly for n+p+ spacings larger than the epi layer
thickness. In that case the injected hole current flow changes from lateral through
the p- epi layer to vertical through the p++ bulk.
2000
Jtrig P-emitter [uA/um]
7.5um, B2
8 um, B2
1500
9 um, B2
12 um, B2
1000
500
0
0
2
4
6
8
10
12
14
16
n+p+ spacing [um]
75
Chapter 4
Fig. 9: P-emitter trigger current of process B2 at 125°C as a function of n+p+-spacing A (see fig.1) for various epitaxial layer thicknesses.
Fig. 10 shows that the P+ substrate contact placement is not critical. The p++
bulk namely acts as a kind of equipotential surface that sinks all the injected holes
and redistributes them over all available substrate contacts in the layout, see fig.
11. The resistance R1 from a P+ contact through the Pwell and epi layer is in the
order of a few kΩ while the substrate spreading resistances Rsub1,2 are less than
100 Ω for distances d2 less than 1 mm. As a result, Fspread is very small (<< 0.1).
For the same reason the P+ guardring efficiency is fully determined by the ratio of
the P+ guardring area versus the total P+ area in the layout and thus the P+ guardring width and distance to the P+ emitter are not very critical. The thinner the epi
layer, the more effective the p++ base shunt and its function as a hole current sink.
Jtrig P-emitter [uA/um]
1000
7 um, A
8 um, A
9 um, A
12 um, A
7.5um, B1
800
600
400
200
0
0
20
40
60
80
Distance substrate contact
to N-emitter [um]
100
Fig. 10: P-emitter trigger current at 125 °C of process A and B1 as a function of
P+ substrate contact to N-emitter distance C (see fig.1) at 4.8 µm n+p+spacing for various epi layer thicknesses.
76
Systematic derivation of latchup design rules from test structures
p+-emitter
p+ guardring
d2
d1
Nwell
p+
R1
p++ bulk
p+
R1
R1’
p- epi
p+
p+ collector
Rsub1
Rsub2
Fig. 11: Schematic view of the distribution of injected holes over the various P+
substrate contacts.
4.4.2. Impact Nwell contact placement and N+/Nwell guardrings
Another feature of p-/p++ epi substrates is that the injected electrons are confined to the epi layer due to the build-in potential between the p- epi and p++ substrate and the small minority carrier diffusion length (≈1µm) in the p++ substrate.
First, this results in a slight increase of the (Nwell) base resistance of the parasitic
PNP transistor and thus actually in a somewhat reduced N-emitter trigger current
for thinner epi layers and small n+p+ spacings, see fig. 12.
Jtrig N-emitter [uA/um]
60
50
40
30
7.5um, B2
8 um, B2
9 um, B2
12 um, B2
20
10
0
0
2
4
6
8
10
12
14
16
n+p+ spacing [um]
Fig. 12: N-emitter trigger current at 125 °C of process B2 as a function of n+p+spacing A (see fig. 1) for various epitaxial layer thicknesses.
77
Chapter 4
Second, N+/Nwell guardrings become very efficient as due to the injected electron confinement in the epi layer the distance of the N+/Nwell guardring to the injector is irrelevant, see fig. 13.
1E-3
Escape Fraction F
7.5um, B1
7.5um, B2
8 um, B2
9 um, B2
12 um, B2
1E-4
1E-5
0
10 20 30 40 50 60
N-emitter/Nwell guardspacing[um]
70
Fig. 13: N+/Nwell guardring efficiency at 125 °C of process B1 and B2 as a function of distance A (see fig. 4) between N+ injector and a 10 µm wide N+ /
Nwell guardring for various epi layer thicknesses.
Furthermore fig. 14 shows that the collection efficiency improves logarithmically with the guardring width. This can be understood using fig. 15, where we divided the p- epi layer between Nwell bottom and p++ bulk in squares. The probability that an electron on its ‘random walk’ diffusion path does pass such a square is
about 1/e (e =2.7) because the electron recombines or is collected as soon as it hits
the p++ bulk or Nwell respectively. The number of escaped electrons does thus decrease exponentially with the number of squares and thus guardring width. Also
here thinner epi layers are beneficial as this improves the electron confinement
(i.e. increases the number of squares for a constant guardring width).
78
Systematic derivation of latchup design rules from test structures
1E-1
7.5um, B1
Escape Fraction F
7.5um, B2
8 um, B2
9 um, B2
1E-2
12um, B2
1E-3
1E-4
3
4
5
6
7
8
9
n+/Nwell guardringwidth[um]
10
11
Fig. 14: N+/Nwell guardring efficiency at 125 °C of process B1 and B2 as a function of guardring width B (see fig. 4) for various epi layer thicknesses.
n + / N w e ll
g u a r d r in g
p- epi
e - in
n+
N w e ll
e - c o lle c te d
ee-
p + + b u lk
e -out
pass
r e c o m b in e d
p - e p i s q u a re
e le c tr o n p a s s p r o b a b ility ≈ 1 / e
Fig. 15: Escape fraction of electrons decreases exponentially with the number of
p- epi squares and thus N+/Nwell guardring width.
Finally fig. 16 shows the effect of the placement of Nwell contacts on the Nemitter trigger current for various epi layer thicknesses. As the Nwell contact to Pemitter spacing directly determines the base resistance of parasitic PNP transistor,
it has a strong effect on the trigger current.
79
Chapter 4
Jtrig N-emitter [uA/um]
50
7um , A
8um , A
9um , A
12um , A
7.5um, B1
40
30
20
10
0
0
20
40
60
80
100
Distance Nwell contact to P-emitter [um]
Fig. 16: N-emitter trigger current at 125 °C of process A and B1 as a function of
Nwell contact to P-emitter distance B (see fig.1) at 4.8 µm n+p+-spacing
for various epi layer thicknesses.
4.4.3 Design rules for CMOS process B1
Design rules can now be easily derived from the above data where we take the
data of process B1 as example. In case of hole injection (positive trigger currents),
fig. 10 shows that for process B1 with a 7.5 µm p- epi layer Jtrig,p= 190 µA/µm at
the 4.8 µm minimum n+p+ spacing. As Fspread << 0.1, see section 4.4.1, the criterion in equation (3) is thus easily met and the substrate contact and P+ guardring
design rules can be very relaxed (e.g. one substrate contact for every 200 µm and
use of minimum p+-width for the P+ guardring). In case of electron injection (negative trigger currents), fig. 13 shows that for process B1 with a 7.5 µm p- epi layer, a 4 µm wide N+/Nwell guardring results in Fescape = 5.5⋅10-4. Assuming Fspread
= 1 (very conservative, see fig. 6) we find that Jtrig,,n should be ≥ 500x5.5⋅10-4x1=
0.28 µA/ µm. One now can determine the maximum allowable Nwell contact to Pemitter spacing from fig. 14. Using a design rule of 100 µm, we find Jtrig,n= 3.5
µA/µm, which still provides a factor 12 safety margin. The above demonstrates
that CMOS processes on epi substrates are very latchup robust provided proper
design rules are used.
4.5 CONCLUSIONS
Using a dedicated set of test structures, the latchup susceptibility of a number
of submicron CMOS processes on p-/p++ epitaxial substrates have been characteri80
Systematic derivation of latchup design rules from test structures
zed as a function of n+p+-spacing, placement of Nwell and substrate contacts and
guardring width and distance to the injecting junction. Subsequently it has been
shown for the first time how these data can be translated into latchup design rules
taking into account the geometrical spreading of the injected carriers. It is
demonstrated that this approach results in very latchup robust products in case of
p-/p++ epitaxial substrates, thus eliminating the need for time-consuming and expensive ‘trial-and-error’ design optimisation cycles. The method is also applicable
to processes on non-epitaxial substrates.
4.6 REFERENCES
[1] R.R. Troutman, ‘Latchup in CMOS technology’, Kluwer Academic Publishers, Boston, (1986)
[2] E.A. Amerasekera, ‘Failure mechanisms in semiconductor devices’, Chapter
3, Wiley, Chichester, (1997)
[3] E.A. Amerasekera, ‘Designing latchup robustness in a 0.35um technology’,
Proceedings International Reliability Physics Symposium, pp.280-285, (1994)
[4] M.J. Chen, S.S. Ho, P.N. Tseng, R.Y. Shiue, H.S. Lee, J.H. Chen, J.K. Jeng,
Y.N. Jou, ‘A compact model of holding voltage for latchup in epitaxial
CMOS’, Proceedings International Reliability Physics Symposium, pp. 339345, (1997)
[5] T. Aoki, ‘A discussion on the temperature dependence of latchup trigger current in CMOS/BiCMOS structures’, IEEE Transactions on Electron Devices,
ED-40, pp. 2023-2028, (1993)
[6] JEDEC Specification
[7] J.A. van der Pol, P.B.M. Wolbert, ‘Systematic derivation of latchup design rules for submicron CMOS processes from test structures’, Microelectronics &
Reliability, pp. 1051-1056, (1998)
81
Chapter 4
82
5
Short Loop Monitoring of
Metal Stepcoverage by
Simple Electrical Measurements [12]
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
Introduction
Electrical assessment of metal stepcoverage
Design rule verification for (non-)planarized bipolar processes
Effect of metal stepcoverage on electromigration lifetime
Design rule verification for a non-planarized BiCMOS process
Process split evaluation and shortloop equipment monitoring
Metal stepcoverage wafermaps
Summary and conclusions
References
5.1 INTRODUCTION
Metal stepcoverage is a key factor determining metallization reliability. Ample
data [1-7] show that reduced stepcoverage seriously affects the electromigration
resistance of metal lines. Several papers [4-7] have shown that the reduction in
electromigration lifetime is larger than can be predicted from cross section reduction alone. The quantitative effect of metal stepcoverage on electromigration lifetime however remains complicated and appears to depend on the total number of
steps in the line [7], the spacing between steps [3], the passivation system [6], the
line width and step frequency [4] and obviously the specific topography under study. Many of these effects have been primarily attributed to the reduction of the
grain size of the aluminum alloy on steps compared to flat topography [4,7].
Smaller grains increase the number of triple points which are the points of flux divergence. However also temperature gradients, grain size orientation and mechanical stress effects do play an important role [4-7]. Via hole electromigration results reported in [8] for example show that the electromigration resistance of an81
Chapter 5
isotropically etched vias is better than that of tapered vias despite a much better
metal stepcoverage in case of the latter. This was attributed to a larger tensile mechanical stress exerted by the passivation in case of the tapered via increasing the
diffusivity of the aluminum atoms.
Traditionally, metal stepcoverage is determined by making ‘Schliffs’ or Focussed Ion Beam (FIB) cross sections on a limited number of locations, e.g. 5, on a
wafer. This method has however a number of drawbacks and limitations. Firstly,
variations in metal stepcoverage across the wafer can hardly be assessed.
Secondly, in case of non-planarized processes, topography strongly increases if
due to alignment variations between two masks two separate steps do coincide.
This can result in a significant reduction of the metal stepcoverage. The
probability of detecting such worst case steps with the traditional method is quite
low because these coinciding steps typically only occur at worst case 4σ mask misalignment. Finally the cross section method is very time consuming and thus
costly.
In this paper an alternative method to overcome the above limitations will be
described which is capable of measuring the metal stepcoverage electrically and is
suited for design rule verification, evaluation of process splits, shortloop monitoring of metal deposition equipment and dielectric planarization processes, the generation of metal stepcoverage wafermaps, wafer release of production wafers
(screening-out of weak parts) and evaluation of the correlation between metal
stepcoverage and electromigration resistance. Results will be shown of the application of this method on BiCMOS and Bipolar processes with different metallization systems. Based on this work the method has also been applied to study stepcoverage issues in an TiW/AlSiTi metallization system [11].
5.2 ELECTRICAL ASSESSMENT OF METAL STEPCOVERAGE
5.2.1 Test structures
The test structures consist of metal line meanders over oxide/polysilicon/metal
topography and flat surfaces, both in X- and Y-direction, see fig. 1. In one module
the meanders cross more than 150 steps. These steps make up about 30% of the
total line length of 2250 µm. As shown in fig. 2, the metal lines have no metal
overlap over the sides of the contacts to prevent the creation of resistance path
along the sides of the contacts shunting the resistance within the contacts which
has a more severe topography. By varying the distance between two masks the topography can be varied and also worst case coinciding steps can be created. The
sensitivity of the structures can be increased by enlarging the number of steps
within a given line length.
The test structures enable resistance measurements and detection of shorts between lines and underlying metal or polysilicon. To ensure accurate resistance
measurements when using probe needles, the lines are contacted via Kelvin contacts to the bondpads. On one die about 160 different stepcoverage modules with
all kinds of topography variations are present (totaling about 70mm2) covering
82
Short loop monitoring of metal stepcoverage by simple electrical measurements
critical steps in BiCMOS, Bipolar and high voltage BCD processes. These processes involve as well planarised as non-planarized metallisation systems. Line width
of the meanders varies between 4 µm and 8 µm depending on the metallisation
process. The 4σ alignment accuracy of the stepper used for processing of the
wafers in this study equals 0.5 µm.
metal 1
poly silicon
contact to silicon
n- diffusion
spacing
contact to poly
Fig. 1: Schematic view of a stepcoverage test structure for a BiCMOS process
showing one metal line meander crossing 204 polysilicon (PS) and contact-to-silicon (CO) defined steps (details see fig. 2) and one metal line
meander without topography.
Fig. 2: Schematic view of part of a stepcoverage test structure for the BiCMOS
process in fig. 1 showing a metal1 line crossing polysilicon (PS) gate and
contact-to-silicon (CO) defined topography. The CO-PS spacing is varied
in different stepcoverage modules.
About 100 die are placed on a 5 inch wafer allowing the assessment of metal
stepcoverage over the complete wafer and thus the generation of stepcoverage wa83
Chapter 5
fermaps. All processes can be covered with one dedicated metal stepcoverage
maskset and process specific shortloop flowcharts.
84
Short loop monitoring of metal stepcoverage by simple electrical measurements
5.2.2 Measurement method
The measurement method is based on the comparison of the resistance of a
metal line meander over topography and of a meander on a flat surface, see fig. 1.
The use of Kelvin contacts ensures accurate resistance measurements. By taking
the ratio of the two resistances, the effects of line width and sheet resistance
variations can be eliminated. The effects of lithography induced local line width
variations over steps can be neglected because the line width of the test structures
is about 6 µm which is well above the resolution of the stepper used. Correlations
between the resistance ratio and actual metal stepcoverage percentage can be
made easily due to the fact that the metal lines do not overlap the sides of the
contacts. This enables the assessment of the metal stepcoverage percentage by a
fast and simple topview SEM inspection after removal of just the passivation and
intermetal dielectric layers, see fig. 3. Resistance measurements are done on a
Keithley 450 parametric tester.
Fig. 3: Topside SEM inspection of the stepcoverage of a metal2 line in a bipolar
process crossing topography defined by metal1 (IN) and shallow-n diffusion (SN) oxide steps. SN-IN spacing equals 4 µm.
Using the above described test structures and measurement method the drawbacks of the traditional method can be overcome. The new method is fast and allows evaluation of metal stepcoverage on all possible topography since all distances between two masks are present in different modules (including those with
worst case coinciding steps). Wafermaps of metal stepcoverage can be made easily
and electromigration measurements can be performed as a function of the
stepcoverage percentage. Furthermore the method is suitable for short loop
85
Chapter 5
equipment monitoring and process split evaluation and can be applied to any
process technology ranging from high power bipolar to submicron CMOS.
The electrical stepcoverage characterization method has been applied to planarised and non-planarized BiCMOS and high power bipolar processes with
different metallization systems. The results are shown and discussed in the next
sections.
5.3 DESIGN RULE VERIFICATION FOR BIPOLAR PROCESSES
The above method has been used for design rule verification on two 3 µm high
power bipolar processes that only differ in their metallization system. In the first
process pattern definition is done by wet etching of aluminum while in the second
process the aluminum is anodized (Al converted to Al2O3). The latter process thus
has no metal1 steps and is planarized in contrast to the non-planarized wet-etch
technology. Metal1 (IN) and metal2 (IN2) thickness equal 1.0 µm and 2.0 µm respectively and metal composition is AlSi(1%)Cu(0.04%). Metal2 is sputtered at
200 °C and 350 °C for the anodized and wet etched technology respectively. The
intermetal dielectric consists of a 0.9 µm thick plasma deposited Si3N4 layer.
In both technologies the shallow-n (SN) emitter is diffused from a
phosphorous oxide slurry (no implantations used). This creates an about 0.7 µm
high oxide step at the edge of the SN diffusion. Critical aspect in the metallization
system is the metal2 stepcoverage over metal1 as a function of the spacing
between the metal1 step and the SN oxide step (IN-SN spacing), see fig. 4. If the
metal1 and SN step coincide, topography increases to 1.7 µm. The stepcoverage
maskset contains modules in which the IN-SN spacing is varied including worst
case coinciding steps. In this way the minimum allowable spacing to guarantee
good stepcoverage can be determined. The IN2 line width equals 8 µm.
Phosphorous oxide
Metal 2
Metal 1
Metal 1
Intermetal Nitride
Oxide
SN diffusion
IN-SN
DISTANCE
Fig. 4: Schematic cross section of the non-planarized bipolar process with etched
aluminum showing the topography below metal2 caused by metal1 and
SN diffusion oxide steps.
86
Short loop monitoring of metal stepcoverage by simple electrical measurements
Fig. 5 and 6 show the metal2 stepcoverage (represented by the resistance ratio
between metal lines over steps and flat lines) as a function of the IN-SN spacing
for the non-planarised (A: etched aluminum) and the planarised (B: anodised aluminum) bipolar process.
A: non-planarized
1.25
B: planarized
Mean
Resistance Ratio
1.20
wafer A1
wafer A2
wafer A3
wafer B1
wafer B2
wafer B3
1.15
A
1.10
1.05
1.00
B
-4.0
-3.0
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
4.0
0.95
IN1 Overlap over SN [um]
Fig. 5: Resistance ratio between metal lines over steps and flat lines as a function
of the metal1 to SN oxide step spacing for a non-planarised (A) and a planarised (B) bipolar process. The steps coincide in case the SN-IN spacing
equals 0 µm.
87
Chapter 5
Fig. 6: Normal distribution plot of the 96 resistance ratios over the wafer for SNIN spacings of 0 µm and 4 µm and for both bipolar processes. Spread
across the wafer is clearly less than differences between variants.
Fig. 5 shows the mean value of the ratios of the 96 modules on the wafer while
the variation of the ratio across the wafer can be seen from the normal distribution
plots of the ratios in fig. 6. Not surprisingly, the non planarised process shows a
6% higher resistance ratio in fig. 5. However for coinciding IN-SN steps the ratio
increases by another 15% which points at a degraded metal2 stepcoverage. SEM
pictures of this step in fig. 7 show that this is indeed the case; the metal2 stepcoverage is only 10 to 15 %. In fig. 7 also SEM pictures for other SN-IN spacings
are shown. These can be used to calibrate the resistance ratio. Using fig. 5, design
rules for guaranteeing sufficient metal stepcoverage can be easily generated. In
this case, under worst case misalignment conditions, the IN-SN spacing should be
larger than 0.5 µm.
Fig. 7: Topside SEM inspections of the non-planarized bipolar process after removal of the top layers showing the metal2 stepcoverage over metal1 and
SN oxide steps for four different IN-SN spacings namely -4.0 µm (a), -1.0
µm (b), -0.5 µm (c) and 0.0 µm (d). Detoriation of the stepcoverage with
decreasing spacing can be clearly seen, worst case occurs at a SN-IN spacing of 0 µm.
5.4 EFFECT OF METAL STEPCOVERAGE ON ELECTROMIGRATION
LIFETIME
88
Short loop monitoring of metal stepcoverage by simple electrical measurements
Electromigration experiments have been carried out on the above described
non-planarised bipolar technology to determine the effect of reduced metal stepcoverage on electromigration lifetime. Average aluminium grain size as measured
on bondpads equals 1.8 µm. Fig. 8 shows the lifetime results of 8 µm wide metal2
meanders over flat surfaces (having 100% stepcoverage) and over coinciding SNIN steps (worst case topography, 10% to 15% stepcoverage) for two stress conditions in a lognormal distribution plot. A 40% resistance increase was used as the
failure criterion. Surprisingly, the MTTF reduction of lines over steps is only
about 35% and thus much less than could be anticipated based on the aluminum
cross section reduction, although the spread is somewhat larger. Extrapolation to
normal use conditions shows that the electromigration lifetime in case of a SN-IN
spacing of 0.0 µm is still acceptable.
Fig. 8: Lognormal distribution plot of the electromigration lifetime of metal2
lines in case of no (100% stepcoverage) and worst case (15%
stepcoverage) underlying topography.
Failure analysis shows that the opens on the test structures with worst case
steps do not preferably occur on the steps as might be expected from the cross
section reduction and the presence of smaller aluminum grains at the steps
[4,7,10]. Apart from at steps, opens are also found on the flat parts of the metal2
89
Chapter 5
line in-between the steps, see fig. 9. Fig. 9 also shows that in this case the opens
do occur as well on top of the IN line as between the IN lines, see fig. 9. A similar
failure mode was found in [3] and attributed to temperature gradient effects. This
hypothesis was however not unambiguously proven and also we do not yet
understand why the failures do not preferentially occur on the steps. Another contributing factor may be the fact that the distance between the two steps in our
structures is only 14 µm. This is in the same order as the Blech length [9] of the
metallization system below which void generation due to electromigration is
counteracted by backflow of aluminum due to the electromigration induced
mechanical stress and/or atomic concentration gradients.
(a)
(b)
90
Short loop monitoring of metal stepcoverage by simple electrical measurements
(c)
Fig. 9: SEM inspection of the failure location after electromigration stress of a
metal2 line over worst case topography (15% stepcoverage) in the nonplanarized bipolar process.
The larger spread in failure times in case of lines over steps is most likely caused by variations in metal stepcoverage over the wafer and thus between the
various Devices Under Test in our study. Fig. 6 namely shows that for coinciding
SN-IN steps the spread in the resistance ratio over the wafer is about 0.04, which
is about the same as the difference between the mean ratios of the modules with a
SN-IN spacing of 0.0 µm and 0.5 µm. According to fig. 7 these modules have a
worst case stepcoverage of 10% and 25% respectively. Based on this we estimate
that due to small SN-IN alignment variations across the wafer, the stepcoverage
for modules with coinciding SN-IN steps ranges from 10% to 25% over the
complete wafer. The explanation given in [10], larger spread due to differences in
line length, does not hold for our case as all line lengths are equal.
5.5 DESIGN RULE VERIFICATION FOR A NON-PLANARIZED
BICMOS PROCESS
In non-planarized (Bi)cmos processes a critical issue is the spacing between
polysilicon (PS) gates and the contacts-to-silicon (CO) as it determines both the
occurrence of metal1 to PS shorts and the metal1 stepcoverage in CO-contacts, see
fig. 10. In this specific 1.5 µm BiCMOS technology the polysilicon and PS-metal1
dielectric (TEOS) thickness are both 0.5 µm and the metal1 thickness equals 0.6
µm. Metal1 composition is AlSi(1%)Cu(0.04%) and it is sputtered at 200°C. The
CO-contacts are wet etched. The stepcoverage maskset contains modules in which
the CO-PS distance is varied (see fig. 1) including worst case coinciding steps and
modules that should result in shorts. In this way the minimum allowable spacing
to guarantee good stepcoverage and good IN-PS isolation can be determined.
Phosphorous
Oxide
Metal1
TEOS
CO
Poly Silicon
CO
CO-PS Spacing
Fig. 10: Schematic cross section of the non-planarized BiCMOS process showing
the topography below metal1 caused by PS gates and CO-contacts.
91
Chapter 5
Fig. 11 shows the metal1 stepcoverage as a function of the CO-PS spacing. For
small CO-PS spacings many of the 96 modules on the wafer show very high resistance ratio values. Assuming that resistance ratios larger than 10 are opens, fig.
11a shows the percentage opens as a function of the CO-PS spacing. In fig. 11b
boxplots of the resistance ratio for the remaining non-open modules are shown as
a function of the CO-PS spacing.
12 0
w a fer 1
w a fer 2
Percentage Opens
10 0
80
60
40
20
0
0 .0
0.5
1 .0
1.5
2 .0
2 .5
CO -PS D ista nce [um ]
(a)
1.20
Resistance Ratio
(non open &
non-shorted modules)
w afer 1
w afer 2
1.15
5%
25%
50%
75%
95%
1.10
1.05
1.00
0.0
0.5
1.0
1.5
2.0
2.5
CO-PS Distance [um]
(b)
Fig. 11: Metal1 stepcoverage over PS gates and neighbouring CO-contacts as a
function of the CO-PS spacing for the non-planarized BiCMOS process.
The stepcoverage is depicted by the percentage of modules showing
opens (resistance ratio >10) (a) and boxplots of the resistance ratio between lines over steps and flat lines of the non-open modules (b). The 5,
25, 50, 75 and 95 percentile points are shown.
92
Short loop monitoring of metal stepcoverage by simple electrical measurements
Fig. 12 shows the percentage of metal1-PS shorts as a function of the CO-PS
spacing. SEM pictures of FIB cross sections can be seen in fig. 13 for various COPS spacings. These cross sections clearly support the electrical measurements.
Fig. 13b shows for example that for a CO-PS spacing of 1.0 µm the metal1 stepcoverage is reduced to zero (thus effectively an open) while fig. 13c clearly shows
a IN-PS short at a 0.5 µm CO-PS spacing.
120
wafer 1
wafer 2
Percentage Shorts
100
80
60
40
20
0
0.0
0.5
1.0
1.5
2.0
2.5
CO-PS Distance [um]
Fig. 12: Percentage of metal1 to polysilicon shorts as a function of the CO to PS
spacing for the non-planarized BiCMOS process.
Metal1
P2O5 oxide
TEOS oxide
Polysilicon
Silicon
Gate oxide
(a)
CO-PS = 1.5µm
(c)
CO-PS = 0.5µm
(b)
CO-PS = 1.0µm
93
Chapter 5
(d)
CO-PS = 0.0µm
Fig. 13: SEM inspections of the non-planarized BiCMOS process showing metal1 stepcoverage over PS and CO topography and metal1 to PS shorts
for four different CO-PS spacings namely 1.5 µm (a), 1.0 µm (b), 0.5 µm
(c) and 0.0 µm (d). Opens and shorts occur for a CO-PS spacing smaller
than 1.75 µm.
It can be concluded from fig. 11 and fig. 12 that for an acceptable metal1 stepcoverage and prevention of metal1 to polysilicon shorts, the CO-PS spacing under
worst case misalignment conditions must be larger than 1.75 µm. Just using the
cross section data would probably have resulted in too optimistic design rules.
5.6 PROCESS SPLIT EVALUATION AND SHORT LOOP EQUIPMENT
MONITORING
Another application of the method has been the evaluation of process splits in
order to optimize metal1 stepcoverage in contacts-to-silicon (CO) for the
planarised bipolar technology with anodized aluminum. Details of the
metallization system have been listed earlier in this section. Critical issue in this
technology is the metal1 stepcoverage in CO-contacts in case the CO-contact edge
coincides with the before mentioned SN oxide step, see fig. 14. The stepcoverage
problem is in this case aggravated by the fact that the top of the thermal oxide
consists of an about 100 nm thick phosphorous oxide (P2O5) layer that originates
from the phosphorous slurry from which the SN emitter was diffused. During wet
CO contact etch this P2O5 layer (which is covered by photo resist) is slightly
underetched (see fig. 16) resulting in a slight negative slope at the very top of the
CO-contact. This protrusion is smoothened by a subsequent HF-dip before metal
deposition but is not completely removed as can be seen in fig. 16. Especially if
the CO-contact edge coincides with the SN oxide step the protrusion can be very
pronounced. In combination with the low aluminum sputtering temperature
(100°C) this may give rise to stepcoverage problems. Coinciding SN and CO steps
only occur at 4σ mask misalignment. It is therefore almost impossible to evaluate
process splits using normal production wafers or test chips.
94
Short loop monitoring of metal stepcoverage by simple electrical measurements
Phosphorous oxide
protrusion
Metal 1
Silicon Oxide
Contact Window
SNdiffusion
SN- CO
DISTANCE
Fig. 14: Schematic cross section of the planarized bipolar process with anodised
aluminum showing the topography below metal1 caused by CO contacts
and SN-diffusion oxide steps.
The stepcoverage maskset has been used in metal1 stepcoverage optimization
experiments. In these experiments, the bias voltage during aluminum sputtering
(0 V and 200 V), the lifetime of the sputter targets (right after or just before target
replacement) and the use of the HF-dip of the CO-contact before aluminum sputtering were varied. Fig. 15 shows the metall stepcoverage as a function of the SNCO spacing. Again as well the percentage of opens (96 datapoints per wafer) as
the mean and sigma of the resistance ratio of the non-open modules are shown.
Fig. 15 clearly reveals that wafers sputtered directly after metal target change
in the metal deposition equipment suffer from very poor stepcoverage of metal1 in
the CO-contact. Wafers sputtered near the end of the target life show good stepcoverage. This is confirmed by fig. 16 showing SEM pictures of FIB cross sections
of metal1 over coinciding CO and SN steps for both cases.
95
Chapter 5
100
Target, Bias, HF dip
begin, yes, yes
80
begin, yes, no
Percentage Opens
begin, no, yes
begin, no, no
60
end, yes, yes
end, yes, no
end, no, yes
40
end, no, no
20
0
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
SN-CO Distance [um]
(a)
2.6
Target, Bias, HF dip
begin, yes, yes
begin, yes, no
Mean Resistance Ratio
2.2
begin, no, yes
begin, no, no
end, yes, yes
1.8
end, yes, no
end, no, yes
end, no, no
1.4
1.0
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
SN-CO Distance [um]
(b)
96
Short loop monitoring of metal stepcoverage by simple electrical measurements
1.6
Target, Bias, HF dip
1.4
begin, yes, yes
begin, yes, no
Standard Deviation
Resistance Ratio
1.2
begin, no, yes
1.0
begin, no, no
end, yes, yes
0.8
end, yes, no
end, no, yes
0.6
end, no, no
0.4
0.2
0.0
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
SN-CO Distance [um]
(c)
Fig. 15: Metal1 stepcoverage over SN oxide steps and CO-contacts as a function
of the SN to CO spacing for the planarized bipolar process. The stepcoverage is depicted by the percentage of modules showing opens (resistance ratio > 10) (a) and the mean (b) and sigma (c) of the resistance ratio
between metal lines over steps and flat lines of the non-open modules.
The fact that stepcoverage is poor in case of a ‘virgin’ target is probably due to
the fact that also oxygen atoms and contaminants that are present on the surface
of the new target are sputtered on the wafer. These contaminants limit the
mobility of the aluminum atoms on the wafer surface, thereby hampering lateral
diffusion of the aluminum atoms and thus degrading metal stepcoverage. The
above justifies the existing practice in the waferfab where first several runs with
dummy wafers (corresponding to a few hundred µm of aluminum) have to be
sputtered before starting metal deposition on production wafers. Note that the
other process variants (presence of sputter bias and HF-dip) have only a minor
influence on the metal1 stepcoverage.
The stepcoverage maskset is reasonably well suited for short loop monitoring
of critical processes on metal deposition equipment. A short loop typically consists
of 4 masks and has a throughput time of about one week. To ensure good
stepcoverage in volume production, modules with worst case topography (like the
one with coinciding SN- and CO-edges in case of the bipolar process with
anodized aluminum) can also be included in a scribelane or drop-in Process
Control Module (PCM) for wafer release purposes and stepcoverage monitoring
on production wafers. In this way wafers containing weak parts can be screenedout. Fig. 17 shows an example of a SPC Control Chart containing data from over
700 batches of the resistance ratio of a metal2 line over coinciding IN-SN steps
and over a flat surface for the non-planarized bipolar process described in section
5.3.1. Only one Out-Of-Control (OOC) caused by a scratch on the module is
97
Chapter 5
observed demonstrating that the metal stepcoverage is well controlled in the
waferfab.
Fig. 16: SEM inspections of metal1 stepcoverage in CO-contacts coinciding with
SN oxide steps in case of wafers sputtered a) directly after metal target
change and b) after sputtering several runs of dummy wafers.
1,2
1,1
average
LCL
target
UCL
1,0
feb-99
jan-99
dec-98
nov-98
okt-98
sep-98
aug-98
jul-98
jun-98
mei-98
apr-98
mrt-98
feb-98
jan-98
dec-97
0,9
nov-97
Ratio metal2 over steps
1,3
Date
98
Short loop monitoring of metal stepcoverage by simple electrical measurements
Fig. 17: SPC Control Chart of the resistance ratio of a metal2 line over coinciding IN-SN steps and over a flat surface for the non-planarised bipolar
process showing data from 770 batches (ratio equals 1.12±0.025).
The method is also suited for (planarised) submicron MOS technologies, especially for monitoring of metal stepcoverage in contacts-to-silicon and intermetal
vias in processes without W-plugs.
5.7 METAL STEPCOVERAGE WAFERMAPS
In addition to fig. 15, for the bipolar process with anodized aluminum also the
variation of the metal1 stepcoverage across the wafer has been evaluated in case of
coinciding CO- and SN-steps (see previous section). Fig. 18 shows wafermaps for
wafers sputtered both directly after metal target change and sputtered after processing of several runs with dummy wafers. These data once again demonstrate the
need to sputter dummy wafers after metal target change but furthermore show that
the stepcoverage can vary significantly across the wafer. Areas with good stepcoverage are found adjacent to areas with complete opens. This clearly points at the
limitations of the cross sectioning method and the importance of stepcoverage wafermaps.
5.8 SUMMARY AND CONCLUSIONS
A novel method has been developed capable of assessing metal stepcoverage
by simple electrical measurements. The metal stepcoverage is represented by the
resistance ratio of metal lines over (worst case) topography and metal lines over
flat surfaces. The resistance ratio correlates well to the metal stepcoverage
percentage as measured by topview SEM inspections after stripping passivation
and intermetal dielectric layers.
The metal stepcoverage test chip and measurement method has been successfully applied to the verification of metal step coverage related design rule in
various technologies, the evaluation of process splits and shortloop monitoring of
metal deposition equipment. The method is also suited for short loop monitoring
of dielectric planarization processes. The effect of stepcoverage on
electromigration lifetime has been found to be very limited. Open failures occur
primarily between the steps and not at the steps which is not yet understood.
Metal stepcoverage appears to be strongly dependent on the lifetime of the metal
sputter target; several runs with dummy wafers are necessary after metal target
change to guarantee good stepcoverage. Metal stepcoverage wafermaps show that
stepcoverage can vary strongly across a wafer demonstrating the limitations of the
traditional cross sectioning method. The new electrical method has shown to be
able to overcome the drawbacks of this method.
A worst case test structure can be included in drop-in or scribelane Process
Control Modules to enable stepcoverage monitoring on production wafers and for
wafer release purposes. In this way wafers containing weak parts can be screened99
Chapter 5
out. The test structure can also be used for lifetesting but it should be noted that
the correlation between test structure lifetime and product lifetime in general is
not straightforward [4,7]. The method is also applicable to submicron MOS process technologies.
4
3.5
Resistance ratio
3
2.5
2
8
7
6
1.5
5
4
3
1
2
Y-position
1
0.5
0
-1
0
0
-2
-1 -2
-3
-4
-5
-3
-6
-7
-8
X-position
-9 -10
(a)
4
3.5
Resistance ratio
3
2.5
2
8
7
6
1.5
5
4
3
1
2
Y-position
1
0.5
0
-1
0
0
-2
-1 -2
-3
-4
-5
-3
-6
-7
-8
X-position
-9 -10
(b)
Fig. 18: Wafermaps of the resistance ratio between metal lines over worst case
steps and flat lines for wafers sputtered directly after metal target change
100
Short loop monitoring of metal stepcoverage by simple electrical measurements
(a) and sputtered after processing of several runs with dummy wafers (b),
clearly demonstrating the need to sputter dummy wafers after metal target change. From (a) also the large variation of stepcoverage across the
wafer can be seen.
101
Chapter 5
5.9 REFERENCES
[1] K.A. Danso, L. Tullos, ‘Thin film metallization studies and lifetime prediction using Al-Si and Al-Cu-Si conductor test bars’, Microelectronics & Reliability, vol. 21, no. 4, pp. 513-527, (1981)
[2] A. Wild, M. Triantafyllou, ‘Electromigration on oxide steps’, Microelectronics & Reliability, vol. 28, no. 2, pp. 243-255, (1988)
[3] A.S. Oates, ‘Step spacing effects on electromigration’, Proceedings International Reliability Physics Symposium, pp. 20-24, (1990)
[4] Y.E. Strausser, B.L. Euzent, R.C. Smith, B.M. Tracy, K. Wu, ‘The effect of
metal film topography and lithography on grain size distributions and on
electromigration performance’, Proceedings International Reliability Physics
Symposium, pp. 140-144, (1987)
[5] L. Kisselgof, L.J. Elliott, J.J. Maziarz, J.R. Lloyd, ‘Electromigration lifetime
and step coverage in Al/Cu/Si thin film conductors’, Materials Reliability Issues in Microelectronics Symposium, April 30-May 3, Anaheim, pp. 107112, (1991)
[6] L. Ferlazzo, G. Lormand, G. Reimbold, ‘Passivation effects on step AlCu/TiN line electromigration performance’, Microelectronic Engineering, Vol.
15, No. 1-4, pp. 487-490, (1991)
[7] T. Nogami, S. Oka, K. Naganuma, T. Nakata, C. Maeda, O. Haida, ‘Electromigration lifetime as a function of line length or step number’, Proceedings
International Reliability Physics Symposium, pp. 366-372, (1992)
[8] H. Nishimura, Y. Okuda, K. Yano, ‘Dependence of electromigration lifetime
for via chains on slope angles of via holes’, Journal Electrochemical Society,
vol. 142, no. 10, pp. 3565-3569, (1995)
[9] I.A. Blech, C. Herring, ‘Stress generation by electromigration’, Applied Physics Letters, vol. 29, pp. 131, (1976)
[10] J.S. May, ‘Electromigration characteristics of vias in Ti:W/Al-Cu(2wt%)
multilayered metallization’, Proceedings IRPS, pp. 91-96, (1991)
[11] E.A. Schönbächler, 'Electromigration behavior of a multi-layer metallisation',
Thesis ETH Zurich, Series in Micro-Electronics, Volume 14, Hartung-Gorre
Verlag, Konstanz, (1998)
[12] J.A. van der Pol, E.R Ooms, H.T. Brugman, ‘Short loop monitoring of metal
stepcoverage by simple electrical measurements’, Proceedings International
Reliability Physics Symposium, pp. 148-155, (1996)
102
Short loop monitoring of metal stepcoverage by simple electrical measurements
103
6
Relation Between Yield And Reliability
of Integrated Circuits and
Application to Failure Rate Assessment
and Reduction in the One Digit Fit
and PPM Reliability Era [10]
6.1 Introduction
6.2 Yield as a reliability indicator
6.3 Experimental results
6.3.1 Relation between yield and line fall-off.
6.3.2 Relation between line fall-off and field returns
6.3.3 Rrelation between yield and burn-in reject rate
6.3.4 Relation between burn-in and High Temperature Operating Life
(HTOL) failure rate
6.4 Failure rate prediction and assesment
6.5 Options for failure rate reduction
6.5 1 Yield improvement
6.5.2 Elimination of special causes (‘maverick’ batches)
6.5.3 Screening of weak parts with latent defects during product test
6.6 Conclusions
6.7 References
6.1 INTRODUCTION
In the present 1 digit PPM and FIT reliability era, the assessment of the actual
Early Failure Rate (EFR) level of a product, let alone the demonstration of its improvement, has become problematic due to the lack of sufficient reliability fails
during life tests. Although the problem can be alleviated by increasing the number
of products in lifetest [1], this is often prohibited by economic constraints and
101
Chapter 6
lengthening feedback loops. Thus it is hard for a waferfab to obtain statistically
significant data that can be used for definition and implementation of improvement actions aimed at continuous reduction of the EFR and line fall-off PPM level
of its products. Line fall-off is defined as the number of devices that fail after printed circuit board assembly and test at the customer’s site (e.g. television manufacturer). Field returns are devices that fail at the end-customer.
The number of line fall-off failures and field returns is much larger than the
number of lifetest failures and could in principle also be used for definition of improvement actions. However detection of problems at the customer conflicts with
customer satisfaction principles and thus is highly unwanted. Furthermore this approach also results in unacceptably long feedback loops of 3 to 6 months and practice has proven that it is hard to obtain reliable data concerning field returns from
the user community. Consequently there is a need for another source of data that
can be used to drive failure rates down.
In this paper it is demonstrated that electrical sort (E-sort) yield data can be
used for this purpose. Correlations between yield failures and reliability failures
(line fall-off, field returns, burn-in and EFR rejects) will be shown. Based on an
assumption of a relation between yield and reliability defect densities a model will
be introduced and compared with the data. We shall discuss and demonstrate the
use of this model to assess the EFR, to implement EFR reduction programs in a
waferfab and to reduce the EFR by screening techniques at E-sort testing.
6.2 YIELD AS A RELIABILITY INDICATOR
In today’s products and processes, wear-out failures during operational life are
virtually eliminated due to the adoption of ‘wafer level reliability‘ and ‘build-in
reliability’ techniques during process development [1] and the use of reliability
related design rules and reliability simulation techniques during product design.
Consequently, most product reliability failures are random failures due to processing incidents (‘special causes’) and process defect density (’normal causes’).
Thus the rootcauses of most reliability failures should be the same as the rootcauses of zero hour failures, i.e. yield failures. Larger defects will cause yield failures
while smaller defects show up as reliability failures (or not at all). The data in fig.
1 shows that is indeed the case. Note that package, EOS (electrical overstress) and
test program related failures were not taken into account.
Fig. 1 shows that failures due to particles and patterning defects are dominant.
It is inevitable that such defects are introduced during wafer processing. Similar
results are obtained for CMOS products [3,4] as well as from theoretical considerations [2,5]. Consequently, there should be a strong, and thus measurable, relation between the number of failures in the field and in lifetest and the yield. There
exist only a few papers on this relation [2,3,5,6,7, 11] and, apart from [4], the
amount of experimental data is often very limited.
In [4] a model was developed for the yield-reliability relation based on the assumption that the density of smaller reliability defects, which actually cause a product field failure, is a fraction of the larger defects which cause yield failures, see
102
Relation between yield and reliability of integrated circuits
equation (1). Here Dy is the yield defect density, Dr the reliability defect density
and α the yield to reliability ratio. From (1) equation (2) can be derived for the relation between the fraction of returned devices R and the batch yield Y [4]. This is
a similar result as in [2]. Here M is the maximum possible yield fraction and allows for clustering effects and edge exclusions. M will typically exceed 90% for
commercial processes.
scratches
5%
other
7%
pinholes
6%
scratches
3%
unknown
40%
patterning
14%
other
9%
unknown
41%
patterning
12%
particles
34%
particles
29%
(a)
(b)
other
16%
unknown
29%
patterning
12%
particles
44%
(c)
Fig. 1: Rootcause pareto of (a) yield, (b) line fall-off and (c) high temperature lifetest failures of a variety of products from a bipolar / BICMOS waferfab.
Dr = α × Dy
R = 1− (
Y α
)
M
[cm-2]
(1)
(2)
Equation (2) is a two parameter model (α and M) linking the failure fraction R to
the yield Y and is applicable for a complete integrated circuit and independent of
103
Chapter 6
the die size of a product; the lower yield of large die accounts for the higher failure rate. As M typically is larger than 90%, the yield to reliability ratio α is the dominant factor in equation (2).
6.3 EXPERIMENTAL RESULTS
In our study, four mixed-signal IC’s and one SRAM running in high volumes
were selected. Devices were fabricated in a 1.0 µm and 1.2 µm CMOS process, a
1.5 µm BICMOS process and two 3.0 µm Bipolar processes with different metallisation systems. The devices came from two waferfabs. From the mixed-signal
devices the batch yield, the line fall-off fraction (failing after assembly on a PCB)
and the number of field returns (coming from the end users) were studied while
for the SRAM the batch yield, burn-in reject rate (24 hrs / 150°C) and EFR was
studied.
6.3.1 Relation between yield and line fall-off.
ppm level [a.u.]
Data from four mixed-signal high volume IC’s that were sold to customers under a PPM agreement (i.e. all failing dies are returned) were used for this study,
totaling over 42 million shipped devices. When plotting for one product the number of line fall-off returns per batch (PPM level) versus batch yield for individual
batches, see fig. 2, no obvious correlation was found. Fig. 2 shows that highest
PPM levels were even found for high yielding batches. Apparently, screening on
E-sort yield alone does not prevent ‘maverick’ batches, i.e. batches that will give
rise to high numbers of field returns, from being delivered.
50
60
70
80
90
Yield [%]
Fig. 2: Line fall-off returns versus batch yield for bipolar product #2.
104
Relation between yield and reliability of integrated circuits
However, when one plots PPM levels based on the number of returned devices
combined from all batches within a yield range that contains enough returns to
move beyond the statistical noise level, versus the yield, a correlation becomes evident, as shown in fig. 3a to 3d for the CMOS, Bipolar and BICMOS products respectively. An arbitrary interval constant of 5% was chosen. The PPM level was
calculated with a 60% confidence level, the error bars indicate the 10% to 90%
confidence level. It is striking to see how well behaved the PPM versus yield curve
is for our products, even though they originate from various technologies and fabs.
The data could be fitted very well with equation (2). For different processes in a
given fab nearly the same values for α were found. In principle α will be related
to a waferfab, a technology, a design methodology and test methodology and the
assembly and test procedure at the customers site.
.
(a)
105
Chapter 6
(b)
(c)
106
Relation between yield and reliability of integrated circuits
(d)
¢
Fig. 3: PPM level of line fall-off returns versus batch yield category ( ), together
with the number of shipped dies (+) and line fall-off returns ( ) for a
CMOS product (a), Bipolar product #1 (b), Bipolar product #2 (c) and a
BICMOS product (d). Thick line is the model fit.
o
6.3.2 Relation between line fall-off and field returns
number of returns [a.u.]
In order to correlate yield and reliability in the field, we used data from bipolar
product #2 from which we got reliable field return data. The relation between line
fall-off and the field return quantity is depicted in fig. 4, where a clear correlation
can be seen. It is a strong indication of the fact that line fall-off is in fact both a
first reliability screen and a reliability indicator.
Line fall off
Field returns
60-65
65-70
70-75
75-80
80-85
85-90
Yield [%]
Fig. 4: Line fall-off versus field returns for bipolar product #2.
107
Chapter 6
6.3.3 Relation between yield and burn-in reject rate
Data from 6 million CMOS SRAM products were used for this study [3]. This
product initially was subjected to a 100% burn-in (24 hrs / 150 °C) while moving
along the learning curve. For both yield and burn-in rejects a distinction was made between functional (e.g. shorts and opens) and parametric (e.g. standby supply
current or access time) failures. The functional burn-in reject rate is plotted versus
the functional batch yield in fig. 5. The thick line is the model fit. Again data
from all batches within a yield range were combined to move beyond the statistical noise level, the error bars indicate the one sigma spread. Also these data can
be fitted very well with equation (2), demonstrating the universal application of
the model linking yield to reliability via the ratio α.
Fig. 5: Functional burn-in reject rate () versus functional batch yield for a
CMOS SRAM together with the number of shipped dies (+).
No correlation was obtained when plotting parametric burn-in reject rate versus parametric yield loss per batch, see fig. 6. Failure analysis showed that this
was due to the fact that the parametric rejects were caused by many different rootcauses that occur during the unmature phase of a process like parasitic transistor
leakage, junction leakage or out-of-control process parameters like poly gate width
determining transistor transconductance. The functional rejects were primarily
caused by particles and patterning defects. Thus when dealing with new state-ofthe-art technologies the model will only be valid if reliability failures are correlated to the functional batch yield, thus disregarding the parametric yield loss.
108
Relation between yield and reliability of integrated circuits
Fig. 6: Parametric burn-in reject rate versus the parametric yield loss for a
CMOS SRAM.
6.3.4 Relation between burn-in and High Temperature Operating
Life (HTOL) failure rate
In total 76 CMOS SRAM batches were subjected to a 144 hrs High
Temperature Operating Lifetest at 150°C and 5.5V after the 24 hrs burn-in to
assess the EFR of products shipped to the customer. Fig. 7 shows a clear
correlation between the cumulative Burn-in and EFR reject rate (measured in
PPM rejects). The EFR can thus be predicted from the burn-in failure rate. Failure
analysis shows that this is due to the fact that the rootcauses of burn-in and EFR
rejects are the same. Furthermore fig. 7 shows that the EFR is about 1.6 times
larger than the burn-in reject rate. This is in very good agreement with the failure
rate model of Philips Consumer Electronics [8]. This model is based on a study of
lifetest data of tens of thousands MOS and bipolar devices and shows that the failure rate as a function of time is best described by a Weibull distribution with a β
of 0.45.
109
Burn-in Failure Rate [a.u.]
Chapter 6
10
8
6
4
2
0
0
3
6
9
12
15
Early Life Failure Rate [a.u.]
Fig. 7: Early Life Failure Rate (EFR) versus burn-in reject rate for a CMOS
SRAM.
6.4 FAILURE RATE PREDICTION AND ASSESMENT
The factor α is expected to be the same for similar products in a given technology.
It may even be constant for an entire waferfab which is indicated by our data. If
so, then the EFR and field failure rate can be predicted independent of the die area
by measuring die yield only; area effects are accounted for in the model via the
yield. Thus yield will be the only relevant indicator of reliability. If in a new technology in a given waferfab the EFR or PPM level can be determined in only one
yield range (e.g. by taking a large low yielding die for lifetest or PPM-cooperations with customers), then by assuming a certain maximum yield M the value of
α can be assessed. The value of M is not very critical, so reasonable predictions of
EFR and PPM levels can be made quickly after process or product introduction.
Note that the relation between α, waferfab and technology it is not yet fully
quantified to date.
6.5 OPTIONS FOR FAILURE RATE REDUCTION
6.5 1 Yield improvement
Because both product reliability and yield are determined by defect related
rootcauses, see fig. 1, a waferfab may control out-going product reliability by determining equipment defect densities. The continuous defect reduction programs
ongoing in our waferfabs are aimed at those defects that have the largest impact
on yield and reliability (e.g. particles, patterning defects). The impact on reliabili110
Relation between yield and reliability of integrated circuits
Apr
Jan '96
Oct
Jul
Apr
Jan '95
Oct
Jul
Apr
Jan '94
Avg. '93
Defect Density [a.u.]
ty is evaluated via lifetest and analysis of line fall-off rejects. This approach will
enforce an improvement in yield and in the associated reliability level as shown in
fig. 8 and 9. These depict the defect density trend for the bipolar/BICMOS waferfab over a three year period and the resulting line fall-off trend for the BICMOS
product (in combination with screens at product test) respectively. The rejects per
batch distribution for the process in which bipolar product #2 is made in fig. 10
shows that the number of batches with high number of returns has decreased significantly from 1992 to 1995. The worst batch in 1992 gave 25 returns, whereas
the worst batches in 1994 and 1995 respectively gave 7 and 5 returns. Here we
disregarded the two batches with 12 and 16 returns in 1995 which were caused by
particle contamination incidents. Furthermore the percentage of batches causing
returns decreased from 38% in 1992 to 11% in 1995. Thus the yield improvement
program results not only in a continuously reducing Early Failure Rate but also in
a reduction of the probability of occurrence of ‘maverick’ batches.
Mar
Jan-96
Nov
Sep
Jul
May
Mar
Jan-95
ppm level [a.u.]
Fig. 8: Defect density trend for the Bipolar/ BICMOS production line.
month
111
Chapter 6
Fig. 9: Line Fall-off Trend for the BICMOS product.
% of batches
100
80
60
40
20
0
0 2
4 6
8 10
12
Rejects per batch
14 16
18 20
22 24
1992
1993
1994
1995
Year
Fig. 10: Line fall-off per batch distribution for 1992 to 1995 for the bipolar process #2.
6.5.2 Elimination of Special Causes (‘Maverick’ batches)
As stated in the introduction, part of the product reliability failures are caused
by random processing incidents (‘special causes’) that may result in ‘maverick’
batches. These incidents generally result in low yielding wafers and thus may be
detected by having a consistent analysis system of low yielding wafers in place.
Based on the analysis it is decided whether the wafers should be scrapped to eliminate potential maverick batches. The effectiveness of this system is clear from
fig. 10. However, practice shows that not all processing incidents result in low
yield and thus as such may remain unnoticed until feedback from the customer is
obtained. Quick detection of maverick batches is in this case often hampered by
the fact that products from one batch are delivered to many customers so that the
number of returns per customer can be low while still dealing with a maverick
batch.
This problem can be circumvented by detailed analysis of the line fall-off trend
of a product. Assume the background defect PPM level of a product caused by random process defects like particles is r. The line fall-off per batch distribution in
fig. 10 then should follow a binomial distribution and the probability P(X) of finding X rejects in a sample size of N products (equal to the batch size) equals:
P( X ) =
N!
r X (1 − r ) ( N − X )
X !( N − X )!
(3)
112
Relation between yield and reliability of integrated circuits
# returns / 10 succ. lots [a.u.]
Typically the probability of finding more than 2 to 3 rejects per batch is very
low. Thus rejects from batches with 3 to 4 or more returns can be attributed to
‘special causes’ (processing incidents) and the other rejects to ‘normal causes’
(random process defects). Fig. 11 shows the line fall-off trend for a complete product family of bipolar product #2 where this distinction between rejects has been
made. On a relatively stable background of random process defect related returns
several excursions can be seen related to a limited number of processing incidents
affecting numerous batches that were not detected at E-sort. In our line fall-off
customer return analysis system occurrence of these batches is automatically signalled (‘Batch Oriented Analysis’) and high priority is given to rootcause analysis
of these rejects as they might be the first of many maverick batches. In this way
corrective actions can be implemented as early as possible.
special causes
normal causes
|
|
1990 1991
|
1992
|
1993
|
1994
1995
Fig. 11: Line fall-off trend of the complete product family of bipolar product #2.
6.5.3 Screening of weak parts with Latent Defects during product
test
An interesting issue arises when additional EFR improvement is required above a given target level. Fig. 3 and equation (2) show that the brute force technique
of yield improvement by line defect density reduction only provides a limited gain.
If however, one would achieve a reduction in the reliability to yield defect ratio α,
the EFR improvement would be much larger. To this end, a larger fraction of the
defects must be allowed to induce a failure at E-sort or at least before parts are
shipped to the customer. Usually burn-in is applied for this purpose. However
smart testing which measures product yield beyond the traditional zero hour point
113
Chapter 6
V-screen rejects [a.u.]
could also reduce the reliability to yield defect ratio α. Examples are the implementation of IddQ (quiescent current) testing, voltage screens (V-screen) and distribution testing methods at E-sort or final test. Fig. 12 and 13 illustrate the possible effects of extended testing during E-sort.
55
65
75
85
Esort yield [%]
95
Fig. 12: Voltage screen induced failures versus batch yield at E-sort for 30 batches of the bipolar product #2.
Fig. 13: IddQ and voltage screen induced failures versus batch yield for 22 and
132 batches of two BICMOS products respectively.
In fig. 12 and 13 a clear correlation between yield and V-screen and IddQ rejects is seen. Furthermore for the bipolar product #2 it was found that both for Vscreen and reliability failures the predominant failure mode was particles causing
114
Relation between yield and reliability of integrated circuits
metal 1 to metal 2 shorts. This is a clear indication that a V-screen forces reliability failures to fail (‘latent defects’). Fig. 14 shows the ppm versus yield curve for
both products with (2 million devices) and without V-screen (11 million devices).
Again the error bars indicate the 10% and 90% confidence level. It can be seen
that the resulting reduction in α is about a factor of 2.5. To obtain a similar EFR
reduction by yield improvement alone, the yield should have increased another
10%. Note that this implies a reduction of the waferfab defect density by more
then a factor of 2. The effect of the implementation of the V-screen and IddQ test
at E-sort, in combination with the controlled defect density reduction, on the EFR
of the BICMOS product is shown in fig. 8.
Fig. 14: PPM level of line fall-off returns versus batch yield category of the bipolar product #2 for batches with ( ) and without ( ) V-screen at E-sort,
together with the number of shipped dies with V-screen (+). Dashed lines are the model fits.
o
¢
6.6 CONCLUSIONS
There is a strong relation between IC product yield and failures that occur during burn-in or in the early life of product use. This relation was found using five
different IC’s, running in high volumes and manufactured in several processes,
from two waferfabs. Similar results were later obtained on a 0.25µm state-of-theart microprocessor process [3]. The correlation between yield and reliability was
found to obey a simple model, in which the reliability defect density is defined as
a fraction α of the yield defect density. Implication of the model is that reliability
115
Chapter 6
prediction of a certain type of IC may be done based on its yield alone. In case of
non-mature processes one should only take the functional yield into account and
disregard parametric yield loss. ‘Maverick’ batches show up as batches with more
than 2 or 3 rejects but their occurrence can not be prevented by screening on Esort yield only. Using the ppm-yield relation, it is shown how a waferfab may
improve the reliability level of its products in a fast and controlled way. The model also indicates that yield improvement may not be the most effective way to
achieve a reliable product if the reliability to yield defect ratio α is large. It is
shown for two products that a substantial improvement can be achieved at the product test stage by implementation of screening methods like voltage screen or IddQ
tests.
6.7 REFERENCES
[1] D.L. Crook, ‘Evolution of VLSI reliability engineering’, Proceedings International Reliability Physics Symposium, pp. 2-11, (1990)
[2] H.H.Huston, C.P. Clarke, ‘Reliability defect detection and screening during
processing- theory and implementation’ Proceedings International Reliability
Physics Symposium, pp. 268-275, (1992).
[3] W.C. Riordan, R. Miller, J.M. Sherman, J. Hicks, ‘Microprocesor reliability
performance as a function of die location for a 0.25 µm five layer metal
CMOS logic process’, Proceedings International Reliability Physics Symposium, pp. 1-11, (1999)
[4] F. Kuper, J. van der Pol, E. Ooms, T. Johnson, R. Wijburg, W. Koster, D.
Johnston, ‘Relation between yield an reliability of integrated circuits: experimental results and application to continous early failure rate reduction programs’, Proceedings International Reliability Physics Symposium, pp. 17-21,
(1996).
[5] C.Glenn Shirley, ‘A defect model of reliability’, Tutorial IRPS, (1995).
[6] V. Riviere, A. Touboul, S.B. Amor, G. Gregoris, J.L. Stevenson, P.S. Yeung,
‘Evidence of a correlation between yields and reliability data for a rad-hard
SOI technology’, Proc. of the 1995 International Conference on Microelectronic test structures’, pp. 221-224, (1995)
[7] J.G.Prendergast, ‘Reliability and quality correlation for a particular failure
mechanism’, Proceedings International Reliability Physics Symposium, pp.
87-93, (1993).
[8] H.R. Claessen, ‘Reliability of IC’s’, Summer Course on reliability and yield
in MOS VLSI technologies, Volume IV, IMEC, Leuven, Belgium, June 5-9,
(1989).
[10]J.A. van der Pol, F.G. Kuper, E.R. Ooms, ‘Reliation between yield and reliability of integrated circuits and application to failure rate assessment and reduction in the one digit FIT and PPM reliability Era’, Microelectronics &
Reliability, pp. 1603-1610, (1996)
[11] T. Kim, W. Kuo, ‘Modelling manufacturing yield and reliability’, IEEE
Transactions on Semiconductor Manufacturing, pp. 485-492, (1999)
116
Relation between yield and reliability of integrated circuits
117
7
Impact of Screening of Latent Defects
at Electrical Test on the
Yield-Reliability Relation and
Application to Burn-in Elimination [16]
7.1 Introduction
7.2 Impact of screening latent defects at e-sort on product reliability
7.2.1 Yield-reliability relation
7.2.2 Failure rate reduction options
7.2.3 Impact of latent defect screens at e-sort on yield-reliability
relation
7.3 Model predicting burn-in failure rate from batch yield
7.3.1 Experiment and failure rate evolution model
7.3.2 Validation of the model
7.3.3 Process dependence of the model constants
7.3.4 Burn-in failure rate prediction
7.4 Application of model to burn-in elimination
7.4.1 Impact of screens
7.4.2 Verification of the model
7.5 Conclusions
7.6 References
7.1 INTRODUCTION
Since the mid-seventies the issue of integrated circuit ‘infant mortality’ has received growing interest as it was recognised [1] that next to wear-out failure mechanisms, also ‘early failures’ [2] could have a significant impact on the overall
circuit reliability as encountered in the field. In today’s products and processes,
wear-out failures during operational life are virtually eliminated due to the adop117
Chapter 7
tion of ‘wafer level reliability’ (WLR) or ‘building-in reliability’ (BIR) techniques
during process development [3] and the use of reliability related design rules and
reliability simulation techniques during product design. Consequently, product reliability failures are currently dominated by randomly distributed ‘latent defects’
due to processing incidents or process defect density like particles or litho and gate oxide defects.
In order to improve the reliability of their products in the field, many manufacturers of e.g. television sets or car radio sets have adopted screening techniques
[4] and require (especially in case of military or automotive applications) either a
full or a sample burn-in to weed-out the latent defects, thus hoping to achieve onedigit-FIT reliability levels. At the same time, integrated circuit manufacturers have adopted techniques like in-line defect monitoring, statistical process control
(SPC), rigorous yield learning and defect reduction programs [5] and various latent defect and maverick screens in order to improve yield and reduce the number
of reliability failures due to manufacturing flaws [6,7]. Fig. 1 shows for example
the yield learning curve of a Bipolar-CMOS-DMOS (BCD) process in our Bipolar-BiCMOS waferfab. In fig. 2 the defect density trend in the same fab is depicted demonstrating a 30% improvement rate per year. As can be seen in fig. 3, the
associated yield improvement together with the introduction of latent defect
screens resulted in a factor 10 reduction of the line fall-off rate of a BiCMOS TV
signal processing IC in 2.5 years. Line fall-off failures are in this case products
that fail after printed circuit board assembly and test at the customers site (e.g. a
television set manufacturer).
Fig. 1: Yield learning curve of a Bipolar-BiCMOS-DMOS (BCD) process in the
Bipolar-BiCMOS waferfab.
As a result of all the IC manufacturers efforts, product reliability has improved
dramatically over the years, see fig. 4, and single-digit FIT early failure rates and
118
Screening of latent defects at electrical test and application to burn-in elimination
Defect Density [a.u.]
line fall-off PPM numbers are not uncommon today. However, despite the reliability improvement, in many cases burn-in is still mandated ‘blindly’ with little regard for the cost involved (burn-in operation and associated yield loss) and the actual benefits realised (reduction of both warranty cost and customer dissatisfaction). The latter is especially an issue for high yielding mature manufacturing lines with average yields above 85-90%. Note that in the context of this paper the
term burn-in strictly applies to an extended operation at elevated temperature and
not to other screens.
0
|
1993
|
1994
|
1995
|
1996
1997
Month
Fig. 2: Defect density trend of the Bipolar-BiCMOS waferfab.
119
Chapter 7
Fig. 3: Line fall-off PPM trend of BiCMOS product #1.
Fig. 4: Trend in early- and intrinsic failure rate (FIT) targets for application in
consumer products.
Only few papers in literature deal with the above trade-off between burn-in
cost and benefit. In [8,9,10] models were developed to provide a rational basis for
setting burn-in yield criteria based on a detailed analysis of actual product burn-in
and lifetest data. However no relation to batch yield at electrical test (E-sort) was
established. In this paper we will firstly show quantitatively how screening of latent defects at E-sort testing using tests like voltage screen, IddQ (quiescent cur120
Screening of latent defects at electrical test and application to burn-in elimination
rent) testing and distribution testing improves product reliability and secondly
how these tests can eliminate the need for burn-in based on a model, supported by
experimental data, that relates the necessary burn-in condition to the required reliability level and to the batch yield.
7.2 IMPACT OF
SCREENING LATENT DEFECTS AT
DUCT RELIABILITY
E-SORT ON PRO-
7.2.1 Yield-Reliability Relation
In [6,7] it was shown using burn-in and customer line fall-off data that there is
a clear relation between yield and reliability of integrated circuits in case the yield
loss is dominated by defects like particles. This yield-reliability relation was successfully modelled by equation (2) with R the fraction of rejects at a certain use (or
stress) condition, Y the batch yield, M the maximum possible yielding fraction, allowing for clustering effects, placement of Process Control Modules and edge exclusions and α the ratio between reliability and yield defect density Dr and Dy, see
equation (1) [6,7]. M will typically exceed 90% for commercial processes.
Dr = α ⋅ D y
[cm-2] (1)
Y öα
÷
è Mø
R = 1 − æç
(2)
The yield-reliability ratio α is dependent on waferfab, technology and product
operating conditions. In table 1 normalised data are shown for 8 different products
in 4 technologies from 3 waferfabs totalling 84 million devices. The effect of operating condition on α can be noticed from the fact that the α-value (and thus according to equation (2) also the line fall-off) at 95°C is significantly higher than
that at 65°C or 85°C and from the high α-value for the product that was subjected
to burn-in [7]. Note that in the latter case we correlated the batch yield with the
burn-in rejects and not with the rejects of the products that were shipped into the
field after the burn-in; the α-value of these burned-in devices in the field will be
much lower.
7.2.2 Failure Rate Reduction Options
Failure rate reduction can be achieved by the brute force technique of yield improvement and by a reduction of the reliability to yield defect ratio α. For the latter, a larger fraction of latent defects must be forced to induce a failure at E-sort
testing or at least before the parts are shipped to the customer in the field. Applying burn-in is one way to do the job (products after burn-in will show a low α-va121
Chapter 7
lue in the field) but introducing product screens at E-sort testing is potentially
much more cost-effective. The most commonly used screening techniques include
voltage screens, distribution testing and, for CMOS circuitry, IddQ testing.
In case of a voltage screen the product is operated at an elevated supply voltage below the intrinsic breakdown voltage of the process for about 10-100ms while
applying some functional test patterns. Goal is to screen-out gate oxide defects
and near-shorts due to particles or defects during lithographic processing. In case
of distribution testing many different implementations exist. In our case we determine the distribution of a few critical analog parameters (e.g. a supply current or a
reference voltage) for all good products on every individual wafer and reject any
outliers even if the product is still within its datasheet specification. This is based
on the belief that as long as the abnormal product behaviour is not understood,
one can also not guarantee its reliability. Finally it has been shown [11] that in
CMOS circuitry IddQ testing is an effective tool to detect latent defects due to resistive short-circuit paths in a product that have not yet lead to stuck-at failures.
IddQ testing is often combined with voltage screens. In fig. 5 the correlation
between batch E-sort yield and IddQ rejects is shown for 575 batches of two
BiCMOS products totalling more than 8 million devices. It can be clearly seen
that the lower the batch yield, the higher the IddQ fall-out and the more latent defects are screened out. Also the maverick behaviour of some lots is evident. The
effect on the improvement of the line fall-off rate of one of these products can be
seen in fig. 3.
E-sort
Data source
Norm.
Fab
Product
Sample
screen
size (x106)
α-value
A
2.7
no
Line fall-off
1.0
1.0µ CMOS1
@ 65°C
A
6.0
no
Burnin 24hr
59
1.2µ CMOS2
@ 150°C
B
14.9
no
Line fall-off
2.7
3µ Bipolar1
@ 85°C
B
8.7
no
Line fall-off
3.5
3µ Bipolar2
@ 95°C
B
2.4
1
Line fall-off
2.2
3µ Bipolar2
@ 95°C
B
13.0
no
Line fall-off
1.8
1.5µ BiCMOS1
@ 85°C
B
1.6
no
Line fall-off
2.3
1.5µ BiCMOS2
@ 85°C
B
3.4
1,2
Line fall-off
1.2
1.5µ BiCMOS2
@ 85°C
B
5.8
1,2,3
Line fall-off
1.3
1.5µ BiCMOS3
@ 85°C
C
7.5
no
Line fall-off
1.5
1.5µ BiCMOS1
@ 85°C
C
7.7
1,2
Line fall-off
0.5
1.5µ BiCMOS1
122
Screening of latent defects at electrical test and application to burn-in elimination
C
1.5µ BiCMOS4
10.1
@ 85°C
Line fall-off
@ 85°C
1,2,3
1.8
Table 1: Normalised yield-reliability ratios α for various products from different
processes and different waferfabs totalling 84 million devices (A=
CMOS fab, B= Bipolar-BiCMOS fab, C= CMOS-BiCMOS fab) using
different screens (1= Voltage Screen, 2= Distribution Testing, 3= IddQ).
7.2.3 Impact of Latent Defect Screens at E-sort on Yield-Reliability
Relation
Iddq rejects [a.u.]
The impact of latent defect screens on product reliability is demonstrated by
the data in fig. 6,7 and 8. All data in a 5% batch yield interval have been combined to obtain statistically meaningful results. In the figures the number of shipped
devices in each yield category is shown as well as the correlation between yield
and line fall-off rejects. The PPM-number is determined for each yield category by
dividing all customer returns by the number of shipped devices in that yield category. The error bars denote the 10% and 90% confidence limits. The dashed lines
are the weighted fits using the model in equation (2) and the number of shipped
devices as the weight factor. The M-values used in the fit are derived from the
yield data and given in the figure captions.
70
75
80
85
90
95
Esort Yield [%]
(a)
123
Iddq rejects [a.u.]
Chapter 7
50
55
60
65
70
75
80
Esort Yield [%]
(b)
Fig. 5: IddQ test induced failures at E-sort versus batch yield for a) 186 batches of
BiCMOS product #2 from Fab B and b) 389 batches of BiCMOS product
#1 from Fab C, see table 1.
Fig. 6 shows the data for a bipolar power amplifier shipped to only one automotive customer before (8.7 million devices) and after (2.4 million devices) introduction of a voltage screen intended to screen-out particles causing intermetal
shorts. Under a PPM-cooperation agreement all line fall-off failures were sent
back to us resulting in a good fit between the model and the data. We find that the
applied voltage screen apparently reduced α by a factor 1.6. Note furthermore that
the average yield of the batches with V-screen is higher than without V-screen
due to the continuous yield improvement program in the waferfab.
124
Screening of latent defects at electrical test and application to burn-in elimination
Fig. 6: Bipolar product #2: PPM level of line fall-off returns and number of
shipped devices versus batch yield category for batches with () and
without (o) voltage screen. Dashed lines are the (weighted) model fits
using M= 93%. All data in a 5% yield interval have been combined.
In fig. 7 and 8 similar data are shown of the effect of the introduction of a voltage screen and supply current related distribution tests for two different BiCMOS
TV signal processing ICs from two waferfabs (in total 7.5 and 1.6 million devices
before and 7.7 and 3.6 million devices after introduction of the screens respectively). It can be clearly seen that for all yield categories the PPM-number of the devices subjected to the E-sort screens is lower than that of the devices without screen.
The corresponding α reduction factors are 3.0 and 2.0 respectively, demonstrating
quantitatively how reliability can effectively be improved by introduction of E-sort
screens.
The relatively poor fit between the model in equation (2) and the data in fig. 7
and 8 is probably caused by the fact that due to the large volumes shipped and the
large customer base, not all customers did sent their line fall-off rejects back. This
results in too optimistic values for the calculated α -number in table 2 but, as this
effect holds for both the samples with and without screens, the determined α -reduction factor will still be reasonably accurate. Furthermore it must be noted that
due to the weight factor used, the fit tends to favour the data points with the highest number of shipped devices.
125
Chapter 7
Fig. 7: BiCMOS product #1: PPM level of line fall-off returns and number of
shipped devices versus batch yield for batches with () and without (o)
screen consisting of voltage screen and distribution tests. Dashed lines are
the (weighted) model fits using M= 92%.
è
ç
No screen
With screen
model fits
è
0.5
ç
50-55 55-60
60-65
1.0
# shipped [millions]
PPM level [a.u.]
1.5
0.0
65-70 70-75
75-80 80-85
85-90 90-95
Yield range [%]
Fig. 8: BiCMOS product #2: PPM level of line fall-off returns and number of
shipped devices versus batch yield for batches with () and without (o)
screen consisting of voltage screen and distribution tests. Dashed lines are
the (weighted) model fits using M= 92%.
126
Screening of latent defects at electrical test and application to burn-in elimination
7.3
MODEL PREDICTING BURN-IN FAILURE RATE FROM BATCH
YIELD
7.3.1 Experiment and Failure Rate Evolution Model
In order to be able to model the effects of burn-in, it is necessary to know the
failure rate evolution versus time. To this purpose in the mid-eighties a large-scale
experiment was set up by Philips Consumer Electronics and Philips Semiconducors in order to investigate the correlation between failure rates of products during
conventional lifetests and failure rates of products in their real operating environment [12]. Data from this time period are included because back then still high
failure rates were observed allowing statistically relevant experiments with limited
sample sizes. The experiment included over 50 different bipolar product types of
various complexity fabricated in four different waferfabs. Extensive lifetesting was
carried out up to 8000hrs at junction temperatures Tj of 110°C, 125°C, 140°C and
170°C and simultaneously the cumulative failure curve of the same products in
the set under normal operating conditions was registered, see fig. 9. The error
bars denote the 60% confidence interval. Sample sizes during the lifetest were
4000, 7000, 2600 and 330 products respectively up to 1000hrs and 330, 330, 300
and 90 products respectively up to 8000hrs. The junction temperature of the products in the set Tj,set was 105°C with an estimated spread of ± 5°C. Failure analysis showed that the failures were dominated by defects, no wear-out was observed.
127
Chapter 7
Fig. 9: Cumulative failure plot for various bipolar IC’s during lifetest at four different junction temperatures and of the same products operated in the set
at Tj = 105°C.
The set cumulative failure curve shows two slopes and can be described by a
combination of two Weibull distributions F1 and F2. Based on the bathtub model
[2] we assume for F2 a constant failure rate distribution. This results in equation
(3) with Ft,T the cumulative failed fraction at operating time t and junction temperature T, β the Weibull shape factor and η1,2,T the experimentally determined characteristic lifes at junction temperature T. ηT at other temperatures than T can be
calculated using equation (4). The η2 /η1 ratio determines the cross-over point
tXover,T at temperature T between the two slopes in fig. 9. For practical cases where
F1 << 0.01, tXover,T can be approximated by equation (5) as then at t= tXover,T
F1=F2.
β
é
ù
æ t ö
æ t ö
ê
ú
ç
÷
ç
÷
−ç
+ç
÷
êè η ÷
ø
è η2,T ø ú
1
,
T
ê
ú
ë
û
Ft,T = F1 + (1− F1) ⋅ F2 = 1− e
q ⋅E a ö
÷
η 1 , 2 , T = C1 , 2 ⋅ e k ⋅ T ø
æ
ç
è
t Xover , T =
β
æ
ç η 1, T
ç
ç η 2, T
è
(3)
[hrs]
(4)
[hrs]
(5)
1
ö
÷
÷
÷
ø
( β − 1)
Using the above model, we can calculate the lifetest results of the IC’s back to
Tj,set , see fig. 10. Again the error bars denote the 60% confidence interval. The
(weighted) fit between the model and the data is also shown. It appears that the lifetest failures follow the same cumulative failure curve as the set test failures. We
find for the model constants Ea= 0.7 ± 0.1 eV, β= 0.40 ± 0.05 and tXover equals
about 3000hrs at 105°C. Note that η1 and η2 together determine the absolute failure level. The Ea of 0.7eV corresponds well with the value of 0.65eV reported in
[13].
Similarly, failure rate plots were determined from accelerated lifetest data of
various MOS and Bipolar IC’s and calculated back to 85°C using the Arrhenius
model with Ea= 0.7eV, see fig. 11 and 12. Once again the individual failure rate
plots of the investigated IC-families essentially have the same shape and can be
modelled by the combination of the Weibull distribution (F1) and the constant
128
Screening of latent defects at electrical test and application to burn-in elimination
failure rate (F2) in equation (3). The fit between the model and the data is also
shown in fig. 11 and 12. In this case we find for the model constants β= 0.40 ±
0.05 and tXover,85°C = 33000 ± 8000 hrs at 85°C. As the cross-over point between
the two slopes in equation (3) occurs for more than 30000 hrs at 85°C, the second
term in equation (3) can be neglected for most practical cases.
Fig.10: Cumulative failure plot of IC’s in the set under normal operating conditions (Tj,set = 105°C) compared with stress test results at various Tj calculated back to the Tj,set using an Arrhenius model with Ea = 0.7eV.
129
Chapter 7
Fig. 11: Normalised cumulative failure plot for various bipolar and MOS IC’s.
Data are calculated back to 85°C using Ea= 0.7eV and normalised to 1 at
300 hrs.
130
Screening of latent defects at electrical test and application to burn-in elimination
7.3.2 Validation of the model
The above model has been validated in a later period by comparing the failure
rate predictions based on 48hrs lifetest data with the actual results of 300hrs operational testing of sets, see table 2 [12]. As can be seen, the FIT rates predicted by
the model are within 40% of the actual numbers which is a remarkably good
agreement compared to other prediction models.
Fig. 12: Normalised failure rate plot for various bipolar and MOS IC’s. Data are
calculated back to 85°C using Ea= 0.7eV and normalised to 1 at 300hrs.
Technology
Bipolar
CMOS
NMOS Fab1
NMOS Fab2
Average
Tj,set (°°C)
85
50
70
85
Measured Failure
Rate @ 300hrs (FIT)
500
320
900
5500
Calculated Failure
Rate @ 300hrs (FIT)
700
350
850
7000
Table 2 : Comparison of measured and calculated FIT-rates of various products in
actual consumer sets.
7.3.3 Process Dependence of the Model Constants
Note that β and η are process dependent constants. For submicron CMOS logic processes β = 0.25 has been reported [14] and for (embedded) DRAM proces131
Chapter 7
ses β‘s between 0.20 and 0.36 have been found [15]. In fig.13 recent data from 3
µm bipolar, 1.5 µm BiCMOS, 1.2 and 0.8 µm CMOS and 0.5 µm embedded
DRAM processes calculated back to 85°C using Ea = 0.7eV are shown indicating
that for modern technologies β ranges from 0.2 to 0.5. Furthermore the cross-over
point between the slopes in equation (3) does not show up yet at 30000 hrs at
85°C again indicating that the contribution of the constant failure rate term in
equation (3) is negligible for most practical cases.
Fig. 13: Normalised cumulative failure plot for various bipolar, BiCMOS, CMOS
and embedded DRAM technologies. Data are calculated back to 85°C
using Ea = 0.7eV and normalised to 1 at 300hrs.
7.3.4 Burn-in Failure Rate Prediction
Based on the above failure rate time evolution model in equation (3) we now
can derive a model for predicting the burn-in failure rate from batch yield. Assume that we have determined the yield-reliability ratio α from data obtained from a
stress of duration ts at junction temperature Ts. We furthermore take the β- and
tXover,Ts -value applicable to the technology of the products being studied. In that
case only the characteristic life η1,Ts is unknown as η2,Ts can be calculated using
equation (5). Here it is important to note that α is stress condition dependent thus
α ≡ αts,Ts. By combining equation (2), (3) and (5) we then can derive an ex-
132
Screening of latent defects at electrical test and application to burn-in elimination
pression for η1,Ts. In case we are dealing with only one Weibull distribution
(single slope so tXover,Ts → ∞) we obtain equation (6):
ts
η1, Ts =
[hrs]
(6)
1
é
ê − α t s , Ts
ë
æ
⋅ ln ç
è
Y öùβ
÷
M ø úû
Equation (6) expresses the characteristic life η1,Ts at stress temperature Ts as a
function of batch yield Y, maximum yield fraction M, stress time ts, Weibull shape
factor β and yield-reliability ratio αts,Ts determined from the stress data. η1,T at
other temperatures than Ts can be calculated from equation (4) for a given activation energy Ea. In case we are dealing with two Weibull distributions as in equation (3) (two slopes so tXover at use conditions ≤ 25 years) also the cross-over point
tXover,Ts enters the equation and the expression for η1,Ts becomes somewhat more
complex:
1
η1, Ts =
é æ
ts
ê −ç t β +
ê ç s
t Xover ,Ts (1− β )
ê è
ê
æ Y ö
α t s , Ts ⋅ ln ç ÷
ê
è Mø
ê
ê
ë
ö
÷
÷
ø
ùβ
ú
ú
ú
ú
ú
ú
ú
û
[hrs]
(7)
One can now calculate the cumulative failed fraction as a function of batch
yield at any use or stress condition other than the stress conditions time ts and
temperature Ts by using equation (3) and (4). Note that all the required constants
are known.
We firstly apply the model in equation (3), (4) and (7) to the line fall-off data
in fig. 6 to 8. In consumer applications the set test after printed circuit board assembly generating the line-fall-off rejects typically consists of a ts = 0.5 hrs stress
at Tambient = 40°C equivalent to a junction stress temperature Ts ≈85°C. So using ts
= 0.5hrs, Ts = 85°C, Ea = 0.7eV and assuming typical values for the other constants for demonstration purposes (β = 0.3, tXover,85°C = 33000 hrs, M= 95% and
α0.5hrs,85°C = 5⋅10-5, exact values to be determined for each individual product family and application), equation (7) allows us to calculate η1,85°C and to predict the
cumulative failure rate at any use or stress condition as a function of batch yield.
In fig. 14 the calculated cumulative failed fractions are shown for a number of
standard stress and operating conditions. Note that a different α-value corresponds to each curve. For the 6 hrs and 240 hrs burn-in curves we find for example
α6hrs,150°C = 3.1⋅10-4 and α240hrs,150°C = 1.2⋅10-3. Using these numbers and equation
(2), one can simply calculate the reject level as a function of batch yield. To verify
133
Chapter 7
how realistic the calculated reject numbers are, we take the example of a 60%
yielding batch. The calculated reject levels during a 150°C burn-in are in the order of several hundreds of PPM. This is in good agreement with the reject numbers observed in reality.
Fig. 14: Cumulative failures versus batch yield for various use and stress conditions using α0.5hrs,85°C =5⋅10-5, M = 95%, β =0.3 and Ea = 0.7eV as an example.
Next we apply the model to determine the impact of burn-in and E-sort screens
on the PPM reject levels experienced in the field for batches with varying yield.
The cumulative failed fraction after tu operating hours at temperature Tu after a
burn-in of ts hrs at temperature Ts can be calculated using equation (8) (acceleration factor) and equation (9):
AFTu − Ts =
é q⋅E
a ⋅æ
ç
ê
è
êë k
e
Ft u ,Tu = F( t u + AFT
1
1 öù
− ÷ú
Tu T s ø úû
⋅t ),T
u − Ts s u
(8)
− F( AFT
⋅t ),T
u − Ts s u
(9)
134
Screening of latent defects at electrical test and application to burn-in elimination
Fig.15 shows the cumulative failure plot for a high (90%) and a moderately
(50%) yielding batch during operation at 85°C including the impact of a 6 hrs (typical for µ-processors and DRAMs) and 240 hrs (Class S military) burn-in at
150°C and of the introduction of E-sort screens (assuming an ‘α‘ reduction by a
factor 2) on the failure curve of the 50% yielding batch. The burn-in results in a
significant reliability improvement for short operating times but after that reliability is worse than for both the high yielding batch and the 50% batch after screening. In case of the 6hrs burn-in the cross-over points occur after about 40hrs and
1000 hrs respectively, well within the useful life of most applications. In case of
the 240 hrs burn-in these numbers are 700 hrs and 10000 hrs. An important implication of these findings is that for Hi-Rel military components the use of commercial parts from mature high volume manufacturing lines is probably more advisable than the extensive burn-in of parts from dedicated low-volume unstable lines.
Fig. 15: Cumulative failure plot for a good (90%) and a moderately (50%) yielding batch and the impact of burn-in and E-sort screens on the curve of
the 50% yield batch (using M = 95% and assuming a factor 2 reduction
of α due to the screen).
135
Chapter 7
7.4 APPLICATION OF MODEL TO BURN-IN ELIMINATION
7.4.1 Impact of screens
Using the model the minimum burn-in time needed to obtain a customer specified reliability level after given use conditions t and Tj can be calculated and consequently below which average product yield level burn-in is needed (note that the
higher reject level of low yielding batches is compensated by the lower reject level
of high yielding batches). Fig. 16 shows for a typical situation (α0.5hrs,85°C = 5⋅10-5,
M = 95%, β = 0.3, Ea = 0.7eV and cross-over point at 33000 hrs, exact values
again product family and application dependent) the required burn-in time at
150°C versus batch yield to obtain a single-digit PPM line-fall-off reject rate for
automotive products and <100 PPM and <300 PPM (≅10 FIT) rejects during consumer warranty and product life respectively. The corresponding use conditions
are 0.5 hrs / 95 °C, 1500 hrs / 85°C and 30000 hrs / 85°C respectively. The burnin time strongly depends on the applicable operating time as for longer times
burn-in is less effective, see fig. 15. Fig. 16 also shows the burn-in time reduction
resulting from the introduction of screens at E-sort (assuming a factor 2 reduction
in α).
Fig.16: Burn-in time required at 150°C for a) 10 PPM line fall-off automotive
(0.5 hrs / 95°C), b) 100 PPM rejects warranty consumer (1500 hrs /
136
Screening of latent defects at electrical test and application to burn-in elimination
85°C) and c) 10 FIT failure rate consumer life (30000 hrs / 85°C) versus
batch yield (M= 95%). The dashed curves show the impact of E-sort
screens.
Apparently, products with an average yield < 80, 81 and 85% need burn-in for
automotive and different consumer application conditions given above respectively, while E-sort screening lowers the burn-in requiring yields to 68, 69 and 76%
respectively.
137
Chapter 7
7.4.2 Verification of the Model
The model can be reliably validated using data from the bipolar product in fig.
6. This audio amplifier part did not receive a burn-in and was only shipped to one
automotive customer who sent 100% of the line fall-off rejects back to us as part
of a PPM-cooperation agreement. This ensures that the PPM-data are not flattered
by the fact that some customers did not sent all rejects back. Fig. 17a shows the Esort yield of every batch of this product during 1995. The voltage screen at E-sort
was introduced in week 9522. The delay between the V-screen introduction at Esort and use of these products by the customer is caused by pipeline and stock effects. Using the above model and the actual M and α-values as determined from
fig. 6, we calculated the yield levels required to achieve less then 10 PPM line
fall-off reject rate with and without a voltage screen. These yield levels (being
84.0% and 86.8% respectively) are depicted by the horizontal dashed lines in fig.
17a. We see that before the introduction of the voltage screen most batches yield
lower than the level required to obtain 10 PPM line fall-off and that after the introduction most batches yield higher than the 10 PPM yield level.
Fig. 17b now shows the line fall-off reject levels as reported on a weekly basis
by the customer self. We observe that without the voltage screen the average PPM
level is indeed above 10 PPM which subsequently drops to an average level below
10 PPM after introduction of the voltage screen. This is in perfect agreement with
the predictions derived from our model. For a second product only running at this
automotive customer a 7 PPM line fall-off rate is reported while currently yielding
at more than 90%. This is also in full agreement with the predictions of our model.
138
Screening of latent defects at electrical test and application to burn-in elimination
(a)
(b)
Fig. 17: a) E-sort batch yield and minimum yield required for a 10 PPM line falloff reject rate before and after introduction of a V-screen and b) the corresponding weekly line fall-off rate reported by the automotive customer.
7.5 CONCLUSIONS
Based on data of over 31 million devices it has been shown that screening of
latent defects at electrical test (e.g. by voltage screens or IddQ-tests) can significantly reduce the number of reliability failures. A reduction of the yield-to-reliability ratio α by a factor 1.5 to 3 is found having a similar impact on the PPM and
FIT reliability levels.
Based on an extensive comparison between set and product lifetest data and
the above yield - reliability correlation, a model has been developed predicting the
product failure rate as a function of batch yield at any operating time and temperature. Application of this model to burn-in and lifetest failure rate prediction shows
firstly that in general high yielding batches generate less failures than low yielding batches, even if the latter have been subject to burn-in. Secondly, for a lot of
practical applications the use of screens at E-sort is more effective with respect to
screening of latent defects than a standard (6 hrs) burn-in.
Using the model, the minimum burn-in time needed to obtain a customer specified reliability target can be calculated as a function of batch yield as well as the
139
Chapter 7
average product yield level above which burn-in can be eliminated. The calculated
numbers appear to be in perfect agreement with experimental data.
7.6 REFERENCES
[1] D.S. Peck, ‘New concerns about integrated circuit reliability’, Proceedings
IRPS, pp. 1-6, (1978)
[2] B.A. Unger, ‘Early life failures’, Quality & Reliability Engineering International, vol. 4, pp 27-34, (1988)
[3] D.L. Crook, ‘Evolution of VLSI reliability engineering’, Proceedings IRPS,
pp. 2-11, (1990)
[4] E.A. Amerasekera, F.N. Najim, ‘Failure mechanisms in semiconductor devices’, Chichester, John Wiley & Sons, ch. 7, (1997)
[5] P.K. Nag, W. Maly, H.J. Jacobs, ‘Simulation of yield/cost learning curves
with Y4’, IEEE Transactions on semiconductor manufacturing’, vol. 10, pp.
256-265, (1997)
[6] F. Kuper, J. van der Pol, E. Ooms, T. Johnson, R. Wijburg, W. Koster, D.
Johnston, ‘Relation between yield and reliability of integrated circuits: experimental results and application to continuous early failure rate reduction
programs’, Proceedings IRPS, pp. 17-21, (1996)
[7] J.A. van der Pol, F.G. Kuper, E.R. Ooms, ‘Relation between yield and reliability of integrated circuits and application to failure rate assessment and reduction in the one-digit PPM reliability era’, MicroElectronics & Reliability,
pp. 1603-1610, (1996)
[8] A.P. van den Heuvel, N.F. Khory, ‘A rational basis for setting burn-in yield
criteria’, Proceedings International Test Conference, pp. 524-530, (1984)
[9] W. Smith, N.F. Khory, ‘Does the burn-in of integrated circuits continue to be
a meaningful course to pursue’, 38th Electronic Components Conference, pp.
1-4, (1988)
[10] D.L. Jacobowitz, ‘A software tool for designing burn-in programs’, Proceedings Annual Reliability and Maintainability Symposium, pp. 302-305,
(1987)
[11] T.R. Henry, T. Soo, ‘Burn-in elimination of a high volume microprocessor
using IddQ’, Proceedings International Test Conference, (1996)
[12] H.R. Claessen, ‘Reliability of IC’s’, IMEC Summer Course, Belgium, (1989)
[13] IBM, Memory Products, Qualification Handbook, 16MB DRAM, Die revision E, Document nr. MMDD06QHU-00
[14] R. Zelenka, Presentation at Ford Reliability Workshop, Colorado Springs,
October 22-24, (1993)
[15] M. Matthaei, Embedded DRAM Burn-in data, Private Communication,
(1997)
[16] J.A. van der Pol, E.R. Ooms, A. van ‘t Hof, F. Kuper, ‘Impact of screening of
latent defects at electrical test on the yield-reliability relation and application
to burn-in elimination’, pp. 370-377, Proceedings IRPS, (1998)
140
Screening of latent defects at electrical test and application to burn-in elimination
141
8
Summary and Conclusions
8.1 Summary
8.2 Conclusions
8.1 SUMMARY
Chapter 1 discusses trends in semiconductor technology and product reliability
and describes the reliability assurance system that has been implemented in process development, product development and high volume manufacturing in order
to achieve the factor 10 million reliability improvement over the last 30 years.
Furthermore motivation for the work in this thesis is given.
Chapter 2 shows the application of highly accelerated (wafer level) stress
techniques to two wear-out failure modes in a high voltage Bipolar-CMOS-DMOS
process technology being transistor instabilities due to sodium ingression and
transistor instabilities due to surface charges originating from high voltage
circuitry. Furthermore a quantitative model is developed allowing the derivation
of design rules for elimination of the surface charge failure mode.
Chapter3 discusses the relation between the hot carrier lifetimes of transistors
and that of SRAM circuits as well as the implications for technology development.
Chapter 4 demonstrates a straightforward method for the derivation of latchup
design rules for submicron CMOS processes changing this field from an ’art’ into
an ‘engineering science’.
Chapter 5 deals with new method to assess the metal stepcoverage of a
metallisation system by electrical measurements. Application to process
optimisation, design rule derivation and process monitoring is discussed.
Chapter 6 investigates the relation between yield and reliability of products
and demonstrates that the yield can be used as a reliability indicator instead of
conventional life tests. A quantitative model between yield and reliability is
developed and validated and application to failure rate reduction is discussed.
1
Chapter 8
Chapter 7 deals with the impact of electrical screens at product test on the
product failure rate in the field and explores the failure rate evolution with time.
This enables extension of the yield-reliability model into a new model capable of
predicting the product failure rate as a function of batch yield at any operating
time and temperature. Application to ‘burn-in’ elimination is discussed.
8.2 CONCLUSIONS
Dominant failure modes in high power/high voltage (650V) BCD-technologies
are threshold voltage instabilities of the lateral DMOS transistor due to sodium ingression and parasitic leakage currents in low voltage devices induced by high
surface potentials originating from the high voltage devices. In chapter 2 it is
shown that the threshold voltage instabilities can be prevented by improving the
sodium getter capabilities of the dielectric layers in the backend process and by
increasing the silicon nitride passivation thickness. The occurrence of parasitic
leakage currents appears to be strongly dependent on temperature, moisture
content of the plastic package, circuit layout and applied operating voltage. The
'charge-creep' effect can be modelled by describing the evolution of the surface
potential as a function of place and time by means of a lumped element RCmodel. A good qualitative and a reasonable quantitative agreement between experimental data and model predictions is found. Using the model also design rules
that can be used to eliminate the 'charge-creep' effects in actual circuits have been
derived.
Hot carrier degradation of a full-CMOS SRAM results primarily in an increase
of the minimum operating voltage and write time parameters, as shown in chapter
3, caused by degradation of the access transistor of the memory cell. This is a
‘pass’-type transistor that is operated with both source-drain voltage polarities.
This type of operation makes the transistor much more sensitive to the effects of
hot carrier degradation than in the case of operation in the ‘inverter’-type mode.
Circuit simulations confirm the observed degradation effects. The hot carrier lifetime of SRAM products appears to be about a factor 50 larger than that of static
stressed transistors. This discrepancy is caused by duty cycle effects and by the
limited sensitivity of the SRAM to the individual transistor degradation. Other
results show that this finding is generally applicable thus facilitating product design, for example by eliminating the need for cascoding of transistors at critical
locations. In this way increases in memory and microprocessor speed can be
realised as well as more aggressive scaling of process technologies without jeopardising product reliability.
The latchup susceptibility of a submicron CMOS processes on p-/p++ epitaxial
substrates can be characterised, using a dedicated set of test structures, as a
function of n+p+-spacing, placement of Nwell and substrate contacts, guardring
width and distance of the guardring to the injecting junction. Chapter 4
demonstrates how these data can be translated into latchup design rules taking
into account the geometrical spreading of the injected carriers. This approach
results in very latchup robust products in case of p-/p++ epitaxial substrates, thus
2
Summary and conclusions
eliminating the need for time-consuming and expensive ‘trial-and-error’ design
optimisation cycles. The method is also applicable to processes on non-epitaxial
substrates.
The metal stepcoverage of a metallisation system is a critical parameter to
control in a waferfab process as it may have a dramatic effect on electromigration
related reliability. Chapter 5 shows how it can be assessed by simple electrical
measurements where the stepcoverage is represented by the resistance ratio of
metal lines over (worst case) topography and metal lines over flat surfaces. The
resistance ratio appears to correlate well to the stepcoverage percentage as
determined from SEM inspections. As a result, the novel method is well suited for
derivation of stepcoverage related design rules. Surprisingly, in our work the
effect of metal stepcoverage on electromigration resistance was found to be very
limited; the open failures also did not occur on the steps. Another application is
the optimisation of sputter processes. It appears that metal stepcoverage depends
strongly on metal sputter target lifetime. Several runs with dummy wafers appear
to be necessary after metal target change to guarantee good stepcoverage.
Furthermore it is found that stepcoverage can vary significantly over the wafer
surface demonstrating the importance of metal stepcoverage wafermap data and
the limitations of the conventional cross sectioning method. Finally it appears that
the method is well suited for monitoring of metal stepcoverage on production
wafers and screening-out of weak parts by placing it in the Process Control
Modules on each wafer.
Clear relations have been established in chapter 6 between E-sort yield and
‘burn-in’, EFR and field failure rates for nearly 50 million high volume products
in bipolar, CMOS and BICMOS technologies from different waferfabs (later
confirmed by data from a 0.25µm state-of-the-art microprocessor process). The
relations obey a simple model that assumes that the reliability defect density is a
fraction of the waferfab defect density and that rootcauses of failures are the same.
The model allows a die size independent prediction and assessment of FIT and
PPM reliability levels of an IC just based on its yield, eliminating the need for excessive life testing. ‘Maverick’ batches are identified by more than 2 to 3 rejects
per batch and can not be eliminated by scrap of low yielding wafers alone. For
non-mature technologies only correlations with functional yield are found, the
parametric yield loss should be disregarded. Using the results, it is shown how
reliability can be improved in a fast and controlled way, even in the 1 digit FIT
and PPM reliability era, by reducing waferfab defect density, elimination of special causes and implementation of screens at product test. As the effect of yield on
PPM reject level is not that strong, the latter approach can be very effective in
improving reliability.
Finally, it is shown in chapter 7, based on data of over 31 million devices, that
screening of latent defects at electrical test (e.g. by voltage screens or IddQ-tests)
can improve PPM and FIT reliability levels by a factor 1.5 to 3, demonstrating
that these techniques are a good alternative to ‘burn-in’. As this provides
significant efficiency improvement and cost reduction opportunities to high
volume semiconductor manufacturers, these screening techniques are rapidly
becoming standard industry practice. Extensive comparison between TV-set and
3
Chapter 8
product lifetest data shows that the failure rate curve indeed follows a ‘bathtub’
shape. The curve can be accurately modelled by a modified Weibull distribution.
Merging it with the above yield-reliability relation, results in a new model that for
the first time allows prediction of product failure rate as a function of batch yield
at any operating time and temperature. Application of the model to prediction of
burn-in and lifetest failure rates, shows first that in general high yielding batches
generate in the long run less failures than low yielding batches, even if the latter
have been subject to ‘burn-in’. Second, for a lot of practical applications the use of
screens at E-sort is more effective with respect to screening of latent defects than a
standard (e.g. 6 hrs) ’burn-in’. Using the model, the minimum burn-in time
needed to obtain a customer specified reliability target can be calculated as a
function of batch yield. It also answers the question under what conditions burn-in
can be eliminated. The calculated numbers appear to be in good agreement with
experimental data.
4
Summary
Over the past 30 years the reliability of semiconductor products has improved
by a factor of 100 while at the same time the complexity of the circuits has increased by a factor 105. This 7-decade reliability improvement has been realised by implementing a sophisticated reliability assurance system in process development,
product development and high volume manufacturing, aimed at building-in
product reliability and establishing effective improvement feedback loops in both
development and production as described in chapter 1.
This thesis deals with new methods that have been developed to continue the
current improvement rate also in the new millennium. In process development the
adoption of highly accelerated stress techniques (preferably on wafer level) has become crucial as this gives the opportunity to simulate 10 years of product lifetime
within a few hours or days, in-line with today’s development cycle times. In
chapter 2 these are applied to a high voltage Bipolar-CMOS-DMOS technology for
the selection of the best dielectric passivation stack capable of preventing wear-out
failures due to sodium ingression from the plastic package. Another dominant
wear-out failure mode in high voltage products is the occurrence of transistor
instabilities induced by high voltage surface charges originating from the high
voltage connections to the product (bondwires and bondpads). Using similar stress
methods, a new quantitative model has been developed that describes this failure
mechanism and that allows us to derive design rules that eliminate the surface
charge effects and thus ensure reliable high voltage products.
The highly accelerated stresses are often carried out on dedicated test structures
designed in such a way that they are ‘’susceptible” to primarily only the failure
mechanism of interest. This introduces the problem of how to convert test structure
lifetime data to actual product data. As reliability margins are vanishing rapidly in
modern semiconductor technologies this is of great interest to the industry. In
chapter 3 this has been explored for the case of hot carrier degradation. It appears
that large lifetime differences can occur between test structures and products due
duty cycle effects and the varying sensitivity of the electrical parameters of a
product to the degradation of one or more of its components. Lifetimes of products
in dynamic operation can easily exceed the lifetimes of corresponding transistors
in static operation by a factor 100. As a result, more aggressive scaling of process
technologies is possible without jeopardising the product reliability, enabling e.g.
increases in the maximum operation frequency of state-of-the-art microprocessors.
For building-in reliability during product development, the availability of
reliability related design rules is mandatory. One aspect of this are design rules
that ensure that the product is robust against voltage spikes on its external pins so
that it does not ‘latchup’ and burn-out. In chapter 4 a method is demonstrated,
applicable to any CMOS technology, for the derivation of latchup design rules
from simple test structures. It allows first-time-right design of products and
changes the perception of latch-up being an ’art’ to being an ’engineering science’.
143
In high volume manufacturing the prevention and detection of process excursions that might deteriorate product reliability and yield is of the uttermost importance. Therefore very sophisticated in-line and end-of-line control systems have
been implemented in manufacturing flows where all critical equipment and
process parameters that may influence product performance or reliability are
regularly measured and kept under Statistical Process Control. One such critical
process parameter is the metal stepcoverage of a metallisation system as it may
have a dramatic effect on electromigration related reliability of a product. Chapter
5 deals with a novel method that allows metal stepcoverage monitoring by simple
electrical measurements. The method is applied to optimisation of sputter
processes and generation of design rules and it is also shown that it has clear
advantages over the commonly used cross-sectioning method as also stepcoverage
wafermaps can be easily made. Combined with end-of-line control techniques it is
well suited for metal stepcoverage control on production wafers and screening of
weak parts.
As a result of the ‘building-in’ reliability approach in process and product
development, wear-out failure modes do not occur anymore in today’s products.
Instead product failures are dominated by early failures caused by manufacturing
defects. Product failure rates however have become so low that conventional life
testing techniques are not capable anymore of providing enough statistically
significant data (at reasonable cost) to guide the improvement actions in the high
volume manufacturing lines. Therefore a paradigm shift is needed. In chapter 6 it
is shown that the product yield can be used as a primary reliability indicator. Data
of over 50 million products in various processes show for the first time that there
exists a clear correlation between the yield of a product and its reliability in the
field because the nature of yield and reliability defects is the same. Thus a waferfab
may improve the reliability level of its products in a fast and controlled way by
monitoring and reducing its in-line defect densities, eliminating the need for
excessive life testing. The yield-reliability relation is described by a quantitative
model that allows predict the reliability in the field based on yield data. It can be
used to set objective scrap limits for non-conforming (low yield) material, thus
preventing that products with a larger failure probability are shipped to customers.
The model also indicates that a substantial product reliability improvement can be
obtained by the implementation of screens at product test like voltage screen and
IddQ testing. Data of over 30 million products show in chapter 7 that these
techniques can halve failure rates and are a good alternative ‘burn-in’ (a
conventional failure rate reduction method where products are operated at a high
temperature for some time before shipment to customers). Screening techniques
are rapidly becoming standard industry practice due to the potential efficiency
improvements and cost reductions they offer in high volume manufacturing.
Given the high cost of ‘burn-in’, the question under what conditions it can be
eliminated is a relevant one. Therefore is in chapter 7 the failure rate evolution
curve with time determined, based on lifetest data of actual TV-sets and products,
and also succesfully modelled quantitatively. The failure rate curve of indeed has
the well known ‘bathtub’ shape. Combining the failure rate model with the above
yield-reliability relation, results in a novel model capable of predicting for the first
144
time the product failure rate as a function of batch yield at any operating time and
temperature.
It allows calculation of the minimum burn-in time needed to obtain a customer
specified reliability target as a function of batch yield as well as the average product yield level above which burn-in can be eliminated. Good agreement between
model predictions and experimental data is shown. The model predicts that in
general high yielding batches generate less failures than low yielding batches, even
if the latter have been subject to burn-in. This once again demonstrates the importance of high yield for achieving excellent product reliability. Furthermore, for a
lot of practical applications the use of screens at E-sort is more effective with
respect to screening of latent defects than a standard (short) burn-in.
145
146
Nieuwe Methoden voor het Inbouwen en
Verbeteren van de Betrouwbaarheid van
Geintegreerde Schakelingen
Toepassing op Massa Fabricage van Halfgeleiders
Samenvatting
Gedurende de laatste 30 jaar is de betrouwbaarheid van halfgeleider producten
(IC’s) met een factor 100 verbeterd terwijl tegelijkertijd de complexiteit van de
schakelingen met een factor 105 is toegenomen. Deze betrouwbaarheidsverbetering
met 7 grootte ordes is gerealiseerd door de implementatie van een geavanceerd betrouwbaarheids borgings systeem in de proces ontwikkeling, product ontwikkeling
en massa fabricage. Dit systeem is gericht op het inbouwen van betrouwbaarheid
en het realiseren van effectieve verbeter processen in ontwikkeling en productie
zoals in hoofdstuk 1 van dit proefschrift beschreven wordt.
Dit proefschrift behandelt nieuwe methoden die ontwikkeld zijn om de huidige
trend in verbetering van de betrouwbaarheid ook in de toekomst te kunnen
doorzetten. Tijdens de proces ontwikkeling is toepassing van sterk versnelde
betrouwbaarheidsevaluatie technieken (bij voorkeur op plak niveau) cruciaal
geworden omdat op deze wijze 10 jaar product levensduur gesimuleerd kan worden
binnen enkele uren of dagen, hetgeen noodzakelijk is gezien de huidige sterk
verkorte ontwikkel cycli. In hoofdstuk 2 worden deze methoden toegepast op een
hoogspannings Bipolair-CMOS-DMOS technologie om die (diëlectrische)
passivatie laag te kunnen selecteren die het meest geschikt is om massale uitval
(’einde levensduur’) ten gevolge van het binnen dringen in het IC van natrium
vanuit het plastic van de omhulling te voorkomen. Een andere belangrijk ’einde
levensduur’ uitvalsmechanisme in hoogspannings IC’s is het optreden van lek
paden in transistoren veroorzaakt door positieve oppervlakte ladingen die
afkomstig zijn van de hoogspanningsdelen van de schakeling (o.a. bonddraden en
bondflappen). Met vergelijkbare versnellingstechnieken is een nieuw model
ontwikkeld dat dit uitval mechanisme quantitatief beschrijft en dat gebruikt kan
worden om ontwerp regels af te leiden die de gevolgen van de oppervlakte
ladingen elimineert zodat de betrouwbaarheid van de hoogspannings IC’s
gegarandeerd kan worden.
De sterk versnelde betrouwbaarheidsevaluaties worden vaak op speciale test
structuren uitgevoerd die zo ontworpen zijn dat ze voornamelijk voor slechts een
uitvalsmechanisme gevoelig zijn. Hierbij rijst het probleem hoe de levensduur data
die van de test structuren verkregen worden vertaald moeten worden naar de levensduur van daadwerkelijke producten. Om dat de betrouwbaarheidsmarges in
147
moderne technologieën snel aan het verdwijnen zijn is dit van groot belang voor de
halfgeleider industrie. In hoofdstuk 3 is dit onderzocht voor het geval van degradatie ten gevolge van ‘hete ladingsdragers’. Het blijkt dat er grote verschillen tussen
de levensduur van test structuren en echte producten kunnen bestaan ten gevolge
van ‘duty cycle’ effecten en de variërende gevoeligheid van de electrische
parameters van een product voor de degradatie van een van zijn transistoren. De
levensduur van dynamisch werkende producten kan met gemak een factor 100
groter zijn dan de levensduur van de relevante transistoren in statisch bedrijf. Als
gevolg hiervan is er een agressievere schaling mogelijk van proces technologieën
zonder de product betrouwbaarheid in gevaar te brengen, waardoor bijv. toenames
in de snelheid van microprocessors mogelijk zijn.
Om de betrouwbaarheid gedurende de product ontwikkeling te kunnen inbouwen, is de beschikbaarheid van hieraan gerelateerde ontwerp regels absoluut noodzakelijk. Een aspect hiervan is een set ontwerp regels die gegarandeerd dat het
product robuust is ten opzichte van externe spanningspieken op zijn pinnen zodat
het niet in ‘latchup’ gaat en uitbrandt. In hoofdstuk 4 wordt een methode
gedemonstreerd die toepasbaar is op elke CMOS technologie en waarmee ‘latchup’
ontwerp regels uit eenvoudige test structuren afgeleid kunnen worden. Hiermee is
het mogelijk om producten te ontwerpen die direct aan de specificaties voldoen.
Tegelijkertijd verandert het imago van ‘latchup’ als zijnde iets wat een hoog
’zwarte magie’ gehalte heeft in iets wat een behapbaar technisch probleem is.
In massa productie is het voorkomen van en detecteren van proces uitschieters
van het grootste belang omdat deze de product opbrengst en betrouwbaarheid sterk
nadelig kunnen beïnvloeden. Daarom zijn er in halfgeleider productie processen
zeer geavanceerde beheers systemen geimplementeerd, zowel in de lijn als aan het
eind van de lijn. Hierbij worden alle kritische apparatuur en proces parameters die
de product specificatie en betrouwbaarheid kunnen beïnvloeden op regelmatige basis gemeten en beheerst met ‘Statistische Proces Controle’ methoden. Een van die
kritische proces parameters is de stapbedekking het metaal van een bepaald
metallisatie proces omdat de stapbedekking een grote invloed kan hebben op de
electromigratie gerelateerde betrouwbaarheid van een product. Hoofdstuk 5
beschrijft een nieuwe methode waarmee de metaal stapbedekking op simpele wijze
electrisch gemeten kan worden. Deze methode is toegepast op de optimalisatie van
metaal sputter processen en het afleiden van ontwerpregels. Verder wordt
aangetoond dat de nieuwe methode duidelijke voordelen heeft ten opzichte van de
normaal gebruikte methode waarbij doorsneden worden gemaakt omdat de met de
eerste ook eenvoudig de stapbedekking over het gehele plak oppervlak bepaald kan
worden. Gecombineerd met standaard beheersingsmethoden aan het einde van de
lijn, is de nieuwe methode zeer geschikt voor stapbedekking controle op productie
plakken en het onderscheppen van zwakke broeders.
Als gevolg van de aanpak om betrouwbaarheid tijdens de proces en product
ontwikkeling in te bouwen, komen ‘einde levensduur’ uitval mechanismen in de
huidige producten niet meer voor. In plaats daarvan wordt de product betrouwbaarheid nu bepaald door ’vroege uitvallers’ ten gevolge van defecten die tijdens het
productie proces geïntroduceerd zijn. De uitval niveaus van de producten zijn nu
zo laag, dat standaard product levensduur technieken niet meer in staat zijn (tegen
148
acceptabele kosten) om genoeg statistisch relevante informatie op te leveren
waarop de verbeteracties in de productie gebaseerd kunnen worden. Er is daarom
een paradigma verschuiving nodig. In hoofdstuk 6 wordt aangetoond dat de
voormeet opbrengst van een product (dat is het percentage goede kristallen per
plak) gebruikt kan worden als de belangrijkste betrouwbaarheidsindicator. Data
verkregen van meer dan 50 miljoen producten in diverse proces families tonen
namelijk voor het eerst in de literatuur aan dat er een duidelijke correlatie is tussen
de voormeet opbrengst van een product en zijn betrouwbaarheid bij de eind
gebruiker. Dit komt doordat de defecten die opbrengst verlies veroorzaken en die
aanleiding geven tot een uitvaller bij de eind gebruiker van dezelfde origine zijn.
Dus kan een diffusie fabriek de betrouwbaarheid van zijn producten snel en
gecontroleerd verbeteren door de defect niveaus in de lijn te beheersen en te
verminderen. Hiermee vervalt de noodzaak om op extreem grote aantallen
producten levensduur testen uit te voeren. De opbrengst - betrouwbaarheid relatie
kan beschreven worden met een quantitatief model waarmee de uitvalsniveaus bij
de eindgebruiker voorspeld kunnen worden slechts op basis van de voormeet
opbrengst. Het model kan ook gebruikt worden om objectieve grenzen te zetten
voor wanneer een afwijkende plak met lagere opbrengst vernietigd moet worden.
Op deze wijze kan voorkomen worden dat producten met een te grote uitval kans
bij de eindgebruiker terechtkomen. Het model geeft ook aan dat een aanzienlijke
betrouwbaarheidsverbetering bereikt kan worden door speciale testen in het meet
programma van het product te implementeren (‘screening’ testen) waarmee
zwakke broeders, die gekenmerkt worden door bijv. een te hoog stroomverbruik in
rust toestand of die al uitvallen bij licht verhoogde spanning in het
meetprogramma, gedetecteerd en uitgezeefd kunnen worden. In hoofdstuk 7 wordt
aangetoond op basis van data van meer dan 30 miljoen producten dat met deze
technieken de uitval met een factor 2 verminderd kan worden en dat ze goede
alternatieven kunnen zijn voor het ‘inbranden’ van producten (waarbij een product
gedurende een bepaalde tijd bij hoge temperatuur bedreven wordt voordat het naar
de eindgebruiker gaat). Dit soort methoden wordt inmiddels in snel tempo overal
toegepast omdat ze de nodige efficiency verbeteringen en kosten reducties opleveren in massa productie.
Gezien de hoge kosten van het ’inbranden’ van producten, is de vraag onder
welke condities het achterwege gelaten kan worden uiterst relevant. Daarom is in
hoofdstuk 7 de uitval curve als functie van tijd bepaald op basis van levensduur data van echte televisies en gerelateerde producten en tevens succesvol quantitatief
gemodelleerd. De uitval curve heeft inderdaad de wel bekende ‘badkuip kromme’
vorm. Combinatie van het uitval snelheidsmodel met de bovenstaande opbrengst –
betrouwbaarheid relatie, resulteert in een nieuw model waarmee voor het eerst in
de literatuur de uitval snelheid van een product voorspeld kan worden op basis van
de voormeet opbrengst en de gebruikscondities van het product voor wat betreft tijd
en temperatuur. Met het model kan ook de minimaal benodigde inbrand tijd berekend worden die nodig is om een bepaald door de klant gespecificeerd maximaal
uitvalsniveau te halen als functie van de voormeet opbrengst evenals de gemiddelde voormeet opbrengst waarboven het inbranden achterwege gelaten kan worden.
De model voorspellingen blijken goed met de experimentele data over een te ko149
men. Het model voorspelt verder dat partijen met hoge opbrengst minder lange termijn uitval opleveren dan partijen met lage opbrengst, ook als deze laatste ‘ingebrand’ worden. Dit bevestigt weer het belang van hoge opbrengsten voor het realiseren van een uitstekende product betrouwbaarheid. Verder geeft het model aan
dat voor tal van praktische situaties, het gebruik van ‘screening’ testen effectiever
is in het verlagen van de uitvalsniveaus dan een standaard korte ‘inbrand’ cyclus.
150
151
List of Publications and
Conference Presentations
[1] J.A. van der Pol, J.J.M. Koomen, ‘Relation between the hot carrier lifetime of
transistors and CMOS SRAM products’, Proceedings International Reliability
Physics Symposium (IRPS), pp. 178-185, (1990)
[2] J.A. van der Pol, ‘Hot carrier degradation of MOS transistors and circuits’,
Presentation at Ford Automotive Reliability Workshop, October 21-23, Colorado Springs, (1993)
[3] K. van Doorselaer, T.M. Moore, J.A. van der Pol, ‘Failure criteria for inspection using acoustic microscopy after moisture sensitivity testing of plastic surface mount devices’, Proceedings International Symposium on Testing and
Failure Analysis (ISTFA), pp. 229-239, (1994) and presentation at
Ford/Delco /Chrysler Automotive Reliability Workshop, October 20-21,
Detroit, (1994)
[4] J.A. van der Pol, P.B.M Wolbert, ‘A structured approach for the derivation of
latch-up design rules for submicron CMOS processes’, Presentation at Ford/
Delco/Chrysler Automotive Reliability Workshop, October 20-21, Detroit,
(1994)
[5] J.A. van der Pol, E.R. Ooms, ‘Short loop monitoring of metal stepcoverage by
simple electrical measurements’, Proceedings IRPS, pp. 148-155, (1996) and
presentation at Ford/Delco/Chrysler Automotive Reliability Workshop, October 25-27, Indianapolis, (1995)
[6] F. Kuper, J.A. van der Pol, E.R. Ooms, T. Johnson, R. Wijburg, W. Koster,
D. Johnston, ‘Relation between yield and reliability of integrated cicruits:
experimental results and application to continuous early failure rate reduction
programs’, Proceedings IRPS, pp. 17-21, (1996)
[7] B. Krabbenborg, J.A. van der Pol, ‘The influence of process variations on the
robustness of an audio power IC’, Microelectronics & Reliability, pp. 18191822, (1996)
[8] J.A. van der Pol, F.G. Kuper, E.R. Ooms, ‘Relation between yield and reliability of integrated circuits and application to failure rate assessment and reduction in the one digit FIT and PPM reliability era’, Microelectronics & Reliability, pp. 1603-1610, (1996) and presentation at European Symposium on Reliability of Electron devices and Failure analysis (ESREF), Enschede, (1996)
[9] F.W. Ragay, J.A. van der Pol, J. Naderman, ‘In-situ monitoring of dry corrosion degradation of Au ballbonds to Al bondpads in plastic packages during
HTSL’, Microelectronics & Reliability, pp. 1931-1934, (1996)
[10] J.A. van der Pol, H.J. Gerritsen, R.T.H. Rongen, P.P.M.C. Groeneveld, P.W.
Ragay, H.A. van den Hurk, ‘Reliability issues in 650V high voltage BipolarCMOS-DMOS integrated circuits’, Microelectronics & Reliability, pp. 17231726, (1997) and presentation at ESREF Conference, Bordeaux, (1997)
152
[11] J.A. van der Pol, E.R. Ooms, A. van ‘t Hof, F. Kuper, ‘Impact of screening of
latent defects at electrical test on the yield-reliability relation and application
to burn-in elimination’, pp. 370-377, Proceedings IRPS, (1998)
[12] J.A. van der Pol, P.B.M. Wolbert, ‘Systematic derivation of latch-up design
rules for submicron CMOS processes from test structures’, Microelectronics
& Reliability, pp. 1051-1056, (1998) and presentation at ESREF Conference,
Kopenhagen, (1998)
[13] E.R. Ooms, J.A. van der Pol, ‘Occurrence and elimination of anomalous temperature dependence of latchup trigger currents in BICMOS processes’, Proceedings IRPS, pp. 138-143, (1999)
[14] J.A. van der Pol, J-P.F. Huijser, R.B.H. Basten, ‘New latchup mechanism in
complementary bipolar power IC’s triggered by backside die attach glue’, Microelectronics & Reliability, pp.863-868, (1999), presentation at ESREF Conference, Bordeaux, (1999) and presentation at the Automotive Electronics
Council Automotive Reliability Workshop, November 2-5, Nashville, (1999)
[15] J.A. van der Pol, A.W. Ludikhuize, H.G.A. Huizing, B.van Velzen, R.J.E.
Hueting, J.F. Mom, G. van Lijnschoten, G.J.J. Hessels, E.F. Hooghoudt, R.
van Huizen, M.J. Swanenberg, J.H.H.A. Egbers, F. van den Elshout, J.J.
Koning, H. Schligtenhorst, J. Soeteman, ‘A-BCD: An Economic 100V RESURF Silicon-On-Insulator BCD Technology for Consumer and Automotive
Applications’,
Proceedings International
Symposium
on
Power
Semiconductor Devices (ISPSD), Toulouse, (2000)
[16] J.A. van der Pol, R.T.H. Rongen, H.J. Bruggers, ‘Modelling of Surface Potential Induced Leakage Failures in High Voltage Integrated Circuits and Application to Design Rule Derivation’, Submitted to ESREF Conference, Dresden,
(2000)
153
Dankwoord
Vele mensen hebben de afgelopen jaren bijgedragen aan de totstandkoming van
de artikelen die in dit proefschrift bijeengebracht zijn. Als eerste denk ik hierbij
mijn ex-collega’s van het Corporate Reliability Centre van het Philips Natuurkundig Laboratorium die mij gestimuleerd hebben mijn eerste wankele schreden op
het wetenschappelijke pad te zetten, met name Jan Verweij en Jan Koomen maar
ook Karel Van Doorselaer, Fred Kuper, Ajith Amerasekera, Mario Pinto, Albert
van der Wijk en Kees de Zeeuw. Daar heb ik ook geleerd dat werken best met veel
plezier te combineren valt waar ook de niet genoemde groepsleden veel aan
bijgedragen hebben.
In Nijmegen was werk klimaat in de Reliability Physics groep van de Quality
en Reliability (Q&R) afdeling van Consumer IC Nijmegen (CIC-N) en de nauwe
samenwerking met de process engineers in Waferfab AN en de designers, device
fysici en product engineers van CIC-N een uitstekende voedingsbodem. Een ruime
keuze aan interessante en relevante onderwerpen, voldoende middelen om deze gedegen uit te zoeken, goede discussie partners en altijd een plezierige en positieve
samenwerking, ook als het business belang grote druk op de groep legde. De belangrijkste steun kwam hierbij van Eric Ooms, Paul Schras, Fred Kuper, Jocky
Naderman en Philip Wolbert voor (in willekeurige volgorde) de stimulerende discussies, hulp bij PC perikelen, opbeurende maar soms ook uitdagende woorden,
hulp bij experimenten en analyses en nog vele andere zaken.
Onmisbare bijdragen, zowel in materiele als immateriele zin, zijn echter ook
geleverd door John Vroemen, Han Gerritsen, Peter Ragay, Benno Krabbenborg,
Rene Rongen, Loed Heldens, Frans van Lottum, Peter Groeneveld, Henk van den
Hurk, Guus Rehbach, Jan Bruggers, Anton Aelbers, Peter Taylor en Toon van ‘t
Hof† van de Q&R afdeling, Piet van Kessel, Johan Bosmans, Piet Wessels, Jacques
Mom, George Timan, Dick Vogelzang, Jan Soeteman†, Tom de Boer, Henk
Verstappen, Wil Josquin en Hans Seele van de Waferfab AN, Jos Plagge, John
Somberg, Arnold Sengers, Arno Emmerik, Ben Verhoeven, Richard Langezaal,
Menno van Langen, Arjan van Wijk, Hans van den Berg, Frans Urselmann en
Kees Joosse van CIC-N, Willem Koster van MOS3, Ruud van Winkelhof, Alfons
Goossens , Karl Anderten, Will Gubbels, Jan Slotboom, Peter Meijer, Dick
Kleinloog en Rob Wolters van het Nat.Lab. en Bob Thomas (USA).
Dank ook aan mijn paranimfen Jan Koomen en Mario Pinto voor hun steun en
vriendschap gedurende meer dan 10 jaar en Taib El Ghazi voor het ‘art work’ van
de omslag.
De grote inspirator van dit proefschrift mag ook apart geëerd worden: Jan
Verweij, mijn promotor. Zonder zijn stimulering en zijn vertrouwen in mij was het
waarschijnlijk nooit zover gekomen. Jan, hiervoor hartelijk dank evenals voor het
grote geduld dat je ten toon gespreid hebt bij de voltooiing van dit proefschrift.
Verder natuurlijk dank aan Anne-Mieke, Bram, Karel en tenslotte ook nog
Lotte voor hun grote geduld tijdens al die uren dat ‘Papa weer zijn boekje aan het
tikken was’ in plaats van met de jongens te spelen of met Lotte te wandelen.
154
Tenslotte wil ik nog mijn ouders vermelden die het studeren altijd door dik en
dun gesteund hebben, ‘Pô en Moe’, bedankt!
155
Levensloop
Jacob van der Pol werd op 5 mei 1961 geboren te Hoensbroek, provincie Limburg. Hij behaalde in 1979 zijn Gymnasium-β diploma aan de ‘Thomas à Kempis’
Scholen gemeenschap te Zwolle. Hierna studeerde hij Technische Natuurkunde
aan de Technische Hogeschool Twente (later Universiteit Twente) waar hij afstudeerde in de vakgroep Quantumelectronica op de optische versterking van nanoseconde CO2 laserpulsen. Het ingenieurs diploma werd in 1986 ‘met lof’ behaald.
Tevens werd de bevoegdheid van ’Stralingsdeskundige-C’ verkregen. Tijdens zijn
militaire dienst werkte hij op het Fysisch- en Electronisch Laboratorium (FEL) van
TNO te Den Haag waarna hij in 1987 in dienst trad bij het Philips Natuurkundig
Laboratorium in Eindhoven. Hier werkte in het Corporate Reliability Centre aan
diverse product en proces reliability onderwerpen van SRAM geheugens en
submicron CMOS processen met nadruk op hot carrier degradatie en latchup.
Sinds 1991 werkt hij bij Philips Semiconductors in Nijmegen. Na een jaar als
reliability fysicus in submicron CMOS logic process ontwikkeling in de waferfab
MOS3 werd hij in 1992 benoemd tot Reliability Physics Manager voor de product
groep Consumer ICs. Hier werkte hij aan diverse bipolaire, BiCMOS en
hoogspannings Bipolair-CMOS-DMOS (BCD) processen en producten en diverse
(SMD) package families. Nadruk lag op de reliability van hoogspannings BCD
processen, latchup in BiCMOS en BCD processen en ‘Early Failure Rate’ reductie
technieken. Sinds 1996 is hij Process Engineering en Development manager van
de Consumer Systems waferfab AN in Nijmegen die Bipolaire, BiCMOS, BCD en
Silicon-On-Insulator processen voert. Hij is lid geweest van het Technical Program
Committee van de IRPS conferentie, is co-auteur van de 1996 IRPS ‘Outstanding
Paper Award’ en is sinds 1997 een van de chairman van het Technical Program
Committee van de ESREF conferentie.
Biography
Jacob van der Pol was born in Hoensbroek, the Netherlands in 1961. He received the M.S. degree in Applied Physics ‘with honor’ from Twente University of
Technology in 1986 with a specialisation in laser physics. After one year with the
Physics and Electronics Laboratory of TNO in the Hague, he joined Philips Research Laboratories in 1987 where he worked on process and product reliability issues of SRAMs and submicron CMOS processes with emphasis on hot carrier degradation and latchup. In 1991 he joined Philips Semiconductors (PS) in Nijmegen
as reliability physicist in CMOS logic process development. In 1992 he became
Reliability Physics Manager for Consumer ICs. Here he has worked on various
process, product and package reliability issues covering various (SMD) package
families and bipolar, BiCMOS, CMOS logic and high voltage BCD processes.
Emphasis was on the latter technology, latchup in BiCMOS processes and ‘Early
Failure Rate’ reduction methods. Currently he is Process Engineering and
Development Manager of the Consumer Systems Waferfab AN in Nijmegen,
156
running Bipolar, BiCMOS, BCD and SOI processes. He has participated in the
IRPS conference Technical Program Committee, is co-author of the 1996 IRPS
’Outstanding Paper Award’ and is one of the chairmen of the Technical Program
Committee of the ESREF conference.
157