Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
New Methods for Building-in and Improvement of Integrated Circuit Reliability Application to High Volume Semiconductor Manufacturing Jacob van der Pol Cover design : Omslag : Taib El Ghazi de Bipolair – CMOS - DMOS WaferFab ‘AN’ van Philips Semiconductors te Nijmegen ISBN : 9036514614 NEW METHODS FOR BUILDING-IN AND IMPROVEMENT OF INTEGRATED CIRCUIT RELIABILITY APPLICATION TO HIGH VOLUME SEMICONDUCTOR MANUFACTURING PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit Twente, op gezag van de rector magnificus, prof.dr. F.A. van Vught, volgens het besluit van het College voor Promoties in het openbaar te verdedigen op donderdag 8 Juni 2000 te 15.00 uur. door Jacob Antonius van der Pol geboren op 5 mei 1961 te Hoensbroek Dit proefschrift is goedgekeurd door de promotoren prof.dr. J.F. Verweij prof.dr.ir. F.G Kuper ’…. doe nou maar wat je zegt, dat is al moeilijk genoeg ….’ naar Bram Faas† Aan Anne-Mieke, Bram en Karel voor de vele uren dat papa weer zat te ‘tikken’ en mijn ouders voor hun nooit aflatende steun De promotiecommissie: Voorzitter: Prof.dr. H. Wallinga Universiteit Twente Secretaris: Prof.dr. H. Wallinga Universiteit Twente Promotoren: Prof.dr. J.F. Verweij Prof.dr.ir. F.G. Kuper Semiconductors Universiteit Twente Universiteit Twente / Philips Leden: Prof.dr.ir. A.C. Brombacher Technische Universiteit Eindhoven Prof.dr. H.E. Maes Universiteit Leuven / IMEC, Belgie Prof.dr.ir. A.J. Mouthaan Universiteit Twente Prof.dr.ir. B. Nauta Universiteit Twente Prof.dr. P.H. WoerleeUniversiteit Twente / Philips Research Contents 1. INTRODUCTION 1.1 Introduction 1.2 Integrated circuit technology and reliability trends 1.3 System for building-in and improvement of product reliability 1.4 References 2. RELIABILITY ISSUES IN HIGH VOLTAGE BIPOLAR-CMOS-DMOS INTEGRATED CIRCUITS 2.1 Introduction 2.2 Threshold voltage instabilities of HV DMOS transistors 2.3 Parasitic leakage currents induced by ‘charge-creep’ 2.3.1 Failure mechanism 2.3.2 Surface potential modelling by a lumped element RC-network 2.3.3 'Charge-creep' characterisation using test structures 2.3.4 Comparison of experimental data and model predictions 2.3.5 Design rules 2.4 Conclusions 2.5 References 3. RELATION BETWEEN THE HOT CARRIER LIFETIME OF TRANSISTORS AND CMOS SRAM PRODUCTS 3.1 Introduction 3.2 Experimental 3.3 Transistor and SRAM parameter degradation 3.4 Analysis and discussion of the SRAM parameter degradation 3.5 Relation between the transistor and SRAM hot carrier lifetime 3.6 Summary and conclusions 3.7 References 4. SYSTEMATIC DERIVATION OF LATCHUP DESIGN RULES FOR SUBMICRON CMOS PROCESSES FROM TEST STRUCTURES 4.1 Introduction 4.2 Latchup susceptibility reduction options 4.3 Design rule derivation approach 4.4 Application: design rule derivation for a CMOS process on p-/p++ epitaxial substrates 4.4.1 Impact P+ substrate contact placement and P+ guardrings 4.4.2 Impact Nwell contact placement and N+/Nwell guardrings 4.4.3 Process specific design rules 4.5 Conclusions 4.6 References 5. SHORT LOOP MONITORING OF METAL STEPCOVERAGE BY SIMPLE ELECTRICAL MEASUREMENTS 5.1 Introduction 5.2 Electrical assessment of metal stepcoverage 5.3 Design rule verification for (non-)planarized bipolar processes 5.4 Effect of metal stepcoverage on electromigration lifetime 5.5 Design rule verification for a non-planarized BiCMOS process 5.6 Process split evaluation and shortloop equipment monitoring 5.7 Metal stepcoverage wafermaps 5.8 Summary and conclusions 5.9 References 6. RELATION BETWEEN YIELD AND RELIABILITY OF INTEGRATED CIRCUITS AND APPLICATION TO FAILURE RATE ASSESSMENT AND REDUCTION IN THE ONE DIGIT FIT AND PPM RELIABILITY ERA 6.1 Introduction 6.2 Yield as a reliability indicator 6.3 Experimental results 6.3.1 Relation between yield and line fall-off. 6.3.2 Relation between line fall-off and field returns 6.3.3 Rrelation between yield and burn-in reject rate 6.3.4 Relation between burn-in and High Temperature Operating Life (HTOL) failure rate 6.4 Failure rate prediction and assesment 6.5 Options for failure rate reduction 6.5 1 Yield improvement 6.5.2 Elimination of special causes (‘maverick’ batches) 6.5.3 Screening of weak parts with latent defects during product test 6.6 Conclusions 6.7 References 7. IMPACT OF SCREENING OF LATENT DEFECTS AT ELECTRICAL TEST ON THE YIELD-RELIABILITY RELATION AND APPLICATION TO BURN-IN ELIMINATION 7.1 Introduction 7.2 Impact of screening latent defects at e-sort on product reliability 7.2.1 Yield-reliability relation 7.2.2 Failure rate reduction options 7.2.3 Impact of latent defect screens at e-sort on yield-reliability relation 7.3 Model predicting burn-in failure rate from batch yield 7.3.1 Experiment and failure rate evolution model 7.3.2 Validation of the model 7.3.3 Process dependence of the model constants 7.3.4 Burn-in failure rate prediction 7.4 Application of model to burn-in elimination 7.4.1 Impact of screens 7.4.2 Verification of the model 7.5 Conclusions 7.6 References SUMMARY SAMENVATTING LIST OF PUBLICATIONS DANKWOORD LEVENSLOOP BIOGRAPHY 1 Introduction 1.1 1.2 1.3 1.4 Introduction Integrated circuit technology and reliability trends System for building-in and improvement of product reliability References 1.1 INTRODUCTION Over the past 30 years the reliability of semiconductor products has increased by an astonishing factor of over 10 million despite the unprecedented progress of technology in this period. This thesis deals with the methodology and techniques that have been developed in process development, product development and high volume manufacturing firstly to realise this huge improvement and secondly to continue this improvement rate in the next millennium. This first chapter briefly describes the trends in semiconductor technology and product reliability and also introduces the system for the building-in and improvement of integrated circuit reliability. The other chapters focus on the new methods and techniques that have been developed to enable a further advancement of the system. In process development the adoption of highly accelerated stress techniques (preferably on wafer level) has become crucial as this gives the opportunity to simulate 10 years of product lifetime within a few hours or days, in-line with today’s development cycle times. In chapter 2 these Wafer Level Reliability (WLR) techniques are applied during the development of a high voltage BipolarCMOS-DMOS (BCD) technology to evaluate the product lifetime due to a sodium ingression wear-out failure mechanism and to select the best lifetime improvement candidate from various process modification options. Using similar stress methods, also a new model is developed that quantitatively describes transistor instabilities induced by surface charges. These charges originate from high voltage circuitry on the chip and constitute a dominant wear-out failure mode in high voltage products. It is shown how the model can be used to derive design rules that eliminate the effects of the surface charges and thus ensure reliable high voltage products. 1 Chapter 1 The highly accelerated stresses are often carried out on dedicated test structures designed in such a way that they are ’susceptible’ to primarily only the failure mechanism of interest. This introduces the problem of how to convert test structure lifetime data to actual product lifetimes. As reliability margins are vanishing rapidly in modern semiconductor technologies this is of great interest to the industry. In chapter 3 this has been explored for the case of hot carrier degradation. Large lifetime differences can occur in this case between test structures and products due duty cycle effects, differences between AC and DC degradation and the varying sensitivity of the electrical parameters of a product to the degradation of one or more of its components. It is shown that lifetimes of products in dynamic operation can easily exceed the lifetimes of corresponding transistors in static operation by a factor 100. This finding is now commonly applied during product design, enabling increases in the maximum of the operation frequency of state-of-the-art microprocessors and Systems-On-A-Chip as well as more aggressive scaling of process technologies without jeopardising the product reliability. For building-in reliability during product development, the availability of reliability related design rules is mandatory. One aspect of this are design rules that ensure that the product is robust against voltage spikes on its external pins so that it does not ‘latchup’ and burn-out. In chapter 4 a consistent approach is demonstrated that allows the derivation of latchup design rules from simple test structures and that is applicable to any CMOS technology. This allows first-timeright design of products and it changes the perception of latchup prevention being an ‘art’ to being an ‘engineering science’. In high volume manufacturing the prevention and detection of process excursions that might deteriorate product reliability and yield is of the uttermost importance. Therefore very sophisticated in-line and end-of-line control systems have been implemented in manufacturing flows where all critical equipment and process parameters that may influence product performance or reliability are regularly measured and kept under Statistical Process Control. One such critical process parameter is the metal stepcoverage of a metallisation system as it may have a dramatic effect on electromigration related reliability of a product. In chapter 5 a new method is described that enables monitoring of the metal stepcoverage of a metallisation system by simple electrical measurements. It can be used for process optimization, design rule drivation and stepcoverage monitoring. In the latter case electrical test structures, called Process Control Modules (PCM’s), are placed on a number of positions on each wafer to identify any material that might contain a reliability hazard. As a result of the ‘building-in’ reliability approach in process and product development, wear-out failure modes do not occur anymore in today’s products. Instead product failures are dominated by early failures caused by manufacturing defects. Product failure rates however have become so low that conventional life testing techniques are not capable anymore of providing enough statistically significant data (at reasonable cost) to guide the improvement actions in the high volume manufacturing lines. Therefore a paradigm shift is needed. In chapter 6 it is for the first time quantitatively shown, based on data of over 50 million 2 Introduction products, that there exists a clear correlation between the yield of a product and its reliability in the field. Thus the yield is a primary reliability indicator. It can be used to screen out material that does not fit into the normal yield distribution of a product, thus preventing that products with a larger failure probability (‘Maverick’ lots) are shipped to customers. Also quantitative model is developed and validated allowing to predict the reliability in the field based on yield data. In this way so that yield scrap limits can be set based on engineering arguments instead of based on qualitative reasoning as in the past, enabling a much better trade-off between cost and benefit of scrapping deviating material. Finally, chapter 7 deals with the failure rate evolution versus time, failure rate prediction and reduction of the failure rate by various screening techniques. Experimental data show that the failure rate curve indeed has a bathtub shape (see section 1.2). Its evolution is described by a new model allowing to show quantitatively what the impact of various ‘burn-in’ options are on product failure rate. Furthermore it is shown for the first time what the quantitative effect on failure rate is of alternative screening techniques that can be implemented in the electrical-sort (‘E-sort’) test program at wafer level like voltage screens and quiescent current (‘IddQ’) tests. It appears that these techniques are a good alternative to burn-in and can reduce failure rates by about a factor 2. This finding opens the way for significant efficiency improvements and cost reductions in high volume semiconductor manufacturing and consequently the screening techniques are rapidly becoming standard industry practice. 1.2 INTEGRATED CIRCUIT TECHNOLOGY AND RELIABILITY TRENDS The reliability of semiconductor products as a function of time is commonly described by a bathtub curve [1,2,49,54-56]. This is because the plot of the product failure rate as a function of time has the shape of a cross sectioned bathtub as shown in fig. 1. Three failure regimes can be distinguished in the bathtub curve. In the ‘infant mortality’ or ‘early failure’ period, the products show a high, but decreasing failure rate as a function of time until the failure rate stabilises. This period is referred to as the ‘random failure’ period. Finally, in the ‘wear-out’ period, the failure rate increases again when end-of-life of the products is reached. 3 Failure Rate Chapter 1 Manufacturing Defects Early Failure Period Intrinsic Degradation Mechanisms Electrical Overstress Events & Defect Tail Random Failure Period Wear-Out Period Time Fig. 1: Failure rate as a function of time: the bathtub curve. The nature of the failures in the three periods is generally very different, see table 1. The majority of the failures in the ‘early failure’ period are caused by manufacturing defects like e.g. particles, near opens and shorts in metal lines, weak spots in isolating dielectrics or poorly bonded bondwires in the package. In the ‘random failure’ period many different rootcauses occur but failures related to specific events like lightning, load dump spikes occurring during disconnection of car batteries or other overstress situations are most notable. Failures in the ‘wearout’ period are related to intrinsic properties of the materials and devices used in the product in combination with the product use conditions like temperature, voltage and currents including their time dependence. Examples of wear-out failure mechanisms are electromigration, (gate) oxide breakdown, hot carrier degradation, mobile ion contamination and dry corrosion of bondballs [1,3-5,31-32], see also section 1.3. Reliability engineering deals with on one hand systematically reducing the infant mortality and random failures and on the other hand keeping the wear-out phase beyond practical duration. Early Failure Period particles gate oxide defects near-opens / nearshorts pinholes in isolating dielectrics scratches loose bondwires popcorn damage Random Failures Period latch-up latent ESD damage Safe-Operating-Area Wear-Out Period electro-migration (gate) oxide breakdown hot carrier degradation (SOAR) violations mobile ion contamination load-dump car battery electrical overstress extended early failures transistor instabilities stress voiding thermo-migration surface charges corrosion pattern shift 4 Introduction bondwire fracture ‘dry’ bondball corrosion Table 1: Characteristic failure modes in the three regimes of the bathtub curve. Today's state-of-the-art products like microprocessors or Systems-On-a-Chip (SOC) contain tens of millions transistors, a factor 105 more than in the early seventies as shown in fig. 2. This has been realised by a simultaneous reduction in minimum feature size and increase of die area, see fig. 3. At the same time also package technology has evolved from simple Dual-In-Line (DIL) packages to complex high pincount Chip Scale packages, see fig. 4 and 5. The remarkable thing about semiconductors is that despite this dramatic increase in complexity of processes, products and packages, simultaneously the product failure rate has decreased by more than two orders of magnitude as witnessed by fig. 6. Here it must be noted that the failure rate of IC’s usually is expressed in FIT (Failures In Time). A FIT is one failure per 1 billion (109) device hours under normal operating conditions. As the failure rate of IC’s decreases in time, failure rates are usually determined after 48 or 168 hours as well as after 1000 hours accelerated testing. From the 48 or 168 hours results an ‘Early Failure Rate’ (EFR) can be determined and from the 1000 hours results an ‘Intrinsic Failure Rate’ (IFR). Today, Early Failures Rate requirements by customers are below 10 FIT, corresponding to a maximum of one failure during 100 million operating hours. Number of Transistors per Chip 1E+10 Memory Microprocessors 1E+09 1G 256M 1E+08 64M 16M 1E+07 4M 1E+06 1M 256k 1E+05 64k 16k 1E+04 4k 1k 468 K6-3 K6 Pentium Pro Pentium 368 80268 8086 8080 4004 1E+03 1965 1970 1975 1980 1985 1990 1995 2000 2005 Year Fig. 2: Trend in chip complexity [53]. 5 10 10000 1 1000 0.1 Die Size [mm2] Minimum Feature Size [um] Chapter 1 100 0.01 10 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 Year Fig. 3: Trend in minimum feature size and die sizes of DRAMs [5,11]. 100 mm2 - 11% Fig. 4: Package integration trend [58], QFP= QUAD Flat Pack, TAB= Tape Automated Bonding, COB= Chip On Board, CSP= Chip Scale Package. 6 Introduction Fig. 5: Microprocessor pin count trend [58]. In order to be able to achieve this 7-decade reliability improvement, the semiconductor manufacturers have implemented a very refined product reliability assurance system. Key elements are building-in reliability during process and product development, thorough product and process qualification procedures, in-line control of product reliability items in the waferfab and assembly plant, ‘Maverick’ lot control (a ‘Maverick’ lot shows an exceptionally high failure rate), reliability monitoring via analysis of lifetest rejects and customer returns, see fig. 7, and a well functioning continuous improvement process focussed on corrective actions on any deviation observed. It is also clear that along with the increase in product complexity and decrease of allowed failure rate, this system and the methods used need to be continuously updated. The next section deals in more detail with the product reliability assurance ‘chain’ and will indicate what the contribution of the work in this thesis to the system is. 7 Chapter 1 Fig. 6: Trend in early- and intrinsic failure rate (FIT) targets for application in consumer products [3, 47]. 1.3 SYSTEM FOR BUILDING-IN AND IMPROVEMENT OF PRODUCT RELIABILITY 1.3.1 Introduction In fig. 7 the system that is widely implemented in semiconductor manufacturing to build-in and improve product reliability has been schematically depicted. It will be described in more detail below and furthermore it will be indicated what the contribution of the work in this thesis to the advancement of the system is. 1.3.2 Materials and process module research In the research phase work is oriented to the choice of the proper materials and development of revolutionary process steps that will finally be able to meet the future product requirements. These choices typically have primarily impact on the wear-out failure mechanisms as listed in table 1. Present examples are the work on copper metallisation, see fig. 8, dual-damascene processes, low-k dielectrics, alternative gate dielectrics, sub-0.15µm lithography [3-8] and in the field of packaging novel types of moulding plastics offering improved moisture resistance and robustness against soldering treatments [12-14]. 8 Introduction Materials & Process Module research Continuous Improvement and Feedback Loops Process development Package development Ch. 2 & 4 Product development Process qualification Package qualification Ch. 3 & 4 Product qualification Reliability control of processes in waferfab and assembly plant Ch. 6 Screening of defects Ch. 6 & 7 Ch. 5 ‘Maverick Lot’ detection Ch. 7 Reliability monitoring and customer feedback Ch. 6 & 7 Fig. 7: The product reliability assurance and improvement system. 100000 MTTF [sec] AlCu, Ea= 0.62eV 10000 Cu , Ea= 0.97eV 1000 100 10 1.6E-03 1.7E-03 1.8E-03 1.9E-03 1/T [K-1] 2.0E-03 Fig. 8: Electromigration lifetime versus temperature of a full Cu-metallisation versus that of an AlCu-based metallisation [8]. 1.3.3 Process development In the process development phase, the game is to combine (known) materials and new (evolutionary) process steps in such a way that the process and product requirements are met in the shortest possible time at the lowest cost. Emphasis in this thesis is firstly on process reliability investigations and secondly on the derivation of reliability related design rules for products. By means of extensive high9 Chapter 1 ly accelerated Wafer Level Reliability (WLR) techniques [15,16] and the use of appropriate lifetime extrapolation models, the lifetimes related to each of the wear-out failure mechanisms can be established. If insufficient, process modifications are required as well as verification of the expected improvement by renewed WLR-investigations. In chapter 2 this approach is demonstrated for the case of a high voltage Bipolar-CMOS-DMOS (BCD) technology. Based on the WLR-data and extrapolation models also design rules are derived intended to eliminate wear-out effects and thus ensure reliability operation of the products during their useful life. In this context it is interesting to note that the useful life period can vary from 7000 hrs for automotive products via 30000 hrs for consumer products to 250000 hrs for some telecommunication devices. Design rules can be optimised accordingly. Examples of these design rules are the maximum allowable current through a metal line versus the line width to prevent electromigration failures, the maximum allowable voltage on a MOS transistor versus the poly gate length to prevent hot carrier degradation effects and metal-stress design rules to prevent passivation cracking and pattern-shift for the package case. In high voltage products transistor instabilities induced by surface charges are a dominant wear-out failure mode. In chapter 2 for the first time the interaction between these instabilities and the surface charges is determined quantitatively and it is furthermore also discussed how the effects of the surface charges in high voltage products can be eliminated by means of proper design rules. The lifetime estimates for the various wear-out failure mechanisms are generally extracted from static stress WLR-experiments on test structures. For advanced processes the safety margins have completely disappeared. Current DC hot carrier lifetimes of state-of-the-art 0.35 µm and 0.25 µm are for example typically less than 2 months [17]. This means that it becomes important to establish the relation between the lifetimes as measured (typically by means of static stresses) on test structures and those of real products. Large differences can occur due duty cycle effects, differences between AC and DC degradation and the varying sensitivity of the electrical parameters of a product to the degradation of one or more of its components. Chapter 3 deals with this issue for the case of hot carrier degradation [18]. It is shown that product lifetimes can be easily a factor 100 larger than the corresponding DC transistor lifetime. This finding is now commonly applied during product design, enabling increases in the maximum of the operation frequency of state-of-the-art microprocessors and Systems-On-A-Chip as well as more aggressive scaling of process technologies without jeopardising the product reliability. During process development also design rules are derived and devices are designed to make the products robust against electrical overstress events like Electro-Static Discharge (ESD) [19] and latch-up [20], that can result in randomly occurring failures. Quite often the derivation of ESD and latch-up design rules is regarded as a kind of ‘black magic’. However, in chapter 4 a consistent approach is demonstrated that allows to derive latch-up design rules from simple test structures and that is applicable to any CMOS technology [21]. Remarkably, such an approach was not available up to now and it thus fills a gap in engineering science. In literature some complementary studies on building-in ESD and latchup robust10 Introduction ness [22-25, 51] and improvement of Safe-Operating-Area (SOAR)-capability of bipolar products [26] are available. Finally, based on package reliability studies, ‘metal stress’ rules are generated aimed at making the product robust against mechanical stress excerted by the package materials [12-13]. 1.3.4 Process qualification During process qualification first a set of standard WLR-tests [11,12] are executed in order to prove that in the final process flow all wear-out failure mechanisms, see table 1, are sufficiently covered by the design rules or by the process architecture itself (as e.g. in the case of mobile ions). An example of such a program is given in fig. 9. Second, a number of package reliability tests like Temperature Cycling (TMCL) and Highly Accelerated Steam Tests (HAST) are executed to show that the intrinsic properties of the passivation on top of the die meets the requirements related to mechanical strength and moisture permeability. Third, ESD and latch-up test are executed to determine whether the ESD-protection and latchup prevention design rules are appropriate. Table 9 Test Methods for Wafer Level Reliability Test Electromigration Gate Oxide Breakdown Method SNW-FQ-101A SNW-FQ-101B Hot Carrier SNW-FQ-101C Mobile Ion SNW-FQ-101D Metallization Stress Voiding SNW-FQ-101E Notes: 1. t (0.1%) = 10 years at 70 Acceptance Criteria Reference note 1 Defect density < 10 ‘killing defects/cm2 60% Confidence Level (note ²) 10 year life (analog) 0.5 year life (digital)2 60 % Confidence Level BTS TVS (optional) < 10 class B defects per cm line < 10 class C defects per cm line 60% Confidence Level c/n 0/5 0/5 °C, 60% Confidence Level 2. If the requirement is not met business lines shall be informed and implementation of appropriate screening procedures should be considered by the BL. 3. If a process does not meet the 0.5 year life time requirement, then 10 years life time at use conditions must be demonstrated at the circuit level for products manufactured with that process. 11 Chapter 1 Table 10 Construction Analysis Test Methods for Wafers Test Description Abrv. External Wafer Inspection Wafer Bow Wafer Strength Crystal Strength Roughness Adhesion Method EXWI WABO WAST XTST ROGH ADHE c/n Local Document 0/5 3 x 0/5 3 x 0/1 3 x 0/10 0/5 0/5 Fig. 9: Overview of typical Wafer Level Reliability qualification program including construction analysis [27,40]. 1.3.5 Package development and qualification In packaging the trend is towards larger die sizes (fig. 3), a larger pincount [914] requiring a finer pitch of the package leads (fig. 5), smaller bondpad sizes and bondpad pitch on the circuit die (fig. 10), thinner packages (fig. 4 and 11) and an improved resistance against soldering treatments during mounting of the package on a Printed Circuit Board (PCB). For consumer and automotive applications, the majority of the packages are still (derivatives of) the conventional Dual-In-Line (DIL), Single-In-Line (SIL), Quad-Flat-Pack (QFP) and Small-Outline (SO) packages. For state-of-the-art devices like microprocessors however also novel packages like Ball-Grid-Arrays (BGA), Multi-Chip-Modules (MCM) and techniques like e.g. Flip-Chip packaging, Chip scale Packaging (CSP), TapeAutomated-Bonding (TAB) and Controlled Collapse Chip Connection (C4) have been introduced [9-10, 58], see fig. 12. Array/C4 # Pads/ # Pins TAB Evolutionary Vector Wirebond (Aluminum) (Gold ball) Performance, I/O Compaction 50 100 150 Pad Pitch [um] Fig. 10: Technology trend in packaging [13]. 12 Introduction Fig. 11: Package profile comparison [58]. Fig. 12: Package size versus number of I/O’s for various package families [58]. In conventional packaging, emphasis is on package materials that lower the mechanical stress on the die surface and on improved plastic moulding compounds. These new compounds firstly absorb less moisture, secondly contain less contaminants that might induce bondpad corrosion and thirdly have an intrinsically better adhesion to the scratch protection of the die and the leadframe. This is because it has been shown that the loss of adhesion between the moulding plastic and the package materials during e.g. a Temperature Cycling test (TMCL) or a soldering treatment like ‘popcorn’ test is the key factor degrading the package related reliability of the product [28-30]. Delaminated packages are more prone to bondpad corrosion, package cracking, lifted bondballs and lifted wedgebonds and passivation cracking or even ‘pattern shift’. The sensitivity of a product to passivation cracking can be significantly reduced by applying proper ‘metal stress’ design rules [12-13] during the product development. Alternatives are the use of a mechanical stress resistant passivation scheme, the use of a polyamide wafer coating or a silicone die-coating that acts as a kind of stress relief layer [13]. The novel ‘anti-popcorn’ moulding plastics developed to reduce moisture uptake and improve ‘popcorn’ behaviour generally have a low glass-transition (Tg) temperature of around 120-130°C. Unfortunately, this is below the commonly 13 Chapter 1 used High Temperature Operating Lifetest (HTOL) stress temperature of 150°C and also below the normal use junction temperature in some special applications as e.g. lighting and automotive. At these temperatures the Sb- and Br-flame retardant additives in the plastic are less stable and more mobile in these anti-popcorn compounds. As a result, especially the Au-Al ball-bond ‘dry corrosion’ degradation mechanism [31,32] is strongly accelerated compared to the case where a normal low stress moulding compound is used. For some compounds ‘open circuit failures’ are observed within a few thousand operating hours at 150°C, see fig. 13. The problem can be somewhat alleviated by choosing proper bonding conditions [50]. Nevertheless, the trend is towards moulding compound recipes with a lower amount of flame retardant additives and to wirebond materials that are less susceptible to the ‘dry corrosion’ mechanism. (a) (b) 14 Introduction Fig. 13: SEM photograph of (a) a bondpad and (b) the bottom of a lifted bondball showing Au-Al intermetallics due to the ‘dry corrosion’ degradation mechanism [52]. Package qualification is generally done on test chips with emphasis on the intrinsic properties of the package materials. In most cases however package reliability is also investigated as part of the product qualification program due to the potential interaction with the actual product and waferfab process. 1.3.6 Product development During product development the designer firstly must make sure that the product adheres to all design rules derived during the process development phase. Especially the ESD and latchup robustness of the product and its ability to withstand mechanical stress tests is largely determined by the specific design solutions chosen by the designer. Secondly, because of vanishing reliability margins, also reliability simulation techniques [33-36] are employed more and more during the design phase of products in state-of-the-art processes. In this way the actual impact of wear-out failure mechanisms like e.g. hot carrier degradation and electromigration on the circuit performance and thus circuit lifetime can be determined. Furthermore the reliability simulations also reveal the weak spots in the circuit. By making appropriate design changes to these weak spots (e.g. longer channel lengths of MOS transistors or wider metal lines), the product lifetime can be improved until the required lifetime target is met. This approach is the best guarantee that the optimum between circuit performance, die size and required lifetime is achieved. 1.3.7 Product qualification During product qualification the actual product is subjected to a set of standard accelerated stress (life)tests [27,37] aimed at finding any deficiencies in the combination of the design, process, package and application, see fig. 14. In these tests first the endurance performance, robustness against overstress phenomena, mechanical stress resistance and the ability to survive in a humid environment of the product are examined. All these test are related to the intrinsic reliability of the product. If the previous work has been properly done, no rejects are observed. Second, the capability of the product with respect to assembly on a PCB-board is checked. Third, especially in case of automotive or military applications, the sensitivity of the product to defects is determined by subjecting large sample sizes of products to short duration stress tests like e.g. a 24 hours dynamic operation at 150°C (‘burn-in’). Failures observed during these tests are generally related to insufficient control over the manufacturing process by the semiconductor supplier. 15 Chapter 1 1.3.8 Reliability control in the waferfab and assembly plant In order to prevent process excursions that might deteriorate product reliability and yield, the semiconductor wafer and package manufacturers have introduced very sophisticated in-line control systems in their manufacturing process. Table 14 Product Environmental & Electrical Tests for Leaded IC Packages Stress Test Abrv. High EFR Temperature Operational IFR Life (Static or Dynamic) High Temperature Storage Life L Latch-up 1 ESD Susceptibilityl (Human Body Model) ESD Susceptibilityl (Machine Model) SMD Preconditioning Pressure Pot Test Condition HTOL T = 150°C, biased SNW-FQ-500 j HTSLl T = 150°C, unbiasedl a 1.5 Vcc(max) Unsaturated Pressure Pot or Temperature Humidity Bias or 2 2 High Acceleration Stress Test 2 Temperature Cycling 7 Thermal fatigue (Power Devices only) Data Retention Erase/Write Cycling 3 3 SNW-FQ-114l Requirement RFS/Extended < 168 hl 1000 / 2000 hl c/n 0/231 Note 5,6 0/77 Note 6 1000 / 2000 hl 0/77l SNW-FQ-302A ±100 / 200 mA 2 kV 0/500 ESDH 1500Ω / 100pf ESDM 0.75 µH / 200pf SNW-FQ-302B 200 V 0/3 PCON For SMD devices SNW-FQ-225Al JEDEC A113 SNW-FQ-225A SNW-FQ-C102 96 / 192 h 0/77 96 / 192 h 0/77 Preconditioning (SMD’s) THBS l85°C / 85% RH, biased SNW-FQ-225A SNW-FQ-D102 SNW-FQ-225Al SNW-FQ-A102 1000 / 2000 h 0/45 Preconditioning (SMD’s) HAST 130°C / 85% RH, biased SNW-FQ-225Al SNW-FQ-D102 96 / 192 h 0/45 Preconditioning (SMD’s) -65°C to 150°C (Air to Air) TFAT Power on/off @ T max SNW-FQ-225Al 200 / 500 cycles SNW-FQ-112 LAUP SNW-FQ-303 Preconditioning (SMD’s) 121°C/100% RH, unbiasedl Preconditioning (SMD’s) UPOT 130°C / 85% RH, unbiasedl PPOT 2 Specification TMCL J DRET T = 150°C ERWR Note 4. a 0/3 0/77 SNW-FQ-532 10,000 cycles 0/45 SNW-FQ-541 SNW-FQ-540 1000/ 2000 h 1.0x Spec Cycles 0/45 0/45 1. HTSL test unnecessary unless HTOL is conducted at T < 150 °C. 2. Either PPOT or UPOT and THBS or HAST required for Process or Package qualification. TTPP (Temperature Treatment Pressure Pot) test may be performed in place of PPOT for Through Hole Mounted Devices. 3. Additional stress tests for non-volatile products 4. Maximum endurance / operating temperature, according to Product Specification. 5. Minimum sample size to commence qualification. Wafer Fab Process Changes must demonstrate a capability to equal or exceed a 500 FPM performance level within 1 year of qualification completion at a 60% confidence level. 6. Complex, high-pin-count packages may necessitate sample size reduction. J 16 Introduction 7. An acceptable alternative TMCL condition approximating 500 cycles of -65 °C to +150°C is 1000 cycles of -55 °C to +125 °C 8. Stress duration dependent on Fab Process Fig. 14: Overview of typical product reliability qualifications program [27]. Generally, critical parameters that may influence product performance or reliability are defined for all equipment in the fab and measurement frequencies are determined based on statistical techniques. Examples of circuit performance related parameters are sheet resistances, layer thickness and line widths while the number of particles generated per wafer pass and mechanical stresses in layers are related to circuit reliability. In case of assembly plants parameters like ball bond shear-off force, plastic delamination and dimensions are measured. All parameters are controlled using Statistical Process Control (SPC) techniques [38,39]. Using SPC, any deviation of the process from its normal operation is identified and results in ‘blocking’ of the equipment for production before the products are negatively affected. By proper execution of Out-of-Control-Action-Plans (OCAPs), the equipment can later again safely be released for production. Apart from in-line control, also end-of-line control techniques can be used to identify any material that might contain a reliability hazard. For this purpose each wafer contains on a number of positions a large variety of electrical test structures called Process Control Modules (PCMs) suited to monitor and control the performance and reliability of the complete final process. Typical devices on a PCM are transistors, ‘van der Pauw’-type of resistors, capacitors, contact strings, zener diodes, metal lines etc. However often also reliability modules are included. Most common are large capacitors to monitor gate oxide quality and metal meanders to monitor metal shorts. For older non-planarised processes however it is also important to monitor the metal stepcoverage as this may have a dramatic effect on electromigration related reliability of the product. In general this is done by examining SEM cross sections of a worst case step on a regular, in most cases weekly, basis. Chapter 5 deals with a new method that allows to monitor the metal stepcoverage by simple electrical measurements [41]. This enables a metal stepcoverage control on 100% of the produced wafers, see e.g. fig. 15. In this way the probability is greatly reduced that unreliable material slips through all screens and is shipped to a customer between thousands of good wafers. The method can also easily be extended to measure metal electrically stepcoverage in contact holes and vias and is thus also relevant for sub-micron technologies with planarised backends. A new development is the use of ‘Fast Wafer Level Reliability’ (Fast-WLR) techniques [39,40]. Here each wafer contains a series of test structures each dedicated to a particular failure mechanism. The design of the structures and the associated stresses are such that very large acceleration factors are achieved and the failure thresholds are reached within 0.1 to 1 minute. The thus obtained reliability data are controlled by SPC-techniques, allowing to identify any changes in the reliability of the process from its standard level. It is however still necessary in those cases to verify the validity of the observed reliability change by execution of stan17 Chapter 1 dard, less accelerated, stress tests as the very large acceleration factors may also induce degradation mechanisms that are non-relevant at normal use conditions. 1.30 LSL Target USL Resistance Ratio 1.25 LCL UCL 1.20 1.15 1.10 1.05 jan-99 nov-98 sep-98 jul-98 mei-98 mrt-98 jan-98 nov-97 1.00 Date Fig. 15: SPC control chart of a metal stepcoverage monitoring parameter showing the resistance ratio between a metal2 line over metal1 and polysilicon steps and a metal2 line over a flat surface (mean = 1.105 and sigma = 0.010). The chart contains data of about 700 wafer batches (35000 wafers) and reveals 4 out-of-control events and 1 out-of-specification event. 1.3.9 Maverick lot detection In case the design adheres to all (reliability related) design rules and is produced in a mature process in a waferfab with excellent process control, product reliability is dominated by defects occurring during the manufacturing process like particles, scratches, near-opens and near-shorts etc, see e.g. fig. 16. These same defects are generally also the origin of E-sort (Electrical Sort) yield loss; the larger defects then result in zero hour product failure (and thus yield loss) and the smaller size defect constitute latent defects that may fail during operational life of the product. In chapter 6 it is for the first time shown quantitatively, based on data of over 50 million devices, that there exists a clear correlation between the yield of a product, its burn-in fall-out [42,57] and its reliability in the field [42], provided the yield loss is dominated by functional failures and not by parametric failures [43]. Thus the E-sort yield is a primary reliability indicator and can be used to screen out material that does not fit into the normal yield distribution of a product. In this way it is prevented that products with a larger failure probability (‘Maverick’ lots) are shipped to customers. Note that based on the E-sort yield the reliability in the 18 Introduction field can be predicted quantitatively so that yield scrap limits can be set based on engineering arguments instead of based on qualitative reasoning as in the past. This allows a much better trade-off between cost and benefit of scrapping deviating material. FIB X-section particle (a) Metal 2 Aluminum particle Si3N4 Oxide Silicon (b) Fig. 16: Photograph of a particle in BiCMOS circuit (a) causing a failure due to an open metal1 line and a Focussed Ion Beam cross section of the particle (b) revealing that is an aluminum particle. A more sophisticated ‘Maverick lot’ detection method, apart from using the plain yield number, is that also the reject data from the individual tests in the Esort test program (called ‘BIN fingerprint’) are used to distinguish deviating ma19 Chapter 1 terial from the material fitting within the normal distribution (‘Moving Limits’ technique). Deviating material is generally put on hold for more thorough analysis by product engineers after which a decision about scrapping or shipping of the material is made. The results of these analyses are used for continuous improvement of the test programs, product designs or the waferfab processes. Fig. 17 shows the trend of the defect density reduction in a high-volume bipolar-BiCMOS waferfab resulting from this approach. A remarkably constant improvement rate of nearly 20% per year is observed over a period of 20 years. The impact of the continuous improvement feedback loop on the occurrence of ‘maverick’ lots is also demonstrated in chapter 6. 1 00 .0 4 in ch 10 .0 3 in ch 2 in ch 1 99 9 1997 1995 1 993 1 99 1 1989 1987 1985 1 98 3 1981 1979 1 977 0.1 1 975 1.0 1973 De fect Density [cm -2] 5 in ch Fig. 17: Defect density reduction trend in a bipolar-BiCMOS waferfab 1.3.10 Screening of defects In order to reduce the failure rates in the field, products have been traditionally subjected to burn-in [1,45,46]. During burn-in the product is operated for a longer time (6 to 168hrs) at an elevated temperature (125°C to 150°C junction temperature) and often also at a higher than nominal supply voltage (e.g. 7V instead of 5V). In this way latent defects that otherwise would fail during the beginning of the early failure period of the bathtub can be screened out, resulting in a lower subsequent failure rate in the field for devices that survive the burn-in [45,46]. In chapter 7 it is shown, based on experimental data, that the failure rate evolution versus time indeed behaves as a the bathtub curve and it is shown quantitatively what the impact of various burn-in options are on product failure rate [47]. The major drawback of a burn-in is the cost involved with the whole procedure and the fact that for high yielding products the burn-in process itself might induce more (latent) failures due to e.g. handling damage than that are screened out by the procedure. Therefore many alternative screening techniques have been develo20 Introduction ped over the last decade that can be implemented in the electrical-sort (‘E-sort’) test program at wafer level like voltage screens, quiescent current (‘IddQ’)tests and parameter distribution oriented tests like ‘Moving Limits’ [46-48]. In general these tests significantly improve the testcoverage compared to the case when only failures following the ‘Stuck-At’ fault model are detected. An example of the effectiveness of an IddQ test is shown in fig. 18. Defect Level [ppm] 125000 Functional test only 100000 Functional test + IddQ 75000 50000 25000 0 0 20 40 60 80 100 Stuck-At Fault Coverage [%] Fig. 18: Impact of IddQ testing on the PPM level of a product as a function of ‘Stuck-At' fault test coverage [48]. Two different kind of screening tests exist. The first operates the products outside their normal operating window with the aim to force latent defects into ‘hard’ failures (e.g. forcing a weak spot in a gateoxide into a short by applying a high voltage). The second aims at screening out products that are functional and within specification limits but nevertheless show analog parameter values (e.g. a supply current or output voltage) that are outside the distribution of the remainder of the products. In chapter 7 it is shown that these techniques are a good alternative to burn-in and can reduce failure rates by about a factor 2. This finding opens the way for significant efficiency improvements and cost reductions in high volume semiconductor manufacturing and consequently the screening techniques are rapidly becoming standard industry practice. 1.3.11 Reliability monitoring and continuous improvement The semiconductor supplier has the primary responsibility for failure rate monitoring. For this purpose the supplier executes extensive reliability monitoring programs where on a sample basis a part of the production is subjected to reliability evaluations. However, today's failure rates are so low that excessive sample si21 Chapter 1 zes (more than 100000 products) are needed to demonstrate the reliability targets required by the customers. For statistically relevant information about the largest reliability hazards in production even millions of devices are needed. Apart from the prohibitive cost, also the time needed to execute all these tests is so long (several months) that the lifetests have become hardly usable for continuous improvement purposes. The lifetests are however still suitable to detect and subsequently evaluate the delivery risk of potentially ‘Maverick’ lots. Semiconductor suppliers generally agree on ‘PPM-cooperation’ programs with their key-customers. In such a cooperation all devices failing during assembly and testing of the Printed-Circuit-Board (PCB) at the customer (called ‘line fall-off’) are sent back to the suppliers and the rootcause of the failure is determined. Fig. 19 shows as an example a pareto of the failures of a high volume BiCMOS TV signal processing IC. UNKNOWN 14% GOOD 23% ASSEMBLY 1% DIE FAULT 20% OVERSTRESS 30% TEST COVERAGE 10% ESD 2% Fig. 19: Pareto of ‘line fall-off’ failure causes of a BiCMOS TV signal processing product. Note that about a quarter of the returned devices appears to be good due to mismatch between product specification and application or due to poor repair procedures at the customer. As millions of devices are shipped to the customer, the failure sample size is generally large enough to be of statistical significance. Consequently, these data are used extensively to define corrective actions and continuous improvement programs in the waferfabs. Major drawback however is that this feedback loop spans a time of at least 3 months to half a year due to pipeline effects. The way out is the fact that these reliability failures correlate with the yield failures and have the same failure signature, as shown in chapter 6. Apart from in-line defect monitors, data from yield analysis is the fastest feedback loop possible in semiconductor manufacturing with a feedback time of a few weeks. In conclusion, a strong focus on 22 Introduction defect reduction and yield improvement in the waferfab is the best option for a continuous reliability improvement program. 23 Chapter 1 1.4 REFERENCES [1] E.A. Amerasekera, F.N. Najim, ‘Failure Mechanisms in Semiconductor Devices’, John Wiley & Sons, New York, (1997) [2] D. Thompson, B. Wood, ‘Semiconductor defect reliability modeling’, Tutorial International reliability Physics Symposium (IRPS), (1996) [3] D.L. Crook, ‘Evolution of VLSI reliability engineering’, Proceedings IRPS, pp. 2-11, (1990) [4] B. El-Kareh, W.R. Tonti, ‘Chip reliability’, Tutorial IRPS, (1997) [5] P. Chatterjee, W.R. Hunter, A. Amerasekera, S. Aur, C. Duvvury, P. Nicollian, L. Ning, P. Yang, ‘Trends for deep submicron VLSI and their implications for reliability’, Proceedings IRPS, pp. 1-11, (1995) [6] J.W. McPherson, ‘Reliability/processing challenges for ULSI metallization’, Tutorial IRPS, (1994) [7] R.L. Hance, J.W. Miller, K. Erington, M.A. Chonko, ‘Mobile ion contamination in CMOS circuits’, Tutorial IRPS, (1995) [8] H.S. Rathore, D. Nguyen, ‘Copper metallization for sub-micron technology’, Tutorial IRPS, (1997) [9] R. Master, ‘Flip chip and ball grid array packaging’, Tutorial IRPS, (1998) [10] K. Puttlitz, P. Totta, ‘Flip-chip interconnections’, Tutorial IRPS, (1994) [11] ‘National Technology Roadmap for Semiconductors, technology needs’, ed. Semiconductor Industry Association (SIA), (1997) [12] T.M. Moore, S.J. Kelsall, D.R. Edwards, ‘Improving plastic package reliability’, Tutorial IRPS, (1992) [13] J.T. Cullen, T.M. Moore, S.V. Golwalker, ‘Package technology’, Tutorial IRPS, (1996) [14] R. Shook, T. Conrad, ‘Moisture/reflow sensitivity of plastic packaged surface mount IC’s: theory, evaluation and avoidance’, Tutorial IRPS, (1995) [15] D.A. Baglee, D.S. Gibson, ‘Wafer-Level Reliability implementation issues’, Tutorial IRPS, (1990) [16] D.G. Pierce, E.S. Snyder, ‘Wafer Level Reliability : pushing the envelope’, Tutorial IRPS, (1997) [17] R. Bellens, ‘Building-in reliability during library development: hot carrier degradation is no longer a problem of technologists only!’, Microelectronics & Reliability, pp. 1425-1428, (1997) [18] J.A. van der Pol, J.J.M. Koomen, ‘Relation between the hot carrier lifetime of transistors and CMOS SRAM products’, Proceedings IRPS, pp. 178-185, (1990) [19] E.A. Amerasekera, C. Duvvury, ‘ESD in silicon integrated circuits’, John Wiley & Sons, New York, (1995) [20] R. Troutman, ‘Latchup in CMOS technology’, Kluwer, Boston, (1986) [21] J.A. van der Pol, P.B.M. Wolbert, ‘Systematic derivation of latch-up design rules for submicron CMOS processes from test structures’, Microelectronics & Reliability, pp. 1051-1056, (1998) 24 Introduction [22] E.A. Amerasekera, R. Chapman, ‘Technology design for high current and ESD robustness in a deep submicron process’, IEEE Electronic Device Letters, pp. 383-385, (1994) [23] E.A. Amerasekera, S.T. Selvam, R.A. Chapman, ‘Designing latchup robustness in a 0.35µm technology, Proceedings IRPS, pp.280-288, (1994) [24] E.R. Ooms, J.A. van der Pol, ‘Occurrence and elimination of anomalous temperature dependence of latchup trigger currents in BICMOS processes’, Proceedings IRPS, pp. 138-143, (1999) [25] C. Duvvury, C. Hu, G. Hills, ‘Integrated circuit damage due to electrical stress’, Tutorial IRPS, (1994) [26] B. Krabbenborg, J.A. van der Pol, ‘The influence of process variations on the robustness of an audio power IC’, Microelectronics & Reliability, pp. 18191822, (1996) [27] ‘General Quality Specification for Integrated Circuits’, SNW-FQ-611, Philips Semiconductors, (1998) [28] K. van Doorselaer, K. de Zeeuw, ‘Relation between delamination and temperature cycling induced failures in plastic packaged devices’, IEEE Transactions on Components & Hybrids Manufacturing Technology, pp. 879-882, (1990) [29] T.M. Moore, S.J Kelsall, ‘The impact of delamination on stress-induced and contamination-related failure in Surface Mount IC’s’, ‘Proceedings IRPS, pp. 169-176, (1992) [30] K. van Doorselaer, T.M. Moore, J.A. van der Pol, ‘Failure criteria for inspection using acoustic microscopy after moisture sensitivity testing of plastic surface mount devices’, Proceedings International Symposium on Testing and Failure Analysis (ISTFA), pp. 229-239, (1994) [31] J.R. Devaney, P.H. Eisenberg, ‘Gold-Aluminum intermetallics, key parameters - reactions -effects & reliability impact - a review’, Tutorial IRPS, (1990) [32] F.W. Ragay, J.A. van der Pol, J. Naderman, ‘In-situ monitoring of dry corrosion degradation of Au ballbonds to Al bondpads in plastic packages during HTSL’, Microelectronics & Reliability, pp. 1931-1934, (1996) [33] C. Hu, ‘AC effects in IC reliability’, Microelectronics & Reliability, pp. 1611-1617, (1996) [34] M. Lunenborg, ‘MOSFET hot carrier degradation’, Thesis, University of Twente, (1995) [35] R. Bellens, ‘Hot carrier degradation in sub-micron CMOS technologies: problems and solutions’, Tutorial IRPS, (1998) [36] S. Rochel, G. Steele, J.R. Lloyd, S.Z. Hussain, D. Overhauser, ‘Full chip reliability analysis’, Proceedings IRPS, pp. 356-362, (1998) [37] ‘Stress Test Qualification for automotive-grade integrated circuits’, CDFAEC-Q100, Automotive Electronics Council, (1994) [38] D.J. Wheeler, D.S. Chambers, ‘Understanding Statistical Process Control’, Statistical Process Control Inc., Knoxville, (1986) [39] D.G. Pierce, E.S. Snyder, ‘Wafer level reliability: pushing the envelope’, Tutorial IRPS, (1997) 25 Chapter 1 [40] J.S. May, H.H Hoang, ‘Wafer level reliability control program at SGS-Thomson Microelectronics, AEC Reliability Workshop, Indianapolis, October 2124, (1995) [41] J.A. van der Pol, E.R. Ooms, ‘Short loop monitoring of metal stepcoverage by simple electrical measurements’, Proceedings IRPS, pp. 148-155, (1996) [42] F. Kuper, J.A. van der Pol, E.R. Ooms, T. Johnson, R. Wijburg, W. Koster, D. Johnston, ‘Relation between yield and reliability of integrated cicruits: experimental results and application to continuous early failure rate reduction programs’, Proceedings IRPS, pp. 17-21, (1996) [43] J.A. van der Pol, F.G. Kuper, E.R. Ooms, ‘Relation between yield and reliability of integrated circuits and application to failure rate assessment and reduction in the one digit FIT and PPM reliability era’, Microelectronics & Reliability, pp. 1603-1610, (1996) [44] C.G. Shirley, ‘A defect model of reliability’, Tutorial IRPS, (1995) [45] R. Moazzami, C. Hu, ‘SiO2 TDDB testing and burn-in’, Tutorial IRPS, (1992) [46] A.J. Wagner, ‘Semiconductor defect reliability screening and modeling, Tutorial IRPS, (1996) [47] J.A. van der Pol, E.R. Ooms, A. van ‘t Hof, F. Kuper, ‘Impact of screening of latent defects at electrical test on the yield-reliability relation and application to burn-in elimination’, pp. 370-377, Proceedings IRPS, (1998) [48] S. McEuen, T. Paquette, ‘IddQ testing and its application’, Tutorial IRPS, (1995) [49] J. Møltoft, 'Behind the 'bathtub'-curve, a new model and its consequences', Microelectronics & Reliability, pp. 489-500, (1983) [50] Z.N. Liang, F.G. Kuper, M.S. Chen, 'A concept to relate wire bonding parameters to bondability and ball bond reliability', Microelectronics & Reliability, pp. 1287-1292, (1998) [51] J.A. van der Pol, J-P.F. Huijser, R.B.H. Basten, ‘New latchup mechanism in complementary bipolar power Ics triggered by backside die attach glue’, Microelectronics & Reliability, pp. , (1999) [52] A.A. Gallo, ‘Effect of mold compound components on moisture-induced degradation of gold-aluminum bonds in epoxy encapsulated devices’, Proceedings IRPS, pp. 244-251, (1990) [53] T. Claasen, ‘The logarithmic law of usefulness’, Semiconductor International, pp. 175-184, (1998) [54] D.S. Peck, ‘Semiconductor reliability predictions from life distribution data’, in ‘Semiconductor Reliability’, ed. Schwop and Sullivan, pp. 51-67, Reinhold, New York, (1961) [55] D.S. Peck, ‘The reliability of semiconductor devices in the Bell system’, Proceedings of the IEEE, pp. 185-213, (1974) [56] Ö. Hallberg, ‘Failure rate as a function of time due to log-normal life distributions(s) of weak parts’, Microelectronics & Reliability, pp. 155-158, (1977) [57] W.C. Riordan, R. Miller, J.M. Sherman, J. Hicks, ‘Microprocesor reliability performance as a function of die location for a 0.25 µm five layer metal CMOS logic process’, Proceedings IRPS, pp. 1-11, (1999) 26 Introduction [58] M. Salagoïty, ‘Reliability of high density packages’, Tutorial European Symposium on Reliability of Electron devices and Failure analysis (ESREF), (1999) 27 2 Reliability Issues in High Voltage BipolarCMOS-DMOS Integrated Circuits [19,20] 2.1 Introduction 2.2 Threshold voltage instabilities of HV DMOS transistors 2.3 Parasitic leakage currents induced by ‘charge-creep’ 2.3.1 Failure mechanism 2.3.2 Surface potential modelling by a lumped element RC-network 2.3.3 ‘Charge-creep' characterisation using test structures 2.3.3.1 Test structures 2.3.3.2 Experimental results 2.3.4 Comparison of experimental data and model predictions 2.3.4.1 Steady-state surface potential 2.3.4.2 Delay time 2.3.5 Design rules 2.4 Conclusions 2.5 References 2.1 INTRODUCTION The combination of high operating temperature (∼140°C) and high voltages (>600V) in current state-of-the-art high power / high voltage (HV) BipolarCMOS-DMOS (BCD) technologies in applications as e.g. lighting and power supplies induces new degradation mechanisms that are non-relevant in standard 5 V and 3.3 V CMOS technologies. Hardly any published data are available on these mechanisms. The three most significant failure modes are breakdown voltage instabilities of the high voltage lateral double-diffused MOS (DMOS) transistor [1], threshold voltage (Vt) instabilities of this transistor [2,3] and parasitic leakage currents in low voltage parts of the circuit induced by high surface potentials at the moulding plastic - passivation interface originating from 25 Chapter 2 the high voltage part of the circuit (‘gate induced leakage’ [4]). This chapter will discuss the latter two issues. 2.2 THRESHOLD VOLTAGE INSTABILITIES OF HIGH VOLTAGE DMOS TRANSISTORS A cross section of the DMOS transistor is shown in fig. 1. The devices are fabricated in a 3 µm double poly, single metal technology. The poly-metal dielectric is a TEOS oxide stack containing a thin P2O5 layer for mobile ion gettering purposes. Extensive life testing has shown that this getter layer results in an adequate reliability for a 12 V BiCMOS technology. In the 650 V BCD-technology it however appears to be insufficient. Curve B in fig. 2 shows the threshold voltage (Vt) instability occurring during High Temperature Reverse Bias (HTRB) lifetest at 150 °C where the gate bias equals 0 V. The same failure mode also occurs during a Static High Temperature Lifetest (SHTL) with a 12 V gate bias. The failure mode is caused by the fact that commercially available plastic moulding compounds contain traces (≈ 4 ppm) of sodium ions (Na+) originating from the resin manufacturing process. Under high voltage operating conditions, a large lateral electric field (about 10 V/µm) exists along the surface of the transistor between its source and drain, forcing the Na+ions in the plastic at high temperature towards the source. Here the vertical electric field component points towards the grounded source, enabling the Na+ to penetrate the device through pinholes, microcracks, fissures and pores in the Si3N4 plasma nitride passivation [16] and reach the DMOS gate oxide via the path depicted in fig. 1. Na + Si3N4 G Al TEOS S D LOCOS p+ p++ p n+ - - p n P2 O 5 n+ p++ p-Fig. 1: Overview of a high voltage lateral DMOS transistor and the sodium (Na+) penetration path. 26 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits norm. threshold voltage Process and device layout improvements were evaluated using a wafer level highly accelerated lifetest [5,6]. Devices were deliberately contaminated with 0.5 weight % NaOH and subjected to high voltage and high temperature while the Vt was continuously monitored to determine the failure times, see fig. 3. The apparent variation of the sigma of the distribution with temperature is most likely just a statistical effect caused by the small samples sizes (about 9) used in this experiment. The resulting activation energy Ea equals 0.87±0.09 eV. This result is in reasonable agreement with the ≈ 0.7 eV value reported in literature for diffusion of Na+ in silicon oxide (SiO2) [2,7] although also values ranging from 0.45 eV [8] to 1.1 eV [3] have been reported. 1.25 1.00 0.75 0.50 A B C 0.25 0.00 10 100 1000 10000 time (hours) Fig. 2: Vt-degradation of the standard DMOS transistor during HTRB lifetest (Vds=500V/Vgs=0V/T=150°C) with A) 0.45µm, B) 0.9µm and C) 1.8µm Si3N4 passivation. 4 99.9% 99% prob inv (F) 3 2 1 90% 0 50% -1 10% 250°C 200°C 150°C -2 -3 1.0% 0.1% -4 0.1 1 10 100 1000 10000 time (sec) Fig. 3: Vt-shift failure time distribution during a 250V wafer level HTRB stress (Vgs=0V) at 150, 200 and 250°C. 27 Chapter 2 norm. threshold voltage It was found that significant improvements could be achieved by increasing the silicon nitride (Si3N4) passivation thickness (fig. 2) and optimising the layout, see curve A and B in fig. 4. Also the densification of the nitride along the vertical sidewall of the anisotropically etched metal lines by the ion bombardment during plasma enhanced chemical vapour deposition (PE-CVD) appeared to be important as devices with wet etched metal were found to be superior to those with dry etched metal. Apparently, the PE-CVD deposited silicon nitride is relatively porous along the sidewalls of the metal lines. This is also demonstrated in fig. 5 showing a cross-section of a metal line with passivation after HF-etch. The Si3N4 etch rate is clearly larger at the sidewall than at the top or bottom. Finally, a significant lifetime improvement could be achieved by implementing an enhanced phosphorous gettering layer in the TEOS oxide poly-metal dielectric or by including a PSG layer in the TEOS stack [18], see curve C and D in fig. 4. In some of the lifetest experiments a thin Si3N4 passivation layer was used in order to accelerate the Vt-shift failure mode and reduce the required stress time. 1.25 1.00 0.75 0.50 A B C D 0.25 0.00 10 100 1000 10000 time (hours) Fig. 4: Vt of the DMOS transistor with optimised layout versus time during HTRB lifetest (500V/150°C) for A) 0.45µm, B) 0.9µm Si3N4 passivation and C) 0.45µm Si3N4 and improved P2O5-getter layer and D) 1.8µm Si3N4 and a PSG-getter layer. 28 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits densified non-densified Fig. 5: Cross-section of a passivated metal line after HF-etch, the fissure occurs at the border between densified and non-densified (more porous) Si3N4. 2.3 PARASITIC LEAKAGE CURRENTS INDUCED BY ‘CHARGE-CREEP’ 2.3.1 Failure mechanism Plastic moulding compounds have a measurable conductivity due to the presence of water and ionic impurities like e.g. Na+, K+, Cl-, NH4+, HxPO43-, and NO32- in the compound. Furthermore also Br and Sb ions are added to the plastics, acting as flame retardants. The conductivity is strongly temperature dependent and increases over 4 decades between 20 °C and 150°C (Ea= 0.65 eV) as shown in fig. 6 [9]. Consequently, at high temperatures, the high voltage (HV) surface potential (>600V) present at the bondpads of the HV circuitry can spread over the low voltage (<20V) part of the circuit, see fig. 7, and may induce parasitic channels and leakage currents in low voltage devices like bipolar transistors and active as well as parasitic MOS transistors or may affect diffused resistance values [17]. 200°C175°C150°C125°C 100°C 17 Epoxy Resistivity (ohm-cm) 10 75°C 50°C 16 10 15 10 Ea~0.65ev 14 10 EME1100HS EME1100HS EME6210S 13 EME6210S 10 EME6210SR EME6210SR EME1100HJ EME1100HJ Nitto HC10-2 12 10 Nitto HC-10 Ea~2.5ev 11 10 10 10 2.0 2.2 2.4 2.6 2.8 3.0 3.2 1000/T (/°K) 29 Chapter 2 Fig. 6: Resistivity of various commercially available plastic moulding compounds after full moisture saturation as a function of temperature. Fig. 7: Schematic view of parasitic leakage currents induced by high surface potentials originating from a high voltage bondpad (‘charge-creep’). The above phenomena is called 'charge-creep' or 'gate induced leakage' [4] and may result in malfunctioning of circuits within a few hours during Dynamic High Temperature operating Lifetests (DHTL) as shown in fig. 8. The surface potential is “frozen” and thus a permanent failure is created when the circuit is subsequently cooled down to room temperature. The mobility of the ionic impurities is namely strongly reduced at lower temperatures and thus a net positive charge remains ‘traped’ in the moulding compound. 4 A /150°C A /125°C B/150°C C/150°C prob inv (F) 3 2 1 99.9% 99% 90% 0 50% -1 10% -2 1.0% 0.1% -3 -4 0.1 1 10 100 1000 10000 time (hours) Fig. 8: Cumulative failure distribution of a BCD-product during a 390V DHTL stress at 125°C and 150°C for various packages A) cresol-novolac low stress compound, B) as A) but with epoxy-plastic interface modification and C) bi-phenylic anti-popcorn compound. At each readpoint the HVbias was kept on the devices while lowering the stress temperature to room temerature in order to prevent potential annealing effects. 30 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits The lifetest data of the BCD-product in fig. 8 show that the failure mechanism is strongly dependent on temperature as well as the specific type of moulding compound and package construction. Other experiments reported in [10] show that also the moisture content of the package is very important. The water concentration in the package determines the mobility of the ionic impurities and thus also affects the ‘charge-creep’ failure mechanism. It appears that the 'charge-creep' effects can be virtually eliminated by first subjecting products to a 500 hrs bake at 150 °C (which reduces the moisture content of the package to 0 weight %) before the high voltage stress [10]. Similar results are obtained after a 24 hrs bake at 175 °C. If the same samples are subsequently fully moisturised (to ≈ 0.3 weight %) by a storage for 168 hrs at 85 °C / 85 % RH, they again become sensitive to the ''charge-creep' mechanism. It must be noted though that in that case the leakage currents induced by '’charge-creep' are significantly less than that of virgin fully moisturised samples and it also takes longer before the leakage current increase starts. This is probably caused by the ongoing curing of the moulding compound as the 150 °C and 175 °C bake temperatures are close to or even over the 165 °C glass transition temperature Tg of the plastic. The curing affects plastic material properties like maximum moisture uptake, ionic mobility and conductivity. It must be noted that in the experiment shown in fig. 8, the moisture content of the samples was not controlled so the data must be treated cautiously. For proper experimental results and e.g. activation energy determination, the moisture content of the package of the devices must firstly be in equilibrium and secondly be controlled for all samples and at all readpoints. 2.3.2 Surface potential modelling by a lumped element RC-network As discussed in the previous section, the parasitic leakage currents are induced by high surface potentials originating from the high voltage (HV) bondpads. So the 'charge-creep' effect can be modelled by describing the evolution of the surface potentials at the die-plastic interface as a function of place and time. As will be shown in section 2.3.3.1, the surface potential has a one-to-one relation to the leakage currents. The place and time dependence of the surface potential can be modelled by a lumped element RC-network, see fig. 9, with R being the resistance of the moulding compound from the HV-bondpad to the low voltage circuitry and C the capacitance between the active silicon and the interface between the nitride passivation and the moulding compound. Note that after long stress times obviously an equilibrium will be reached governed by the boundary conditions defined by the potentials of the bondpads and the diepad of the circuit. 31 Chapter 2 R HV Bondpad R C R C R C R R C C 0V Earthlane Silicon Nitride Metal TEOS Nwell n+ LOCOS Metal n+ p- Si dnode Dbondpad d=0 Dearthlane Fig. 9: Lumped-element RC-network used for modelling of the evolution of the surface potential as a function of place and time. A real circuit has normally a rectangular geometry and asymmetrically placed HV-bondpads. In order to be able to realistically model the surface potential evolution as a function of place and time by analytical formulas we simplify this geometry to the cylindrical and symmetrical one shown in fig. 10. This allows us in the following sections to compare model predictions and experimental data without having to rely on complex 3D-device simulations. It thus provides more insight in the 'charge-creep' mechanism. Dbondpad dnode HV Dearthlane Metal 0V Earthlane Fig. 10: Geometry used for modelling of the surface potential as a function of the distance to the HV-bondpad. In case of the geometry shown in fig. 10, the resistance R between a node at distance d from the edge of the HV-bondpad and the HV-bondpad is given by 32 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits equation (1) and the capacitance C of the corresponding area by equation (2) and (3). R(d ) = d + Dbondpad ρ plastic ⋅ ∂r ò Dbondpad = 2π ⋅ t plastic ⋅ r = æ d + Dbondpad ρ plastic ⋅ ln ç ç D 2π ⋅ t plastic bondpad è (1) [F] (2) ö ÷ ÷ ø d + Dbondpad C (d ) = ò C 0 ⋅ 2π ⋅ r ⋅ ∂r = Dbondpad [ [Ω] = π ⋅ C 0 ⋅ (d + Dbondpad ) − Dbondpad 2 2 ] where: C0 = ε 0 ⋅ε oxide ⋅ε nitride ⋅ [Fm-2] (3) ε nitride ⋅(t LOCOS + t TEOS ) + ε oxide ⋅t nitride Here is ρplastic the resistivity of the moulding plastic as shown in fig. 6, tplastic the thickness of the moulding plastic on top of the silicon-nitride passivation, Dbondpad the radius of the HV-bondpad and tLOCOS, tTEOS and tnitride the thickness of the LOCOS oxide, TEOS oxide and the Si3N4 passivation. εoxide and εnitride equal 3.9 and 7.5 respectively. Fig. 6 shows that for the epoxy-novolac plastic ρplastic equals about 1.4⋅1014 Ωcm and 6⋅1013 Ωcm at 130°C and 150°C respectively. In the experiments that will be described in section 2.2.3, tLOCOS, tTEOS and tnitride equal 0.95µm, 1.2µm and 1.8µm respectively and tplastic , Dbondpad and C0 equal about 1.2 mm, 40 µm and 1.12⋅10-9 Fcm-2 respectively. The surface potential Vsurface as a function of place and time after switching the high voltage on at t= 0 seconds is for the lumped-element RC-network given by equation (4): t − æ ç τ (d ) Vsurface (d, t ) = Vsurface (d ) ⋅ ç1 − e delay ç è ö ÷ ÷ ÷ ø [V] (4) Equation (5) gives the steady state surface potential Vsurface (d) at a node at a distance d from the HV-bondpad. Note that Vsurface (D earthlane) = 0 V. 33 Chapter 2 Vsurface(d ) = Vbondpad ⋅ R(d → Dearthlane) = R(Dbondpad → d ) + R(d → Dearthlane) Dearthlane ö ÷ ÷ çd +D bondpad ø è æ lnç = Vbondpad ⋅ [V] ö æD lnç earthlane ÷ ÷ çD è bondpad ø (5) where Vbondpad is the high voltage applied to the bondpad, Dbondpad the radius of the bondpad, Dearthlane the radius from the centre of the bondpad to the grounded earthlane (also called sawlane) and d the distance between the edge of the bondpad and the location of interest. In the experiments described in section 2.2.3, Dbondpad equals ≈ 40 µm, Dearthlane ≈ 1100 µm and Vbondpad = 500V. The delay time τdelay in equation (4) between the application of the high voltage at the HV-bondpad and the response of the voltage at one of its nodes is given by equation (6). τdelay is equal to the RC-time constant between the edge of the HV-bondpad and the node located at a distance d. Note that at t= τdelay the surface potential has reached 63% of its steady-state value. d + Dbondpad τ delay (d ) = ò ò R ( r ) ⋅ C ( r ) ⋅ ∂r∂r = Dbondpad d + Dbondpad = ò ò Dbondpad = [ ρ ⋅ C 0 ⋅ 2π ⋅ r ⋅ ∂r∂r = 2π ⋅ r ⋅ t plastic ρ ⋅ C0 2 2 ⋅ (d + Dbondpad ) − Dbondpad 2 ⋅ t plastic ] [s] (6) Equation (6) implies that for large distances d to the HV-bondpad, τdelay increases approximately quadratically with d. For smaller distances the exponent will be more like 1.8 than 2.0. 2.3.3 'Charge-creep' characterisation using test structures 34 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits 2.3.3.1 Test structures A dedicated test chip has been designed and processed to quantify the 'chargecreep' effect as a function of time, temperature and distance of sensitive circuitry to the HV components. A parasitic field-oxide NMOS transistor is used as the low voltage ‘sense’ device, see fig. 11. The transistor width and length are 10µm and 5µm respectively and its threshold voltage is typically about 40V. The effects of moisture content have been reported in [10], see also section 2.3.1. The test chip has a size of 2.2 x 2.9 mm2 and is packaged in a cresol-novolac low stress compound. In all the experiments discussed below the packages were fully saturated with moisture (≈ 0.3 weight %) and in equilibrium before starting the stresses unless otherwise mentioned. d Source n+ 2* Dbondpad Drain n+ Bondpad Field oxide Metal Parasitic NMOS sense transistor W/L=10/5 µm (a) (b) Fig. 11: Schematic overview of the 'charge-creep' test structures; a) top view and b) cross-section. The testchip also contained a field oxide NMOS transistor of the same geometry as the parasitic NMOS transistor in fig. 11 but in this case with a metal gate on top. The leakage current of this transistor as a function of the metal gate volta35 Chapter 2 ge is shown in fig. 12. If the metal gate is not present, a similar leakage current can be induced by raising the surface potential at the silicon nitride to plastic interface. Obviously in this case a higher voltage is needed because apart from the LOCOS and TEOS oxide, now also the silicon nitride passivation is part of the gate dielectric, see fig. 10. The relation between them is given by equation (7). æ V surface = çç1 + è ö ε oxide t nitride ÷ •V • ε nitride t LOCOS + t TEOS ÷ø gate [V] (7) where Vsurface is the surface potential at the plastic-nitride interface and Vgate the gate voltage corresponding to the measured leakage current as shown in fig. 12. εoxide and εnitride equal 3.9 and 7.5 respectively. tLOCOS, tTEOS and tnitride are the thickness of the LOCOS oxide, TEOS oxide and silicon nitride passivation and are 0.95 µm, 1.2 µm and 1.8 µm respectively in our experiments. Consequently, Vsurface equals 1.44 xVgate. Equation (7) thus allows us to convert the measured leakage current of the parasitic NMOS transistor in our experiments in a simple way to a surface potential. The test chip used in the experiments had a size of 2.2 x 2.9 mm2 and was packaged in a cresol-novolac low stress compound. In all the experiments discussed below the packages were saturated with moisture (≈ 0.3 weight-%) and in equilibrium before starting the stresses unless otherwise mentioned (see section 2.3.3.2). 1E-02 1E-03 Leakage Current [A] 1E-04 1E-05 150'C 1E-06 130'C 1E-07 1E-08 110'C 1E-09 90'C 1E-10 1E-11 0 100 200 300 400 500 Gate Voltage Parasitic NMOS [V] 36 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits Fig. 12: Leakage current of a W/L=10/5 µm parasitic field oxide NMOS transistor as a function of the voltage on the metal gate for various temperatures. The gate dielectric consists of a 0.95 µm LOCOS oxide and a 1.2 µm TEOS oxide. 2.3.3.2 Experimental results Conventional high temperature lifetests leakage current (A) Fig. 13 shows results from a conventional HTRB lifetest at 150°C on the test structures. Here the samples were removed from the stress set-up at each readpoint and the leakage current was measured at room temperature. Note that the storage time at room temperature at the various readpoints was less than 24 hrs, thus limiting any moisture uptake by the packages. The data show that the induced leakage currents decrease with the distance from the HV bondpad as expected and that, due to the small size of the test die (2.2x2.9 mm2), equilibrium is reached within ¼ hour. The HV surface potential then extends to more than 1 mm from the HV bondpad. Remarkably, for longer times the leakage current and thus also the surface potential decreases. This is probably due to the ongoing curing of the moulding compound, see section 2.3.1. Annealing behaviour is shown in fig. 14. The annealing takes longer than the leakage increase, firstly due to the fact that the ion mobility and thus the conductivity is much less in dry samples and secondly due to the curing effect. Note that during the leakage increase the samples still contain moisture. Similarly, repeated high voltage stress and annealing experiments show that the leakage increase time constant increases with every cycle. 1E-02 1E-03 1E-04 1/4h 1h 24h 1E-05 1E-06 0 200 400 600 800 distance to HV bondpad (µm) Fig. 13: Leakage current of a parasitic NMOS transistor versus distance to a high voltage bondpad for various stress times during a conventional 500 V HTRB lifetest at 150 °C. 37 Chapter 2 leakage current (A) 1E-02 1E-04 Stress /Anneal 1E-06 0h/0h 24h/0h 24h/24h 24h/65h 1E-08 1E-10 1E-12 0 200 400 600 800 distance to HV bondpad (µm) Fig. 14: Recovery of the leakage current of a parasitic NMOS transistor versus distance to a HV bondpad during a 500 V stress at 150 °C and annealing at 150 °C. The legend shows both the stress and anneal times. In-situ high temperature lifetests During the conventional lifetest experiments equilibrium was already reached at the first readpoint being about 15 minutes after the start of the stress. Therefore also in-situ stress experiments were carried out where the leakage current was continuously monitored at the stress temperature during a 500V high voltage stress. Fig. 15, 16 and 17 show the results of these in-situ 'charge-creep' stress experiments. Fig. 15 shows the leakage current as a function of time for various distances of the parasitic ‘sense’ transistor to the HV-bondpad and for temperatures ranging from 110 °C to 150 °C. The data confirm that the 'charge-creep' mechanism is a fast effect as already indicated by the data in fig. 13 from the conventional lifetest experiments. At a distance of 100 µm from the HV-bondpad equilibrium or steady state is reached after about 1000 seconds at 150 °C. At the same time, the induced leakage current effects extend to beyond 0.9 mm from the HV-bondpad. Furthermore, we observe a delay time τdelay between the start of the HV-stress and the onset of the leakage current increase. This is typical for a higher order system and thus consistent with the lumped element RC-model that we have used to describe the surface potential evolution versus time and place. 38 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits 1E-02 T= 150 C Leakage Current [A] 1E-03 1E-04 1E-05 Distance to HV-Bondpad 100um 1E-06 1E-07 350um 550um 1E-08 850um 1E-09 10 100 1000 10000 100000 Time [s] (a) 1E-02 T= 130 C Leakage Current [A] 1E-03 1E-04 1E-05 1E-06 D istanc e to H V-B ondpad 10 0um 35 0um 55 0um 85 0um 1E-07 1E-08 1E-09 1E-10 10 100 1000 Tim e [s] 1 0000 100 000 (b) 1E-02 Distance to HV-Bo nd pad Leakage Current [A] 1E-03 100um 1E-04 350um 1E-05 550um 1E-06 850um 1E-07 1E-08 T = 110 C 1E-09 1E-10 10 100 1000 10000 100000 T im e [s] 39 Chapter 2 (c) Fig. 15: Parasitic NMOS leakage current as a function of time during a 500 V SHTL stress for various distances between the parasitic NMOS ‘sense’ transistor and the HV bondpad and for three temperatures (a) 150 °C, (b) 130 °C and (c) 110 °C [J. Bruggers]. Fig. 16 shows for various temperatures how the delay time τdelay increase with the distance d to the HV-bondpad. It appears that the τdelay is proportional to ≈ d2. Data from test structures packaged both in epoxy-novolac and in bi-phenylic moulding compounds and stressed at temperatures between 90°C and 150°C show that the exponent varies between 1.72 and 2.37 with an average and spread of 2.06 ± 0.19. This is consistent with the exponent value of 2 predicted by the lumped-element RC-model in section 2.2.2. 100000 150 C 130 C 110 C Delay Time [s] @ I-leak= 10nA 10000 1000 y = 0.026x 100 2.03 y = 0.0045x 2.0 2 y = 0.0051x 1.7 2 10 10 100 D istan ce to HV-bon dp ad [um ] 1000 Fig. 16: Delay time between the start of the 500 V SHTL stress and the onset of the leakage current increase of the parasitic NMOS, defined as a leakage larger than 10 nA, as a function of the distance to the HV-bondpad for three different temperatures. Finally, it is shown in fig. 17 that the leakage current increase is strongly temperature dependent. It also shows an increase of delay time τdelay with decreasing temperature that is also consistent with the lumped element RC-model. 40 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits 1E-03 150'C 140'C 1E-04 130'C Leakage Current [A] 1E-05 120'C 110'C 1E-06 1E-07 1E-08 1E-09 1E-10 10 100 1000 10000 100000 Time [s] Fig. 17: Leakage current of a parasitic NMOS transistor located at 550 µm distance from the HV-bondpad as a function of time during a 500 V SHTL stress at various temperatures. As described in section 2.3.3.1, all leakage current graphs in fig. 15 and 17 can be converted to surface potential graphs. Using equation (4), this allows us to determine the delay time τdelay for a given distance to the HV-bondpad as a function of temperature. Thus the activation energy Ea of the 'charge-creep' failure mechanism can be measured. For the cresol-novolac low stress compound and the bi-phenylic anti-popcorn compound activation energies of Ea= 0.9 ± 0.2 eV and Ea= 1.1 ± 0.1 eV respectively are found [10]. The fact that we find different values supports our model that the 'charge-creep' is governed by conduction in the moulding compound. 2.3.4 Comparison of experimental data and model predictions 2.3.4.1 Steady-state surface potential In order compare the experimental data with the model predictions, the leakage currents graphs must be converted to surface potential graphs using fig. 12 and equation (7). An example is shown in fig. 18. It depicts the surface potential as a function of time for various distances to the HV-bondpad during a 500V stress at 130°C as derived from fig. 15b that shows the corresponding leakage currents. It clearly shows that the steady state surface potential decreases with increasing distance to the HV-bondpad. Similarly, the delay time between the rise of the surface potential and the start of the stress increases as is qualitatively predicted by the lumped element RC-model. 41 Chapter 2 500 Distance to HV-Bondpad 400 T= 130 C Surface Potential [V] 100um 350um 300 550um 850um 200 100 0 10 100 1000 10000 100000 Time [s] Fig. 18 : Surface potential induced by the ‘charge-creep’ effect as a function of time for various distances to the HV bondpad during a 500 V SHTL stress at 130 °C as derived from the leakage currents of parasitic NMOS transistors, see text. The corresponding leakage data can be found in fig. 15b. Fig. 19 shows the steady state surface potential as a function of the distance d to the HV-bondpad as obtained from various experiments on test structures packaged in both epoxy-novolac and bi-phenylic moulding compounds and stressed at 500V at temperatures ranging from 90°C to 150°C. Note that the surface voltages have been derived from the measured leakage currents using the calibration curves in fig. 12 and equation (7). In order to obtain a good fit between the data and the model in equation (5) we need to modify equation (5). This is because in fig. 19 the 500V value of the HVbondpad is reached at a distance doffset of about 80 µm from the HV-bondpad instead at a d= 0 µm as predicted by the model. We can incorporate this in our model by replacing d by d-doffset in equation (5). Two effects cause this discrepancy. The first is that our model assumes a cylindrical symmetry while in reality we are dealing with a square 80 µm x 80 µm bondpad and a rectangular die. The second, more important, effect is that the 1.8 µm thick silicon-nitride passivation layer becomes conductive at high lateral field strengths and high temperatures due to Frenkel-Poole conduction [11,12]. Actually, if at 150 °C a 500 V voltage is applied to the bondpad, the silicon-nitride can be significantly more conductive than the moulding plastic up to a significant distance from the edge of the HV-bondpad due to the high lateral electric field in the silicon-nitride layer, see fig. 19. Consequently, the 500 V value in the measurements is reached at a distance doffset > 0 µm. Moreover, the real thickness of the gate dielectric is thinner than assumed because the silicon-nitride is conductive too and is virtually no longer part of the gate dielectric. Thus, the use of equation 42 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits (7) for the calculation of surface potentials from the measured leakage currents, see section 2.3.3.1, will result in too high values for the surface potential. So for small distances to the HV-bondpad, Vsurface is actually more or less equal to Vgate. This explains the fact that in fig. 19 also surface potential values above the 500 V stress voltage occur. The above effect can not be modelled analytically and requires the use of 3D-device simulations to take it properly into account. This is beyond the scope of this work. It should furthermore be noted that for large leakage currents, and thus for high surface potentials, the inaccuracy in the derived surface potential values increases strongly, see fig. 12. 600 Experiment Model fit Surface Potential [V] 500 400 300 200 100 0 0 200 400 600 800 1000 Distance to Edge HV-Bondpad [um] Fig. 19: Steady-state surface potential induced by the ‘charge-creep’ effect as a function of the distance to the HV bondpad during 500V SHTL stresses at various temperatures ranging from 90°C to 150°C and for various plastic moulding compounds. The dashed line is the model fit after replacing d by d-doffset in equation (6), see text. The significance of the silicon-nitride conductivity is illustrated in the following. The silicon-nitride conductivity is exponentially dependent on lateral field strength as well as on temperature [13-15] and furthermore increases with increasing Si-content of the layer [13,14], see also fig. 20. The Si:N stoichiometric ratio also determines the refractive index of the layer [13,14]. The silicon-nitride layer in our experiments has a refractive index of 2.0 and at 20 °C its resistivity ρSiN equals about 7⋅1014 Ωcm at 20 V/µm lateral field strength, see fig. 20. This results in a resistivity of about 4.2⋅1010 Ωcm at 20 V/µm at 150 °C, using an activation energy for the conductivity of about 0.8 eV[14]. At this operating condition, ρplastic/tplastic equals ≈ 5⋅1014 Ω whereas ρSiN/tSiN equals ≈ 2.3⋅1014 Ω. The actual electric field strength in the silicon-nitride is de43 Chapter 2 termined by the surface potential at the SiN-plastic interface and is larger than 10 V/µm up to 200 µm from the edge of the HV-bondpad as shown in fig. 19. Thus, the silicon-nitride will indeed be significantly more conductive than the moulding plastic near the HV-bondpad and consequently the 500 V value in the measurements is reached at a distance doffset > 0 µm. 1E+15 Si-rich SiN n=2.45 Resistivity [Ohm.cm] 1E+14 Standard SiN n=2.00 1E+13 1E+12 1E+11 1E+10 1E+09 1E+08 0 1 2 3 4 5 6 7 Electric Field [MV/cm] Fig. 20: Resistivity of a 200 nm thick PE-CVD deposited silicon nitride layer as a function of the electric field at 20°C for various Si:N stoichiometric ratios a) standard (refractive index n=2.0) and b) silicon rich (n=2.45) [S. Evseev and G. Timan]. If we now replace ‘d’ in equation (5) by ‘d – doffset’ we obtain an excellent fit between the data and the model fit depicted by the dashed line in fig. 19. The fit constants are Dbondpad = 24 ± 7 µm, dasym = 107 ± 6 µm and Dearthlane = 1024 ± 91 µm. We find that the value of Dearthlane obtained from the model fit is in very good agreement with the ≈ 1100 µm actual distance on the test structure. The value of Dbondpad corresponds reasonably well with actual value of 40 µm. 2.3.4.2 Delay time In order to compare the calculated and measured delay times quantitatively, we again replace ‘d’ in equation (6) by ‘d – doffset’ and use the previously derived fit constants for Dbondpad,, dasym and Dearthlane. The result is shown in fig. 21 for a stress at 130 °C. We find a good qualitative agreement between the calculated delay times and the ones determined from the leakage current graphs, taking into account the previously discussed model limitations. The delay times determined from the 63 % level of the surface potential graph are however significantly larger 44 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits than the calculated ones. This is most likely caused by the fact that during the stress at 130 °C the moisture already evaporates from the plastic. As this reduces the ion mobility, the actual resistivity of the plastic mould compound will be larger than the values shown in fig. 6 that were used in the calculations. Note that the data in fig. 6 are valid for moisture saturated compounds. 5000 Delay Time [s] I= 10nA 4000 V= 63% 3000 Calculated 2000 1000 0 0 200 400 600 800 1000 Distance from HV-bondpad [um] Fig. 21: Comparison between measured and calculated delay times as a function of the distance to the HV-bondpad during a 130 °C stress using two different delay time criteria: a) the time at which the leakage current exceeds 10 nA (see fig. 14b) and b) the time at which the surface potential equals 63 % of its steady-state value (see fig. 16). 2.3.5 Design rules The model derived in section 2.2.4 can be used to derive design rules for the safe distance of an active circuit element from the HV-bondpad in order to prevent the occurrence of parasitic leakage currents for a given operating/use condition. Assuming that circuit design is robust against leakage currents smaller than 1 µA, we find from fig. 12 that the maximum allowable surface potential equals about 50 V. Using equation (5) one can then calculate the corresponding safe distance dsafe to the HV-bondpad. All devices located closer to the HV-bondpad than dsafe should be protected by applying proper shielding measures like using field plates. Field plates are metal or polysilicon plates that are placed on top of sensitive devices, thus shielding these from the high surface potentials at the silicon-nitride to plastic interface. Note that sensitive devices located further away from the HVbondpad can be left un-shielded. For a practical case where Dearthlane= 3 mm, 45 Chapter 2 Dbondpad= 40 µm and Vbondpad= 400 V we find that dsafe ≈ 1.7 mm. It can be concluded that a significant fraction of the die area is sensitive to the 'charge-creep' effect. The problem can be alleviated somewhat by placing metal lines connected to ground potential around the HV-bondpads while simultaneously establishing contact between these metal lines and the moulding plastic by locally removing the silicon-nitride passivation on top of these lines. 2.4 CONCLUSIONS Dominant failure modes in high power/high voltage (650 V) BCDtechnologies are threshold voltage instabilities of the lateral DMOS transistor due to sodium ingression and parasitic leakage currents in low voltage devices induced by high surface potentials originating from the high voltage devices ('chargecreep'). The threshold voltage instabilities can be prevented by improving the sodium getter capabilities of the dielectric layers in the backend process and by increasing the silicon nitride passivation thickness. The occurrence of parasitic leakage currents appears to be strongly dependent on temperature, moisture content of the plastic package, circuit layout and applied operating voltage. The 'charge-creep' effect can be modelled by describing the evolution of the surface potential as a function of place and time by means of a lumped element RCmodel. A good qualitative and a reasonable quantitative agreement between experimental data and model predictions is found. Using the model also design rules that can be used to eliminate the 'charge-creep' effects in actual circuits have been derived. 2.5 REFERENCES [1] T. Fujihara, Y. Yano, S. Obinata, N. Kumagai, K. Sakurai, “Proposal for new interconnection technique for very high-voltage IC’s”, Journal of Appl.ied Physics, Vol. 35, pp. 5655-5663, (1996) [2] R.L. Hance, J.W. Miller, K. Erington, M.A. Chonko, “Mobile ion contamination in CMOS circuits”, International Reliability Physics Symposium (IRPS) Tutorial, Topic 4, (1995) [3] E.H. Nicollian, J.R. Brews, ‘MOS (Metal Oxide Semiconductor) physics and technology’, Wiley, New York, (1982) [4] R.D. Mosbarger, D.J. Hickey, “The effects of materials and post-mold profiles on plastic encapsulated integrated circuits”, Proceedings IRPS 1994, pp. 93-100, (1994) [5] P.L. Hefley, J.W. McPherson, “The impact of an external sodium diffusion source on the reliability of MOS circuitry”, Proceedings IRPS, pp. 167-172, (1988) [6] C. Hong, B. Henson, T. Scelsi, R. Hance, “An accelerated sodium resistance test for IC passivation films”, Proceedings IRPS, pp. 318-325, (1995) [7] J.P Stagg, Applied Physics Letters, no. 10, pp. 532, (1977) 46 Reliability issues in High Voltage Bipolar-CMOS-DMOS integrated circuits [8] G. Greeuw, Thesis, University Groningen, (1984) [9] R. McClelland, “Generic leakage plastic - recoverable, code B7a”, Philips Semiconductors Failure Analysis Handbook, (1995) [10] H.J. Bruggers, R.T.H. Rongen, C.P. Meeuwsen, A.W. Ludikhuize, ‘Reliability problems due to ionic conductivity of IC encapsulation materials under high voltage conditions’, Proceedings International Symposium on Power Semiconductor Devices (ISPSD), pp. 197-200, (1999) [11] J. Frenkel, ‘On pre-breakdown phenomena in insulators and electronic semiconductors’, Physical Review, pp. 647, (1938) [12] S.M. Sze, ‘Physics of semiconductor devices’, 2nd edition, John Wiley & Sons, New York, (1981) [13] J.W. Osenbach, W.R. Knolle, ‘Semi-insulating Silicon Nitride (SinSiN) as a resistive field shield’, IEEE Transactions on Electron Devices, pp. 15221528, (1990) [14] J.W. Osenbach, J.L. Zell, W.R. Knolle, L.J. howard, ‘Electrical, physical and chemical characteristics of plasma-assisted chemical-vapor deposited semiinsulating a-SiN:H and their use as a reistive shield for high voltage integrated circuits’, Journal Applied Physics, pp. 6830-6843, (1990) [15] K. Matsuzaki, T. Horasawa, G. Tada, M. Saga, ‘Application of a semi-insulating amorphous hydrogenated silicon nitride film as a resistive field shield and its reliability’, Journal Electrochemical Society, pp. 4296-4304, (1998) [16] J.V. Dalton, J. Drobek, ‘Structure and sodium migration in silicon nitride films’, Journal of the Electrochemical Society, pp. 865-868, (1968) [17] R.C. Olberg, ‘The effects of epoxy encapsulant composition on semiconductor device stability’, Journal of the Electrochemical Society, pp. 129-133, (1971) [18] L.H. Kaplan, M.E. Lowe, ‘Phosphosilicate glass stabilization of MOS structures’, Journal of the Electrochemical Society, pp. 1649-1653, (1971) [19] J.A. van der Pol, H.J. Gerritsen, R.T.H. Rongen, P.P.M.C. Groeneveld, P.W. Ragay, H.A. van den Hurk, ‘Reliability issues in 650V high voltage BipolarCMOS-DMOS integrated circuits’, Microelectronics & Reliability, pp. 17231726, (1997) [20] J.A. van der Pol, R.T.H. Rongen, H.J. Bruggers, ‘Modelling of surface potential induced leakage failures in high voltage integrated circuits and application to design rule derivation’, Submitted to ESREF2000 Conference. 47 3 Relation Between the Hot Carrier Lifetime of Transistors and CMOS SRAM Products [24] 3.1 3.2 3.3 3.4 3.5 3.6 3.7 Introduction Experimental Transistor and SRAM parameter degradation Analysis and discussion of the SRAM parameter degradation Relation between the transistor and SRAM hot carrier lifetime Summary and conclusions References 3.1 INTRODUCTION Along with decreasing MOS transistor device geometries, hot carrier degradation of integrated circuits is becoming a more and more serious reliability problem. Under static operating conditions, the hot carriers lifetimes of transistors in 0.35 µm and 0.25 µm technologies for example hardly exceed a few months. It is therefore increasingly important to determine the relation between the lifetimes of stand-alone transistors and actual circuits. Many publications have appeared about hot carrier effects in MOS transistors [1,2,16]. However, the relation to real products is not very clear. Moreover, apart from some circuit simulation work [3,4] and a few experiments on circuits [5,6], hardly any data did exist about this topic before this work. This is mainly due to the fact that the derivation of circuit lifetime from transistor static stress results is complicated. Factors contributing to this are the variety of transistor lifetime criteria in use (e.g. 100 mV threshold voltage shift, 10% transconductance or 10% Id, degradation etc.), duty cycle effects, possible AC enhanced degradation effects [7,8], annealing effects [9,10] and the usually unknown sensitivity of the circuit performance to the transistor degradation. 47 Chapter 3 Furthermore, it has been shown that the sensitivity may be supply voltage dependent [3]. In this chapter, a detailed experimental study into the relation between the transistor and circuit hot carrier lifetime, carried out on 45 ns / 100 pF low power 64K (8k8) full-CMOS static random access memories (SRAM), is described. 3.2 EXPERIMENTAL 3.2.1 SRAM circuit description The full-CMOS 8K8 SRAMs used in this study [11] feature a 6 transistor memory cell and are fabricated in a single poly, double metal 1.2 µm twin tub CMOS process with p-epi on a p+ substrate. The 1.2 µm LDD n-channel and 1.4 µm conventional p-channel transistors have a minimum effective channel length Leff,min of 0.75 µm and have n+ and p+ doped polysilicon gates in the matrix respectively. All periphery transistors have n+ doped gates. Gate oxide thickness and n+ and p+ source-drain junction depths equal 25 nm and 500 nm respectively. The devices have a plasma-nitride passivation layer. Fig. 1: Functional block diagram of the 8K8 SRAM [11]. 48 Relation between the hot carrier lifetime of transistors and CMOS SRAM products Fig. 1 shows a functional block diagram of the SRAM. The matrix is organised in 16 sections and each section consists of a cell array of 128 rows and 32 columns. The logic combination of the addresses activates one polysilicon wordline WL within a section and, via pass gates, 8 cells from the total of 32 cells are selected for read or write operations (bytewide organisation). The cells on the remaining 24 columns are in a 'pseudo-read mode'; they also pass their data contents on to the bitline pairs (columns) but stay isolated from the local section read- and write databusses. The minimum SRAM read and write cycle times, Trc,min and Twc,min, are specified at 45 ns. 3.2.2 SRAM stress method and stress conditions The full circuit stress consisted of dynamically operating the products at high speed, at low ambient temperature and at high supply voltage, using a worst case pattern. The read and write cycle time Trc and Twc, equalled 60 ns. The ambient temperature was set at -30'C to diminish the effects of other degradation mechanisms (e.g. electromigration). Lowering the temperature also slightly accelerates hot carrier degradation effects [12]. Five different supply voltages Vdd ranging from 7.5 V to 10 V were used during the stress of these nominal 5 V devices. All input voltages were raised to the Vdd- level to limit power dissipation in the in- and output buffers. Under these conditions, the junction temperature Tj of the devices equalled approximately 0°C. Care was taken to avoid inductive ringing of the supply lines. Each I/O pin was loaded by a 30 pF capacitor. In total 83 devices from 2 different batches were stressed. 3.2.3 SRAM stress pattern The stress pattern applied was a simple two complementary address-toggling mode. It involves reading and writing of alternating data in two complementary addresses (i.e. in two wordlines in different sections). The waveforms of the stress pattern are depicted in fig. 2. The aim of this 'Two Address Method (TAM)'-pattern is to focus the hot carrier stress on the memory datapath by maximising the stress duty cycle of all datapath elements (i.e. in- and output buffers, decoders, write and read bus drivers, pass gates, memory cells, sense amplifiers etc., see fig. 1). Its advantage is that it yields much more information in the same amount of stress time than a pattern which accesses all cells; the stress duty cycle of e.g. the memory cell would be 1000 times smaller in case a linear scan pattern is used. A drawback of the TAM pattern is that only a small fraction of the circuit transistors is stressed so the weakest circuit part from the design point of view may remain unstressed. However, the addresses are chosen such that transistors in all distinct building blocks of the memory are stressed. It is therefore a reasonable assumption that the results obtained are representative for the intrinsic circuit lifetime. Nevertheless, to determine 49 Chapter 3 the influence of process related defects (e.g. particles in the polysilicon layer), or to obtain statistical information on the hot carrier stress resistance of all cells, a scantype pattern should be used. Fig. 2 : Schematic of the stress pattern waveforms, Twc= Trc= 60 ns. 3.2.4 Transistor stress conditions The static transistor degradation experiments were carried out at 20°C ambient temperature. The channel length L and width W of the devices equalled 1.2 µm and 20 µm respectively. To check for narrow width effects also some devices with W= 1.2 µm were stressed. No difference in degradation was observed. The devices were stressed at several Vds values, while Vgs equalled Vds /2-0.5 V (approximately maximum substrate current condition). During the stress the following parameters were monitored: the threshold voltage Vt, defined as the Vgs value at which Ids equals 0.01 W/L µA, the maximum transconductance βmax and the drain-source current Ids at Vgs= 2.5 V, all in the linear region (Vds= 0.1 V) and the forward mode (source and drain terminal the same as during the stress). 3.3 TRANSISTOR AND SRAM PARAMETER DEGRADATION 3.3.1 Transistor parameter degradation 50 Relation between the hot carrier lifetime of transistors and CMOS SRAM products A typical example of the transistor degradation is shown in fig. 3. The stress conditions are shown in the figure. Note the super-linear time dependence of the degradation. This is indicative of severe electron trapping [10]. Fig. 3: Degradation of the threshold voltage Vt, the maximum transconductance βmax and the drain-source current Ids, all measured in the linear region, as a function of time of a W/L= 20/1.2 µm n-channel transistor (Leff= 0.78 µm). The stress conditions are shown in the figure and the measurement conditions are described in section 3.2.4. 3.3.2 SRAM parameter degradation The circuit parameters were monitored during the stress by characterising the devices at several readpoints at 20°C ambient temperature on a Teradyne J386A memory tester. Under the stress conditions described above, changes in the electrical parameter occurred. Fig. 4 shows the increase of the address access time Taa as a function of stress time (the chip enable access time Tac did degrade in a similar way). The increase can be clearly seen although only at high voltage levels and after long times. The most sensitive parameter, however, appears to be the minimum operating voltage Vdd,min as is shown in fig. 5. Large changes occur already after short stress times. Note that the time at which the onset of the degradation occurs is exponentially dependent on the supply voltage, which is indicative of hot carrier degradation effects. Along with Vdd,min, also the write timing parameters, such as the minimum data- to-write-time overlap Tdw, measured at Vdd = 4.4 V, degraded. Fig. 6 shows that 51 Chapter 3 both changes are strongly correlated. This is because they result from the same degradation effect, as will be shown in section 3.4. No shift was observed in the write time parameters measured at Vdd = 5.6 V, which will also be explained in section 3.4. Furthermore, no shift of the DC SRAM parameters occurred. Fig. 4: The address access time Taa at Vdd=4.4 V of batch 1 as function of stress time for five different stress supply voltages Vdd. The n- and p-channel transistors Leff equal 0.94 µm and 1.07 µm respectively. Stress conditions as described in section 3.2.4. 52 Relation between the hot carrier lifetime of transistors and CMOS SRAM products Fig. 5: The minimum operating voltage Vdd,min of batch 1 as a function of stress time for five different stress supply voltages Vdd. Leff and stress condition as in fig. 4. Fig. 6: Correlation between the degradation of the write pulse width Tdw, measured at Vdd= 4.4 V, and the minimum operating voltage Vdd,min of batch 1 and 2. Stress conditions as in fig. 4. The n- and p-channel Leff of batch 2 equal 0.85 µm and 0.97 µm respectively. The Tdw data-sheet specification limit of 20 ns corresponds to a Vdd,min value of 3.5 to 4.0 V. A strong saturation effect (and even recovery after long at times) can be seen in fig. 5. This could be caused by annealing effects, induced by detrapping of electrons from shallow trap levels in the gate oxide [13]. Note that strong electron trapping occurred during the transistor stress. Electron detrapping will lead to a recovery of the threshold voltage [9,10] and of the circuit parameter shifts [6]. Therefore, also annealing effects have been investigated. Fig. 7 shows the recovery of Vdd,min during storage at various temperatures after the stress. Note that Vdd,min completely recovers to its 0 hour value (about 2.4 V) and that the recovery time constant τrecov (T) is strongly temperature dependent. Fig. 13 shows that the Vdd,min shift is proportional to the Vt -shift of the lifetime limiting transistor. Therefore, the recovery activation energy Ea, which equals the average electron trap level energy Etrap, can be calculated from the following simple model [10,13]: 53 Chapter 3 ∂Vdd,min ( t ) = A ⋅ ∂Vt ( t ) = B ⋅ N trap ( t ) = B ⋅ Ntrap ( t = 0 ) ⋅ e t τ re cov ( T ) (1) where A and B are constants, Ntrap the number of trapped electrons/cm and τrecov (T) the recovery time constant in hrs. Using equation (1), τrecov (T) can be calculated from fig. 7. By plotting the results in an Arrhenius plot, see fig. 8, we find that Etrap and Ea equal 1.03 ± 0.16 eV. At 200 °C, τrecov (T) equals 1300 hrs so at 0 °C (Tjunction of SRAM) τrecov (T) equals about 3 years. Therefore we conclude that annealing effects are negligible at -30°C ambient temperature. In section 3.4 it will be shown that the saturation and recovery after long stress times is caused by a compensating degradation effect. 2 Fig. 7: Recovery of the minimum operating voltage Vdd,min after the stress during storage at various temperatures. 54 Relation between the hot carrier lifetime of transistors and CMOS SRAM products Fig. 8: Arrhenius plot of the Vdd,min recovery time constants, Ea = 1.03 ± 0.16 eV. Similar recovery effects were observed in the write time parameters, consistent with the fact that they are strongly correlated to Vdd,min. No significant recovery of the access times was observed. 3.4 ANALYSIS AND DISCUSSION OF THE SRAM PARAMETER DEGRADATION In this section the origin of the circuit degradation is investigated. The location of the damage is determined by electrical analysis, which is confirmed by circuit simulation, voltage micro probing and photoemission microscopy. Finally it is verified whether the lifetime limiting transistor can explain the observed parameter shifts quantitatively. 3.4.1 Localisation of the circuit damage Detailed electrical analysis of the circuit degradation by means of bitmapping techniques on a memory tester showed that the access time degradation occurred in all the cells in the two sections containing the two stressed wordlines, see fig. 1. This reveals that the hot carrier damage is localised in the sense amplifiers and/or read bus drivers. Bitmapping further showed that the Vdd,min and write time degradation were associated with a write problem; while reading was possible down to Vdd = 2.4 V (equal to the 0 hour Vdd,min value, see fig. 5), for writing a higher supply voltage was required. The write problem appeared to be solely confined to the 64 cells connected to the two stressed wordlines, see section 3.2. Apparently, the damage was localised in the memory cells themselves. A schematic of the SRAM cell and bitline circuit is depicted in fig. 9. 55 Chapter 3 The further analysis was concentrated on the Vdd,min and write time degradation, because these determine the circuit lifetime (see section 3.5). 3.4.2 Stress on the memory cell transistors Fig. 10a and 10b show the Vds and Vgs voltages of the 6 memory cell transistors during a typical read and write cycle at Vdd= 5.5 V. It can be clearly seen that the access transistor T2 (see fig. 9) connected to the ‘0’-node of the memory cell suffers the largest stress, as also was shown by Sakurai et.al. [14]. Whenever the wordline WL is activated at the start of a read or write cycle (denoted by 'A' in fig. 10b), the Vgs of T2 is swept from 0 V to Vdd while its Vds still (almost) equals Vdd due to the large bitline capacitance. The stress on the other cell transistors is much less. Firstly, their maximum Vds is smaller due to the voltage drop across the diode, see fig. 9. Secondly, during reading no large Vgs transients occur. Thirdly, when Vgs transients do occur during writing, their Vds has already dropped significantly due to the small cell node capacitances. Note however that at elevated supply voltages also the driver transistor D1 will be stressed, because during reading (denoted by 'B' in fig. 10a) its Vgs will have risen to above its Vt (about 0.8 V). At Vdd = 5.5 V, Vgs will remain smaller than Vt. Fig. 9: Schematic of the SRAM cell and bitline circuitry. D1 and D2 are the nchannel driver transistors (W/L= 2.0/1.2 µm), Ll and L2 the p-channel load transistors (W/L=1.0/1.4 µm), Tl and T2 the n-channel access transistors (W/L= L2/1.2 µm) and Cl and C2 the p-channel bitline load transistors. The diode results from the use of n+ and p+ doped polysilicon gates. The black dots indicate the degradation sites. 56 Relation between the hot carrier lifetime of transistors and CMOS SRAM products (a) (b) Fig. 10: Circuit simulation showing the drain-source Vds and gate-source Vgs voltages of the 6 memory cell transistors during a typical read and a write cycle at Vdd= 5.5 V. During writing a '1' is written in node 2 (see fig. 9) which initially, and during reading, contained a ‘0’. Due to the delay of the write signals, a write cycle starts with a read operation. T2 and D1 57 Chapter 3 are stressed during the periods indicated by 'A' and 'B' respectively. 3.4.3 Photo-emission recording of the memory array The above reasoning was verified by means of photoemission microscopy [15]. Fig. 11a shows the layout of the cell [11] and fig. 11b shows a photo emission recording of the memory array under dynamic operation at Vdd= 8.5 V. Data 1' was written into the cells so node 2, see fig. 9, contained a ‘0’. The access transistor T2 and driver transistor D1, as explained in the previous section emit light. This is indicative of severe hot carrier degradation effects. No light was seen when the part was not toggled, so punch-through effects are negligible and the emitted light is undoubtedly associated with hot carrier effects. Note that the access transistors are completely covered by the bitlines (in second metal). Due to multiple interference, this results in two light spots on both sides of the bitline at the location of the T2. The PMOS transistors emitted no light. (a) 58 Relation between the hot carrier lifetime of transistors and CMOS SRAM products (b) Fig 11: Layout of the memory cell showing the location of the transistors depicted in fig. 9 (a) and photo-emission recording of the memory array (b). The SRAM was operated in the read mode at Vdd= 8.5 V, Trc= 250 ns and with 'data 1' written into the cells. The integration time equalled 2000 s, using a Hamamatsu C3230-01 photo-emission camera. The light emitted by the access transistors T2 and driver transistors D1 can be clearly seen. 59 Chapter 3 3.4.4 Voltage micro-probing of the memory cell transistors Further confirmation of the above degradation mechanism was obtained by voltage microprobing of degraded and non-degraded cell transistors. Their bitlines were isolated by means of laser cutting. Fig. 12 shows the results after 250 hrs stress at Vdd= 9 V, corresponding to a 4.0 V Vdd,min value. The measurement conditions are shown in the figure. Curve A (series combination of the access and load transistor) clearly shows a positive 1.25 V Vt -shift of the access transistor when operated in the reverse mode (source and drain terminals interchanged with respect to the stress). Back-bias effects have been taken into account. The Vt -shift leads to a reduction of the saturated drain-source current by more than a factor of 2. Curve B (series combination of the driver and access transistor) shows that, even in the forward mode, the current drive capability has decreased by almost a factor 2 in the linear region. However, as is well known, the effects are less severe in the saturated region. The above is a clear example of the fact that especially ‘pass’-type transistors (that are operated with both Vds polarities) are sensitive to hot carrier degradation in circuits. This is because saturation current decrease is much larger when operated in the reverse mode than in the forward mode. In contrast, ‘inverter’-type transistors are only operated in the forward mode and consequently these can withstand much more hot carrier damage at the drain side before a significant decrease of the saturation current occurs. Fig. 12: Micro-probing results of the cell transistors with isolated bitlines after a 250 hrs stress at Vdd= 9 V. The measurement conditions are shown in the figure. The thick and thin lines respectively represent measurements on 60 Relation between the hot carrier lifetime of transistors and CMOS SRAM products five non-degraded and on one degraded cell. Curve A reveals a 1.25 V positive Vt -shift of the access transistor. 3.4.5 Verification of the SRAM parameter degradation In contrast to reading, during writing the access transistor operates in the strongly degraded reverse mode. Hence, especially the memory write parameters will be affected. We will explain this by a qualitative reasoning followed by a quantitative verification. Writing of a '0' at a node that initially contains a ‘1’ occurs if this node, which is connected to the gate of the opposite driver (see fig. 9), is pulled low enough. The opposite driver will be shut off and the cell will switch. However, degradation of the access transistor leads to an increase of the minimum node voltage that can be attained as this value is determined by the voltage division between the access and load transistor. In the end, the opposite driver will remain open and the cell can not be switched any more without increasing the supply voltage Vdd. This explains not only the Vdd,min degradation but also its recovery after long stress times, see fig 5. As was shown in fig. 10 and 11, in the end also the driver transistors will degrade and their Vt will increase. As a result, they can be shut of more easily and the cell can be switched, and thus written, at lower Vdd. Circuit simulations showed that the write time degradation is caused by the presence of a meta-stable state in the degraded cell during writing. The occurrence of this state is strongly dependent on the supply voltage and, actually, only occurs when the Vdd,min of the cell is close to the operation voltage, see fig. 6 and 15. Therefore no write time degradation was measured at Vdd= 5.6 V, see section 5.3. The asymptote at Vdd,min= 4.4 V in fig. 6 can be easily explained. When Vdd,min equals the operation voltage, the cell can not be written any more and thus the Tdw value will approach infinite. The above has been verified quantitatively as well. Fig. 13 and 14 show the calculated increase of Vdd,min and Tdw respectively as a function of the Vt -shift of the access transistor. Their correlation is shown in fig. 15. The transconductance β, normalised to its 0-hour value, is used as a parameter. The simulations are in good agreement with the measured values. Fig. 3 shows that in case of a 1.25 V Vt -shift, the β degradation equals about 30%. From fig. 13 can be seen that the corresponding Vdd,min equals 3.7 V, compared to the measured value of 4.0 V (fig. 12). Furthermore, the simulated Tdw - Vdd,min correlation in fig. 15 fits well to the measured values in fig. 6. This proves that the large Vdd,min and Tdw increase can indeed quantitatively be explained by the measured Vt -shift. We conclude that the damage in the memory cell is primarily located in the access transistor and that it is the lifetime limiting transistor. This explains why not only the two stressed bytes in the product, but also the other cells connected to the same wordlines, see fig. 1, degraded. These cells are in the 'pseudo read mode' when their wordline is activated, see section 2, so their access transistors suffer the same stress as those of the two stressed bytes. 61 Chapter 3 Fig. 13: Circuit simulations showing the minimum operating voltage Vdd,min as a function of the Vt -shift of the access transistor. The transconductance β, normalised to its 0-hour value, is used as a parameter. Fig. 14: Circuit simulations showing the minimum write pulse width Tdw as a function of the Vt -shift of the access transistor. The transconductance β, normalised to its 0-hour value, is used as a parameter. 62 Relation between the hot carrier lifetime of transistors and CMOS SRAM products Fig. 15: Circuit simulations showing the correlation between the minimum operating voltage Vdd,min and the minimum write pulse width Tdw. The transconductance β, normalised to its 0-hour value, is used as a parameter. 3. 5 RELATION BETWEEN THE TRANSISTOR AND SRAM HOT CARRIER LIFETIME Two different criteria were used to define the transistor lifetime; a 100 mV Vt shift and a 10% βmax degradation. The SRAM lifetime was determined by the first parameter that ran out of its datasheet specification. One might argue that the 'distance' to the specification limit is different for each product batch, and therefore a 10% degradation of any parameter would be a more appropriate lifetime criterion. However, the fact that the parameters of a batch with a short channel length will be further away from the specification limit, will probably be compensated by the fact that they will degrade faster. We therefore will use the datasheet related lifetime criterion. The Tdw write time parameter (specification limit 20 ns), see section 3.3, appears to be the lifetime limiting parameter. Therefore, a 20 ns Tdw -value was used as the SRAM lifetime criterion, which, as can be seen from fig. 6, corresponds to a Vdd,min value of 3.5 to 4.0 V. In order to allow a comparison, both the transistor and product lifetimes were extrapolated to minimum Leff (0.75 µm), minimum junction temperature Tj (0 °C) 63 Chapter 3 and, in case of the product, to maximum operating frequency (Twc,min=Trc,min= 45 ns), using equation (2): B Tlf = A ⋅ e V dd − ⋅e C Leff q⋅ E a T ⋅ e k ⋅T ⋅ rc Trc ,min [s] (2) where A, B, C and Ea are constants. The activation energy Ea= -0.04 eV [12]. The channel length acceleration factor C is only technology dependent and equals 4.4 µm for our technology. Constant A and the voltage acceleration factor B can also be product type dependent. In fig. 16 the resulting hot carrier lifetimes of two SRAM batches and the 1.2 µm n-channel transistors are plotted as a function of 1/Vdd. It clearly shows that slopes of the product and transistor lifetime data are the same (B= 165±10 V), indicating a similar degradation mechanism. However, the SRAM lifetime is significantly larger than the transistor lifetime and equals 1250 years at Vdd = 5.5 V. This corresponds to a bit lifetime of 4.4⋅1017 read or write cycles. The product transistor lifetime ratio is determined by comparing curve 2 and curve 4, because only batch 2 was processed using the same flow as the transistors. A lifetime ratio of about a factor 50 results. Fig. 16: Hot carrier lifetime of 1.2 µm n-channel transistors, stressed at maximum substrate current Isub (curves 1 and 2) and of two 8K8 SRAM batches (curves 3 and 4) as a function of 1/Vdd, using respectively a 10% β, a 100 mV Vt -shift and a 20 ns Tdw lifetime criterion. All lifetimes have 64 Relation between the hot carrier lifetime of transistors and CMOS SRAM products been extrapolated to Leff= 0.75 µm, Tj= 0 °C and maximum operating frequency. The SRAM lifetime at Vdd= 5.5 V equals 300 to 1250 years, compared to a transistor lifetime of 17 years (Vt -criterion). The lifetime discrepancy is in the first place caused by the small sensitivity of the performance of this particular product to the transistor degradation. In section 3.4 it was shown that the Vt -shift of the access transistor needed to reach the SRAM end-of-life is 10 times larger, i.e. 1.0 V, than the 100 mV value used to define the transistor end-of-life. Fig. 3 shows that this accounts for about a factor 5 in the product - transistor lifetime ratio. The second major contribution to the lifetime discrepancy is duty cycle effects. Although compensated by AC enhanced degradation effects, they apparently account for the remaining factor 10. The influence of AC enhanced degradation effects is thus limited. 3.6 SUMMARY AND CONCLUSIONS Hot carrier degradation of full-CMOS SRAM results in an increase of the minimum operating voltage, the write time parameters and the access times. The degradation of the first two parameters can be traced back to the degrading of only one transistor, that is the access transistor of the memory cell. This is a ‘pass’-type transistor that is operated with both source-drain voltage polarities. This type of operation makes the transistor much more sensitive to the effects of hot carrier degradation than in the case of operation in the ‘inverter’-type mode because for a given level of hot carrier damage, the saturation current decrease is much larger in the reverse mode (source-drain terminals interchanged with respect to the stress situation) than in the forward mode. Circuit simulations show that the observed circuit parameter shifts can quantitatively be explained by this failure mode. The relation between the hot carrier lifetime of static stressed transistors and the SRAM products has been established. Although their supply voltage dependence is the same, the product lifetime appears to be about a factor 50 larger than the transistor lifetime. This discrepancy can be accounted for by the small sensitivity of the SRAM to the transistor degradation and by duty cycle effects. In conclusion, product lifetimes are severely underestimated if they are straightforwardly derived from static transistor lifetime data. This finding can be applied during product design, for example by eliminating the need for cascoding of transistors at critical locations. In this way increases in memory access times and the maximum operation frequency of microprocessors are as well as more aggressive scaling of process technologies without jeopardising the product reliability. True product lifetimes can best be obtained from stressing the product or by simulation of the degradation effects in the circuit. In addition to this work [24], numerous papers have appeared dealing with both simulation [16-20] and the product stress issues [20-23]. Of course transistor stresses remain important for process evaluation, optimisation and monitoring. 65 Chapter 3 66 Relation between the hot carrier lifetime of transistors and CMOS SRAM products 3.7 REFERENCES [1] C. Hu, S.C. Tam, F.-C. Hsu, PA. Ko, T.Y. Chan, K.W. Terril, 'Hot-electron induced MOSFET degradation - model, monitor and improvement', IEEE Transaction on Electron Devices, vol.32, pp. 375-385, (1985) [2] P. Heremans, R. Bellens, G. Groeseneken, H.E. Maes, 'Consistent model for the hotcarrier degradation in n-channel and p-channel MOSFET's', IEEE Transactions on Electron Devices, vol. 35, pp. 2194-2209, (1988) [3] S. Aur, D.E. Hocevar, P. Yang, 'Circuit hot electron effect simulation', IEDM Technical Digest, pp. 498-501, (1987) [4] P.M. Lee, M.M. Kuo, K. Seki, P.K. Ko, C. Hu, 'Circuit aging simulator (CAS)', IEDM Technical Digest, pp. 134-137, (1988) [5] C. Duvvury, D. Redwine, H. Kitagawa, R. Ham, Y. Chuang, C. Beydler, A. Hyslop, 'Impact of hot carriers on DRAM circuits', Proceedings International Reliability Physics Symposium, pp. 201-206, (1987) [6] M. Matsumoto, Y. Kimura, K. Hirayarna, H. Koyarna, N. Maki, H. Matsumoto, 'Degradation mechanism due to hot electron trapping in high density CMOS SRAM, Proceedings International Symposium on Testing and Failure Analysis, pp. 89-94, (1988) [7] W. Weber, ‘Dynamic stress experiments for understanding hot carrier degradation phenomena', IEEE Transactions on Electron Devices, vol. 35, pp. 1476-1486, (1988) [8] W. Weber, I. Borchert, 'Hot-hole and electron effects in dynamically stressed n-MOSFETs', Proceedings ESSDERC, pp. 719-722, (1989) [9] A.G. Sabnis, J.T. Nelson, 'A physical model for degradation of DRAMS during accelerated stress aging', Proceedings International Reliability Physics Symposium, pp. 90-95, (1983) [10] R. Annunziata, G. Dalla Libera, E. Ghio, A. Maggis, 'Annealing of hot carrier damaged double metal MOSIPEV, Proceedings ESSDERC, pp. 715-718, (1989) [11] W.C.H. Gubbels, C.D. Hartgring, R.H.W. Salters, J.A.M. Lammerts, M.J. Tooher, P.F.P.C. Hens, J.J.J. Bastiaens, J.M.F. van Dijk, M.A. Sprokel, 'A 40-ns/ 100-pF low-power full-CMOS 256K (32Kx8) SRAM, IEEE Journal of Solid State Circults, vol. 22, pp. 741-747, (1987) [12] C. Yao, J. Tzou, R. Cheung, H. Chan, 'Temperature dependence of CMOS device reliability', Proceedings International Reliability Physics Symposium, pp. 175-182, (1986) [13] R. Mahnkopf, G. Przyrembel, H.G. Wagemann, 'Annealing of hot carrier- induced MOSFET degradation', Journal. de Physique, Coil. C4, pp. 771-774, (1988) [14] T. Sakurai, M. Kakumu, T. Lizuka, 'Hot carrier suppressed VLSI with submicrometer geometry', International Solid State Circuits Conference, pp. 272-273, (1985) [15] N. Khurana, C-L. Chiang, 'Analysis of product hot electron problems by gated emission microscopy', Proceedings International Reliability Physics Symposium, pp. 189-194, (1986) 67 Chapter 3 [16] H. Wang, H. De, R. Lahri, D. Haueisen, ‘Improving hot-electron reliability through circuit analysis and design’, Proceedings International Reliability Physics Symposium, pp. 107-111, (1991) [17] K.N. Quader, P. Fang, J. Yue, P.K. Ko, C. Hu, ‘Simulation of CMOS circuit degradation due to hot carrier effects’, Proceedings International Reliability Physics Symposium, pp. 16-23, (1992) [18] M. Pagey, R.J. Milanowksi, E.S. Snyder, N. Bui, B. Deem, B. Bhuva, S. Kerns, ‘Unified model for n-channel hot carrier degradation under different degradation mechanisms’, Proceedings International Reliability Physics Symposium, pp. 289-293, (1996) [19] P.-C. Li, G.I. Stamoulis, I.N. Hajj, ‘I-Probe-d: a hot carrier and oxide reliability simulator’, Proceedings International Reliability Physics Symposium, pp. 274-279, (1994) [20] R. Bellens, I. Clemminck, K. Van Doorselaer, ‘Building-in reliability during library development: hot carrier degradation is no longer a problem of the technologists only!’, Microelectronics & Reliability, pp. 1425-1428, (1997) [21] C. Jiang, E. Johnson, J.J. Shaw, C. Hu, ‘AC hot carrier degradation in a voltage controled oscillator’, Proceedings International Reliability Physics Symposium, pp. 53-56, (1993) [22] Y. Huh, D. Yang, H. Shin, Y. Sung, ‘Hot carrier induced circuit degradation in actual DRAM’, Proceedings International Reliability Physics Symposium, pp. 72-75, (1995) [23] R. Bellens, ‘Hot carrier degradation in sub-micron CMOS technologies: problems and possible solutions’, Tutorial IRPS, (1998) [24] J.A. van der Pol, J. Koomen, ‘Relation between the hot carrier lifetime of transistors and CMOS SRAM products’, Proceedings International Reliability Physics Symposium, pp. 178-185, (1990) 68 4 Systematic Derivation of Latchup Design Rules for Submicron CMOS Processes from Test Structures [7] 4.1 4.2 4.3 4.4 Introduction Latchup susceptibility reduction options Design rule derivation approach Application: design rule derivation for a CMOS process on p-/p++ epitaxial substrates 4.4.1 Impact P+ substrate contact placement and P+ guardrings 4.4.2 Impact Nwell contact placement and N+/Nwell guardrings 4.4.3 Process specific design rules 4.5 Conclusions 4.6 References 4.1 INTRODUCTION Latchup is an intrinsic reliability risk for CMOS processes [1,2] due to the presence of built-in parasitic thyristors in this technology. These thyristors can switch into an unwanted high current mode, called latchup, due to external disturbances like e.g. voltage spikes. This high current mode can only be switched off by disconnecting the supply voltage. So, when built into a system, the occurrence of latchup can easily damage the circuit and cause a system malfunction. Thus each CMOS circuit requires some sort of latchup protection to prevent randomly occurring failures. No clear approach has been published in literature for the derivation of latchup design rules. Up to now, it has been regarded as a kind of ‘art‘ and a significant amount of trial-and-error has been associated with circuit design and product development. In this work, however, a consistent approach is demonstrated that allows to derive latch-up design rules from simple test structures and that is appli67 Chapter 4 cable to any CMOS technology [7]. It thus helps eliminating the kind of ‘black magic’ at-mosphere around this phenomenom. In a CMOS process using p-type substrates, the thyristors consist of lateral NPN and vertical PNP transistors, see fig. 1. P-emitter N-emitter substrate contact C P+ Nwell contact A p- epi Rbase p- Rbase p++ B n+ LOCOS L-NPN P+ V-PNP n+ Nwell RbaseNw p++ bulk Fig. 1: Schematic view of a latchup test structure showing the parasitic bipolar transistors and base resistances, the relevant design rules (A: n+p+-spacing, B: distance N-well contact to P+ emitter and C: distance P+ substrate contact to N+-emitter) and the N- and P-emitters. As for new generation submicron processes the n+p+-spacing (that determines the base-width and thus the gain of the parasitic bipolar transistors) continues to shrink, the latchup susceptibility increases strongly. In principle, latchup free products can be obtained by ensuring that the thyristor holding voltage Vh is larger than the supply voltage Vdd. This appears however to be impractical for submicron processes as Vh > Vdd is only achieved for n+p+ spacings significantly larger than the minimum design rules, even if latchup robust p-/p++ epitaxial substrates are used [3,4]. This is illustrated by the data in fig.2 from two 5 V 0.7 µm twin well CMOS processes from two waferfabs using 10 Ωcm p- epi on a 0.01 Ωcm p++ substrate. Process A has a significantly higher temperature budget than process B1 (the alignment markers are made by a double LOCOS oxidation instead of by a silicon dry etch) which results in an about 1.5 µm thinner remaining epi thickness at the end of the process for process A due to diffusion of the p++ bulk into the p- epi layer. This explains that the 9 µm epi Vh data of process A correspond to those of 7.5 µm epi of process B1. Fig. 2 shows that by going to very thin epi layers, Vh approaches Vdd but this option is limited by P+/Nwell/p++ substrate punch-through 68 Systematic derivation of latchup design rules from test structures and Rsheet Nwell requirements, see e.g. fig. 3. Thus there is a clear need for proper design rules to obtain latchup robust products. In this paper a systematic method is proposed for the derivation of those design rules from test structures. This method is illustrated using data from one of the above CMOS processes on epitaxial substrates. However, it is also applicable for processes on p- bulk substrates with or without buried layers. 7 5.5V supply voltage 6 Vhold [V] 5 4 5 um, A 7 um, A 8 um, A 9 um, A 12 um, A 7.5um, B1 3 2 1 0 0 2 4 6 8 n+p+ spacing [um] 10 12 Rsheet N-well [kOhm/sq] Fig. 2: Holding voltage at 125 °C as a function of n+p+ spacing A for various epitaxial layer thicknesses and two different 0.7 µm processes (A and B1) from two waferfabs. 3.0 M ax. value Analog design constraint 2.5 2.0 1.5 1.0 0.5 0.0 4 6 8 10 12 Epi Thickness [um ] 14 69 Chapter 4 Fig. 3: Nwell sheet resistance as a function of epitaxial layer thickness for process A. For small thicknesses the p++ doped substrate overdopes the Nwell resulting in an increase of the sheet resistance. Analog design constraints determine the maximum allowable Rsheet Nwell. 70 Systematic derivation of latchup design rules from test structures 4.2 LATCHUP SUSCEPTIBILITY REDUCTION OPTIONS Facing the fact that Vh<Vdd, there are basically only three options to improve latchup robustness. First, one can increase the thyristor P- and N-emitter trigger currents Itrig,p/n (hole and electron injection respectively) by reducing the gain of the parasitic bipolar transistors and/or their base resistances by means of process as well as design options, see table 1. Second, one can reduce the number of carriers reaching the thyristor after injection at the product I/O-bondpads by applying P+ or N+/Nwell guardrings, see e.g. fig. 4. Process option p- epi layer on p++ substrate N / Pwell dose n++/p++ buried layers Silicidation Design option Effect on gain bipolar transistors - + + n p -spacing n+p+-spacing partitioning Placement of Nwell / substrate contacts Yes Yes (VPNP only) Effect on base resistance Yes (L-NPN only) Yes Yes Yes Yes Yes - - Yes Table 1: Impact of process and design options on gain and base resistance of LNPN and V-PNP parasitic bipolar transistors, assuming a p--type substrate n+/Nwell guardring n +-injector A n+ p - epi n+/Nwell collector B n+ n+ Nw Nw p ++ bulk 71 Chapter 4 Fig. 4: Schematic view of a N+/Nwell guardring test structure showing the relevant design rules (A: distance guardring to N+-injector and B: guardring width) Finally the injection points can be placed at a larger distance d from the thyristors which also reduces the current density at the location of the thyristors due to geometrical spreading of the injected carriers. In case of an injecting diffusion with length L, the decrease of the current density with distance d has been calculated for the two extreme cases shown in fig. 5a and 5b. Assuming a 2D-spreading of the injected current, we derive the equations (1) and (2) respectively for the current density reduction factors Fspread. Note that this a worst case approach as in reality near-3D spreading will occur, resulting in even lower values of Fspread. E le m e n t d x α L C i r c u it d E S D P ro te c tio n = In je c to r (a) L d ESD Protection Circuit Element dx = Injector (b) Fig. 5: Latchup sensitive circuit located a) perpendicular and b) parallel at a distance d from a uniformly injecting diffusion with length L (e.g. an ESD protection). Circuit located perpendicular to injector with length L: 72 Systematic derivation of latchup design rules from test structures π 1 Fspread = (ln(tan( )) − ln(tan( π 4 arctan( 2 2d ) L ))) (1) Circuit located parallel to injector with length L: Fspread = 1 d+ L (ln( )) 2⋅ π d (2) Fig. 6 shows some numerical values for an injector with length L=100 µm (typical length of an ESD protection that acts as injecting diffusion). We see that are only minor differences between the extreme cases for longer distances d and that the current density decrease in general can be reasonably approximated to be proportionally to ≈ ln(1+L/d). F-spread 1 Injector = 100um long ESD diode Perpendicular to Injector Parallel to Injector 0.1 0.01 0 200 400 600 800 Distance to Injector [um] 1000 Fig. 6: Current density reductions factors Fspread as a function of distance d and orientation to a uniformly injecting diffusion with length L equal to 100 µm. 3. DESIGN RULE DERIVATION APPROACH In order to ensure that a product can withstand a specified injection current Iinjection from the product bondpads, the following criterion must be satisfied : 73 Chapter 4 J trig , p /n ≥ I injection Linjection ⋅ Fescape ⋅ Fspread [µA/µm] (3) with Jtrig,p/n the trigger current density of the thyristor in case of hole and electron injection respectively, Linjection the perimeter of the injecting junction(s) connected to the bondpad (typically the ESD-protection), Fescape,p/n the fraction of injected carriers that ‘escaped’ from the guardrings and Fspread the current density reduction factor due to the geometrical spreading of the injected current. Equation (3) holds for both positive and negative pulses (hole injection from P-emitter and electron injection from N-emitter respectively). Typical values for Iinjection and Linjection are +/- 100 mA (JEDEC latchup qualification requirement [6]) and 200 µm (perimeter typical ESD protection) respectively resulting in a maximum injected current density at the bondpad of ≈ 500µA/µm. Using guardring efficiency and thyristor trigger currents data obtained from latchup test structures, see fig. 7 and 8, one can now determine what the maximum allowed distances of Nwell and P+ substrate contacts to the emitter diffusions are at minimum n+p+ spacing and what the guardring width and distance requirements are. Note that here data obtained at maximum junction temperature must be used as latchup trigger currents are lowest at maximum temperature [1,2,5]. Thus the appropriate design rules can be derived as will be illustrated using data from the CMOS process B1, see section 4.4.3. Fig. 7: Schematic view of a latchup test structure to determine effect of Nwell and P+ substrate contact placement on latchup sensitivity. 74 Systematic derivation of latchup design rules from test structures Fig. 8: Schematic view of a test structure to determine effect of N+/Nwell guardring width and distance to N-emitter on the electron collection efficiency. 4.4 APPLICATION: DESIGN RULE DERIVATION FOR A CMOS - ++ PROCESS ON P /P EPITAXIAL SUBSTRATES 4.4.1. Impact P+ substrate contact placement and P+ guardrings In case of p-/p++ epi substrates, the p++ bulk acts as a low ohmic shunt for the base resistance of the lateral parasitic NPN-transistor otherwise formed by the high ohmic p- epi layer, see fig.1. This strongly improves P-emitter trigger currents as shown in fig. 9, particularly for n+p+ spacings larger than the epi layer thickness. In that case the injected hole current flow changes from lateral through the p- epi layer to vertical through the p++ bulk. 2000 Jtrig P-emitter [uA/um] 7.5um, B2 8 um, B2 1500 9 um, B2 12 um, B2 1000 500 0 0 2 4 6 8 10 12 14 16 n+p+ spacing [um] 75 Chapter 4 Fig. 9: P-emitter trigger current of process B2 at 125°C as a function of n+p+-spacing A (see fig.1) for various epitaxial layer thicknesses. Fig. 10 shows that the P+ substrate contact placement is not critical. The p++ bulk namely acts as a kind of equipotential surface that sinks all the injected holes and redistributes them over all available substrate contacts in the layout, see fig. 11. The resistance R1 from a P+ contact through the Pwell and epi layer is in the order of a few kΩ while the substrate spreading resistances Rsub1,2 are less than 100 Ω for distances d2 less than 1 mm. As a result, Fspread is very small (<< 0.1). For the same reason the P+ guardring efficiency is fully determined by the ratio of the P+ guardring area versus the total P+ area in the layout and thus the P+ guardring width and distance to the P+ emitter are not very critical. The thinner the epi layer, the more effective the p++ base shunt and its function as a hole current sink. Jtrig P-emitter [uA/um] 1000 7 um, A 8 um, A 9 um, A 12 um, A 7.5um, B1 800 600 400 200 0 0 20 40 60 80 Distance substrate contact to N-emitter [um] 100 Fig. 10: P-emitter trigger current at 125 °C of process A and B1 as a function of P+ substrate contact to N-emitter distance C (see fig.1) at 4.8 µm n+p+spacing for various epi layer thicknesses. 76 Systematic derivation of latchup design rules from test structures p+-emitter p+ guardring d2 d1 Nwell p+ R1 p++ bulk p+ R1 R1’ p- epi p+ p+ collector Rsub1 Rsub2 Fig. 11: Schematic view of the distribution of injected holes over the various P+ substrate contacts. 4.4.2. Impact Nwell contact placement and N+/Nwell guardrings Another feature of p-/p++ epi substrates is that the injected electrons are confined to the epi layer due to the build-in potential between the p- epi and p++ substrate and the small minority carrier diffusion length (≈1µm) in the p++ substrate. First, this results in a slight increase of the (Nwell) base resistance of the parasitic PNP transistor and thus actually in a somewhat reduced N-emitter trigger current for thinner epi layers and small n+p+ spacings, see fig. 12. Jtrig N-emitter [uA/um] 60 50 40 30 7.5um, B2 8 um, B2 9 um, B2 12 um, B2 20 10 0 0 2 4 6 8 10 12 14 16 n+p+ spacing [um] Fig. 12: N-emitter trigger current at 125 °C of process B2 as a function of n+p+spacing A (see fig. 1) for various epitaxial layer thicknesses. 77 Chapter 4 Second, N+/Nwell guardrings become very efficient as due to the injected electron confinement in the epi layer the distance of the N+/Nwell guardring to the injector is irrelevant, see fig. 13. 1E-3 Escape Fraction F 7.5um, B1 7.5um, B2 8 um, B2 9 um, B2 12 um, B2 1E-4 1E-5 0 10 20 30 40 50 60 N-emitter/Nwell guardspacing[um] 70 Fig. 13: N+/Nwell guardring efficiency at 125 °C of process B1 and B2 as a function of distance A (see fig. 4) between N+ injector and a 10 µm wide N+ / Nwell guardring for various epi layer thicknesses. Furthermore fig. 14 shows that the collection efficiency improves logarithmically with the guardring width. This can be understood using fig. 15, where we divided the p- epi layer between Nwell bottom and p++ bulk in squares. The probability that an electron on its ‘random walk’ diffusion path does pass such a square is about 1/e (e =2.7) because the electron recombines or is collected as soon as it hits the p++ bulk or Nwell respectively. The number of escaped electrons does thus decrease exponentially with the number of squares and thus guardring width. Also here thinner epi layers are beneficial as this improves the electron confinement (i.e. increases the number of squares for a constant guardring width). 78 Systematic derivation of latchup design rules from test structures 1E-1 7.5um, B1 Escape Fraction F 7.5um, B2 8 um, B2 9 um, B2 1E-2 12um, B2 1E-3 1E-4 3 4 5 6 7 8 9 n+/Nwell guardringwidth[um] 10 11 Fig. 14: N+/Nwell guardring efficiency at 125 °C of process B1 and B2 as a function of guardring width B (see fig. 4) for various epi layer thicknesses. n + / N w e ll g u a r d r in g p- epi e - in n+ N w e ll e - c o lle c te d ee- p + + b u lk e -out pass r e c o m b in e d p - e p i s q u a re e le c tr o n p a s s p r o b a b ility ≈ 1 / e Fig. 15: Escape fraction of electrons decreases exponentially with the number of p- epi squares and thus N+/Nwell guardring width. Finally fig. 16 shows the effect of the placement of Nwell contacts on the Nemitter trigger current for various epi layer thicknesses. As the Nwell contact to Pemitter spacing directly determines the base resistance of parasitic PNP transistor, it has a strong effect on the trigger current. 79 Chapter 4 Jtrig N-emitter [uA/um] 50 7um , A 8um , A 9um , A 12um , A 7.5um, B1 40 30 20 10 0 0 20 40 60 80 100 Distance Nwell contact to P-emitter [um] Fig. 16: N-emitter trigger current at 125 °C of process A and B1 as a function of Nwell contact to P-emitter distance B (see fig.1) at 4.8 µm n+p+-spacing for various epi layer thicknesses. 4.4.3 Design rules for CMOS process B1 Design rules can now be easily derived from the above data where we take the data of process B1 as example. In case of hole injection (positive trigger currents), fig. 10 shows that for process B1 with a 7.5 µm p- epi layer Jtrig,p= 190 µA/µm at the 4.8 µm minimum n+p+ spacing. As Fspread << 0.1, see section 4.4.1, the criterion in equation (3) is thus easily met and the substrate contact and P+ guardring design rules can be very relaxed (e.g. one substrate contact for every 200 µm and use of minimum p+-width for the P+ guardring). In case of electron injection (negative trigger currents), fig. 13 shows that for process B1 with a 7.5 µm p- epi layer, a 4 µm wide N+/Nwell guardring results in Fescape = 5.5⋅10-4. Assuming Fspread = 1 (very conservative, see fig. 6) we find that Jtrig,,n should be ≥ 500x5.5⋅10-4x1= 0.28 µA/ µm. One now can determine the maximum allowable Nwell contact to Pemitter spacing from fig. 14. Using a design rule of 100 µm, we find Jtrig,n= 3.5 µA/µm, which still provides a factor 12 safety margin. The above demonstrates that CMOS processes on epi substrates are very latchup robust provided proper design rules are used. 4.5 CONCLUSIONS Using a dedicated set of test structures, the latchup susceptibility of a number of submicron CMOS processes on p-/p++ epitaxial substrates have been characteri80 Systematic derivation of latchup design rules from test structures zed as a function of n+p+-spacing, placement of Nwell and substrate contacts and guardring width and distance to the injecting junction. Subsequently it has been shown for the first time how these data can be translated into latchup design rules taking into account the geometrical spreading of the injected carriers. It is demonstrated that this approach results in very latchup robust products in case of p-/p++ epitaxial substrates, thus eliminating the need for time-consuming and expensive ‘trial-and-error’ design optimisation cycles. The method is also applicable to processes on non-epitaxial substrates. 4.6 REFERENCES [1] R.R. Troutman, ‘Latchup in CMOS technology’, Kluwer Academic Publishers, Boston, (1986) [2] E.A. Amerasekera, ‘Failure mechanisms in semiconductor devices’, Chapter 3, Wiley, Chichester, (1997) [3] E.A. Amerasekera, ‘Designing latchup robustness in a 0.35um technology’, Proceedings International Reliability Physics Symposium, pp.280-285, (1994) [4] M.J. Chen, S.S. Ho, P.N. Tseng, R.Y. Shiue, H.S. Lee, J.H. Chen, J.K. Jeng, Y.N. Jou, ‘A compact model of holding voltage for latchup in epitaxial CMOS’, Proceedings International Reliability Physics Symposium, pp. 339345, (1997) [5] T. Aoki, ‘A discussion on the temperature dependence of latchup trigger current in CMOS/BiCMOS structures’, IEEE Transactions on Electron Devices, ED-40, pp. 2023-2028, (1993) [6] JEDEC Specification [7] J.A. van der Pol, P.B.M. Wolbert, ‘Systematic derivation of latchup design rules for submicron CMOS processes from test structures’, Microelectronics & Reliability, pp. 1051-1056, (1998) 81 Chapter 4 82 5 Short Loop Monitoring of Metal Stepcoverage by Simple Electrical Measurements [12] 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 Introduction Electrical assessment of metal stepcoverage Design rule verification for (non-)planarized bipolar processes Effect of metal stepcoverage on electromigration lifetime Design rule verification for a non-planarized BiCMOS process Process split evaluation and shortloop equipment monitoring Metal stepcoverage wafermaps Summary and conclusions References 5.1 INTRODUCTION Metal stepcoverage is a key factor determining metallization reliability. Ample data [1-7] show that reduced stepcoverage seriously affects the electromigration resistance of metal lines. Several papers [4-7] have shown that the reduction in electromigration lifetime is larger than can be predicted from cross section reduction alone. The quantitative effect of metal stepcoverage on electromigration lifetime however remains complicated and appears to depend on the total number of steps in the line [7], the spacing between steps [3], the passivation system [6], the line width and step frequency [4] and obviously the specific topography under study. Many of these effects have been primarily attributed to the reduction of the grain size of the aluminum alloy on steps compared to flat topography [4,7]. Smaller grains increase the number of triple points which are the points of flux divergence. However also temperature gradients, grain size orientation and mechanical stress effects do play an important role [4-7]. Via hole electromigration results reported in [8] for example show that the electromigration resistance of an81 Chapter 5 isotropically etched vias is better than that of tapered vias despite a much better metal stepcoverage in case of the latter. This was attributed to a larger tensile mechanical stress exerted by the passivation in case of the tapered via increasing the diffusivity of the aluminum atoms. Traditionally, metal stepcoverage is determined by making ‘Schliffs’ or Focussed Ion Beam (FIB) cross sections on a limited number of locations, e.g. 5, on a wafer. This method has however a number of drawbacks and limitations. Firstly, variations in metal stepcoverage across the wafer can hardly be assessed. Secondly, in case of non-planarized processes, topography strongly increases if due to alignment variations between two masks two separate steps do coincide. This can result in a significant reduction of the metal stepcoverage. The probability of detecting such worst case steps with the traditional method is quite low because these coinciding steps typically only occur at worst case 4σ mask misalignment. Finally the cross section method is very time consuming and thus costly. In this paper an alternative method to overcome the above limitations will be described which is capable of measuring the metal stepcoverage electrically and is suited for design rule verification, evaluation of process splits, shortloop monitoring of metal deposition equipment and dielectric planarization processes, the generation of metal stepcoverage wafermaps, wafer release of production wafers (screening-out of weak parts) and evaluation of the correlation between metal stepcoverage and electromigration resistance. Results will be shown of the application of this method on BiCMOS and Bipolar processes with different metallization systems. Based on this work the method has also been applied to study stepcoverage issues in an TiW/AlSiTi metallization system [11]. 5.2 ELECTRICAL ASSESSMENT OF METAL STEPCOVERAGE 5.2.1 Test structures The test structures consist of metal line meanders over oxide/polysilicon/metal topography and flat surfaces, both in X- and Y-direction, see fig. 1. In one module the meanders cross more than 150 steps. These steps make up about 30% of the total line length of 2250 µm. As shown in fig. 2, the metal lines have no metal overlap over the sides of the contacts to prevent the creation of resistance path along the sides of the contacts shunting the resistance within the contacts which has a more severe topography. By varying the distance between two masks the topography can be varied and also worst case coinciding steps can be created. The sensitivity of the structures can be increased by enlarging the number of steps within a given line length. The test structures enable resistance measurements and detection of shorts between lines and underlying metal or polysilicon. To ensure accurate resistance measurements when using probe needles, the lines are contacted via Kelvin contacts to the bondpads. On one die about 160 different stepcoverage modules with all kinds of topography variations are present (totaling about 70mm2) covering 82 Short loop monitoring of metal stepcoverage by simple electrical measurements critical steps in BiCMOS, Bipolar and high voltage BCD processes. These processes involve as well planarised as non-planarized metallisation systems. Line width of the meanders varies between 4 µm and 8 µm depending on the metallisation process. The 4σ alignment accuracy of the stepper used for processing of the wafers in this study equals 0.5 µm. metal 1 poly silicon contact to silicon n- diffusion spacing contact to poly Fig. 1: Schematic view of a stepcoverage test structure for a BiCMOS process showing one metal line meander crossing 204 polysilicon (PS) and contact-to-silicon (CO) defined steps (details see fig. 2) and one metal line meander without topography. Fig. 2: Schematic view of part of a stepcoverage test structure for the BiCMOS process in fig. 1 showing a metal1 line crossing polysilicon (PS) gate and contact-to-silicon (CO) defined topography. The CO-PS spacing is varied in different stepcoverage modules. About 100 die are placed on a 5 inch wafer allowing the assessment of metal stepcoverage over the complete wafer and thus the generation of stepcoverage wa83 Chapter 5 fermaps. All processes can be covered with one dedicated metal stepcoverage maskset and process specific shortloop flowcharts. 84 Short loop monitoring of metal stepcoverage by simple electrical measurements 5.2.2 Measurement method The measurement method is based on the comparison of the resistance of a metal line meander over topography and of a meander on a flat surface, see fig. 1. The use of Kelvin contacts ensures accurate resistance measurements. By taking the ratio of the two resistances, the effects of line width and sheet resistance variations can be eliminated. The effects of lithography induced local line width variations over steps can be neglected because the line width of the test structures is about 6 µm which is well above the resolution of the stepper used. Correlations between the resistance ratio and actual metal stepcoverage percentage can be made easily due to the fact that the metal lines do not overlap the sides of the contacts. This enables the assessment of the metal stepcoverage percentage by a fast and simple topview SEM inspection after removal of just the passivation and intermetal dielectric layers, see fig. 3. Resistance measurements are done on a Keithley 450 parametric tester. Fig. 3: Topside SEM inspection of the stepcoverage of a metal2 line in a bipolar process crossing topography defined by metal1 (IN) and shallow-n diffusion (SN) oxide steps. SN-IN spacing equals 4 µm. Using the above described test structures and measurement method the drawbacks of the traditional method can be overcome. The new method is fast and allows evaluation of metal stepcoverage on all possible topography since all distances between two masks are present in different modules (including those with worst case coinciding steps). Wafermaps of metal stepcoverage can be made easily and electromigration measurements can be performed as a function of the stepcoverage percentage. Furthermore the method is suitable for short loop 85 Chapter 5 equipment monitoring and process split evaluation and can be applied to any process technology ranging from high power bipolar to submicron CMOS. The electrical stepcoverage characterization method has been applied to planarised and non-planarized BiCMOS and high power bipolar processes with different metallization systems. The results are shown and discussed in the next sections. 5.3 DESIGN RULE VERIFICATION FOR BIPOLAR PROCESSES The above method has been used for design rule verification on two 3 µm high power bipolar processes that only differ in their metallization system. In the first process pattern definition is done by wet etching of aluminum while in the second process the aluminum is anodized (Al converted to Al2O3). The latter process thus has no metal1 steps and is planarized in contrast to the non-planarized wet-etch technology. Metal1 (IN) and metal2 (IN2) thickness equal 1.0 µm and 2.0 µm respectively and metal composition is AlSi(1%)Cu(0.04%). Metal2 is sputtered at 200 °C and 350 °C for the anodized and wet etched technology respectively. The intermetal dielectric consists of a 0.9 µm thick plasma deposited Si3N4 layer. In both technologies the shallow-n (SN) emitter is diffused from a phosphorous oxide slurry (no implantations used). This creates an about 0.7 µm high oxide step at the edge of the SN diffusion. Critical aspect in the metallization system is the metal2 stepcoverage over metal1 as a function of the spacing between the metal1 step and the SN oxide step (IN-SN spacing), see fig. 4. If the metal1 and SN step coincide, topography increases to 1.7 µm. The stepcoverage maskset contains modules in which the IN-SN spacing is varied including worst case coinciding steps. In this way the minimum allowable spacing to guarantee good stepcoverage can be determined. The IN2 line width equals 8 µm. Phosphorous oxide Metal 2 Metal 1 Metal 1 Intermetal Nitride Oxide SN diffusion IN-SN DISTANCE Fig. 4: Schematic cross section of the non-planarized bipolar process with etched aluminum showing the topography below metal2 caused by metal1 and SN diffusion oxide steps. 86 Short loop monitoring of metal stepcoverage by simple electrical measurements Fig. 5 and 6 show the metal2 stepcoverage (represented by the resistance ratio between metal lines over steps and flat lines) as a function of the IN-SN spacing for the non-planarised (A: etched aluminum) and the planarised (B: anodised aluminum) bipolar process. A: non-planarized 1.25 B: planarized Mean Resistance Ratio 1.20 wafer A1 wafer A2 wafer A3 wafer B1 wafer B2 wafer B3 1.15 A 1.10 1.05 1.00 B -4.0 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 4.0 0.95 IN1 Overlap over SN [um] Fig. 5: Resistance ratio between metal lines over steps and flat lines as a function of the metal1 to SN oxide step spacing for a non-planarised (A) and a planarised (B) bipolar process. The steps coincide in case the SN-IN spacing equals 0 µm. 87 Chapter 5 Fig. 6: Normal distribution plot of the 96 resistance ratios over the wafer for SNIN spacings of 0 µm and 4 µm and for both bipolar processes. Spread across the wafer is clearly less than differences between variants. Fig. 5 shows the mean value of the ratios of the 96 modules on the wafer while the variation of the ratio across the wafer can be seen from the normal distribution plots of the ratios in fig. 6. Not surprisingly, the non planarised process shows a 6% higher resistance ratio in fig. 5. However for coinciding IN-SN steps the ratio increases by another 15% which points at a degraded metal2 stepcoverage. SEM pictures of this step in fig. 7 show that this is indeed the case; the metal2 stepcoverage is only 10 to 15 %. In fig. 7 also SEM pictures for other SN-IN spacings are shown. These can be used to calibrate the resistance ratio. Using fig. 5, design rules for guaranteeing sufficient metal stepcoverage can be easily generated. In this case, under worst case misalignment conditions, the IN-SN spacing should be larger than 0.5 µm. Fig. 7: Topside SEM inspections of the non-planarized bipolar process after removal of the top layers showing the metal2 stepcoverage over metal1 and SN oxide steps for four different IN-SN spacings namely -4.0 µm (a), -1.0 µm (b), -0.5 µm (c) and 0.0 µm (d). Detoriation of the stepcoverage with decreasing spacing can be clearly seen, worst case occurs at a SN-IN spacing of 0 µm. 5.4 EFFECT OF METAL STEPCOVERAGE ON ELECTROMIGRATION LIFETIME 88 Short loop monitoring of metal stepcoverage by simple electrical measurements Electromigration experiments have been carried out on the above described non-planarised bipolar technology to determine the effect of reduced metal stepcoverage on electromigration lifetime. Average aluminium grain size as measured on bondpads equals 1.8 µm. Fig. 8 shows the lifetime results of 8 µm wide metal2 meanders over flat surfaces (having 100% stepcoverage) and over coinciding SNIN steps (worst case topography, 10% to 15% stepcoverage) for two stress conditions in a lognormal distribution plot. A 40% resistance increase was used as the failure criterion. Surprisingly, the MTTF reduction of lines over steps is only about 35% and thus much less than could be anticipated based on the aluminum cross section reduction, although the spread is somewhat larger. Extrapolation to normal use conditions shows that the electromigration lifetime in case of a SN-IN spacing of 0.0 µm is still acceptable. Fig. 8: Lognormal distribution plot of the electromigration lifetime of metal2 lines in case of no (100% stepcoverage) and worst case (15% stepcoverage) underlying topography. Failure analysis shows that the opens on the test structures with worst case steps do not preferably occur on the steps as might be expected from the cross section reduction and the presence of smaller aluminum grains at the steps [4,7,10]. Apart from at steps, opens are also found on the flat parts of the metal2 89 Chapter 5 line in-between the steps, see fig. 9. Fig. 9 also shows that in this case the opens do occur as well on top of the IN line as between the IN lines, see fig. 9. A similar failure mode was found in [3] and attributed to temperature gradient effects. This hypothesis was however not unambiguously proven and also we do not yet understand why the failures do not preferentially occur on the steps. Another contributing factor may be the fact that the distance between the two steps in our structures is only 14 µm. This is in the same order as the Blech length [9] of the metallization system below which void generation due to electromigration is counteracted by backflow of aluminum due to the electromigration induced mechanical stress and/or atomic concentration gradients. (a) (b) 90 Short loop monitoring of metal stepcoverage by simple electrical measurements (c) Fig. 9: SEM inspection of the failure location after electromigration stress of a metal2 line over worst case topography (15% stepcoverage) in the nonplanarized bipolar process. The larger spread in failure times in case of lines over steps is most likely caused by variations in metal stepcoverage over the wafer and thus between the various Devices Under Test in our study. Fig. 6 namely shows that for coinciding SN-IN steps the spread in the resistance ratio over the wafer is about 0.04, which is about the same as the difference between the mean ratios of the modules with a SN-IN spacing of 0.0 µm and 0.5 µm. According to fig. 7 these modules have a worst case stepcoverage of 10% and 25% respectively. Based on this we estimate that due to small SN-IN alignment variations across the wafer, the stepcoverage for modules with coinciding SN-IN steps ranges from 10% to 25% over the complete wafer. The explanation given in [10], larger spread due to differences in line length, does not hold for our case as all line lengths are equal. 5.5 DESIGN RULE VERIFICATION FOR A NON-PLANARIZED BICMOS PROCESS In non-planarized (Bi)cmos processes a critical issue is the spacing between polysilicon (PS) gates and the contacts-to-silicon (CO) as it determines both the occurrence of metal1 to PS shorts and the metal1 stepcoverage in CO-contacts, see fig. 10. In this specific 1.5 µm BiCMOS technology the polysilicon and PS-metal1 dielectric (TEOS) thickness are both 0.5 µm and the metal1 thickness equals 0.6 µm. Metal1 composition is AlSi(1%)Cu(0.04%) and it is sputtered at 200°C. The CO-contacts are wet etched. The stepcoverage maskset contains modules in which the CO-PS distance is varied (see fig. 1) including worst case coinciding steps and modules that should result in shorts. In this way the minimum allowable spacing to guarantee good stepcoverage and good IN-PS isolation can be determined. Phosphorous Oxide Metal1 TEOS CO Poly Silicon CO CO-PS Spacing Fig. 10: Schematic cross section of the non-planarized BiCMOS process showing the topography below metal1 caused by PS gates and CO-contacts. 91 Chapter 5 Fig. 11 shows the metal1 stepcoverage as a function of the CO-PS spacing. For small CO-PS spacings many of the 96 modules on the wafer show very high resistance ratio values. Assuming that resistance ratios larger than 10 are opens, fig. 11a shows the percentage opens as a function of the CO-PS spacing. In fig. 11b boxplots of the resistance ratio for the remaining non-open modules are shown as a function of the CO-PS spacing. 12 0 w a fer 1 w a fer 2 Percentage Opens 10 0 80 60 40 20 0 0 .0 0.5 1 .0 1.5 2 .0 2 .5 CO -PS D ista nce [um ] (a) 1.20 Resistance Ratio (non open & non-shorted modules) w afer 1 w afer 2 1.15 5% 25% 50% 75% 95% 1.10 1.05 1.00 0.0 0.5 1.0 1.5 2.0 2.5 CO-PS Distance [um] (b) Fig. 11: Metal1 stepcoverage over PS gates and neighbouring CO-contacts as a function of the CO-PS spacing for the non-planarized BiCMOS process. The stepcoverage is depicted by the percentage of modules showing opens (resistance ratio >10) (a) and boxplots of the resistance ratio between lines over steps and flat lines of the non-open modules (b). The 5, 25, 50, 75 and 95 percentile points are shown. 92 Short loop monitoring of metal stepcoverage by simple electrical measurements Fig. 12 shows the percentage of metal1-PS shorts as a function of the CO-PS spacing. SEM pictures of FIB cross sections can be seen in fig. 13 for various COPS spacings. These cross sections clearly support the electrical measurements. Fig. 13b shows for example that for a CO-PS spacing of 1.0 µm the metal1 stepcoverage is reduced to zero (thus effectively an open) while fig. 13c clearly shows a IN-PS short at a 0.5 µm CO-PS spacing. 120 wafer 1 wafer 2 Percentage Shorts 100 80 60 40 20 0 0.0 0.5 1.0 1.5 2.0 2.5 CO-PS Distance [um] Fig. 12: Percentage of metal1 to polysilicon shorts as a function of the CO to PS spacing for the non-planarized BiCMOS process. Metal1 P2O5 oxide TEOS oxide Polysilicon Silicon Gate oxide (a) CO-PS = 1.5µm (c) CO-PS = 0.5µm (b) CO-PS = 1.0µm 93 Chapter 5 (d) CO-PS = 0.0µm Fig. 13: SEM inspections of the non-planarized BiCMOS process showing metal1 stepcoverage over PS and CO topography and metal1 to PS shorts for four different CO-PS spacings namely 1.5 µm (a), 1.0 µm (b), 0.5 µm (c) and 0.0 µm (d). Opens and shorts occur for a CO-PS spacing smaller than 1.75 µm. It can be concluded from fig. 11 and fig. 12 that for an acceptable metal1 stepcoverage and prevention of metal1 to polysilicon shorts, the CO-PS spacing under worst case misalignment conditions must be larger than 1.75 µm. Just using the cross section data would probably have resulted in too optimistic design rules. 5.6 PROCESS SPLIT EVALUATION AND SHORT LOOP EQUIPMENT MONITORING Another application of the method has been the evaluation of process splits in order to optimize metal1 stepcoverage in contacts-to-silicon (CO) for the planarised bipolar technology with anodized aluminum. Details of the metallization system have been listed earlier in this section. Critical issue in this technology is the metal1 stepcoverage in CO-contacts in case the CO-contact edge coincides with the before mentioned SN oxide step, see fig. 14. The stepcoverage problem is in this case aggravated by the fact that the top of the thermal oxide consists of an about 100 nm thick phosphorous oxide (P2O5) layer that originates from the phosphorous slurry from which the SN emitter was diffused. During wet CO contact etch this P2O5 layer (which is covered by photo resist) is slightly underetched (see fig. 16) resulting in a slight negative slope at the very top of the CO-contact. This protrusion is smoothened by a subsequent HF-dip before metal deposition but is not completely removed as can be seen in fig. 16. Especially if the CO-contact edge coincides with the SN oxide step the protrusion can be very pronounced. In combination with the low aluminum sputtering temperature (100°C) this may give rise to stepcoverage problems. Coinciding SN and CO steps only occur at 4σ mask misalignment. It is therefore almost impossible to evaluate process splits using normal production wafers or test chips. 94 Short loop monitoring of metal stepcoverage by simple electrical measurements Phosphorous oxide protrusion Metal 1 Silicon Oxide Contact Window SNdiffusion SN- CO DISTANCE Fig. 14: Schematic cross section of the planarized bipolar process with anodised aluminum showing the topography below metal1 caused by CO contacts and SN-diffusion oxide steps. The stepcoverage maskset has been used in metal1 stepcoverage optimization experiments. In these experiments, the bias voltage during aluminum sputtering (0 V and 200 V), the lifetime of the sputter targets (right after or just before target replacement) and the use of the HF-dip of the CO-contact before aluminum sputtering were varied. Fig. 15 shows the metall stepcoverage as a function of the SNCO spacing. Again as well the percentage of opens (96 datapoints per wafer) as the mean and sigma of the resistance ratio of the non-open modules are shown. Fig. 15 clearly reveals that wafers sputtered directly after metal target change in the metal deposition equipment suffer from very poor stepcoverage of metal1 in the CO-contact. Wafers sputtered near the end of the target life show good stepcoverage. This is confirmed by fig. 16 showing SEM pictures of FIB cross sections of metal1 over coinciding CO and SN steps for both cases. 95 Chapter 5 100 Target, Bias, HF dip begin, yes, yes 80 begin, yes, no Percentage Opens begin, no, yes begin, no, no 60 end, yes, yes end, yes, no end, no, yes 40 end, no, no 20 0 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 SN-CO Distance [um] (a) 2.6 Target, Bias, HF dip begin, yes, yes begin, yes, no Mean Resistance Ratio 2.2 begin, no, yes begin, no, no end, yes, yes 1.8 end, yes, no end, no, yes end, no, no 1.4 1.0 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 SN-CO Distance [um] (b) 96 Short loop monitoring of metal stepcoverage by simple electrical measurements 1.6 Target, Bias, HF dip 1.4 begin, yes, yes begin, yes, no Standard Deviation Resistance Ratio 1.2 begin, no, yes 1.0 begin, no, no end, yes, yes 0.8 end, yes, no end, no, yes 0.6 end, no, no 0.4 0.2 0.0 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 SN-CO Distance [um] (c) Fig. 15: Metal1 stepcoverage over SN oxide steps and CO-contacts as a function of the SN to CO spacing for the planarized bipolar process. The stepcoverage is depicted by the percentage of modules showing opens (resistance ratio > 10) (a) and the mean (b) and sigma (c) of the resistance ratio between metal lines over steps and flat lines of the non-open modules. The fact that stepcoverage is poor in case of a ‘virgin’ target is probably due to the fact that also oxygen atoms and contaminants that are present on the surface of the new target are sputtered on the wafer. These contaminants limit the mobility of the aluminum atoms on the wafer surface, thereby hampering lateral diffusion of the aluminum atoms and thus degrading metal stepcoverage. The above justifies the existing practice in the waferfab where first several runs with dummy wafers (corresponding to a few hundred µm of aluminum) have to be sputtered before starting metal deposition on production wafers. Note that the other process variants (presence of sputter bias and HF-dip) have only a minor influence on the metal1 stepcoverage. The stepcoverage maskset is reasonably well suited for short loop monitoring of critical processes on metal deposition equipment. A short loop typically consists of 4 masks and has a throughput time of about one week. To ensure good stepcoverage in volume production, modules with worst case topography (like the one with coinciding SN- and CO-edges in case of the bipolar process with anodized aluminum) can also be included in a scribelane or drop-in Process Control Module (PCM) for wafer release purposes and stepcoverage monitoring on production wafers. In this way wafers containing weak parts can be screenedout. Fig. 17 shows an example of a SPC Control Chart containing data from over 700 batches of the resistance ratio of a metal2 line over coinciding IN-SN steps and over a flat surface for the non-planarized bipolar process described in section 5.3.1. Only one Out-Of-Control (OOC) caused by a scratch on the module is 97 Chapter 5 observed demonstrating that the metal stepcoverage is well controlled in the waferfab. Fig. 16: SEM inspections of metal1 stepcoverage in CO-contacts coinciding with SN oxide steps in case of wafers sputtered a) directly after metal target change and b) after sputtering several runs of dummy wafers. 1,2 1,1 average LCL target UCL 1,0 feb-99 jan-99 dec-98 nov-98 okt-98 sep-98 aug-98 jul-98 jun-98 mei-98 apr-98 mrt-98 feb-98 jan-98 dec-97 0,9 nov-97 Ratio metal2 over steps 1,3 Date 98 Short loop monitoring of metal stepcoverage by simple electrical measurements Fig. 17: SPC Control Chart of the resistance ratio of a metal2 line over coinciding IN-SN steps and over a flat surface for the non-planarised bipolar process showing data from 770 batches (ratio equals 1.12±0.025). The method is also suited for (planarised) submicron MOS technologies, especially for monitoring of metal stepcoverage in contacts-to-silicon and intermetal vias in processes without W-plugs. 5.7 METAL STEPCOVERAGE WAFERMAPS In addition to fig. 15, for the bipolar process with anodized aluminum also the variation of the metal1 stepcoverage across the wafer has been evaluated in case of coinciding CO- and SN-steps (see previous section). Fig. 18 shows wafermaps for wafers sputtered both directly after metal target change and sputtered after processing of several runs with dummy wafers. These data once again demonstrate the need to sputter dummy wafers after metal target change but furthermore show that the stepcoverage can vary significantly across the wafer. Areas with good stepcoverage are found adjacent to areas with complete opens. This clearly points at the limitations of the cross sectioning method and the importance of stepcoverage wafermaps. 5.8 SUMMARY AND CONCLUSIONS A novel method has been developed capable of assessing metal stepcoverage by simple electrical measurements. The metal stepcoverage is represented by the resistance ratio of metal lines over (worst case) topography and metal lines over flat surfaces. The resistance ratio correlates well to the metal stepcoverage percentage as measured by topview SEM inspections after stripping passivation and intermetal dielectric layers. The metal stepcoverage test chip and measurement method has been successfully applied to the verification of metal step coverage related design rule in various technologies, the evaluation of process splits and shortloop monitoring of metal deposition equipment. The method is also suited for short loop monitoring of dielectric planarization processes. The effect of stepcoverage on electromigration lifetime has been found to be very limited. Open failures occur primarily between the steps and not at the steps which is not yet understood. Metal stepcoverage appears to be strongly dependent on the lifetime of the metal sputter target; several runs with dummy wafers are necessary after metal target change to guarantee good stepcoverage. Metal stepcoverage wafermaps show that stepcoverage can vary strongly across a wafer demonstrating the limitations of the traditional cross sectioning method. The new electrical method has shown to be able to overcome the drawbacks of this method. A worst case test structure can be included in drop-in or scribelane Process Control Modules to enable stepcoverage monitoring on production wafers and for wafer release purposes. In this way wafers containing weak parts can be screened99 Chapter 5 out. The test structure can also be used for lifetesting but it should be noted that the correlation between test structure lifetime and product lifetime in general is not straightforward [4,7]. The method is also applicable to submicron MOS process technologies. 4 3.5 Resistance ratio 3 2.5 2 8 7 6 1.5 5 4 3 1 2 Y-position 1 0.5 0 -1 0 0 -2 -1 -2 -3 -4 -5 -3 -6 -7 -8 X-position -9 -10 (a) 4 3.5 Resistance ratio 3 2.5 2 8 7 6 1.5 5 4 3 1 2 Y-position 1 0.5 0 -1 0 0 -2 -1 -2 -3 -4 -5 -3 -6 -7 -8 X-position -9 -10 (b) Fig. 18: Wafermaps of the resistance ratio between metal lines over worst case steps and flat lines for wafers sputtered directly after metal target change 100 Short loop monitoring of metal stepcoverage by simple electrical measurements (a) and sputtered after processing of several runs with dummy wafers (b), clearly demonstrating the need to sputter dummy wafers after metal target change. From (a) also the large variation of stepcoverage across the wafer can be seen. 101 Chapter 5 5.9 REFERENCES [1] K.A. Danso, L. Tullos, ‘Thin film metallization studies and lifetime prediction using Al-Si and Al-Cu-Si conductor test bars’, Microelectronics & Reliability, vol. 21, no. 4, pp. 513-527, (1981) [2] A. Wild, M. Triantafyllou, ‘Electromigration on oxide steps’, Microelectronics & Reliability, vol. 28, no. 2, pp. 243-255, (1988) [3] A.S. Oates, ‘Step spacing effects on electromigration’, Proceedings International Reliability Physics Symposium, pp. 20-24, (1990) [4] Y.E. Strausser, B.L. Euzent, R.C. Smith, B.M. Tracy, K. Wu, ‘The effect of metal film topography and lithography on grain size distributions and on electromigration performance’, Proceedings International Reliability Physics Symposium, pp. 140-144, (1987) [5] L. Kisselgof, L.J. Elliott, J.J. Maziarz, J.R. Lloyd, ‘Electromigration lifetime and step coverage in Al/Cu/Si thin film conductors’, Materials Reliability Issues in Microelectronics Symposium, April 30-May 3, Anaheim, pp. 107112, (1991) [6] L. Ferlazzo, G. Lormand, G. Reimbold, ‘Passivation effects on step AlCu/TiN line electromigration performance’, Microelectronic Engineering, Vol. 15, No. 1-4, pp. 487-490, (1991) [7] T. Nogami, S. Oka, K. Naganuma, T. Nakata, C. Maeda, O. Haida, ‘Electromigration lifetime as a function of line length or step number’, Proceedings International Reliability Physics Symposium, pp. 366-372, (1992) [8] H. Nishimura, Y. Okuda, K. Yano, ‘Dependence of electromigration lifetime for via chains on slope angles of via holes’, Journal Electrochemical Society, vol. 142, no. 10, pp. 3565-3569, (1995) [9] I.A. Blech, C. Herring, ‘Stress generation by electromigration’, Applied Physics Letters, vol. 29, pp. 131, (1976) [10] J.S. May, ‘Electromigration characteristics of vias in Ti:W/Al-Cu(2wt%) multilayered metallization’, Proceedings IRPS, pp. 91-96, (1991) [11] E.A. Schönbächler, 'Electromigration behavior of a multi-layer metallisation', Thesis ETH Zurich, Series in Micro-Electronics, Volume 14, Hartung-Gorre Verlag, Konstanz, (1998) [12] J.A. van der Pol, E.R Ooms, H.T. Brugman, ‘Short loop monitoring of metal stepcoverage by simple electrical measurements’, Proceedings International Reliability Physics Symposium, pp. 148-155, (1996) 102 Short loop monitoring of metal stepcoverage by simple electrical measurements 103 6 Relation Between Yield And Reliability of Integrated Circuits and Application to Failure Rate Assessment and Reduction in the One Digit Fit and PPM Reliability Era [10] 6.1 Introduction 6.2 Yield as a reliability indicator 6.3 Experimental results 6.3.1 Relation between yield and line fall-off. 6.3.2 Relation between line fall-off and field returns 6.3.3 Rrelation between yield and burn-in reject rate 6.3.4 Relation between burn-in and High Temperature Operating Life (HTOL) failure rate 6.4 Failure rate prediction and assesment 6.5 Options for failure rate reduction 6.5 1 Yield improvement 6.5.2 Elimination of special causes (‘maverick’ batches) 6.5.3 Screening of weak parts with latent defects during product test 6.6 Conclusions 6.7 References 6.1 INTRODUCTION In the present 1 digit PPM and FIT reliability era, the assessment of the actual Early Failure Rate (EFR) level of a product, let alone the demonstration of its improvement, has become problematic due to the lack of sufficient reliability fails during life tests. Although the problem can be alleviated by increasing the number of products in lifetest [1], this is often prohibited by economic constraints and 101 Chapter 6 lengthening feedback loops. Thus it is hard for a waferfab to obtain statistically significant data that can be used for definition and implementation of improvement actions aimed at continuous reduction of the EFR and line fall-off PPM level of its products. Line fall-off is defined as the number of devices that fail after printed circuit board assembly and test at the customer’s site (e.g. television manufacturer). Field returns are devices that fail at the end-customer. The number of line fall-off failures and field returns is much larger than the number of lifetest failures and could in principle also be used for definition of improvement actions. However detection of problems at the customer conflicts with customer satisfaction principles and thus is highly unwanted. Furthermore this approach also results in unacceptably long feedback loops of 3 to 6 months and practice has proven that it is hard to obtain reliable data concerning field returns from the user community. Consequently there is a need for another source of data that can be used to drive failure rates down. In this paper it is demonstrated that electrical sort (E-sort) yield data can be used for this purpose. Correlations between yield failures and reliability failures (line fall-off, field returns, burn-in and EFR rejects) will be shown. Based on an assumption of a relation between yield and reliability defect densities a model will be introduced and compared with the data. We shall discuss and demonstrate the use of this model to assess the EFR, to implement EFR reduction programs in a waferfab and to reduce the EFR by screening techniques at E-sort testing. 6.2 YIELD AS A RELIABILITY INDICATOR In today’s products and processes, wear-out failures during operational life are virtually eliminated due to the adoption of ‘wafer level reliability‘ and ‘build-in reliability’ techniques during process development [1] and the use of reliability related design rules and reliability simulation techniques during product design. Consequently, most product reliability failures are random failures due to processing incidents (‘special causes’) and process defect density (’normal causes’). Thus the rootcauses of most reliability failures should be the same as the rootcauses of zero hour failures, i.e. yield failures. Larger defects will cause yield failures while smaller defects show up as reliability failures (or not at all). The data in fig. 1 shows that is indeed the case. Note that package, EOS (electrical overstress) and test program related failures were not taken into account. Fig. 1 shows that failures due to particles and patterning defects are dominant. It is inevitable that such defects are introduced during wafer processing. Similar results are obtained for CMOS products [3,4] as well as from theoretical considerations [2,5]. Consequently, there should be a strong, and thus measurable, relation between the number of failures in the field and in lifetest and the yield. There exist only a few papers on this relation [2,3,5,6,7, 11] and, apart from [4], the amount of experimental data is often very limited. In [4] a model was developed for the yield-reliability relation based on the assumption that the density of smaller reliability defects, which actually cause a product field failure, is a fraction of the larger defects which cause yield failures, see 102 Relation between yield and reliability of integrated circuits equation (1). Here Dy is the yield defect density, Dr the reliability defect density and α the yield to reliability ratio. From (1) equation (2) can be derived for the relation between the fraction of returned devices R and the batch yield Y [4]. This is a similar result as in [2]. Here M is the maximum possible yield fraction and allows for clustering effects and edge exclusions. M will typically exceed 90% for commercial processes. scratches 5% other 7% pinholes 6% scratches 3% unknown 40% patterning 14% other 9% unknown 41% patterning 12% particles 34% particles 29% (a) (b) other 16% unknown 29% patterning 12% particles 44% (c) Fig. 1: Rootcause pareto of (a) yield, (b) line fall-off and (c) high temperature lifetest failures of a variety of products from a bipolar / BICMOS waferfab. Dr = α × Dy R = 1− ( Y α ) M [cm-2] (1) (2) Equation (2) is a two parameter model (α and M) linking the failure fraction R to the yield Y and is applicable for a complete integrated circuit and independent of 103 Chapter 6 the die size of a product; the lower yield of large die accounts for the higher failure rate. As M typically is larger than 90%, the yield to reliability ratio α is the dominant factor in equation (2). 6.3 EXPERIMENTAL RESULTS In our study, four mixed-signal IC’s and one SRAM running in high volumes were selected. Devices were fabricated in a 1.0 µm and 1.2 µm CMOS process, a 1.5 µm BICMOS process and two 3.0 µm Bipolar processes with different metallisation systems. The devices came from two waferfabs. From the mixed-signal devices the batch yield, the line fall-off fraction (failing after assembly on a PCB) and the number of field returns (coming from the end users) were studied while for the SRAM the batch yield, burn-in reject rate (24 hrs / 150°C) and EFR was studied. 6.3.1 Relation between yield and line fall-off. ppm level [a.u.] Data from four mixed-signal high volume IC’s that were sold to customers under a PPM agreement (i.e. all failing dies are returned) were used for this study, totaling over 42 million shipped devices. When plotting for one product the number of line fall-off returns per batch (PPM level) versus batch yield for individual batches, see fig. 2, no obvious correlation was found. Fig. 2 shows that highest PPM levels were even found for high yielding batches. Apparently, screening on E-sort yield alone does not prevent ‘maverick’ batches, i.e. batches that will give rise to high numbers of field returns, from being delivered. 50 60 70 80 90 Yield [%] Fig. 2: Line fall-off returns versus batch yield for bipolar product #2. 104 Relation between yield and reliability of integrated circuits However, when one plots PPM levels based on the number of returned devices combined from all batches within a yield range that contains enough returns to move beyond the statistical noise level, versus the yield, a correlation becomes evident, as shown in fig. 3a to 3d for the CMOS, Bipolar and BICMOS products respectively. An arbitrary interval constant of 5% was chosen. The PPM level was calculated with a 60% confidence level, the error bars indicate the 10% to 90% confidence level. It is striking to see how well behaved the PPM versus yield curve is for our products, even though they originate from various technologies and fabs. The data could be fitted very well with equation (2). For different processes in a given fab nearly the same values for α were found. In principle α will be related to a waferfab, a technology, a design methodology and test methodology and the assembly and test procedure at the customers site. . (a) 105 Chapter 6 (b) (c) 106 Relation between yield and reliability of integrated circuits (d) ¢ Fig. 3: PPM level of line fall-off returns versus batch yield category ( ), together with the number of shipped dies (+) and line fall-off returns ( ) for a CMOS product (a), Bipolar product #1 (b), Bipolar product #2 (c) and a BICMOS product (d). Thick line is the model fit. o 6.3.2 Relation between line fall-off and field returns number of returns [a.u.] In order to correlate yield and reliability in the field, we used data from bipolar product #2 from which we got reliable field return data. The relation between line fall-off and the field return quantity is depicted in fig. 4, where a clear correlation can be seen. It is a strong indication of the fact that line fall-off is in fact both a first reliability screen and a reliability indicator. Line fall off Field returns 60-65 65-70 70-75 75-80 80-85 85-90 Yield [%] Fig. 4: Line fall-off versus field returns for bipolar product #2. 107 Chapter 6 6.3.3 Relation between yield and burn-in reject rate Data from 6 million CMOS SRAM products were used for this study [3]. This product initially was subjected to a 100% burn-in (24 hrs / 150 °C) while moving along the learning curve. For both yield and burn-in rejects a distinction was made between functional (e.g. shorts and opens) and parametric (e.g. standby supply current or access time) failures. The functional burn-in reject rate is plotted versus the functional batch yield in fig. 5. The thick line is the model fit. Again data from all batches within a yield range were combined to move beyond the statistical noise level, the error bars indicate the one sigma spread. Also these data can be fitted very well with equation (2), demonstrating the universal application of the model linking yield to reliability via the ratio α. Fig. 5: Functional burn-in reject rate () versus functional batch yield for a CMOS SRAM together with the number of shipped dies (+). No correlation was obtained when plotting parametric burn-in reject rate versus parametric yield loss per batch, see fig. 6. Failure analysis showed that this was due to the fact that the parametric rejects were caused by many different rootcauses that occur during the unmature phase of a process like parasitic transistor leakage, junction leakage or out-of-control process parameters like poly gate width determining transistor transconductance. The functional rejects were primarily caused by particles and patterning defects. Thus when dealing with new state-ofthe-art technologies the model will only be valid if reliability failures are correlated to the functional batch yield, thus disregarding the parametric yield loss. 108 Relation between yield and reliability of integrated circuits Fig. 6: Parametric burn-in reject rate versus the parametric yield loss for a CMOS SRAM. 6.3.4 Relation between burn-in and High Temperature Operating Life (HTOL) failure rate In total 76 CMOS SRAM batches were subjected to a 144 hrs High Temperature Operating Lifetest at 150°C and 5.5V after the 24 hrs burn-in to assess the EFR of products shipped to the customer. Fig. 7 shows a clear correlation between the cumulative Burn-in and EFR reject rate (measured in PPM rejects). The EFR can thus be predicted from the burn-in failure rate. Failure analysis shows that this is due to the fact that the rootcauses of burn-in and EFR rejects are the same. Furthermore fig. 7 shows that the EFR is about 1.6 times larger than the burn-in reject rate. This is in very good agreement with the failure rate model of Philips Consumer Electronics [8]. This model is based on a study of lifetest data of tens of thousands MOS and bipolar devices and shows that the failure rate as a function of time is best described by a Weibull distribution with a β of 0.45. 109 Burn-in Failure Rate [a.u.] Chapter 6 10 8 6 4 2 0 0 3 6 9 12 15 Early Life Failure Rate [a.u.] Fig. 7: Early Life Failure Rate (EFR) versus burn-in reject rate for a CMOS SRAM. 6.4 FAILURE RATE PREDICTION AND ASSESMENT The factor α is expected to be the same for similar products in a given technology. It may even be constant for an entire waferfab which is indicated by our data. If so, then the EFR and field failure rate can be predicted independent of the die area by measuring die yield only; area effects are accounted for in the model via the yield. Thus yield will be the only relevant indicator of reliability. If in a new technology in a given waferfab the EFR or PPM level can be determined in only one yield range (e.g. by taking a large low yielding die for lifetest or PPM-cooperations with customers), then by assuming a certain maximum yield M the value of α can be assessed. The value of M is not very critical, so reasonable predictions of EFR and PPM levels can be made quickly after process or product introduction. Note that the relation between α, waferfab and technology it is not yet fully quantified to date. 6.5 OPTIONS FOR FAILURE RATE REDUCTION 6.5 1 Yield improvement Because both product reliability and yield are determined by defect related rootcauses, see fig. 1, a waferfab may control out-going product reliability by determining equipment defect densities. The continuous defect reduction programs ongoing in our waferfabs are aimed at those defects that have the largest impact on yield and reliability (e.g. particles, patterning defects). The impact on reliabili110 Relation between yield and reliability of integrated circuits Apr Jan '96 Oct Jul Apr Jan '95 Oct Jul Apr Jan '94 Avg. '93 Defect Density [a.u.] ty is evaluated via lifetest and analysis of line fall-off rejects. This approach will enforce an improvement in yield and in the associated reliability level as shown in fig. 8 and 9. These depict the defect density trend for the bipolar/BICMOS waferfab over a three year period and the resulting line fall-off trend for the BICMOS product (in combination with screens at product test) respectively. The rejects per batch distribution for the process in which bipolar product #2 is made in fig. 10 shows that the number of batches with high number of returns has decreased significantly from 1992 to 1995. The worst batch in 1992 gave 25 returns, whereas the worst batches in 1994 and 1995 respectively gave 7 and 5 returns. Here we disregarded the two batches with 12 and 16 returns in 1995 which were caused by particle contamination incidents. Furthermore the percentage of batches causing returns decreased from 38% in 1992 to 11% in 1995. Thus the yield improvement program results not only in a continuously reducing Early Failure Rate but also in a reduction of the probability of occurrence of ‘maverick’ batches. Mar Jan-96 Nov Sep Jul May Mar Jan-95 ppm level [a.u.] Fig. 8: Defect density trend for the Bipolar/ BICMOS production line. month 111 Chapter 6 Fig. 9: Line Fall-off Trend for the BICMOS product. % of batches 100 80 60 40 20 0 0 2 4 6 8 10 12 Rejects per batch 14 16 18 20 22 24 1992 1993 1994 1995 Year Fig. 10: Line fall-off per batch distribution for 1992 to 1995 for the bipolar process #2. 6.5.2 Elimination of Special Causes (‘Maverick’ batches) As stated in the introduction, part of the product reliability failures are caused by random processing incidents (‘special causes’) that may result in ‘maverick’ batches. These incidents generally result in low yielding wafers and thus may be detected by having a consistent analysis system of low yielding wafers in place. Based on the analysis it is decided whether the wafers should be scrapped to eliminate potential maverick batches. The effectiveness of this system is clear from fig. 10. However, practice shows that not all processing incidents result in low yield and thus as such may remain unnoticed until feedback from the customer is obtained. Quick detection of maverick batches is in this case often hampered by the fact that products from one batch are delivered to many customers so that the number of returns per customer can be low while still dealing with a maverick batch. This problem can be circumvented by detailed analysis of the line fall-off trend of a product. Assume the background defect PPM level of a product caused by random process defects like particles is r. The line fall-off per batch distribution in fig. 10 then should follow a binomial distribution and the probability P(X) of finding X rejects in a sample size of N products (equal to the batch size) equals: P( X ) = N! r X (1 − r ) ( N − X ) X !( N − X )! (3) 112 Relation between yield and reliability of integrated circuits # returns / 10 succ. lots [a.u.] Typically the probability of finding more than 2 to 3 rejects per batch is very low. Thus rejects from batches with 3 to 4 or more returns can be attributed to ‘special causes’ (processing incidents) and the other rejects to ‘normal causes’ (random process defects). Fig. 11 shows the line fall-off trend for a complete product family of bipolar product #2 where this distinction between rejects has been made. On a relatively stable background of random process defect related returns several excursions can be seen related to a limited number of processing incidents affecting numerous batches that were not detected at E-sort. In our line fall-off customer return analysis system occurrence of these batches is automatically signalled (‘Batch Oriented Analysis’) and high priority is given to rootcause analysis of these rejects as they might be the first of many maverick batches. In this way corrective actions can be implemented as early as possible. special causes normal causes | | 1990 1991 | 1992 | 1993 | 1994 1995 Fig. 11: Line fall-off trend of the complete product family of bipolar product #2. 6.5.3 Screening of weak parts with Latent Defects during product test An interesting issue arises when additional EFR improvement is required above a given target level. Fig. 3 and equation (2) show that the brute force technique of yield improvement by line defect density reduction only provides a limited gain. If however, one would achieve a reduction in the reliability to yield defect ratio α, the EFR improvement would be much larger. To this end, a larger fraction of the defects must be allowed to induce a failure at E-sort or at least before parts are shipped to the customer. Usually burn-in is applied for this purpose. However smart testing which measures product yield beyond the traditional zero hour point 113 Chapter 6 V-screen rejects [a.u.] could also reduce the reliability to yield defect ratio α. Examples are the implementation of IddQ (quiescent current) testing, voltage screens (V-screen) and distribution testing methods at E-sort or final test. Fig. 12 and 13 illustrate the possible effects of extended testing during E-sort. 55 65 75 85 Esort yield [%] 95 Fig. 12: Voltage screen induced failures versus batch yield at E-sort for 30 batches of the bipolar product #2. Fig. 13: IddQ and voltage screen induced failures versus batch yield for 22 and 132 batches of two BICMOS products respectively. In fig. 12 and 13 a clear correlation between yield and V-screen and IddQ rejects is seen. Furthermore for the bipolar product #2 it was found that both for Vscreen and reliability failures the predominant failure mode was particles causing 114 Relation between yield and reliability of integrated circuits metal 1 to metal 2 shorts. This is a clear indication that a V-screen forces reliability failures to fail (‘latent defects’). Fig. 14 shows the ppm versus yield curve for both products with (2 million devices) and without V-screen (11 million devices). Again the error bars indicate the 10% and 90% confidence level. It can be seen that the resulting reduction in α is about a factor of 2.5. To obtain a similar EFR reduction by yield improvement alone, the yield should have increased another 10%. Note that this implies a reduction of the waferfab defect density by more then a factor of 2. The effect of the implementation of the V-screen and IddQ test at E-sort, in combination with the controlled defect density reduction, on the EFR of the BICMOS product is shown in fig. 8. Fig. 14: PPM level of line fall-off returns versus batch yield category of the bipolar product #2 for batches with ( ) and without ( ) V-screen at E-sort, together with the number of shipped dies with V-screen (+). Dashed lines are the model fits. o ¢ 6.6 CONCLUSIONS There is a strong relation between IC product yield and failures that occur during burn-in or in the early life of product use. This relation was found using five different IC’s, running in high volumes and manufactured in several processes, from two waferfabs. Similar results were later obtained on a 0.25µm state-of-theart microprocessor process [3]. The correlation between yield and reliability was found to obey a simple model, in which the reliability defect density is defined as a fraction α of the yield defect density. Implication of the model is that reliability 115 Chapter 6 prediction of a certain type of IC may be done based on its yield alone. In case of non-mature processes one should only take the functional yield into account and disregard parametric yield loss. ‘Maverick’ batches show up as batches with more than 2 or 3 rejects but their occurrence can not be prevented by screening on Esort yield only. Using the ppm-yield relation, it is shown how a waferfab may improve the reliability level of its products in a fast and controlled way. The model also indicates that yield improvement may not be the most effective way to achieve a reliable product if the reliability to yield defect ratio α is large. It is shown for two products that a substantial improvement can be achieved at the product test stage by implementation of screening methods like voltage screen or IddQ tests. 6.7 REFERENCES [1] D.L. Crook, ‘Evolution of VLSI reliability engineering’, Proceedings International Reliability Physics Symposium, pp. 2-11, (1990) [2] H.H.Huston, C.P. Clarke, ‘Reliability defect detection and screening during processing- theory and implementation’ Proceedings International Reliability Physics Symposium, pp. 268-275, (1992). [3] W.C. Riordan, R. Miller, J.M. Sherman, J. Hicks, ‘Microprocesor reliability performance as a function of die location for a 0.25 µm five layer metal CMOS logic process’, Proceedings International Reliability Physics Symposium, pp. 1-11, (1999) [4] F. Kuper, J. van der Pol, E. Ooms, T. Johnson, R. Wijburg, W. Koster, D. Johnston, ‘Relation between yield an reliability of integrated circuits: experimental results and application to continous early failure rate reduction programs’, Proceedings International Reliability Physics Symposium, pp. 17-21, (1996). [5] C.Glenn Shirley, ‘A defect model of reliability’, Tutorial IRPS, (1995). [6] V. Riviere, A. Touboul, S.B. Amor, G. Gregoris, J.L. Stevenson, P.S. Yeung, ‘Evidence of a correlation between yields and reliability data for a rad-hard SOI technology’, Proc. of the 1995 International Conference on Microelectronic test structures’, pp. 221-224, (1995) [7] J.G.Prendergast, ‘Reliability and quality correlation for a particular failure mechanism’, Proceedings International Reliability Physics Symposium, pp. 87-93, (1993). [8] H.R. Claessen, ‘Reliability of IC’s’, Summer Course on reliability and yield in MOS VLSI technologies, Volume IV, IMEC, Leuven, Belgium, June 5-9, (1989). [10]J.A. van der Pol, F.G. Kuper, E.R. Ooms, ‘Reliation between yield and reliability of integrated circuits and application to failure rate assessment and reduction in the one digit FIT and PPM reliability Era’, Microelectronics & Reliability, pp. 1603-1610, (1996) [11] T. Kim, W. Kuo, ‘Modelling manufacturing yield and reliability’, IEEE Transactions on Semiconductor Manufacturing, pp. 485-492, (1999) 116 Relation between yield and reliability of integrated circuits 117 7 Impact of Screening of Latent Defects at Electrical Test on the Yield-Reliability Relation and Application to Burn-in Elimination [16] 7.1 Introduction 7.2 Impact of screening latent defects at e-sort on product reliability 7.2.1 Yield-reliability relation 7.2.2 Failure rate reduction options 7.2.3 Impact of latent defect screens at e-sort on yield-reliability relation 7.3 Model predicting burn-in failure rate from batch yield 7.3.1 Experiment and failure rate evolution model 7.3.2 Validation of the model 7.3.3 Process dependence of the model constants 7.3.4 Burn-in failure rate prediction 7.4 Application of model to burn-in elimination 7.4.1 Impact of screens 7.4.2 Verification of the model 7.5 Conclusions 7.6 References 7.1 INTRODUCTION Since the mid-seventies the issue of integrated circuit ‘infant mortality’ has received growing interest as it was recognised [1] that next to wear-out failure mechanisms, also ‘early failures’ [2] could have a significant impact on the overall circuit reliability as encountered in the field. In today’s products and processes, wear-out failures during operational life are virtually eliminated due to the adop117 Chapter 7 tion of ‘wafer level reliability’ (WLR) or ‘building-in reliability’ (BIR) techniques during process development [3] and the use of reliability related design rules and reliability simulation techniques during product design. Consequently, product reliability failures are currently dominated by randomly distributed ‘latent defects’ due to processing incidents or process defect density like particles or litho and gate oxide defects. In order to improve the reliability of their products in the field, many manufacturers of e.g. television sets or car radio sets have adopted screening techniques [4] and require (especially in case of military or automotive applications) either a full or a sample burn-in to weed-out the latent defects, thus hoping to achieve onedigit-FIT reliability levels. At the same time, integrated circuit manufacturers have adopted techniques like in-line defect monitoring, statistical process control (SPC), rigorous yield learning and defect reduction programs [5] and various latent defect and maverick screens in order to improve yield and reduce the number of reliability failures due to manufacturing flaws [6,7]. Fig. 1 shows for example the yield learning curve of a Bipolar-CMOS-DMOS (BCD) process in our Bipolar-BiCMOS waferfab. In fig. 2 the defect density trend in the same fab is depicted demonstrating a 30% improvement rate per year. As can be seen in fig. 3, the associated yield improvement together with the introduction of latent defect screens resulted in a factor 10 reduction of the line fall-off rate of a BiCMOS TV signal processing IC in 2.5 years. Line fall-off failures are in this case products that fail after printed circuit board assembly and test at the customers site (e.g. a television set manufacturer). Fig. 1: Yield learning curve of a Bipolar-BiCMOS-DMOS (BCD) process in the Bipolar-BiCMOS waferfab. As a result of all the IC manufacturers efforts, product reliability has improved dramatically over the years, see fig. 4, and single-digit FIT early failure rates and 118 Screening of latent defects at electrical test and application to burn-in elimination Defect Density [a.u.] line fall-off PPM numbers are not uncommon today. However, despite the reliability improvement, in many cases burn-in is still mandated ‘blindly’ with little regard for the cost involved (burn-in operation and associated yield loss) and the actual benefits realised (reduction of both warranty cost and customer dissatisfaction). The latter is especially an issue for high yielding mature manufacturing lines with average yields above 85-90%. Note that in the context of this paper the term burn-in strictly applies to an extended operation at elevated temperature and not to other screens. 0 | 1993 | 1994 | 1995 | 1996 1997 Month Fig. 2: Defect density trend of the Bipolar-BiCMOS waferfab. 119 Chapter 7 Fig. 3: Line fall-off PPM trend of BiCMOS product #1. Fig. 4: Trend in early- and intrinsic failure rate (FIT) targets for application in consumer products. Only few papers in literature deal with the above trade-off between burn-in cost and benefit. In [8,9,10] models were developed to provide a rational basis for setting burn-in yield criteria based on a detailed analysis of actual product burn-in and lifetest data. However no relation to batch yield at electrical test (E-sort) was established. In this paper we will firstly show quantitatively how screening of latent defects at E-sort testing using tests like voltage screen, IddQ (quiescent cur120 Screening of latent defects at electrical test and application to burn-in elimination rent) testing and distribution testing improves product reliability and secondly how these tests can eliminate the need for burn-in based on a model, supported by experimental data, that relates the necessary burn-in condition to the required reliability level and to the batch yield. 7.2 IMPACT OF SCREENING LATENT DEFECTS AT DUCT RELIABILITY E-SORT ON PRO- 7.2.1 Yield-Reliability Relation In [6,7] it was shown using burn-in and customer line fall-off data that there is a clear relation between yield and reliability of integrated circuits in case the yield loss is dominated by defects like particles. This yield-reliability relation was successfully modelled by equation (2) with R the fraction of rejects at a certain use (or stress) condition, Y the batch yield, M the maximum possible yielding fraction, allowing for clustering effects, placement of Process Control Modules and edge exclusions and α the ratio between reliability and yield defect density Dr and Dy, see equation (1) [6,7]. M will typically exceed 90% for commercial processes. Dr = α ⋅ D y [cm-2] (1) Y öα ÷ è Mø R = 1 − æç (2) The yield-reliability ratio α is dependent on waferfab, technology and product operating conditions. In table 1 normalised data are shown for 8 different products in 4 technologies from 3 waferfabs totalling 84 million devices. The effect of operating condition on α can be noticed from the fact that the α-value (and thus according to equation (2) also the line fall-off) at 95°C is significantly higher than that at 65°C or 85°C and from the high α-value for the product that was subjected to burn-in [7]. Note that in the latter case we correlated the batch yield with the burn-in rejects and not with the rejects of the products that were shipped into the field after the burn-in; the α-value of these burned-in devices in the field will be much lower. 7.2.2 Failure Rate Reduction Options Failure rate reduction can be achieved by the brute force technique of yield improvement and by a reduction of the reliability to yield defect ratio α. For the latter, a larger fraction of latent defects must be forced to induce a failure at E-sort testing or at least before the parts are shipped to the customer in the field. Applying burn-in is one way to do the job (products after burn-in will show a low α-va121 Chapter 7 lue in the field) but introducing product screens at E-sort testing is potentially much more cost-effective. The most commonly used screening techniques include voltage screens, distribution testing and, for CMOS circuitry, IddQ testing. In case of a voltage screen the product is operated at an elevated supply voltage below the intrinsic breakdown voltage of the process for about 10-100ms while applying some functional test patterns. Goal is to screen-out gate oxide defects and near-shorts due to particles or defects during lithographic processing. In case of distribution testing many different implementations exist. In our case we determine the distribution of a few critical analog parameters (e.g. a supply current or a reference voltage) for all good products on every individual wafer and reject any outliers even if the product is still within its datasheet specification. This is based on the belief that as long as the abnormal product behaviour is not understood, one can also not guarantee its reliability. Finally it has been shown [11] that in CMOS circuitry IddQ testing is an effective tool to detect latent defects due to resistive short-circuit paths in a product that have not yet lead to stuck-at failures. IddQ testing is often combined with voltage screens. In fig. 5 the correlation between batch E-sort yield and IddQ rejects is shown for 575 batches of two BiCMOS products totalling more than 8 million devices. It can be clearly seen that the lower the batch yield, the higher the IddQ fall-out and the more latent defects are screened out. Also the maverick behaviour of some lots is evident. The effect on the improvement of the line fall-off rate of one of these products can be seen in fig. 3. E-sort Data source Norm. Fab Product Sample screen size (x106) α-value A 2.7 no Line fall-off 1.0 1.0µ CMOS1 @ 65°C A 6.0 no Burnin 24hr 59 1.2µ CMOS2 @ 150°C B 14.9 no Line fall-off 2.7 3µ Bipolar1 @ 85°C B 8.7 no Line fall-off 3.5 3µ Bipolar2 @ 95°C B 2.4 1 Line fall-off 2.2 3µ Bipolar2 @ 95°C B 13.0 no Line fall-off 1.8 1.5µ BiCMOS1 @ 85°C B 1.6 no Line fall-off 2.3 1.5µ BiCMOS2 @ 85°C B 3.4 1,2 Line fall-off 1.2 1.5µ BiCMOS2 @ 85°C B 5.8 1,2,3 Line fall-off 1.3 1.5µ BiCMOS3 @ 85°C C 7.5 no Line fall-off 1.5 1.5µ BiCMOS1 @ 85°C C 7.7 1,2 Line fall-off 0.5 1.5µ BiCMOS1 122 Screening of latent defects at electrical test and application to burn-in elimination C 1.5µ BiCMOS4 10.1 @ 85°C Line fall-off @ 85°C 1,2,3 1.8 Table 1: Normalised yield-reliability ratios α for various products from different processes and different waferfabs totalling 84 million devices (A= CMOS fab, B= Bipolar-BiCMOS fab, C= CMOS-BiCMOS fab) using different screens (1= Voltage Screen, 2= Distribution Testing, 3= IddQ). 7.2.3 Impact of Latent Defect Screens at E-sort on Yield-Reliability Relation Iddq rejects [a.u.] The impact of latent defect screens on product reliability is demonstrated by the data in fig. 6,7 and 8. All data in a 5% batch yield interval have been combined to obtain statistically meaningful results. In the figures the number of shipped devices in each yield category is shown as well as the correlation between yield and line fall-off rejects. The PPM-number is determined for each yield category by dividing all customer returns by the number of shipped devices in that yield category. The error bars denote the 10% and 90% confidence limits. The dashed lines are the weighted fits using the model in equation (2) and the number of shipped devices as the weight factor. The M-values used in the fit are derived from the yield data and given in the figure captions. 70 75 80 85 90 95 Esort Yield [%] (a) 123 Iddq rejects [a.u.] Chapter 7 50 55 60 65 70 75 80 Esort Yield [%] (b) Fig. 5: IddQ test induced failures at E-sort versus batch yield for a) 186 batches of BiCMOS product #2 from Fab B and b) 389 batches of BiCMOS product #1 from Fab C, see table 1. Fig. 6 shows the data for a bipolar power amplifier shipped to only one automotive customer before (8.7 million devices) and after (2.4 million devices) introduction of a voltage screen intended to screen-out particles causing intermetal shorts. Under a PPM-cooperation agreement all line fall-off failures were sent back to us resulting in a good fit between the model and the data. We find that the applied voltage screen apparently reduced α by a factor 1.6. Note furthermore that the average yield of the batches with V-screen is higher than without V-screen due to the continuous yield improvement program in the waferfab. 124 Screening of latent defects at electrical test and application to burn-in elimination Fig. 6: Bipolar product #2: PPM level of line fall-off returns and number of shipped devices versus batch yield category for batches with () and without (o) voltage screen. Dashed lines are the (weighted) model fits using M= 93%. All data in a 5% yield interval have been combined. In fig. 7 and 8 similar data are shown of the effect of the introduction of a voltage screen and supply current related distribution tests for two different BiCMOS TV signal processing ICs from two waferfabs (in total 7.5 and 1.6 million devices before and 7.7 and 3.6 million devices after introduction of the screens respectively). It can be clearly seen that for all yield categories the PPM-number of the devices subjected to the E-sort screens is lower than that of the devices without screen. The corresponding α reduction factors are 3.0 and 2.0 respectively, demonstrating quantitatively how reliability can effectively be improved by introduction of E-sort screens. The relatively poor fit between the model in equation (2) and the data in fig. 7 and 8 is probably caused by the fact that due to the large volumes shipped and the large customer base, not all customers did sent their line fall-off rejects back. This results in too optimistic values for the calculated α -number in table 2 but, as this effect holds for both the samples with and without screens, the determined α -reduction factor will still be reasonably accurate. Furthermore it must be noted that due to the weight factor used, the fit tends to favour the data points with the highest number of shipped devices. 125 Chapter 7 Fig. 7: BiCMOS product #1: PPM level of line fall-off returns and number of shipped devices versus batch yield for batches with () and without (o) screen consisting of voltage screen and distribution tests. Dashed lines are the (weighted) model fits using M= 92%. è ç No screen With screen model fits è 0.5 ç 50-55 55-60 60-65 1.0 # shipped [millions] PPM level [a.u.] 1.5 0.0 65-70 70-75 75-80 80-85 85-90 90-95 Yield range [%] Fig. 8: BiCMOS product #2: PPM level of line fall-off returns and number of shipped devices versus batch yield for batches with () and without (o) screen consisting of voltage screen and distribution tests. Dashed lines are the (weighted) model fits using M= 92%. 126 Screening of latent defects at electrical test and application to burn-in elimination 7.3 MODEL PREDICTING BURN-IN FAILURE RATE FROM BATCH YIELD 7.3.1 Experiment and Failure Rate Evolution Model In order to be able to model the effects of burn-in, it is necessary to know the failure rate evolution versus time. To this purpose in the mid-eighties a large-scale experiment was set up by Philips Consumer Electronics and Philips Semiconducors in order to investigate the correlation between failure rates of products during conventional lifetests and failure rates of products in their real operating environment [12]. Data from this time period are included because back then still high failure rates were observed allowing statistically relevant experiments with limited sample sizes. The experiment included over 50 different bipolar product types of various complexity fabricated in four different waferfabs. Extensive lifetesting was carried out up to 8000hrs at junction temperatures Tj of 110°C, 125°C, 140°C and 170°C and simultaneously the cumulative failure curve of the same products in the set under normal operating conditions was registered, see fig. 9. The error bars denote the 60% confidence interval. Sample sizes during the lifetest were 4000, 7000, 2600 and 330 products respectively up to 1000hrs and 330, 330, 300 and 90 products respectively up to 8000hrs. The junction temperature of the products in the set Tj,set was 105°C with an estimated spread of ± 5°C. Failure analysis showed that the failures were dominated by defects, no wear-out was observed. 127 Chapter 7 Fig. 9: Cumulative failure plot for various bipolar IC’s during lifetest at four different junction temperatures and of the same products operated in the set at Tj = 105°C. The set cumulative failure curve shows two slopes and can be described by a combination of two Weibull distributions F1 and F2. Based on the bathtub model [2] we assume for F2 a constant failure rate distribution. This results in equation (3) with Ft,T the cumulative failed fraction at operating time t and junction temperature T, β the Weibull shape factor and η1,2,T the experimentally determined characteristic lifes at junction temperature T. ηT at other temperatures than T can be calculated using equation (4). The η2 /η1 ratio determines the cross-over point tXover,T at temperature T between the two slopes in fig. 9. For practical cases where F1 << 0.01, tXover,T can be approximated by equation (5) as then at t= tXover,T F1=F2. β é ù æ t ö æ t ö ê ú ç ÷ ç ÷ −ç +ç ÷ êè η ÷ ø è η2,T ø ú 1 , T ê ú ë û Ft,T = F1 + (1− F1) ⋅ F2 = 1− e q ⋅E a ö ÷ η 1 , 2 , T = C1 , 2 ⋅ e k ⋅ T ø æ ç è t Xover , T = β æ ç η 1, T ç ç η 2, T è (3) [hrs] (4) [hrs] (5) 1 ö ÷ ÷ ÷ ø ( β − 1) Using the above model, we can calculate the lifetest results of the IC’s back to Tj,set , see fig. 10. Again the error bars denote the 60% confidence interval. The (weighted) fit between the model and the data is also shown. It appears that the lifetest failures follow the same cumulative failure curve as the set test failures. We find for the model constants Ea= 0.7 ± 0.1 eV, β= 0.40 ± 0.05 and tXover equals about 3000hrs at 105°C. Note that η1 and η2 together determine the absolute failure level. The Ea of 0.7eV corresponds well with the value of 0.65eV reported in [13]. Similarly, failure rate plots were determined from accelerated lifetest data of various MOS and Bipolar IC’s and calculated back to 85°C using the Arrhenius model with Ea= 0.7eV, see fig. 11 and 12. Once again the individual failure rate plots of the investigated IC-families essentially have the same shape and can be modelled by the combination of the Weibull distribution (F1) and the constant 128 Screening of latent defects at electrical test and application to burn-in elimination failure rate (F2) in equation (3). The fit between the model and the data is also shown in fig. 11 and 12. In this case we find for the model constants β= 0.40 ± 0.05 and tXover,85°C = 33000 ± 8000 hrs at 85°C. As the cross-over point between the two slopes in equation (3) occurs for more than 30000 hrs at 85°C, the second term in equation (3) can be neglected for most practical cases. Fig.10: Cumulative failure plot of IC’s in the set under normal operating conditions (Tj,set = 105°C) compared with stress test results at various Tj calculated back to the Tj,set using an Arrhenius model with Ea = 0.7eV. 129 Chapter 7 Fig. 11: Normalised cumulative failure plot for various bipolar and MOS IC’s. Data are calculated back to 85°C using Ea= 0.7eV and normalised to 1 at 300 hrs. 130 Screening of latent defects at electrical test and application to burn-in elimination 7.3.2 Validation of the model The above model has been validated in a later period by comparing the failure rate predictions based on 48hrs lifetest data with the actual results of 300hrs operational testing of sets, see table 2 [12]. As can be seen, the FIT rates predicted by the model are within 40% of the actual numbers which is a remarkably good agreement compared to other prediction models. Fig. 12: Normalised failure rate plot for various bipolar and MOS IC’s. Data are calculated back to 85°C using Ea= 0.7eV and normalised to 1 at 300hrs. Technology Bipolar CMOS NMOS Fab1 NMOS Fab2 Average Tj,set (°°C) 85 50 70 85 Measured Failure Rate @ 300hrs (FIT) 500 320 900 5500 Calculated Failure Rate @ 300hrs (FIT) 700 350 850 7000 Table 2 : Comparison of measured and calculated FIT-rates of various products in actual consumer sets. 7.3.3 Process Dependence of the Model Constants Note that β and η are process dependent constants. For submicron CMOS logic processes β = 0.25 has been reported [14] and for (embedded) DRAM proces131 Chapter 7 ses β‘s between 0.20 and 0.36 have been found [15]. In fig.13 recent data from 3 µm bipolar, 1.5 µm BiCMOS, 1.2 and 0.8 µm CMOS and 0.5 µm embedded DRAM processes calculated back to 85°C using Ea = 0.7eV are shown indicating that for modern technologies β ranges from 0.2 to 0.5. Furthermore the cross-over point between the slopes in equation (3) does not show up yet at 30000 hrs at 85°C again indicating that the contribution of the constant failure rate term in equation (3) is negligible for most practical cases. Fig. 13: Normalised cumulative failure plot for various bipolar, BiCMOS, CMOS and embedded DRAM technologies. Data are calculated back to 85°C using Ea = 0.7eV and normalised to 1 at 300hrs. 7.3.4 Burn-in Failure Rate Prediction Based on the above failure rate time evolution model in equation (3) we now can derive a model for predicting the burn-in failure rate from batch yield. Assume that we have determined the yield-reliability ratio α from data obtained from a stress of duration ts at junction temperature Ts. We furthermore take the β- and tXover,Ts -value applicable to the technology of the products being studied. In that case only the characteristic life η1,Ts is unknown as η2,Ts can be calculated using equation (5). Here it is important to note that α is stress condition dependent thus α ≡ αts,Ts. By combining equation (2), (3) and (5) we then can derive an ex- 132 Screening of latent defects at electrical test and application to burn-in elimination pression for η1,Ts. In case we are dealing with only one Weibull distribution (single slope so tXover,Ts → ∞) we obtain equation (6): ts η1, Ts = [hrs] (6) 1 é ê − α t s , Ts ë æ ⋅ ln ç è Y öùβ ÷ M ø úû Equation (6) expresses the characteristic life η1,Ts at stress temperature Ts as a function of batch yield Y, maximum yield fraction M, stress time ts, Weibull shape factor β and yield-reliability ratio αts,Ts determined from the stress data. η1,T at other temperatures than Ts can be calculated from equation (4) for a given activation energy Ea. In case we are dealing with two Weibull distributions as in equation (3) (two slopes so tXover at use conditions ≤ 25 years) also the cross-over point tXover,Ts enters the equation and the expression for η1,Ts becomes somewhat more complex: 1 η1, Ts = é æ ts ê −ç t β + ê ç s t Xover ,Ts (1− β ) ê è ê æ Y ö α t s , Ts ⋅ ln ç ÷ ê è Mø ê ê ë ö ÷ ÷ ø ùβ ú ú ú ú ú ú ú û [hrs] (7) One can now calculate the cumulative failed fraction as a function of batch yield at any use or stress condition other than the stress conditions time ts and temperature Ts by using equation (3) and (4). Note that all the required constants are known. We firstly apply the model in equation (3), (4) and (7) to the line fall-off data in fig. 6 to 8. In consumer applications the set test after printed circuit board assembly generating the line-fall-off rejects typically consists of a ts = 0.5 hrs stress at Tambient = 40°C equivalent to a junction stress temperature Ts ≈85°C. So using ts = 0.5hrs, Ts = 85°C, Ea = 0.7eV and assuming typical values for the other constants for demonstration purposes (β = 0.3, tXover,85°C = 33000 hrs, M= 95% and α0.5hrs,85°C = 5⋅10-5, exact values to be determined for each individual product family and application), equation (7) allows us to calculate η1,85°C and to predict the cumulative failure rate at any use or stress condition as a function of batch yield. In fig. 14 the calculated cumulative failed fractions are shown for a number of standard stress and operating conditions. Note that a different α-value corresponds to each curve. For the 6 hrs and 240 hrs burn-in curves we find for example α6hrs,150°C = 3.1⋅10-4 and α240hrs,150°C = 1.2⋅10-3. Using these numbers and equation (2), one can simply calculate the reject level as a function of batch yield. To verify 133 Chapter 7 how realistic the calculated reject numbers are, we take the example of a 60% yielding batch. The calculated reject levels during a 150°C burn-in are in the order of several hundreds of PPM. This is in good agreement with the reject numbers observed in reality. Fig. 14: Cumulative failures versus batch yield for various use and stress conditions using α0.5hrs,85°C =5⋅10-5, M = 95%, β =0.3 and Ea = 0.7eV as an example. Next we apply the model to determine the impact of burn-in and E-sort screens on the PPM reject levels experienced in the field for batches with varying yield. The cumulative failed fraction after tu operating hours at temperature Tu after a burn-in of ts hrs at temperature Ts can be calculated using equation (8) (acceleration factor) and equation (9): AFTu − Ts = é q⋅E a ⋅æ ç ê è êë k e Ft u ,Tu = F( t u + AFT 1 1 öù − ÷ú Tu T s ø úû ⋅t ),T u − Ts s u (8) − F( AFT ⋅t ),T u − Ts s u (9) 134 Screening of latent defects at electrical test and application to burn-in elimination Fig.15 shows the cumulative failure plot for a high (90%) and a moderately (50%) yielding batch during operation at 85°C including the impact of a 6 hrs (typical for µ-processors and DRAMs) and 240 hrs (Class S military) burn-in at 150°C and of the introduction of E-sort screens (assuming an ‘α‘ reduction by a factor 2) on the failure curve of the 50% yielding batch. The burn-in results in a significant reliability improvement for short operating times but after that reliability is worse than for both the high yielding batch and the 50% batch after screening. In case of the 6hrs burn-in the cross-over points occur after about 40hrs and 1000 hrs respectively, well within the useful life of most applications. In case of the 240 hrs burn-in these numbers are 700 hrs and 10000 hrs. An important implication of these findings is that for Hi-Rel military components the use of commercial parts from mature high volume manufacturing lines is probably more advisable than the extensive burn-in of parts from dedicated low-volume unstable lines. Fig. 15: Cumulative failure plot for a good (90%) and a moderately (50%) yielding batch and the impact of burn-in and E-sort screens on the curve of the 50% yield batch (using M = 95% and assuming a factor 2 reduction of α due to the screen). 135 Chapter 7 7.4 APPLICATION OF MODEL TO BURN-IN ELIMINATION 7.4.1 Impact of screens Using the model the minimum burn-in time needed to obtain a customer specified reliability level after given use conditions t and Tj can be calculated and consequently below which average product yield level burn-in is needed (note that the higher reject level of low yielding batches is compensated by the lower reject level of high yielding batches). Fig. 16 shows for a typical situation (α0.5hrs,85°C = 5⋅10-5, M = 95%, β = 0.3, Ea = 0.7eV and cross-over point at 33000 hrs, exact values again product family and application dependent) the required burn-in time at 150°C versus batch yield to obtain a single-digit PPM line-fall-off reject rate for automotive products and <100 PPM and <300 PPM (≅10 FIT) rejects during consumer warranty and product life respectively. The corresponding use conditions are 0.5 hrs / 95 °C, 1500 hrs / 85°C and 30000 hrs / 85°C respectively. The burnin time strongly depends on the applicable operating time as for longer times burn-in is less effective, see fig. 15. Fig. 16 also shows the burn-in time reduction resulting from the introduction of screens at E-sort (assuming a factor 2 reduction in α). Fig.16: Burn-in time required at 150°C for a) 10 PPM line fall-off automotive (0.5 hrs / 95°C), b) 100 PPM rejects warranty consumer (1500 hrs / 136 Screening of latent defects at electrical test and application to burn-in elimination 85°C) and c) 10 FIT failure rate consumer life (30000 hrs / 85°C) versus batch yield (M= 95%). The dashed curves show the impact of E-sort screens. Apparently, products with an average yield < 80, 81 and 85% need burn-in for automotive and different consumer application conditions given above respectively, while E-sort screening lowers the burn-in requiring yields to 68, 69 and 76% respectively. 137 Chapter 7 7.4.2 Verification of the Model The model can be reliably validated using data from the bipolar product in fig. 6. This audio amplifier part did not receive a burn-in and was only shipped to one automotive customer who sent 100% of the line fall-off rejects back to us as part of a PPM-cooperation agreement. This ensures that the PPM-data are not flattered by the fact that some customers did not sent all rejects back. Fig. 17a shows the Esort yield of every batch of this product during 1995. The voltage screen at E-sort was introduced in week 9522. The delay between the V-screen introduction at Esort and use of these products by the customer is caused by pipeline and stock effects. Using the above model and the actual M and α-values as determined from fig. 6, we calculated the yield levels required to achieve less then 10 PPM line fall-off reject rate with and without a voltage screen. These yield levels (being 84.0% and 86.8% respectively) are depicted by the horizontal dashed lines in fig. 17a. We see that before the introduction of the voltage screen most batches yield lower than the level required to obtain 10 PPM line fall-off and that after the introduction most batches yield higher than the 10 PPM yield level. Fig. 17b now shows the line fall-off reject levels as reported on a weekly basis by the customer self. We observe that without the voltage screen the average PPM level is indeed above 10 PPM which subsequently drops to an average level below 10 PPM after introduction of the voltage screen. This is in perfect agreement with the predictions derived from our model. For a second product only running at this automotive customer a 7 PPM line fall-off rate is reported while currently yielding at more than 90%. This is also in full agreement with the predictions of our model. 138 Screening of latent defects at electrical test and application to burn-in elimination (a) (b) Fig. 17: a) E-sort batch yield and minimum yield required for a 10 PPM line falloff reject rate before and after introduction of a V-screen and b) the corresponding weekly line fall-off rate reported by the automotive customer. 7.5 CONCLUSIONS Based on data of over 31 million devices it has been shown that screening of latent defects at electrical test (e.g. by voltage screens or IddQ-tests) can significantly reduce the number of reliability failures. A reduction of the yield-to-reliability ratio α by a factor 1.5 to 3 is found having a similar impact on the PPM and FIT reliability levels. Based on an extensive comparison between set and product lifetest data and the above yield - reliability correlation, a model has been developed predicting the product failure rate as a function of batch yield at any operating time and temperature. Application of this model to burn-in and lifetest failure rate prediction shows firstly that in general high yielding batches generate less failures than low yielding batches, even if the latter have been subject to burn-in. Secondly, for a lot of practical applications the use of screens at E-sort is more effective with respect to screening of latent defects than a standard (6 hrs) burn-in. Using the model, the minimum burn-in time needed to obtain a customer specified reliability target can be calculated as a function of batch yield as well as the 139 Chapter 7 average product yield level above which burn-in can be eliminated. The calculated numbers appear to be in perfect agreement with experimental data. 7.6 REFERENCES [1] D.S. Peck, ‘New concerns about integrated circuit reliability’, Proceedings IRPS, pp. 1-6, (1978) [2] B.A. Unger, ‘Early life failures’, Quality & Reliability Engineering International, vol. 4, pp 27-34, (1988) [3] D.L. Crook, ‘Evolution of VLSI reliability engineering’, Proceedings IRPS, pp. 2-11, (1990) [4] E.A. Amerasekera, F.N. Najim, ‘Failure mechanisms in semiconductor devices’, Chichester, John Wiley & Sons, ch. 7, (1997) [5] P.K. Nag, W. Maly, H.J. Jacobs, ‘Simulation of yield/cost learning curves with Y4’, IEEE Transactions on semiconductor manufacturing’, vol. 10, pp. 256-265, (1997) [6] F. Kuper, J. van der Pol, E. Ooms, T. Johnson, R. Wijburg, W. Koster, D. Johnston, ‘Relation between yield and reliability of integrated circuits: experimental results and application to continuous early failure rate reduction programs’, Proceedings IRPS, pp. 17-21, (1996) [7] J.A. van der Pol, F.G. Kuper, E.R. Ooms, ‘Relation between yield and reliability of integrated circuits and application to failure rate assessment and reduction in the one-digit PPM reliability era’, MicroElectronics & Reliability, pp. 1603-1610, (1996) [8] A.P. van den Heuvel, N.F. Khory, ‘A rational basis for setting burn-in yield criteria’, Proceedings International Test Conference, pp. 524-530, (1984) [9] W. Smith, N.F. Khory, ‘Does the burn-in of integrated circuits continue to be a meaningful course to pursue’, 38th Electronic Components Conference, pp. 1-4, (1988) [10] D.L. Jacobowitz, ‘A software tool for designing burn-in programs’, Proceedings Annual Reliability and Maintainability Symposium, pp. 302-305, (1987) [11] T.R. Henry, T. Soo, ‘Burn-in elimination of a high volume microprocessor using IddQ’, Proceedings International Test Conference, (1996) [12] H.R. Claessen, ‘Reliability of IC’s’, IMEC Summer Course, Belgium, (1989) [13] IBM, Memory Products, Qualification Handbook, 16MB DRAM, Die revision E, Document nr. MMDD06QHU-00 [14] R. Zelenka, Presentation at Ford Reliability Workshop, Colorado Springs, October 22-24, (1993) [15] M. Matthaei, Embedded DRAM Burn-in data, Private Communication, (1997) [16] J.A. van der Pol, E.R. Ooms, A. van ‘t Hof, F. Kuper, ‘Impact of screening of latent defects at electrical test on the yield-reliability relation and application to burn-in elimination’, pp. 370-377, Proceedings IRPS, (1998) 140 Screening of latent defects at electrical test and application to burn-in elimination 141 8 Summary and Conclusions 8.1 Summary 8.2 Conclusions 8.1 SUMMARY Chapter 1 discusses trends in semiconductor technology and product reliability and describes the reliability assurance system that has been implemented in process development, product development and high volume manufacturing in order to achieve the factor 10 million reliability improvement over the last 30 years. Furthermore motivation for the work in this thesis is given. Chapter 2 shows the application of highly accelerated (wafer level) stress techniques to two wear-out failure modes in a high voltage Bipolar-CMOS-DMOS process technology being transistor instabilities due to sodium ingression and transistor instabilities due to surface charges originating from high voltage circuitry. Furthermore a quantitative model is developed allowing the derivation of design rules for elimination of the surface charge failure mode. Chapter3 discusses the relation between the hot carrier lifetimes of transistors and that of SRAM circuits as well as the implications for technology development. Chapter 4 demonstrates a straightforward method for the derivation of latchup design rules for submicron CMOS processes changing this field from an ’art’ into an ‘engineering science’. Chapter 5 deals with new method to assess the metal stepcoverage of a metallisation system by electrical measurements. Application to process optimisation, design rule derivation and process monitoring is discussed. Chapter 6 investigates the relation between yield and reliability of products and demonstrates that the yield can be used as a reliability indicator instead of conventional life tests. A quantitative model between yield and reliability is developed and validated and application to failure rate reduction is discussed. 1 Chapter 8 Chapter 7 deals with the impact of electrical screens at product test on the product failure rate in the field and explores the failure rate evolution with time. This enables extension of the yield-reliability model into a new model capable of predicting the product failure rate as a function of batch yield at any operating time and temperature. Application to ‘burn-in’ elimination is discussed. 8.2 CONCLUSIONS Dominant failure modes in high power/high voltage (650V) BCD-technologies are threshold voltage instabilities of the lateral DMOS transistor due to sodium ingression and parasitic leakage currents in low voltage devices induced by high surface potentials originating from the high voltage devices. In chapter 2 it is shown that the threshold voltage instabilities can be prevented by improving the sodium getter capabilities of the dielectric layers in the backend process and by increasing the silicon nitride passivation thickness. The occurrence of parasitic leakage currents appears to be strongly dependent on temperature, moisture content of the plastic package, circuit layout and applied operating voltage. The 'charge-creep' effect can be modelled by describing the evolution of the surface potential as a function of place and time by means of a lumped element RCmodel. A good qualitative and a reasonable quantitative agreement between experimental data and model predictions is found. Using the model also design rules that can be used to eliminate the 'charge-creep' effects in actual circuits have been derived. Hot carrier degradation of a full-CMOS SRAM results primarily in an increase of the minimum operating voltage and write time parameters, as shown in chapter 3, caused by degradation of the access transistor of the memory cell. This is a ‘pass’-type transistor that is operated with both source-drain voltage polarities. This type of operation makes the transistor much more sensitive to the effects of hot carrier degradation than in the case of operation in the ‘inverter’-type mode. Circuit simulations confirm the observed degradation effects. The hot carrier lifetime of SRAM products appears to be about a factor 50 larger than that of static stressed transistors. This discrepancy is caused by duty cycle effects and by the limited sensitivity of the SRAM to the individual transistor degradation. Other results show that this finding is generally applicable thus facilitating product design, for example by eliminating the need for cascoding of transistors at critical locations. In this way increases in memory and microprocessor speed can be realised as well as more aggressive scaling of process technologies without jeopardising product reliability. The latchup susceptibility of a submicron CMOS processes on p-/p++ epitaxial substrates can be characterised, using a dedicated set of test structures, as a function of n+p+-spacing, placement of Nwell and substrate contacts, guardring width and distance of the guardring to the injecting junction. Chapter 4 demonstrates how these data can be translated into latchup design rules taking into account the geometrical spreading of the injected carriers. This approach results in very latchup robust products in case of p-/p++ epitaxial substrates, thus 2 Summary and conclusions eliminating the need for time-consuming and expensive ‘trial-and-error’ design optimisation cycles. The method is also applicable to processes on non-epitaxial substrates. The metal stepcoverage of a metallisation system is a critical parameter to control in a waferfab process as it may have a dramatic effect on electromigration related reliability. Chapter 5 shows how it can be assessed by simple electrical measurements where the stepcoverage is represented by the resistance ratio of metal lines over (worst case) topography and metal lines over flat surfaces. The resistance ratio appears to correlate well to the stepcoverage percentage as determined from SEM inspections. As a result, the novel method is well suited for derivation of stepcoverage related design rules. Surprisingly, in our work the effect of metal stepcoverage on electromigration resistance was found to be very limited; the open failures also did not occur on the steps. Another application is the optimisation of sputter processes. It appears that metal stepcoverage depends strongly on metal sputter target lifetime. Several runs with dummy wafers appear to be necessary after metal target change to guarantee good stepcoverage. Furthermore it is found that stepcoverage can vary significantly over the wafer surface demonstrating the importance of metal stepcoverage wafermap data and the limitations of the conventional cross sectioning method. Finally it appears that the method is well suited for monitoring of metal stepcoverage on production wafers and screening-out of weak parts by placing it in the Process Control Modules on each wafer. Clear relations have been established in chapter 6 between E-sort yield and ‘burn-in’, EFR and field failure rates for nearly 50 million high volume products in bipolar, CMOS and BICMOS technologies from different waferfabs (later confirmed by data from a 0.25µm state-of-the-art microprocessor process). The relations obey a simple model that assumes that the reliability defect density is a fraction of the waferfab defect density and that rootcauses of failures are the same. The model allows a die size independent prediction and assessment of FIT and PPM reliability levels of an IC just based on its yield, eliminating the need for excessive life testing. ‘Maverick’ batches are identified by more than 2 to 3 rejects per batch and can not be eliminated by scrap of low yielding wafers alone. For non-mature technologies only correlations with functional yield are found, the parametric yield loss should be disregarded. Using the results, it is shown how reliability can be improved in a fast and controlled way, even in the 1 digit FIT and PPM reliability era, by reducing waferfab defect density, elimination of special causes and implementation of screens at product test. As the effect of yield on PPM reject level is not that strong, the latter approach can be very effective in improving reliability. Finally, it is shown in chapter 7, based on data of over 31 million devices, that screening of latent defects at electrical test (e.g. by voltage screens or IddQ-tests) can improve PPM and FIT reliability levels by a factor 1.5 to 3, demonstrating that these techniques are a good alternative to ‘burn-in’. As this provides significant efficiency improvement and cost reduction opportunities to high volume semiconductor manufacturers, these screening techniques are rapidly becoming standard industry practice. Extensive comparison between TV-set and 3 Chapter 8 product lifetest data shows that the failure rate curve indeed follows a ‘bathtub’ shape. The curve can be accurately modelled by a modified Weibull distribution. Merging it with the above yield-reliability relation, results in a new model that for the first time allows prediction of product failure rate as a function of batch yield at any operating time and temperature. Application of the model to prediction of burn-in and lifetest failure rates, shows first that in general high yielding batches generate in the long run less failures than low yielding batches, even if the latter have been subject to ‘burn-in’. Second, for a lot of practical applications the use of screens at E-sort is more effective with respect to screening of latent defects than a standard (e.g. 6 hrs) ’burn-in’. Using the model, the minimum burn-in time needed to obtain a customer specified reliability target can be calculated as a function of batch yield. It also answers the question under what conditions burn-in can be eliminated. The calculated numbers appear to be in good agreement with experimental data. 4 Summary Over the past 30 years the reliability of semiconductor products has improved by a factor of 100 while at the same time the complexity of the circuits has increased by a factor 105. This 7-decade reliability improvement has been realised by implementing a sophisticated reliability assurance system in process development, product development and high volume manufacturing, aimed at building-in product reliability and establishing effective improvement feedback loops in both development and production as described in chapter 1. This thesis deals with new methods that have been developed to continue the current improvement rate also in the new millennium. In process development the adoption of highly accelerated stress techniques (preferably on wafer level) has become crucial as this gives the opportunity to simulate 10 years of product lifetime within a few hours or days, in-line with today’s development cycle times. In chapter 2 these are applied to a high voltage Bipolar-CMOS-DMOS technology for the selection of the best dielectric passivation stack capable of preventing wear-out failures due to sodium ingression from the plastic package. Another dominant wear-out failure mode in high voltage products is the occurrence of transistor instabilities induced by high voltage surface charges originating from the high voltage connections to the product (bondwires and bondpads). Using similar stress methods, a new quantitative model has been developed that describes this failure mechanism and that allows us to derive design rules that eliminate the surface charge effects and thus ensure reliable high voltage products. The highly accelerated stresses are often carried out on dedicated test structures designed in such a way that they are ‘’susceptible” to primarily only the failure mechanism of interest. This introduces the problem of how to convert test structure lifetime data to actual product data. As reliability margins are vanishing rapidly in modern semiconductor technologies this is of great interest to the industry. In chapter 3 this has been explored for the case of hot carrier degradation. It appears that large lifetime differences can occur between test structures and products due duty cycle effects and the varying sensitivity of the electrical parameters of a product to the degradation of one or more of its components. Lifetimes of products in dynamic operation can easily exceed the lifetimes of corresponding transistors in static operation by a factor 100. As a result, more aggressive scaling of process technologies is possible without jeopardising the product reliability, enabling e.g. increases in the maximum operation frequency of state-of-the-art microprocessors. For building-in reliability during product development, the availability of reliability related design rules is mandatory. One aspect of this are design rules that ensure that the product is robust against voltage spikes on its external pins so that it does not ‘latchup’ and burn-out. In chapter 4 a method is demonstrated, applicable to any CMOS technology, for the derivation of latchup design rules from simple test structures. It allows first-time-right design of products and changes the perception of latch-up being an ’art’ to being an ’engineering science’. 143 In high volume manufacturing the prevention and detection of process excursions that might deteriorate product reliability and yield is of the uttermost importance. Therefore very sophisticated in-line and end-of-line control systems have been implemented in manufacturing flows where all critical equipment and process parameters that may influence product performance or reliability are regularly measured and kept under Statistical Process Control. One such critical process parameter is the metal stepcoverage of a metallisation system as it may have a dramatic effect on electromigration related reliability of a product. Chapter 5 deals with a novel method that allows metal stepcoverage monitoring by simple electrical measurements. The method is applied to optimisation of sputter processes and generation of design rules and it is also shown that it has clear advantages over the commonly used cross-sectioning method as also stepcoverage wafermaps can be easily made. Combined with end-of-line control techniques it is well suited for metal stepcoverage control on production wafers and screening of weak parts. As a result of the ‘building-in’ reliability approach in process and product development, wear-out failure modes do not occur anymore in today’s products. Instead product failures are dominated by early failures caused by manufacturing defects. Product failure rates however have become so low that conventional life testing techniques are not capable anymore of providing enough statistically significant data (at reasonable cost) to guide the improvement actions in the high volume manufacturing lines. Therefore a paradigm shift is needed. In chapter 6 it is shown that the product yield can be used as a primary reliability indicator. Data of over 50 million products in various processes show for the first time that there exists a clear correlation between the yield of a product and its reliability in the field because the nature of yield and reliability defects is the same. Thus a waferfab may improve the reliability level of its products in a fast and controlled way by monitoring and reducing its in-line defect densities, eliminating the need for excessive life testing. The yield-reliability relation is described by a quantitative model that allows predict the reliability in the field based on yield data. It can be used to set objective scrap limits for non-conforming (low yield) material, thus preventing that products with a larger failure probability are shipped to customers. The model also indicates that a substantial product reliability improvement can be obtained by the implementation of screens at product test like voltage screen and IddQ testing. Data of over 30 million products show in chapter 7 that these techniques can halve failure rates and are a good alternative ‘burn-in’ (a conventional failure rate reduction method where products are operated at a high temperature for some time before shipment to customers). Screening techniques are rapidly becoming standard industry practice due to the potential efficiency improvements and cost reductions they offer in high volume manufacturing. Given the high cost of ‘burn-in’, the question under what conditions it can be eliminated is a relevant one. Therefore is in chapter 7 the failure rate evolution curve with time determined, based on lifetest data of actual TV-sets and products, and also succesfully modelled quantitatively. The failure rate curve of indeed has the well known ‘bathtub’ shape. Combining the failure rate model with the above yield-reliability relation, results in a novel model capable of predicting for the first 144 time the product failure rate as a function of batch yield at any operating time and temperature. It allows calculation of the minimum burn-in time needed to obtain a customer specified reliability target as a function of batch yield as well as the average product yield level above which burn-in can be eliminated. Good agreement between model predictions and experimental data is shown. The model predicts that in general high yielding batches generate less failures than low yielding batches, even if the latter have been subject to burn-in. This once again demonstrates the importance of high yield for achieving excellent product reliability. Furthermore, for a lot of practical applications the use of screens at E-sort is more effective with respect to screening of latent defects than a standard (short) burn-in. 145 146 Nieuwe Methoden voor het Inbouwen en Verbeteren van de Betrouwbaarheid van Geintegreerde Schakelingen Toepassing op Massa Fabricage van Halfgeleiders Samenvatting Gedurende de laatste 30 jaar is de betrouwbaarheid van halfgeleider producten (IC’s) met een factor 100 verbeterd terwijl tegelijkertijd de complexiteit van de schakelingen met een factor 105 is toegenomen. Deze betrouwbaarheidsverbetering met 7 grootte ordes is gerealiseerd door de implementatie van een geavanceerd betrouwbaarheids borgings systeem in de proces ontwikkeling, product ontwikkeling en massa fabricage. Dit systeem is gericht op het inbouwen van betrouwbaarheid en het realiseren van effectieve verbeter processen in ontwikkeling en productie zoals in hoofdstuk 1 van dit proefschrift beschreven wordt. Dit proefschrift behandelt nieuwe methoden die ontwikkeld zijn om de huidige trend in verbetering van de betrouwbaarheid ook in de toekomst te kunnen doorzetten. Tijdens de proces ontwikkeling is toepassing van sterk versnelde betrouwbaarheidsevaluatie technieken (bij voorkeur op plak niveau) cruciaal geworden omdat op deze wijze 10 jaar product levensduur gesimuleerd kan worden binnen enkele uren of dagen, hetgeen noodzakelijk is gezien de huidige sterk verkorte ontwikkel cycli. In hoofdstuk 2 worden deze methoden toegepast op een hoogspannings Bipolair-CMOS-DMOS technologie om die (diëlectrische) passivatie laag te kunnen selecteren die het meest geschikt is om massale uitval (’einde levensduur’) ten gevolge van het binnen dringen in het IC van natrium vanuit het plastic van de omhulling te voorkomen. Een andere belangrijk ’einde levensduur’ uitvalsmechanisme in hoogspannings IC’s is het optreden van lek paden in transistoren veroorzaakt door positieve oppervlakte ladingen die afkomstig zijn van de hoogspanningsdelen van de schakeling (o.a. bonddraden en bondflappen). Met vergelijkbare versnellingstechnieken is een nieuw model ontwikkeld dat dit uitval mechanisme quantitatief beschrijft en dat gebruikt kan worden om ontwerp regels af te leiden die de gevolgen van de oppervlakte ladingen elimineert zodat de betrouwbaarheid van de hoogspannings IC’s gegarandeerd kan worden. De sterk versnelde betrouwbaarheidsevaluaties worden vaak op speciale test structuren uitgevoerd die zo ontworpen zijn dat ze voornamelijk voor slechts een uitvalsmechanisme gevoelig zijn. Hierbij rijst het probleem hoe de levensduur data die van de test structuren verkregen worden vertaald moeten worden naar de levensduur van daadwerkelijke producten. Om dat de betrouwbaarheidsmarges in 147 moderne technologieën snel aan het verdwijnen zijn is dit van groot belang voor de halfgeleider industrie. In hoofdstuk 3 is dit onderzocht voor het geval van degradatie ten gevolge van ‘hete ladingsdragers’. Het blijkt dat er grote verschillen tussen de levensduur van test structuren en echte producten kunnen bestaan ten gevolge van ‘duty cycle’ effecten en de variërende gevoeligheid van de electrische parameters van een product voor de degradatie van een van zijn transistoren. De levensduur van dynamisch werkende producten kan met gemak een factor 100 groter zijn dan de levensduur van de relevante transistoren in statisch bedrijf. Als gevolg hiervan is er een agressievere schaling mogelijk van proces technologieën zonder de product betrouwbaarheid in gevaar te brengen, waardoor bijv. toenames in de snelheid van microprocessors mogelijk zijn. Om de betrouwbaarheid gedurende de product ontwikkeling te kunnen inbouwen, is de beschikbaarheid van hieraan gerelateerde ontwerp regels absoluut noodzakelijk. Een aspect hiervan is een set ontwerp regels die gegarandeerd dat het product robuust is ten opzichte van externe spanningspieken op zijn pinnen zodat het niet in ‘latchup’ gaat en uitbrandt. In hoofdstuk 4 wordt een methode gedemonstreerd die toepasbaar is op elke CMOS technologie en waarmee ‘latchup’ ontwerp regels uit eenvoudige test structuren afgeleid kunnen worden. Hiermee is het mogelijk om producten te ontwerpen die direct aan de specificaties voldoen. Tegelijkertijd verandert het imago van ‘latchup’ als zijnde iets wat een hoog ’zwarte magie’ gehalte heeft in iets wat een behapbaar technisch probleem is. In massa productie is het voorkomen van en detecteren van proces uitschieters van het grootste belang omdat deze de product opbrengst en betrouwbaarheid sterk nadelig kunnen beïnvloeden. Daarom zijn er in halfgeleider productie processen zeer geavanceerde beheers systemen geimplementeerd, zowel in de lijn als aan het eind van de lijn. Hierbij worden alle kritische apparatuur en proces parameters die de product specificatie en betrouwbaarheid kunnen beïnvloeden op regelmatige basis gemeten en beheerst met ‘Statistische Proces Controle’ methoden. Een van die kritische proces parameters is de stapbedekking het metaal van een bepaald metallisatie proces omdat de stapbedekking een grote invloed kan hebben op de electromigratie gerelateerde betrouwbaarheid van een product. Hoofdstuk 5 beschrijft een nieuwe methode waarmee de metaal stapbedekking op simpele wijze electrisch gemeten kan worden. Deze methode is toegepast op de optimalisatie van metaal sputter processen en het afleiden van ontwerpregels. Verder wordt aangetoond dat de nieuwe methode duidelijke voordelen heeft ten opzichte van de normaal gebruikte methode waarbij doorsneden worden gemaakt omdat de met de eerste ook eenvoudig de stapbedekking over het gehele plak oppervlak bepaald kan worden. Gecombineerd met standaard beheersingsmethoden aan het einde van de lijn, is de nieuwe methode zeer geschikt voor stapbedekking controle op productie plakken en het onderscheppen van zwakke broeders. Als gevolg van de aanpak om betrouwbaarheid tijdens de proces en product ontwikkeling in te bouwen, komen ‘einde levensduur’ uitval mechanismen in de huidige producten niet meer voor. In plaats daarvan wordt de product betrouwbaarheid nu bepaald door ’vroege uitvallers’ ten gevolge van defecten die tijdens het productie proces geïntroduceerd zijn. De uitval niveaus van de producten zijn nu zo laag, dat standaard product levensduur technieken niet meer in staat zijn (tegen 148 acceptabele kosten) om genoeg statistisch relevante informatie op te leveren waarop de verbeteracties in de productie gebaseerd kunnen worden. Er is daarom een paradigma verschuiving nodig. In hoofdstuk 6 wordt aangetoond dat de voormeet opbrengst van een product (dat is het percentage goede kristallen per plak) gebruikt kan worden als de belangrijkste betrouwbaarheidsindicator. Data verkregen van meer dan 50 miljoen producten in diverse proces families tonen namelijk voor het eerst in de literatuur aan dat er een duidelijke correlatie is tussen de voormeet opbrengst van een product en zijn betrouwbaarheid bij de eind gebruiker. Dit komt doordat de defecten die opbrengst verlies veroorzaken en die aanleiding geven tot een uitvaller bij de eind gebruiker van dezelfde origine zijn. Dus kan een diffusie fabriek de betrouwbaarheid van zijn producten snel en gecontroleerd verbeteren door de defect niveaus in de lijn te beheersen en te verminderen. Hiermee vervalt de noodzaak om op extreem grote aantallen producten levensduur testen uit te voeren. De opbrengst - betrouwbaarheid relatie kan beschreven worden met een quantitatief model waarmee de uitvalsniveaus bij de eindgebruiker voorspeld kunnen worden slechts op basis van de voormeet opbrengst. Het model kan ook gebruikt worden om objectieve grenzen te zetten voor wanneer een afwijkende plak met lagere opbrengst vernietigd moet worden. Op deze wijze kan voorkomen worden dat producten met een te grote uitval kans bij de eindgebruiker terechtkomen. Het model geeft ook aan dat een aanzienlijke betrouwbaarheidsverbetering bereikt kan worden door speciale testen in het meet programma van het product te implementeren (‘screening’ testen) waarmee zwakke broeders, die gekenmerkt worden door bijv. een te hoog stroomverbruik in rust toestand of die al uitvallen bij licht verhoogde spanning in het meetprogramma, gedetecteerd en uitgezeefd kunnen worden. In hoofdstuk 7 wordt aangetoond op basis van data van meer dan 30 miljoen producten dat met deze technieken de uitval met een factor 2 verminderd kan worden en dat ze goede alternatieven kunnen zijn voor het ‘inbranden’ van producten (waarbij een product gedurende een bepaalde tijd bij hoge temperatuur bedreven wordt voordat het naar de eindgebruiker gaat). Dit soort methoden wordt inmiddels in snel tempo overal toegepast omdat ze de nodige efficiency verbeteringen en kosten reducties opleveren in massa productie. Gezien de hoge kosten van het ’inbranden’ van producten, is de vraag onder welke condities het achterwege gelaten kan worden uiterst relevant. Daarom is in hoofdstuk 7 de uitval curve als functie van tijd bepaald op basis van levensduur data van echte televisies en gerelateerde producten en tevens succesvol quantitatief gemodelleerd. De uitval curve heeft inderdaad de wel bekende ‘badkuip kromme’ vorm. Combinatie van het uitval snelheidsmodel met de bovenstaande opbrengst – betrouwbaarheid relatie, resulteert in een nieuw model waarmee voor het eerst in de literatuur de uitval snelheid van een product voorspeld kan worden op basis van de voormeet opbrengst en de gebruikscondities van het product voor wat betreft tijd en temperatuur. Met het model kan ook de minimaal benodigde inbrand tijd berekend worden die nodig is om een bepaald door de klant gespecificeerd maximaal uitvalsniveau te halen als functie van de voormeet opbrengst evenals de gemiddelde voormeet opbrengst waarboven het inbranden achterwege gelaten kan worden. De model voorspellingen blijken goed met de experimentele data over een te ko149 men. Het model voorspelt verder dat partijen met hoge opbrengst minder lange termijn uitval opleveren dan partijen met lage opbrengst, ook als deze laatste ‘ingebrand’ worden. Dit bevestigt weer het belang van hoge opbrengsten voor het realiseren van een uitstekende product betrouwbaarheid. Verder geeft het model aan dat voor tal van praktische situaties, het gebruik van ‘screening’ testen effectiever is in het verlagen van de uitvalsniveaus dan een standaard korte ‘inbrand’ cyclus. 150 151 List of Publications and Conference Presentations [1] J.A. van der Pol, J.J.M. Koomen, ‘Relation between the hot carrier lifetime of transistors and CMOS SRAM products’, Proceedings International Reliability Physics Symposium (IRPS), pp. 178-185, (1990) [2] J.A. van der Pol, ‘Hot carrier degradation of MOS transistors and circuits’, Presentation at Ford Automotive Reliability Workshop, October 21-23, Colorado Springs, (1993) [3] K. van Doorselaer, T.M. Moore, J.A. van der Pol, ‘Failure criteria for inspection using acoustic microscopy after moisture sensitivity testing of plastic surface mount devices’, Proceedings International Symposium on Testing and Failure Analysis (ISTFA), pp. 229-239, (1994) and presentation at Ford/Delco /Chrysler Automotive Reliability Workshop, October 20-21, Detroit, (1994) [4] J.A. van der Pol, P.B.M Wolbert, ‘A structured approach for the derivation of latch-up design rules for submicron CMOS processes’, Presentation at Ford/ Delco/Chrysler Automotive Reliability Workshop, October 20-21, Detroit, (1994) [5] J.A. van der Pol, E.R. Ooms, ‘Short loop monitoring of metal stepcoverage by simple electrical measurements’, Proceedings IRPS, pp. 148-155, (1996) and presentation at Ford/Delco/Chrysler Automotive Reliability Workshop, October 25-27, Indianapolis, (1995) [6] F. Kuper, J.A. van der Pol, E.R. Ooms, T. Johnson, R. Wijburg, W. Koster, D. Johnston, ‘Relation between yield and reliability of integrated cicruits: experimental results and application to continuous early failure rate reduction programs’, Proceedings IRPS, pp. 17-21, (1996) [7] B. Krabbenborg, J.A. van der Pol, ‘The influence of process variations on the robustness of an audio power IC’, Microelectronics & Reliability, pp. 18191822, (1996) [8] J.A. van der Pol, F.G. Kuper, E.R. Ooms, ‘Relation between yield and reliability of integrated circuits and application to failure rate assessment and reduction in the one digit FIT and PPM reliability era’, Microelectronics & Reliability, pp. 1603-1610, (1996) and presentation at European Symposium on Reliability of Electron devices and Failure analysis (ESREF), Enschede, (1996) [9] F.W. Ragay, J.A. van der Pol, J. Naderman, ‘In-situ monitoring of dry corrosion degradation of Au ballbonds to Al bondpads in plastic packages during HTSL’, Microelectronics & Reliability, pp. 1931-1934, (1996) [10] J.A. van der Pol, H.J. Gerritsen, R.T.H. Rongen, P.P.M.C. Groeneveld, P.W. Ragay, H.A. van den Hurk, ‘Reliability issues in 650V high voltage BipolarCMOS-DMOS integrated circuits’, Microelectronics & Reliability, pp. 17231726, (1997) and presentation at ESREF Conference, Bordeaux, (1997) 152 [11] J.A. van der Pol, E.R. Ooms, A. van ‘t Hof, F. Kuper, ‘Impact of screening of latent defects at electrical test on the yield-reliability relation and application to burn-in elimination’, pp. 370-377, Proceedings IRPS, (1998) [12] J.A. van der Pol, P.B.M. Wolbert, ‘Systematic derivation of latch-up design rules for submicron CMOS processes from test structures’, Microelectronics & Reliability, pp. 1051-1056, (1998) and presentation at ESREF Conference, Kopenhagen, (1998) [13] E.R. Ooms, J.A. van der Pol, ‘Occurrence and elimination of anomalous temperature dependence of latchup trigger currents in BICMOS processes’, Proceedings IRPS, pp. 138-143, (1999) [14] J.A. van der Pol, J-P.F. Huijser, R.B.H. Basten, ‘New latchup mechanism in complementary bipolar power IC’s triggered by backside die attach glue’, Microelectronics & Reliability, pp.863-868, (1999), presentation at ESREF Conference, Bordeaux, (1999) and presentation at the Automotive Electronics Council Automotive Reliability Workshop, November 2-5, Nashville, (1999) [15] J.A. van der Pol, A.W. Ludikhuize, H.G.A. Huizing, B.van Velzen, R.J.E. Hueting, J.F. Mom, G. van Lijnschoten, G.J.J. Hessels, E.F. Hooghoudt, R. van Huizen, M.J. Swanenberg, J.H.H.A. Egbers, F. van den Elshout, J.J. Koning, H. Schligtenhorst, J. Soeteman, ‘A-BCD: An Economic 100V RESURF Silicon-On-Insulator BCD Technology for Consumer and Automotive Applications’, Proceedings International Symposium on Power Semiconductor Devices (ISPSD), Toulouse, (2000) [16] J.A. van der Pol, R.T.H. Rongen, H.J. Bruggers, ‘Modelling of Surface Potential Induced Leakage Failures in High Voltage Integrated Circuits and Application to Design Rule Derivation’, Submitted to ESREF Conference, Dresden, (2000) 153 Dankwoord Vele mensen hebben de afgelopen jaren bijgedragen aan de totstandkoming van de artikelen die in dit proefschrift bijeengebracht zijn. Als eerste denk ik hierbij mijn ex-collega’s van het Corporate Reliability Centre van het Philips Natuurkundig Laboratorium die mij gestimuleerd hebben mijn eerste wankele schreden op het wetenschappelijke pad te zetten, met name Jan Verweij en Jan Koomen maar ook Karel Van Doorselaer, Fred Kuper, Ajith Amerasekera, Mario Pinto, Albert van der Wijk en Kees de Zeeuw. Daar heb ik ook geleerd dat werken best met veel plezier te combineren valt waar ook de niet genoemde groepsleden veel aan bijgedragen hebben. In Nijmegen was werk klimaat in de Reliability Physics groep van de Quality en Reliability (Q&R) afdeling van Consumer IC Nijmegen (CIC-N) en de nauwe samenwerking met de process engineers in Waferfab AN en de designers, device fysici en product engineers van CIC-N een uitstekende voedingsbodem. Een ruime keuze aan interessante en relevante onderwerpen, voldoende middelen om deze gedegen uit te zoeken, goede discussie partners en altijd een plezierige en positieve samenwerking, ook als het business belang grote druk op de groep legde. De belangrijkste steun kwam hierbij van Eric Ooms, Paul Schras, Fred Kuper, Jocky Naderman en Philip Wolbert voor (in willekeurige volgorde) de stimulerende discussies, hulp bij PC perikelen, opbeurende maar soms ook uitdagende woorden, hulp bij experimenten en analyses en nog vele andere zaken. Onmisbare bijdragen, zowel in materiele als immateriele zin, zijn echter ook geleverd door John Vroemen, Han Gerritsen, Peter Ragay, Benno Krabbenborg, Rene Rongen, Loed Heldens, Frans van Lottum, Peter Groeneveld, Henk van den Hurk, Guus Rehbach, Jan Bruggers, Anton Aelbers, Peter Taylor en Toon van ‘t Hof† van de Q&R afdeling, Piet van Kessel, Johan Bosmans, Piet Wessels, Jacques Mom, George Timan, Dick Vogelzang, Jan Soeteman†, Tom de Boer, Henk Verstappen, Wil Josquin en Hans Seele van de Waferfab AN, Jos Plagge, John Somberg, Arnold Sengers, Arno Emmerik, Ben Verhoeven, Richard Langezaal, Menno van Langen, Arjan van Wijk, Hans van den Berg, Frans Urselmann en Kees Joosse van CIC-N, Willem Koster van MOS3, Ruud van Winkelhof, Alfons Goossens , Karl Anderten, Will Gubbels, Jan Slotboom, Peter Meijer, Dick Kleinloog en Rob Wolters van het Nat.Lab. en Bob Thomas (USA). Dank ook aan mijn paranimfen Jan Koomen en Mario Pinto voor hun steun en vriendschap gedurende meer dan 10 jaar en Taib El Ghazi voor het ‘art work’ van de omslag. De grote inspirator van dit proefschrift mag ook apart geëerd worden: Jan Verweij, mijn promotor. Zonder zijn stimulering en zijn vertrouwen in mij was het waarschijnlijk nooit zover gekomen. Jan, hiervoor hartelijk dank evenals voor het grote geduld dat je ten toon gespreid hebt bij de voltooiing van dit proefschrift. Verder natuurlijk dank aan Anne-Mieke, Bram, Karel en tenslotte ook nog Lotte voor hun grote geduld tijdens al die uren dat ‘Papa weer zijn boekje aan het tikken was’ in plaats van met de jongens te spelen of met Lotte te wandelen. 154 Tenslotte wil ik nog mijn ouders vermelden die het studeren altijd door dik en dun gesteund hebben, ‘Pô en Moe’, bedankt! 155 Levensloop Jacob van der Pol werd op 5 mei 1961 geboren te Hoensbroek, provincie Limburg. Hij behaalde in 1979 zijn Gymnasium-β diploma aan de ‘Thomas à Kempis’ Scholen gemeenschap te Zwolle. Hierna studeerde hij Technische Natuurkunde aan de Technische Hogeschool Twente (later Universiteit Twente) waar hij afstudeerde in de vakgroep Quantumelectronica op de optische versterking van nanoseconde CO2 laserpulsen. Het ingenieurs diploma werd in 1986 ‘met lof’ behaald. Tevens werd de bevoegdheid van ’Stralingsdeskundige-C’ verkregen. Tijdens zijn militaire dienst werkte hij op het Fysisch- en Electronisch Laboratorium (FEL) van TNO te Den Haag waarna hij in 1987 in dienst trad bij het Philips Natuurkundig Laboratorium in Eindhoven. Hier werkte in het Corporate Reliability Centre aan diverse product en proces reliability onderwerpen van SRAM geheugens en submicron CMOS processen met nadruk op hot carrier degradatie en latchup. Sinds 1991 werkt hij bij Philips Semiconductors in Nijmegen. Na een jaar als reliability fysicus in submicron CMOS logic process ontwikkeling in de waferfab MOS3 werd hij in 1992 benoemd tot Reliability Physics Manager voor de product groep Consumer ICs. Hier werkte hij aan diverse bipolaire, BiCMOS en hoogspannings Bipolair-CMOS-DMOS (BCD) processen en producten en diverse (SMD) package families. Nadruk lag op de reliability van hoogspannings BCD processen, latchup in BiCMOS en BCD processen en ‘Early Failure Rate’ reductie technieken. Sinds 1996 is hij Process Engineering en Development manager van de Consumer Systems waferfab AN in Nijmegen die Bipolaire, BiCMOS, BCD en Silicon-On-Insulator processen voert. Hij is lid geweest van het Technical Program Committee van de IRPS conferentie, is co-auteur van de 1996 IRPS ‘Outstanding Paper Award’ en is sinds 1997 een van de chairman van het Technical Program Committee van de ESREF conferentie. Biography Jacob van der Pol was born in Hoensbroek, the Netherlands in 1961. He received the M.S. degree in Applied Physics ‘with honor’ from Twente University of Technology in 1986 with a specialisation in laser physics. After one year with the Physics and Electronics Laboratory of TNO in the Hague, he joined Philips Research Laboratories in 1987 where he worked on process and product reliability issues of SRAMs and submicron CMOS processes with emphasis on hot carrier degradation and latchup. In 1991 he joined Philips Semiconductors (PS) in Nijmegen as reliability physicist in CMOS logic process development. In 1992 he became Reliability Physics Manager for Consumer ICs. Here he has worked on various process, product and package reliability issues covering various (SMD) package families and bipolar, BiCMOS, CMOS logic and high voltage BCD processes. Emphasis was on the latter technology, latchup in BiCMOS processes and ‘Early Failure Rate’ reduction methods. Currently he is Process Engineering and Development Manager of the Consumer Systems Waferfab AN in Nijmegen, 156 running Bipolar, BiCMOS, BCD and SOI processes. He has participated in the IRPS conference Technical Program Committee, is co-author of the 1996 IRPS ’Outstanding Paper Award’ and is one of the chairmen of the Technical Program Committee of the ESREF conference. 157