Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Thermal runaway wikipedia , lookup
Opto-isolator wikipedia , lookup
Stray voltage wikipedia , lookup
Mains electricity wikipedia , lookup
Electrical substation wikipedia , lookup
Earthing system wikipedia , lookup
Surge protector wikipedia , lookup
Distribution management system wikipedia , lookup
Immunity-aware programming wikipedia , lookup
Reliability and Failure Analysis of Electronic Components By Marco Mugnaini Design for Safety of Electronic Components • For VLSI Circuits to be a useful and growing technology, 2 conditions must be satisfied: – Can be produced in large quantities at low cost – Can perform their functions throughout their intended lifetime • To lower the cost of manufacturing, the optimal size of the IC should be selected. • The optimal size is a compromise between several competing considerations: – – – – Partitioning of the system yield of good circuits packaging and system assembly cost Reliability/availability and design for safety of complete system • Large number of IC’s results in high yield and assembly cost • To arrive at an optimal division of the system, we must be able to predict the total system reliability as a function of the number of IC’s of varying size 2 Mechanism of Yield Loss in VLSI • Cause for low yield falls into 3 basic categories: – Parametric processing problems – Circuit design problems – Random point defects in circuits due to several causes Processing Effects • Often a wafer is divided into regions good chips and bad • This is most likely due to processing effects such as – Variations in thickness of oxide or polysilicon layers – Variations in resistance of implanted layers – Variations in width of lithographically defined features 3 • Alignment of photomasks – e.g. PolySi gate lengths are shorter in thinner polySi regions than in thicker polySi regions. This may cause channel lengths to be too short and transistors cannot be turned off. This leads to excessive leakage current • Variations in thickness of deposited dielectric lead to variations in contact window size. This may lead to non-operative circuits if the circuits depend on having a low value of contact resistance. • Variations in the doping of implanted layers which also leads to variations in contact resistance • Also, wafer may vary in size during processing in excess of 20ppm. Therefore a 125 mm wafer changes in size by 2.5mm. This may cause significant misalignment. 4 Circuit Sensitivities • Certain areas of a wafer have low device yield because the design of a circuit has failed to consider expected variations in device parameters and correlation between variations in different parameters. Point Defects • A 3 m dust can cause a break in a metal conductor • Isolated oxidation induced stacking fault may cause excessive leakage current Modeling of Yield-loss Mechanisms • We need to model IC yield in terms of fundamental parameters independent of particular IC and characteristics of the process and processing line because: – by accurately modeling the yield we can predict the cost and availability of ICs 5 • • • • • • • • – once yield-modeling parameters are known one can compare processing quality of different process lines and indicate where improvements are required IC yield is expressed as Y=Y0Y1(D0,A,i) 1-Y0 = fraction of bad chips due to processing related effects 1-Y1 = remaining fraction of bad chips which is a function of density of point defects A is the chip area i is the parameter unique to different models of the yield Y = ratio of good chips to total number of chips per wafer All models predict Y decreases monotonically as A increases Yield modeling can identify those processes and mechanisms that limit yield of present IC The process can then be improved or eliminated as needed 6 Uniform Density of Point Defects • In those areas where yield not degraded by either processing or circuit sensitivities, the remaining cause of chip failure is randomly distributed point defects • A grid of 24 chip sites with 10 defects randomly distributed. In this example 16 of the 24 sites have 0 defects • Of the remaining sites 6 have 1 defect no site has more than 2 defects • The problem of determining the yield is identical to the problem of placing n balls in N cells and then calculating probability of a given cell containing k balls P k = (n!)/[k!(n-k)!] (1)/(Nn)(N-1) n-k • If N and n are both large n/N = m remains finite and can be approximated as Pk =e-mmk /k! • The probability that a chip contains no defects is Y1 = P0 = e-m • The probability a chip contains 1 defect is P1 = me-m 7 • If the area of the chip is A, the total chip area in the useable part of a wafer is NA • The density of defects is n/NA = D0 • The average number of defects per chip, m, is m = n/N = D0 NA/N = D0 A Y1 = P0 = exp(-D0 A) • This Poisson estimation was used to predict yield in the early days in the manufacture process • The actual yield was found to be much larger than predicted 8 Yield Enhancement using Redundant Circuitry • Many large MOS memory chips are designed with redundant circuitry, which can be switched to replace defective circuit elements • This is usually accomplished using fusible links which can be fused as needed using laser or other techniques • The yield will then be modified as shown • Y1 = P0 + P1 • P0 = probability of chip containing no defects • P1 = probability of chip containing 1 defect • = probability of chip containing 1 defect and can be repaired by using a single redundant column Simple Non-uniform Distribution of D • Discrepancy between measured and predicted yield led to investigation of non-uniform distribution of D0 across a wafer 9 • The yield can be expressed as • The yield is expressed as Y = exp(-DA) f(D) dD • f(D) is the normalized distribution of defect density f(D) dD = 1 • 3 different D0 are investigated: – Delta function Y1 = exp(-D0A) – Triangular Y2 = {[1-exp(-D0A)]/D0A}2 – Rectangular Y3 = {1-exp(-2D0A)}/2D0A • for D0A >> 1 we find that – Y1 = exp(-D0A) – Y2 = 1/(D0A)2 – Y3 = 1/(2D0A) • Y3 is found to be most closely fit to the observed yield of large IC’s • The above distributions do not have any physical basis, therefore more physically based distributions need to be investigated 10 Gamma Distribution • The Gamma distribution is more physical f(D) = 1/[()() ]D -1 exp(-D/ ) • and are 2 distribution parameters and () is the gamma function • Average density of defects = • Variance of D = 2 • Consequently Y4 = 1/(1+SD0A)1/s • for s 0, Gamma function reduces to delta function and Y4 exp(-D0A) • Using different values of s, Gamma function is a good approximations of Y2 and Y3 over a wide range of D0A • Gamma yield functions can be used to represent a large variations in the shape of experimental yield vs area curve see Fig. 4 and 5 p. 621 and 622 of Sze. • Each type of defect is characterized by – its mean defect density Dn0 – shape factor of its distribution Sn 11 – portion of total chip area An susceptible that defect – Using Gamma yield function Yn = 1/{(1+SnAnDn0)}1/Sn • The overall yield is the product of the yield for each known type of defect Y = Yn for n=1,2,….,N • For a mature process in a well controlled high yield line, all of the major yield-limiting defects have probably been controlled or eliminated. The yield is a product of many terms each approximately 1. • This means SnAnDn0 << 1 ln Y = [-(1/Sn) ln(1+SnAnDn0)] ln(1+SnAnDn0) SnAnDn0 Thus lnY = -AnDn0 Y = exp(- AnDn0) D* = (1/A) AnDn0 Y = exp(-AD*) • Here An is the total chip area susceptible to the particular defect 12 Reliability Requirements for VLSI • It is instructive to consider examples of the effects of device failure – Early discrete solid state computer systems typically consisted of 105 transistors per system – If 1 device failure per month is set as the minimum acceptable condition then the failure rate < 1/(105 720 hrs) = 14 10-9 failure/device-hour • 1 FIT 1 failure/ 109 device-hour • The objective for the hypothetical system is for < 14 FIT 13 MAIN FAILURE MODES for VLSI •Printed circuit board failures •Printed circuit boards (PCBs) are vulnerable to environmental influences; for example,the traces are corrosion-prone and may be improperly etched leaving partial shorts,while the vias may be insufficiently plated through or filled with solder. The traces may crack or make poor contact under mechanical loads, often resulting in unreliable PCB operation. Residues of solder flux may facilitate corrosion; those of other materials on PCBs can cause electrical leaks. Polar covalent compounds can attract moisture like antistatic agents, forming a thin layer of conductive moisture between the traces. •Vias are a common source of unwanted serial resistance. Mousebites are regions where metallization has a decreased width; such defects usually do not show during electrical testing but present a major reliability risk. Increased current density in the mousebite can aggravate electro-migration problems; 14 Semiconductor (LED and MOSFET) failures Many failures result in generation of hot electrons. These are observable under an optical microscope. Examples of semiconductor failures include accumulation of charge carriers trapped in the gate oxide of MOSFETs. This introduces permanent gate biasing, influencing the transistor's threshold voltage; it may be caused by hot carrier injection, ionizing radiation, or nominal use. 15 Electrostatic discharge (ESD) Electrostatic discharge is a subclass of electrical overstress and may cause immediate device failure, permanent parameter shifts and latent damage causing increased degradation rate. It has at least one of three components, localized heat generation, high current density and high electric field gradient; prolonged presence of currents of several amperes transfer energy to the device structure to cause damage. Catastrophic ESD failure modes include: • Junction burnout, where a conductive path forms through the junction and shorts it • Metallization burnout, where melting or vaporizing of a part of the metal interconnect interrupts it • Oxide punch-through, formation of a conductive path through the insulating layer between two conductors or semiconductors; the gate oxides are thinnest and therefore most sensitive. The damaged transistor (MOSFET) shows a low-ohmic junction between gate and drain terminals. 16 A parametric failure only shifts the device parameters and may manifest in stress testing; sometimes, the degree of damage can lower over time. Latent ESD failure modes occur in a delayed fashion and include: • Insulator damage by weakening of the insulator structures. • Junction damage by lowering minority carrier lifetimes, increasing forwardbias resistance and increasing reverse-bias leakage. • Metallization damage by conductor weakening. 17 •Catastrophic failures require the highest discharge voltages, are the easiest to test for •and are rarest to occur. Parametric failures occur at intermediate discharge voltages •Crownlite Mfg. Corp., Failure modes of electronics, and occur more often, with latent failures the most common. For each parametric failure, there are 4–10 latent ones. •The gate oxide of some MOSFETs can be damaged by 50 volts of potential, the gate isolated from the junction and potential accumulating on it causing extreme stress on the thin dielectric layer; stressed oxide can shatter and fail immediately. The gate oxide itself does not fail immediately but can be accelerated by stress induced leakage current, the oxide damage leading to a delayed failure after prolonged operation hours; •on-chip capacitors using oxide or nitride dielectrics are also vulnerable. •Smaller structures are more vulnerable because of their lower capacitance, meaning the same amount of charge carriers charges the capacitor to a higher voltage. •All thin layers of dielectrics are vulnerable; hence, chips made by processes employing thicker oxide layers are less vulnerabile. 18 • Capacitors • Capacitors are characterized by their capacitance, parasitic resistance in series and parallel, breakdown voltage and dissipation factor; structurally, capacitors consist of electrodes separated by a dielectric, connecting leads, and housing; deterioration of any of these may cause parameter shifts or failure. Shorted failures and leakage due to increase of parallel parasitic resistance are the most common failure modes of capacitors, followed by open failures. One common example of capacitor failures includes overvoltage or aging of the dielectric, occurring when breakdown voltage falls below operating voltage. • In addition to the problems listed above, electrolytic capacitors suffer from failures when power dissipation by high ripple currents and internal resistances cause an increase of the capacitor's internal temperature beyond specifications, accelerating the deterioration rate; such capacitors usually fail short. •Reliability of semiconductor devices can be summarized as follows: 1. Semiconductor devices are very sensitive to impurities and particles. Therefore, to manufacture these devices it is necessary to manage many processes while accurately controlling the level of impurities and particles. 2. The problems of micro-processes, and thin films and must be fully understood as they apply to metallization and bonding wire bonding. It is also necessary to analyze surface phenomena from the aspect of thin films. 3. Reliability of semiconductor devices may depend on assembly, use, and environmental conditions. Stress factors affecting device reliability include gas, dust, contamination, voltage, current density, temperature, humidity, mechanical stress, vibration, shock, radiation, pressure, and intensity of magnetic and electrical fields. 20 Reliability Theory •Useful mathematical description requires precise definition of the terms •Definitions: •Reliability -- probability that an item will perform a required function under stated conditions for a stated period of time 21 • For an IC the required function is generally defined by a test program for an automatic test set • Often initial test programs are not complete and the ckts are not tested under “all” required conditions • As new device failure modes are identified, the appropriate tests are included in later test programs • Stated Conditions -- comprise of the total physical environments, including mechanical, thermal, electrical …. • Stated period of time -- the time during which satisfactory operation is required Cumulative Distribution Function • If the device is operational at t = 0. The probability that the device will fail at or before t is given by the function F(t) F(t) = 0 t<0 0 F(t) F(t`) 0 t t` F(t) 1 t 22 Reliability Function and Probability Density Function • The probability density function is f(t) = dF(t)/dt • The Cumulative distribution function is t F(t) = 0 f(x)dx • The reliability function is R(t) = t f (x)dx • Thus f(t) = - dR(t)/dt 23 Failure Rate • In many applications the quantity of most concern is the instantaneous failure rate • This is often referred to as the hazard rate • Fraction of devices that were good at time t and that fail by t + is given by • F(t + ) - F(t) = R(t) - R(t+ ) • The average failure rate during the time interval, , is • (t) = average failure rate • = 1/ [R(t) - R(t+ )]/R(t) • for 0 • (t) = - 1/R(t) dR(t)/dt = f(t)/R(t) • = f(t)/[1 - F(t)] • = - d[ln R(t)]/dt • R(t) = exp[- 0t (x) dx] 24 Mean Time to Failure (MTTF) and Common Distribution Functions (p. 630 Sze) • MTTF is a common measure of reliability MTTF = 0 t f(t) dt • It is desirable to have a single mathematical model that represents the failure rate of devices over their entire lifetime • (t) generally varies as a function of time as shown A. High “early failures” or “Infant Mortality” due to manufacturing defects B. “Midlife” or “Steady state” period of low and generally constant failure rate C. “Final” or “wear out” period C t 25 Exponential Distribution Function • The simplest distribution function, exponential, is characterized by a constant failure rate over the lifetime of the device. This is useful for representing a device in which all early failure mechanisms have been eliminated – (t) = 0 – R(t) = exp(- 0t) – F(t) = 1 - exp(- 0t) – f(t) = 0exp(- 0t) – MTTF = 0 t 0exp(- 0t) dt 26 Weibull Distribution • (t) varies as a power of the age of the device = (/)t-1 where and are constants • For < 1 the failure rate decreases with time and can be used to represent early failure • For = 1, (t) is constant and can be used to represent steady state • For > 1, (t) increases with time and can be used represent wearout condition • For = 1, the failure rate is constant which is a special case of Weibull distribution • R(t) = exp{-(1/)t} 1 f(t) = (/) t-1exp {-(1/)t} 1 F (t ) exp t MTTF = 1/ (1+1/) where =1. which is linear. The 1 ln ln slope of the line is ln t ln 1 F (t ) . The MTTF is the time when F(t) = 0.5 27