* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 634_1.pdf
Survey
Document related concepts
Scanning SQUID microscope wikipedia , lookup
Microelectromechanical systems wikipedia , lookup
Electron mobility wikipedia , lookup
Giant magnetoresistance wikipedia , lookup
Photoconductive atomic force microscopy wikipedia , lookup
Semiconductor device wikipedia , lookup
Electron-beam lithography wikipedia , lookup
Diamond anvil cell wikipedia , lookup
Semiconductor wikipedia , lookup
Colloidal crystal wikipedia , lookup
Crystal structure wikipedia , lookup
Metastable inner-shell molecular state wikipedia , lookup
Biological small-angle scattering wikipedia , lookup
Transcript
High Resolution X-ray Scattering Methods For ULSI Materials Characterization Richard J. Matyi Physics Laboratory, National Institute of Standards and Technology, Gaithersburg, MD 20899 Abstract. X-ray analytical methods with high angular resolution are becoming increasingly important for the characterization of materials used in ULSI fabrication. Vendors now market state-of-the-art X-ray tools for the routine analysis of parameters such as layer thickness, chemical composition, strain relaxation, and interfacial roughness. The recent integration of X-ray diffraction and reflectivity systems into fab-compatible process metrology tools suggests that the importance of these techniques will only increase with time. Here we discuss some basic principles of high resolution X-ray methods (notably double- and triple-axis X-ray diffractometry and high resolution X-ray reflectometry) and will describe the capabilities and limitations of these tools for ULSI materials. Reference will be made to “real-life” problems involving bulk and thin-film structures (ranging from amorphous dielectrics and polycrystalline metals to highly perfect epitaxial single crystal materials) to show both the utility and the shortcomings of high resolution X-ray methods. diffractometry, and (c) X-ray reflectometry. INTRODUCTION The continual reduction in the dimensions of semiconductor device structures is placing increasingly stringent demands on key metrology tools. Because their wavelengths (less than 10Å) are similar to the sizes of the smallest features in advanced devices, Xrays are well suited to the structural characterization of very small and/or thin structures. The interactions of X-rays with atoms tends to be relatively weak (unlike, for instance, electron-solid interactions), so they can typically be described in terms of perturbations. This greatly simplifies the mathematical description of the X-ray scattering process and permits quantitative modeling of that process with relative ease. High resolution X-ray scattering is an attractive tool for semiconductor metrology applications due to its sensitivity to the structure of semiconductor materials and its capability for obtaining quantitative information [1-3]. Until recently X-ray tools were typically confined to the laboratory benchtop; now, however, a number of equipment vendors are introducing fabcompatible X-ray tools. In this paper we will examine some of the theoretical and practical aspects of modern high resolution X-ray measurement techniques to illustrate the utility and the limitations of these methods. Three high resolution techniques will be discussed: (a) double axis X-ray diffractometry, (b) triple axis X-ray THEORETICAL BASIS OF HIGH RESOLUTION X-RAY METHODS The scattering of X-rays by a solid† is conveniently described in terms of a hierarchy of processes. At the lowest level is the scattering of an incident X-ray beam (which is an electromagnetic wave) by a single electron. From electromagnetic wave theory it is known that the electric field will exert a force on the electron; since the field is varying sinusoidally with time, the electron will be accelerated. However, classical theory also says that an accelerated charge must radiate, so the oscillating electron will become a source of scattered radiation that has the same frequency as the incident wave. In this manner a single electron becomes a source of scattered radiation, albeit a weak one – the intensity is proportional to the square of the classical electron radius re2 , or about 7.94×10-30 m2. Thus the reason we can observe scattered X-rays at all is due to the large number of electrons present in most solids. Now we extend the hierarchy of scattering to the electrons that surround a particular atom. Since an † For brevity in this overview we consider only elastic scattering. More comprehensive treatments are given in a number of excellent texts, such as Warren [4] and Als-Nielsen and McMorrow [5]. CP683, Characterization and Metrology for ULSI Technology: 2003 International Conference, edited by D. G. Seiler, A. C. Diebold, T. J. Shaffner, R. McDonald, S. Zollner, R. P. Khosla, and E. M. Secula 2003 American Institute of Physics 0-7354-0152-7/03/$20.00 634 atom is “large” compared with the X-ray r* wavelength, different X-ray waves will H hkl =1 d hkl r exhibit phase relationships that depend on the direction the scattering is S λ observed. In the forward (main beam) r∗ H hkl direction, the waves scattered by all of r θ the electrons are in phase; the total amS λ r r d plitude scattered in this direction will be So λ So λ the number of electrons (Z) times the θ 000 single electron scattering. In other directions, the waves generated by different electrons will exhibit differences in phase causing partial FIGURE 1. Vector representation of the diffraction process with respect interference and a decrease in the scatto the diffracting planes (left) and the Ewald spehere (right). tered amplitude. Thus the atomic scattering factor decreases from a value Z in the forward dithe condition needed for a diffraction maximum, Figrection towards zero as the scattering angle increases. ure 1b shows a useful variation on the theme. Like Next, consider the scattering of a collection of atother vectors, the incident and diffracted wavevectors r r oms – specifically the unit cell of a crystal structure. S o λ and S λ may be translated as long as their diAgain, waves scattered in different directions will rections and magnitudes are not changed. The condihave different relative phases, and in most cases a tion for diffraction is then satisfied when the resultant summation over all scattered X-rays results in destrucof the vectors is coincident with the reciprocal lattice r r∗ tive interferences that reduce the net scattered amplivector H hkl . Note that if S λ is randomized over all tude to zero. However, in specific directions, the possible orientations a sphere is generated. This sphere waves will reinforce to produce a non-zero amplitude, describes the locus of permitted conditions for positive with the amplitude scattered per unit structure being reinforcement of the scattered X-rays (i.e. a diffraction known as the structure factor. maximum) and is often called the Ewald sphere. A simple physical model based on the dependence The concepts discussed above comprise the kineof the phase of the scattered radiation on the direction matic theory of X-ray diffraction. Implicit in the kineof the scattering is known as Bragg’s Law. X-rays of matic theory are critical assumptions: wavelength λ scattered from the atoms in adjacent • The interaction between the incident X-rays and crystal planes separated by a distance d will interfere the solid are sufficiently weak that the energy loss constructively if the path difference between the two by the incident beam is negligible and there is no planes is an integral number of wavelengths. This change in the X-ray wavelength within the solid. physical process gives rise to the Bragg equation: nλ = 2d sin θ (1) While this equation is widely used in the scientific community, a more useful description of the X-ray scattering process is shown in Figure 1a. A more complete analysis of the phase relations of the waves scattered by a three-dimensional array of atoms shows that the maximum of constructive interference can be described by a vector equation: r r r∗ S − So = H hkl (2) λ r r where S o and S are unit vectors in the incident and r∗ diffracted beam directions, respectively, and H hkl is a vector perpendicular to a set of diffracting planes (hkl) with a length inversely proportional to the interplanar r∗ spacing d. Because the magnitude of H hkl has a reciprocal relationship to the interplanar spacing, it is usually called a reciprocal lattice vector. While Equation 2 and Figure 1a indeed describe • The intensities of the scattered X-rays are low. • An X-ray photon is scattered only once; there are no multiple scattering events. • The diffracting crystal is small and far away from the detector. In the case of powder samples and highly imperfect single crystals, these assumptions are usually warranted. However, in large, highly perfect semiconductor crystals, one or even all may become invalid. In these cases a more complex theory is needed to describe the diffraction process. In this approach (usually referred to as the dynamic theory of diffraction) a solution is found to Maxwell’s equations for the propagation of electromagnetic waves in a medium with a periodically varying electrical susceptibility. By enforcing proper boundary conditions, we arrive at a solution for waves that satisfy both Bragg’s Law and Maxwell’s equations. Figure 2 illustrates how this treatment arises. In ki- 635 Ewald sphere (a) r r S λ=k ∆θ ∼ a few arcseconds (b) hkl (≡ h) r* H hkl sphere about 0 to hkl r* H hkl r So λ αh 000 (≡ 0) sphere about h (c) αο to 0 FIGURE 2. The dynamic theory of diffraction: (a) the Ewald sphere from the kinematic theory; (b) definition of wavevectors that satisfy the diffraction condition inside and outside the crystal; (c) definition of the deviation parameters αo and αh. nematic diffraction, the permitted wavevectors are described by the Ewald sphere. In the dynamic case, the incident and diffracted wavevectors are coupled; their lengths change (due to a change in X-ray wavelength within the crystal), and their common origin describes a point which is displaced from the spheres that can be drawn about the tail (point 000) and head r∗ . Very (point hkl) of the reciprocal lattice vector H hkl r close to the kinematic origin defined by S o λ and r S λ , these two spheres can be approximated by planes; as shown in Figure 2, the deviations from the kinematic diffraction conditions caused by the dynamic interactions can be denoted αo and αh. These deviations generate two hyperbolic sheets in three dimensions that are asymptotic to the kinematic spheres; this is known as the dispersion surface and is given by [1,6,7] α0 α h = k 2 C 2 χ h χ h (3) side a solid, implying that the index of refraction for X-rays differs from unity. It is in fact given by [1,6,7] λ 2 reρe λµ n = 1− − i x = 1 − δ − iβ (5) 2π 4π where ρe is the electron density of the solid and µx is the linear absorption coefficient. Both the real (δ) and imaginary (β) components are very small for most semiconductor materials under laboratory conditions. For instance, if silicon is examined with CuKα X-rays, δ ≈ 7.4×10-6 and β ≈ 1.9×10-7. The index of refraction for X-rays is thus slightly less than unity, so they can experience total external reflection at an interface. The critical angle for total external reflection is: rρ 2 θcrit = 2δ ⇒ θcrit = λ o e (6) π In the case of silicon and CuKα radiation, the critical angle is about 0.22°. At incidence angles greater than θcrit, partially reflected and transmitted X-ray wavefields will both be present. This situation is identical to that in conventional optics, where the Fresnel equations give the reflection and transmission coefficients for a light ray incident on an interface and forms reflected and refracted rays. Neglecting an absorption correction the X-ray reflection coefficient becomes: where k is the vacuum wavevector, C describes the polarization of the X-ray beam, and the electrical susceptibility χ h = −(re λ 2 πV ) Fh is given by the classical electron radius re, the unit cell volume V, and the structure factor Fh corresponding to reflection (hkl). Equation (3) is central to the dynamic theory, since it gives the wavevectors that are permitted under dynamic diffraction conditions as well as other insights. For example, it can be shown that the separation of the hyperbolic sheets gives the range corresponding to the width of an X-ray reflection. This separation is the Xray analogue to the “energy gap” that arises when electrons propagate through a crystal. It yields a peak breadth in a symmetric reflection geometry of: 2re λ 2 Fh (4) ∆θ = πVC sin 2θB where VC is the volume of the unit cell. Peak breadths of a few arcseconds are usually seen in typical semiconductor materials under laboratory conditions. X-rays have different wavelengths inside and out- R ( θ) = 2 θ − θ2 − θcrit 2 θ + θ2 − θcrit 2 (7) With this in hand it is possible to describe three generic aspects of an X-ray reflectivity profile that would be obtained from a bulk sample: • • • 636 For θ < θcrit there would be a constant specular reflected intensity; At θ = θcrit the profile would show a sharp drop in the reflectivity; For θ > θcrit the intensity profile would decreases with a θ-4 dependence. Of course, the ability to characterize thin film materials is far more important in the semiconductor environment. A procedure for doing so was developed by Parratt [8] which uses a recursion relationship for an arbitrary N-layer structure on a semi-infinite substrate: rj + R j +1a 2j +1 (8) Rj = 1 + rj R j +1a 2j +1 r ∗ monochromator H hkl wide-open detector X-ray source sample In this equation, R j = E rj E tj is the ratio of the reflected electric field amplitude to the transmitted field. The complex amplitude a j = exp(ik z , j d j ) corresponds Ewald sphere to layer j (1≤ j ≤ N) with thickness dj and depends on the z-component of the wavevector which from Snell’s law is k j , z = k ( n 2j − cos2 θ)1 2 with nj being the refrac- S/λ So/λ tive index of layer j. The Fresnel coefficient rj for interface is then rj = ( k j , z − k j +1, z k j , z + k j +1, z ) . Equation sample (8) is solved by starting at the bottom (layer N closest to the substrate) and noting that the reflected intensity coming up from the semi-infinite substrate will be ideally zero. FIGURE 3. Double axis rocking curve geometry in the laboratory perspective (top ) and in terms of the Ewald sphere (bottom). then the net rocking curve will be the correlation of the two perfect crystal single reflection profiles. In the case of the (400) reflection of CuKα X-rays from silicon, for instance, the peak breadth from dynamic theory is about 3.5 arcseconds. If there are structural defects in the sample crystal, their presence will be indicated as an extra broadening of the rocking curve. For this reason, the breadth of the double-axis rocking curve has long been used as an indicator of the relative “quality” of the sample crystal. Double axis rocking curves are widely used for the analysis of thin epitaxial (single crystal) layers on crystalline substrates. The presence of an epitaxial layer on the substrate results in two reciprocal lattice vectors directed normal to the diffracting planes (or, in this case shown here where the diffracting planes are parallel to the surface, normal to the sample). If we assume that the d-spacing of the layer is less than the corresponding d-spacing of the substrate, then r∗ r∗ H hkl (layer) > H hkl (sub) . As the sample is rotated DOUBLE AXIS DIFFRACTOMETRY With the theoretical background established, it is now possible to critically examine the various high resolution X-ray diffraction methods used in semiconductor metrology. The first of these is double axis Xray diffractometry, where the incident beam from an X-ray source is monochromated by a crystal (or set of crystals‡) with a high degree of structural perfection. From our discussion of dynamical diffraction, we recall that the intrinsic reflection range of a perfect crystal is typically a few arcseconds. If a beam of Xrays from a laboratory X-ray source (for instance, CuKα radiation from a copper X-ray tube) is directed at this crystal, then only those CuKα X-rays within the perfect crystal reflecting range of a few arcseconds will be diffracted. The X-rays diffracted by the monochromator will likewise have an angular spread of only a few arcseconds. This highly conditioned beam can then be directed at a sample crystal as it is rotated about an axis perpendicular to the plane defined by r r r∗ S o λ , S λ , and H hkl . The resulting plot of the multiply diffracted intensity as a function of sample crystal rotation is known as a “rocking curve.” A schematic of the double axis geometry is seen in Figure 3. If the monochromator and sample crystals are identical and sufficiently perfect to diffract dynamically, through the Bragg reflection condition, Figure 3 shows that the reciprocal lattice points corresponding to the r∗ substrate (larger d-spacing, smaller H hkl ) and layer r∗ (smaller d-spacing, larger H hkl ) pass through the Ewald sphere and satisfy the condition for diffraction. However, the use of a wide-open detector means that a “fan” of doubly-diffracted beams can be intercepted by the detector. Thus any feature touching the Ewald sphere will diffract and can be detected. Figure 4 shows a (004) double axis rocking curve from the compound semiconductor AlGaAs (approximately 37% Al) on a GaAs substrate. In this case the ‡ The fact that many instruments use multiple crystals as incident beam conditioners has led to the term “double axis” to identify a system with a monochromator and sample, rather than the older “double crystal” name. 637 105 105 a b c d 104 103 102 Intensity -1 Intensity (counts s ) 104 undoped 1.1 x 1015 2.6 x 1015 4.4 x 1015 103 101 a 100 102 b 10-1 c 10-2 d 10-3 -300 101 -200 -100 0 100 200 300 Rocking angle (arcseconds) 105 100 -300 a b c d e 104 -150 0 150 300 103 Rocking angle (arcseconds) 102 Intensity FIGURE 4. A typical double axis X-ray rocking curve from AlGaAs/GaAs. lattice parameter of the layer is larger than that of the substrate, so the layer peak is observed at a smaller angle. The interference of the diffracted wavefields from the epitaxial layer and the substrate give rise to interference fringes§ with an angular spacing given by: δ= λγ hkl t sin 2θB 101 a 100 b 10-1 c 10-2 d 10-3 e 10-4 -900 -600 -300 0 as implanted 5 s, 950°C 20 s, 950°C 30 s, 1050°C 30 s, 1100°C 300 600 900 Rocking angle (arcseconds) FIGURE 5. Double axis rocking curves from boronimplanted silicon: (top) 5 keV B-implant, 113 reflection, all samples as–implanted with dose indicated; (bottom) 3.5 keV B-implant, 224 reflection, postimplant anneal at temperature and time indicated. (9) where γhkl is the cosine of the angle between the diffracted beam direction and the inward-pointing surface normal. While double axis rocking curves can be used to provide accurate thickness measurements, a more common use is in determining the structural characteristics of crystalline materials. Any modification to the crystalline structure of either bulk or thin film materials can generate changes in lattice parameter that can be measured in a rocking curve experiment. Figure 5 illustrates rocking curves from Bimplanted Si following implantation with different doses ([9], top) and at a constant dose but varying post-implant anneals ([10], bottom). In the asimplanted state, B- and Si-interstitials expand the host silicon crystal structure and generate scattering at lower angles from that of bulk silicon. In contrast, the post-implant activation anneal allows the implanted boron to migrate to Si-lattice sites. In the case of B/Si this causes the lattice parameter to decrease, so that excess scattering will be seen at angles greater than the bulk Si Bragg angle. Figure 6 gives an example of the analysis of chemical composition in epitaxial layers from double axis rocking curves. The Figure shows 004 rocking curves that were obtained from a series of samples of SiGe grown by molecular beam epitaxy on GaAs [11]. The nominal lattice parameter of pure Ge (5.6568Å) is larger than that of GaAs (5.6534Å), so the layer peak appears at a smaller angle with respect to the substrate. As silicon (nominal lattice parameter 5.4310Å) is Intensity 9.4% Si 6.4% Si 4.4% Si 2.9% Si 0% Si -600 § Some authors incorrectly refer to these as “Pendellösung fringes” in the belief that they are analogous to dynamical diffraction features that arise from the selection of active wavevectors at the dispersion surface. However, the thickness fringes seen in rocking curves do not come from an oscillation between the sheets of the dispersion surface and hence they are not Pendellösung fringes. -300 0 300 600 900 Rocking angle (arcseconds) FIGURE 6. Double axis rocking curves from SiGe grown on GaAs with compositions ranging from 0% Si (bottom) to 9.4% Si (top). 638 al al at ct ao ao substrate + layer fully strained partially relaxed fully relaxed simple tilt pseudomorphic growth tilt + shear other distortions FIGURE 7. Strained layer growth showing an independent substrate and layer (left), a layer/substrate system transitioning from fully strained to fully relaxed (center), and the effect of tilt and shear distortions (right). distorted with respect to its unstrained state. The amount of distortion will depend on the elastic constants of the layer, which may not be well known. Second, if the degree of relaxation is not known, then the relation between the partially-strained and the unstrained unit cell will make composition measurements impossible. The common approach to dealing with this is to obtain rocking curves from both symmetric reflections (diffracting planes parallel to the interface shown in Figure 7) as well as asymmetric reflections from inclined planes. If the distortion in the epitaxial layer is relatively simple (such as a cubic unit cell distorting into a tetragonal structure) only one or two asymmetric reflections may be needed. However, if the layer is tilted, sheared, or otherwise deformed in a more complex, low symmetry crystal system a complete definition of the structure of the layer may require a large number of asymmetric reflections. While the preceding discussion has been confined to single layers, multilayer or superlattice structures can also be characterized by double axis diffraction. Figure 8 shows a (004) Si/GaAs superlattice grown on GaAs [12]. Due to the lattice mismatch between the two materials, the silicon layers had to be kept very thin (~3Å) while the GaAs layers could be thicker added to the epitaxial layer, the lattice parameter decreases; typically it is assumed that the crystalline lattice parameter of a single-phase compound is a linear function of the chemical composition (usually known as Vegard’s Law). The decrease in lattice parameter with increasing Si causes the epitaxial layer peak to “pass by” the substrate peak. For small lattice mismatches, the rocking curve shows well-defined interference fringes, indicating a strong dynamical interaction between the wavefields diffracted by the layer and the substrate. This implies that the structural perfection of the layer is at least high enough to permit dynamical diffraction to occur. At the highest silicon content, however, the epitaxial layer peak is significantly broadened and there is no evidence of interference fringes. Both of these are indicative of a decrease in structural quality of the GeSi layer. The decrease in structural perfection with increasing difference in the lattice parameter between an epitaxial layer and its substrate is a consequence of the strain relaxation in the layer/substrate system. The key features involved with strain relaxation are shown in Figure 7. Consider an epitaxial layer material whose lattice parameter is larger than that of the substrate. If the layer is very thin, then the energy of the layer/substrate system will be minimized if the layer is pseudomorphic with the substrate – that is, the layer elastically deforms so that the in-plane lattice parameter of the layer matches that of the substrate. As the layer thickness increases, however, the increase of the strain energy in the layer makes becomes so large that the layer/substrate interface will decompose into a lattice-mismatched structure, with the misfit strain being accommodated by interfacial dislocations. Strain relaxation by the formation of interfacial dislocations places two important challenges to the use of double axis rocking curves for the determination of layer composition via lattice parameter measurements. First, even if there is no relaxation of the mismatch strain, the crystallographic unit cell of the layer will be Intensity (counts s-1) 105 104 103 102 101 100 -2400 -1200 0 1200 2400 Rocking angle (arcseconds) FIGURE 8. Double axis rocking curve from a 10period Si/GaAs superlattice [12]. 639 (336Å in this case). The signature of a superlattice is the appearance of satellite peaks whose separation is related to the superlattice period D by: λ λ (10) D= = sin θ+ − sin θ− 2 cos θB δθ qz beam conditioner streak where θ+ and θ- are the angular positions of the first high-angle and low-angle satellites, and δθ is the separation between satellites. While it may be surprising that Si layers of 3Å can be detected, in this case they generate a superlattice pattern by altering the relative phases of the X-rays scattered by the successive GaAs layers. Note also the presence of the “average” superlattice peak at a slightly larger angle to the GaAs substrate reflection; this arises from the average composition of the Si-GaAs alloy. By modeling of the positions of this zero-order peak and the satellites, very high accuracy structural parameters can be obtained. qx Ewald sphere S/λ * Hhkl So/λ FIGURE 10. Nominal resolution characteristics of a high resolution triple axis experiment. method is often called reciprocal space mapping. Figure 10 illustrates the normal resolution characteristics of a triple axis X-ray scan. First, parallel to the Ewald sphere we see two “streaks” of intensity. These arise because both the incident beam conditioner and the analyzer crystals have a finite angular dispersion range that can be seen if a highly perfect sample crystal is present. Multiple crystal and multiple reflection geometries are often employed in high resolution triple axis instruments in order to suppress the off-peak tails of the reflection profiles of the monochromator and analyzer assemblies and thus reduce or eliminate these streaks. There is also a streak perpendicular to the surface; this feature (often called the “surface streak”) is a common phenomenon in surface diffraction methods such as reflection high energy electron diffraction (RHEED). It arises due to the truncation of the otherwise infinite 3D crystal structure at a surface. Finally, defects in a crystal will alter the intensity distribution about a reciprocal space point in two ways. Compositional variations and/or strains in the lattice will locally change the lattice parameters of the sample; this will be manifested in the redistribution of intensity away from the exact Bragg condition in the θ/2θ direction (parallel to the direction of the reciprocal lattice vector in a symmetric geometry). In contrast, mosaic spread in the sample will create extra intensity away from the Bragg angle when rotating the sample axis, ω (perpendicular to the reciprocal lattice vector in a symmetric geometry). Figure 11 illustrates a reciprocal space map from a typical bulk crystal – in this case a sample of the II-VI semiconductor HgCdTe. Some of the principal features of this reciprocal space map are the following: 1. Intensities are usually plotted as equal-intensity contours on a log scale; here we use four contours per In triple axis measurements, the diffracted beam is conditioned by an analyzer crystal before it encounters the detector, as shown in Figure 9. The inclusion of the analyzer with a very low defect density (usually high quality silicon or germanium) and with an acceptance range of a few arcseconds permits the angular position of the diffracted beam to be precisely determined. When combined with a highly perfect monochromator r crystal, both the directions and magnitudes of S o λ r and S λ are well defined. The angular resolution is on the order of a few arcseconds, which is much finer than that which could be obtained with a simple collimator or narrow slit. Because of these provisions, the volume of reciprocal space sampled at any given angular position of the incident and diffracted wavevectors can be made very small. Operationally, it is much easier to move the sample with respect to the incident beam than vice versa, so in a typical triple axis experiment the angular settings of the sample and the analyzer crystals are manipulated during a scan. The resultant data represents a “map” of the intensity distribution; hence this X-ray source detector analyzer monochromator analyzer streak diffuse scatter TRIPLE AXIS DIFFRACTOMETRY r∗ H hkl dynamical surface streak sample FIGURE 9. Schematic illustration of a typical high resolution triple axis X-ray diffraction configuration. 640 30 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 0 -10 20 15 10 10 5 q004 (µm-1) 10 25 15 q004 (µm-1) q111 (µm) -1 20 30 20 log (Intensity) 0 0 -5 -5 -10 -20 -15 -10 -30 -30 5 -20 -20 -10 0 q⊥ (µm) 10 20 30 -15 -25 -1 -20 -10 FIGURE 11. Experimental (111) triple axis reciprocal space map from bulk HgCdTe. -5 0 q220 (µm-1) 5 10 -30 -15 -10 -5 0 5 q220 (µm-1) 10 15 FIGURE 12. High resolution triple axis reciprocal space maps from SIMOX (left) and bonded Si (right). decade (i.e. a step of 100.25 counts s-1 per contour level) with a minimum contour of 100 = 1 count s-1. 2. The intensity is plotted in terms of reciprocal lattice coordinates qx, qz that are related to the diffractometer angular coordinates ω and 2θ by [13]: qx = (2α-β)(sin θ /λ ); qz = β(cosθ/ λ ) (11) where α and β are the deviations of the sample crystal and the analyzer crystal, respectively, from the exact Bragg condition. Note that the units are in reciprocal microns; for comparison, the length of the (111) reciprocal lattice vector is approximately 2670 µm-1. A comparison with the scale range of ±30 µm-1 proves that the data are indeed “high resolution.” 3. The surface streak is seen running vertically in the Figure; its small inclination indicates that the sample surface was slightly off an exact (111) orientation. 4. Also seen in the Figure is a distribution of diffuse intensity around the 111 reciprocal lattice point. This intensity represents scattering by the grown-in structural defects in the HgCdTe sample; the combination of local misorientations and strains is responsible for the details of the scattering from this particular sample. Figure 12 shows the use of triple axis diffractometry for Si-materials analysis. On the left-hand side is a 004 reciprocal space map from a silicon-on-insulator (SOI) sample prepared using the SIMOX process. In this approach, a silicon wafer is implanted with a high dose of oxygen; subsequent annealing permits the implanted oxygen to combine with the silicon to form a buried SiO2 layer underneath the top layer of silicon. The reciprocal space map shows significant isotropic diffuse scattering from residual ion implant defects in the top crystalline silicon layer, the substrate, or both. The right-hand side of Figure 12 shows a 004 reciprocal space map from SOI fabricated by wafer bonding. The Figure shows a separate surface streak due to misoriented Si on top of oxide; however, the intensity of the streak and the absence of off-peak diffuse scatter indicates that the material is relatively de- fect-free. It is interesting to note that the intensity of the surface streak from the top crystalline silicon layer is modulated due to its finite thickness; the repeat period of approximately 10 µm-1 in reciprocal space corresponds to a “real space” thickness of about 0.1 µm. Our final example is shown in Figure 13,which illustrates triple axis data recorded from the same SiGe/GaAs materials used for the double axis scans in Figure 6. The reciprocal space map from pure Ge on GaAs shows the GaAs reciprocal lattice point surrounded by an observable level of diffuse scatter; the excess scattering is commonly seen in bulk GaAs crystals from the normal level of grown-in defects. The reciprocal lattice point corresponding to the Ge epitaxial layer is seen at a smaller value of qz; the lack of diffuse scatter around this point suggests a high degree of structural perfection. As the Si content of the epitaxial layer increases, its 004 reciprocal lattice point progressively moves to larger values of qz. While the rocking curves in Figure 6 show a monotonic increase in the separation of the substrate and layer peaks, the reciprocal space maps in Figure 13 demonstrate that the defect structure of both the layer and the substrate change as the mismatch is increased. As expected, the defect scattering around the layer 004 peak increases with increasing lattice mismatch. However, the reciprocal space maps also show that the defect structure of the substrate experiences significant changes as well. Among other explanations, this suggests that the lattice-mismatched growth “injects” defects into the substrate. X-RAY REFLECTOMETRY X-ray reflectometry (XRR) can be considered as a variant on triple crystal diffractometry where the region of reciprocal space of interest is in the vicinity of the 000 point. Because the distances in reciprocal 641 -3x10-3 0% Si 4.4% Si 6.4% Si 9.4% Si -2x10-3 -1 q004 (Å ) -1x10-3 0 1x10-3 2x10-3 3x10-3 -2x10-3 -1x10-3 0 -1 1x10-3 2x10-3 q220 (Å ) FIGURE 13 High resolution triple axis reciprocal space maps from SiGe grown on GaAs with compositions ranging from 0% Si (left) to 9.4% Si (right). The same scale is used in all plots. A typical specular X-ray reflectivity profile from a a nominally bare silicon wafer is shown in Figure 15. The Figure also shows the results of a calculated fit to the experimental data. To perform the fit, it was assumed that the sample had a thin layer of native oxide on top of the silicon substrate. The initial densities of the Si and SiO2 were assumed to be the bulk values (taken as 2.33 and 2.27 g cm-3, respectively); these parameters were allowed to vary, as were the roughness values for the Si/SiO2 interface and the SiO2 top surface. The curve is well fit assuming a thin SiO2 layer (14.5Å) on top of the bulk substrate. Both the Si/SiO2 and the SiO2/air interfaces show a few Ångstroms of roughness. The densities of both the substrate and the SiO2 are similar to those anticipated for bulk Si and SiO2. The results from this control sample are not at all remarkable and are consistent with what one would expect to see from a silicon wafer that had been sitting in a laboratory ambient for over one year. Figure 16 shows a typical XRR analysis of an important semiconductor materials system – in this case, a comparison of two different thin diffusion barriers deposited by physical vapor deposition (nominally 70Å Ta and TaN in the two samples respectively) be- space are thus very small, the corresponding distances probed in “real” space are relatively large. X-ray reflectometry thus gives information on the large-scale features of a sample and is thus insensitive to smallscale crystallography. Hence it can be applied to materials irrespective of their physical state, whether that be amorphous, polycrystalline, or single crystal. Figure 14 shows the geometry a simple reflectometer. From the earlier discussion of the index of refraction, we know that reflectometry measurements require the incident X-ray beam to make a small angle with respect to the sample surface – down to a zero angle of incidence where the beam is parallel to the surface. High quality measurements thus require an incident X-ray beam having both a narrow spatial extent and a small angular divergence. In the most common approach (specular reflectivity measurements), the angles of the incident and reflected beams with respect to the surface are kept equal. The reflectivity, R, is the ratio of reflected to incident intensities X-ray source slits sample diffuse scatter slits detector specularly reflected be 106 detector Intensity (counts) sample experimental data model fit 105 parabolic graded multilayer mirror X-ray Ge channel + source compressor Ge channel collimator 104 103 102 101 100 10-1 FIGURE 14. Schematic illustration of a simple X-ray reflectometer (top) and one suitable for high resolution analyses of semiconductors (bottom). 0 1 2 3 4 θ (arcseconds) FIGURE 15. Experimental and calculated X-ray reflectivity curves from nominally bare silicon. 642 emphasized in Figure 17, which shows the fit of the data with the low-angle data plotted on an expanded scale. The inclusion of a thin, rough, low-density top copper oxide layer was found to be essential for generating a high-quality fit. The roughness was calculated by including a factor that damps the otherwise perfect interface reflectivity by a Gaussian height distribution. 109 108 Ta/Cu data Ta/Cu fit TaN/Cu data TaN/Cu fit 107 Reflectivity 106 105 104 103 102 SOURCES OF ERRORS IN HIGH RESOLUTION X-RAY ANALYSES 101 100 10-1 0 1 2 θ (degrees) 3 At this point it is tempting to illustrate additional examples that show how useful high resolution X-ray methods can articulate the structure of various materials. Instead we will describe some of the more important factors that degrade these measurements. While there are many potential factors that might limit both the accuracy and the precision of high resolution X-ray analyses, here we will discuss only six potential degrading agents: (1) angle metrology; (2) X-ray beam conditioning; (3) mechanical and optical alignment; (4) the sample; (5) noise and (6) software and analysis. 4 FIGURE 16. Experimental and calculated X-ray reflectivity curves from Ta/Cu/Si and TaN/Cu/Si neath a much thicker (nominally 750Å) copper metallization layer. A cursory examination of the reflectivity data in the Figure shows the superposition of the scattering from the thick Cu layer (distinguished by the rapidly oscillating intensity fringes at low angles) and the more slowly varying fringes (with a period of about 0.4°) that were generated from the thin but highZ Ta and TaN layers. Greater complexity is found in the ternary TiNSi fabricated by chemical vapor deposition. Figure 17 shows an XRR profile recorded from one sample that illustrates several common features from this system. In this sample, the TiNSi diffusion barrier (nominally 30Å thick) was deposited at 340°C prior to the overlay of a relatively thick Cu layer. The reflectivity calculated from model structures that was assumed to be (1) the Si substrate, (2) a silicon nitride interfacial layer, (3) a layer of pure TiN, (4) a titanium nitride silicide layer with an assumed composition (TiN)x(Si3N4)1-x with x = 0.5, (5) the thick Cu layer, and (6) a thin, very rough, top surface layer of copper oxide. The simulations closely match the experimental data, implying that the assumed structures give a reasonably good description of the actual samples. This is -1 Intensity (counts s ) 106 Angle metrology The well-known Bragg equation (nλ = 2d sinθ) shows that X-ray analyses are, at their essence, exercises in angle metrology. Accurate measurement of angles is thus at the core of high-quality X-ray analyses. Unfortunately, few users of high resolution X-ray systems characterize the fidelity of the angle scales of their instruments. Reliance on angle encoders to independently read the driving motor shaft or (better) the driven axis is not necessarily better, because all encoders have periodic angle errors and an uncertainty associated with those errors. Techniques based on circle closure provide avenues for the precision calibration of angle measurement tools [14]. While closure calibration methods are widely used for precision angle division tools, apparently they are rarely applied to high resolution X-ray scattering systems. High quality goniometers may be delivered with angle calibration curves supplied by the vendor — but it is unlikely that the user will place the device in the same thermal, environmental, and mechanical environment as it was in the vendor’s calibration facility. Moreover, since all mechanical components experience mechanical wear, one would anticipate that any calibration would vary with time. 105 105 104 104 103 103 102 102 0.4 0.5 0.6 0.7 0.8 101 100 10-1 0 1 2 θ (degrees) 3 X-ray beam conditioning 4 Observation of the features in a high resolution Xray scan requires extensive conditioning of the X-ray beam. However, the output from an X-ray beam condi- FIGURE 17. XRR profile of a TiNSi diffusion barrier (~ 30Å thick) deposited at 340°C on a relatively thick (~750Å) Cu layer (inset shows the low-angle region). 643 tioner can alter the appearance of the scattering from the sample in a high resolution X-ray experiment. High angle, high resolution diffraction experiments are usually performed so to minimize the effects of the beam conditioner, either through the use of a parallel, non-dispersive geometry or multiple reflection monochromator crystals to reduce the spread in angle and wavelength of the X-ray probe beam. Multiple reflection X-ray optics utilize the natural dynamic reflection range of perfect crystals (nominally a few arcseconds); they are desirable because they preferentially reduce the intensity of the wings of a reflection and are conveniently realized by using channel-cut crystals. Unfortunately, in a laboratory or fabrication facility¶, there is a tradeoff between angular resolution and X-ray flux; high resolution typically reduces the available flux of X-ray photons. Hence there are many approaches to X-ray beam conditioning. For instance, XRR measurements at NIST are performed using a graded multilayer mirror to collect the X-rays diverging from the line source and produce a quasi-parallel output beam [15]. The output from the mirror is then further conditioned with a monolithic four-reflection Ge monochromator crystal in which the X-ray beam executes three symmetric and one asymmetric (220) reflections. The final (asymmetric) reflection is designed to reduce the spatial extent of the beam in the plane of the reflectometer to approximately 75 µm. This approach is but one of many that have been used in high resolution X-ray analyses. While it is intuitive that the details of the beam conditioning system (specifically, the spread in angle and wavelength of the beam that interrogates the sample) will have a quantitative impact on the results of an X-ray experiment, the magnitude of these effects is not at all obvious. cause of finite tolerances in machining – where nothing is “exactly” parallel, perpendicular, or coincident as the case may be – this error budget will always be non-zero and will generate systematic errors. As discussed below, these problems are made worse by the fact that X-ray measurements are further polluted by noise from the X-ray production process. The sample The physical state of the sample is an important component of high-resolution X-ray analyses. Mechanical stresses and warpage, spatial non-uniformity, and both surface and interfacial roughness can alter the details (and, in some cases, the gross appearance) of a reflectometry scan. In reflectivity analyses, as long as the sample is sufficiently large or the spatial extent of the incident beam is sufficiently narrow, then the beam will be fully intercepted before the critical angle is reached. However, for small samples, the scattering curve may be on the steep θ-4 decent before the incident beam is completely captured by the sample. In this case direct measurements of either the totally reflected incident intensity or the critical angle are not possible. Our experience suggests that, under these conditions, the structural parameters determined from a fit to the data are subject to much larger errors than they are when the transition at the critical angle can be observed directly. Loosely put, “size matters.” An even more compelling problem occurs in double- and triple-axis X-ray measurements, where the sample is often assumed to be “highly perfect” and diffracts according to the dynamic theory. In this case, the results obtained from an X-ray analysis could be seriously in error. Consider a thin, structurally defective epitaxial layer; the rocking curve from such a sample would be broadened by structural defects, but it would also broaden due to the finite thickness of the film. Thus the application of a “perfect crystal” model to this sample would tend to underestimate the thickness by an unknown amount. Clearly, if the quantitative results of the dynamic theory are sought, then the sample has to be “good enough” to deserve the application of perfect-crystal theory. Mechanical and optical alignment The need for mechanical and optical alignment is a well-known prerequisite for accuracy in most X-ray analytical methods. For high resolution work, the condition for an instrument to be “well aligned” requires that the effective X-ray source and all beam defining elements be parallel to each other and perpendicular to the scattering plane, and that the line of intersection of the incident and scattered beams be coincident with the sample surface and the sample rotation axis. The effect of mechanical and optical alignment errors are well-documented in powder X-ray diffraction, but in high resolution work they are often not appreciated. The effects of misalignments will be incorporated into the overall error budget of a measurement. Be- Noise The primary sources of noise in a high resolution X-ray experiment are the statistical variations in the photon flux produced by the X-ray source as well as the quantum mechanical (i.e. probabilistic) nature of the scattering process itself. As anyone who has conducted an X-ray experiment knows, the presence of noise increases the “error band” associated with a given measurement. With increased noise in a given data set (for instance, from using a weak X-ray source ¶ In this paper we are considering only laboratory-based (as opposed to synchrotron-based) instrumentation because of its compatibility with pre-existing semiconductor fabrication facilities and operations. 644 approach achieves the evolution of parameter vectors by a repeated process of mutation, reproduction and selection. With a high degree of computational efficiency, small random changes (mutation) in the population of parameter vectors generate diversity in the population; selection guarantees that the “fittest” parameter vectors will propagate into future generations. Evolutionary algorithms thus appear to represent the most efficient methods for the automated fitting of Xray profiles. or a short counting time), the larger the error band and the greater the uncertainty in the assignment of a structural model to the experimental data. Unfortunately, the obvious solutions of using a more intense X-ray source such as a synchrotron or counting for longer times in order to improve the counting statistics are often at odds with the industrial need for performing a real-time in-line analysis with rapid turnaround. A more fundamental problem is that the effects of noise vary with the intensity, and hence with angular position. It can be shown, for instance, that the functional dependence of the noise will follow a Iw dependence described by w = 0.5 under high intensity conditions, with w decreasing to zero at low intensities [11]. This illustrates the important fact that the noise varies with the scattered intensity, and as a result the statistical reliability of each point will be different depending on its angular position with respect to the peak maximum. This means that when one is fitting a calculated curve to an experimental data set, it must be realized that all points are not equivalent, and that this variation in statistical reliability (in other words, the functional dependence of the noise on the intensity) should be taken into account in the fitting procedure. This is rarely if ever done, however. CONCLUSIONS It has been the purpose of this review to demonstrate the ubiquitous nature of high resolution X-ray scattering methods in semiconductor materials analysis. Given the continuing reduction in the dimensions of structures, it appears likely that the need for these characterization tools will only increase in time. REFERENCES 1. Bowen, D. K. and Tanner, B. K., High Resolution X-Ray Diffractometry and Topography. Bristol: Taylor & Francis (1998). 2. Holy, V., U. Pietch and T. Baumbach, High-Resolution X-Ray Scattering from Thin Films and Multilayers. Berlin: Springer-Verlag (1999). 3. Fewster, P. F., X-ray Scattering from Semiconductors. London: Imperial College Press (2000). 4. Warren, B. E., X-ray Diffraction (2nd ed). New York: Dover (1991). 5. Als-Nielsen, J. and McMorrow, D., Elements of Modern X-ray Physics. Chinchester: Wiley (2001). 6. Authier, A., Lagomarsino, S., and Tanner, B. K., eds. Xray and Neutron Dynamical Diffraction. NATO ASI Series B: Physics, Vol. 357. New York: Plenum (1996). 7. Authier, A., Dynamical Theory of X-ray Diffraction. Oxford: Oxford University Press (2001). 8. Parratt, L. G., Phys. Rev. 95, 359 (1954). 9. Chapek, D. L., Conrad, J. R., Matyi, R. J., and Felch, S. B., J. Vac. Sci. Tech. B12, 951 (1994). 10. Matyi, R. J., Chapek, D. L., Brunco, D. P., Felch, S. B., and Lee, B. S., Surf. Coat. Technol. 93, 247 (1997). 11. Staley, T. W., Ph.D. dissertation, University of Wisconsin (1998). 12. Gillespie, H. J., Wade, J. K., Crook, G. E., and Matyi, R. J., J. Appl. Phys. 73, 95 (1993). 13. Iida, A. and Kohra, K., Phys. Stat. Sol. A 51, 533 (1979). 14. Estler, W. T., J. Res. Natl. Inst. Stand. Tech. 103, 141 (1998). 15. Deslattes, R. D. and Matyi, R. J., Analysis of thin layer structures by X-ray reflectometry, in Handbook of Silicon Semiconductor Metrology (A.C. Diebold, Editor). New York: Marcel Dekker (2001). 16. Warmington, M., Panaccione, C., Matney, K. M., and Bowen, D. K., Phil. Trans. R. Soc. Lond. A 357, 2827 (1999). Software and analysis The final issue that we wish to consider is the effect of computational methods for the analysis of high resolution X-ray data. One of the advantages of X-ray scattering as an analytical tool is that the mathematical basis for X-ray methods is extremely well developed. As a result, there exist a number of robust computer simulation and analysis packages for obtaining structural information from high resolution X-ray scans. Typically, a quantitative analysis involves calculating a simulated profile and comparing it to an experimental profile. After noting the difference between simulated and experimental curves, the input parameters to the simulation are then adjusted to improve the agreement between theory and experiment. The interpretation of high resolution X-ray data is essentially a non-linear curve-fitting problem, where a set of discrete data is compared to a theoretical continuous intensity distribution derived from a mathematical model of the postulated structure. This process is notoriously inefficient, since the experimenter often resorts to increasingly complex model structures (interfacial layers with composition or strain gradients, broadening due to substrate curvature, etc.) There are numerous examples in the literature where different structural models all fit an experimental profile. Because of these problems, the use of “evolutionary algorithms” has shown great promise for the fitting of high resolution X-ray data [16]. This 645