Download 634_1.pdf

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Scanning SQUID microscope wikipedia , lookup

Microelectromechanical systems wikipedia , lookup

Electron mobility wikipedia , lookup

Giant magnetoresistance wikipedia , lookup

Photoconductive atomic force microscopy wikipedia , lookup

Semiconductor device wikipedia , lookup

Electron-beam lithography wikipedia , lookup

Diamond anvil cell wikipedia , lookup

Semiconductor wikipedia , lookup

Colloidal crystal wikipedia , lookup

Crystal structure wikipedia , lookup

Metastable inner-shell molecular state wikipedia , lookup

Biological small-angle scattering wikipedia , lookup

Low-energy electron diffraction wikipedia , lookup

X-ray crystallography wikipedia , lookup

Transcript
High Resolution X-ray Scattering Methods For
ULSI Materials Characterization
Richard J. Matyi
Physics Laboratory, National Institute of Standards and Technology, Gaithersburg, MD 20899
Abstract. X-ray analytical methods with high angular resolution are becoming increasingly important for the characterization of materials used in ULSI fabrication. Vendors now market state-of-the-art X-ray tools for the routine analysis of
parameters such as layer thickness, chemical composition, strain relaxation, and interfacial roughness. The recent integration of X-ray diffraction and reflectivity systems into fab-compatible process metrology tools suggests that the importance of these techniques will only increase with time. Here we discuss some basic principles of high resolution X-ray
methods (notably double- and triple-axis X-ray diffractometry and high resolution X-ray reflectometry) and will describe
the capabilities and limitations of these tools for ULSI materials. Reference will be made to “real-life” problems involving bulk and thin-film structures (ranging from amorphous dielectrics and polycrystalline metals to highly perfect epitaxial single crystal materials) to show both the utility and the shortcomings of high resolution X-ray methods.
diffractometry, and (c) X-ray reflectometry.
INTRODUCTION
The continual reduction in the dimensions of semiconductor device structures is placing increasingly
stringent demands on key metrology tools. Because
their wavelengths (less than 10Å) are similar to the
sizes of the smallest features in advanced devices, Xrays are well suited to the structural characterization of
very small and/or thin structures. The interactions of
X-rays with atoms tends to be relatively weak (unlike,
for instance, electron-solid interactions), so they can
typically be described in terms of perturbations. This
greatly simplifies the mathematical description of the
X-ray scattering process and permits quantitative
modeling of that process with relative ease.
High resolution X-ray scattering is an attractive
tool for semiconductor metrology applications due to
its sensitivity to the structure of semiconductor materials and its capability for obtaining quantitative information [1-3]. Until recently X-ray tools were typically
confined to the laboratory benchtop; now, however, a
number of equipment vendors are introducing fabcompatible X-ray tools. In this paper we will examine
some of the theoretical and practical aspects of modern
high resolution X-ray measurement techniques to illustrate the utility and the limitations of these methods.
Three high resolution techniques will be discussed: (a)
double axis X-ray diffractometry, (b) triple axis X-ray
THEORETICAL BASIS OF HIGH
RESOLUTION X-RAY METHODS
The scattering of X-rays by a solid† is conveniently
described in terms of a hierarchy of processes. At the
lowest level is the scattering of an incident X-ray beam
(which is an electromagnetic wave) by a single electron. From electromagnetic wave theory it is known
that the electric field will exert a force on the electron;
since the field is varying sinusoidally with time, the
electron will be accelerated. However, classical theory
also says that an accelerated charge must radiate, so
the oscillating electron will become a source of scattered radiation that has the same frequency as the incident wave. In this manner a single electron becomes a
source of scattered radiation, albeit a weak one – the
intensity is proportional to the square of the classical
electron radius re2 , or about 7.94×10-30 m2. Thus the
reason we can observe scattered X-rays at all is due to
the large number of electrons present in most solids.
Now we extend the hierarchy of scattering to the
electrons that surround a particular atom. Since an
†
For brevity in this overview we consider only elastic scattering.
More comprehensive treatments are given in a number of excellent
texts, such as Warren [4] and Als-Nielsen and McMorrow [5].
CP683, Characterization and Metrology for ULSI Technology: 2003 International Conference,
edited by D. G. Seiler, A. C. Diebold, T. J. Shaffner, R. McDonald, S. Zollner, R. P. Khosla, and E. M. Secula
2003 American Institute of Physics 0-7354-0152-7/03/$20.00
634
atom is “large” compared with the X-ray
r*
wavelength, different X-ray waves will
H hkl
=1 d
hkl r
exhibit phase relationships that depend
on the direction the scattering is
S λ
observed. In the forward (main beam)
r∗
H hkl
direction, the waves scattered by all of
r
θ
the electrons are in phase; the total amS λ
r
r
d
plitude scattered in this direction will be
So λ
So λ
the number of electrons (Z) times the
θ
000
single electron scattering. In other
directions, the waves generated by
different
electrons
will
exhibit
differences in phase causing partial
FIGURE 1. Vector representation of the diffraction process with respect
interference and a decrease in the scatto the diffracting planes (left) and the Ewald spehere (right).
tered amplitude. Thus the atomic scattering factor decreases from a value Z in the forward dithe condition needed for a diffraction maximum, Figrection towards zero as the scattering angle increases.
ure 1b shows a useful variation on the theme. Like
Next, consider the scattering of a collection of atother vectors, the incident and diffracted wavevectors
r
r
oms – specifically the unit cell of a crystal structure.
S o λ and S λ may be translated as long as their diAgain, waves scattered in different directions will
rections and magnitudes are not changed. The condihave different relative phases, and in most cases a
tion for diffraction is then satisfied when the resultant
summation over all scattered X-rays results in destrucof the vectors is coincident with the reciprocal lattice
r
r∗
tive interferences that reduce the net scattered amplivector H hkl
. Note that if S λ is randomized over all
tude to zero. However, in specific directions, the
possible orientations a sphere is generated. This sphere
waves will reinforce to produce a non-zero amplitude,
describes the locus of permitted conditions for positive
with the amplitude scattered per unit structure being
reinforcement of the scattered X-rays (i.e. a diffraction
known as the structure factor.
maximum) and is often called the Ewald sphere.
A simple physical model based on the dependence
The concepts discussed above comprise the kineof the phase of the scattered radiation on the direction
matic theory of X-ray diffraction. Implicit in the kineof the scattering is known as Bragg’s Law. X-rays of
matic theory are critical assumptions:
wavelength λ scattered from the atoms in adjacent
• The interaction between the incident X-rays and
crystal planes separated by a distance d will interfere
the solid are sufficiently weak that the energy loss
constructively if the path difference between the two
by the incident beam is negligible and there is no
planes is an integral number of wavelengths. This
change in the X-ray wavelength within the solid.
physical process gives rise to the Bragg equation:
nλ = 2d sin θ
(1)
While this equation is widely used in the scientific
community, a more useful description of the X-ray
scattering process is shown in Figure 1a. A more complete analysis of the phase relations of the waves scattered by a three-dimensional array of atoms shows that
the maximum of constructive interference can be described by a vector equation:
r r
r∗
S − So
= H hkl
(2)
λ
r
r
where S o and S are unit vectors in the incident and
r∗
diffracted beam directions, respectively, and H hkl
is a
vector perpendicular to a set of diffracting planes (hkl)
with a length inversely proportional to the interplanar
r∗
spacing d. Because the magnitude of H hkl
has a reciprocal relationship to the interplanar spacing, it is usually called a reciprocal lattice vector.
While Equation 2 and Figure 1a indeed describe
•
The intensities of the scattered X-rays are low.
•
An X-ray photon is scattered only once; there are
no multiple scattering events.
•
The diffracting crystal is small and far away from
the detector.
In the case of powder samples and highly imperfect
single crystals, these assumptions are usually warranted. However, in large, highly perfect semiconductor crystals, one or even all may become invalid. In
these cases a more complex theory is needed to describe the diffraction process. In this approach (usually
referred to as the dynamic theory of diffraction) a solution is found to Maxwell’s equations for the propagation of electromagnetic waves in a medium with a periodically varying electrical susceptibility. By enforcing proper boundary conditions, we arrive at a solution
for waves that satisfy both Bragg’s Law and Maxwell’s equations.
Figure 2 illustrates how this treatment arises. In ki-
635
Ewald sphere (a)
r
r
S λ=k
∆θ ∼ a few
arcseconds
(b)
hkl (≡ h)
r*
H hkl
sphere
about 0
to hkl
r*
H hkl
r
So λ
αh
000 (≡ 0)
sphere
about h
(c)
αο
to 0
FIGURE 2. The dynamic theory of diffraction: (a) the Ewald sphere from the kinematic theory; (b) definition of wavevectors that satisfy the diffraction condition inside and outside the crystal; (c) definition of the deviation parameters αo and αh.
nematic diffraction, the permitted wavevectors are
described by the Ewald sphere. In the dynamic case,
the incident and diffracted wavevectors are coupled;
their lengths change (due to a change in X-ray wavelength within the crystal), and their common origin
describes a point which is displaced from the spheres
that can be drawn about the tail (point 000) and head
r∗
. Very
(point hkl) of the reciprocal lattice vector H hkl
r
close to the kinematic origin defined by S o λ and
r
S λ , these two spheres can be approximated by
planes; as shown in Figure 2, the deviations from the
kinematic diffraction conditions caused by the dynamic interactions can be denoted αo and αh. These
deviations generate two hyperbolic sheets in three dimensions that are asymptotic to the kinematic spheres;
this is known as the dispersion surface and is given by
[1,6,7]
α0 α h = k 2 C 2 χ h χ h
(3)
side a solid, implying that the index of refraction for
X-rays differs from unity. It is in fact given by [1,6,7]
λ 2 reρe
λµ
n = 1−
− i x = 1 − δ − iβ
(5)
2π
4π
where ρe is the electron density of the solid and µx is
the linear absorption coefficient. Both the real (δ) and
imaginary (β) components are very small for most
semiconductor materials under laboratory conditions.
For instance, if silicon is examined with CuKα X-rays,
δ ≈ 7.4×10-6 and β ≈ 1.9×10-7.
The index of refraction for X-rays is thus slightly
less than unity, so they can experience total external
reflection at an interface. The critical angle for total
external reflection is:
rρ
2
θcrit
= 2δ ⇒ θcrit = λ o e
(6)
π
In the case of silicon and CuKα radiation, the critical angle is about 0.22°. At incidence angles greater
than θcrit, partially reflected and transmitted X-ray
wavefields will both be present. This situation is identical to that in conventional optics, where the Fresnel
equations give the reflection and transmission coefficients for a light ray incident on an interface and forms
reflected and refracted rays. Neglecting an absorption
correction the X-ray reflection coefficient becomes:
where k is the vacuum wavevector, C describes the
polarization of the X-ray beam, and the electrical susceptibility χ h = −(re λ 2 πV ) Fh is given by the classical electron radius re, the unit cell volume V, and the
structure factor Fh corresponding to reflection (hkl).
Equation (3) is central to the dynamic theory, since
it gives the wavevectors that are permitted under dynamic diffraction conditions as well as other insights.
For example, it can be shown that the separation of the
hyperbolic sheets gives the range corresponding to the
width of an X-ray reflection. This separation is the Xray analogue to the “energy gap” that arises when electrons propagate through a crystal. It yields a peak
breadth in a symmetric reflection geometry of:
2re λ 2 Fh
(4)
∆θ =
πVC sin 2θB
where VC is the volume of the unit cell. Peak breadths
of a few arcseconds are usually seen in typical semiconductor materials under laboratory conditions.
X-rays have different wavelengths inside and out-
R ( θ) =
2
θ − θ2 − θcrit
2
θ + θ2 − θcrit
2
(7)
With this in hand it is possible to describe three generic aspects of an X-ray reflectivity profile that would
be obtained from a bulk sample:
•
•
•
636
For θ < θcrit there would be a constant specular
reflected intensity;
At θ = θcrit the profile would show a sharp drop in
the reflectivity;
For θ > θcrit the intensity profile would decreases
with a θ-4 dependence.
Of course, the ability to characterize thin film materials is far more important in the semiconductor environment. A procedure for doing so was developed by
Parratt [8] which uses a recursion relationship for an
arbitrary N-layer structure on a semi-infinite substrate:
rj + R j +1a 2j +1
(8)
Rj =
1 + rj R j +1a 2j +1
r
∗
monochromator H hkl
wide-open
detector
X-ray source
sample
In this equation, R j = E rj E tj is the ratio of the reflected electric field amplitude to the transmitted field.
The complex amplitude a j = exp(ik z , j d j ) corresponds
Ewald
sphere
to layer j (1≤ j ≤ N) with thickness dj and depends on
the z-component of the wavevector which from Snell’s
law is k j , z = k ( n 2j − cos2 θ)1 2 with nj being the refrac-
S/λ
So/λ
tive index of layer j. The Fresnel coefficient rj for interface is then rj = ( k j , z − k j +1, z k j , z + k j +1, z ) . Equation
sample
(8) is solved by starting at the bottom (layer N closest
to the substrate) and noting that the reflected intensity
coming up from the semi-infinite substrate will be
ideally zero.
FIGURE 3. Double axis rocking curve geometry in the
laboratory perspective (top ) and in terms of the Ewald
sphere (bottom).
then the net rocking curve will be the correlation of the
two perfect crystal single reflection profiles. In the
case of the (400) reflection of CuKα X-rays from silicon, for instance, the peak breadth from dynamic theory is about 3.5 arcseconds. If there are structural defects in the sample crystal, their presence will be indicated as an extra broadening of the rocking curve. For
this reason, the breadth of the double-axis rocking
curve has long been used as an indicator of the relative
“quality” of the sample crystal.
Double axis rocking curves are widely used for the
analysis of thin epitaxial (single crystal) layers on
crystalline substrates. The presence of an epitaxial
layer on the substrate results in two reciprocal lattice
vectors directed normal to the diffracting planes (or, in
this case shown here where the diffracting planes are
parallel to the surface, normal to the sample). If we
assume that the d-spacing of the layer is less than the
corresponding d-spacing of the substrate, then
r∗
r∗
H hkl
(layer) > H hkl
(sub) . As the sample is rotated
DOUBLE AXIS DIFFRACTOMETRY
With the theoretical background established, it is
now possible to critically examine the various high
resolution X-ray diffraction methods used in semiconductor metrology. The first of these is double axis Xray diffractometry, where the incident beam from an
X-ray source is monochromated by a crystal (or set of
crystals‡) with a high degree of structural perfection.
From our discussion of dynamical diffraction, we
recall that the intrinsic reflection range of a perfect
crystal is typically a few arcseconds. If a beam of Xrays from a laboratory X-ray source (for instance,
CuKα radiation from a copper X-ray tube) is directed
at this crystal, then only those CuKα X-rays within the
perfect crystal reflecting range of a few arcseconds
will be diffracted. The X-rays diffracted by the monochromator will likewise have an angular spread of only
a few arcseconds. This highly conditioned beam can
then be directed at a sample crystal as it is rotated
about an axis perpendicular to the plane defined by
r
r
r∗
S o λ , S λ , and H hkl
. The resulting plot of the multiply diffracted intensity as a function of sample crystal rotation is known as a “rocking curve.” A schematic of the double axis geometry is seen in Figure 3.
If the monochromator and sample crystals are identical and sufficiently perfect to diffract dynamically,
through the Bragg reflection condition, Figure 3 shows
that the reciprocal lattice points corresponding to the
r∗
substrate (larger d-spacing, smaller H hkl
) and layer
r∗
(smaller d-spacing, larger H hkl ) pass through the
Ewald sphere and satisfy the condition for diffraction.
However, the use of a wide-open detector means that a
“fan” of doubly-diffracted beams can be intercepted by
the detector. Thus any feature touching the Ewald
sphere will diffract and can be detected.
Figure 4 shows a (004) double axis rocking curve
from the compound semiconductor AlGaAs (approximately 37% Al) on a GaAs substrate. In this case the
‡
The fact that many instruments use multiple crystals as incident
beam conditioners has led to the term “double axis” to identify a
system with a monochromator and sample, rather than the older
“double crystal” name.
637
105
105
a
b
c
d
104
103
102
Intensity
-1
Intensity (counts s )
104
undoped
1.1 x 1015
2.6 x 1015
4.4 x 1015
103
101
a
100
102
b
10-1
c
10-2
d
10-3
-300
101
-200
-100
0
100
200
300
Rocking angle (arcseconds)
105
100
-300
a
b
c
d
e
104
-150
0
150
300
103
Rocking angle (arcseconds)
102
Intensity
FIGURE 4. A typical double axis X-ray rocking curve
from AlGaAs/GaAs.
lattice parameter of the layer is larger than that of the
substrate, so the layer peak is observed at a smaller
angle. The interference of the diffracted wavefields
from the epitaxial layer and the substrate give rise to
interference fringes§ with an angular spacing given by:
δ=
λγ hkl
t sin 2θB
101
a
100
b
10-1
c
10-2
d
10-3
e
10-4
-900
-600
-300
0
as implanted
5 s, 950°C
20 s, 950°C
30 s, 1050°C
30 s, 1100°C
300
600
900
Rocking angle (arcseconds)
FIGURE 5. Double axis rocking curves from boronimplanted silicon: (top) 5 keV B-implant, 113 reflection, all samples as–implanted with dose indicated;
(bottom) 3.5 keV B-implant, 224 reflection, postimplant anneal at temperature and time indicated.
(9)
where γhkl is the cosine of the angle between the diffracted beam direction and the inward-pointing surface
normal.
While double axis rocking curves can be used to
provide accurate thickness measurements, a more
common use is in determining the structural characteristics of crystalline materials. Any modification to the
crystalline structure of either bulk or thin film materials can generate changes in lattice parameter that can
be measured in a rocking curve experiment.
Figure 5 illustrates rocking curves from Bimplanted Si following implantation with different
doses ([9], top) and at a constant dose but varying
post-implant anneals ([10], bottom). In the asimplanted state, B- and Si-interstitials expand the host
silicon crystal structure and generate scattering at
lower angles from that of bulk silicon. In contrast, the
post-implant activation anneal allows the implanted
boron to migrate to Si-lattice sites. In the case of B/Si
this causes the lattice parameter to decrease, so that
excess scattering will be seen at angles greater than the
bulk Si Bragg angle.
Figure 6 gives an example of the analysis of
chemical composition in epitaxial layers from double
axis rocking curves. The Figure shows 004 rocking
curves that were obtained from a series of samples of
SiGe grown by molecular beam epitaxy on GaAs [11].
The nominal lattice parameter of pure Ge (5.6568Å) is
larger than that of GaAs (5.6534Å), so the layer peak
appears at a smaller angle with respect to the substrate.
As silicon (nominal lattice parameter 5.4310Å) is
Intensity
9.4% Si
6.4% Si
4.4% Si
2.9% Si
0% Si
-600
§
Some authors incorrectly refer to these as “Pendellösung fringes”
in the belief that they are analogous to dynamical diffraction features
that arise from the selection of active wavevectors at the dispersion
surface. However, the thickness fringes seen in rocking curves do
not come from an oscillation between the sheets of the dispersion
surface and hence they are not Pendellösung fringes.
-300
0
300
600
900
Rocking angle (arcseconds)
FIGURE 6. Double axis rocking curves from SiGe
grown on GaAs with compositions ranging from 0% Si
(bottom) to 9.4% Si (top).
638
al
al
at
ct
ao
ao
substrate + layer
fully strained
partially
relaxed
fully relaxed
simple tilt
pseudomorphic growth
tilt + shear
other distortions
FIGURE 7. Strained layer growth showing an independent substrate and layer (left), a layer/substrate system transitioning from fully strained to fully relaxed (center), and the effect of tilt and shear distortions (right).
distorted with respect to its unstrained state. The
amount of distortion will depend on the elastic constants of the layer, which may not be well known.
Second, if the degree of relaxation is not known,
then the relation between the partially-strained and the
unstrained unit cell will make composition measurements impossible. The common approach to dealing
with this is to obtain rocking curves from both symmetric reflections (diffracting planes parallel to the
interface shown in Figure 7) as well as asymmetric
reflections from inclined planes. If the distortion in the
epitaxial layer is relatively simple (such as a cubic unit
cell distorting into a tetragonal structure) only one or
two asymmetric reflections may be needed. However,
if the layer is tilted, sheared, or otherwise deformed in
a more complex, low symmetry crystal system a complete definition of the structure of the layer may require a large number of asymmetric reflections.
While the preceding discussion has been confined
to single layers, multilayer or superlattice structures
can also be characterized by double axis diffraction.
Figure 8 shows a (004) Si/GaAs superlattice grown on
GaAs [12]. Due to the lattice mismatch between the
two materials, the silicon layers had to be kept very
thin (~3Å) while the GaAs layers could be thicker
added to the epitaxial layer, the lattice parameter decreases; typically it is assumed that the crystalline lattice parameter of a single-phase compound is a linear
function of the chemical composition (usually known
as Vegard’s Law). The decrease in lattice parameter
with increasing Si causes the epitaxial layer peak to
“pass by” the substrate peak.
For small lattice mismatches, the rocking curve
shows well-defined interference fringes, indicating a
strong dynamical interaction between the wavefields
diffracted by the layer and the substrate. This implies
that the structural perfection of the layer is at least
high enough to permit dynamical diffraction to occur.
At the highest silicon content, however, the epitaxial
layer peak is significantly broadened and there is no
evidence of interference fringes. Both of these are indicative of a decrease in structural quality of the GeSi
layer. The decrease in structural perfection with increasing difference in the lattice parameter between an
epitaxial layer and its substrate is a consequence of the
strain relaxation in the layer/substrate system.
The key features involved with strain relaxation are
shown in Figure 7. Consider an epitaxial layer material
whose lattice parameter is larger than that of the substrate. If the layer is very thin, then the energy of the
layer/substrate system will be minimized if the layer is
pseudomorphic with the substrate – that is, the layer
elastically deforms so that the in-plane lattice parameter of the layer matches that of the substrate. As the
layer thickness increases, however, the increase of the
strain energy in the layer makes becomes so large that
the layer/substrate interface will decompose into a
lattice-mismatched structure, with the misfit strain
being accommodated by interfacial dislocations.
Strain relaxation by the formation of interfacial dislocations places two important challenges to the use of
double axis rocking curves for the determination of
layer composition via lattice parameter measurements.
First, even if there is no relaxation of the mismatch
strain, the crystallographic unit cell of the layer will be
Intensity (counts s-1)
105
104
103
102
101
100
-2400
-1200
0
1200
2400
Rocking angle (arcseconds)
FIGURE 8. Double axis rocking curve from a 10period Si/GaAs superlattice [12].
639
(336Å in this case). The signature of a superlattice is
the appearance of satellite peaks whose separation is
related to the superlattice period D by:
λ
λ
(10)
D=
=
sin θ+ − sin θ− 2 cos θB δθ
qz
beam
conditioner
streak
where θ+ and θ- are the angular positions of the first
high-angle and low-angle satellites, and δθ is the separation between satellites. While it may be surprising
that Si layers of 3Å can be detected, in this case they
generate a superlattice pattern by altering the relative
phases of the X-rays scattered by the successive GaAs
layers. Note also the presence of the “average” superlattice peak at a slightly larger angle to the GaAs substrate reflection; this arises from the average composition of the Si-GaAs alloy. By modeling of the positions of this zero-order peak and the satellites, very
high accuracy structural parameters can be obtained.
qx
Ewald
sphere
S/λ
*
Hhkl
So/λ
FIGURE 10. Nominal resolution characteristics of a
high resolution triple axis experiment.
method is often called reciprocal space mapping.
Figure 10 illustrates the normal resolution characteristics of a triple axis X-ray scan. First, parallel to the
Ewald sphere we see two “streaks” of intensity. These
arise because both the incident beam conditioner and
the analyzer crystals have a finite angular dispersion
range that can be seen if a highly perfect sample crystal is present. Multiple crystal and multiple reflection
geometries are often employed in high resolution triple
axis instruments in order to suppress the off-peak tails
of the reflection profiles of the monochromator and
analyzer assemblies and thus reduce or eliminate these
streaks. There is also a streak perpendicular to the surface; this feature (often called the “surface streak”) is a
common phenomenon in surface diffraction methods
such as reflection high energy electron diffraction
(RHEED). It arises due to the truncation of the otherwise infinite 3D crystal structure at a surface.
Finally, defects in a crystal will alter the intensity
distribution about a reciprocal space point in two
ways. Compositional variations and/or strains in the
lattice will locally change the lattice parameters of the
sample; this will be manifested in the redistribution of
intensity away from the exact Bragg condition in the
θ/2θ direction (parallel to the direction of the reciprocal lattice vector in a symmetric geometry). In contrast, mosaic spread in the sample will create extra
intensity away from the Bragg angle when rotating the
sample axis, ω (perpendicular to the reciprocal lattice
vector in a symmetric geometry).
Figure 11 illustrates a reciprocal space map from a
typical bulk crystal – in this case a sample of the II-VI
semiconductor HgCdTe. Some of the principal features of this reciprocal space map are the following:
1. Intensities are usually plotted as equal-intensity
contours on a log scale; here we use four contours per
In triple axis measurements, the diffracted beam is
conditioned by an analyzer crystal before it encounters
the detector, as shown in Figure 9. The inclusion of the
analyzer with a very low defect density (usually high
quality silicon or germanium) and with an acceptance
range of a few arcseconds permits the angular position
of the diffracted beam to be precisely determined.
When combined with a highly perfect monochromator
r
crystal, both the directions and magnitudes of S o λ
r
and S λ are well defined. The angular resolution is
on the order of a few arcseconds, which is much finer
than that which could be obtained with a simple collimator or narrow slit.
Because of these provisions, the volume of reciprocal space sampled at any given angular position of the
incident and diffracted wavevectors can be made very
small. Operationally, it is much easier to move the
sample with respect to the incident beam than vice
versa, so in a typical triple axis experiment the angular
settings of the sample and the analyzer crystals are
manipulated during a scan. The resultant data represents a “map” of the intensity distribution; hence this
X-ray source
detector
analyzer
monochromator
analyzer streak
diffuse
scatter
TRIPLE AXIS DIFFRACTOMETRY
r∗
H hkl
dynamical
surface streak
sample
FIGURE 9. Schematic illustration of a typical high
resolution triple axis X-ray diffraction configuration.
640
30
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.75
4.00
0
-10
20
15
10
10
5
q004 (µm-1)
10
25
15
q004 (µm-1)
q111 (µm)
-1
20
30
20
log (Intensity)
0
0
-5
-5
-10
-20
-15
-10
-30
-30
5
-20
-20
-10
0
q⊥ (µm)
10
20
30
-15
-25
-1
-20
-10
FIGURE 11. Experimental (111) triple axis reciprocal
space map from bulk HgCdTe.
-5
0
q220 (µm-1)
5
10
-30
-15
-10
-5
0
5
q220 (µm-1)
10
15
FIGURE 12. High resolution triple axis reciprocal
space maps from SIMOX (left) and bonded Si (right).
decade (i.e. a step of 100.25 counts s-1 per contour level)
with a minimum contour of 100 = 1 count s-1.
2. The intensity is plotted in terms of reciprocal lattice coordinates qx, qz that are related to the diffractometer angular coordinates ω and 2θ by [13]:
qx = (2α-β)(sin θ /λ ); qz = β(cosθ/ λ )
(11)
where α and β are the deviations of the sample crystal
and the analyzer crystal, respectively, from the exact
Bragg condition. Note that the units are in reciprocal
microns; for comparison, the length of the (111) reciprocal lattice vector is approximately 2670 µm-1. A
comparison with the scale range of ±30 µm-1 proves
that the data are indeed “high resolution.”
3. The surface streak is seen running vertically in the
Figure; its small inclination indicates that the sample
surface was slightly off an exact (111) orientation.
4. Also seen in the Figure is a distribution of diffuse
intensity around the 111 reciprocal lattice point. This
intensity represents scattering by the grown-in structural defects in the HgCdTe sample; the combination
of local misorientations and strains is responsible for
the details of the scattering from this particular sample.
Figure 12 shows the use of triple axis diffractometry for Si-materials analysis. On the left-hand side is a
004 reciprocal space map from a silicon-on-insulator
(SOI) sample prepared using the SIMOX process. In
this approach, a silicon wafer is implanted with a high
dose of oxygen; subsequent annealing permits the implanted oxygen to combine with the silicon to form a
buried SiO2 layer underneath the top layer of silicon.
The reciprocal space map shows significant isotropic
diffuse scattering from residual ion implant defects in
the top crystalline silicon layer, the substrate, or both.
The right-hand side of Figure 12 shows a 004 reciprocal space map from SOI fabricated by wafer
bonding. The Figure shows a separate surface streak
due to misoriented Si on top of oxide; however, the
intensity of the streak and the absence of off-peak diffuse scatter indicates that the material is relatively de-
fect-free. It is interesting to note that the intensity of
the surface streak from the top crystalline silicon layer
is modulated due to its finite thickness; the repeat period of approximately 10 µm-1 in reciprocal space corresponds to a “real space” thickness of about 0.1 µm.
Our final example is shown in Figure 13,which illustrates triple axis data recorded from the same
SiGe/GaAs materials used for the double axis scans in
Figure 6. The reciprocal space map from pure Ge on
GaAs shows the GaAs reciprocal lattice point surrounded by an observable level of diffuse scatter; the
excess scattering is commonly seen in bulk GaAs crystals from the normal level of grown-in defects. The
reciprocal lattice point corresponding to the Ge epitaxial layer is seen at a smaller value of qz; the lack of
diffuse scatter around this point suggests a high degree
of structural perfection.
As the Si content of the epitaxial layer increases, its
004 reciprocal lattice point progressively moves to
larger values of qz. While the rocking curves in Figure
6 show a monotonic increase in the separation of the
substrate and layer peaks, the reciprocal space maps in
Figure 13 demonstrate that the defect structure of both
the layer and the substrate change as the mismatch is
increased. As expected, the defect scattering around
the layer 004 peak increases with increasing lattice
mismatch. However, the reciprocal space maps also
show that the defect structure of the substrate experiences significant changes as well. Among other explanations, this suggests that the lattice-mismatched
growth “injects” defects into the substrate.
X-RAY REFLECTOMETRY
X-ray reflectometry (XRR) can be considered as a
variant on triple crystal diffractometry where the region of reciprocal space of interest is in the vicinity of
the 000 point. Because the distances in reciprocal
641
-3x10-3
0% Si
4.4% Si
6.4% Si
9.4% Si
-2x10-3
-1
q004 (Å )
-1x10-3
0
1x10-3
2x10-3
3x10-3
-2x10-3
-1x10-3
0
-1
1x10-3
2x10-3
q220 (Å )
FIGURE 13 High resolution triple axis reciprocal space maps from SiGe grown on GaAs with compositions ranging
from 0% Si (left) to 9.4% Si (right). The same scale is used in all plots.
A typical specular X-ray reflectivity profile from a
a nominally bare silicon wafer is shown in Figure 15.
The Figure also shows the results of a calculated fit to
the experimental data. To perform the fit, it was assumed that the sample had a thin layer of native oxide
on top of the silicon substrate. The initial densities of
the Si and SiO2 were assumed to be the bulk values
(taken as 2.33 and 2.27 g cm-3, respectively); these
parameters were allowed to vary, as were the roughness values for the Si/SiO2 interface and the SiO2 top
surface. The curve is well fit assuming a thin SiO2
layer (14.5Å) on top of the bulk substrate. Both the
Si/SiO2 and the SiO2/air interfaces show a few Ångstroms of roughness. The densities of both the substrate and the SiO2 are similar to those anticipated for
bulk Si and SiO2. The results from this control sample
are not at all remarkable and are consistent with what
one would expect to see from a silicon wafer that had
been sitting in a laboratory ambient for over one year.
Figure 16 shows a typical XRR analysis of an important semiconductor materials system – in this case,
a comparison of two different thin diffusion barriers
deposited by physical vapor deposition (nominally
70Å Ta and TaN in the two samples respectively) be-
space are thus very small, the corresponding distances
probed in “real” space are relatively large. X-ray reflectometry thus gives information on the large-scale
features of a sample and is thus insensitive to smallscale crystallography. Hence it can be applied to materials irrespective of their physical state, whether that
be amorphous, polycrystalline, or single crystal.
Figure 14 shows the geometry a simple reflectometer. From the earlier discussion of the index of refraction, we know that reflectometry measurements require the incident X-ray beam to make a small angle
with respect to the sample surface – down to a zero
angle of incidence where the beam is parallel to the
surface. High quality measurements thus require an
incident X-ray beam having both a narrow spatial extent and a small angular divergence. In the most common approach (specular reflectivity measurements),
the angles of the incident and reflected beams with
respect to the surface are kept equal. The reflectivity,
R, is the ratio of reflected to incident intensities
X-ray
source
slits
sample
diffuse
scatter
slits detector
specularly reflected be
106
detector
Intensity (counts)
sample
experimental data
model fit
105
parabolic graded
multilayer mirror
X-ray
Ge channel + source
compressor
Ge channel
collimator
104
103
102
101
100
10-1
FIGURE 14. Schematic illustration of a simple X-ray
reflectometer (top) and one suitable for high resolution
analyses of semiconductors (bottom).
0
1
2
3
4
θ (arcseconds)
FIGURE 15. Experimental and calculated X-ray reflectivity curves from nominally bare silicon.
642
emphasized in Figure 17, which shows the fit of the
data with the low-angle data plotted on an expanded
scale. The inclusion of a thin, rough, low-density top
copper oxide layer was found to be essential for generating a high-quality fit. The roughness was calculated
by including a factor that damps the otherwise perfect
interface reflectivity by a Gaussian height distribution.
109
108
Ta/Cu data
Ta/Cu fit
TaN/Cu data
TaN/Cu fit
107
Reflectivity
106
105
104
103
102
SOURCES OF ERRORS IN HIGH
RESOLUTION X-RAY ANALYSES
101
100
10-1
0
1
2
θ (degrees)
3
At this point it is tempting to illustrate additional
examples that show how useful high resolution X-ray
methods can articulate the structure of various materials. Instead we will describe some of the more important factors that degrade these measurements. While
there are many potential factors that might limit both
the accuracy and the precision of high resolution X-ray
analyses, here we will discuss only six potential degrading agents: (1) angle metrology; (2) X-ray beam
conditioning; (3) mechanical and optical alignment;
(4) the sample; (5) noise and (6) software and analysis.
4
FIGURE 16. Experimental and calculated X-ray reflectivity curves from Ta/Cu/Si and TaN/Cu/Si
neath a much thicker (nominally 750Å) copper metallization layer. A cursory examination of the reflectivity data in the Figure shows the superposition of the
scattering from the thick Cu layer (distinguished by
the rapidly oscillating intensity fringes at low angles)
and the more slowly varying fringes (with a period of
about 0.4°) that were generated from the thin but highZ Ta and TaN layers.
Greater complexity is found in the ternary TiNSi
fabricated by chemical vapor deposition. Figure 17
shows an XRR profile recorded from one sample that
illustrates several common features from this system.
In this sample, the TiNSi diffusion barrier (nominally
30Å thick) was deposited at 340°C prior to the overlay
of a relatively thick Cu layer. The reflectivity calculated from model structures that was assumed to be (1)
the Si substrate, (2) a silicon nitride interfacial layer,
(3) a layer of pure TiN, (4) a titanium nitride silicide
layer with an assumed composition (TiN)x(Si3N4)1-x
with x = 0.5, (5) the thick Cu layer, and (6) a thin, very
rough, top surface layer of copper oxide.
The simulations closely match the experimental
data, implying that the assumed structures give a reasonably good description of the actual samples. This is
-1
Intensity (counts s )
106
Angle metrology
The well-known Bragg equation (nλ = 2d sinθ)
shows that X-ray analyses are, at their essence, exercises in angle metrology. Accurate measurement of
angles is thus at the core of high-quality X-ray analyses. Unfortunately, few users of high resolution X-ray
systems characterize the fidelity of the angle scales of
their instruments. Reliance on angle encoders to independently read the driving motor shaft or (better) the
driven axis is not necessarily better, because all encoders have periodic angle errors and an uncertainty associated with those errors.
Techniques based on circle closure provide avenues for the precision calibration of angle measurement tools [14]. While closure calibration methods are
widely used for precision angle division tools, apparently they are rarely applied to high resolution X-ray
scattering systems. High quality goniometers may be
delivered with angle calibration curves supplied by the
vendor — but it is unlikely that the user will place the
device in the same thermal, environmental, and mechanical environment as it was in the vendor’s calibration facility. Moreover, since all mechanical components experience mechanical wear, one would anticipate that any calibration would vary with time.
105
105
104
104
103
103
102
102
0.4
0.5
0.6
0.7
0.8
101
100
10-1
0
1
2
θ (degrees)
3
X-ray beam conditioning
4
Observation of the features in a high resolution Xray scan requires extensive conditioning of the X-ray
beam. However, the output from an X-ray beam condi-
FIGURE 17. XRR profile of a TiNSi diffusion barrier
(~ 30Å thick) deposited at 340°C on a relatively thick
(~750Å) Cu layer (inset shows the low-angle region).
643
tioner can alter the appearance of the scattering from
the sample in a high resolution X-ray experiment.
High angle, high resolution diffraction experiments are
usually performed so to minimize the effects of the
beam conditioner, either through the use of a parallel,
non-dispersive geometry or multiple reflection monochromator crystals to reduce the spread in angle and
wavelength of the X-ray probe beam. Multiple reflection X-ray optics utilize the natural dynamic reflection
range of perfect crystals (nominally a few arcseconds);
they are desirable because they preferentially reduce
the intensity of the wings of a reflection and are conveniently realized by using channel-cut crystals.
Unfortunately, in a laboratory or fabrication facility¶, there is a tradeoff between angular resolution and
X-ray flux; high resolution typically reduces the available flux of X-ray photons. Hence there are many approaches to X-ray beam conditioning. For instance,
XRR measurements at NIST are performed using a
graded multilayer mirror to collect the X-rays diverging from the line source and produce a quasi-parallel
output beam [15]. The output from the mirror is then
further conditioned with a monolithic four-reflection
Ge monochromator crystal in which the X-ray beam
executes three symmetric and one asymmetric (220)
reflections. The final (asymmetric) reflection is designed to reduce the spatial extent of the beam in the
plane of the reflectometer to approximately 75 µm.
This approach is but one of many that have been
used in high resolution X-ray analyses. While it is intuitive that the details of the beam conditioning system
(specifically, the spread in angle and wavelength of the
beam that interrogates the sample) will have a quantitative impact on the results of an X-ray experiment,
the magnitude of these effects is not at all obvious.
cause of finite tolerances in machining – where nothing is “exactly” parallel, perpendicular, or coincident
as the case may be – this error budget will always be
non-zero and will generate systematic errors. As discussed below, these problems are made worse by the
fact that X-ray measurements are further polluted by
noise from the X-ray production process.
The sample
The physical state of the sample is an important
component of high-resolution X-ray analyses. Mechanical stresses and warpage, spatial non-uniformity,
and both surface and interfacial roughness can alter the
details (and, in some cases, the gross appearance) of a
reflectometry scan. In reflectivity analyses, as long as
the sample is sufficiently large or the spatial extent of
the incident beam is sufficiently narrow, then the beam
will be fully intercepted before the critical angle is
reached. However, for small samples, the scattering
curve may be on the steep θ-4 decent before the incident beam is completely captured by the sample. In
this case direct measurements of either the totally reflected incident intensity or the critical angle are not
possible. Our experience suggests that, under these
conditions, the structural parameters determined from
a fit to the data are subject to much larger errors than
they are when the transition at the critical angle can be
observed directly. Loosely put, “size matters.”
An even more compelling problem occurs in double- and triple-axis X-ray measurements, where the
sample is often assumed to be “highly perfect” and
diffracts according to the dynamic theory. In this case,
the results obtained from an X-ray analysis could be
seriously in error. Consider a thin, structurally defective epitaxial layer; the rocking curve from such a
sample would be broadened by structural defects, but
it would also broaden due to the finite thickness of the
film. Thus the application of a “perfect crystal” model
to this sample would tend to underestimate the thickness by an unknown amount. Clearly, if the quantitative results of the dynamic theory are sought, then the
sample has to be “good enough” to deserve the application of perfect-crystal theory.
Mechanical and optical alignment
The need for mechanical and optical alignment is a
well-known prerequisite for accuracy in most X-ray
analytical methods. For high resolution work, the condition for an instrument to be “well aligned” requires
that the effective X-ray source and all beam defining
elements be parallel to each other and perpendicular to
the scattering plane, and that the line of intersection of
the incident and scattered beams be coincident with
the sample surface and the sample rotation axis. The
effect of mechanical and optical alignment errors are
well-documented in powder X-ray diffraction, but in
high resolution work they are often not appreciated.
The effects of misalignments will be incorporated
into the overall error budget of a measurement. Be-
Noise
The primary sources of noise in a high resolution
X-ray experiment are the statistical variations in the
photon flux produced by the X-ray source as well as
the quantum mechanical (i.e. probabilistic) nature of
the scattering process itself. As anyone who has conducted an X-ray experiment knows, the presence of
noise increases the “error band” associated with a
given measurement. With increased noise in a given
data set (for instance, from using a weak X-ray source
¶
In this paper we are considering only laboratory-based (as opposed
to synchrotron-based) instrumentation because of its compatibility
with pre-existing semiconductor fabrication facilities and operations.
644
approach achieves the evolution of parameter vectors
by a repeated process of mutation, reproduction and
selection. With a high degree of computational efficiency, small random changes (mutation) in the population of parameter vectors generate diversity in the
population; selection guarantees that the “fittest” parameter vectors will propagate into future generations.
Evolutionary algorithms thus appear to represent the
most efficient methods for the automated fitting of Xray profiles.
or a short counting time), the larger the error band and
the greater the uncertainty in the assignment of a structural model to the experimental data.
Unfortunately, the obvious solutions of using a
more intense X-ray source such as a synchrotron or
counting for longer times in order to improve the
counting statistics are often at odds with the industrial
need for performing a real-time in-line analysis with
rapid turnaround. A more fundamental problem is that
the effects of noise vary with the intensity, and hence
with angular position. It can be shown, for instance,
that the functional dependence of the noise will follow
a Iw dependence described by w = 0.5 under high intensity conditions, with w decreasing to zero at low
intensities [11]. This illustrates the important fact that
the noise varies with the scattered intensity, and as a
result the statistical reliability of each point will be
different depending on its angular position with respect to the peak maximum. This means that when one
is fitting a calculated curve to an experimental data set,
it must be realized that all points are not equivalent,
and that this variation in statistical reliability (in other
words, the functional dependence of the noise on the
intensity) should be taken into account in the fitting
procedure. This is rarely if ever done, however.
CONCLUSIONS
It has been the purpose of this review to demonstrate the ubiquitous nature of high resolution X-ray
scattering methods in semiconductor materials analysis. Given the continuing reduction in the dimensions
of structures, it appears likely that the need for these
characterization tools will only increase in time.
REFERENCES
1. Bowen, D. K. and Tanner, B. K., High Resolution X-Ray
Diffractometry and Topography. Bristol: Taylor & Francis
(1998).
2. Holy, V., U. Pietch and T. Baumbach, High-Resolution
X-Ray Scattering from Thin Films and Multilayers. Berlin:
Springer-Verlag (1999).
3. Fewster, P. F., X-ray Scattering from Semiconductors.
London: Imperial College Press (2000).
4. Warren, B. E., X-ray Diffraction (2nd ed). New York:
Dover (1991).
5. Als-Nielsen, J. and McMorrow, D., Elements of Modern
X-ray Physics. Chinchester: Wiley (2001).
6. Authier, A., Lagomarsino, S., and Tanner, B. K., eds. Xray and Neutron Dynamical Diffraction. NATO ASI Series
B: Physics, Vol. 357. New York: Plenum (1996).
7. Authier, A., Dynamical Theory of X-ray Diffraction.
Oxford: Oxford University Press (2001).
8. Parratt, L. G., Phys. Rev. 95, 359 (1954).
9. Chapek, D. L., Conrad, J. R., Matyi, R. J., and Felch, S.
B., J. Vac. Sci. Tech. B12, 951 (1994).
10. Matyi, R. J., Chapek, D. L., Brunco, D. P., Felch, S. B.,
and Lee, B. S., Surf. Coat. Technol. 93, 247 (1997).
11. Staley, T. W., Ph.D. dissertation, University of Wisconsin (1998).
12. Gillespie, H. J., Wade, J. K., Crook, G. E., and Matyi, R.
J., J. Appl. Phys. 73, 95 (1993).
13. Iida, A. and Kohra, K., Phys. Stat. Sol. A 51, 533 (1979).
14. Estler, W. T., J. Res. Natl. Inst. Stand. Tech. 103, 141
(1998).
15. Deslattes, R. D. and Matyi, R. J., Analysis of thin layer
structures by X-ray reflectometry, in Handbook of Silicon
Semiconductor Metrology (A.C. Diebold, Editor). New
York: Marcel Dekker (2001).
16. Warmington, M., Panaccione, C., Matney, K. M., and
Bowen, D. K., Phil. Trans. R. Soc. Lond. A 357, 2827
(1999).
Software and analysis
The final issue that we wish to consider is the effect of computational methods for the analysis of high
resolution X-ray data. One of the advantages of X-ray
scattering as an analytical tool is that the mathematical
basis for X-ray methods is extremely well developed.
As a result, there exist a number of robust computer
simulation and analysis packages for obtaining structural information from high resolution X-ray scans.
Typically, a quantitative analysis involves calculating a simulated profile and comparing it to an experimental profile. After noting the difference between
simulated and experimental curves, the input parameters to the simulation are then adjusted to improve the
agreement between theory and experiment. The interpretation of high resolution X-ray data is essentially a
non-linear curve-fitting problem, where a set of discrete data is compared to a theoretical continuous intensity distribution derived from a mathematical model
of the postulated structure.
This process is notoriously inefficient, since the
experimenter often resorts to increasingly complex
model structures (interfacial layers with composition
or strain gradients, broadening due to substrate curvature, etc.) There are numerous examples in the literature where different structural models all fit an experimental profile. Because of these problems, the use
of “evolutionary algorithms” has shown great promise
for the fitting of high resolution X-ray data [16]. This
645