Download Introduction to the Physics of Matter

Document related concepts

Conservation of energy wikipedia , lookup

History of subatomic physics wikipedia , lookup

Photon polarization wikipedia , lookup

Old quantum theory wikipedia , lookup

T-symmetry wikipedia , lookup

Electrical resistivity and conductivity wikipedia , lookup

State of matter wikipedia , lookup

Relativistic quantum mechanics wikipedia , lookup

Nuclear physics wikipedia , lookup

Density of states wikipedia , lookup

Condensed matter physics wikipedia , lookup

Hydrogen atom wikipedia , lookup

Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup

Atomic theory wikipedia , lookup

Transcript
Introduction to the Physics of
Matter
Lecture notes for the course Struttura della Materia I
Laurea triennale in fisica – Università degli Studi di Milano
Nicola Manini
Version 3.14 – March 12, 2008
The present printout is the one official release
which supersedes all previous hystorical lecture notes,
and accounts for 4 years’ worth of bug fixing.
...many eyes make all bugs shallow...
http://www.mi.infm.it/manini/dida/
Contents
Introduction
0.1. Basic ingredients
0.1.1. Typical scales
0.1.2. Qualitative considerations
0.2. Spectra and broadening
vii
vii
ix
xii
xii
Chapter 1. Atoms
1.1. One-electron atom/ions
1.1.1. The energy spectrum
1.1.2. The angular wavefunction
1.1.3. The radial wavefunction
1.1.4. Orbital angular momentum and magnetic dipole moment
1.1.5. The Stern-Gerlach experiment
1.1.6. Electron Spin
1.1.7. Total angular momentum and magnetic moment
1.1.7.1. Total magnetic moment in the coupled basis
1.1.8. Fine structure
1.1.8.1. Spin-orbit coupling
1.1.8.2. The relativistic kinetic correction
1.1.8.3. The Lamb shift
1.1.9. Nuclear Spin and hyperfine structure
1.1.10. Electronic transitions, selection rules
1.1.11. Spectra in a magnetic field
1.2. Many-electron atoms
1.2.1. Identical particles
1.2.2. The independent-particles approximation
1.2.3. The 2-electron atom
1.2.4. The self-consistent field theory
1.2.5. The periodic table
1.2.6. Core levels and spectra
1.2.7. Optical spectra
1.2.7.1. Alkali atoms
1.2.7.2. Atoms with incomplete degenerate shells
1
1
5
6
9
12
13
16
17
20
22
22
24
25
27
28
31
33
33
36
39
43
48
50
53
54
56
iii
iv
CONTENTS
1.2.7.3. Many-electron atoms in magnetic fields
1.2.8. Dipole selection rules
Chapter 2. Molecules
2.1. The adiabatic separation
2.2. Chemical and nonchemical bonding
2.2.1. H+
2
2.2.2. Covalent and ionic bonding
2.2.3. Weak nonchemical bonds
2.2.4. Classification of bonding
2.3. Molecular spectra
2.3.1. Rotational and ro-vibrational spectra
2.3.2. Electronic spectra
2.3.3. Zero-point effects
60
62
65
65
69
69
74
80
83
84
85
89
90
Chapter 3. Statistical physics
3.0.4. Probability and statistics
3.0.5. Quantum statistics and the density operator
3.1. Equilibrium ensembles
3.1.1. Connection to thermodynamics
3.1.2. Entropy and the second principle
3.2. Ideal systems
3.2.1. The high-temperature limit
3.2.1.1. Internal degrees of freedom of molecules
3.2.1.2. Isolated (spin) degrees of freedom
3.2.2. Degenerate Fermi and Bose gases
3.2.2.1. Fermi particles
3.2.2.2. Bose particles
3.3. Interaction radiation-matter
3.3.1. The laser
93
93
95
97
99
102
103
106
111
114
117
122
127
133
135
Chapter 4. Solids
4.1. The microscopic structure of solids
4.1.1. Lattices and crystal structures
4.1.2. The reciprocal lattice
4.1.2.1. An algebraic note
4.1.3. Diffraction experiments
4.2. Electrons in crystals
4.2.1. Models of bands in solids
4.2.1.1. The tight-binding model
4.2.1.2. The plane-waves method
4.2.2. Filling of the bands: metals and insulators
139
140
149
159
162
163
178
184
184
187
191
CONTENTS
4.2.2.1. Metals
4.2.2.2. Semiconductors
4.2.3. Spectra of electrons in solids
4.3. Lattice dynamics
4.3.1. The normal modes of vibration
4.3.2. Thermal properties of phonons
4.3.3. Other phonon effects
Appendix A. Conclusions and outlook
A.1. Essential equations
A.2. “Advanced” Topics
A.3. Acknowledgments
v
195
207
223
227
228
234
240
243
244
245
245
Introduction
The purpose of this course is to develop some initial microscopic understanding of
many basic phenomena regarding “matter” in its atomic molecular and condensed
states.
0.1. Basic ingredients
Fundamental experiments realized in the late XIXth and early XXth century prove
that any piece of matter (e.g. a sample of pure He gas in a vessel, a block of solid ice,
a screw made of metal alloy, a mobile phone, a block of wood, a cup of soup, a bee...
) is ultimately composed by a huge but finite number of negatively charged electrons
and positively charged nuclei. The nuclear inner structure is usually irrelevant to
most “ordinary” properties of matter: to the purpose of the physics of matter, nuclei
can be treated as structureless point-like particles. If we neglect relativistic effects
and the interactions of our sample of matter with its surrounding, then all internal
microscopic interactions among the components are of simple Coulombic nature.
The nonrelativistic motion of the electrons and nuclei in the sample is governed by
the following Hamiltonian energy operator:
(1)
Htot = Tn + Te + Vne + Vnn + Vee
where:
(2)
Tn =
1 X PR2 α
2 α Mα
is the kinetic energy of the nuclei (PRα is the conjugate momentum to the position
Rα ),
1 X 2
Pri
(3)
Te =
2 me i
is the kinetic energy of the electrons (Pri is the conjugate momentum to ri ),
q2 X X
Zα
(4)
Vne = − e
4πǫ0 α i |Rα − ri |
vii
viii
INTRODUCTION
is the electron-ion attraction potential energy,
q 2 1 X X Zα Zβ
(5)
Vnn = e
4πǫ0 2 α β6=α |Rα − Rβ |
is the nucleus-nucleus repulsion, and finally
q2 1 X X
1
(6)
Vee = e
4πǫ0 2 i j6=i |ri − rj |
represents the electron-electron repulsion. Basically, the distinction between a steel
key and a bottle of wine is only made by the “ingredients”, i.e. the number of
electrons and the number and types of nuclei (charge numbers Zα and masses Mα )
involved.
A state ket |ψi containing all quantum-mechanical information describing the
motion of all nuclei and electrons evolves according to Schrödinger’s equation
d
|ψ(t)i = Htot |ψ(t)i .
dt
This equation, based on Hamiltonian (1), is apparently simple and universal. This
simplicity and universality indicates that in principle it is possible to understand the
observable behavior of any isolated macroscopic object in terms of its microscopic
interactions. In practice, however, exact solutions of Eq. (7) are available for few
simple and idealized cases only. If one attempts an approximate numerical solution
of Eq. (7), she/he soon faces the problem that the information contents of a N particles ket increases exponentially with N , and soon exceeds the capacity of any
computer. To describe even a relatively basic system as a pure rarefied molecular
gas, or an elemental solid, nontrivial approximations to the solution of Eq. (7) are
called for.
Applying smart approximations to Eq. (7) to understand observed materials properties and to make correct previsions of new properties is a refined art. These approximations become often important conceptual tools to link the macroscopic properties
of matter with the underlying microscopic interactions. The present course gives a
panoramic view of several observed phenomena in the physics of matter, introducing
a few standard conceptual tools for their understanding. The proposed schemes of
approximations are often rather simple and primitive idealizations: the bibliography suggests indications to expand the student’s conceptual toolbox get closer to
today’s state of the art in research. We should be aware that even the smartest and
most experienced physicists of matter cannot often give accurate previsions of basic
properties such as the conducting behavior of a pure material of known composition
and structure, without actually carrying out the experiment. Quantitative and often even qualitative understanding based on Eq. (7) of more complex systems (e.g.
(7)
i~
0.1. BASIC INGREDIENTS
(a)
ix
(b)
Figure 0.1. Scanning tunneling microscope (STM) images of (a) a
single-wall carbon nanotube (SWNT) and (b) a multi-wall carbon nanotube (MWNT). These are cylindrical macromolecules entirely composed of carbon atoms.
biological matter) is still by far beyond the capability of today’s theoretical physics
of matter.
Needless to say, this basic course focuses on simple well-understood experiments
phenomena and conceptual schemes with marginal hints at very few selected systems
and techniques investigated in current research. Physicists chemists and biologists
collect a wealth of experimental data, for which understanding is often only qualitative and partial. Creativity and insight help us to develop new conceptual schemes
and approximate models to interpret these data and proceed toward a better understanding of the intimate structure and dynamics of matter.
0.1.1. Typical scales. The motions described by the Hamiltonian Htot involve
several characteristic dimensional scales, dictated by the physical constants [?] in
Htot , where the absence of the speed of light c is noteworth. First, observe that in
Htot the elementary charge qe and electromagnetic constant ǫ0 always appear in the
fixed combination
e2 ≡
qe2
,
4πǫ0
x
INTRODUCTION
(a)
(b)
Figure 0.2. STM images of: (a) gold clusters (about 1 nm diameter)
on graphite; (b) a reconstructed Au(111) surface (atomic corrugation
15 pm, tunneling current 0.3 nA; scanning voltage 0.3 V).
of dimensions energy × length. A unique combination of e2 , Planck’s constant ~,
and electron mass me yields the characteristic length
~2
= 0.529177 × 10−10 m
me e2
named Bohr radius, which sets the typical length scale of electronic motions. All
distances in the physics of matter could conveniently be measured in a0 units, and
indeed for most materials typical interatomic distances turn out a few times a0 .
Individual atoms are routinely seen, e.g. by scanning tunneling microscopy (STM):
Figures 0.1-0.3 confirm that atoms in many solids are typically spaced by a fraction
of nm.
The interaction energy of two elementary point charges at the typical distance a0
(8)
a0 =
e2
me e4
= 2 = 4.35975 × 10−18 J = 27.2114 eV,
a0
~
named Hartree energy, sets the typical energy scale for phenomena involving a single
electron in ordinary matter; in practice, eV units are more often used. The nuclear
charge factors Zα ≤ 100 can scale the e2 coupling up by 2 decades, thus the electronic energies up by at most 4 orders of magnitude (104 EHa ≃ 300 keV). However,
delicate balances often involve characteristic electronic energies as small as 1 meV.
Nuclear motions are usually associated to smaller energies (∼ 10−4 ÷ 10−3 EHa ) than
(9)
EHa =
0.1. BASIC INGREDIENTS
(a)
xi
(b)
Figure 0.3. STM images of (a) a clean Si(100) surface (4 degree
miscut); (b) the basal surface of covellite CuS (the image covers approximately 6 nm on a side; bright spots are high tunneling current
sites corresponding to surface S atoms; Cu atoms do not show in this
image).
electronic motions, because of the at least 1836 times larger mass at the denominator
of the kinetic term (2).
The typical timescale of electronic motions is inversely proportional to its energy
scale:
(10)
tB =
~
~3
=
= 2.41888 × 10−17 s.
EHa
me e4
Oscillations of period 2πtB have a frequency νB = ωB /(2π) = EHa /(2π~) = 6.5797 ×
1015 Hz.
The typical electron velocity is then set by the ratio
r
a0
EHa
e2
(11)
vB =
= 2.18769 × 106 m/s,
=
=
tB
me
~
about 1% of the speed of light, which justifies a posteriori the initial neglect of
relativity. In fact, the ratio
r
e2
1
EHa
vB
=
=
= 7.29735 × 10−3 =
,
(12)
α=
2
c
me c
~c
137.036
xii
INTRODUCTION
the so called fine-structure constant, measures the relative importance of the relativistic corrections to the nonrelativistic electron dynamics given by Eq. (1). As
discussed above, motions of the nuclei are much slower, thus relativistic corrections
to their kinetic energy Tn are usually negligible.
Figure 0.4 compares the typical length and energy scales for electrons in matter to
the wavelengths and photon energies of the electromagnetic waves, used to investigate matter itself as sketched mainly in Secs. 0.2, 4.1.3, and 4.2.3 below. The photon
wavelength matches the typical interatomic distances (∼ 10−10 m) for photons in
the X-rays region (∼ 104 eV ≫ EHa ). On the other hand, photons whose energies
fit the typical atomic energy scale EHa lie in the ultraviolet, near the visible range:
in this range, the characteristic electromagnetic wavelength is a fraction of µm, i.e.
at least three orders of magnitude larger than the typical atomic sizes. Typical
energies associated to the motion of the nuclei in matter (of order 10−2 eV) match
the infrared region of the electromagnetic spectrum.
0.1.2. Qualitative considerations. In ordinary matter, electrons tend to lump
around the strongest positive charges around, the nuclei, driven by the term (4).
Due to quantum kinetic energy (3), electrons do not collapse to the nuclei but form
atoms of finite size ≈ a0 , as sketched in Chapter 1. Atoms then act as the building
blocks of matter, in its gaseous and condensed phases. Many observed macroscopic
properties of extended matter such as elasticity, heat transport, heat capacity can
be described in terms of the motion of the center of mass of atoms or small collections of atoms. Ultimately, the interaction among atoms, governing these motions,
is driven by the quantum dynamics of charged electrons and nuclei described by
Eq. (7), usually described within the adiabatic separation scheme (Sec. 2.1). Understanding the dynamics of a finite number of electrons in the field of two or several
nuclei, and the motion of these nuclei themselves (Chapter 2) provides the basics of
interatomic bonding, the mechanism granting the very existence of condensed matter. Finally, the methods and approximations developed for few-atom systems and
for statistically large ensembles (Chapters 3) lead to new concepts and phenomena
associated to the macroscopic size of extended systems (Chapter 4).
0.2. Spectra and broadening
Starting from the late XIXth century, physicists have developed and employed all
sorts of techniques to investigate the intimate excitations of matter. Many data have
been obtained through two broad classes of spectroscopies: absorption and emission.
• Absorption: the sample is crossed by a collimated beam of monochromatic
light (not necessarily visible): the intensity loss of the beam going through
the sample is measured as a function of light frequency (or equivalently
wavelength), as sketched in Fig. 0.5.
0.2. SPECTRA AND BROADENING
Figure 0.4. The spectrum of electromagnetic radiation. Wavelength, frequency and photon energy are compared on logarithmic
scales.
xiii
xiv
INTRODUCTION
MONOCHROMATIC
PHOTON SOURCE
ω
1111111111
0000000000
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
I(ω)
SAMPLE
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
PHOTON
DETECTOR
I
ω
Figure 0.5. A radical schematization of the setup for absorption spectroscopy.
1111111111
0000000000
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
SAMPLE
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
MONOCHROMATOR
PHOTON
DETECTOR
ω
I
I(ω)
ω
Figure 0.6. A radical schematization of the setup for emission spectroscopy.
• Emission: the sample is excited, e.g. by means of a flame or electrical
discharge. The light emitted in the de-excitation transitions is collected,
and its intensity is measured as a function of frequency (Fig. 0.6).
Spectra of both kinds are routinely collected to probe the properties of gaseous (both
atomic and molecular), liquid, and solid samples.
Atomic and molecular spectra are usually characterized by sharp monochromatic
peaks (also called “lines”). However, no absorption/emission peak is ever infinitely
sharp (Fig. 0.7): at least 3 simultaneous effects combine to broaden each spectral
Intensity [A.U.]
0.2. SPECTRA AND BROADENING
20
xv
width=0.02
width=0.05
width=0.08
15
10
5
0
1.2
1.4
1.6
1.8
2
2.2
ω
Figure 0.7. Broadening limits the details observable in a spectrum:
when the line width exceeds the separation between two lines (dotted
curve), these look like a single peak, and the “fine” detail of the two
near peaks is then lost.
line [?]: (i) experimental resolution of the spectrometer, (ii) “natural” broadening
due to finite lifetime, (iii) Doppler broadening.
Experimental resolution is typically limited by the resolution of the monochromator, noise in the photon detector, inhomogeneities in the sample or in some external
field applied to it. Experimental broadening is usually generated by several random
concurrent effects, and determines therefore a Gaussian line shape. Its has no fundamental nature, thus it can be (and often is) reduced by means of technological
advances.
According to the basic Schrödinger theory, all eigenstates of an atom are stationary
states. However, this theory neglects interaction with the fluctuating radiation field,
which is always present. When this interaction is included, only the ground state
is really stationary. Instead, any excited state decays spontaneously to all states at
lower energy. The total decay rate is the sum of the individual decay rates.1 The
total decay rate determines a (random) exponential decay of the average population
of atoms initially in an excited state:
(13)
1
[N ](t) = N0 e−tγ
tot
= N0 e−t/τ ,
For example, as described in Sec. 1.1.10 below, the 3p state of H decays at a rate γ3p→1s =
1.67 × 108 s−1 to the ground state, and at a γ3p→2s = 2.25 × 107 s−1 to the 2s state; decay to state
2p is dipole-forbidden, thus occurs at a negligible rate. Therefore the 3p state empties at a total
tot
rate γ3p
= γ3p→1s + γ3p→2s = 1.90 × 108 s−1 .
xvi
INTRODUCTION
which defines the lifetime of that atomic level τ = 1/γ tot . In practice, an atom hardly
lasts in an excited state longer than a few times its characteristic τ . Accordingly,
τ sets the typical duration that any spectroscopy experiment involving that excited
state may last. Due to the time-energy uncertainty, the energy of an atomic level
cannot be measured with better precision than
~
= ~ γ tot .
τ
The effect of the finite lifetime of atomic levels on the spectral lines is therefore a
broadening of the otherwise infinitely sharp line. This “natural” lifetime broadening
appears in the spectrum as a Lorentzian peak profile
γ
(15)
I(ω) = I0
,
π[(ω − ω0 )2 + γ 2 ]
(14)
∆E ≃
where ω0 = ~1 (Ef −Ei ) is the original line position set by the energy difference of the
initial and final state. Atomic excited states are characterized by typical lifetimes
of several ns, thus by natural spectral broadening γ of a fraction of µeV.
In addition to the natural broadening due to finite lifetime, the random thermal motion in the gas-phase sample introduces an extra source of broadening: the
Doppler broadening. When seen from the lab frame, atoms/molecules moving toward or away from the detector at a velocity vx have transition frequencies blue or
red shifted with respect to those at rest. Since the thermal (molecular center-mass)
velocities are non-relativistic, the Doppler frequency shift is given by the simple
form
vx (16)
ω = ω0 1 ±
ω0 = angular frequency at rest.
c
The molecular velocities are random distributed (see Eq. (192) in Sec. 3.2.1 below)
depending on the gas temperature T : the average number of molecules with velocity
component vx in the direction of detection is
r
M vx2
M
dn(vx )
exp −
=N
,
(17)
dvx
2πkB T
2kB T
where M is the molecular mass. Radiation intensity then spreads around the rest
frequency ω0 as
"
2 #
M c2 ω − ω0
(18)
I(ω) = I0 exp −
.
2kB T
ω0
This represents a Gaussian broadening of full width at half-maximum
r
√
kB T
.
(19)
∆ωDoppler = ω0 8 ln 2
M c2
0.2. SPECTRA AND BROADENING
xvii
Since the photons wavelength λ = 2πc/ω, the relative broadening is the same in
terms of wavelengths:
r
kB T
∆λDoppler √
(20)
.
= 8 ln 2
λ0
M c2
Heavier atoms/molecules move more slowly and are less affected by Doppler broadening: this source of broadening is then relevant mainly for the spectra of the lightest
atom, hydrogen. The Hα line, introduced shortly, of gas-phase H suffers of 10 µeV
Doppler broadening at 300 K.
CHAPTER 1
Atoms
The importance of the spectroscopy of atoms and ions for the understanding of
the whole physics of matter cannot be overestimated. The study of atoms starts off
naturally from the exact dynamics of a single electron in the central field of a charged
nucleus (Sec. 1.1) because, beside the intrinsic interest of this system, the notation
and concepts developed here are at the basis of the language of all atomic physics.
This language is then used to introduce (Sec. 1.2) the spectroscopy of many-electron
atoms, collections of 2 to 102 electrons moving in the attractive central field of a
single nucleus and subject to mutual repulsion.
1.1. One-electron atom/ions
The one-electron atom is one of the few quantum systems where essentially exact
solutions of the Schrödinger equation (7) are available. Here, comparison of theory
and experiment allows physicists to establish the limits of validity and predictive
power of the quantum mechanical model Eqs. (1-6). When relativistic effects are
included (Sec. 1.1.8), the model is found in almost perfect agreement with extremely
accurate experimental data, all the tiny discrepancies being satisfactorily accounted
for by a perturbative treatment of residual interactions (Sec. 1.1.9).
The solution of the Schrödinger equation (7) for the one-electron atom is a basic
exercise in quantum mechanics.1 Both the Vee and Vnn terms in Eq. (1) vanish, and
only the nuclear and electronic kinetic energies plus the Coulomb attraction Vne are
1
We often use standard Dirac notation [?], with a ket |ki representing a physical state in the
Hilbert space, characterized by the full set of quantum numbers k. The bra hj| is a linear operator
from the Hilbert space to the complex numbers, which takes the “component” or “overlap” hj|ki
of any ket |ki along the direction |ji. If {|ji} is a basis of eigenstates of some linear operator
J (representing a physical observable), the real number |hj|ki|2 represents the probability that
starting from a state |ki, measurement of J yields the eigenvalue j. As an example, taking for J
the position operator R, with eigenvalues r and eigenkets |ri, the overlap ψk (r) = hr|ki (called
wavefunction in real-space representation of state |ki) is a complex number such that |ψk (r)|2
equals the probability (density) of finding the particle at r when position is measured in an initial
state |ki.
Rather than attacking directly the full time-dependent Eq. (7), it is often smarter to first solve
the time-independent Schrödinger equation, i.e. the eigenvalue problem Htot |ψi = E|ψi. On the
resulting basis of energy eigenstates |ji (with energy Ej ), one then expands the general solution of
1
2
1. ATOMS
relevant. Detailed solutions of the one-electron atom problem are available in many
textbooks, including Refs. [?, ?, ?]. Here we only summarize the strategy and main
results:
• Separation of the center of mass motion: the position operators of
~ (of mass M ) and of the electron ~re are replaced by the
the nucleus R
combinations
~
~.
~ cm = M R + me~re
and
~r = ~re − R
(21)
R
M + me
In terms of these new coordinates, the Hamiltonian separates into a decou~ cm
pled kinetic term for R
Hcm = −
(22)
~2
∇2 ,
2(M + me ) R~ cm
plus a Coulombic Hamiltonian for the relative coordinate ~r
HCoul = −
(23)
~2 2 Ze2
∇ −
,
2µ ~r
|~r|
where
(24)
µ=
M me
M + me
is the reduced mass of the 2-particle system. This separation is equivalent
to that done for solving the classical Kepler-Newton two-body “planetary”
~ cm translational motion is trivially described
problem. The free global R
in terms of plane waves. The internal atomic dynamics is that of a single particle of mass µ in the same Coulombic central field as the original
nucleus-electron interaction.
• Separation in spherical coordinates: given the spherical symmetry of
the potential, the Schrödinger equation is conveniently rewritten in polar coordinates r, θ, ϕ. By factorizing the total wavefunction ψ(r, θ, ϕ) =
R(r)Θ(θ)Φ(ϕ), the variables separate, and the original three-dimensional
(3D) equation splits into three independent second-order equations for the
Eq. (7)
|ψ(t)i =
X
j
|ji hj|ψ(0)i e−i Ej t/~ ,
where the weights are precisely the complex components hj|ψ(0)i of the initial state |ψ(0)i on this
basis. In particular, the time evolution of a pure energy eigenstate |ji involves a single rotating
phase factor exp(−i Ej t/~): this justifies the qualification of |ji as stationary.
1.1. ONE-ELECTRON ATOM/IONS
3
r, θ, and ϕ motions:
(25)
(26)
(27)
d2 Φ
+ ηΦ = 0 ,
dϕ2
dΘ
η 1 d
sin θ
+ λ−
Θ = 0,
sin θ dθ
dθ
sin2 θ
~2 λ
~2 1 d
2 dR
− E R = 0.
r
+ U (r) +
−
2µ r2 dr
dr
2µ r2
Here we indicate a general function U (r) for the potential energy −Ze2 /r:
this same formalism can be applied to any central potential (e.g. in Sec. 2.3
below).
• Solution of the separate eigenvalue problems: the differential equations are
solved imposing the relevant boundary conditions for R(r), Θ(θ) and Φ(ϕ).
The eigenvalues η, λ, and E can assume only certain values, compatible
with the boundary conditions:
η = ηm = m 2 ,
λ = λl = l(l + 1),
2 4
µ EHa Z 2
µZ e
(30) E = En = − 2 2 = −
,
2~ n
m e 2 n2
(28)
(29)
m = 0, ±1, ±2, ...
l = |m|, |m| + 1, |m| + 2, ...
n = l + 1, l + 2, l + 3, ...
The integer numbers m (magnetic q.n.), l (azimuthal q.n.), and n (principal quantum number) parameterize the eigenvalues and the corresponding
eigenfunctions:
(31)
(32)
(33)
1
Φm (ϕ) = √ eimϕ
2π
Θlm (θ) = (−1)
|m|−m
2
Rnl (r) = −k 3/2
s
s
2l + 1 (l − |m|)! |m|
P (cos θ)
2 (l + |m|)! l
(n − l − 1)!
−kr/2
(kr)l L2l+1
,
n+l (kr) e
2n[(n + l)!]3
where k is a shorthand for 2Z/(an), a is a corrected atomic length unit
a = mµe a0 = ~2 /(µe2 ). The associated Legendre functions Plm (x) (m ≥ 0)
are defined by
(34)
Plm (x)
2 m/2
= (1 − x )
dm
Pl (x),
dxm
1 dl 2
Pl (x) = l
(x − 1)l .
l
2 l! dx
4
1. ATOMS
E [EHa ]
-0.850
-0.544
-0.378
-1.511
-0.1
-3.400
-0.2
-0.3
-0.4
-0.5
-13.598
n= 1
n= 2
n= 3
n= 4
n= 5
n= 6
Figure 1.1. The 6 lowest-energy levels En of the spectrum of hydrogen according to the nonrelativistic Schrödinger theory. Energies
in eV near each line. The zero of the scale coincides with the onset of
the continuum of unbound states.
The associated Laguerre polynomials Lqp (ρ) are polynomials of degree p−q,
defined by
(35)
Lqp (ρ) =
dq
Lp (ρ) ,
dρq
Lp (ρ) = eρ
dp p −ρ
(ρ e ) .
dρp
As the individual terms are properly normalized by the square root etc.
prefactors of Eqs. (31)-(33), so is the total atomic wavefunction
(36)
ψnlm (r, θ, ϕ) = Rnl (r) Θlm (θ) Φm (ϕ)
representing the atomic state |n, l, mi. Explicitly, the orthonormality relations read:
(37)
Z
′ ′
′
∗
hn, l, m|n , l , m i = r2 dr sin θdθ dϕ ψnlm
(r, θ, ϕ) ψn′ l′ m′ (r, θ, ϕ) = δnn′ δll′ δmm′ .
In addition to all these bound states, a continuum of unbound states of
arbitrary positive energy represents the ionic states, where the electron
moves far away from the nucleus.
1.1. ONE-ELECTRON ATOM/IONS
5
1.1.1. The energy spectrum. The energy eigenvalues (30) of the nonrelativistic 1-electron atom depend on the principal quantum number n only, and show
the characteristic structure drawn in Fig. 1.1. In particular, the lowest-energy state
(the ground state) is |n, l, mi = |1, 0, 0i. For H, its binding energy amounts to
− 21 EHa mµe = −13.5983 eV (slightly less negative than − 21 EHa = −13.6057 eV, due
to the reduced-mass correction µ/me = 0.999456). Above this n = 1 state there
lie a sequence of bound energy “levels”. In particular, the lowest excited level of H
(n = 2, including states |2, 0, 0i, |2, 1, −1i, |2, 1, 0i, and |2, 1, 1i) is 4-fold degenerate
and sits at − 21 212 − 1 EHa mµe = 83 EHa mµe = 10.1987 eV above the ground state.
Further n-levels have an increasing degeneracy given by the values of l = 0, ...n − 1
compatible with that n, and by the values m = 0, ±1, ... ± l compatible with each l.
This m-degeneracy (2l+1 states) occurs for any central potential and represents the
possibility for the orbital angular momentum to align in any direction in 3D space
without affecting the energy of the atom. In contrast, the additional degeneracy for
different l (n2 states in total) is characteristic of the Coulomb −r−1 potential, and
is lifted for different radial dependencies of the potential energy U (r).
Transitions between levels of any n are observed (Fig. 1.2). Historically, the
close agreement of the results of Schrödinger’s equation with accurate experimental
data marked one of the early triumphs of quantum mechanics. The transitions
group naturally in series of transitions with the same lowest state (initial state
in absorption or final state in emission, see Fig. 1.2), and belonging to the same
spectral region. In particular, the spacings of the Coulombic eigenvalues (30) are
peculiar in making the two highest-energy series (lowest state n = 1 and 2) not
overlap with any other series, since the energy distance between the lowest level
(n) and the next (n + 1) is larger than the whole range of bound-state energies
from level n + 1 to the ionization threshold. For hydrogen, the transitions whose
lower level is n = 1 constitute the so-called Lyman series (10.2 ÷ 13.6 eV, in the
ultraviolet); the transitions whose lower level is n = 2 constitute the Balmer series
(1.89÷3.40 eV, visible); the transitions whose lower level is n = 3 are called Paschen
series (0.66 ÷ 1.51 eV, infrared); the transitions whose lower level is n = 4 are called
Brackett series (0.31 ÷ 0.85 eV, infrared).
The weak dependence of the spectrum on the mass of the nucleus (through the
reduced mass µ) produces a fine duplication (relative energy separations ≃< 0.1%
– see Fig. 1.3) of the lines of the spectrum of a mixture of different isotopes such as
1
H and 2 H (also called deuterium D). Finally, note that the Z 2 dependence of the
eigenvalues (30) makes one half of the lines of one half of the series of the He+ ion
(one third of those of Li2+ , ...) almost coincident (except for the mass shift, and
relativistic effects) with the series of H, as illustrated in Fig. 1.4.
6
1. ATOMS
Figure 1.2. The observed spectrum of hydrogen (a), with a detail
of the Balmer series (b) in the visible range. (c) A portion of the
spectrum of the star ζ Tauri, showing more than 20 lines of the Balmer
series.
1.1.2. The angular wavefunction. The angular solutions Ylm (θ, ϕ) = Θlm (θ)Φm (ϕ)
are the normalized eigenfunctions (named spherical harmonics) of the free angular
motion of one quantum-mechanical particle.2 Ylm contains complete information
about an important observable: the orbital angular momentum. Indeed, ~2 × (the
2
Definitions (31) and (32) adhere to the standard conventions for the phase of the ubiquitous
Ylm functions.
1.1. ONE-ELECTRON ATOM/IONS
7
Figure 1.3. A close view of the Balmer Hα line emitted by a mixture
of 1 H and 2 H.
angular part of the Laplace operator) occurring in the separation of variables repre~ 2 of the rotating two-body system.
sents the squared orbital angular momentum |L|
~ 2 . Likewise, ~m are the eigenvalues of the
~2 λ = ~2 l(l + 1) are the eigenvalues of |L|
∂
angular momentum component Lz = −i~ ∂ϕ
. Information about other components
(Lx or Ly ) is not compatible with the measurement of Lz , and thus has only statistical meaning in quantum mechanics, since these other components do not commute
with Lz . On the other hand, the choice of the ẑ direction in space (connected to
the choice of the polar coordinate system used) is arbitrary, and due to spherical
symmetry, any alternative choice would lead to the same observable results.
The spherical harmonics carry complete information about the angular distribution of ~r. In a state |l, mi with fixed squared angular momentum l and ẑprojection m, the probability that the vector ~r joining the nucleus to the electron
is proportional to (sin θ cos ϕ, sin θ sin ϕ, cos θ) equals |hθ, φ|l, mi|2 sin θ dθ dϕ ≡
|Ylm (θ, ϕ)|2 sin θ dθ dϕ. Equation (31) indicates that the ϕ dependence of |Ylm |2
is always trivial: |Ylm |2 are constant functions on all circles at fixed θ. Figure 1.5
collects polar plots of |Ylm |2 as a function of θ for several values of l and m (other visualization of the same objects can be found in many textbooks). It is apparent that
l − |m| counts the number of zeros (nodes) of |Ylm |2 as the polar angle θ runs from
0 to π. A large number of nodes indicates a large angular momentum component
perpendicular to ẑ.
Notes: The quantization of angular momentum originates from the boundary
conditions Φ(ϕ + 2π) = Φ(ϕ), and Θ(θ) finite at θ = 0 and π for the angular
wavefunction: these are realized only for discrete values of the quantum numbers
m and l [?]. The visible (Fig. 1.5) increase of |Yl0 (0, ϕ)|2 with l does not contradict
normalization (37), because of the sin θ integration factor. If the rl factor taken from
the radial wavefunction (33) is grouped together with Ylm (θ, ϕ), one can express
rl · Ylm (θ, ϕ) in Cartesian components (rx , ry , rz ) of ~r, obtaining a homogeneous
8
1. ATOMS
Figure 1.4. Observed changes in the spectra of one-electron
atom/ions due to changes in the nuclear mass. The currently
accepted value of the Rydberg constant R∞ = EHa /(2hc) is
109737.31568527(73) cm−1 (2007 CODATA).
1.1. ONE-ELECTRON ATOM/IONS
l=0
l=1
9
l=2
l=3
0.4
0.4
0.4
0.4
0.2
0.2
0.2
0.2
z
0
0
0
0
-0.2
-0.2
-0.2
-0.2
-0.4
-0.4
-0.4
-0.4
0.1 0.2 0.3 0.4 0.5
x
0.1 0.2 0.3 0.4 0.5
x
0.1 0.2 0.3 0.4 0.5
x
0.1 0.2 0.3 0.4 0.5
x
Figure 1.5. Polar plots of the lowest-l spherical harmonics: radial
distance from the origin equals |Ylm (θ, ϕ)|2 . Here the x − z plane (ϕ =
0) is shown, but ϕ may be taken of any value. θ measures the angle
away from the ẑ axis and varies from 0 (upward) to π (downward).
Colors encode increasing value of |m|, from its minimum m = 0 (red)
to its maximum m = l (violet).
polynomial of degree l. For example,
r
r
3
3
(38)
r Y10 (θ, ϕ) =
rz ,
r Y1 ±1 (θ, ϕ) =
(∓rx − i ry ) .
4π
8π
This observation makes it clear that the parity of Ylm (θ, ϕ) (character for ~r → −~r) is
the same as that of l, i.e. (−1)l . Finally, it easy and useful to retain the expression for
the simplest spherical harmonic function (a polynomial of degree 0, i.e. a constant):
Y00 (θ, ϕ) = (4π)−1/2 .
Important notation: the standard spectroscopic language to indicate the value
of orbital angular momentum is s, p, d, f, g, h, ... for l = 0, 1, 2, 3, 4, 5, ...
respectively.
1.1.3. The radial wavefunction. The radial wavefunction Rnl (r), Eq. (33),
has the structure of a product of (i) a normalization factor, (ii) a power rl (mentioned
above in relation to Ylm ), (iii) an associated Laguerre polynomial of degree n−l−1 in
. The power term defines the behavior of Rnl (r) for
r, and (iv) the exponential of − Zr
2a
r → 0. The exponential decay dominates at large r, where Rnl (r) ∼ rn−1 exp − Zn ar .
The Laguerre polynomial L2l+1
n+l (ρ) vanishes at as many different points as its degree
(n − l − 1): these zeroes are all located at positive ρ, thus each produces a radial
10
s
1. ATOMS
4
0.0175
0
0.015
0.0125
-2
3
0
0
5
10
15
1
20
r/a
25
30
log
0.0025
10
2
R ]
a
0.005
2
a
3
R
2
0.0075
[a
0.01
3
R
2
3
-4
35
-6
0
0
10
20
r/a
30
0
p
5
10
15
20
r/a
25
30
35
0
0.02
[a
R ]
-2
10
0.01
log
a
3
R
3
2
2
0.015
0.005
-4
-6
0
0
10
20
r/a
30
0
d
5
10
15
20
r/a
25
30
35
5
10
15
20
r/a
25
30
35
0
0.0015
10
3
log
a
[a
0.001
3
R
2
2
R ]
-2
-4
0.0005
-6
0
0
10
20
r/a
30
0
Figure 1.6. Z = 1 hydrogenic radial s p and d squared eigenfunctions |Rnl (r)|2 . Left: linear scale; right: log10 scale. Solid: n = 1,
dashed: n = 2, · – : n = 3, · · – : n = 4, · · · – : n = 5. The (n−l−1)
radial nodes, where Rnl changes sign, appear as kinks in the log plots
2
1.1. ONE-ELECTRON ATOM/IONS
0.5
s
0.1
p
0.15
2 2
a r R
R
0.1
a r
2
0.3
0.2
d
0.08
2
0.4
2 2
a r R
11
0.06
0.04
0.05
0.02
0.1
0
0
0
10
20
r/a
30
40
0
0
10
20
r/a
30
40
0
10
20
r/a
30
Figure 1.7. Hydrogen s p and d radial probability distribution
P (r) = |r Rnl (r)|2 . Solid: n = 1, dashed: n = 2, · – : n = 3, · ·
– : n = 4, · · · – : n = 5. The (n−l −1) radial nodes show as tangencies to the horizontal axis, because of squaring Rnl . These curves
refer to Z = 1.
node, i.e. a radial distance r where Rnl (r) vanishes and changes sign. Figure 1.6
summarizes all these facts for the lowest-n radial eigenfunctions.
It is important to distinguish the “radial probability distribution” P (r) = r2 |R(r)|2
of Fig. 1.7 from the |R(r)|2 of Fig. 1.6. P (r) dr provides the probability that the
nucleus-electron distance is within dr of r, regardless of the direction where ~r points.
The r2 weight factor is precisely the spherical-coordinates Jacobian, proportional to
the surface of the sphere of radius r, or rather the volume of the spherical shell
“between r and r + dr”. In contrast, the probability that the electron is found at a
specific position ~r relative to the nucleus is given by
(39)
P3D (~r) d3~r = |ψ(~r)|2 d3~r = |ψ(r, θ, ϕ)|2 d3~r ,
where the polar coordinates are those representing that point ~r. Equation (39)
indicates that |ψ(r, θ, ϕ)|2 gives the actual 3D probability distribution in space, not
P (r). This means in particular that the probability density profile along a line
through the nucleus (specifed by fixing (θ, ϕ)) is simply |R(r)|2 . Note (Fig. 1.6)
that all s eigensolutions have nonzero Rn0 (0) = 2[Z/(an)]3/2 . Indeed, r = 0 is a
(cusp-like) absolute maximum of |Rn0 (r)|2 . It is no surprise that the most likely
point in space for the electron is ~r = ~0, the place where the nucleus sits, thus where
the potential U (r) is the most attractive. For s states the misleading vanishing of
P (r → 0) is entirely due to the r2 weight.
For l > 0, also the probability density |Rnl (r)|2 vanishes at the origin, where
the centrifugal repulsion described by the λ = l(l + 1) “effective potential” term
in Eq. (27). diverges. We have here the quantum-mechanical analogous to the
40
12
1. ATOMS
impossibility of a classical point particle carrying nonzero angular momentum to
reach the origin of a central potential. More wave-mechanically: for l > 0 the origin
is a common point of one or several nodes of the angular wavefunction: the only
way for the total wavefunction to be single-valued there is to vanish.
For increasing nuclear charge Z, |Rnl (r)|2 and P (r) localize closer and closer to
the origin (scaling as PZ (r) = Z 3 P1 (rZ)): this explains the ∝ Z 2 (rather than ∝ Z 1
as in Vne ) dependency of the eigenenergies (30). The simplest radial wavefunction,
that of the ground state, exemplifies well this Z dependency:
r
3/2
k 3 −kr/2
Zr
Z
(40)
R10 (r) =
.
exp −
e
=2
2
a
a
In contrast, at fixed Z and for increasing n, the radial probability distribution P (r)
peaks at larger and larger distance from the origin, see Fig. 1.7.
Notation: the hydrogenic kets/eigenfunctions |n, l, mi of (36) are often shorthanded as n[l], where n is the principal quantum number, [l] is the relevant letter
s, p, d, ... for that value of l, and information about m is dropped. For example,
4p refers to any of ψ41 −1 , ψ41 0 , ψ41 1 . This notation is far from unambiguous, as the
same 4p symbol implies different radial dependences Rnl (r) for nuclei of different
mass and charge.
1.1.4. Orbital angular momentum and magnetic dipole moment. The
angular momentum of an orbiting charged particle such as an electron is associated
to a magnetic dipole moment. This is best illustrated in the classical case (Fig. 1.8):
a particle of mass m and charge q rotating along a circular orbit of radius r at a
~ = ~r × p~ = m r v n̂,
speed v, thus in a period T = 2πr/v, has angular momentum L
where n̂ is the unit vector perpendicular to its trajectory. The current along the
loop is simply I = q/T = q v/(2πr). The magnetic moment of a ring current equals
the product of the current times the loop area:
qv 2
q v r n̂
q ~
πr n̂ =
=
L.
2πr
2
2m
One can show that this equality holds for any shape of the orbit.
Relation (41) holds also in quantum mechanics, as an operatorial relation. For an
electron of charge q = −qe , where the angular momentum is quantized in units of
~, it is convenient to write (41) as
(41)
(42)
~µ = I πr2 n̂ =
~µ = −
~
~
~qe L
L
qe ~
L=−
= −gl µB ,
2me
2me ~
~
~qe
where the Bohr magneton µB = 2m
= 9.27401 × 10−24 J T−1 (or equivalently A m2 )
e
is the natural scale of atomic magnetic moments. gl = 1 is the orbital g-factor,
1.1. ONE-ELECTRON ATOM/IONS
13
~
Figure 1.8. The relation between the orbital angular momentum L
and the magnetic moment ~µ produced by an electron of charge −qe
~ produced by the
rotating on a circular orbit. The magnetic field B
circulating current is indicated by the curved lines.
introduced for uniformity of notation with those situations with nontrivial g-factors
g 6= 1 that we shall encounter below.
The atomic angular momenta can be detected by letting the associated magnetic
~ is uniform, it induces a
moments interact with a magnetic field. If the field B
~ with a frequency (the Larmor frequency)
precession of ~µ around the direction of B
qe B
ω = 2me , routinely detected in microwave resonance experiments. If the field is
nonuniform instead, a net force acts on the atom, as we discuss in the next section.
1.1.5. The Stern-Gerlach experiment. The interaction energy of a magnetic moment with a magnetic field is
(43)
~.
Hmagn = −~µ · B
This energy is expected to remain constant in time due to the classical precession. A
~ generates a force equal to the gradient of this interaction energy.
nonuniform field B
(44)
~ µ · B)
~ = ∇(~
~ µ · B)
~ .
F~ = −∇(−~
14
1. ATOMS
Figure 1.9. The origin of the force that a nonuniform magnetic field
produces on a magnetic moment. (a) If the magnetic dipole is seen as
a circulating current, the net force originates from a force component
~ (b) If the dipole
consistently pointing in the direction of increasing B.
is seen as a pair magnetic monopoles, a net force arises from the
unbalance between the forces on the individual monopoles.
In particular, in a magnetic field with a dominant Bz component, the ẑ force component Fz is proportional to the derivative of Bz along the same direction:
~ z Bz = µz ∂Bz .
(45)
F~z ≃ µz ∇
∂z
The “microscopical” origin of this force is pictured in Fig. 1.9. The observation that
a nonuniform magnetic field produces a force proportional to a magnetic-moment
component is at the basis of the Stern-Gerlach experiment, one of the key experiments of quantum mechanics.
As illustrated in Fig. 1.10, a collimated beam of neutral atoms at thermal speeds
is emitted from an oven into a vacuum chamber where it traverses a region of inhomogeneous magnetic field and is finally collected by a suitable detector. Basically,
the Stern-Gerlach apparatus is a device for measuring the component of atomic ~µ in
the field-gradient direction. The first sections of Ref. [?] report a full analysis of the
Stern-Gerlach experiment and its far reaching implications. The original experiment
(1922) was carried out using Ag atoms, but similar deflections are observed using
atomic H.
The main outcome of the Stern-Gerlach experiment is that the ẑ component of
~µ is not distributed continuously as one would expect for a classical vector pointing at random in space, but rather peaked at discrete values. The lower panel of
1.1. ONE-ELECTRON ATOM/IONS
Figure 1.10. (a) In the Stern-Gerlach experiment, a collimated
beam of atoms from an oven traverses a region of inhomogeneous
magnetic field created by a magnet with asymmetric core expansions:
the atoms are finally detected at a collector plate. (b) In an inhomogeneous magnetic field, a magnet experiences a net force which depends
on its orientation. (c) The deflection pattern recorded on the detecting
plate in a Stern-Gerlach measurement of the ẑ component of the magnetic dipole moment of Ag atoms (the outcome would be the same for
H atoms). Contrary to the classical prediction of an even distribution
of randomly oriented magnetic moments, two discrete components are
observed, due to angular-momentum quantization.
15
16
1. ATOMS
Fig. 1.10 shows clustering of the atoms in two lumps. Now, quantum mechanics
indeed makes the prevision that the ẑ-component Lz of angular momentum (and
thus µz of magnetic moment) should show discrete eigenvalues. However, (i) the
number of eigenvalues of Lz must be odd (2l + 1, with integer l = 0, 1, 2, ... – see
Eq. (29)) and (ii) the ground state of hydrogen has l = 0, thus it should have no
magnetic moment at all, and one undeflected lump should be observed, rather than
two. This is a first hint that some extra degree of freedom must play a role in the
one-electron atom.
1.1.6. Electron Spin. The Stern-Gerlach experiment, the multiplet fine structure of the spectral lines (the fine doublets of Fig. 1.3), and the Zeeman splitting of
the spectral lines (see Sec. 1.1.11 below) are three pieces of evidence pointing to the
existence of an extra degree of freedom of the electron, beside its position in space.
W. Pauli introduced a nonclassical internal degree of freedom, later named electron
spin, with properties similar to orbital angular momentum. Spin may be pictured as
the intrinsic angular momentum of rotation of the electron around itself, although
this picture is imprecise. Like orbital angular momentum, if spin is measured along
one direction, say ẑ, one finds Sz ∼ ~ms , where again the quantum number ms
takes (2s + 1) values ms = −s, . . . s. As a Stern-Gerlach deflector produces two
lumps, 2 = 2s + 1 components are postulated, requiring that the intrinsic angular
momentum of the electron must be s = 21 . This in turn is associated to a squared
~ 2 = 1 ( 1 + 1)~2 = 3 ~2 .
spin angular momentum operator |S|
2 2
4
To avoid confusion with the spin ẑ-projection quantum number ms , hence we shall
use ml for the projection quantum number associated to Lz . With this notation, a
complete wavefunction, necessary to specify all degrees of freedom of the electron,
is slightly more complicated than R(r)Ylml (θ, ϕ): an extra spin dependence must be
inserted. Assuming, as apparent from the nonrelativistic Hamiltonian (1), that spin
and orbital motions do not interact, the eigenstate of a one-electron atom with spin
pointing up (↑) or down (↓) in a definite orientation is written
(46)
ψn l ml ms (r, θ, ϕ, σ) = Rn l (r) Yl ml (θ, ϕ) χms (σ) .
Here σ is the variable for the extra degree of freedom, which can take value ± 21 ,
according to checking if the electron spin points up or down in some ẑ direction,
while the quantum number ms = ± 21 indicates which way the spin of this specific
state is actually pointing with respect to that fixed direction. These basic spin
functions are therefore simply χms (σ) = hσ|ms i = δms σ .
Less trivial spin wavefunctions occur when the spin points in some direction other
than ẑ (non-Sz eigenstates). A Stern-Gerlach apparatus was employed to purify
a beam of atoms with spin polarized some oblique direction, then analyzed by a
second apparatus to measure the spin component σ along the fixed ẑ direction.
1.1. ONE-ELECTRON ATOM/IONS
17
For the oblique-spin pure state, the (now nontrivial) spin wavefunction χ(σ) bears
the standard significance of a wavefunction in quantum mechanics: |χ(↑)|2 is the
probability that, when Sz is measured, + 21 ~ is found, while |χ(↓)|2 is the probability
is only
to obtain − 12 ~. Upon Stern-Gerlach measurement, the spin ẑ component
P
2
found pointing either up or down, therefore the total probability σ |χ(σ)| = |χ(↑
)|2 + |χ(↓)|2 = 1.
To the present initial level of understanding, electron spin is just an extra quantum
number which, in the absence of magnetic fields only provides an extra degeneracy
to all atomic states: the total degeneracy of the nth level is 2n2 , rather than n2 .
Spin will affect the energy levels only when relativistic effects are considered, in
Sec. 1.1.8.
An important novelty characteristic of spin is that the separation of the ↑ and
↓ sub-beams in a Stern-Gerlach apparatus is compatible with a g-factor for spin
gs ≃ 2, rather than 1 as the orbital gl . The precise value of the electron intrinsic
magnetic moment is determined extremely accurately by electron spin resonance
(ESR) measurements: gs = 2.00116.
1.1.7. Total angular momentum and magnetic moment. The spin operator components Sx , Sy , Sz are assumed to follow the same algebra of commutation
relations as the orbital angular momentum components, such as [Lx , Ly ] = i ~ Lz ,
etc. These commutation relations, characteristic of angular momentum in quantum
mechanics, are all that it takes to determine all the properties (eigenvalues, eigenvectors, matrix elements) of the angular momentum operators, irrespective of their
spin or orbital nature [?]. The basic physical consideration is that the deep nature
of both spin and orbital angular momentum is precisely of being angular momenta.
Like linear momentum p~ is the generator of translations, angular momentum gen~ is all that is needed to generate
erates the rotations. Orbital angular momentum L
rotations of a structureless particle. However electron spin makes electrons slightly
more complicated objects: rotations must include spin, beside position degrees of
freedom. The operator that generates the rotations of an electron is therefore the
~ + S.
~
total angular momentum J~ = L
As [Li , Sk ] = 0, the J~ components Jx , Jy , Jz satisfy the same commutation re~ 2 has eigenvalues
lations as the orbital and spin part. As a consequence, again |J|
~2 j(j + 1), with either integer or half odd integer j, and Jz has eigenvalues ~mj ,
with mj taking the 2j + 1 values mj = −j, −j + 1, . . . j − 1, j.
~ 2 and Jz commute with |L|
~ 2 and |S|
~ 2 individually. One can then
Both operators |J|
ask legitimately: which values of j are compatible with two given angular momenta l
and s? The commutation relations are sufficient to answer completely this question
and also to obtain the “coupled states” in terms of the eigenstates of the original
18
1. ATOMS
s
jmax=l+s
s
l
l
jmin
=|l−s|
Figure 1.11. The intuitive mnemonic rule of angular-momentum
composition, Eq. (47).
basis where the individual Lz and Sz components are diagonal. For our purposes,
it suffices to retain the result of this interesting exercise in quantum mechanics [?],
namely that the allowed values for j are
j = |l − s|, |l − s| + 1, . . . l + s − 1, l + s .
(47)
The extremal values recall the classical picture (Fig. 1.11) of vector composition;
the discrete values reproduce the quantum mechanical quantization of angular momentum.
The coupling of the angular momenta realizes a change of basis: diagonalization
of the operators Lz and Sz gives a basis of d = (2l + 1) · (2s + 1) states |ml , ms i
labeled by all possible combinations of the allowed values for the projections of the
angular momentum in the ẑ direction. In this d-dimensional space of states it is
~ 2 and Jz are diagonal
often convenient to employ a different basis, one where |J|
instead. The number of states must remain the same, so it must be verified that
(2l+1) · (2s+1) = 2(|l−s|) + 1 + 2(|l−s|+1) + 1 + . . . + 2(l+s−1) + 1 + 2(l+s) + 1 .
This is easily checked. The coupling of angular momentum states is realized through
a unitary transformation in this d-dimensional space, i.e. through a d × d unitary
matrix:
X jm
(48)
|j, mj i =
Cl mlj s ms |ml , ms i ,
ml ,ms
jm
Cl mlj s ms
numbers (named Clebsch-Gordan coefficients, conventionally
where the
chosen real, tabulated in many books, e.g. Refs. [?, ?]) are the matrix elements of
the basis transformation. With the basic nonrelativistic Hamiltonian (1) all the d
states within the (l,s) multiplet have the same energy, and this degeneracy holds
1.1. ONE-ELECTRON ATOM/IONS
19
whether we describe them in terms of the |ml , ms i basis or of the |j, mj i basis. Any
of these two bases is equally well suited to describe the system. The difference is that
the |ml , ms i basis emphasizes the invariance of the system with respect to separate
space and spin rotation, while the |j, mj i basis emphasizes the system invariance
with respect to global rotations (of both position and spin by the same amount).
Concretely, in the one-electron atom at hand, the orbital angular momentum l
combines with the electron spin s = 12 . Rule (47) assigns to j either one value
j = s = 12 (for l = 0, s states), or two values j = l ± 21 (for l ≥ 1, p, d, f, ... states).
For s states, the transformation matrix from the |ml = 0, ms i basis to the |j = 21 , mj i
1
mj
~
basis is trivially the 2 × 2 identity C 2 1 = δm ms , since here J~ coincides with S.
00
2
ms
j
For l ≥ 1, the “maximally aligned” components
1
1
1
j = l+ , mj = ± l+
= ml = ±l, ms = ±
;
2
2
2
the remaining coupled states are each expressed in terms of two uncoupled states
only, those whose Jz component match:
j = l + 1 , m j = a
2
and the orthogonal ket
ml = mj + 1 , ms = − 1 + b
2
2
ml = mj − 1 , ms = + 1 ,
2
2
j = l− 1 , mj = b
2
m l = m j + 1 , m s = − 1 − a
2
2
m l = m j − 1 , m s = + 1 ,
2
2
where
l+ 21
a = Cl m
mj
1
j+ 2
1
2
− 21
=
s
l + 21 − mj
2l + 1
and
l+ 21
b = Cl m
mj
1
j− 2
1
2
+ 21
=
s
l + 21 + mj
2l + 1
are the quantum weights in the transformation. Clearly, a2 + b2 = 1 as required by
a unitary transformation.
Even more concretely, for a p (l = 1) orbital triplet, the explicit transformation
matrix between the |ml , ms i basis and the coupled |j, mj i basis involves a 6 × 6
20
matrix

j =


 j=


 j=


 j=


 j=

j =
1. ATOMS
as follows:
q
q
 
1
2
1
1
, mj = 2
0
0
0 − 3
2
3

q
q

 
1
1
, mj = − 12   0
0
− 23 0
2
3




3
, mj = 23   1
0
0
0
0
2
=
q
q
 
2
1
3
, mj = 21   0
0
0
2
3
3


q
q
 
3
1
2
, mj = − 12   0
0
0
2
3
3


3
3
0
0
0
0
0
, mj = − 2
2
= 1, ms = + 12





1 
= −1, ms = + 2 

,
= 1, ms = − 12 


1
= 0, ms = − 2 

= −1, ms = − 12
= 0, ms = + 12
doublet followed by the j = 1 + 21 = 32 quartet.
In practice, if we measure the total angular momentum J~ of a one-electron atom,
then each multiplet of states of given l yields two groups of states characterized by
two values of j: l − 21 and l + 21 . If spherical symmetry is not broken, all states at
given j (but different mj ) must have the same energy. This suggests that, when we
find a mechanism that splits the degeneracy of these states (i.e. a physical reason for
associating different energy to different values of j), we will obtain an explanation for
the two-fold fine-split structure of the observed spectral lines (Fig. 1.3). In Sec. 1.1.8
we shall discuss a weak interaction that splits states of different j, thus making the
coupled basis preferable to the uncoupled one.
The present discussion is much more general than the composition of orbital and
spin angular momenta of a single electron. The rules for composing angular momenta
are purely algebraic: they apply equally well to any kind of angular momentum. We
shall rely on this formalism when dealing with many-electron atoms, especially in
Secs. 1.2.4 and 1.2.7.2.
Notation: the states of the coupled basis are commonly indicated as 2s+1 [l]j ,
where [l] is the relevant capital letter S, P, D... for that value of l = 0, 1, 2... Information about n is given otherwise, e.g. with the n[l] notation, and information about
mj is lost. For example, 3d 2 D3/2 stands for any of the four n = 3, l = 2, j = 32 , mj
kets.
1.1.7.1. Total magnetic moment in the coupled basis. In the presence of both
orbital and spin angular momenta, the effective magnetic moment of the atom is
the vector sum of them:
~ + 2S
~
~ + gs S
~
L
gl L
≃ −µB
.
(49)
~µ = ~µl + ~µs = −µB
~
~
By studying the matrix elements of the ~µ operator, we can evaluate the magnetic
properties of an atom, where both spin and orbital magnetic moment come into play.
According to the operatorial relation (49), these properties are only determined by
where we list the j = 1 −
1
2
=
1
2
 
m l
 

0   m l
 

0   m l
· 
0   m l
 

0   m l
 
m l
1
0
1.1. ONE-ELECTRON ATOM/IONS
21
angular-momentum properties, without any reference to the radial dependence of
the electron wavefunction.
The matrix elements of the magnetic-moment operator on the two bases described
above, are substantially different. They are related via the unitary transformation
(48). In the uncoupled basis |ml , ms i, the µz operator is diagonal, with eigenvalues
Lz + 2Sz
(50) hml , ms |µz |ml , ms i = −µB hml , ms |
|ml , ms i = −µB (ml + 2ms ) .
~
The other components have simple (nondiagonal) expressions for the matrix elements, which can be obtained from the well known matrix elements of Lx/y and Sx/y
[?].
In principle one could obtain the matrix elements of ~µ on the coupled basis |j, mj i
by using explicitly the Clebsch-Gordan transformation (48). However, a simpler and
more instructive method yields these matrix elements within each subspace at fixed
total angular momentum j. The key point is a symmetry argument: on average,
all vector quantities characterizing a spherically symmetric object freely rotating
in space are proportional to its total angular momentum (Wigner-Eckart theorem).
This means in particular that
~
~ ∝ hJi,
~
~ ∝ hJi.
~
h~µi ∝ hJi,
hLi
and hSi
As the matrix elements of J~ are well known, the only unknown quantities are the
corresponding proportionality constants. To obtain them, note that by definition
µB
~ + 2S|j,
~ mj i = − µB hj, mj |J~ + S|j,
~ mj i
hj, mj |~µ|j, mj i = − hj, mj |L
~
~
µB
~ mj i = −gj µB hj, mj |J|j,
~ mj i ,
= −(1 + γ)
(51)
hj, mj |J|j,
~
~
~ mj i and hj, mj |J|j,
~ mj i. We
where we introduce the ratio γ between hj, mj |S|j,
determine γ by observing that the same ratio is involved when we take the scalar
~
product with J:
~ mj i = γ hj, mj |J|j,
~ mj i ,
hj, mj |S|j,
(52)
~ mj i = γ hj, mj |J~ · J|j,
~ mj i .
hj, mj |J~ · S|j,
~ with the
γ can be extracted from Eq. (52) by replacing the scalar product J~ · S
expression
~2
~2
~ 2
~ = |J| + |S| − |L| ,
(53)
J~ · S
2
~
~
~
obtained by squaring (J − S) = L. We obtain the proportionality constant
~ mj i
~ 2 + |S|
~ 2 − |L|
~ 2 |j, mj i
hj, mj |J~ · S|j,
j(j + 1) + s(s + 1) − l(l + 1)
hj, mj | |J|
γ=
=
.
=
2
~ 2 |j, mj i
2 j(j + 1) ~
2j(j + 1)
hj, mj | |J|
22
1. ATOMS
Finally, we obtain the proportionality constant introduced in Eq. (51) between the
magnetic moment and the total angular momentum
j(j + 1) + s(s + 1) − l(l + 1)
(54)
gj = 1 + γ = 1 +
2j(j + 1)
called Landé g-factor. gj gives a measure (in units of µB ) of the new effective atomic
magnetic moment due to the combined orbital and spin contributions, within a given
fixed-j multiplet. For example, the ẑ component of h~µi is
(55)
hj, mj |µz |j, mj i = −gj µB mj .
Note that not all off-diagonal matrix elements of Sz (and thus of µz ) vanish on the
coupled basis |j, mj i. Amusingly, the values of gj are not restricted to the range
1 ≤ gj ≤ 2, contrary to what one might expect for a combination of two moments
with gl = 1 and gs = 2.
1.1.8. Fine structure. The smallness of the observed fine splittings (a fraction
of meV, Fig. 1.3) suggests that they derive from some tiny interaction, absent in
the original Hamiltonian (1). Indeed, all relativistic effects were neglected there: we
consider them in detail here.
1.1.8.1. Spin-orbit coupling. A first neglected effect is due to the magnetic field
experienced by electron spin, due to its own orbital motion. This is a subtle relativistic effect, due to the Lorentz transformation of the nuclear electric field into the
frame of reference of the electron. Call ~v the electron velocity in the nuclear rest
frame. In the electron frame of reference, the nucleus is seen to move with velocity
−~v , and is therefore associated with a current −Zqe~v . According to the Biot-Savart
law of electromagnetism, this current produces a magnetic field at the point (joined
by a vector ~r) where the electron sits
~
~ r) = − 1 ~r × (−Zqe~v ) = E(~r) × ~v .
(56)
B(~
4πǫ0 c2
|~r|3
c2
Equation (56) uses the fact that the electric field produced by the nucleus at the
same point is
~ r) = Zqe ~r ,
(57)
E(~
4πǫ0 |~r|3
and identifies this magnetic field as a relativistic effect. In Eq. (56) we recognize the
orbital angular-momentum operator:
~
L
Zqe
~ r) = Zqe ~r × ~v =
(58)
B(~
.
4πǫ0 c2 |~r|3
4πǫ0 c2 me |~r|3
~ r) acts point by point on the electron. In this field, the
This magnetic field B(~
interaction energy of the electron spin magnetic moment ~µs should in principle
1.1. ONE-ELECTRON ATOM/IONS
23
~ r). However, this energy must actually be reduced by a factor 1 (first
equal −~µs · B(~
2
recognized by L.H. Thomas) due to the fact that the electron frame of reference is
accelerated [?]. The correct magnetic interaction energy operator is therefore:
!
~
1
Ze2
L
1
g
µ
Zq
s
B
e
~ r) =
~ ·L
~.
~·
(59)
Hs−o = − ~µs · B(~
=
S
S
2
2 ~
4πǫ0 c2 me |~r|3
2 m2e c2 r3
This operator, named spin-orbit interaction, has nonzero off-diagonal elements
connecting states with different n, ml , ms . However, in practice states with different
n have vastly different energies, so that the tiny n-off-diagonal spin-orbit couplings
perturb negligibly the energy. These n-off-diagonal matrix elements are usually
ignored. For given l and considering only the n-diagonal matrix elements of Hs−o ,
we rewrite Eq. (59) as
Z ∞
Ze2
Ze2
−3
~
~
~ ·L
~.
hn, l|r |n, li S · L =
(60) Hs−o ≃
r−3 [Rn l (r)]2 r2 dr S
2 m2e c2
2 m2e c2 0
The radial integral may be computed explicitly for hydrogenic wavefunctions Rn l ,
obtaining
3
2
Z
−3
(61)
hn, l|r |n, li =
3
a
n l(l + 1)(2l + 1)
(of course it diverges for l = 0). The spin-orbit Hamiltonian is thus conveniently
rewritten as
~ ·L
~
S
,
(62)
Hs−o = ξ
~2
where the spin-orbit energy parameter
3
3
2
1
µ
Ze2 ~2 Z
4 2
=
Z
α
E
.
(63) ξ =
Ha
2 m2e c2 a
n3 l(l + 1)(2l + 1)
me n3 l(l + 1)(2l + 1)
The last equality uses the expression for the mass-rescaled atomic length scale a =
a0 mµe , the definition (9) of the Hartree energy, and the energy relation of the finestructure constant α2 = mEeHac2 . In this form, it is apparent that the typical energy
scale of spin-orbit ξ
•
•
•
•
~ and S;
~
is positive, i.e. it favors antiparallel alignment of L
is a leading α2 = (v/c)2 relativistic correction;
is α2 ≃ 5.3 × 10−5 times smaller than the typical orbital energies;
grows as Z 4 , reflecting the increase in nuclear field intensity ∝ Z and the reduction as Z −1 in average electron-nucleus distance so that hn, l|r−3 |n, li ∝
Z 3;
24
1. ATOMS
• decreases as n−3 , reflecting increase with n in the average electron-nucleus
distance, and the r−3 dependence of the interaction energy (59);
~L
~ ∝ l), due to the rl suppression of the radial
• decreases roughly as l−3 (but S·
wavefunction close to the origin, the region where spin-orbit interaction (59)
dominates.
3
The energy scale of ξ for hydrogen amounts to α2 EHa mµe = 2.3178 × 10−22 J
= 1.4467 meV. Note however that the lowest state for which spin-orbit applies (2p)
has n3 l(l + 1)(2l + 1) = 48, thus ξ2p = 0.030139 meV only, and then for all higher
levels ξ is even smaller.
~ · L.
~ On the decoupled
Consider now the remaining operatorial part in Eq. (62): S
~ ·L
~ has plenty of nonzero off-diagonal matrix elebasis |l, s, ml , ms i, the operator S
~ · L,
~ within
ments. To apply first-order perturbation theory, we must diagonalize S
each initially degenerate space at fixed l (and s). Very generally, a tiny perturbation splits degenerate states easily, after rearranging them to the basis where the
perturbation itself is diagonal. We come now to employ the coupled basis intro~ ·L
~ is diagonal on the re-coupled basis
duced in Eq. (48). To convince oneself that S
~ + S)
~ = J~ and invert it as follows:
|l, s, j, mj i, take the square of (L
~2
~2
~ 2
~ ·L
~ = |J| − |S| − |L| .
S
2
(64)
(Note the similarity to Eq. (53).) All operators at the right hand side are of course
diagonal on the |l, s, j, mj i basis: the expression for the eigenvalue is then
hl, s, j, mj |
(65)
~ ·L
~
S
j(j + 1) − s(s + 1) − l(l + 1)
|l, s, j ′ , m′j i =
δj j ′ δmj m′j .
2
~
2
In summary: on the coupled basis |l, s, j, mj i, the spin-orbit interaction is diagonal
and its eigenvalues are given by Eq. (65), multiplied by the energy prefactor ξ.
As an example, for a p level of a one-electron atom, the two different eigenvalues
~ ~
of S·~2L are −1 (for J = 21 ) and + 12 (for J = 23 ). Therefore, spin-orbit splits any p
level (3 × 2 = 6 orbital×spin states) of hydrogen into two multiplets, 2 P 1 (2 states)
2
and 2 P 3 (4 states), separated by an energy 32 ξ (= 45.21 µeV for the 2p level of H).
2
1.1.8.2. The relativistic kinetic correction. A second relativistic correction of the
same order (v/c)2 must be included, beside spin-orbit. This energy contribution
accounts for the leading correction to the kinetic energy expression p2 /(2µ):
(66)
p
1 p2
1 p4
p2
p4
2
2
2
4
2
2
Tr = µ c + p c − µc = µc 1 +
−
+
...
−
1
=
−
+ ... .
2 µ2 c 2 8 µ4 c 4
2µ 8µ3 c2
1.1. ONE-ELECTRON ATOM/IONS
25
As we did for Hs−o , to treat the weak perturbation −p4 /(8µ3 c2 ) at first order, we
just need the diagonal matrix elements of this operator. Although p4 looks like a
formidable differential operator, the trick p4 = (p2 )2 = [2µ(Htot − Vne )]2 permits to
rewrite the diagonal matrix elements of p4 in terms of simple radial integrals of r−1
and r−2 . The final result is
3 p4
1
Z 4 α2
3
µ
,
(67)
hn, l| − 3 2 |n, li = − 3 EHa
−
8µ c
n
me
2l + 1 8n
where we omit to indicate ml /ms or j/mj , which are irrelevant for such radial
integrals.
By combining the spin-orbit and kinetic correction
(68)
Hrel = Hs−o −
p4
,
8µ3 c2
we obtain the diagonal matrix elements of the total relativistic correction of order
α2 :
3 Z 4 α2
j(j + 1) − s(s + 1) − l(l + 1)
1
3
µ
hn, l, j|Hrel |n, l, ji =
EHa
−
+
n3
me
2l(l + 1)(2l + 1)
2l + 1 8n
3 1
3
µ
Z 4 α2
−
(69)
,
= − 3 EHa
n
me
2j + 1 8n
where the last simplification is based on having s = 12 , thus l = j ± 21 . This last
expression applies even for s states (while all previous steps did not), for which
separate analysis is necessary.
Expression (69) can be combined with the nonrelativistic eigenvalues (30) to obtain the following equation for the eigenvalues corrected to order α2 :
(70)
"
#
2 µ
2
1
3
Z 2 EHa µ 1
.
1 + Zα
−
hn, l, j|Htot + Hrel |n, l, ji = −
2
m e n2
me
n 2j + 1 4n
This remarkable relation yields a quantitative prevision for the spectrum that can be
directly compared to experiment: all n-levels should be split, for the different values
of j, but not for different values of l giving the same j. This same l degeneracy
is obtained by solving the more refined theory based on Dirac’s equation, which is
exact to all orders in α, not just α2 .
1.1.8.3. The Lamb shift. As the extra l-degeneracy is rather surprising, the occurrence of a splitting between levels with same j and different l was closely investigated, both theoretically and experimentally. Indeed, quantum fluctuations of the
electromagnetic field and finite nuclear size should remove this degeneracy (Lamb
26
1. ATOMS
d
TT
LI
SP
PHOTON SOURCE
ER
MONOCHROMATIC
PHOTON
DETECTOR
1111111
0000000
0000000
1111111
0000000
1111111
am
e be
0000000
1111111
SWITCH
prob
0000000
1111111
pump1111111
beam
00000
11111
0000000
00000
11111
0000000
1111111
00000
11111
0000000
1111111
00000
11111
0000000
1111111
SAMPLE
0000000
1111111
0000000
1111111
0000000
1111111
Figure 1.12. (a) Level scheme for the fine structure of the Balmer
Hα line, including the relativistic corrections and the Lamb shift. (b)
First direct spectroscopic observation of the Lamb shift in the Hα line,
obtained [?] by the double-resonance experiment sketched in panel
(d). (c) Theoretical lines with relative intensities. The three peaks
at the left involve the n = 2 2 P3/2 state, and correspond to the three
transitions marked at the left side of panel (a). The four lines at the
right involve the n = 2 j = 21 states, illustrated by the four arrows at
the right side of panel (a): in the absence of Lamb shift these lines
would be two instead of four. Splittings within these four lines are
associated to both spin orbit in the n = 3 multiplet and the Lamb
shift between n = 2 2 S1/2 and 2 P1/2 , which is then measured of the
order of 0.03 cm−1 , about 4 µeV. The average separation between the
two groups of lines measures the spin-orbit splitting in the 2p, and it
agrees with the theoretical evaluation ≃ 45 µeV. (Note: 0.1 cm−1 ≃
12 µeV.)
1.1. ONE-ELECTRON ATOM/IONS
27
shift). Figure 1.12a reports the expected spectral fine structure of the Balmer Hα
line, including the relativistic corrections and the Lamb shift.
Due to Doppler broadening (random thermal atomic motion, see Sect. 0.2), it is
exceedingly difficult to observe these tiny splittings. To circumvent this broadening
and acquire the high-resolution spectrum of Fig. 1.12b, the authors of Ref. [?] used
a trick based on double resonance. A powerful tunable monochromatic light beam
is split into a strong interruptible “pump” beam plus a second weak “probe” beam.
As the light frequency matches a resonant transition, absorption takes place and the
probe beam is attenuated. However, if the pump beam saturates the transition in
the sample, then absorption is strongly reduced. The spectrum records the probe
absorption difference between time intervals when the pump beam is on and when it
is off. Broadening is then removed since the beams are almost antiparallel, so that
the same atoms are selected by the pump/probe frequency match, namely those with
practically null instantaneous translational velocity in the beam direction, thus null
Doppler shift.
1.1.9. Nuclear Spin and hyperfine structure. Like the electron, many nu~ For example, the proton in an H atom is a particle of
clei can carry a spin I.
1
spin I = 2 . As usual for a quantum system, the nuclear magnetic moment ~µ is
proportional to its angular momentum:
I~
~µN = µn .
~
Here, the nuclear magneton µn is defined by
(71)
µ n = gn
qe ~
me
= gn µ B
,
2Mn
Mn
where the nuclear g-factor gn is a number of order unity whose value depends on
the inner nuclear structure. For example, gn = 5.58569 for the proton.
The nuclear spin produces a magnetic field which adds to the spin-orbit one. This
field is extremely weak, since it is suppressed by the ratio me /Mn with respect to
typical electronic fields. Because of this field, the nuclear spin interacts with the
electron. The interaction is extremely weak for l > 0 orbitals, as the electron does
not get very close to the nucleus, due to the rl term in Eq. (33). For s orbitals, the
only electronic magnetic moment affected by the nuclear field is that associated to
~ Like for spin-orbit, the interaction Hamiltonian is proportional to the only
spin S.
scalar term one could build with the two vector quantities involved:
HS~ I~ = C~µN · ~µe = C gn gs µ2B
~
me I~ · S
.
Mn ~ 2
28
1. ATOMS
Figure 1.13. All-sky Doppler map of the 21 cm emission by 1 H in
the Milky Way.
The prefactor C is the relevant radial matrix element, which equals
1 2
C=
Rn0 (0)2 .
2
4πǫ0 c 3
Using Rn0 (0) = 2[Z/(an)]3/2 , the characteristic coupling energy
(72)
Z3
2
4 Z3 2
me
Z 3 gn
me
2 me
g
= gn gs 3
E
=
α
E
=
1.06 µeV a.m.u.
ξN = Cgn gs µ2B
n 3
Ha
Mn
3
n me c2 Ha Mn
3
n
Mn
Mn
where we used the usual relation α2 = mEeHac2 , and the nuclear mass is measured in
atomic mass units a.m.u. For 1 H, Eq. (72) yields ξN = 5.88 µeV.
The electron and nuclear spins couple to a grand total angular momentum F~ =
~
~ with the same rules derived for L
~ and S
~ in Sec. 1.1.7. With the customary
I + S,
trick we obtain
~ 2
~2
~2
~ = |F | − |I| − |S| ,
I~ · S
2
2
~
~
so that the expectation value of I · S/~ equals 21 [f (f + 1) − i(i + 1) − s(s + 1)] on the
coupled basis, where |F~ |2 and |Fz | are diagonal, rather than Iz and Sz . As
s = 21 ,
two coupled states f = i ± 12 occur, with an energy separation of ξN i + 21 .
~ 2 = − 3 and 1 for f = 0
For 1 H, the proton has spin i = 21 , so that I~ · S/~
4
4
and 1 respectively. The separation between these two hyperfine-split states equals
−1
≃ 21 cm, and a
therefore ξN = 5.88 µeV: it corresponds to a wavelength hc ξN
frequency of ξN h−1 ≃ 1.43 GHz. This transition, in the radio-frequency range,
at precisely 1420405751.80 Hz, was discovered in 1951 in the spectrum of galactic
atomic hydrogen, and has now become a standard tool to investigate the galactic
distribution of atomic 1 H (see Fig. 1.13).
1.1.10. Electronic transitions, selection rules. Not all conceivable transitions are equally easy to observe. Basically, it is found that some transitions proceed
at a large rate, while others occur immensely more slowly. The reason for this can
be found in quantum mechanics. The probability per unit time that an undisturbed
atom decays radiatively from an initial state |ii to a final state |f i is given by
1
~ 2,
(73)
γif =
E 3 |hf |d|ii|
3πǫ0 ~4 c3 if
1.1. ONE-ELECTRON ATOM/IONS
29
where Eif = Ei − Ef , and d~ is the operator describing coupling to the radiation field.
In the approximation that the radiating object is much smaller than the radiation
wavelength (see Fig. 0.4), the operator d~ = −qe~r is the electric dipole operator (dipole
~ vanishes are
approximation). All transitions for which the matrix element hf |d|ii
“forbidden”: this means that they occur at very low rates, associated to higher
multipoles in the field expansion.
The matrix elements of the dipole operator of the one-electron wavefunction are:
Z
~
(74)
hnf , lf , ml f |d|ni , li , ml i i = ψn∗ f lf ml f (~r) d~ ψni li ml i (~r) d3 r .
This integration is conveniently carried out in polar coordinates: express the dipole
operator d~ = −qe~r = −qe r (sin θ cos ϕ, sin θ sin ϕ, cos θ), and observe that
r
2π
[Y1 −1 (θ, φ) − Y1 1 (θ, φ)]
rx = r
3
r
2π
[Y1 −1 (θ, φ) + Y1 1 (θ, φ)]
ry = r i
3
r
4π
rz = r
Y1 0 (θ, φ)
3
(this is the inverse of Eq. (38)), so that the dipole matrix element is proportional to
|h~ri|2 = |hf |~r|ii|2 = h~ri∗ · h~ri = hrx i∗ hrx i + hry i∗ hry i + hrz i∗ hrz i
2π
hY1 −1 − Y1 1 i∗ hY1 −1 − Y1 1 i −i2 hY1 −1 + Y1 1 i∗ hY1 −1 + Y1 1 i +2hY1 0 i∗ hY1 0 i
= |hri|2
3
2 4π
|hY1 −1 i|2 + |hY1 1 i|2 + |hY1 0 i|2 ,
= (75)
|hri|
3
where we have shortened all bra-ket indications, omitting initial and final states
indications and angular dependency of the spherical harmonics.
The squared dipole matrix element in the transition rate is then written in polar
coordinates as
Z ∞
2
2
2
2
~ i , li , ml i i = q Rnf lf (r) r Rni li (r)r dr ×
hnf , lf , ml f |d|n
e 0
2
Z
Z 2π
4π X π
∗
(76)
sin θ dθ
dφ Ylf mf (θ, φ)Y1 m (θ, φ)Yli mi (θ, φ) .
3 m 0
0
No radial integral vanishes: multiplication by r turns no Rni li (r) into exactly another
Rnl (r), thus r Rni li (r) has nonzero expansion coefficients on all other radial basis
wavefunctions. The radial integral can be computed for any initial (ni , li ) and final
(nf , lf ): it diminishes rapidly when ni and nf differ by large amounts, because
30
1. ATOMS
Rni li (r) and Rnf lf (r) take nonnegligible values in remote places, so that their product
is everywhere small. For the angular part, much sharper statements can be made:
the integration in the lower row of Eq. (76) represents the angular overlap of a
state with l = lf to the product Y1 m Yli mi . This product can be decomposed as two
coupled angular momenta: according to the rule (47), the composition of an angular
momentum l = 1 with an angular momentum l = li yields three allowed values of
the coupled angular momentum in Y1 m Yli mi .3 Final states characterized by l = lf
not satisfying
(77)
lf = |li − 1|, li , li + 1
are guaranteed to make the angular integral vanish. Moreover, note that also when
lf = li the angular integral vanishes! The reason is that the parity of ~r (thus
of Y1 m (θ, φ)) is negative, while the parities of Yli ml i (θ, φ) and of Yli∗ml f (θ, φ) are
both (−1)li . The triple product Ylf∗ mf Y1 m Yli mi is therefore of negative parity, thus
integration over all space averages to zero.
In summary, in the dipole approximation, nonzero matrix element can occur only
for transitions involving states with l changing by exactly unity. The allowed transitions have therefore
(78)
∆l = lf − li = ±1 .
This equality is called a dipole selection rule, the one regarding l.
The fact that the dipole operator is associated to an orbital l = 1 implies also
that its component m = −1, 0, 1. Accordingly, the only value of ml f for which the
angular integral is nonzero, is obtained by adding m to ml i . From this observation
we derive the ml -selection rule:
(79)
∆ml = ml f − ml i = 0, ±1 .
Until this point, spin was ignored, as the dipole operator is purely spatial: it does
nothing to spin. Indeed on the uncoupled basis |n, l, ml , ms i, d~ can (and does) only
change the spatial degrees of freedom, but it acts as the identity for spin. As a
result, we have
(80)
(81)
∆s = sf − si = 0
∆ms = ms f − ms i = 0 .
The first spin selection rule (80) is trivial for a one-electron atom, as s ≡
but it will become relevant for many-electron atoms.
3
1
2
anyway,
Indeed, it can be shown that angular integral within the absolute value is proportional to
l m
the Clebsch-Gordan coefficient Clif mlfi 1 m .
1.1. ONE-ELECTRON ATOM/IONS
31
By analyzing the composition of the coupled states |n, l, j, mj i in terms of the
|n, l, ml , ms i states, we find that the following selection rules hold for the coupled
states:
(82)
(83)
∆j = jf − ji = 0, ±1
(not 0 −→ 0)
∆mj = mj f − mj i = 0, ±1 .
After establishing what transitions are allowed in the dipole approximation, we
have better estimate the rate of these transitions according to (73). Remembering
that ωif = ~−1 Eif ≈ Z 2 ~−1 EHa , and observing that the order of magnitude of
~ ≈ qe a0 /Z, we can estimate
|hf |d|ii|
(84)
2 3
~ 2
2
Eif3 |hf |d|ii|
Z4
e2 Z 2 4
2
2 a0
2 e
γif =
ωif = Z 2 α3 ωif ,
≃
E ωif qe 2 ≃
e ωif = Z
3πǫ0 ~4 c3
ǫ0 ~3 c3 Ha
Z
(~c)3
~c
where we have used EHa a0 = e2 and α = e2 /(~c). As ωif ≃ Z 2 1016 Hz, and
α3 ≃ 10−7 , we expect transition rates of the order γif ≃ Z 4 109 s−1 , i.e. typical
−1
≃ Z −4 ns. This sets the order of magnitude of the speed of atomic
decay times γif
transitions. The strong (Eif3 ) dependence of γif makes the transition time shorter
for more energetic transitions, and longer for low-energy transitions. Much slower
transition rates occur for dipole-forbidden transitions, associated to weaker higherorder couplings to the electromagnetic field (magnetic dipole, electric quadrupole...).
Other, nonradiative transitions may also occur due to collisions, e.g. with fast electrons, with other atoms, with the vessel wall. These mechanisms become dominant
for the decay of long-lived metastable states, which lack fast dipole-allowed decay
transitions.
1.1.11. Spectra in a magnetic field. We conclude this Section with a brief
analysis of atomic spectra in the condition where a maximum of information can be
extracted from them, i.e. when the atomic sample is immersed in a uniform magnetic
field. In these conditions, the atomic magnetic moment couples to the external field,
so that different values of the component of the magnetic moment along the field
~ · ~µ, Eq. (43), detectable by spectroscopical
direction acquire different energies −B
investigation. The magnetic coupling with the external field rewrites:
~
~
~ · ~µ = µB B
~ · L + 2S = µB Bz Lz + 2Sz
Hmagn = −B
~
~
(see Eq. (49)). This operator is diagonal on the uncoupled |l, s, ml , ms i basis. Remember however (Sec. 1.1.8) that Hs−o is not diagonal on that basis, but rather on
the coupled basis |l, s, j, mj i. In fact, Hs−o and Hmagn cannot be diagonalized simultaneously, as they do not commute. To obtain the spectrum and eigenvectors one
(85)
32
1. ATOMS
must then diagonalize the total operator Hmagn + Hs−o , within each (2l+1) · (2s+1)dimensional subspace at fixed l and s. This diagonalization is not especially complicated, but it is perhaps more instructive to understand in detail the two limiting
cases where either characteristic energy scale µB |B| or ξ dominates.
The simplest limit (µB |B| ≫ ξ), of magnetic field energy µB |B| much larger than
the spin-orbit energy ξ, occurs for different values of the field, depending on the
atom considered. For hydrogen 2p, the strong-field limit is reached for |B| ≫ 0.5 T,
while for He+ 2p it takes a magnetic field as large as |B| ≫ 8 T, due to the Z 4 dependence of the spin-orbit energy, Eq. (63). In this limit of very strong field,
the coupled basis is not especially good, as full rotational invariance of the atom
is badly broken. The uncoupled basis |l, s, ml , ms i works fine instead: spin and
orbital moments align relative to the field with an energy gain or cost depending on
their different g-factors. On this basis, the large interaction Hmagn is diagonal: if we
neglect the smaller Hs−o , the magnetic energy levels are simply
(86)
Emagn (ml , ms ) ≃ hml , ms |Hmagn |ml , ms i = µB Bz (ml + 2ms )
(Paschen-Back limit). Hs−o corrections may be added perturbatively.
In the (more common) opposite weak-field limit (|B| ≪ ξ/µB ), spherical symmetry
is only weakly perturbed. The states |l, s, j, mj i in the coupled basis are approximate
eigenstates of Hmagn + Hs−o , and Hmagn acts as a weak perturbation. To first order
in µB |B|/ξ, the energy correction due to Hmagn is provided by Eq. (51):
(87)
Emagn (mj ) ≃ hj, mj |Hmagn |j, mj i = hj, mj | − µz Bz |j, mj i = gj µB Bz mj
(Zeeman limit), where gj is the Landé g-factor, obtained in Eq. (54).
Both bases fail and neither of these approximate expressions is correct in the
intermediate-field regime µB |B| ≃ ξ. Figure 1.14 shows the exact pattern of splittings of the six 2 P states under the action of a magnetic field, changing from the
Zeeman (weak field) to the Paschen-Back (strong-field) limit. The initial slopes of
the curves at B → 0, divided by the relevant mj , measure the values of the Landé
gj .
The experimentally observed spectra confirm the theory outlined here. For H,
provided that a sufficiently strong magnetic field is applied, the Paschen-Bach is
relatively straightforward to observe as a triplication of all lines. If very high spectral
resolution can be achieved, also the weak-field Zeeman splitting of the H lines shown
in the conceptual scheme of Fig. 1.15 could be detected. Such Zeeman effect is called
“anomalous” since the lines are not regularly spaced, and the spacing is different for
the (2 S1/2 ←→2 P1/2 )-originated and the (2 S1/2 ←→2 P3/2 )-originated lines. However,
in atomic physics, this is rather the rule than an anomaly: only a few S = 0 lines
of many-electron atoms happen to show “regular” Zeeman splittings (see Fig. 1.31
in Sec. 1.2.7.3).
1.2. MANY-ELECTRON ATOMS
ml , ms
l=1, s=1/2
Energy / ξ
33
+1,+1/2
5
0,+1/2
-1,1/2
j=3/2
j=1/2
0
+1,-1/2
0,-1/2
-5
-1,-1/2
0
2
Zeeman limit
µBB / ξ
4
Paschen-Back limit
Figure 1.14. Spin-orbit and magnetic splitting of a 2 P multiplet.
With the shorthand b = µB B/ξ, the expression for the mj = ± 32
energies (solid lines) is simply 21 ± 2b ξ. The energies of the four
other levels are 14 (−1 ± 2b + d)ξ (dotted lines) and 41 (−1 ± 2b − d)ξ
p
(dashed lines), where d = 9 + 4b(1 + b).
1.2. Many-electron atoms
1.2.1. Identical particles. The concept indistinguishable particles is central
to understand correctly all the physics of matter beyond one-electron atoms. In
classical mechanics, each particle is labeled by its own position and momentum:
one could in principle follow individual trajectories through the motion, and thus
tell identical particles i and j apart at any time. In quantum mechanics, indistinguishable particles are such at the deepest level. There is no way, even as a matter
of principle, to ever distinguish e.g. 2 electrons. Quantum mechanics implements
this perfect indistinguishableness through symmetry: any many-particles ket has a
definite symmetry “character” for the permutation Pij swapping any two identical
particles. As this permutation symmetry is a discrete symmetry which, once applied
twice, leads back to the initial state, the eigenvalues of any permutation can only
be +1 or −1.
Those particles for whose swap the total ket |ai of the system is symmetric (+1
character) are called bosons. This is represented by Pij |ai = |ai.
Those particles for whose swap the total ket |ai of the system is antisymmetric
(−1 character) are called fermions. This is indicated by Pij |ai = −|ai.
34
1. ATOMS
3/2
1/2
2P
3/2
−1/2
45 µ eV
−3/2
1/2
−1/2
2P
1/2
10 eV
Field off Field on
1/2
2S
1/2
−1/2
Figure 1.15. Conceptual scheme of the Zeeman-split 1s←→2p lowest Lyman line of H. Splittings between lines of different mj are
proportional to the magnetic energy µB B and the relevant g-factor:
gj (2 P3/2 ) = 34 , gj (2 P1/2 ) = 23 , gj (2 S1/2 ) = 2.
A simple rule connects the spin of a particle kind to its permutational symmetry:
integer-spin particles are bosons, half odd-integer particles are fermions.
Example of bosons: the photon (spin 1). Examples of fermions: the electron,
the proton, the neutron (all spin 12 ). A collection of bosons and fermions lumped
together is often treated as a single point particle. This makes sense when the
internal dynamics is associated to very high excitation energy, so that the lump
remains in its ground state, which can be degenerate according to the projection
of the lump’s total angular momentum. One could ask what character for the
permutation of two such identical lumps the collective ket shows. This is simply
answered by counting the number of (−1)’s generated by the permutations of pairs
of identical fermions. For example, hydrogen atoms in the same hyperfine state
are bosons, since a −1 is generated by the permutation of the two electrons and
a second −1 is generated by the permutation of the two protons. According to
Eq. (47), these composite objects fulfil the spin rule: e.g. the grand total spin of
H in its ground state is f = 0. Similarly, identical nuclei are bosons or fermions
1.2. MANY-ELECTRON ATOMS
35
He
25
Ionization energy [eV]
Ne
20
Ar
Kr
15
Xe
Hg Rn
10
5
0
Li
0
Na
Ga
K
20
Tl
In
Rb
Fr
Cs
40
60
80
100
Z
Figure 1.16. First ionization energies of neutral atoms (N = Z),
as a function of the atomic number Z.
according to whether they contain an even or odd number of nucleons (protons or
neutrons). For example the deuteron ion D+ , a bound state of a proton and a
neutron in a i = 1 spin state, is a boson. Likewise, the 13 C isotope of carbon is a
fermion (13 nucleons + 6 electrons), while the 35 Cl isotope of chlorine is a boson (35
nucleons + 17 electrons).
The permutational symmetry is crucial to understand the dynamics of many electrons and, in particular, the structure of many-electron atoms. As discussed in
greater detail below, antisymmetry obliges N electrons to span N quantum states,
thus effectively avoiding one another. In practice, the geometrical constraint of antisymmetry is often more effective than the dynamical electron-electron repulsion
(6) in keeping electrons apart.
Without permutational antisymmetry all electrons would occupy the same 1s shell
in the atomic ground state. If that happened (as it could if electrons were distinguishable particles – or bosons – rather then fermions), then the atomic ionization
energies should increase roughly with Z 2 and the size of atoms should decrease
roughly as Z −1 . In stark contrast, relatively weak non-monotonic Z dependence of
both these properties (Fig. 1.16 and 1.17) are observed: in particular, the ionization
energy and the atomic size show a general tendency to respectively decrease and
increase with Z.
36
1. ATOMS
Figure 1.17. Empirical atomic radii of elements, as a function of Z.
1.2.2. The independent-particles approximation. Electrons interact strongly
with each other. In a neutral atom, the electron-electron repulsion (6) is of the same
order of magnitude as the attraction (4) to the nucleus.
Exact solution of the Schrödinger equation associated to Hamiltonian (1) involves
the determination of a N -electron wavefunction. For increasing N , this becomes
rapidly a formidable task, as the N -electron wavefunction describes the correlated
motion of all the N electrons, and thus depends in a nontrivial way on all position
and spin coordinates of the N electrons. The amount of information associated to a
generic N -electron wavefunction is exponentially large with N , and basically there
is no way to store (let alone to compute!) the exact wavefunction for the ground
state of several interacting electrons.
Most approximate methods on the market are based on the observation that a
basis of the Hilbert space of N -particle states can be built as the tensor product of
single-particle basis states. More explicitly, if {|αj i} is a complete set of orthonormal
states for a single particle, the product
(88)
|α1 , α2 , ...αN i = |α1 i ⊗ |α2 i ⊗ ...|αN i ,
realizes a basis for N particles when all possible choices of α1 , α2 , ... αN are explored.
For indistinguishable particles, the correct permutational symmetry is imposed to
the product state (88) by taking the properly symmetrized linear combination
1 X
(89)
|α1 , α2 , ...αN iS/A = √
(±1){P } |αP1 , αP2 , ...αPN i .
NP P
1.2. MANY-ELECTRON ATOMS
37
Here P indicates a generic permutation of the N particles αj , and the sum extends
over the N ! permutations; the normalization NP equals N ! if all the αj happen
to be different; {P } in the exponent indicates the parity of the permutation P ,
i.e. the number of pair transpositions P is made of. The fully symmetrized basis
state |α1 , α2 , ...αN iS realizes the correct permutational symmetry of N bosons; the
antisymmetric combination |α1 , α2 , ...αN iA involving nontrivial (−1) signs can play
the role of basis state for N fermions. For bosons, no restriction applies to the
quantum numbers αj : any number of them may be equal. Instead, for fermions, all
quantum numbers must necessarily be different. If two were equal, say αi = αj , in
the sum of (89), the kets |αP1 , ...αPi , ..., αPj , ...αPN i and |αP1 , ...αPj , ..., αPi , ...αPN i
would be equal, but with opposite parity phase factor (−1){P } , so that they all
cancel in pairs in the sum, and the total ket |α1 , ...αi , ...αj , ...αN iA vanishes. In
summary, the product basis kets for N fermions are characterized by N different
quantum numbers: this property expresses the Pauli exclusion principle, according
to which no two identical fermions can occupy the same quantum state.
The symmetrized product kets (89) are substantially simpler objects than a generic
boson/fermion ket: their information contents is only directly proportional to N ,
rather than exponentially. Despite this simplicity, they constitute a basis of the
Hilbert space of the proper symmetry: any actual N -boson/N -fermion ket |aB/F i
(e.g. an exact eigenstate of the interacting Hamiltonian) can be expressed as a linear
combination of the factorized basis kets:
(90)
|aB/F i =
X
α1 ,α2 ,...αN
caα1 ,α2 ,...αN |α1 , α2 , ...αN iS/A ,
where caα1 ,α2 ,...αN are the complex coefficients defining the linear combination. The
number of these coefficients is exponentially large with N : the intrinsic difficulty of
the correlated problem has been transfered to the expansion coefficients.
Many approximate methods of resolutions of the Schrödinger problem for manyelectron systems replace the (lowest) exact eigenstate with one (as smart as possible) basis state |α1 , α2 , ...αN iS/A constructed with single-particle states |α1 i, |α2 i,
solutions of some appropriate single-electron Hamiltonian. This is the so-called
independent-particles approximation.
For atoms, the simplest approach along this line consists in neglecting the electronelectron Coulomb interaction Vee (6) altogether. In this approximation, the Schrödinger
problem for the N electrons is exactly factorized: each electron moves independently
of the others in the field Vne of the nucleus of charge Zqe . The single-electron eigenstates |αj i are hydrogenic wavefunctions of the kind (46) (we neglect spin-orbit),
38
1. ATOMS
N −1
Z
Figure 1.18. When one electron moves away from an atom/ion, the
nucleus of charge Zqe and the remaining (N −1) electrons attract it
as if they were a point charge (Z − N + 1) qe .
with the individual factors given in Eqs. (31) (32) (33), with the appropriate Z.4 In
this atomic context αj stands for the j-th set of quantum numbers nj , lj , ml j , ms j .
Choose at will any set of N different αj : the generic N -electron eigenstate is given
by |α1 , α2 , ...αN iA as in (89). For example, for N = 4 electrons, an acceptable state
is
(91)
|1, 0, 0, ↑, 3, 1, 1, ↑, 3, 1, 0, ↑, 3, 1, −1, ↓iA .
The standard spectroscopic notation for this state 1s3p3 lists of the occupied singleparticle orbitals, with the corresponding “occupancies” as exponents: all information
about the ml ’s and ms ’s is dropped.
The total energy E tot of an atomic state is minus the work needed to decompose
the atom from that given bound state to an isolated nucleus plus the N individual
electrons at rest at infinite reciprocal distance. For the simple factorized eigenstates,
the total energy is simply the sum of the single-particle energies. The eigenstate (91)
is not the one with the lowest possible energy. The ground state can be obtained
by minimizing the energy of each single-particle orbital, without violating the Pauli
principle. For N = 4, the state with the lowest energy is any of the 1s2 2s2 , 1s2 2s2p,
1s2 2p2 (a total of 1 + 12 + 15 = 28 individual states), all with energy
EHa Z 2 Z 2
5
tot
(92)
E1s2 2s2 = 2E1 + 2E2 = −2
+ 2 = − Z 2 EHa .
2
2
1
2
4
2
tot
2
The excited state (91) has substantially higher total energy E1s3p
3 = − 3 Z EHa .
As the maximum occupancy of the n-th hydrogenic level 2n2 grows rapidly, in the
brutally simplified model at hand (complete neglect of Vee ), the minimum energy
to remove an electron from the atom (first ionization energy) increases with Z only
4
We ignore the reduced-mass correction, assuming an infinite nuclear mass. The finiteness
of the nuclear mass introduces extra tiny electron-electron correlations in addition to those of
Coulomb origin.
1.2. MANY-ELECTRON ATOMS
39
marginally more slowly than Z 2 , at variance with experiment (Fig. 1.16). Also,
in this model any atom (regardless of Z) would be able to accept any number N
of electrons, always forming bound states. Experimentally, however, only certain
atoms can form negatively charged ionic bound states, but never with more than 1
extra charge (N ≤ Z + 1). These difficulties indicate that, as one could imagine,
complete neglect of Vee is a very poor approximation. The main reason for the failure
of this model is illustrated in Fig. 1.18: in reality while an electron is removed from
the atom, it does not feel the bare nuclear attraction −Ze2 /r but rather, due to
electron-electron repulsion and in accord to the divergence theorem, the substantially
weaker combined effect of the nuclear charge and that of the other N −1 electrons,
V (r) ≃ (−Z + N −1) e2 /r. This phenomenon of “screening” reduces the ionization
energy substantially, and must be included for a decent factorized description of the
atomic wavefunction.
The following section sketches some theory for the simplest many-electron atom,
He (Z = N = 2), for which significant insight is obtained by treating Vee as a
perturbation to the uncorrelated electron states. Perturbative methods fail for N ≥
3: Sec. 1.2.4 presents a more systematic method for improving the independentelectron approximation to include screening.
1.2.3. The 2-electron atom. Put the position and spin coordinates of the jth electron together and shorthand them as wj = (~rj , σj ). The permutation operator
P in Eq. (89) acts on the position eigenkets exchanging the N coordinates with one
another:
(93)
P |w1 , w2 , ...wN i = |wP1 , wP2 , ...wPN i .
Accordingly, the wavefunction associated to Eq. (89) can be written equivalently
1 X
Ψα1 ,...αN (w1 , ...wN ) = hw1 , w2 , ...wN |α1 , α2 , ...αN iA = √
(−1){P } hwP1 , ...wPN |α1 , ...αN i
N! P
1 X
(−1){P } hwP1 |α1 i · ...hwPN |αN i
(94)
= √
N! P
1 X
(−1){P } ψα1 (wP1 ) · ...ψαN (wPN ) .
= √
N! P
The sum in the last expression is the determinant of the
ψαi (wj ):
ψα1 (w1 ) ψα1 (w2 )
1 ψα2 (w1 ) ψα2 (w2 )
(95)
Ψα1 ,...αN (w1 , ...wN ) = √ ..
..
.
.
N! ψ (w ) ψ (w )
αN
1
αN
2
matrix whose elements are
.
· · · ψαN (wN ) ···
···
...
ψα1 (wN )
ψα2 (wN )
..
.
40
1. ATOMS
This object is called a Slater determinant.
In the simplest nontrivial case of N = 2 electrons (relevant, e.g., for the He atom,
the Li+ , Be2+ ,... ions), the independent-electron wavefunction reads
(96)
1 ψα1 (w1 ) ψα1 (w2 ) ψα1 (w1 )ψα2 (w2 ) − ψα1 (w2 )ψα2 (w1 )
√
.
=
Ψα1 ,α2 (w1 , w2 ) = √ 2 ψα2 (w1 ) ψα2 (w2 ) 2
In the cases where the orbital quantum numbers coincide (n1 = n2 , l1 = l2 , ml 1 =
ml 2 ), the orbital part can be factorized from the determinant. The latter only
involves spin:
(97)
Ψn,l,ml ,↑,
1 χ↑ (σ1 ) χ↑ (σ2 )
r1 ) ψn,l,ml (~r2 ) √ n,l,ml ,↓ (w1 , w2 ) = ψn,l,ml (~
2 χ↓ (σ1 ) χ↓ (σ2 )
.
This antisymmetric up–down combination of the two electron spins is an eigenstate
~ = ~s1 + ~s2 , with null eigenvalue |S|
~2 =
of the square modulus of the total spin S
~ 2 , like (97), are useful because the matrix elements
S(S +1)~2 = 0. Eigenstates of |S|
of the (hitherto neglected) Coulomb repulsion between states of different S vanish
since Vee is an orbital operator, which cannot change spin. The other S = 0 states
(spin singlets), those involving two different sets of orbital quantum numbers, are:
ΨS=0
n1 ,l1 ,ml 1 ,
n2 ,l2 ,ml 2 (w1 , w2 )
=
ψn1 ,l1 ,ml 1 (~r1 ) ψn2 ,l2 ,ml 2 (~r2 ) + ψn1 ,l1 ,ml 1 (~r2 ) ψn2 ,l2 ,ml 2 (~r1 ) χ↑ (σ1 ) χ↓ (σ2 ) − χ↑ (σ2 ) χ↓ (σ1 )
√
√
(98)
.
2
2
Note that these S = 0 states are not single Slater determinants of the type (96).
The singlet states are characterized by an orbital part of the wavefunction which is
symmetric under permutation P12 , with the spin part taking care of the required
antisymmetry.
~ 2 . The other value
The singlet states (97) and (98) are S = 0 eigenstates of |S|
of S allowed by Eq. (47) is S = 1. The spin part of the wavefunctions of these
spin-triplet states is any of:
(99)
(100)
(101)
X S=1,MS =1 (σ1 , σ2 ) = χ↑ (σ1 ) χ↑ (σ2 )
χ↑ (σ1 ) χ↓ (σ2 ) + χ↑ (σ2 ) χ↓ (σ1 )
√
X S=1,MS =0 (σ1 , σ2 ) =
2
S=1,MS =−1
X
(σ1 , σ2 ) = χ↓ (σ1 ) χ↓ (σ2 ) ,
1.2. MANY-ELECTRON ATOMS
41
which are all symmetric for P12 . Therefore the orbital part takes care of antisymmetry:
S
ΨS=1,M
n1 ,l1 ,ml 1 ,
(102)
n2 ,l2 ,ml 2 (w1 , w2 )
=
ψn1 ,l1 ,ml 1 (~r1 ) ψn2 ,l2 ,ml 2 (~r2 ) − ψn1 ,l1 ,ml 1 (~r2 ) ψn2 ,l2 ,ml 2 (~r1 ) S=1,MS
√
X
(σ1 , σ2 ) .
2
In these states, at least one of the orbital quantum numbers for the two electrons
needs to be different: (n1 , l1 , ml 1 ) 6= (n2 , l2 , ml 2 ), or else the wavefunction vanishes.
The following table summarizes the basic properties of the singlet (S = 0) and
triplet (S = 1) basis states |n1 , l1 , ml 1 , n2 , l2 , ml 2 , S, MS i:
orbital quantum numbers
orbital wavefunction
spin quantum numbers
spin wavefunction
S = 0 – spin-singlet states S = 1 – spin-triplet states
any
(n1 , l1 , ml 1 ) 6= (n2 , l2 , ml 2 )
symmetric
antisymmetric
↑ and ↓ (different)
any
antisymmetric
symmetric
These spin-symmetrized states are convenient 0th -order states for a perturbation
theory in Vee . They are eigenstates of Te + Vne : like in the example of Eq. (92), the
“unperturbed” energies
1
EHa Z 2 Z 2
1
tot (0)
(103)
En1 ,n2 = −
,
+ 2 = −2 EHa
+
2
n21
n2
n21 n22
where the last expression refers to He (Z = 2). The ground state n1 = n2 = 1
tot (0)
(in spectroscopic terms 1s2 ) is necessarily a spin singlet and has energy E1,1
=
−4 EHa ≃ −109 eV.
As a next step, the effect of Vee is accounted for, considering only its diagonal
matrix elements, and neglecting its off-diagonal matrix elements. This approximation leaves the states unchanged and provides the first-order (additive) correction
E tot (1) = hVee i to the eigenenergies, in accord with general perturbation theory [?]:
(104)
tot (1)
En1 ,l1 ,ml 1 ,
n2 ,l2 ,ml 2 , S,MS
=
hn1 , l1 , ml 1 , n2 , l2 , ml 2 , S, MS |Vee |n1 , l1 , ml 1 , n2 , l2 , ml 2 , S, MS i .
Since the electron-electron repulsion is positive, this correction is always positive.
The detailed calculation of these Coulomb integrals is a rather intricate mathematical exercise, but its qualitative outcome is very instructive. The largest of these
E tot (1) corrections occurs for the most localized wavefunction, the one where the
two electrons stay very close together, both in the 1s level: the ground state 1s2 .
The average inter-electron distance is of the order of a0 , thus the Coulomb integral
42
1. ATOMS
Figure 1.19. Energy levels of atomic He, showing some transitions.
n, l refer to n2 , l2 , while n1 = 1 for all states reported here. Note that
triplet states sit systematically lower than the singlets.
tot (1)
E1,0,0, 1,0,0, 0,0 is of order ∼ e2 /a0 = EHa . The precise value obtained from integration [?] is 45 EHa ≃ 34 eV. This brings the estimated ground-state energy of He to
−74.8 eV, in fair agreement with the experimental value −79.00 eV (minus the sum
of the first and second ionization energies of He).
For the optically most relevant states, those with one electron sitting in 1s and the
other in an excited state n2 [l2 ], perturbation theory accounts for several experimental
observations:
• All Coulomb integrals are smaller than the one for the ground state, and
tend to decrease for increasing n2 : the Coulomb correction become less and
less important as the electrons move apart from each other.
1.2. MANY-ELECTRON ATOMS
43
• The Coulomb integrals at given n2 depend weakly on l2 : they usually increase for increasing l2 . By looking at the hydrogenic radial distributions
Fig. 1.7, one can note that indeed the electrons, on average, sit slightly
closer when l2 is larger. This is an important finding, since it breaks the
H-atom l-degeneracy of the shells, putting Ens < Enp < End < ..., in accord
to experimental finding (Fig. 1.19).
• The Coulomb integrals depend on S, clearly not through the spin wavefunction which has nothing to do with the purely spatial Vee , but through the different electron-electron correlation in the spatially P12 -symmetrical (S = 0)
or spatially P12 -antisymmetrical (S = 1) wavefunction. In particular,
S
Eq. (102) shows that the triplet wavefunction ΨS=1,M
n1 ,l1 ,ml 1 , n2 ,l2 ,ml 2 (w1 , w2 ) vanishes for ~r1 → ~r2 . On the contrary, the singlet wavefunction ΨS=0
n1 ,l1 ,ml 1 , n2 ,l2 ,ml 2 (w1 , w2 )
is finite at ~r1 = ~r2 . Therefore, on average, the electrons in a spin-triplet
state avoid each other more effectively than in the spin-singlet state with
the same orbital quantum numbers.5 Indeed, Coulomb integrals are systematically smaller for S = 1 than for S = 0 states, as explicit evaluation of
the integral (104) shows. This result accounts for the experimental observation that the triplet states lie systematically lower than the corresponding
singlet (Fig. 1.19). Splittings between states of different spin, here singlets
and triplets, are named exchange splittings.
The perturbative approach presented here is useful mostly as a conceptual tool,
to understand qualitative trends, and general concepts such as those listed above.
Perturbation theory is relatively successful for the 2-electron atom, but for N > 2
electrons the repulsion that a given electron experiences from all the other N − 1
electrons is comparable to the attraction generated by the nucleus, and any attempt
to treat it as a small perturbation fails. A better approximate approach, based on
a mean-field self-consistent evaluation of the electron-electron repulsion, yields fair
quantitative accuracy for any N and is commonly used to date. The reliability of
this and similar self-consistent field methods have made them standard tools for
understanding experiments and making previsions of atomic properties of matter
from first principles.
1.2.4. The self-consistent field theory. The problem of describing at best
the ground state of a N -electron problem in terms of a single Slater determinant
belongs to the general framework of variational problems. The simple idea is that
5
This means that the fixed-spin states include some degree of geometric correlation of the
electronic motion, which would be absent in a pure Slater determinant (96). Employing a basis
where the perturbation Vee is diagonal within the 0th -order degenerate space follows the same
strategy as the choice (Secs. 1.1.7 and 1.1.8) of the |l, s, j, mj i basis to have Hs−o diagonal within
the degenerate multiplets.
44
1. ATOMS
the average energy E var [a] = ha|Htot |ai of any state |ai is larger than or equal to
that of the ground state. The lower E var [a] is, the closer |ai gets to the ground state.
When for |ai we take a generic Slater determinant, the “best” state in its class is
the result of the minimization of the energy
E var [ψα1 , ...ψαN ] = hα1 , ...αN |A Htot |α1 , ...αN iA
Z
=
dw1 , ...dwN Ψ∗α1 ,...αN (w1 , ...wN )Htot Ψα1 ,...αN (w1 , ...wN )
(105)
Z
X
1 Xh
=
hαi |H1 |αi i +
dw dw′ vee (w, w′ ) |ψαi (w)|2 |ψαj (w′ )|2
2
i,j
i
Z
i
− dw dw′ ψα∗ i (w)ψα∗ j (w′ ) vee (w, w′ ) ψαj (w)ψαi (w′ )
with respect to arbitrary variations of the N single-particle wavefunctions ψRαi com∗
posing the Slater determinant.
i them to remain orthonormal dw ψαi (w)ψαj (w) =
h We only require
2
~
∇~r2 −
δij . In Eq. (105), H1 (w) = − 2m
e
vee (w, w′ ) =
e2
|~
r −~
r′ |
Ze2
|~
r|
stands for the one-particle term, and
δms m′s for the electron-electron Coulomb repulsion.
Finding a minimum of E var is the problem of minimizing a functional, i.e. a function whose independent variables is a set of functions. This constrained minimization
problem is formally solved if the ψαi satisfy the set of coupled nonlinear integrodifferential equations called Hartree-Fock (HF) equations:
2
1
}|
z
{
H1 (w)ψα (w) +
z
Z
(106)
dw′
X
β
}|
3
{
|ψβ (w′ )|2 vee (w, w′ ) ψα (w) −
z
Z
dw′
X
}|
ψβ∗ (w′ ) vee (w, w′ ) ψα (w′ )ψβ (w)
β
= ǫα ψα (w) .
Each of the three numbered terms derives from a corresponding term in the total
energy (105). If one pretends that all ψβ functions were, rather than the unknown
functions they really are, given fixed functions, then equations (105) would become
linear in ψα . The general form of these equations is that of a Schrödinger equation:
term 1 contains the kinetic energy and the Coulomb attraction of the nucleus; term
2P
represents the Coulomb repulsion of the average charge distribution of all electrons
( β |ψβ (w′ )|2 represents the density distribution of the N electrons); term 3 is a
nonclassical nonlocal exchange term which, in particular, removes the unphysical
repulsion of the electron with itself introduced by term 2 (observe that the α = β
terms in the sums of terms 2 and 3 cancel).6 The HF equations realize indeed a
6
{
In the β sum of term 3 only ms β = ms α terms survive, as vee is purely orbital and does not
modify spin.
1.2. MANY-ELECTRON ATOMS
45
Generate N orthogonal
initial wavefunctions
Generate the integro−
differential operator
Solve the Schroedinger−
like equation
Replace old wave−
functions with new ones
Choose the N lowest
eigensolutions
Compute total energy
and expectation values
Are
NO
the new
wavefunctions equal
to the old
ones?
YES
Figure 1.20. An idealized resolution scheme of the self-consistent
HF equations (106).
very natural way to treat the electron-electron repulsion as well as possible at the
mean-field level.
Terms 2 and 3 of (106) depend explicitly on the (unknown) wavefunctions ψβ .
The standard strategy (Fig. 1.20) for the solution of the HF equation is based on
initially assuming that all ψβ in (106) are known: start from some arbitrary initial
set of N orthonormal one-electron wavefunctions, put them in place of all ψβ ’s in
(106), solve (usually numerically) the linear equations for ψα ; from the list of new
solutions, take the N eigenfunctions with lowest single-particle eigenenergy ǫβ and
re-insert them into the equations (106) in place of the ψβ ’s; iterate this procedure
as long as needed. Usually, after several iterations (of the order of 10, depending
on the starting ψβ ), self-consistency is reached, i.e. the wavefunctions do not change
appreciably from one iteration to the next. The converged wavefunctions allow
to compute several observable quantities, e.g. the total energy given by Eq. (105).
The sum of the nuclear potential plus the repulsion of the charge distribution of
46
1. ATOMS
r V[r]
2
4
6
8
r
-2
-4
-6
-8
Figure 1.21. Qualitative form of the atomic Hartree-Fock effective
potential multiplied by r (in units of e2 ). The asymptotic values are
consistent with N = Z = 8. The effective potential multiplied by r
and divided by −e2 defines the effective charge Zeff (r) seen at that
distance from the nucleus.
the other electrons represents the self-consistent potential acting on the motion of
charged particles within the atom.
Until now, no assumption has been made about the symmetry of the self-consistent
potential. It needs not have any special symmetry, and indeed it often has none.
However, a simplifying approximation is commonly made: that the charge distribution and the self-consistent potential are spherically symmetric, like the nuclear
potential. This approximation has the advantage of allowing to separate variables in Eq. (106), like in the solution of the one-electron atom, and write each
HF single-particle solution ψα as the product of a nontrivial radial wavefunction
(determined numerically) times a spherical harmonic Ylml . In the spherical approximation, one-electron wavefunctions are labeled by hydrogen-like quantum numbers
α = (n, l, ml , ms ): here angular quantum numbers l, ml label exactly the same angular dependence (31), (32) as the one-electron atom’s, while the radial wavefunction
Rnl (r) has now a nontrivial radial dependency (determined self-consistently). Like
for the 1-electron atom, the number of radial nodes (n−l−1) defines n.
The N -electron ground state is built by filling the single-electron levels starting
from 1s, 2s, ... upward. In the spherical approximation, one finds that the selfconsistent potential felt by each electron has the correct behavior for large r (namely,
(N−1−Z) e2 /r) and for small r (namely, −Ze2 /r), as illustrated in Fig. 1.21. As the
potential has not a simple Coulomb shape, the single-electron levels are not found
1.2. MANY-ELECTRON ATOMS
47
Figure 1.22. Radial properties of a spherically symmetrical atom
(Ar, N = Z = 18) computed by means of the HF self-consistent
method. (a) The radial probability distribution for the filled singleelectron states. Note that the characteristic radius of the innermost
shell n = 1 is ≈ a0 /Z, while the outer filled shell (n = 3) is slightly
larger than a0 . (b) The total radial probability distribution P (r) and
effective charge Z(r) specifying the effective potential.
at the positions (30) of the 1-electron atom, and in particular their energy depends
explicitly on l, not only on n. Indeed, an ns orbital, with larger probability than
np close to the origin where the effective HF potential is more strongly attractive,
is placed lower in energy. Thus the Hartree-Fock method accounts quite naturally
for the observed l-ordering ns, np, nd ... of the single-electron levels observed in the
atomic spectra (e.g. for He in Fig. 1.19).
Figure 1.22 reports the filled single-electron HF radial wavefunctions, for the Ar
atom. The typical radii of the individual shells vary very substantially with n (from
≈ a0 /Z of 1s, to slightly more than a0 for the 3p valence shell), so that peaks related
to the 3 filled shells emerge clearly in the total probability distribution. Accordingly,
the n-dependence of the shell energy is faster for many-electron atoms than in the
one-electron atom.
The independent-electron self-consistent spherical-field model has become more
than a simple approximation: it provides the basic language of atomic physics.
Physicists are well aware that a single-determinant configuration is only an approximation of the actual eigenstate of an atom, but, in practice, atomic states are
48
1. ATOMS
routinely labeled by the electron occupancies of single-electron orbitals best representing the actual many-electron state. For example, the standard notation for the
electronic ground-state configuration of Mg is 1s2 2s2 2p6 3s2 .
1.2.5. The periodic table. The HF theory permits us to understand the
ground state (and even many excitations) of many-electron atoms and ions. With the
tot
HF method, the 1s2 ground configuration of He has total energy E1s
2 = −77.8 eV,
about 2 eV above the measured ground-state energy; the radial dependency of
the one-electron wavefunction is of course nonhydrogenic. In Lithium, a third
electron adds into 2s (configuration 1s2 2s). The HF ground-state total energy is
−7.4328 EHa [?], to be compared with the experimental (1st +2nd +3rd ) ionization
energy 7.4755 EHa , with an error of about 1 eV. The first ionization potential can be
computed by subtracting the total energy given by a self-consistent calculation for
the positive ion, N = Z −1: for Li one obtains a ionization potential of 5.34 eV, in
good accord with the experimental value 5.39 eV. This value is much smaller than
that of He (24.59 eV). The reason is that the 2s shell is much more weakly bound
than 1s. Beryllium has 1s2 2s2 ground state. For all these atoms (N ≤ 4) involving
only s orbitals, the spherical approximation is appropriate.
Starting from Boron, electrons occupy progressively a degenerate p shell: as the p
shell is non spherically symmetric, the spherical approximation for the self-consistent
field is questionable. The 2p shell is completely filled as Neon (N = Z = 10),
the
P next noble2 gas, is reached. Again Ne is a spherically symmetrical atom, since
m |Ylm (θ, ϕ)| is independent of θ and ϕ. The ionization potential of Ne is again
very large, but not as much as that of He (see Fig. 1.23). The next atom, Na,
involves one electron in the 3s shell, which is located much higher in energy than
2p. Again the ionization potential has a dip, as shown in Fig. 1.16, which can be
interpreted as the starting of a new shell which is only weakly bound. As Z = N
further increases, the filling of the n = 3 shell proceeds fairly smoothly, with 3s and
3p becoming more and more strongly bound until the next noble gas Ar is reached.7
For Z this large, the l-dependence of the single-particle HF energy is so strong that
the HF self-consistent field puts the 4s shell lower than 3d. Indeed experiment shows
that potassium has ground state 1s2 2s2 2p6 3s2 3p6 4s rather than 1s2 2s2 2p6 3s2 3p6 3d.
The physical properties of this atom are similar to those of other alkali metals (Li,
Na).
7
For Ar, Z = 18, the HF approximation finds a total energy of −526.817 EHa , which is
0.791 EHa = 21.5 eV in excess of the experimental energy [?]. The absolute error is rather large,
which indicates that the neglect of explicit correlations in the electronic motion is a serious drawback of HF. However, the relative error is ∼ 0.15% only, and the excess energy per electron amounts
to approximately 1 eV, indicating that this mean-field approximation captures the bulk of Coulomb
energy.
1.2. MANY-ELECTRON ATOMS
Figure 1.23. The observed position of the ground state and lowest
excitations, and the spread of the atomic multiplets due to Coulomb
exchange (and, to a tiny extent, spin-orbit) for Z ≤ 11. The energy
of the singly ionized atom sets the zero of this scale (top of the figure).
49
50
1. ATOMS
The 3d shell is then filled after 4s but before 4p, although some inversions (as in
Cr and Cu) indicate that 4s and 3d are almost degenerate at the HF level, and more
subtle effects of electron correlation play an interesting role. Further intersections
associated to a strong l-dependence of energy occur as 4d, 4f and 5d are being filled,
as listed in the periodic table. Similar properties of all elements with a given number
of s or p electrons in the outermost shell suggest the arrangement of all atoms in
the periodic table. The “low-energy” properties of atomic materials with incomplete
d shells (transition metals) and f shells (lanthanides) are relatively similar in each
group.
The size of the atoms (a not especially well defined property) computed with HF is
in relatively good accord with the empirical trends of Fig. 1.17. In particular, noble
gases are especially small and alkali atoms especially large relative to other atoms of
similar Z; on the whole, the size of neutral atoms tends to increase slowly with Z.
Cations (=positive ions) can be produced with any charge Z − N : the size of these
species decreases as shells get emptied and screening is less and less effective. Usually
(but not always) an ion with N electrons has the same electronic configuration as
the atom with Z = N (this is especially true for small charging N ≃ Z). However
all single-electron wavefunctions shrink closer to the nucleus in the cation than in
the neutral atom. Anions (=negative ions) are stable in gas phase only for certain
atoms and only up to a maximum charging of 1 electron (i.e. N ≤ Z + 1). The HF
model signals that some ionic state is not stable by never reaching self consistence.
The results of HF is often in accord with experiment. For example all halogens (F,
Cl, Br, I, At), thanks to their an almost complete relatively “deep” in energy np5
shell (Fig. 1.23), have positive electron affinity (defined as the ionization potential
of the negative ion), which means that their negative ion is stable against loss of the
extra electron.
Beside ionic states, HF permits also to compute (to some extent) excited states
and excitation energies. For example, after computing the ground-state properties of
Na, by filling the N = 11 lowest single-electron levels as 1s2 2s2 2p6 3s, one could run a
new self-consistent calculation putting the 11th valence electron in 3p rather than in
3s (configuration 1s2 2s2 2p6 3p). The self-consistent field turns out different, and the
total energy is larger. The difference in total energy between the two calculations is
a fair estimate of the excitation energy, here of the 3s→3p transition of Na, about
2 eV, see Fig. 1.26 below.
1.2.6. Core levels and spectra. According to HF theory, the energy of the
deepest single-particle levels drops very rapidly with increasing Z, essentially as
∝ −Z 2 . Indeed, in the independent-electrons language, one could conceive exciting
not just the shallow levels with the smallest binding energy, but also the deep core
levels 1s, 2s ... For example for Na, a configuration such as 1s1 2s2 2p6 3s2 could be
1.2. MANY-ELECTRON ATOMS
15
51
X-ray absorption coefficient
Ar
Cl
S
10
Na
Mg
Al
Si
P
Absorption
Z = 11 to 18
5
0
0
1000
2000
Energy [eV]
3000
4000
Figure 1.24. The observed absorption coefficient of all atoms in the
third row of the periodic table, showing, for increasing Z, the regular
displacement of the K edge, and buildup of the L edge.
investigated, where an inner 1s electron has been promoted to the outer 3s shell.
This state involves a huge excitation energy (of the order of 40 EHa ), and one might
suspect that such an enormously unbound state has no right to exist. Indeed, the
atom in this state has plenty of transitions that allow to get rid of big chunks of this
excitation energy, e.g. to a substantially lower-energy state 1s2 2s2 2p5 3s2 . According
to Eq. (73), the decay transition rate is very large as it grows with the third power
of the energy associated to the transition, which dominates over the reduction in
dipole matrix element due to the small size of the initial shell, to a total ∼ Z 4÷5
dependence – see Eq. (84). Accordingly, the lifetime broadening of core states may
be huge, exceeding ~γ ≈ 1 eV. Despite such huge broadening, core-hole states are
not just a theoretical prediction of the independent-electron model, but they are
routinely observed in UV and X-ray spectroscopies.
Like optical spectroscopies, many experiments probing core states can be classified
as absorption or emission, with the same conceptual scheme of Figs. 0.5 and 0.6.
Absorption data (Fig. 1.24) show a remarkable regularity of the spectra above ≈
100 eV, and a systematic change of peak positions and intensities as Z is increased.
A characteristic feature of X-ray absorption spectra is the asymmetry of the peaks,
which show a sharp edge on the low-energy side and a broad slow decrease on the
52
1. ATOMS
Figure 1.25. The observed highest-energy core levels for uranium,
their labeling in terms of the hole quantum numbers n l and j, and
the dipole-allowed transitions among them. Within each shell, note
the huge l-related and spin-orbit splittings.
high-energy side. The reason for the edge is that below the minimum excitation
energy for the core state no absorption takes place, while above threshold, the
core electron may be promoted to several empty bound and unbound states of the
atom in gas phase, or of the condensed phase. The slow decrease above edge is
due to an increase of the kinetic energy of the ejected electron (which equals the
difference between the absorbed photon energy and the energy of the atomic core
state): the final state becomes increasingly orthogonal to the initial core level, and
correspondingly the dipole matrix element (74) decreases. For this same reason, an
X-ray photon hitting an atom is much more likely to extract a core electron than a
weakly bound outer-shell electron.
Emission spectra show the same simplicity and regularity as absorption spectra.
Atoms are excited initially, typically by collisions with high-energy electrons. The
subsequent emission involves transitions only from levels for which enough energy
is made available by excitation. For example, if 2 keV electrons are used to excite
the sample, emission involving the 1s shell is observed for all Z ≤ 14 (Si), but not
for P and higher-Z atoms (see Fig. 1.24).
1.2. MANY-ELECTRON ATOMS
53
Yet another needless traditional notation haunts core states and X-ray spectra:
a hole (=missing electron) in shell n = 1, 2, 3, 4, ... is labeled K, L, M , N , ...
The substructures related to states of different l and j acquire a Roman subscript
(e.g. LIII for 2p 2 P3/2 ), as in Fig. 1.25. Dipole-allowed transitions in emission are
organized in series according to the initial shell, with a Greek-letter subscript for the
final shell. For example, the transition K → L (in other words the decay 1s2s2 2p6 ...
→ 1s2 2s2 2p5 ...) is called Kα emission line, K → M is Kβ , and L → M is Lα .
In the days of the great discoveries of chemistry and physics, when the structure and classification of atoms were being understood, H. G.-J. Moseley acquired
and compared characteristic emission spectra of many elements: he showed that
the Kα inverse wavelength (equivalently, energy) is roughly proportional to Z 2 . It
approximately fits a law:
(107)
1
λKα
≈ C (Z − a)2 .
In this phenomenological dependence, the proportionality constant C is close to the
Rydberg constant (see Fig. 1.4), and the quantity a, accounting for screening, is
approximately 2. The discovery of this regularity permitted to identify the correct
value of Z of each atomic species, thus correcting several mistakes in the early
periodic table.
The decent accuracy of Moseley’s fit suggests that one could estimate the energy positions of the core levels (within, say, 20%) without going through a full
self-consistent HF calculation. Indeed, the core states may be thought of as approximately hydrogenic states in an effective −Zeff e2 /r Coulomb potential, where the
value of Zeff is determined by the average of the radial distribution of the singleelectron wavefunction at hand related to the effective potential curve, like that of
Fig. 1.21. Accordingly, core energies of a shell can be estimated by means of Eq. (30),
taking for Z a Zeff ≃ Z − 2 for the K shell, Zeff ≃ Z − 10 for the L shell, and in
general Zeff ≃ Z− (number of electrons in inner shells up to and including the target
shell).
Nowadays, X-ray spectroscopies are routinely used in a number of ways, including use as position-sensitive analytic tools, local probe of the near chemical environment of different atomic species, and many others. Many more application
examples of X-ray spectroscopies are reported at http://www.esrf.fr/ and at
http://www.elettra.trieste.it/.
1.2.7. Optical spectra. The simplicity of the core spectra is rapidly lost when
the optical and soft-UV range is explored. Different atoms show completely different
spectra, but some regularity, a few qualitative statements, and many similarities can
be recognized.
54
1. ATOMS
The main observation necessary to understand optical spectra (and core transitions as well) is that the dipole operator driving the transition in a many-electron
context is the sum of the individual dipoles of the single electrons:
X
X
(108)
d~ =
d~i = −qe
~ri .
i
i
The operator ~ri acts on a N -electron wavefunction by multiplying it by the position
coordinate of the i-th electron. In the independent-electron approximation, which,
we have seen, works fairly well for atoms in the HF self-consistent model, the matrix
element of one such operator between two properly antisymmetrized states (89) is
simply
X
X
hα1 , ...αN |A d~i |β1 , ...βN iA =
hαPi |d~i |βi i hαP1 |β1 i...hαPi−1 |βi−1 ihαPi+1 |βi+1 i...hαPN |βN i.
i
i P
Here, all the overlap integrals vanish unless all αPk = βk , in the assumption that the
single-electron states composing the initial states are essentially equal to those composing the final states. Furthermore, also the single-particle dipole matrix element
hαPi |d~i |βi i vanishes, unless it satisfies the single-electron dipole selection rules. In
other terms, for N electrons, dipole allowed transitions occur only between states
where one single electron makes a dipole transition, and the other N −1 electrons
remain in the initial single-particle state. Of course, any of the N electrons can
make the transition to any empty state. The angular part of the single-electron
wavefunction is again of type Ylm (spherical harmonics): the dipole selection rules
(∆l = ±1, ∆s = 0) derived for the one-electron atom in Sec. 1.1.10 continue to hold
for the single electron making the transition. Accordingly, a few examples of allowed
transitions of Be follow: 1s2 2s2 →1s2 2s2p, 1s2 2s2p→1s2 2s4d, and 1s2 2s2 →1s2s2 3p;
and a few examples of forbidden transitions: 1s2 2s2 6→1s2 2s3d, 1s2 2s2 6→1s2 2p2 , and
1s2 2s2 6→1s2s2p3p. Further selection rules (Sec. 1.2.7.2) affect the total angularmomentum quantum numbers obtained by coupling the spins and orbital angular
momenta of individual electrons.
1.2.7.1. Alkali atoms. In the alkalis, the last complete np shell lies rather deep
in energy (typically a few tens eV), while the outermost (partly) occupied (n+1)s
level is very shallow (from 5.4 eV in Li to 3.9 eV in Cs). The ground state symmetry
is 2 S1/2 , like that of H, where now the label 2S+1 [L]J refers to total angular momenta
which, for the alkalis, equal the angular momenta of the outer (“optical”) electron,
since the close shells (=completely filled shells) contribute null spin and null orbital
angular momentum. All excitations of the inner-shells electrons are core excitations,
involving energies much larger than the first ionization energy.
Electrons in the inner shells remain essentially “frozen” in a spherically symmetric
core providing an effective potential (Fig. 1.21) for the motion of the outer electron,
which is the responsible for all low-energy excitations of the alkali atoms. The
1.2. MANY-ELECTRON ATOMS
55
Figure 1.26. (a) The observed level scheme of the first alkali-metal
atoms, compared to that of H. (b) Energy-level diagram of Na atom,
with some transitions indicated by arrows, with the associated wavelengths measured in Å=0.1 nm. (c) Fine structure of the lowest Na
levels due to spin-orbit interaction. The zero of the energy scale is set
by (a) the ground state of the ion A+ , and (b,c) the ground state of
the neutral atom.
associated spectrum of excitations resembles that of a one-electron atom, the main
difference being the significant energy gaps between states characterized by the same
n but different l. In particular, the separation of (n+1)s and (n+1)p is especially
56
1. ATOMS
large, of the order of 2 to 3 eV, and originates a characteristic transition in the visible
spectrum. This (n+1)s→ (n+1)p transition is especially strong due to the large dipole
matrix element involving strongly overlapping and fairly extended wavefunctions.
Figure 1.26 reports a typical level scheme and several optical transitions.
Spin-orbit affects all non-s states. The natural generalization of Eqs. (60) and
(63) to a generic radial potential yields a microscopic estimate of the relevant ξ for
a given shell:
(109)
ξ=
1 dVeff (r)
1
~2
4
2
hn,
l|
|n,
li
=
Z
α
E
.
Ha
eff
s−o
2 m2e c2
r dr
n3 l(l + 1)(2l + 1)
where Zeff s−o is implicitly defined by this equation and provides rough estimates
eff
of ξ. Due to the strong localization of the mean field 1r dV dr (r) near the origin, the
effective charge Zeff s−o is usually larger than that introduced above for the estimate
of the level position: Zeff < Zeff s−o < Z. The spin-orbit splitting of the levels shows
in splittings of all optical transitions. The observed spin-orbit splittings of the lowest
excited p state of the alkali atoms are:
Element
Li Na K Rb Cs
Z
3 11 19
37
55
single-electron excited level
2p 3p 4p
5p
6p
Spin-orbit splitting 32 ξ [meV] 0.042 2.1 7.2 29.5 68.7
Zeff s−o
0.98 3.5 6.0 10.0 14.2
Remarkably, the spin-orbit splitting of Li 2p is smaller than that of H 2p: the reason
is that most of the Li 2p wavefunction lies well outside the compact 1s screening
shell. The fine-structure splittings of higher excited non-s states are smaller than
those reported here (Fig. 1.26c).
Within the HF model, Eq. (109) permits to evaluate the spin-orbit energy ξ for
any shell of any atom, not just for the excited shells of alkali atoms. For example, the
large effective Zeff s−o accounts for the colossal spin-orbit splittings (tens or hundreds
eV) of the core shells of heavy atoms, observed in X-ray spectra (e.g. the LII –LIII
splitting – Figs. 1.24 and 1.25).
1.2.7.2. Atoms with incomplete degenerate shells. The occupancies of the single
orbitals are often insufficient to determine many characters of the atomic ground
state, e.g. its total angular momentum J, thus its degeneracy. The ground-state
symmetry of noble gases, alkali earth and, in general, all atoms in close-shell configurations, including Zn, Cd, Hg, Yb, and No is trivial: these atoms all qualify
as nondegenerate spherically symmetric 1 S0 . Likewise, the ground state of alkali
metals is simply 2 S1/2 , with a twofold degeneracy associated to spin. Another similar and relatively simple case is that of B and Sc (and atoms of the same groups
IIIB and IIIA), where a single electron occupies a degenerate p or d shell, while all
1.2. MANY-ELECTRON ATOMS
57
inner shells are complete: here the spin and orbital angular momentum equal those
of the lone electron. Spin orbit splits J = L − 21 and J = L + 21 , putting L − 21
lower. Accordingly, the ground state of B is 2 P1/2 and that of Sc is 2 D3/2 . The last
relatively simple class is that of the halogen atoms (group VIIB, p5 configuration),
characterized by a single hole in an otherwise full shell. Here not surprisingly the
single hole carries the same spin and orbital angular momentum (S = 12 and L = 1)
as a single electron in that shell. Interestingly, the effective spin-orbit interaction of
the hole is reversed in sign.8 Once this is agreed upon, the ground-state symmetry
of all halogens is 2 P3/2 . Similarly, the symmetry of the ground state of Tm (4f13 ) is
2
F7/2 .
A more intricate situation occurs when several electrons occupy a degenerate shell,
but are too few to fill it completely. The resulting independent-particle quantum
state is highly degenerate, in terms of one-electron energetics. For example, in carbon two (identical) electrons occupy a p shell in 6·5/2 = 15 possible ways. In general,
d!
N electrons occupy d = 2(2l + 1) degenerate spin-orbitals in Nd = N !(d−N
dif)!
ferent ways, corresponding to physically different orthogonal quantum states. This
large degeneracy is partly lifted by (i) the residual electron-electron interaction (i.e.
the energy differences, ignored at the HF level, related to the correlated electronic
motion), and (ii) the spin-orbit interaction.
In the outer atomic shells, Coulomb energies are of the order of one to several
eV (like for He, Sec. 1.2.3), while for Z smaller than about 30 spin orbit is a comparatively small interaction (ξ ≪ 1 eV), which we initially neglect. The residual
Coulomb interaction (not accounted for by the HF self-consistent field) is spheri~ 2 and |L|
~ 2 . Figure 1.27 provides a
cally symmetric: it thus commutes with total |S|
complete list of the “multiplet” states labeled by S and L.
The residual Coulomb interactions acts on HF states like the full Coulomb repulsion in He (Sec. 1.2.3): it first of all splits states of different total spin S: low-spin
states sit higher in energy, as the electrons move, on average, closer to one another.
The ground state will therefore have the highest possible spin: this result is in accord
with the empirical first Hund rule.
In degenerate configurations there often occur several states of the same spin,
but different total angular momentum L. Coulomb repulsion is generally larger for
8 Observe that the total spin-orbit operator for N electrons in a degenerate shell is H
s−o =
PN ~
ξ i=1 li · ~si . As the shell completes with d = 2(2l + 1) electrons, it is convenient to rewrite
Pd
Pd
Pd
Hs−o = ξ i=1 ~li · ~si − ξ i=N +1 ~li · ~si = −ξ i=N +1 ~li · ~si . The first term vanishes, for closed-shell
cancellation. The residual term, of opposite sign, represents the spin-orbit interaction of “missing”
electrons, named holes. Halogen atoms have N = d − 1 electrons, thus 1 hole in a p valence shell,
which behaves as an electron with reversed spin-orbit ξeff = −ξ < 0.
58
1. ATOMS
Figure 1.27. All terms in which Coulomb correlation splits a degenerate [l]N configuration, for l = 0, 1, 2, 3 and all possible fillings N .
When several states with the same 2S+1 [L] occur in the configuration,
the number of occurrences is written under the letter [L] denoting
the total orbital angular momentum. In LS (Russell-Saunders) coupling, weak spin-orbit interaction further splits states characterized
by different J, e.g. 2 P → 2 P1/2 , 2 P3/2 , not detailed here.
states with low L. Accordingly, the ground state is the state with highest possible
L among those with maximum S. This is known as second Hund rule.
Finally, once the total L and S are determined, the hitherto neglected spin-orbit
interaction couples them together to a total angular momentum J. The allowed
values of J are given by the usual rule (47), and the question of which of them is
lowest in energy is decided by the sign of the effective spin-orbit parameter for that
partly filled shell. While the true spin-orbit parameter is necessarily positive (see
~ with total S
~ may as
Eq. (109)), the effective parameter for the coupling of total L
well be negative, as discussed above for the halogens. Indeed the sign of the effective
spin-orbit reverses when more than 2l + 1 electrons occupy a shell where 2(2l + 1)
electrons can fit. Accordingly, the third Hund rule states that the ground state has
1.2. MANY-ELECTRON ATOMS
6 singlet states
9 triplet states
59
{
(1)
1S
0
(5)
1D
2
{
(5)
(3)
(1)
3P
3P 2
3P 1
0
Figure 1.28. The correct labeling and ordering of the p2 multiplets
according to LS coupling and Hund’s rules. Degeneracies 2J + 1 are
indicated at the left.
J = |L − S| for less than half-filled shell and J = L + S when the shell is more than
half filled.
In the p2 example mentioned above, the fifteen states arrange themselves in triplets
3
P0 , 3 P1 , 3 P2 , and singlets 1 D2 , 1 S0 . No other states are compatible with Pauli’s
principle [?, ?]. According to Hund’s rules, the ordering of these states is illustrated
in Fig. 1.28. Similar qualitative ordering of the levels, as given by Hund’s rules, is
observed also in configurations involving several incomplete shells, as in Fig. 1.29.
For low-Z atoms, Fig. 1.23 shows the ranges where the multiplet of states corresponding to each configuration as observed spectroscopically. Observe that the
lowest multiplets spread out by several eV, while more excited multiplets are much
less disperse, as the residual Coulomb repulsion becomes weaker and weaker for
more extended wavefunctions.
~ and all ~li together
The described scheme of coupling all ~si together to a total S
~ (followed by spin-orbit coupling of S
~ and L
~ together) is called Russellto a total L
Saunders or LS coupling. It provides a satisfactory basis of coupled states to low-Z
atoms, where Coulomb exchange dominates over spin orbit. For increasing Z the
spin-orbit interaction grows fast, while electron-electron repulsion weakens due to
spreading of the outer occupied orbitals. For very large Z ≥ 50 spin-orbit dominates:
Hs−o must be accounted for before Coulomb terms. Spin-orbit couples the spin and
orbital moment of each electron to an individual ji = li ± 21 . These individual total
angular momenta are then coupled to a total J by smaller Coulomb terms. This
ordering of the couplings of the angular momenta is called jj coupling, and provides
another basis for the many-electron states. While the LS basis is almost diagonal at
small Z, the jj basis is almost diagonal for large Z (see Fig. 1.30). For intermediate
Z, either basis could be used, but the matrix of Coulomb exchange plus spin-orbit
60
1. ATOMS
(a)
(b)
Figure 1.29. The conceptual sequence of splittings caused by interactions of decreasing intensity for LS multiplets. (a) The 6·(6−1)/2! =
15 states of two “equivalent” electrons in a np2 configuration, with the
splittings induced by an external magnetic field assumed in the Zeeman limit. (b) The 6·10 = 60 states of two “inequivalent” (different n
and/or l) electrons in a 3d4p configuration, as occurs in the spectrum
of excited Ti.
must be diagonalized to get proper eigenstates, as linear combinations of the states
of the chosen basis.
1.2.7.3. Many-electron atoms in magnetic fields. When an external magnetic
field is applied to a many-electron atom, it may behave in two very different ways,
according to whether the atom carries a magnetic moment or not. Atoms with total
1.2. MANY-ELECTRON ATOMS
61
Figure 1.30. Transition from LS coupling in carbon (Z = 6) to
intermediate coupling (Coulomb exchange and spin-orbit of the same
order) in germanium (Z = 32) to jj coupling in lead (Z = 82).
angular momentum J = 0 have no permanent magnetic dipole to align with the
field: the field induces a tiny magnetic moment, of order µB µB∆B , where ∆ is the
energy gap between the ground state and the lowest excitation, but we will ignore
such tiny effects. Instead, open-shell atoms with J 6= 0 carry a magnetic moment
~
~µ = −gJ µB J/~,
with a tendency to align to the field.
For LS coupling, the appropriate g-factor gJ is determined by Eq. (54), with the
total angular momenta J, L, and S in place of the single-electron j, l, and s. As
discussed in Sec. 1.1.11, this total magnetic moment, derived by the coupling of
orbital and spin contributions is relevant in the limit of weak external magnetic field
(Zeeman limit). In most practical experimental situations this is the relevant limit
for many-electron atoms, due to the Z 4 increase of the spin-orbit energy ξ, and the
maximum field accessible in the lab (of the order of 10 T). The opposite strong-field
(Paschen-Back) limit can be realized only for highly excited states, with an electron
close to dissociation, thus weakly affected by the nuclear field.
The simplest splitting pattern – three equally spaced lines, corresponding to
∆MJ = 1, 0 and −1 – is called regular Zeeman splitting. An example is shown
in Fig. 1.31. It occurs when the initial and final g-factors are equal (typically
g = gL = 1). This in principle occurs when either S = 0 or L = 0. However,
L = 0 is most unlikely, as the electron making the transition changes its l by unity,
62
1. ATOMS
Figure 1.31. Regular Zeeman spectrum, with its interpretation.
It only occurs for S = 0 states. For example, it is observed in the
2s3d 1 D2 →2s2p 1 P1 emission of Be.
Figure 1.32. Anomalous Zeeman spectrum in Na (left) and in Zn (right).
thus transitions Li = 0 → Lf = 0 occur very rarely. In practice, regular Zeeman
splitting is observed in optical transitions between spin-singlet states (S = 0). In
all other cases, the Zeeman spectrum shows more complicated patterns due to the
different initial and final g-factors (anomalous Zeeman spectrum, see Fig. 1.32).
Many-electron atoms can also be introduced in Stern-Gerlach apparatuses (Sec. 1.1.5),
in order to investigate the magnetic moment of their ground state. The amount of
deflection measures the ẑ component of the magnetic moment, thus the Landé gfactor gJ , according to Eq. (45):
∂Bz
∂Bz
F~z = µz
= −µB gJ MJ
.
∂z
∂z
The number of sub-beams into which the inhomogeneous field splits the original
beam measures directly the number of allowed MJ values, i.e. the ground-state
degeneracy 2J +1.
(110)
1.2.8. Dipole selection rules. As discussed at the beginning of Sec. 1.2.7, the
main selection rule requires the that a single electron changes state to another state
satisfying ∆l = ±1. Several dipole selection rules for the total quantum numbers J,
L, and S of many-electron atoms in LS coupling also apply. They are summarized
1.2. MANY-ELECTRON ATOMS
63
below:
(111)
(112)
(113)
(114)
(115)
Parity changes
∆S = 0
∆L = 0, ±1
∆J = 0, ±1
∆MJ = 0, ±1
(no 0 → 0 transition)
(no 0 → 0 transition if ∆J = 0).
As both S and L are good quantum numbers only in the limit of very small spinorbit, in practice selection rules (112) and (113) are only approximate.
Figure 1.30 draws the allowed transitions in characteristic examples of LS coupling
and jj coupling. In this latter case specific dipole selection rules apply, which we
leave to specific atomic-physics courses.
The present Chapter summarizes few basic concepts and experimental evidence
in the field of atomic physics, which, in the context of a general course in physics
of matter, provide a minimal background and language for understanding and describing the microscopic atomic structure, which is the at the root of the physics of
matter. Important conceptual points (e.g. the seniority scheme for the labeling of LS
states when L, S and J are not sufficient), several modern spectroscopic techniques
(e.g. Auger), and analytical, chemical, astrophysical applications of atomic physics
are omitted. These advanced subjects are left to more specific volumes [?, ?, ?].
CHAPTER 2
Molecules
Diatomic molecules are the simplest system where several nuclei are bound together by their interaction with one or several electrons. Their simplicity makes
them the ideal system to learn two central concepts of condensed-matter physics:
the adiabatic separation of the electronic and nuclear motions, and chemical bonding.
2.1. The adiabatic separation
The disentanglement of the fast electron dynamics from the slow motion of the
nuclei is a crucial conceptual step to make progress in understanding and interpreting
a huge amount of phenomena and observations in the physics of matter. The total
Hamiltonian Htot , Eq. (1), for a piece of matter depends on the coordinates r of all
electrons and R of all nuclei, and so do all of its eigenfunctions Ψ. Consider the
following factorization for the total wavefunction:
(116)
Ψ(r, R) = Φ(R) ψe (r, R) .
(a)
Assume that the electronic wavefunction ψe (r, R) is a solution ψe (r, R) of the following electronic equation:
(117)
[Te + Vne (r, R) + Vee (r)] ψe(a) (r, R) = Ee(a) (R) ψe(a) (r, R) ,
where (a) represents the set of quantum numbers characterizing a given N -electron
(a)
(a)
eigenstate with energy Ee . The electronic wavefunction ψe (r; R) describes an
electronic eigenstate compatible with a given geometrical configuration R of the ions:
its R-dependence is purely parametric (no ∇R operators in Eq. (117)). Physically
this ansatz corresponds to assuming the ionic and electronic motions as decoupled
from each other, in accord to the following related observations:
• The electron mass is much smaller than the nuclear mass. As masses appear
at the denominators of the kinetic terms Eq. (2) and Eq. (3), electrons move
much faster, thus in shorter timescales, than atomic nuclei.
• Energy levels corresponding to different electronic eigenstates are usually
separated by energy gaps much larger than typical energies associated with
the motion of the atomic nuclei.
65
66
2. MOLECULES
The assumption behind the factorization defined by Eqs. (116), (117), known as the
adiabatic or Born-Oppenheimer scheme, is that, once an initial electronic state has
been selected, the nuclei move slowly enough not to induce transitions to different
electronic states.
To make use of the ansatz (116), observe that Te ∼ ∇2r does not act on the R
coordinates:
(118)
Te [Φ(R) ψe (r, R)] = Φ(R) [Te ψe (r, R)] .
Derivation is slightly more intricate for the nuclear kinetic term:
(119)
∇2R [Φ(R) ψe (r, R)] = ψe (r, R) ∇2R Φ(R)+2 [∇R ψe (r, R)] ∇R Φ(R)+Φ(R) ∇2R ψe (r, R) .
Thus, when we substitute the factorization Ψ(r, R) = Φ(R) ψe (r, R) into the Schrödinger
equation Htot Ψ = Etot Ψ we obtain
X ~2 −
2 [∇R α ψe (r, R)] ∇R α Φ(R) + Φ(R)∇2R α ψe (r, R) + ψe (r, R)∇2R α Φ(R)
2Mα
α
(120)+ Φ(R) Te ψe (r, R) + [Vne + Vee + Vnn ] Φ(R)ψe (r, R) = Etot Φ(R)ψe (r, R) ,
(omitting the coordinate dependencies of the V terms). This can be rearranged as
follows:
X ~2 −
(121)
2 [∇R α ψe (r, R)] ∇R α Φ(R) + Φ(R) ∇2R α ψe (r, R) +
2Mα
α
#
"
X ~ 2 ∇2
Rα
Φ(R) + Φ(R) [Te + Vne + Vee + Vnn ] ψe (r, R) =
(122)− ψe (r, R)
2Mα
α
= Etot Φ(R)ψe (r, R) ,
in order to highlight the electronic Hamiltonian (Te + Vne + Vee ) of Eq. (117).
The two terms of line (121), involving derivatives ∇R of the electronic wavefunction ψe (r, R) are called nonadiabatic terms. The nonadiabatic terms are much
me
) than the typical differences between electronic
smaller (usually by a factor ≈ M
α
energies in Eq. (117), dominating Eq. (122). The adiabatic approximation consists
precisely in the neglect of these terms.
(a)
In Eq. (122), drop the nonadiabatic terms and substitute a solution ψe (r, R) of
Eq. (117) for ψe (r, R):
#
"
2
2 X
∇
~
Rα
+ Ee(a) (R) + Vnn (R) Φ(R) = Etot ψe(a) (r, R) Φ(R) ,
(123) ψe(a) (r, R) −
2 α Mα
(a)
where the electronic wavefunction ψe (r, R) is displaced to the left of the operator,
to stress that the differential part only acts on the ionic part Φ(R). Note that all
2.1. THE ADIABATIC SEPARATION
67
three terms Te , Vne and Vee are completely (and in principle exactly) accounted for by
(a)
the electronic eigenvalue Ee (R), the internuclear repulsion Vnn remains indicated
explicitly, and only the nuclear kinetic term Tn is treated approximately, due to
the neglect of nonadiabatic corrections. The electronic wavefunction can now be
(a) ∗
dropped (formally by multiplying Eq. (123) by ψe (r, R) and integrating over all
electronic coordinates r), to derive the equation for the adiabatic motion of the
nuclei described by Φ(R):
"
#
~2 X ∇2R α
(124)
−
+ Ee(a) (R) + Vnn (R) Φ(R) = Etot Φ(R) .
2 α Mα
The equation for the electronic system (117) describes the motion of all electrons in the piece of matter. When the number of electrons is larger then one (as usually is the case), Eq. (117) involves at least the same theoretical difficulties of manyelectron atoms discussed in Sec. 1.2. Equation (117) is usually solved within some
approximate quantum many-body method, e.g. the Hartree-Fock method sketched
in Sec. 1.2.4, or the Density-Functional theory. In the present polyatomic context,
theory faces the extra difficulty that the electronic problem depends explicitly on the
precise position R of the nuclei, through Vne . One should then solve the electronic
problem for many geometric arrangements (classical configurations R) of the nuclei,
(a)
to obtain a detailed knowledge of the eigenfunction ψe (r, R) and more importantly
(a)
(a)
of the eigenvalue Ee (R) as a function of R. The electronic energy function Ee (R)
is especially important as it drives the motion of the nuclei through Eq. (124). The
neglected nonadiabatic terms (which, remember, are of nuclear kinetic nature) would
induce transitions among different electronic states. Within the adiabatic approxi(a)
mation, once an electronic state ψe (r, R) is chosen, the motion of the nuclei does
(a′ )
not let this electronic eigenstate mix with other eigenstates ψe (r, R): the electronic
state follows adiabatically the slow nuclear motion.
(a)
Once the exact (or some approximate) Ee (R) is plugged into the Schrödinger
equation for the nuclei (124), it allows to understand the translational motion of
the nuclei as driven by a total adiabatic potential energy
(125)
(a)
Vad (R) = Vnn (R) + Ee(a) (R) ,
which is the sum of the direct Coulombic ion-ion repulsion Vnn plus the electronic
(a)
eigenvalue Ee (R) (which changes as a function of the ionic coordinates, as discussed above). This second term, also called “adiabatic electronic contribution”,
(a)
can be seen as the ”glue” which keeps the atoms together. Ee (R) is generally
a complicated function of all the ionic coordinates R: contrary to Vnn , it usually
cannot be expressed as a simple sum of two-body contributions.
68
2. MOLECULES
Of all the infinite electronic eigenstates labeled by (a), the electronic ground state
produces the especially important lowest adiabatic potential energy surface Vad (R).
Electronic excitations (a) generate different adiabatic potential energy surfaces.
In the adiabatic scheme, the translational motions of the nuclei follow the total
adiabatic potential Vad (R). As in common language “nuclear” refers to internal
dynamics of the nuclei, in practice Vad (R) is thought of as the potential energy
governing the motions of “atoms”, or “ions”. As a function of the 3Nn atomic coordinates, Vad (R) shows two very general symmetries, consequences of the symmetry
of the original Hamiltonian (1), describing an isolated system:
• Translational symmetry: if all ions are moved by an arbitrary (equal for
~1+
all) displacement ~u, then Vad remains unchanged. Symbolically: Vad (R
~ 2 +~u, ..., R
~ Nn+~u) = Vad (R
~ 1, R
~ 2 , ..., R
~ Nn ).
~u, R
• Rotational symmetry: if all ions are rotated by an arbitrary (equal for
all) rotation A around a given arbitrary point in space, then Vad remains
~ 1 , AR
~ 2 , ..., AR
~ Nn ) = Vad (R
~ 1, R
~ 2 , ..., R
~ Nn ).
unchanged. In symbols: Vad (AR
This means that Vad depends in practice on the relative positions of the atoms.
Vad (R) often shows a well defined minimum, for the atoms placed in a given
definite relative configuration RM . In the simplest example of a di-atom, one always
finds a definite interionic distance RM for which Vad (R) is minimum. Choose any
two atoms in the periodic table: the potential energy as a function of distance has
the qualitative shape of Fig. 2.1. If the potential well is deep enough, then at low
temperature the motion is confined in a neighborhood of the equilibrium position
RM : the ions execute small oscillations around RM . The ionic motion is sometimes
2
treated within classical mechanics (Mn ddtR2 = −∇R Vad (R)). This does not mean that
the actual atomic motion is any classical, only that under some conditions (e.g. very
heavy ions), the classical limit could provide a good approximation to the actual
quantum dynamics. The classical ionic dynamics often yields useful insight into the
quantum solution of Eq. (124). For example, the classical problem of the normal
modes describing small independent harmonic oscillations around RM is mapped to
that of a collection of quantum harmonic oscillators (a single quantum oscillator,
for a diatomic molecule). We shall return to this fundamental quantum problem in
Secs. 2.3 and 4.3, and sketch its well known solutions.
In summary, the adiabatic scheme provides a basic separation of the intricate coupled electron-ion dynamics into two conceptually and practically distinct problems:
the electronic equation (117) governs the motion of the electrons in the field of the
ions (imagined as instantaneously frozen); the equation for the slower motion of the
ions (124) is a standard Schrödinger equation defined by the adiabatic potential,
which is the sum of the interionic Coulomb repulsion plus the “glue” provided by
the electronic total energy – Eq. (125).
2.2. CHEMICAL AND NONCHEMICAL BONDING
69
Energy / depth of the well
Vad(R)
1
0
-1
0
1
2
3
R/RM
Figure 2.1. The typical qualitative shape of the adiabatic potential
for a diatom, as a function of the interionic separation R. The zero
of energy has been taken as the sum of the total energies of the two
individual atoms at large distance. Horizontal lines represent possible
vibrational ground and low-energy excited levels. The actual number
and positions of these states depends also on the diatom reduced mass.
2.2. Chemical and nonchemical bonding
In the previous Section we anticipated a generic profile for the adiabatic potential
of a diatom (Fig. 2.1). If that sort of long-distance attractive plus short-distance
strongly repulsive behavior is really general, and persists for Nn > 2 atoms, then
it can provide the microscopic mechanism allowing collections of atoms to bind
together, forming all kinds of bound states (including molecules, liquids, solid objects
of everyday matter...), without any tendency to collapse to infinitely dense pointlike objects. To obtain both qualitative and quantitative insight into the bonding
nature of Vad (R), we investigate conveniently simple model systems.
+
2.2.1. H+
2 . This investigation starts naturally with the simplest di-atom: H2 ,
where all complications of electron-electron repulsion Vee are avoided. We proceed
to look for evidence that the adiabatic energy of H+
2 has a qualitative dependence
of the inter-proton distance R of the kind sketched in Fig. 2.1: we check if, as R
(a)
is reduced to finite values, the lowering in electronic energy Ee (R) exceeds the
2
increase in internuclear Coulomb repulsion Vnn (R) = eR , thus providing bonding.
70
2. MOLECULES
[EHa / a0]
0
Vne(r,R)
-5
Vne / R
ry
rx
-10
-3
-2
-1
0
rx /R
1
2
3
Figure 2.2. Three
parallel cuts iof the nuclei-electron attraction
h
1
1
2
Vne (r, R) = −e |~r−Rx̂/2| + |~r+Rx̂/2|
in H2+ , along the line through
the nuclei (the x axis) and along two lines at distances R/2 and R
from the nuclei.
This would imply that at some finite R the total adiabatic potential energy (125)
of the ion is smaller (more negative) than that at R = ∞.
To convince ourselves that this is indeed the case, consider the potential energy
Vne acting on the electron of H+
2 . Vne is the sum of the two attractive Coulomb terms
VL + VR produced by the Left and Right nucleus, which, in the adiabatic spirit, we
take as stationary (their distance R is the independent variable of the adiabatic
potential-energy function Vad (R) we investigate). Figure 2.2 shows three “cuts” of
Vne (r, R) as a function of the electron position r. The main observation here is that in
the intermediate region, between the two nuclei, the total potential is roughly twice
more negative than when one of the two nuclei was removed to infinity. This suggests
that the electron moving in the field of both nuclei could take advantage of both
attractions and lower its average potential energy by spending a significant fraction
of its probability distribution in this intermediate extra-attractive region. We check
if this mechanism produces bonding, i.e. Vad (R) decreases below its infinite-R value
Vad (R) = E1s = − 21 EHa , namely the energy of an isolated hydrogen atom (ignore
the mass correction mµe throughout).
2.2. CHEMICAL AND NONCHEMICAL BONDING
71
ψA
energy / EHa
ψS
0
ψ1s L
ψ1s R
Vne(r)=VL+VR
-5
-10
-3
-2
-1
0
r/R
1
2
3
Figure 2.3. Cuts along the axis through the two nuclei of: (lower
to upper) the potential energy Vne (r, R), the 1s eigenfunctions at each
isolated well, and their symmetric and antisymmetric combinations.
To estimate the energy that may be gained we use a variational approach: the
ground-state energy is lower or equal than the average energy of any trial state of
our construction. For this, start from the 1s Hydrogen orbitals, Eq. (40), |1s Li
and |1s Ri centered at the Left and Right nucleus respectively. Guided by the
reflection symmetry of Vne (r, R), build the trial electronic wavefunction for ψe as
either Symmetric or Antisymmetric normalized linear combination
(126)
|Si = p
|Ai = p
|1s Li + |1s Ri
2(1 + Reh1s L|1s Ri)
|1s Li − |1s Ri
2(1 − Reh1s L|1s Ri)
.
The wavefunctions ψ1s L (r) = hr|1s Li, ψS (r) = hr|Si etc. are sketched in Fig. 2.3.
(a)
We have therefore Ee (R) ≤ h1s S|Htot |1s Si. For infinitely large internuclear sep(a)
aration R, |Si and |Ai become exact eigenstates, of energy Ee (R = ∞) = E1s =
72
2. MOLECULES
− 12 EHa . For finite R, the total electronic energy of these states becomes
(127)
(hL|Te + Vne |Li ± hR|Te + Vne |Li) + (L ↔ R)
S
S
E S = h |Te + Vne | i =
A
A
A
2(1 ± hL|Ri)
where we have omitted the 1s labels for brevity and the Re(), as the overlap hL|Ri is
real. The (L ↔ R) terms obtained by exchanging left and right equal the previous
ones, thus we can omit them together with the factor 2 at the denominator. Replace
VL + VR for Vne , and reorganize the six matrix elements, to take advantage of the
fact that |Li is the ground state of (Te + VL ):
(128)
hL|Te + VL |Li + hL|VR |Li ± hR|Te + VL |Li ± hR|VR |Li
EHa hL|VR |Li ± hR|VR |Li
ES =
=−
+
.
A
1 ± hL|Ri
2
1 ± hL|Ri
The second term contains two Coulomb matrix elements, which are both real and
negative and depend on the internuclear distance. Despite the denominator, the
symmetric state |Si is lower in energy than |Ai. The hL|VR |Li term represents
the attraction that the right nucleus exerts on the electron sitting around the left
2
nucleus. At large R this attractive term balances almost exactly the Vnn = eR
internuclear repulsion. This cancellation represents the classical result that the
electrostatic interaction energy of a spherically symmetrical neutral object (the H
atom) and a remote point charge (the H+ ion) decays much faster than R1 . The
cross term hR|VR |Li, at large internuclear distance decays very fast, approximately
as −EHa aR0 exp(−R/a0 ). Thus for large R
R |Li
1 + hR|V
hL|VR |Li + hR|VR |Li
hR|VR |Li
hL|VR |Li
= hL|VR |Li
≃ hL|VR |Li 1 +
− hL|Ri
1 + hL|Ri
1 + hL|Ri
hL|VR |Li
!
−e2
hR|Li + ...
R/2
− hL|Ri ≃ hL|VR |Li (1 + hL|Ri + ...) ,
≃ hL|VR |Li 1 + −e2
hL|Li + ...
R
2
e
hL|Ri applies since the distribution
where the approximation hR|VR |Li ≃ − R/2
ψL (r)ψR (r) is very small everywhere, and peaks at the axial region between the
e2
. As a consequence, for large R the
two atoms, at the center of which VR ≃ − R/2
attraction (last term of Eq. (128) for the symmetric state) prevails over the repulsion
2
Vnn = eR , thus produces bonding. The sign of the correction to the attraction is reversed for |Ai, which therefore is not bonding as the internuclear repulsion prevails
against attraction. This simple variational model suggests that the adiabatic energy
gain is exponentially small in R/a0 at large-R, due to the decay of the atomic 1s
wavefunctions.
All integrals in Eq. (127) can be evaluated for arbitrary R: Fig. 2.4 depicts the
outcomes of this variational calculation. Note that:
2.2. CHEMICAL AND NONCHEMICAL BONDING
73
1
2
(A)
Vnn=e /R
Vad
<S|Te|S>
Energy/EHa
-0.3
0
Vad
(S)
-0.4
Vad
Vnn+<S|Vne|S>
-1
|A>
<S|Vne|S>
|S>
RM
(a)
0
2
4
6
8
R/a0
0
2
2.5
-0.5
-0.565 (b)
8
6
4
R/a0
Figure 2.4. (a) The total adiabatic potential of the |Si variational
state of H+
2 decomposed in its kinetic, potential and ion-ion contributions: Vad (R) = hS|Te |Si + hS|Vne (r, R)|Si + Vnn (R). (b) A blowup
of the total adiabatic potential Vad (R) for the |Si (solid) and |Ai
(dashed) state. The minimum RM of Vad (R) is indicated and corresponds to a molecular bond energy of 0.065 EHa or 1.76 eV.
(S)
• the adiabatic potential Vad (R) associated to the |Si state shows a minimum
at a finite separation RM ; this potential well can bind the two protons
together, and for this reason |Si is called a bonding orbital;
(A)
• the monotonically decreasing, repulsive adiabatic potential Vad (R) justifies
calling |Ai an antibonding orbital;
• the dot-dashed curve indicates that as R is reduced the lowering of the
electronic potential energy hS|Vne (r, R)|Si does not compensate the raise
in internuclear repulsion, thus our initial expectation that bonding is related
to a gain in electrostatic potential energy is not confirmed by the present
variational model;
• as the electron moves in the wider potential well created by the two protons
rather than in the narrower well of an isolated proton, its kinetic energy
(S)
decreases significantly, enough to dig a minimum in Vad (R).
74
2. MOLECULES
The conclusion of our simple variational model for H+
2 is that the physical origin
of bonding is to be attributed to both kinetic and potential energy lowering of the
electron “screening” the nuclei and spending a significant fraction of probability
in the inter-nuclear region. This mechanism works optimally at R ≃ 2.5 a0 , and
produces a molecular bond energy of about 1.76 eV. For smaller R, the adiabatic
potential shoots up due to the divergence in Vnn (now compensated poorly), and
a new increase of the kinetic energy, as the molecular potential well contracts to a
more atomic-like shape. The proposed variational estimate provides basic qualitative trends, but is especially inaccurate at small R: a more quantitative treatment
requires solving the Schrödinger equation in the double Coulomb well: this yields a
bond energy of 2.79 eV at an optimal distance of 2.00 a0 , in good agreement with
experiment.
The electronic Schrödinger problem for one electron in the field of the two nuclei
has axial (cylindrical) rather than the spherical symmetry of an atom. Accordingly,
the adiabatic electronic states are labeled by their angular-momentum projection m
along the molecular axis through the nuclei, and not, as in atoms, by their total
angular momentum l. Thus, the electronic states of linear molecules, are labeled σ,
π, δ, ..., to indicate the absolute value |m| = 0, 1, 2, ... of their angular momentum
projection: non-σ states are orbitally twofold degenerate. Both |Si and |Ai of
Eq. (126) are σ states, as being composed of 1s (l = 0, thus m = 0) atomic states.
Standard chemical notation adds a star apex to antibonding states, thus |Ai would
actually be labeled 1σ ∗ . Other excited electronic states of H+
2 (whether bonding or
antibonding) can however possess axial angular momentum m 6= 0, thus symmetries
other than σ.
2.2.2. Covalent and ionic bonding. The ion H+
2 is instructive because in its
simplicity it illustrates the physical origin of chemical bonding, permits to introduce
the basic concepts of bonding and antibonding orbitals, and gives a numerical estimate of typical bond energies and distances in molecules. Nonetheless, a chemist
would not call the electronic structure of H+
2 a proper chemical bond. For an actual
covalent bond, it would take two electrons occupying (clearly, with antiparallel spins)
the same bonding orbital |Si. This single-particle orbital should be determined by
some method (e.g. HF) accounting – at least approximately – for electron-electron
repulsion, but at a semi-qualitative level we could take |Si similar to our variational guess (126). Accordingly, the ground-state electronic structure of a neutral
H2 molecule resembles roughly the following 2-electron state:
(129)
1
χ
(σ
)
χ
(σ
)
↑
1
↑
2
.
ψeGS (r1 σ1 , r2 σ2 ) = hr1 σ1 , r2 σ2 |S ↑, S ↓iA = hr1 |Sihr2 |Si √ 2 χ↓ (σ1 ) χ↓ (σ2 ) 2.2. CHEMICAL AND NONCHEMICAL BONDING
75
Both electrons occupy the same spatially symmetric bonding orbital, in a global
state which, as it must, is antisymmetric for exchange of electron 1 and 2 (through
its spin part). The singlet wavefunction (129) represents a typical homonuclear
(GS)
(GS)
covalent bond. For H2 , the bond energy Vad (∞) − Vad (RM ) ≃ 4.7 eV (at an
equilibrium distance RM ≃ 1.3 a0 ). This bond energy is less than twice that of H+
2,
since this singlet state pays a significant electron-electron repulsion.
Beside the ground state, the potential well produced by the two protons has space
for several excited states, which can be classified as single-electron excitations, with
one electron promoted to higher bonding/antibonding states. In general, excited
electronic states need not favor bonding: for example, the lowest triplet state of H2 ,
with an electron in |Si and the other in |Ai is not bond. However, higher excited
states, with both electrons in bonding orbitals, are observed to be bond, although
with lower binding energy and larger interatomic equilibrium distance [?].
When pairs of more complex atoms are let interact to form diatomic molecules,
the number of involved bonding and antibonding orbitals is necessarily larger, to
accommodate all electrons. Figure 2.5 sketches a qualitative ground electronic structure of several homonuclear diatoms, leaving the deep core 1s electrons out. Several
interesting observations may be drawn from this figure:
• Electrons fill the single-particle levels according to the same scheme as in
atoms (lower to higher), and the qualitative ordering of the levels does not
change much in passing from one molecule to another.
• Covalent bonding is mainly related to an occupancy imbalance between
bonding and antibonding orbitals: the most strongly bond molecule is N2 ,
where this imbalance (“bond order”) equals 3.
• When electrons start to fill 2p-derived molecular orbitals, the π states are
lower than the σ combination. This is surprising, as the overlap of the m = 0
p orbital pointing along the molecular axis is much larger than that of the
m = ±1 which are mostly located away from it: the bonding-antibonding
splitting of σ orbitals should be larger than the corresponding π orbitals.
However, the Coulomb repulsion of the filled 1s- and 2s-derived orbitals
pushes the 2p-derived σ orbital up by a substantial amount. The “natural”
ordering is restored in O2 .
• Even a dimer such as Be2 , for which bonding and antibonding orbitals are
equally populated, shows some amount of bonding, of the type described
in Sec. 2.2.3 below.
• In B2 and O2 , two electrons sit in a degenerate π or π ∗ molecular orbital,
which has room for 4 electrons: to minimize the residual Coulomb repulsion,
the ground state is a spin-triplet Hund-rule state (see Sec. 1.2.7.2).
76
2. MOLECULES
Figure 2.5. A schematic chemist’s view of the electronic structure of
a few simple homonuclear diatomic molecules. Blue indicates bonding, red indicates antibonding single-electron orbitals. Red arrows
represent electrons filling the orbitals. The observed bond energy is
indicated. All single-electron energies and splittings are purely qualitative.
• The Ne2 dimer, like Be2 , has all bonding and antibonding orbitals filled: as
a result, only an exceedingly weak bond forms, as described in Sec. 2.2.3
below.
Homonuclear diatomic molecules are very special diatoms, due to their peculiar
L ↔ R symmetry. Most pairs of atoms bind together, and many form covalent bonds
not so unlike those illustrated for the homonuclear molecules. For example, the CO
molecule has the same number of electrons as the molecule N2 . The main difference
2.2. CHEMICAL AND NONCHEMICAL BONDING
4σ *
C
O
77
2σ *
H
F
1s
2p
2π *
3σ
2p
1π
2p
1π
2σ
2σ *
2s
1σ
2s
2σ
CO
11.2 eV
HF
2s
5.9 eV
Figure 2.6. A simplified chemist’s view of the electronic structure
of two heteronuclear diatomic molecules: CO and HF. Blue indicates
bonding, red indicates antibonding single-electron orbitals. The observed bond energy is indicated. Single-electron energies and splittings
are purely qualitative.
is that the L ↔ R symmetry is broken: the attraction of the O nucleus (Z = 8) is
stronger than that of C (Z = 6). As a consequence, as sketched in Fig. 1.23, the 2s
and 2p shells of O sit deeper than the 2s and 2p shells of C: the bonding and antibonding orbitals are not pure symmetric/antisymmetric combinations of the type
(126). In the same spirit of the variational treatment of H+
2 (but neglecting overlaps
hL|Ri for simplicity), these orbitals may often be approximated as eigenstates of a
2 × 2 matrix
EL −∆
hL|H1 |Li hL|H1 |Ri
,
(130)
≡
−∆ ER
hR|H1 |Li hR|H1 |Ri
where H1 = Te + V eff is the effective one-electron Hamiltonian fixed by the selfconsistent potential acting on the electron under examination, and |Li, |Ri are the
considered atomic states (e.g. the 2s or 2p) of the left and right atom (here C and
O). The diagonal elements are mostly dictated by the energy positions of atomic
shells. As the atoms move closer, the off-diagonal term ∆ > 0 grows larger and
larger. The eigenvalues of the matrix (130) are:
s
2
EL + ER
EL − ER
(131)
E ab =
±
+ ∆2 ,
2
2
78
2. MOLECULES
and the corresponding antibonding |ai and bonding |bi eigenkets can be written as
(132) s
s aE
1
1
u
u
EL − ER
=
1± √
1∓ √
|Li ∓
|Ri , with u =
.
b
2
2
2∆
1 + u2
1 + u2
For u = 0 we recover the symmetric homonuclear molecule, with |2∆| representing
the splitting between bonding |bi = |Si and antibonding |ai = |Ai states (126). For
finite u (assuming EL > ER ), the eigenenergies E ab in (131) remain centered around
2
EL +ER
2 1/2
,
but
their
splitting
E
−
E
=
(E
−
E
)
+
(2∆)
is larger than both the
a
b
L
R
2
diagonal separation EL −ER and the off-diagonal mixing energy |2∆|. For increasing
u, the eigenkets (132) resemble less and less the simple symmetric and antisymmetric
combinations (126): |ai acquires a larger |Li character, while |bi acquires a larger
|Ri character. In the limit of very large |u|, the eigenenergies (131) E ab ≃ E L , and
R
L
the eigenkets | ab i ≃ | R
i. As illustrated in Fig. 2.6, an intermediate value of u ≃ 1
applies for the 2s and 2p orbitals of CO near its equilibrium separation: bonding
molecular orbitals lie prevalently on the O side, antibonding ones on the C side,
but substantial quantum delocalization is present. The bond of CO and of many
similar heteronuclear diatoms is classified as covalent polar, since it is associated
to a nonzero permanent electric dipole due to the charge transfer associated to the
unequal charge distribution of the electrons in the bonding state |bi.
HF (here Hydrogen-Fluorine, not Hartree-Fock!, see Fig. 2.6) is the prototype
diatom where, for typical interatomic distances, u is very large (Figure 1.23 shows
how much deeper lies the 2p shell of F than the 1s shell of H). As a result, the
relevant bonding orbital is not much different from the 2p of an isolated F atom.
Thus a radical (rather naive) description of the bond of HF involves a complete
charge transfer from H to F (which therefore completes the shell to 2p6 ). The two
ions H+ and F− would then attract each other with a −e2 /R attraction as long as
they are separated enough. As the proton moves inside the typical size of the outer
shell of F− , the screening of the positive charge of the F nucleus diminishes, and
the attraction gradually turns into repulsion. The origin of bonding in this simple
picture is the energy gained in moving the ions from infinity to the equilibrium
distance, to which the energy paid to form the ions from the neutral atoms (the
ionization potential of the atom which turns into a cation – here H – minus the
electron affinity of the atom acquiring the electron – here F) is to be subtracted.
Indeed, [e2 /(92 pm) = 15.7 eV] minus the ionization potential of H (13.6 eV) plus
the electron affinity of F (3.4 eV) yields 5.5 eV, in not especially poor agreement
with the observed bond energy 5.9 eV. A molecular bond similar to that of HF,
where u is so large that almost complete charge transfer occurs is named ionic bond.
2.2. CHEMICAL AND NONCHEMICAL BONDING
Figure 2.7. 3D structures of several organic molecules: (left to
right up to down) methane CH4 , propane C3 H8 , butane C4 H10 , benzene C6 H6 , ethanol CH3 CH2 OH, pyrene C16 H10 , morphine, heroin,
a peptide, three proteins of increasing complexity. In the stickrepresentation of morphine and heroin, H atoms are left undrawn and
all corners represent C atoms. For the last protein, the single-atom
representation is given up in favor of a block representation.
79
80
2. MOLECULES
The picture of bonding according to Figs. 2.5 and 2.6 and relative discussion is
extremely simplified, at the point of being inaccurate. In reality, all orbitals of the
same symmetry are actually mixed, and the correlation with atomic levels indicates
at most the prominent atomic component in the actual molecular orbital. For example, the orbitals labeled 2σ, 2σ ∗ , 3σ, and 4σ ∗ , are linear combinations each involving
a little of all m = 0 atomic orbitals, mostly 2s and 2pz , but also 1s, 3s, 3pz ,... This
hybrid character of molecular orbitals, a rather boring detail for diatomic molecules,
is a crucial ingredient in determining the 3D shape of polyatomic molecules and
covalent solids. Hybrid orbitals provide adiabatic potentials which depend strongly
not only on interatomic distances but also on angles. An especially important example is the mix of 2s and 2p, at the basis of the geometry of covalent bonds in
organic molecules: sp3 combinations determine the ideally 109◦ angle between the
bonds of tetrahedrally coordinated carbon and silicon; sp2 combinations tend to
bind atoms such as carbon and nitrogen to three other atoms in the same plane,
forming ideally 120◦ angles. The endless combinations of the molecular orbitals of
polyatomic 3D molecular structures (Fig. 2.7) and the weaker interactions among
different molecular units, are at the basis of the infinite richness of organic chemistry, the microscopic functioning core of living matter. Analogous mixtures of 3s &
3p and of 4s & 4p determine the covalency of Si and of Ge, Ga, As, in turn at the
root of the crystalline and amorphous (rather than molecular) structures that these
elements tend to form in their solid state (see Figs. 4.9, 4.25, and 4.63 below).
2.2.3. Weak nonchemical bonds. Our description of covalent bonds in terms
of linear combinations of single fixed atomic orbitals, is doomed to obtain an exponential dependence of Vad (R) at large distance, due to the exponential decay of
the orbitals themselves. Experimentally however, at large distance, the interaction
energy between any two atoms is always attractive and decays following a universal
power law: ∼ R−6 . This is due to classical electromagnetism concepts which the
previous analysis completely overlooked: atomic electric dipole moments and atomic
polarizability.
The electric dipole moment of any atom is, on average, null. As long as spherical
symmetry is unbroken, the single-electron atomic orbitals are also eigenstates of the
~ 2 , namely some R(r) times spherical harmonics
total orbital angular momentum |L|
YLM . These function have definite parity (−1)L , thus the angular probability distribution |YLM (r̂)|2 of each electron is even (see Fig. 1.5). Thus, the average electric
~
dipole hLM |d|LM
i vanishes, as discussed above in Sec. 1.1.10. However, electrons
move around the nucleus and occupy instantaneously specific positions, associated
to nonzero (fluctuating) dipole moment. This dipole moment produces an instanta~ whose intensity decays away from the atom as R−3 , based on
neous electric field E
elementary arguments of electromagnetism.
2.2. CHEMICAL AND NONCHEMICAL BONDING
81
A field acting on a second remote atom “polarizes” it. The charge distribution of
the electrons of the second atom reacts to minimize the total energy, now including,
~ ~
~
beside
P the inner terms of (1), a coupling −d · E to the external field, where d =
−qe ~ri . The second atom responds to the instantaneous external electric field by
~ = αE.
~ As the
building up an induced dipole in the same direction as the field: hdi
field is weak, the atomic polarizability α is independent of the field (linear response).
As a result, the total energy of the two atoms at distance R lowers by an amount
~ E
~ = − 1 α|E|
~ 2 ∝ R−6 with respect to the atoms placed at infinite distance. By
− 12 hdi·
2
dimensional analysis, the prefactor of R−6 must be a [distance]6 × [energy]. As only
atomic physics is involved, the order of magnitude of this van der Waals attraction
must be
vdW
Vad
(R) ∝ −EHa
(133)
a 6
0
R
.
The order of magnitude of the coefficient of the −R−6 decay of the interaction
potential is EHa a60 ≃ 0.6 eV Å6 ≃ 10−79 J m6 .
More quantitatively, the prefactor of R−6 is proportional to the product of the two
atomic electrical polarizabilities [?] (since the dipole fluctuations are proportional to
α of the first atom, as reciprocity suggests). These polarizabilities can be determined
by second-order perturbation theory of atomic states. By standard perturbative
arguments, an external field distorts the ground state so that (i) the electronic
~ of excited states with l
wavefunction acquires components (proportional to |E|)
changed by ±1, and (ii) the total energy lowers quadratically with the field. For
~ = Ez ẑ, the ground state of an H atom
example, in a weak external electric field E
takes the form of a linear combination
|1, 0, 0; Ez i = a|1, 0, 0i +
(134)
X
n>1
bn |n, 1, 0i ,
where the hydrogenic eigenkets |n, l, miP
are represented in real space by the standard
2
2
eigenfunctions of Eq. (36), and
|a|
+
n |bn | = 1. The resulting dipole moment
P
h1, 0, 0; Ez |dz |1, 0, 0; Ez i = 2a n bn hn, 1, 0|dz |1, 0, 0i is proportional to Ez , because
the bn coefficients are. To evaluate the polarizability α, estimate the coefficients
bn by standard first-order perturbation theory [?] bn = −Ez hn10|dz |100i/(E1 − En ),
where En are the energy levels of H, Eq. (30). The weak-field polarizability α is then
given by
(135)
α=
X |hn, 1, 0|dz |1, 0, 0i|2
h1, 0, 0; Ez |dz |1, 0, 0; Ez i
≃ −2
.
Ez
E1 − En
n>1
82
2. MOLECULES
The physical dimensions of α are [Charge]2 [Length]2 [Energy]−1 = [Charge]2 [Time]2 [Mass]−1 ,
the same as 4πǫ0 × [Length]3 . It is therefore natural to measure atomic polarizabilities in units of 4πǫ0 a30 = 1.64878 · 10−41 C2 s2 /kg. The resulting static polarizability of H is αH = 4.50 4πǫ0 a30 . The homologous quantities for He and Li are
αHe = 1.383 4πǫ0 a30 and αLi = 164.11 4πǫ0 a30 . The very small polarizability of He
(huge 1s-2p gap) and the large one of Li (comparably small 2s-2p gap) are accounted
for by Eq. (135): α is inversely proportional to the energy gap from the highest filled
state up to the nearest empty state of different l character.
The dipole–induced-dipole mechanism is perfectly general, and it accounts for the
leading long-range attraction of all pairs of neutral atoms. As distance R reduces,
open-shell atoms, e.g. H and those forming the diatoms of Figs. 2.5 and 2.6, gradually
modify their orbitals to form robust covalent bonds. For pairs of close-shells atoms
(Be2 , Ne2 ) only the van der Waals mechanism produces (rather weak) attraction,
until short-distance repulsion turns in.
As R is reduced the short-distance repulsive Coulombic divergence of Vad ∝ R−1 is
peculiar to H+
2 and few other dimers involving H. For general many-electron atoms,
long before the nucleus-nucleus repulsion Vnn (R) becomes relevant, the electronic
(a)
energy Ee (R) blows up because of Pauli’s principle: as the core electrons are
brought together in the same region of space, their wavefunctions become less and
less orthogonal. Some of these electrons is then pushed up into some empty valence
level, which makes tiny reductions of R cost tens or even hundreds EHa . This
rapid “hard-core” increase of the energy as two atoms collide is responsible for the
“impenetrability” of matter, i.e. the sharp increase of pressure of a sample whose
volume is so much reduced that each atom is left less than its characteristic volume
(∼ few Å3 ) of space. We see that a combined effect of electrons indistinguishableness
and quantum kinetic energy sustains matter against collapse due to electromagnetic
attraction. In a diatomic context, the Vad (R) blowup at short distance is sometimes
parameterized phenomenologically with a R−12 power law.
The simplest potential capturing the long-distance van der Waals attraction and
the phenomenological short-range repulsion is the popular Lennard-Jones potential
σ 12 σ 6
(136)
VLJ (R) = 4ε
.
−
R
R
This expression is nothing but a phenomenological model (out of a large class) for
the actual Vad (R). Its two parameters ε (the depth of the potential well at RM ) and
σ (the radius where VLJ (R) changes sign) are listed in Table 2.1 for the noble-gas
dimers, whose actual Vad (R) the Lennard-Jones potential is a fair approximation
for. In this context, it is also used for describing the dynamics of a collection of
more than two noble-gas atoms as a sum of pair potentials (two-body forces). This
extrapolation is a fair approximation of the actual adiabatic potential of simple
2.2. CHEMICAL AND NONCHEMICAL BONDING
83
element σ [pm] ε [meV] 4εσ 6 [EHa a60 ]
He
256
0.879
1.7
Ne
275
3.08
8.9
Ar
340
10.5
109
368
14.4
239
Kr
Xe
407
19.4
590
Table 2.1. Lennard-Jones parameters (fit to atom-atom scattering data) defining the pair potentials of the diatoms of the indicated
elements according to Eq. (136). Note the increase of atomic size (reflected by σ) with Z and the rapid increase in atomic polarizability in
going from He to Ne (reflected by the −R−6 coefficient 4εσ 6 ≃ α2 ).
close-shell systems at low density: the phase diagram and correlation properties of
the Lennard-Jones solid/fluid is in close qualitative and semi-quantitative agreement
to observed data for noble-gas systems. The two-body Lennard-Jones model instead
is very inappropriate for atoms forming strong directional covalent bonds.
2.2.4. Classification of bonding. We have clarified the mechanism for the
general tendency, pictured in Fig. 2.1, of atoms to attract at large distance and repel when coming in contact. Contrasting this qualitative likeness, significant differences in the equilibrium distances and huge differences in well depths make different
diatoms, bonded by different mechanisms, very unlike.
When at least one of the atoms is a noble gas, the dipole–induced-dipole mechanism is the only mechanism creating attraction, little or no covalency occurs, and
consequently the equilibrium atom-atom distance RM is rather large and the bond
energy [Vad (+∞) − Vad (RM )] is small (few meV, see Table 2.1). Weakly bonded Van
der Waals systems retain ordinarily the monoatomic gas phase to relatively low temperature, and show weaker tendency to form diatomic molecules than to condensate
to monoatomic liquid and solid phases.
A few atoms with open shells (O, N, F) show prominent tendency to form diatomic
molecules, with short strong covalent bonds, with typical bond energies of the order
of a few eV. These molecular units are retained in the low-temperature liquid and
solid phases. For many other elements (e.g. Li, Be, B, C), the extra energy gain
in forming many bonds per atom makes them form extended metallic or covalent
solids, rather than diatoms. When different atoms are covalently bound together,
some amount of electronic charge moves closer to one nucleus than to the other, as
the energy of the atomic shells involved in bonding is different. This phenomenon
makes heteronuclear bonds polar. The extreme limit of complete or almost complete
84
2. MOLECULES
charge transfer is named ionic bond (e.g. HF, LiF): energies and lengths involved
are in the same range as for the covalent bonds.
2.3. Molecular spectra
In the previous Section, we have discussed general properties of the solutions of
the electronic equation (117) for a diatom, thus acquiring information on the typical
shape of the adiabatic potential Vad . We consider now the motion of the two nuclei
in the adiabatic force field described by Vad , with its spectroscopical implications.
This motion can be described in terms of solutions of Eq. (124).
As remarked in Sec. 2.1, the adiabatic potential is independent of the centermass position of the molecule (translational invariance) and of the orientation in
space of the straight line through the two nuclei (rotational invariance). Precisely
the same transformation (21) applied to the two-body problem of the one-electron
atom separates the center-mass motion of the two-body problem of the diatomic
molecule. Like for atoms, due to translational symmetry, the molecular center of
mass translates freely: the random thermal translational motion in a gas-phase
sample originates Doppler and collisional broadening of the spectra.
By treating the internal degree of freedom in polar coordinates, like for the oneelectron atom, we convert the equation for the relative coordinate into angular and
radial equations (25), (26), and (27), where in the latter U (r) is replaced by the
distance-dependent adiabatic potential Vad (R) (the nucleus-nucleus separation R
here replaces the electron-nucleus distance r of the one-electron atom). The angular
equations are universal, thus the angular wavefunction, describing the orientation in
space of the molecule (thus molecular rotations) are standard spherical-harmonics
Ylm . Rotational states |l, mi are labeled by the molecular angular momentum l,1
plus its ẑ component m.
The formal structure of the radial equation is the same as for the one-electron
atom. The substantial physical difference stands in the equilibrium distance (the R
where the potential is most attractive) which for the diatom is finite R = RM > 0,
rather than R = 0 as for the one-electron atom. The main consequence is that the
radial motion is mostly localized near RM , in a region where the centrifugal term
~2 l(l+1)
in the equation is often fairly small. If we neglect the variations of R−2
2µ R2
along a region around RM where the radial wavefunctions differ significantly from
zero, then the radial motion is approximately independent of the rotation, i.e., the
radial solutions are independent of l. The radial quantum number v = 0, 1, 2, ...
for the diatomic molecule indicates the number of radial nodes, like n − l − 1 for
1
In the literature, the molecular angular momentum quantum number is sometimes called r
and sometimes j, rather than l.
2.3. MOLECULAR SPECTRA
85
the one-electron atom. If the adiabatic potential is Taylor-expanded around its
minimum
1 d2 Vad (R′ ) (137)
Vad (R) = Vad (RM ) +
(R − RM )2 + ... ,
2
dR′2 ′
R =RM
and truncated at second order, then the radial motion is approximately a harmonic
motion. In this approximation, the energy spectrum contains three additive contributions, of decreasing importance:
• A large electronic term Vad (RM ), whose lowering at the equilibrium configuration Vad (∞) − Vad (RM ) measures the well depth responsible for the
chemical bond.
• A vibrational term, which in the harmonic approximation amounts to
1
(138)
Evib (v) = ~ω v +
,
2
(139)
measuring the energy of radial vibration around the equilibrium position
RM . Here
s
k
d2 Vad (R′ ) M 1 M2
ω=
,
with k =
,
and
µ
=
µ
dR′2 R′ =RM
M1 + M2
where µ indicates the reduced mass of the two-body oscillator composed
by the two nuclei of mass M1 and M2 (see Eq. (24)). Typical vibrational
energies ~ω of few hundred meV or less are observed.
~2 l(l+1)
~2 l(l+1)
≃ 2µ
• A rotational contribution from the mean rotational term 2µ
2
R2
RM
in the radial equation (27) yields:
(140)
Erot (l) =
~ 2
~2 l(l + 1)
|L|
=
,
2
2µ RM
2I
2
where I indicates the classical momentum of inertia µRM
of the diatom
with respect to its center of mass, assuming the interatomic distance R is
frozen at the equilibrium separation RM . The typical order of magnitude
of the rotational energies ~2 /(2I) in molecular spectra is few meV or less,
H2 having the largest one, 7 meV.
2.3.1. Rotational and ro-vibrational spectra. In an “adiabatic” transition
with the electrons remaining in the electronic ground state, observed spectra fulfill
the standard dipole selection rules: ∆l = ±1. The dipole operator involved here is
the product of the internuclear separation times the charge difference permanently
attached to the two atoms. This charge difference vanishes for equal nuclei, thus
86
2. MOLECULES
Figure 2.8. Observed purely rotational spectrum of HCl gas. Here,
the |li = 0i → |lf = 1i line close to 20 cm−1 is not seen because of
limitations of the spectrometer.
no dipole transition occurs for homonuclear molecules.2 On the contrary the large
dipole moments of strongly polar molecules, such as HF and HCl, produce intense
dipole transitions. The dipole moment of CO, a weakly polar molecule, is only about
10% that of HCl, thus producing much weaker infrared absorption.
Rotational and vibrational molecular spectra are mostly observed in absorption
rather than emission, as the spontaneous emission rate of such low-energy transitions is very small, due to the E 3 dependence of the decay rate, Eq. (73). Purely
rotational spectra, usually in the far infrared region, are associated to ∆v = 0,
∆l = 1 transitions. The energy difference between |li i and |lf i = |li +1i states is
~2
~2
[(li + 1)(li +1 + 1) − li (li + 1)] =
[li + 1] .
2I
I
Accordingly, if in the sample several initial rotational states are populated, then
the rotational spectrum contains several equally spaced lines. The energy spacing is
2
twice as large as the typical rotational quantum ~2I . Figure 2.8 reports a characteristic purely rotational absorption spectrum. The measured separation of the lines
permits to determine the interatomic equilibrium separation RM through Eq. (141).
Roto-vibrational spectra are observed typically in near-infrared absorption. Most
of the dipole intensity concentrates in ∆v = 1 transitions, although in practice,
also weaker transitions with ∆v > 1 are observed routinely. Figure 2.9 reports
a characteristic roto-vibrational spectrum. Again, transitions occur starting from
several initial rotational states: as a consequence, the purely vibrational peak is
“decorated” by rotational transitions, which on the low-energy side imply ∆l = −1
(141)
2
∆Erot (li ) =
This is the reason why clean air (composed mainly of gas-phase N2 and O2 ) is essentially
transparent in the near infrared. Transparency in the visible range is associated to large gaps
(several eV) from the electronic ground state to the first allowed electronic excitation.
2.3. MOLECULAR SPECTRA
Figure 2.9. Observed roto-vibrational spectrum of HCl gas: absorption in this region is associated to the “fundamental” transition
|v = 0i → |v = 1i, decorated by the rotational P branch (at the left,
|li i → |li −1i) and R branch (at the right, |li i → |li +1i). The isotopic
duplication of the lines is visible: H35 Cl is responsible for the stronger
peaks and less abundant H37 Cl for the weaker ones.
Figure 2.10. A scheme of the rotational transitions occurring on
top of a vibrational transition of frequency ν0 in a diatomic molecule.
87
88
2. MOLECULES
Figure 2.11. Normal modes of vibration of polyatomic molecules:
(a) CO2 (~ω1 = 166 meV, ~ω2 = 83 meV, ~ω3 = 291 meV); (b) H2 O
(~ω1 = 453 meV, ~ω2 = 198 meV, ~ω3 = 466 meV).
and are called P branch, and on the high-energy side imply ∆l = +1 and are called
R branch, as illustrated in Fig. 2.10. The rotational structures are equally spaced
according to Eq. (141) (for the P branch a similar result holds). The roto-vibrational
spectra are characteristic for the absence of a purely vibrational peak at energy ~ω,
with could only occur if ∆l = 0 were dipole allowed (which is not).
Roto-vibrational spectra are often investigated through Raman spectroscopy, which
is not based on dipole transitions as infrared absorption. Raman experiments are
based on optical (electronic) non-resonant excitations of the molecule, which rapidly
decay back to the electronic ground state, possibly leaving a vibrational and/or rotational excitation. As the experiment involves two photons, the selection rules allow
∆l = 0, ±2 transitions.
The roto-vibrational dynamics of polyatomic molecules involves further intricacies. Once translations and rotations have been accounted for, 3Nn − 6 (or 3Nn − 5
for linear molecules) internal degrees of freedom correspond to vibrations, which
approximately described in terms of small harmonic oscillations (the classical normal modes) around the multi-dimensional minimum of Vad . Figure 2.11 sketches
2.3. MOLECULAR SPECTRA
89
Figure 2.12. Rotational states decorating vibrational states, in turn
(a′ )
(a′′ )
decorating different electronic states ψe and ψe producing differ′
′′
ent (bonding) adiabatic potentials Vad
(R) and Vad
(R).
such modes of vibration for CO2 and H2 O. The normal modes are then treated as
quantum (harmonic) oscillators.
2.3.2. Electronic spectra. In the optical and ultraviolet photon region, excited electronic states are investigated. These transitions are similar to atomic excitations, and can basically be understood in terms of promotion of an electron from
a filled molecular orbital to an empty one, e.g. from a bonding to an antibonding
(a′ )
(a′′ )
orbital. An electronic transition ψe → ψe in a molecule leads from one adiabatic
potential surface to another, as illustrated in Fig. 2.12. In general, the shapes of dif′
′′
ferent adiabatic potentials Vad
(R) and Vad
(R) associated to different electronic states
′
′′
are well distinct, and in particular RM 6= RM
. This means that electronic transitions
are usually accompanied by vibrational transitions, excited by the displacement of
the equilibrium geometry. Roughly equally spaced vibrational satellites decorate the
90
2. MOLECULES
E
N+2
N2
(a)
R’M
R’’
M
R(b)
Figure 2.13. (a) A sketch of the adiabatic potentials (in the harmonic approximation) of two different molecular electronic states, here
the electronic ground state of neutral N2 , and a generic electronic state
of the ion N+
2 . Arrows highlight the most intense Franck-Condon transitions. (b) The observed photoemission spectrum of N2 → N+
2 , showing sequences of vibrational satellites build on top of each electronic
state.
electronic transitions (see Fig. 2.13), the spacings being given by the harmonic frequency ~ω ′′ of the final adiabatic potential. The intensities of the different satellites
are distributed according to matrix elements proportional to |hv ′ = 0|v ′′ i|2 , where
the vibrational ground state |v ′ = 0i of the initial adiabatic potential is projected on
all the final vibrational eigenstates |v ′′ i in the excited electronic adiabatic potential.
The most intense transitions involve those |v ′′ i states with large overlap in the region
of the original minimum RM , as illustrated in Fig. 2.13 (Franck-Condon principle).
Accordingly, a large number of significantly excited vibrational satellites indicates
a large displacement of the equilibrium position in the electronic transition. For
example, the spectrum of Fig. 2.13 indicates that, in the electronic transition from
2
the ground state of N2 to N+
2 , RM shifts more in going to the Πu state than to
2 +
2 +
Σg or Σu states. Rotational structures accompanying the electronic-vibrational
structures are complicated by the change in momenta of inertia from I ′ to I ′′ and
by occasional changes in electronic angular momentum.
2.3.3. Zero-point effects. We conclude this Section with an amusing detail:
the bond energy of a diatom is slightly less than the depth Vad (+∞) − Vad (RM )
2.3. MOLECULAR SPECTRA
Energy / depth of the well
1
91
Vad(R)
0
-1
0
Eb(quant)
v=2
v=1
Eb(class)
v=0
1
2
R/RM
3
Figure 2.14. The role of the zero-point vibrational motion in the
precise definition of the bond energy of a diatomic molecule. The
zero-point energy is a consequence of the Heisenberg’s uncertainty
principle, or, in other words, the nonzero value of the quantum kinetic term hTn i in a position-localized state. The zero-point energy
decreases as the atomic masses Mα increase, and would eventually
vanish in the limit of a classical adiabatic motion of the atoms.
of the adiabatic potential well, which would apply if the nuclear masses were infinite, or equivalently if the dynamics of the ions were classical. Due to quantum zero-point motion, associated to Heisenberg’s uncertainty, the actual groundstate energy includes a vibrational contribution, that in the harmonic approximation equals Evib (0) = 21 ~ω, see Eq. (138). As illustrated in Fig. 2.14, when
zero-point energy is accounted for, the actual binding energy reduces to Eb =
Vad (+∞) − [Vad (RM ) + Evib (0)]. This usually small effect can be probed by changing
the nuclear isotopic masses, thus modifying the vibrational frequency ω ∝ µ−1/2 ,
without significantly affecting Vad (R). The zero-point effect is the most spectacular
in 4 He2 , which is so extremely weakly bound (see Table 2.1) that a single bound
“vibrational” level is observed [?], with a zero-point energy Evib (0) that almost completely cancels the adiabatic attraction (the well depth is approximately 900 µeV),
to a total net binding energy Eb ≃ 0.1 µeV only! Of the 25% lighter 3 He2 , no bound
state is observed.
92
2. MOLECULES
The present Chapter summarizes few basic ideas and experimental evidence in
the field of molecular physics. The concepts of adiabatic separation and chemical
bonding stand at the heart of the physics and chemistry of matter: they open
the way to all understanding of the dynamics of systems composed by more than a
single atom, including not just molecules but also polymers, clusters, solids. The few
concepts sketched in these pages only scratch the surface of the extremely rich field
of molecular spectroscopy, which provides detailed microscopic information about
the geometry and dynamics of diatomic and polyatomic molecules [?, ?, ?]. Beside
spectroscopical characterization, molecules are produced transformed and studied by
means of all sorts of chemical reactions: these can mostly be analyzed conceptually
in terms of the dynamics of atoms guided by suitable multi-dimensional potential
energy surfaces Vad (R). The qualitative understanding and quantitative study of
these phenomena constitutes the hard core of chemistry and chemical physics.
CHAPTER 3
Statistical physics
The purpose of statistical mechanics is to relate average properties of “macroscopic” objects (thermodynamical quantities) to the fundamental interactions governing their microscopic dynamics. We have seen in the previous chapters that many
detailed properties of individual atoms and molecules can be understood on the basis
of the microscopic electromagnetic interactions driving the motion of the composing electrons and nuclei. The number of electrons and nuclei in atoms and small
molecules does not exceed a few hundred, a few thousand at most. Macroscopic
objects, as opposed to microscopic systems, are characterized by huge numbers of
degrees of freedom. For example, a sodium-chloride crystal weighting 1 g, ready to
be thrown in hot water for cooking pasta, is composed of about 3 · 1023 electrons
and approximately 1022 Na and 1022 Cl nuclei. One may as well conceive some
wavefunction describing the dynamics of such a huge number of degrees of freedom,
but must also readily give up any hope to ever store the huge amount of information
that even an approximate wavefunction holds to describe in detail this system. On
the other hand, this limitation is not especially bad, since such intimate details of
the dynamics of our salt crystal as the individual motions of electrons and nuclei are
probably boring and of little practical interest. A physicist or a materials engineer
is rather interested in the measurable average macroscopic properties of objects and
substances, such as stiffness, tensile strength, heat capacity, heat and electrical conductivity, magnetic susceptibility, phase transitions... To pursue the aim of drawing
a link between the microscopic dynamics and the macroscopical thermodynamical
properties, equilibrium statistical physics borrows its mathematical tools from the
theory of probability and statistics.
3.0.4. Probability and statistics. All of statistical reasoning is based on the
notion of probability. The naive notion of probability coincides with the relative
number of observations. For example, after rolling a dice a large number N of
times, we observe “two” N2 times: the ratio N2 /N is an estimate of the probability
P2 of obtaining “two” in a single roll. However, we have also an a priori idea of
probability. We will assert that P2 = 16 , even against contradictory observation,
unless we have evidence that the dice is loaded. This a priori notion of probability
93
94
3. STATISTICAL PHYSICS
is at the basis of a more rigorous definition of probability as a measure defined on a
“space of events”, such that the measure of all space equals unity.
(1) It is a basic assumption of measure theory that for non intersecting sets of
events A and B, the probability of A ∪ B (i.e. that any event in either A or
B is realized) equals PA + PB .
(2) Given two spaces of events (which could intersect, or even coincide), two
sets of events A and B belonging to the first and second space are called
independent if the probability of A and B (i.e., an event is realized that
satisfies the conditions for belonging to A and at the same time those for
belonging to B) is the product P (A) · P (B).
Both these properties are trivial when probability and relative number of observations are identified.
The two basic properties of probability sketched above allow us to derive probabilities of complicated events in terms of probabilities of elementary events. For
example, the probability of obtaining two ones when two independent dices are rolled
1
1
1
1
is 16 · 16 = 36
(property 2); the probability for a four and a five is 36
+ 36
= 18
, as
both (a four on dice 1 and a five on dice 2) and (a five on dice 1 and a four on dice
2) are possible mutually exclusive events (property 1).
Statistics makes wide use of probability distributions: these are lists of probabilities of mutually exclusive sets of events which cover the whole space of events. For
example, when the two rolled dices are considered, the outcome may be grouped in
1
equal numbers (6 mutually exclusive possibilities, each of probability 36
) or different
1
): using
numbers (6 · 5/2 = 15 mutually exclusive possibilities, each of probability 18
1
5
property 1, this leads to a distribution Pequal = 6 , and Pdifferent = 6 . Clearly, the
events considered exhaust all space, and the sum of all the probabilities in the distribution equals unity. It is a useful exercise to work out the distribution Pboth even ,
Peven odd , Pboth odd on the same space of events. Basically, a probability distribution
makes probability a function of certain specifications of the events considered. Examples of popular distributions in statistical physics are the binomial distribution,
the Poisson distribution, and the Gaussian distribution.
Quantum mechanics, even at the level of a single particle, or few of them, has
an intrinsic statistical interpretation, in the probabilistic postulate of observation.
When a measurement of observable A is done on a quantum system initially in
some state |ii, the system will be found in the eigenstate |ai of A associated to
eigenvalue a, with probability Pa (i) = |ha|ii|2 . The quantum state |ii contains in
its belly all probability distributions corresponding to all possible operators associated to potential measurements that may be carried out on the system. However,
statistical physics is not especially concerned about this statistical interpretation
because, if the quantum mechanical system is left undisturbed, the ket evolves from
3. STATISTICAL PHYSICS
95
any given initial state according to the (deterministic) Schrödinger equation (7). On
the contrary, the statistic description of a macroscopic system attempts to investigate its average properties without the need of a precise specification of the initial
conditions, but rather assuming that all “reasonable” initial conditions could occur
with equal probability. Ideally one would like to identify macroscopic properties
with time averages of the microscopic observables along the evolution dictated by
internal dynamics.
Two basic assumptions reconcile the irrelevance of the initial conditions with the
time-average viewpoint: equilibrium and ergodicity. Equilibrium requires that long
ago the system has undergone some initial transient, and that now, at the time of
interest, no systematic evolution is occurring any more, all collective quantities (e.g.
pressure) fluctuating in time around some well defined average value. Ergodicity
assumes that all kinds of states are randomly explored in a period of time short
with respect to the typical duration of measurements: subsequent times provide
independent random realizations of an underlying probability distribution. A dishonest dice roller violates ergodicity by controlling accurately the initial conditions,
rather than rolling blindly to generate successive truly random independent numbers. A similar violation may also occur in statistical physics, as illustrated in the
example of H2 nuclear spin. H2 molecules occur with total nuclear spin 1 (orthohydrogen) or 0 (parahydrogen). These states are almost degenerate, so they should
all occur with equal probability: one expects that the observed ratio of ortho- to
parahydrogen equals the degeneracy ratio 3:1. However ortho-para interconversion
is rather difficult: normal gaseous encounters either leave nuclear spins unaltered or
simply exchange them, thus leaving the total abundance of each species unaltered.
Normally, inter-conversions occur at the walls, in the neighborhood of magnetic
impurities. One could however keep the H2 sample in a vessel where all magnetic
impurities have been carefully removed. As a result, an anomalous abundance ratio
of ortho- to parahydrogen can be stabilized for extended time. Instead, if magnetic
impurities are present, careless “shuffling” of the nuclear spins occurs, the expected
3:1 equilibrium distribution is readily recovered, and the system is ergodic. In brief,
the ergodic hypothesis assumes that the system is sufficiently random that no conserved quantities prevents the access (in periods of time short with respect to the
duration of the experiment) to some major part of the space of states.
When a system is at equilibrium and ergodic, time averages can safely be replaced
by averages over a suitable probability distribution of the microscopic states. We
now sketch the standard math used to describe a random distribution of quantum
states.
3.0.5. Quantum statistics and the density operator. We want to formalize our basic ignorance of the precise quantum state of a system, while describing
96
3. STATISTICAL PHYSICS
correctly its statistical properties [?]. A system may be found in any of a number
of quantum states |a1 i, |a2 i, ... with probability w1 , w2 , ... respectively. The states
|a1 i, |a2 i, ... need not be eigenstates of any observable, they need not be orthogonal,
and in number they could even exceed the dimension
of the Hilbert space of states.
P
The normalization of probability requires that i wi = 1.
On average, the measurement of an observable B on such a system should provide
X
X X
(142)
[B] =
wi hai |B|ai i =
wi
b |hai |bi|2 ,
i
i
b
where the {|bi} is the basis of eigenkets of B, with eigenvalues b. We introduce the
square brackets [ ] to indicate the statistical mean of the quantum average values.
The averages (142) are naturally computed based on the statistical density operator
X
(143)
ρ̂ =
wi |ai ihai | ,
i
as
[B] =
(144)
X
i
=
X
b
wi hai |B|ai i =
b hb|ρ̂|bi =
X
b
X
i
wi
X
b
b hb|ai ihai |bi =
hb|ρ̂B|bi = Tr(ρ̂B) .
X
b
b hb|
X
i
!
wi |ai ihai | |bi
The density operator collects all dynamical and statistical properties of the system.
ρ̂ is Hermitian, it can therefore be diagonalized. On its diagonal basis, the density
operator is expressed as:
X
Pm |ρm ihρm | ,
(145)
ρ̂ =
m
where the kets |ρm i form a complete orthonormal basis of the Hilbert space (in
contrast to the |ai i of the definition (143)), associated to real eigenvalues Pm . Note
the normalization
(146)
X
X
X X
X
Tr(ρ̂) =
hb|ρ̂|bi =
wi hb|ai ihai |bi =
wi
hai |bihb|ai i =
wi hai |ai i = 1 .
b
P
b i
i
b
i
thus also m Pm = 1, and the eigenvalues Pm can be interpreted as probabilities
(all Pm ≥ 0).
The special case of a statistical operator describing a single pure state ρ̂ =
|a1 iha1 | falls back to standard deterministic quantum mechanics: [B] = Tr(ρ̂B) =
ha1 |B|a1 i = hBi. In this case (and only in this case) ρ̂ has the property of a projector
ρ̂2 = ρ̂.
3.1. EQUILIBRIUM ENSEMBLES
97
W
S
U
Figure 3.1. The whole “universe” U partitioned in two weakly interacting parts: a “system” S plus the “rest of the universe” W.
3.1. Equilibrium ensembles
The basic postulate of equilibrium statistical mechanics is: in accord to the ergodic
hypothesis, all quantum states of the “universe” U with energy E ≤ EU ≤ E + ∆E
are equally likely. In other words, the probability of a state |ii is a constant
PiU = 1/ΩU (E; ∆E) (the same for any |ii) for states in this range of energy and zero
otherwise (microcanonical ensemble). ΩU (E; ∆E) represents the number of accessible states for the whole universe in the allowed energy range E ≤ EU ≤ E + ∆E, so
that probability is correctly normalized to unity. From this very “democratic” postulate, we derive the probability distribution of a system S in thermal equilibrium with
the rest of the universe W (Fig. 3.1). The total Hamiltonian HU = HS + HW + HSW
involves only a very weak interaction term HSW ≈ 0, so that EU ≈ ES + EW .
We address the specific problem: What is the probability Pm of finding the system
S in a given quantum state |mi of energy Em ? As the coupling between S and W is
weak, we can safely assume that the precise state in S and that in W are distributed
independently at random, except for the necessity of energy conservation. This
means that
(147)
P U = PmS P W .
Microcanonical ensembles govern both U and W=U−S individually. Accordingly,
like in U each state has probability P U = 1/ΩU (E; ∆E), in W each state has probability P W = 1/ΩW (E − Em ; ∆E) (E − Em is the residual energy of W, once the
energy Em of S has been removed from the total energy). From Eq. (147) we extract
98
3. STATISTICAL PHYSICS
the probability distribution of S:
PmS
(148)
PU
= W =
P
1
ΩU (E;∆E)
1
ΩW (E−Em ;∆E)
=
ΩW (E − Em ; ∆E)
.
ΩU (E; ∆E)
Now we recognize that for most states the energy of the system Em is a tiny part of
the total energy of the universe E: we then Taylor-expand the numerator (rather:
its logarithm1) around E:
(149)
ln ΩW (E − Em ; ∆E) = ln ΩW (E; ∆E) − βEm + . . . ,
′
where β = ∂ ln ΩW∂E(E′ ;∆E) |E ′ =E . The linear approximation for the logarithm is exceedingly good as long as Em ≪ EW . We substitute it back into Eq. (148) and
obtain
ΩW (E; ∆E) −βEm
(150)
PmS =
,
e
ΩU (E; ∆E)
which is a very remarkable result for at least two reasons:
• the whole dependence on the state m of S has been confined to the expoU (E;∆E)
nential factor, as the ratio Z = ΩΩW
is independent of Em ;
(E;∆E)
• the probability of the state |mi of S only depends on its energy Em , through
a simple exponential.
We may rewrite Eq. (150) inserting the normalization factor Z −1 as
e−βEm
.
Z
Here we omit the label S, as Eq. (151) describes the Boltzmann equilibrium probability distribution for the states of a generic system in weak thermal contact with a
huge environment. This distribution is associated to the Gibbs canonical ensemble.
As the probability distribution (151) is necessarily normalized, Z can alternatively
be expressed entirely in terms of properties of the system S under study, as
X
(152)
Z=
e−βEm = Tr e−βH ,
(151)
Pm = P (Em ) =
m
without any reference to U and W (so that we write H in place of HS ). Z is usually
called partition function. Note that the sum in Eq. (152) involves all microstates including, in particular, all degenerate components of degenerate levels. An equivalent
1 The number Ω
W of microstates of a huge “rest of the universe” increases roughly exponentially
with its energy, thus the expansion of log ΩW is much more accurate and better convergent than
that of ΩW .
3.1. EQUILIBRIUM ENSEMBLES
99
formulation of the same sum is:
(153)
Z=
X
g(E)e−βE .
E
where g(E) is the degeneracy, i.e. the number of states at energy E. Note also that
the trace in Eq. (152) can be taken on any basis of convenience.
The density operator associated to this equilibrium probability distribution is
diagonal in the energy representation:
X e−βEm
1
|mihm| = e−βH .
(154)
ρ̂eq =
Z
Z
m
This same density operator can however be written on any basis, by applying a
suitable unitary transformation.
3.1.1. Connection to thermodynamics. Assume that the system S can be
partitioned into two weakly interacting subsystems S1 and S2 (Em1 m2 ≈ E1m1 +
E2m2 ). When thermal equilibrium is established between S1 , S2 , and the rest of the
world W, the distribution
e−β(E1m1 +E2m2 )
,
−β(E1n1 +E2n2 )
n1 n2 e
Pm1 m2 = Z −1 e−βEm1 m2 = P
can be factorized into
Pm1 m2
e−βE1m1
e−βE2m2
e−βE1m1 e−βE2m2
P
P
=
·
.
·
=
−βE1n1
−βE2n2
Z1
Z2
n1 e
n2 e
This means that both subsystems at equilibrium follow Boltzmann statistics PmSii =
e−βEimi /Zi , with the same β parameter. This suggests that the intensive quantity
β, with dimensions inverse energy, might be a function of temperature.
The natural statistical definition for the thermodynamic internal energy U is the
average of the energy operator
X
1X
(155)
U = [H] = Tr(ρ̂H) =
Em Pm =
Em e−βEm .
Z
m
m
Deriving U with respect to β yields the squared energy fluctuation changed in sign
(156)
P
P
2 −βEm
−βEm 2
2
−Z m Em
+
e
∂U
2
m Em e
2
=
−
(H
−
[H])
.
−
[H]
=
=
−
H
∂β
Z2
Thus, ∂U
≤ 0, i.e. the internal energy decreases when β increases. This suggests
∂β
that β and temperature T might be inversely related. The determination of the
precise relationship is sketched below.
100
3. STATISTICAL PHYSICS
First, note that Z is a multiplicative function (Z = Z1 Z2 for a system composed
of two subsystems), thus its logarithm is an additive function (ln Z = ln Z1 + ln Z2 ),
thus it must represent a thermodynamical extensive quantity. Define the function
(157)
F =−
ln Z
.
β
With this definition, we obtain a relation between the β-derivative of the extensive
quantity ln Z and the internal energy U :
P −βEm
∂
∂(βF ) 1 ∂Z
∂(ln Z)
me
∂β
=
−
=
−
=
=
−
∂β V,N
∂β
Z ∂β
Z
P ∂ −βEm
P
Em e−βEm
m ∂β e
(158)
= m
= [H] = U ,
=−
Z
Z
the average energy, according to its statistical definition (155). Equation (158), identifies the β derivative of βF to the internal energy. But this relation recalls a known
thermodynamical identity involving the free energy and temperature: starting from
the basic definition of the free energy F = U − T S (S = entropy), one finds
∂(F/T )
∂F
−2
−1 ∂F
2
= −T −T F + T
.
= −T 2
(159) U = F + T S = F − T
∂T
∂T
∂T
This can be written more compactly in terms of a derivative w.r.t. the inverse
temperature:
(160)
∂(F/T )
=U.
∂(1/T )
In Eqs. (159) and (160) all T -derivatives are carried out at constant number of
particles N and volume V . By comparing Eq. (158) and Eq. (160) we see that
• it is perfectly natural to identify the function F , defined statistically in
Eq. (157), with the free energy F of thermodynamics;
• β must then be proportional to the inverse temperature 1/T .
The proportionality constant between T and 1/β is known as Boltzmann constant
kB , and represents the numerical conversion factor between temperature and energy:
1
(161)
β=
.
kB T
The numerical value of kB , the ratio of the gas constant to the Avogadro constant
kB = NRA = 1.380658 · 10−23 J/K= 86.1734 µeV/K, is obtained by comparison of
Eq. (189) below with the empirical definition of temperature through the ideal-gas
thermometer. In statistical mechanics, temperature always appears in the energy
combination kB T , and even more often in the β notation.
3.1. EQUILIBRIUM ENSEMBLES
101
1
1
P(Em)
P(Em)
0.1
0.5
0.01
0.001
0
0
(a)
2
4
6
Em
8
0
(b)
2
4
6
8
Em
Figure 3.2. The statistical meaning of temperature: (a) in a system
at equilibrium with the “rest of the universe” the probability P (Em ) of
individual microstates |mi decays exponentially with their energy Em .
(b) The slope of the straight line representing P (Em ) as a function of
energy in a lin-log scale is precisely −β = − kB1T .
The physical meaning of temperature is now clear: the probability distribution of
the microstates of any system in thermal equilibrium is determined uniquely by the
energies of these states, in an extremely simple way: at a given temperature T ,
the probability P (Em ) of a given state of energy Em is proportional to exp[−βEm ].
Thus, −β = − kB1T is the slope of the straight line representing P (Em ) as a function
of Em in a lin-log scale, as in Fig. 3.2.
Note that, because of the normalization factor Z −1 in Eq. (151), all the energies
of all states in the system determine the precise equilibrium probability for the
occurrence of a given energy eigenstate |mi. However, only the energy difference of
two states |mi and |ni determines their probability ratio PPmn = exp[β(En − Em )].
The discussed relations, in particular Eq. (157), establish an explicit link between
statistical mechanics and thermodynamics, i.e. between the microscopic dynamics
and macroscopic observable average properties. Based on Eq. (157), one can then
extract all thermodynamical properties of a system at equilibrium purely from its
partition function Z. Here follows a summary of several basic extensive and intensive
102
3. STATISTICAL PHYSICS
thermodynamical quantities:
(162)
(163)
free energy
internal energy
ln Z
β
∂(βF ) U = F + TS =
∂β F = −kB T ln Z = −
V,N
(164)
(165)
free enthalpy
enthalpy
(166)
entropy
(167)
heat capacity
(168)
pressure
∂(ln Z) =−
∂β V,N
G = F + PV
H = F + PV + TS
U
U −F
∂F = + kB ln Z
=
S=−
∂T V,N
T
T
2
∂U 2 ∂U 2 ∂ ln Z
=
−k
β
=
k
β
CV =
B
B
∂T V,N
∂β V,N
∂β 2
2
∂F ∂(F/N ) N
P =−
=
.
∂V β,N
V
∂(N/V ) β
3.1.2. Entropy and the second principle. The connection (166) of entropy
with statistics is only valid at equilibrium. A much more general definition of entropy
is available, which applies for any statistical distribution of the quantum states
defined by an arbitrary density operator ρ̂ (not necessarily diagonal in the energy
representation):
(169)
S = −kB Tr(ρ̂ ln ρ̂) .
This definition conforms to the intuitive idea of entropy P
as a measure of disorder.
Consider the basis |ρm i where ρ̂ is diagonal. Here, ρ̂ = m Pm |ρm ihρm |. On this
basis
X
(170)
S = −kB
(Pm ln Pm ) .
m
In this form we can compute S for the most ordinate distribution, a pure state, with
Pm = 1 for a single state m, and 0 for all others, and obtain S = 0 since all terms
vanish. S increases when several states have nonzero probabilities. At the opposite
limit, a completely random distribution with equal probability Pm = Ω1 for a number
P 1
1
Ω of states yields S = −kB Ω
m Ω ln( Ω ) = kB ln(Ω). We conclude that entropy is
a logarithmic measure of the number of states that the system accesses.
When the system is at equilibrium, the general statistical definition (169) coincides
with Eq. (166). This is readily verified by substituting the equilibrium density
3.2. IDEAL SYSTEMS
103
operator (154) into Eq. (169), and evaluating the trace on the energy basis:
P −βEm e−βEm
P −βEm
X
ln Z
(ln e−βEm − ln Z)
e
me
S = −kB
Pm ln Pm = −kB
= −kB m
Z
Z
m
P −βEm
(βEm + ln Z)
e
U
Z(ln Z)
(171)
= kB β [H] +
= + kB ln Z .
= kB m
Z
Z
T
Moreover, it is possible to show [?] that if ρ̂ is the density matrix of the GibbsBoltzmann equilibrium distribution (154), the entropy of the system, defined according to Eq. (169), is maximum under the constraint of assigned internal energy
(172)
Tr(ρ̂H) = U .
This result applies to an arbitrary system at fixed internal energy, volume, and
number of particles. It tells us that any generic density operator ρ̂gen , with no
other restriction, yields an entropy smaller than or equal to that of the equilibrium
distribution ρ̂eq of Eq. (154).
It is observed experimentally that any isolated system evolves spontaneously toward equilibrium: then the result of maximum entropy proves the second principle of thermodynamics, that, in a spontaneous transformation toward equilibrium
ρ̂gen → ρ̂eq , entropy is bound to increase: ∆S = Seq − Sgen ≥ 0. This proof can be
generalized to of a system not necessarily isolated, but for our purposes it suffices to
retain that the experimental fact of entropy increase in the approach to equilibrium
can be understood on statistical grounds.
Many textbooks, including Refs. [?, ?, ?], delve into the ideas briefly introduced
in the present Section in greater detail.
3.2. Ideal systems
Before attempting any understanding of the complicated macroscopic properties
of structured objects such as a bicycle or a bowl of soup, physicists have wisely
chosen to investigate simpler systems: (usually) macroscopically homogeneous pure
(or controllably “doped”) bunches of atoms or molecules all of the same kind, or
of few kinds. These simpler systems constitute the wide class of materials. The
rationale behind studying materials is that the properties of a complex structured
object can be understood in terms of the functionality of the individual pieces it
is composed of, whose function, in turn, depends on their shapes and material
properties. For example, to a very good approximation the total heat capacity of a
bicycle equals the sum of the heat capacities of the composing pieces.
The macroscopic properties of a material are often studied in the limit of an
infinitely wide sample, for which the interactions with the surrounding environment
(e.g. the containing vessel) are sufficiently weak to justify the assumptions of Sec. 3.1.
104
3. STATISTICAL PHYSICS
As the surface atoms/molecules directly interacting with the environment usually
involve a layer about ∼ 1 nm thick, for any sample whose linear dimensions (all three
of them) are larger than, say, ∼ 1 µm, the error induced by this bulk approximation
should not be too bad.
Even within the idealizations of a homogeneous bulk sample, the recipe
(1) compute the spectrum of energies Em and eigenstates |mi of the system;
(2) at given temperature T (or β = kB1T ), compute the partition function Z,
Eq. (152);
(3) generate the equilibrium density operator ρ̂ using Eq. (154);
(4) compute macroscopic average quantities as [B] = Tr(ρ̂B), as in Eq. (144);
is not really applicable for any realistic system, due to the difficulty of step 1.
However, for a wide class of ideal systems the programme of statistical mechanics
can be carried to satisfactory conclusion. By ideal systems we mean systems composed of individual components (particles) whose mutual interactions are completely
negligible. Ideal systems have
(173)
H=
N
X
Hi ,
i=1
where Hi governs the dynamics of a small set of degrees of freedom, e.g. the position
and spin of a particle. The simplicity of many ideal systems permits to obtain
exact partition functions, and thus to understand their thermodynamics by means
of statistical methods.
No ideal system exists in nature (although photons and neutrinos are a very good
approximation to non-interacting particles), and if one did exist, strictly speaking
the methods of equilibrium statistical mechanics would be irrelevant for that system
since it would be non ergodic and have no means of reaching equilibrium. However, many properties of weakly interacting systems are fairly close to those of ideal
systems. Many properties of actual materials can be understood qualitatively, and
often even quantitatively, in terms of properties of ideal systems. Of course, the attention devoted to ideal systems should never make us forget their idealized nature,
and the fact that quantitative understanding of real thermodynamical properties of
real materials often require more sophisticated conceptual tools.
For an ideal system, the natural basis of states is a (properly symmetrized) factored basis (89), where the αi quantum numbers identify the state |αi i of “particle”
i in a system that (for a start) we assume to be composed of a single type of identical noninteracting particles. These could be individual noninteracting electrons, or
individual noninteracting atoms, or individual noninteracting molecules... If it was
3.2. IDEAL SYSTEMS
105
Figure 3.3. The typical occupancy distribution of bosons among
single-particle energy levels for (a) low temperature, (b) intermediate
temperature, and (c) high temperature, assuming that, as is often the
case, the single-particle spectrum has no upper bound, so that most
individual levels remain unoccupied.
not for symmetrization/antisymmetrization, the partition function would read
(174)
unrestricted
unrestricted
X
X
?
Z=
exp[−β(Eα1 +Eα2 +· · ·+EαN )] =
exp[−βEα1 ] exp[−βEα2 ] . . . exp[−βEαN ] .
α1 α2 ... αN
α1 α2 ... αN
with Eα indicating the single-particle eigenenergies of state |αi. However, due to the
symmetrization, exchanging any two sets of quantum numbers αi and αj leads to
the same symmetrized state (up to possibly a sign). The number of permutations
of the quantum numbers is simply N ! for the fermions, where all αi are different.
For bosons, the number of permutations of the quantum numbers giving the same
state is more intricate, as in general many of the αi could be the same. To find
the number of permutations, it is sufficient to consider some ordering of the singleparticle states (for example in order of increasing energy), and to note the numbers
n0 , n1 , n2 , ... of indexes αi which in a given term in the sum (174) equal the lowest,
first excited, second excited, ... states.2 The actual number of permutations of a
set of indexes α1 , α2 , . . . , αN is N !/(n0 ! n1 ! n2 ! . . . ). This expression remains valid
for fermions, as all nα equal either 0 or 1. We have therefore the “correct counting”
partition function
#
"
N
X
X n0 ! n1 ! n2 ! . . .
exp −β
Eαi .
(175)
Z=
N
!
α α ... α
i=1
1
2
N
Here the difference between fermions and bosons is taken care by the sum, which
respects Pauli’s principle for fermions, and remains unrestricted for bosons.
2
Clearly, the sum of these occupation numbers
P
α
nα = N .
106
3. STATISTICAL PHYSICS
3.2.1. The high-temperature limit. At low temperature, thePBoltzmann exponential factor tends to privilege those states with a total energy α nα Eα as low
as possible: the system tends to order, with few single-particle states significantly
occupied (Fig. 3.3a): we shall come back to this regime in Sec. 3.2.2. When temperature is high (small β), assuming, as is often the case, that the single-particle spectrum has unlimited high-energy excitations, a huge number of single-particle states
(those of energy < kB T ) become almost equally likely, and the system becomes extremely disordered (Fig. 3.3c). The states with all different quantum numbers are
overwhelmingly more than those with two or more or them equal. As a result, at
high temperature the main contribution to the boson partition function comes from
states which have nα = 0 or 1 at most. All terms with nα > 1 in the sum of (175),
violating Pauli’s principle for fermions, add a comparably negligible term to a huge
Z. Thus, for both bosons and fermions at high temperature, the partition function
is approximately
#
"
unrestricted
N
X
X
1
1
(Z1 )N
(176)
Z≃
exp −β
Z1 · Z2 · ... · ZN =
,
Eαi =
N ! α α ... α
N
!
N
!
i=1
1 2
N
P −βEα
where the single-particle partition function Z1 =
only differs from the
αe
total Z of Eq. (152) in that it involves single-particle states |αi rather than collective
states |mi.
In this limit, the free energy is
(Z1 )N
e Z1
ln Z
≃ −kB T ln
≃ −N kB T ln
,
β
N!
N
where we made use of the Stirling approximation
N
(178)
ln(N !) ≃ N ln ,
e
3
which is very accurate for large N .
If, as often occurs at high temperature, each of the N particles moves freely around
in the volume occupied by the sample, then the motion of its center of mass can
be separated as sketched in Sec. 1.1 for the one-electron atom. The translational
degrees of freedom commute with the internal ones. Accordingly, the 1-particle
partition function Z1 can be factored
(177)
(179)
F =−
Z1 = Z1 tr Z1 int
into a translational times an internal part. The latter describes the statistical dynamics of the internal degrees of freedom of the particle (including rotations), and
3
An even more accurate version of Stirling’s formula is ln(N !) ≃ N ln Ne +
but the omitted logarithmic correction is by far negligible for large N .
1
2
ln
π
3 (1
+ 6N ) ,
3.2. IDEAL SYSTEMS
107
depend therefore on the characteristic spectrum of its excitations. Instead, the
translational part is universal, in the sense that it only depends on the total mass
M of the particle.
To obtain Z1 tr , recall the spectrum of a freely translating particle. For simplicity,
assume that the particle is contained in a macroscopic cubic box of size L × L × L.
Take periodic boundary conditions, i.e. the wavefunction is the same at opposite
boundaries of the box, but the results (Eq. (184) onward) would not change if we
assumed that the wavefunction vanishes at the boundary (prove it!). The allowed
values of the j = x, y, z momentum components
2π
(180)
pj = ~ kj = ~
nj
nj = 0, ±1 ± 2 ± 3, . . .
L
are associated to plane-wave eigenfunctions ψ(~r) = L−3/2 exp(i ~k~n ·~r), of translational
kinetic energy
(2π~)2 (n2x + n2y + n2z )
|~p|2
=
.
2M
2M L2
For macroscopically large L, the translational states form a “continuum”: the sum
yielding the translational partition function
X
X
(2π~)2 (n2x + n2y + n2z )
(182)
Z1 tr =
exp(−βE~n ) =
exp −β
2M L2
n n n
E~n =
(181)
~
n
x
y
z
is conveniently approximated by the integral
Z ∞Z ∞Z ∞
(2π~)2 2
2
2
(n + ny + nz ) dnx dny dnz .
(183)
Z1 tr =
exp −β
2M L2 x
−∞ −∞ −∞
This factorizes in the product of three identical Gaussian integrals.4 In total,
!3 r
3
L M kB T
V
L
(184)
Z1 tr =
=
= 3,
~
2π
Λ
Λ
where we have introduced the thermal length
r
2π
(185)
Λ=~
.
M kB T
We can now substitute the result for the single-particle partition function into the
global partition function (176) of the gas
(186)
4
Remember that
Z=
VN
(Z1 tr )N
(Z1 int )N =
(Z1 int )N .
N!
N ! Λ3N
√
x2
dx = 2π a.
exp
−
2
2a
−∞
R∞
108
3. STATISTICAL PHYSICS
Derivation yields the translational contribution to the internal energy
(187)
∂
VN
∂
∂
3N
3
∂
1/2
ln Ztr = −
ln
=
3N
ln
Λ
=
3N
ln
β
=
=
N kB T ,
Utr = −
∂β
∂β N ! Λ3N
∂β
∂β
2β
2
compatible with the experimentally well established translational contribution 23 kB
per particle to the heat capacity of high-temperature gases. Likewise, we obtain an
expression for the free-energy contribution of the translational motion:
eV
e Z1 tr Z1 int
= −N kB T ln
+ ln Z1 int .
(188)
F = −N kB T ln
N
N Λ3
Observe that Z1 int describes the internal degrees of freedom: it can therefore depend
on T , but not on the volume V of the sample. This observation and the definition
of pressure (168) allow us to obtain a remarkably general equation of state for the
ideal gas
(189)
∂
∂F eV
eV
N kB T
∂
=
−
P =−
−N
k
T
ln
+
ln
Z
ln
=
.
=
N
k
T
B
1
int
B
∂V β,N
∂V
N Λ3
∂V
N Λ3
V
T,N
This relation, obtained on purely statistical grounds, is equivalent to the well es, universally and quantitatively
tablished equation of state of perfect gases P = nRT
V
valid for atomic and molecular gases at high temperature and low density. The
, with n moles
numerical value kB = NRA is determined by comparison N kVB T = nRT
V
of gas containing nNA particles.
Other quantities are accessible experimentally. For example, the single-particle
center-mass kinetic-energy distribution, yields the probability to find a particle in
a given energy interval. To obtain this distribution, we need the energy density of
translational states
(190)
M 3/2 V 1/2
E
gtr (E) = √
2 π 2 ~3
compatible with the kinetic-energy expression (181).5 The single-particle centermass kinetic-energy probability distribution is then (in the spirit of Eq. (153)) the
product of the density of states times the Boltzmann probability that a given state
5
This density of states can be obtained from the observation that the ~n values are evenly
distributed with unit density. According to Eq. (181), the energy is proportional to the squared
length of the ~n vector E~n = A|~n|2 . Therefore,
the number of states with energy ≤ E equals the
p
volume of the ~n-sphere of radius |~n| = E/A. The density of states is the derivative of this
E 1/2
d 4π E 3/2
= 2π A
number of states with respect to energy: gtr (E) = dE
3/2 , which gives Eq. (190)
3
A
after substituting the value of A =
(2π~)2
2M L2
(from Eq. (181)), and V = L3 .
3.2. IDEAL SYSTEMS
109
T=100 K
0.05
dP/de [1/meV]
0.04
0.03
0.02
T=300 K
0.01
0
0
20
40
60
80
e [meV]
Figure 3.4. The distribution (191) of the center-mass kinetic energy
of the high-temperature ideal gas.
is occupied:
(191)
e−βE
2
(2M )3/2 V 1/2 Λ3 −βE
dP (E)
= gtr (E)
E
e
= √ β 3/2 E 1/2 e−βE .
=
2
3
dE
Z1 tr
4π ~
V
π
This probability distribution is remarkably universal: it does not even depend on
the mass of the particles, but only on temperature (see Fig. 3.4).
Similarly, one determines the center-mass velocity distribution. Each component
of ~v = Mp~ is Gaussian-distributed as
r
M vj2
dP (vj )
βM
exp −β
=
.
(192)
dvj
2π
2
To obtain the distribution of speed v = |~v |, one can simply observe that the kinetic
energy E and speed v are connected by E = M2 v 2 , and use the distribution (191):
(193)
r
1/2
βM 2
dP (E) dE
dP (E)
2 3/2 M v 2
dP (v)
2
2
−β M
v
=
= Mv
= Mv √ β
(M β)3/2 v 2 e− 2 v .
e 2 =
dv
dE dv
dE
2
π
π
We find the celebrated Maxwell-Boltzmann equilibrium velocity distribution, reported in Fig. 3.5. The v 2 factor bears the same “polar” origin discussed for the
radial distribution of one-electron wavefunctions in Sec. 1.1.3: one should not forget
110
3. STATISTICAL PHYSICS
0.6
dP/dv [1/w]
0.5
0.4
0.3
| | |
[v^2]^1/2
[v]
vmax
0.2
0.1
0
0
1
2
3
4
v/w
Figure 3.5. The distribution (193) of the single-particle center-mass
speed of the high-temperature ideal√gas, rescaled by w = (βM )−1/2 .
This distribution q
peaks at vmax = 2 w ≃ 1.414 w, while the mean
8
velocity is [v] =
w ≃ 1.596 w, and the mean square velocity is
π
√
[v 2 ]1/2 = 3 w ≃ 1.732 w.
that the velocity ~v with largest probability is ~v = ~0. Observations of the distribution
confirm the statistical analysis (see Fig. 3.6).
Internal degrees of freedom do not affect the equation of state and the translational
distributions. However, they do contribute an additive temperature-dependent term
to the free energy (188) (the part Fint (T ) = N F1 int (T ) = −N kB T ln Z1 int ), to the
internal energy
3
(194)
U = N kB T + N [F1 int (T ) − T F1′ int (T )] ,
2
and therefore also to the heat capacity
3
(195)
CV = N kB − N T F1′′int (T ) .
2
The internal term vanishes for structureless particles (e.g. electrons), while it contributes significantly to thermodynamics whenever the internal degrees of freedom
are associated to excitation energies not too different from kB T . This occurs occasionally for atoms, and occurs all the times for molecules, as illustrated in the
following examples.
3.2. IDEAL SYSTEMS
(a)
111
(b)
Figure 3.6. (a) The velocity distribution of the vapor is probed [?]
by analyzing the atoms emerging from a oven through a tiny hole,
by letting them through the spiraling slot of a rotating cylinder to a
tungsten surface ionization detector. (b) Typical observed distribution
of the center-mass velocities for high-temperature K vapor, compared
to the curve of Eq. (193).
3.2.1.1. Internal degrees of freedom of molecules. We report here briefly the
example of diatomic molecules, in the approximation that rotational and vibrational motions are independent (Sec. 2.3). The partition function factorizes Z1 int =
Z1 vib Z1 rot , and therefore the free energy F1 int = −kB T ln Z1 int = F1 rot + F1 vib .
The rotational partition function is determined uniquely by the dimensionless ratio
~2
β ~2 /(2I) = Θrot /T , where the characteristic Θrot = 2Ik
:
B
∞
X
Θrot
l(l + 1) .
(196)
Z1 rot =
(2l + 1) exp −
T
l=0
This sum6 cannot be evaluated in closed form. However, the characteristic temperatures Θrot are often very small (e.g. 85.4 K for H2 , 15.2 K for HCl, 2.86 K for N2 ).
When T ≫ Θrot the exponential in Z1 rot changes slowly with l and many terms
6
The expression (196) is actually incorrect for the special case of homonuclear molecules,
where nuclear spin and indistinguishableness play a role. Correct expressions for homonuclear
molecules are recovered essentially by multiplying Θrot by two.
112
3. STATISTICAL PHYSICS
2
1
0.8
Cv1 rot/kB
U1 rot/Erot
1.5
1
0.6
0.4
0.5
0.2
0
0
0
(a)
0.5
1
Τ/Θ
1.5
2
0
(b)
0.5
1
Τ/Θ
1.5
2
Figure 3.7. The temperature dependence of the rotational contribution to the internal energy per molecule U1 rot and heat capacity per
molecule CV 1 rot .
contribute to the sum in Eq. (196): it is a good approximation to replace it with an
integral, which is elementary:
(197) Z
Z ∞
∞
Θrot
T
Θrot
exp −
l(l + 1) dl =
y dy =
[T ≫ Θrot ] .
(2l+1) exp −
Z1 rot ≃
T
T
Θrot
0
0
The high-temperature rotational contribution to the thermodynamic functions is
therefore:
T
Frot
≃ −kB T ln
(198)
F1 rot =
N
Θrot
Urot
U1 rot =
(199)
≃ kB T
N
CV rot
(200)
≃ kB
CV 1 rot =
N
T
Srot
≃ kB + kB ln
(201)
S1 rot =
N
Θrot
(see Eqs. (162), (163), (166), (167)). At lower temperature Tp∼ Θrot instead,
truncating the series (196) to a finite number of terms lmax ≈ 2 T /Θrot approximates Z1 rot better. Figure 3.7 reports the temperature dependence of the rotational
heat capacity and internal energy per molecule. Characteristically, as temperature is raised, the rotational degree of freedom “unfreezes”, reaching the classical
equipartition limit at large temperature T ≫ Θrot . Relative equilibrium populations
3.2. IDEAL SYSTEMS
113
1
2
1.5
Cv1 vib/kB
U1 vib/Evib
0.8
1
0.6
0.4
0.2
0.5
0
0
0.5
(a)
1
Τ/Θ
1.5
2
0
(b)
0.5
1
Τ/Θ
1.5
2
Figure 3.8. The exact temperature dependence of the vibrational
contribution to the internal energy per molecule U1 vib and heat capacity per molecule CV 1 vib .
Pl = (2l + 1) exp − ΘTrot l(l + 1) /Z1 rot account for the observed relative intensities
of the rotational structures in molecular spectra (Figs. 2.8 and 2.9).7
In the harmonic approximation (138), the vibrational partition function is determined uniquely by the dimensionless ratio β̃ = β ~ω = ΘTvib , where the characteristic
Θvib = k1B ~ω:
(202)Z1 vib
7
∞
1
1 Θvib X
Θvib
Θvib
v+
= exp −
exp −v
=
exp −
T
2
2 T
T
v=0
v=0
1
1 Θvib
1
=
.
= exp −
Θvib
2 T
1 − exp − T
2 sinh Θ2Tvib
∞
X
Here the dipole matrix element averaged over the initial (2l+1)-degenerate state and summed
over the final states lf = l + 1 (R branch) or lf = l − 1 (P branch) is taken independent of l. Only
l+1
the radial part of the matrix element is, while calculation shows that the angular part equals 2l+1
l
for the R branch and 2l+1 for the P branch. For l not too small, these factors are both close to
0.5, and are therefore often taken as constant.
114
3. STATISTICAL PHYSICS
Here, the series is evaluated in closed form, thus yielding an exact expression valid
at all temperatures. The vibrational thermodynamical functions are therefore:
h
i
~ω
Fvib
(203)
=
+ kB T ln 1 − exp(−β̃)
F1 vib =
N
2
Uvib
~ω
~ω
(204)
U1 vib =
=
+
N
2
exp(β̃) − 1
"
#2
2
CV vib
β̃/2
kB β̃
CV 1 vib =
(205)
=
= kB
N
2 cosh(β̃) − 2
sinh(β̃/2)
h
i
kB β̃
Svib
(206)
= −kB ln 1 − exp(−β̃) +
S1 vib =
N
exp(β̃) − 1
(see Eqs. (162), (163), (166), (167)). The temperature dependence of the vibrational
heat capacity and internal energy per molecule is drawn in Fig. 3.8. In analogy to
rotations, as temperature is raised, the vibrational degree of freedom “unfreezes”,
reaching the classical equipartition limit at large temperature T ≫ Θvib . Note that
the large-T limit for a single oscillator provides a contribution kB T to U1 (and thus
kB to CV 1 ) equal to that of the two rotational degrees of freedom. The reason is that
one harmonic oscillator is associated to two quadratic terms (kinetic and potential)
in the Hamiltonian, rather than one as for each rotational degree of freedom (only
kinetic).
Typical vibrational temperatures of a few diatomic molecules are: 6300 K for H2 ,
4300 K for H 35Cl, 3400 K for N2 , 403 K for K 35Cl. It is seen that, contrary to
the rotational unfreezing, the vibrational transition from the quantum-frozen to the
classical regime is fairly accessible to heat capacity measurement, which find good
accord with the statistical model. Discrepancies at high temperature are mainly due
to the failure of the harmonic approximation.
3.2.1.2. Isolated (spin) degrees of freedom. It is straightforward to apply the
Boltzmann formalism to the statistics of the magnetic moments carried by atoms or
molecules in a gas, through the appropriate partition function Z1 int . However, the
same statistical methods can be applied also to equal isolated non interacting degrees
of freedom, e.g. atomic magnetic moments carried by dilute magnetic impurities in
a solid or a liquid. For brevity, we refer to “spins”: in reality the magnetic moment
is generally proportional to some total angular momentum.
For any system whose Hilbert space of states is finite-dimensional (and not too
large...), it is possible to compute explicitly the partition function Z1 int . For ex~ = B ẑ as in
ample, an angular momentum J coupled to a total magnetic field B
Eq. (43) spans a (2J + 1)-dimensional space of states, a basis of which is labeled by
the Jz projection quantum number MJ . The “spin” partition function is the sum of
3.2. IDEAL SYSTEMS
115
0
0.4
-0.1
Cv1 spin/kB
U1 spin/Espin
0.3
-0.2
-0.3
0.2
0.1
-0.4
0
-0.5
0
1
(a)
2
Τ/Θ
3
4
5
0
(b)
1
2
Τ/Θ
3
4
5
Figure 3.9. Temperature dependence of the internal energy per
spin U1 spin (210) and heat capacity per spin CV 1 spin (211) characterizing the thermodynamics of a magnetic moment in a uniform field B,
associated to a characteristic temperature scale ΘB = k1B Espin =
1
gµB B.
kB
(2J + 1) terms corresponding to the levels of Eq. (87):
(207)
Z1 spin =
J
X
MJ =−J
exp −β̃ MJ ,
where the ratio β̃ = β gj µB B = ΘB /T is defined in terms of the characteristic
ΘB = gj µB B/kB (gj is the relevant g-factor). It is a simple exercise to evaluate
Z1 spin for any given value of J.
For the reader’s convenience, we summarize here the results for the simplest case
spin J = 21 :
(208)
Z1 spin
ΘB
= 2 cosh
2T
β̃
= 2 cosh
2
1
spin− .
2
116
3. STATISTICAL PHYSICS
The spin- 12 thermodynamic functions are therefore:
β̃
Fspin
= −kB T ln 2 cosh
=
N
2
(209)
F1 spin
(210)
U1 spin =
(211)
CV 1 spin
(212)
S1 spin
!
Uspin
gµB B
β̃
=−
tanh
N
2
2
2
kB β̃
CV spin
=
=
N
2 cosh(β̃) + 2
"
!
#
β̃
β̃
β̃
Sspin
= kB ln 2 cosh
− tanh
.
=
N
2
2
2
The temperature dependence of the internal energy and heat capacity of each magnetic moment is drawn in Fig. 3.9. Like for molecular rotations and vibrations, as
temperature is raised, the spin degree of freedom “unfreezes”, with an increase of
the heat capacity. However, when temperature is further raised, due to the finite
spectrum, the internal energy cannot increase indefinitely: U1 spin flattens out, thus
the spin heat capacity decays at large temperature.
The density of magnetization of such a system of magnetic moments
~ = N [~µ1 ]
M
V
(213)
is a quantity of straightforward experimental accessibility and relevance. In such an
~ is necessarily oriented parallel to the magnetic field, M
~ = M ẑ. We
ideal system M
have
[µ
(214)
z 1 ] = Tr(ρ1 µz 1 ) =
J
X
MJ =−J
Mz =
(215)
J
exp(−β̃MJ )
1 X
U1 spin
=−
EMJ PMJ = −
Z1 spin
B M =−J
B
J
1
gµB N
β̃
spin− .
=
tanh
2V
2
2
−gµB MJ
N [~µz 1 ]
N U1 spin
=−
V
VB
The average magnetization changes with temperature following the same functional
dependency as the internal energy, apart for a trivial factor − VNB , thus Fig. 3.9a can
also be read as magnetization as a function of temperature for the J = 21 system.
For weak field, β̃ → 0, the hyperbolic tangent can be expanded to lowest order,
~
obtaining the linear response of the localized spins to a weak total field B:
(216)
1
N gµB 2 1
gµB N
spin− .
β̃ = χB B ,
with χB =
Mz ≃
4V
V
2
kB T
2
3.2. IDEAL SYSTEMS
117
χB represents the (weak-field) magnetic susceptibility. The characteristic inverse-T
dependency of the Curie susceptibility of free spins reflects the disordering effect of
temperature.
In practice it is more standard to measure the susceptibility χH relative to the
external applied field strength H = ǫ0 c2 Bext [in A/m]. The relation between χB and
χH derives from
h
i
h
i
~ = χB B
~ = χB B
~ ext + B
~ int = χB (ǫ0 c2 )−1 H
~ + (ǫ0 c2 )−1 M
~ ,
(217)
M
~ int = (ǫ0 c2 )−1 M
~ for the magnetic field of a uniformely magnewhere the relation B
~ into evidence,
tized material is used. We put M
~ =
M
(218)
(ǫ0 c2 )−1 χB ~
H,
1 − (ǫ0 c2 )−1 χB
thus χH =
(ǫ0 c2 )−1 χB
1 − (ǫ0 c2 )−1 χB
yields the desired relation for the dimensionless susceptibility χH .
3.2.2. Degenerate Fermi and Bose gases. In Sec. 3.2.1 we found that,
at high temperature, non-interacting fermions and bosons behave in an equivalent manner, as an ideal classical gas, plus (possibly) internal degrees of freedom. At low temperature however, spectacular fermion/boson differences show
up. The calculation of the exact partition function Z (175) at arbitrary T can
be carried out by replacing the N sums over the single-particle quantum numbers αi with sums over the occupation
numbers nα . To this purpose, the toPN
P
tal energy in the exponential i Eαi =
α nα Eα is written in terms of occupation numbers
n
of
the
single-particle
states,
so that the exponential factorizes
α
P
Q
exp (−β α nα Eα ) = α exp(−βnα Eα ):
(219)
X Y
X
X n0 ! n1 ! n2 ! . . .
P
P
nα
e−β α nα Eα =
e−βEα
e−β α nα Eα =
.
Z=
N
!
{nα }
{nα }
α
α α ... α
1
2
N
P
α nα =N
P
α nα =N
The occupancies nα are 0 or 1 for P
fermions, and 0, 1, 2, 3, ... for bosons. The
binomial coefficients correcting the α1 ... αN sum for overcounting, is suppressed in
the nα -sum, as the states identified by the occupation numbers are counted correctly,
without undue overcounting. The constraint of fixed total number of particles N
makes the sum over the occupancies in Eq. (219) extremely difficult to compute:
without this constraint, one could exchange sum and product, and the sums would
all look alike.
To get rid of the constraint we use a trick: replace the canonical ensemble, where
the number of particles is fixed, with the grand canonical ensemble, where N is
allowed to vary, to describe a thermodynamical system which is (weakly) exchanging
not only energy but also particles with the rest of the universe. Figure 3.10 illustrates
118
3. STATISTICAL PHYSICS
E
U
[N]
N
U −µ N
−µ N
Figure 3.10. At fixed temperature, the internal energy is generally
a convex function of the number of particles N , which is usually minimum for N = 0. The addition of −µN shifts the equilibrium to some
finite average number of particles [N ], which is an increasing function
of the parameter µ.
the need to subtract µN to the energy eigenvalue in the expression (152) of the
partition function. For an arbitrary system of equal particles, summing the canonical
Z over N yields the correct grand partition function
(220)
Q=
∞ X
X
N =1 m(N )
−β(Em(N ) −µN )
e
−β(H−µN̂ )
= T̃r e
,
where N̂ is the operator counting the number of particles, and the T̃r indicates
summing over all states and all numbers of particles. Q plays the same role for
the grand canonical ensemble as Z for the canonical ensemble. A reasoning similar
to that of Sec. 3.1.1 yields the relation between the grand partition function and
thermodynamics. In particular, β identifies with the inverse temperature and µ
with the chemical potential. The basic relation linking Q to the thermodynamical
3.2. IDEAL SYSTEMS
119
potentials (analogous to Eq. (162)) is the first in the following list:
(221)
(222)
(223)
(224)
J(T, V, µ)
F
U
G
V P (T, µ) = kB T ln Q(T, V, µ)
µ [N ] − J
µ [N ] − J + T S
J + F = µ [N ]
∂J ∂P [N ] =
=V
∂µ T,V
∂µ T
∂G ∂F =
µ =
∂[N ] T,V
∂[N ] T,P
∂J ∂P S =
=V
.
∂T µ,V
∂T µ
(225)
(226)
(227)
=
=
=
=
The listed relations allow us to compute all thermodynamical quantities for a system at equilibrium with a reservoir of energy and of particles. Further details are
discussed in Ref. [?].
Armed with this new tool, we proceed to compute the grand partition function
Q forP
a system of noninteracting identical bosons or fermions. Remembering
P that
N = α nα , we first rearrange the exponent of Eq. (219) and then use the N to
get rid of the constraint
Q =
∞
X
N =0
=
X
{nα }
X
P
−β [( α nα Eα )−µN ]
e
=
N =0
P {nα }
α nα =N
P
e
α
β(µ−Eα )nα
∞
X
=
XY
X
e−β [(
P
nα Eα )−µ
P
α
nα ]
P {nα }
α nα =N
eβ(µ−Eα )nα =
YX
α
{nα } α
α
nα
eβ(µ−Eα )nα =
YX
α
nα
eβ(µ−Eα )
nα
Next, observe that the occupancy numbers nα are mute indexes over which it is
summed. There is no special reason to distinguish among them: just call them n.
Accordingly, we write the grand partition function as
"
#
Y X
n
(228)
Q=
.
eβ(µ−Eα )
α
n
The sum in square brackets can be carried out explicitly: for fermions n = 0, 1, thus
it equals 1 + eβ(µ−Eα ) ; for bosons, n = 0, 1, 2, . . . , therefore that sum is a geometric
series, whose summation8 gives (1−eβ(µ−Eα ) )−1 . By introducing a quantity θ, θ = +1
8
This series converges only if the number eβ(µ−Eα ) < 1, i.e. only if µ is smaller than the
smallest single-particle energy Eα , which is usually 0. Therefore µ must be negative for bosons.
.
120
3. STATISTICAL PHYSICS
for bosons and θ = −1 for fermions, we can write Q in a form factorized over different
single-particle states and valid for both bosons and fermions:
Y
−θ
(229)
Q=
1 − θeβ(µ−Eα )
.
α
The grand potential (221) provides the connection to thermodynamics:
X
(230)
J = P V = kB T ln Q = −θkB T
ln 1 − θeβ(µ−Eα ) .
α
This expression and Eqs. (222)-(227) determine all the thermodynamics of noninteracting bosons/fermions. For noninteracting particles, the single-particle states
|αi are labeled, as discussed in Sec. 3.2.1, by the single-particle momentum p~, plus
possibly internal degrees of freedom. At low temperature any nontrivial internal dynamics is usually “frozen”. Only gs degenerate states remain accessible, and we may
collect them in a spin variable σ (representing for example the ẑ projection of the
total angular momentum of the particle). In practice, like in Eq. (182), the α-sum
is a sum over nx , ny , nz , and (possibly) spin σ. As the translational levels are very
dense, the n-sum can be replaced by an integration over energy E, weighted (like in
Eq. (153)) with the density of translational states gtr (E) computed in Eq. (190):
Z ∞
Z ∞
X
X X
(231)
→
→ gs
dE g(E) .
dE gtr (E) →
α
σ
nx ny nz
0
0
Here we introduce the total density of states g(E) = gs gtr (E). We obtain an equation
of state
Z ∞
g(E)
J
dE
(232)
= −θkB T
ln 1 − θ eβ(µ−E)
P =
V
V
0
Z
(2M )3/2 ∞ √
β(µ−E)
dE
= −θkB T gs
E
ln
1
−
θ
e
.
4π 2 ~3 0
This can be rewritten in terms of the variable Ẽ = Eβ, and integrated by parts to
obtain:
(233)
Z ∞
p
2 2
3/2 Z ∞
Ẽ 3/2
βµ−Ẽ
−3
5/2 (2M )
√
,
Ẽ
ln
1
−
θe
=
d
Ẽ
d
Ẽ
k
T
g
Λ
P = −θ(kB T ) gs
B
s
4π 2 ~3 0
3 π
eẼ−βµ − θ
0
where the thermal length Λ was defined in Eq. (185). This equation of state for the
ideal gas of bosons/fermions expresses the pressure in terms of temperature9 and
9
In introducing the present discussion we addressed particularly the low-temperature regime.
In fact, the only assumption involving temperature is that all internal degrees should be either
frozen or included in gs . As long as no other internal degree of freedom plays any role, Eqs. (230)
and (233) hold for any temperature.
3.2. IDEAL SYSTEMS
121
chemical potential. This is unpractical, since µ is a quantity of difficult experimental
access: it would be preferable that P was expressed in terms of T and the density
[N ]
, like in the high-temperature limit Eq. (189). Unfortunately, in general there is
V
no simple analytic expression of µ as a function of the density, thus a convenient
explicit equation of state is not really available. More interestingly, by computing
the internal energy U , it may be verified that the high-temperature ideal-gas relation
2U
3V
applies at any temperature, for both bosons and fermions, even though from the
point of view of the equation of state (233), at low temperature ideal bosons and
fermions are far from an “ideal gas” in the thermodynamical sense.
By a high-temperature power expansion, it is possible [?] to derive the leading
correction to the ideal-gas behavior Eq. (189) due to quantum statistics:
(235)
Λ3 [N ]
(2π)3/2 ~3 [N ]
[N ]
for δ =
kB T 1 − θ 2−5/2 δ + O(δ)2 ,
=
≪ 1.
P =
V
gs V
gs (M kB T )3/2 V
(234)
P =
The explicit form of the “degeneracy” parameter δ makes it clear that Eq. (235) is
a high-temperature and low-density expansion. The sign of the leading correction
shows opposite tendencies of boson statistics to reduce pressure, and of fermion
statistics to increase it. This is a first high-temperature hint of the better “social” character of bosons versus fermions. The experimental verification of these
corrections in real gases (e.g. of atoms) is extremely difficult, as interatomic interactions introduce corrections to the pressure of the same order as or larger than those
associated to indistinguishableness.
At low temperature, the behavior of ideal bosons and fermions becomes radically
different. The average occupation number [nα ] of single-particle states is a clear
indicator of these differences. This is obtained as
P
Q
P
P
( n neβ(µ−Eα )n ) · α′ 6=α n eβ(µ−Eα′ )n
neβ(µ−Eα )n
Q P β(µ−E ′′ )n
[nα ] =
= Pn β(µ−Eα )n .
α
ne
α′′
ne
P
∂
This fraction is recognized as ∂(βµ)
ln[ n eβ(µ−Eα )n ]. After Eq. (228) above, we computed the sum within the logarithm, and found the result [1−θeβ(µ−Eα ) ]−θ . Therefore,
we have
[nα ] =
∂
−θeβ(µ−Eα )
∂
ln[1 − θeβ(µ−Eα ) ]−θ = −θ
ln[1 − θeβ(µ−Eα ) ] = −θ
,
∂(βµ)
∂(βµ)
1 − θeβ(µ−Eα )
with the final result
(236)
[nα ] =
1
eβ(Eα −µ) − θ
.
122
3. STATISTICAL PHYSICS
In a single formula, this equation collects the celebrated Bose-Einstein distribution
(237)
[nα ]B =
1
eβ(Eα −µ) − 1
and Fermi-Dirac distribution
(238)
[nα ]F =
1
eβ(Eα −µ) + 1
.
The average occupancy of each single-particle state depends only on temperature
and on the energy of the state itself.10 However, the presence of all other particles
affects each single-particle occupancy distribution through the chemical potential µ,
]
and temperature T .
which contains information about total particle density [N
V
3.2.2.1. Fermi particles. The T → 0 limit of the fermion-gas statistics is of
fundamental interest for the physics of matter. The reason is that conduction electrons in many metals can be approximately described as free non-interacting spin- 21
fermions, which at room temperature have a huge degeneracy parameter δ ≫ 1. The
thermal length for electrons at 300 K is close to 4 nm, corresponding to a thermal
V
volume Λ3 ≈ 80 nm3 , much larger than the volume per electron [N
≈ 0.01 nm3 in
]
a typical metal, thus yielding a degeneracy ratio δ ≈ 4000. Electrons in metals are
thus very far from the range of validity of Eq. (235). On the contrary, their properties can often be understood in terms of the ideal Fermi gas at low, and sometimes
even null, temperature.
The T = 0 properties of a Fermi gas are relatively simple. At T = 0, the ground
state is the only N -fermion11 state which is populated: for noninteracting fermions,
elementary quantum mechanics suggests that this is simply the antisymmetric state
realized by filling the N lowest-energy single-particle levels (i.e. the lowest-|~k| N/gs
plane-wave states), up to some maximum single-particle energy ǫF called Fermi
energy, and leaving all the states above empty, as illustrated in Fig. 3.11b. Indeed,
by taking the β → ∞ limit of Eq. (238), the average occupancy becomes a step
function of energy:

Eα < µ
 1,
1
,
Eα = µ ,
(239)
lim [nα ]F =
T →0
 2
0,
Eα > µ
10
And on no other property. In particular, in the absence of any applied magnetic field, occupancy is independent of ms : at all temperatures the ideal gas is in a spin-unpolarized nonmagnetic
state.
11 Hence, for brevity, we adopt the symbol N for the average number of particles [N ].
3.2. IDEAL SYSTEMS
123
ε
T=0
1
[n]F
εF
T=0
....
0.5
T=0.05µ/kB
T=0.2 µ/kB
0
0
(a)
1
2
ε/µ
(b)
Figure 3.11. (a) The average filling of the single-particle levels of
the ideal Fermi gas as a function of energy E (measured in units of the
chemical potential µ), for three increasing temperatures. The drawn
temperature 0.05 kµB corresponds to several thousand K for electrons
in simple metals. (b) The filling of the single-particle levels of noninteracting fermions (here gs = 2) at T = 0.
which thus identifies the chemical potential at T = 0 with ǫF . Then, by requiring
that
(240)
Z µ
Z ∞
Z
(2M )3/2 V µ
(2M )3/2 V gs 2 3/2
1/2
dE g(E) = gs
dE g(E) [nα ]F =
N=
µ ,
dE
E
=
4π 2 ~3
4π 2 ~3
3
0
0
0
we establish the relation between the particle density and the chemical potential
2/3
~2 6π 2 N
(241)
ǫF = µ(T = 0) =
.
2M
gs V
p2
F
) is filled
In p~-space, each state within a sphere of radius pF (such that ǫF = 2M
by gs fermions, while those outside are empty. To this maximum momentum pF =
2 1/3
N
(the Fermi momentum), there corresponds a maximum velocity
~kF = ~ 6π
gs V
pF
vF = M , the Fermi velocity. Similarly, the Fermi energy is sometimes conveniently
expressed as a Fermi temperature TF = kǫFB .
124
3. STATISTICAL PHYSICS
In simple metals, typical densities of conduction electrons (M = me , gs = 2) of
the order N
≈ 1028 ÷ 1029 m−3 (roughly the inverse cube of typical interatomic
V
separations) yield ǫF ≈ 2 ÷ 10 eV, i.e. TF ≈ 20000 ÷ 100000 K. This corresponds to
kF ≈ 1010 m−1 , pF ≈ 10−24 kg m s−1 , and typical velocities vF ≈ 106 m s−1 .
At T = 0 it is also straightforward to obtain the internal energy:
(242)
Z
Z ∞
Z ǫF
(2M )3/2 V gs 2 5/2
(2M )3/2 V gs ǫF
3/2
ǫ .
dE
E
=
U=
dE E g(E) [nα ]F =
dE E g(E) =
4π 2 ~3
4π 2 ~3
5 F
0
0
0
by substituting ǫF from Eq. (241).
This can be expressed in terms of the density N
V
3/2
If only ǫF is substituted using Eq. (240), one can obtain the easier-to-remember
relation:
3
(243)
U (T = 0) = N ǫF .
5
In accord with Eq. (234), the T = 0 pressure
2N
(244)
P (T = 0) =
ǫF
5 V
is very large, due to the huge average kinetic energy of the fermions, obliged by
Pauli’s principle to occupy different plane-wave states. This energy depends strongly
(as V −5/3 ) on the available volume, as the spacing between the momentum states
(180) is inversely proportional to the size of the box containing the gas (which makes
g(E) ∝ V , as in Eq. (190)). The pressure exerted by a typical electron gas in a simple
metal at T = 0 is as large as P ≈ 109 ÷ 1010 Pa! A sharp electric-potential step at
the surface of the metal prevents the conduction electrons from escaping the solid.
The T = 0 properties of the free electron gas explain many properties of the
electrons in simple metals at ordinary temperatures,12 but not those thermodynamical quantities that involve temperature explicitly, such as the heat capacity. In
the T ≪ TF limit, a systematic expansion of the exact thermodynamical equation
of state (233) and the relation (226) connecting µ to N
yield these thermal propV
erties as lowest-order corrections. The mathematical details [?] of this procedure
(called Sommerfeld expansion) are slightly intricate, but the qualitative trends are
straightforward. For example, one can estimate the leading dependency of µ and
U on temperature by considering Fig. 3.11a: when a small temperature is turned
on, the average occupancy [nα ]F changes slightly from the T = 0 step function, with
12
The accord of the free-electron theory with experimental data of many simple metals is
surprisingly good in view of the strong Coulomb interactions between electrons. The reason is that
long-range Coulomb forces are efficiently screened by the electron gas. An experimentally observed
phenomenon which is directly related to electron-electron Coulomb repulsion is that of plasmon
collective excitations, which however occur at rather high energy (few eV), and are therefore of
little importance to the thermodynamics at ordinary temperatures.
3.2. IDEAL SYSTEMS
empty states
filled states
ky
125
2
~ M k BT / h k F
δk~
kF
kx
Figure 3.12. Thermal excitations/deexcitations across the Fermi
sphere involve mainly the occupations of the states within a skin across
the Fermi sphere. For electrons in metals at ordinary temperature
T ≪ TF , the skin thickness ∝ kB T is greatly exaggerated here.
some weak probability that states above µ are populated and states below µ are
empty. As the density of states g(E) ∝RE 1/2 is slightly larger above ǫF than below,
∞
to conserve the fermion number N = 0 g(E) [nα ]F dE the chemical potential decreases slowly as T increases.13 The internal energy U increases mainly due to the
few electrons moving up from states of a skin region of thickness ≈ kB T below ǫF
into states ≈ kB T above (Fig. 3.12): the energy of each excited electron increases by
≈ kB T . The number of excited electrons is of the order of the density of states g(ǫF )
times the energy interval kB T where excitation occurs. The total internal energy
increases therefore by approximately g(ǫF )(kB T )2 . The precise expansion yields
(245)
"
#
2
2
5π
3
T
π2
+ ...
T ≪ TF ,
U = U (T = 0)+ g(ǫF ) (kB T )2 +... = N ǫF 1 +
6
5
12 TF
where the second equality was obtained using the useful relation valid for spin- 12 free
fermions
3N
(246)
g(ǫF ) =
.
2ǫF
13 For g
s
= 2 the precise T -dependence of the chemical potential is µ = ǫF 1 −
π2
12
T
TF
2
+ ... .
126
3. STATISTICAL PHYSICS
Figure 3.13. Low-temperature measured molar heat capacity of
metallic potassium divided by T , as a function of T 2 . For Cv ≃
a T 1 + b T 3 , the finite intercept at T 2 = 0 of the CV /T curve measures
the coefficient a of the T 1 (electronic) contribution; the slope of the
graph measures the coefficient b of T 3 , which is the contribution of
lattice vibrations (Sec. 4.3.2).
T -derivation of Eq. (245) yields the heat capacity of the low-temperature ideal Fermi
gas:
(247)
π2 T
+ ...
CV = N kB
2 TF
T ≪ TF .
The effect of Fermi statistics is to depress the heat capacity by a factor ∼ TTF with
respect to the high-temperature ideal-gas value 23 N kB . The reason is that only
few electrons with energy very close to the Fermi energy are involved in thermal
excitations, the large majority of electrons remaining “frozen” in deeper filled states.
Experimentally, a T -linear contribution to the total heat capacity is observed in
solid metals at low temperature (Fig. 3.13), where other (lattice – see Sec. 4.3
below) contributions are small. The T -linear contribution is attributed to electrons:
for example, the observed T -linear coefficient for potassium ≃ 2.1 mJ mol−1 K−2
π2
≃ 1.7 mJ mol−1 K−2
(Fig. 3.13) agrees fairly with the free-electron estimate NA kB 2T
F
(obtained based on the experimental density N
≃ 1.3 · 1028 m−3 of conduction
V
electrons in potassium, which, through Eq. (241), yields a Fermi energy ǫF ≃ 2.1 eV).
The Fermi gas is nonmagnetic, as all spin states are equally occupied. Within
~ = Hz ẑ
linear response, the application of an external magnetic field strength H
~ = Bz ẑ, that we assume to couple only to the spin
produces a total magnetic field B
degrees of freedom: the gs degeneracy is lifted, and the ideal Fermi gas polarizes.
~ = Mz ẑ denotes the volume density of magnetic moment, like in Eq. (213). For
M
spin- 21 electrons (gs = 2), the magnetization is Mz = −µB [N↑ − N↓ ]/V . By computing the linear response to the external field, one obtains the magnetic behavior of
3.2. IDEAL SYSTEMS
127
Figure 3.14. The measured magnetic susceptibility of metallic
Zr2 V6 Sb. The wide T -independent χ region is characteristic of metallic behavior. The increase of χ as T → 0 is due to the presence of
magnetic impurities (see Eq. (216)).
the ideal Fermi gas. At high T , according to Eq. (216), independent spins produce a
Curie susceptibility χB ∝ N/T , while at small T ≪ TF the Pauli principle freezes out
the spins of most electrons, the paired ones deep inside the Fermi sphere (Fig. 3.12).
Only the approximately g(ǫF ) kB T electrons near the Fermi surface do spin-polarize,
thus producing a substantially T -independent magnetization. Detailed calculation
yields:
1/3
µ2B g(ǫF )
3 µ2B N
31/3 qe2 N
(248)
Mz =
Bz
T ≪ TF ,
Bz =
Bz = 4/3
V
2 ǫF V
4π me V
where the last expression shows the explicit density dependence. Accordingly, the
spin susceptibility of the low-temperature ideal Fermi gas is given by Eq. (218), with
1/3
1/3
2
31/3
N
πN
2 −1
2
2 −1 qe
(249)
(ǫ0 c ) χB = 4/3 (ǫ0 c )
= α a0
.
4π
me V
3V
As only the electrons near the Fermi level polarize, χH is tiny, ≈ 10−5 : it is proportional to the 13 power of the density, as opposed to linear in N/V , as for isolated
spins, Eq. (216). This weak T -independent paramagnetic response (Pauli paramagnetism) is a characteristic signature of metallic behavior in the experimental study
of materials (see e.g. Fig. 3.14).
3.2.2.2. Bose particles. The ground state of N non-interacting bosons is even
simpler than that of N fermions: they all occupy the lowest-energy state ~k = ~0 of
128
3. STATISTICAL PHYSICS
energy E = 0. If a spin degeneracy gs is present, the average number of bosons of
each spin flavor is N/gs . This situation is reflected in the expression (237) for [nα ]B :
1
as β → ∞, the occupancies of all positive-energy states vanish, and µ ≈ − βN
→ 0−
in such a way to ensure that the occupancy of the E = 0 state remains finite = N .
However, the density of states (190) vanishes at E = 0: this indicates that the
conversion (231) of the sum over the discrete single particle states into an energy
integral is missing completely the most populated state, which becomes dominant
at low temperatures. A correct treatment including separately the population of
this state shows a phase transition at a finite temperature
2/3
6.625 ~2 N
,
(250)
TBE =
kB 2M V
signaled by the macroscopic filling of the E = 0 level below TBE .14 This lowtemperature state is called a “Bose-Einstein condensate”.
Even though many atoms and molecules are bosons, they all solidify at much
larger temperatures than the relevant TBE for standard densities. The only exception in condensed-matter is 4 He, and indeed a superfluid transition similar to that
discussed for the ideal boson gas is observed is 4 He at low temperature and ordinary
pressure. However, 4 He at low temperatures is a liquid, not a gas, therefore interparticle interactions play an important role: more sophisticated tools are necessary to
understand the actual nature of the superfluid transition of 4 He. Artificial systems
are being studied in recent years, where droplets of extremely cold atoms are kept
in a metastable gas state inside an electromagnetic trap. These droplets of boson
atoms yield the experimental realization of a Bose-Einstein condensation much more
similar to that of an ideal gas than the 4 He superfluid state (Fig. 3.15).
Beside actual bosons, also the thermodynamical properties of fictitious particles related to harmonic oscillators are described by the Bose-Einstein distribution.
Equation (138) yields the eigenvalues of an harmonic oscillator in terms of the quantum number v representing the number of nodes of the wavefunction considered. In
a polyatomic molecular context, or whenever several harmonic oscillators of frequencies ωα contribute to the dynamics, the vibrational state is labeled by all the
vα quantum numbers, and the associated total energy is
X
1
~ωα .
(251)
Evib (v1 , v2 , ...) =
vα +
2
α
14
The average filling of every individual state |αi is given by Eq. (236). If the size of the
system (both N and V ) doubles, the average occupancy of |αi (and of any state |α′ i close in
energy) does not change: it is the density of states (190) that doubles to take care of the extra
particles. Below TBE , the E = 0 state marks an exception: its occupancy is a finite fraction of N ,
and if the system size is doubled, also its occupancy doubles.
3.2. IDEAL SYSTEMS
129
Figure 3.15. The experimental evidence of the realization of a BoseEinstein condensate phase of gaseous 87 Rb atoms. The graphs show a
2D plot of the single-particle momentum distribution of the atoms in
the droplet, as temperature is cooled down through the Bose-Einstein
condensation temperature TBE . The sudden buildup of the asymmetric lump at p~ = ~0 measures the growth of the condensate fraction as
the transition temperature is reached.
This relation should be compared to the expression
X
(252)
E(n1 , n2 , ...) =
nα Eα
α
used, e.g. in Eq. (175), for the energy of noninteracting particles in terms of the
occupation numbers nα of the single-particle statesPof energy Eα . It is seen that,
apart from an irrelevant constant zero-point shift 21 α ~ωα , the two expressions are
identical provided that the following identifications are made:
single oscillator α
oscillator quantum number vα
oscillator energy quantum ~ωα
oscillator eigenvalue vα ~ωα
←→
←→
←→
←→
α single−particle state
nα occupancy number of state α
Eα single−particle energy of state α
nα Eα energy of n particles in state α .
As vα = 0, 1, 2, 3, ..., the identification works for bosons, not fermions. The outlined
similarities in the Hamiltonian induce a completely parallel statistical behavior. For
130
3. STATISTICAL PHYSICS
example, consider the average vibrational energy of one oscillator, Eq. (204): once
the zero-point term ~ω2α is removed, energy v ~ωα is proportional to the quantum
α
number v, thus the average energy exp~ωβ̃−1
, reflects an average value of v, equal to
1
, precisely the same average occupancy [nα ]B , Eq. (237), of noninteracting
exp β̃−1
boson state |αi with Eα = ~ωα and µ = 0. It is therefore natural to think of
the harmonic ladder as fictitious boson particles, called “phonons” or “photons”,
depending on the context. Accordingly, an oscillator in its v = 4 state is said to hold 4
photons, and an oscillator it its ground state contains no photons. The Bose-Einstein
distribution can therefore be profitably employed to describe the thermodynamics
of a set of harmonic oscillators, by replacing the Eα with the relevant ǫ = ~ωα , and
taking µ = 0. In this context, the total number N of bosons cannot be held fixed
(phonons/photons are not conserved particles), and varies freely with temperature,
contrary to fermions and “real” bosons, such as atoms in a box. The lack of µ
makes the statistics of such a system a simple exercise. We summarize here briefly
the result for the photon gas, which describes the thermodynamics of the normal
modes of the electromagnetic fields at thermal equilibrium within an isothermal
cavity.
The main difference between photons and material massive particles is that the
dispersion relation, i.e. the ǫ(~p) (or, equivalently, ω(~k)) dependency is
(253)
ǫ(~p) = c|~p|
or
w(~k) = c|~k| ,
or
~|~k|2
w(~k) =
2M
(c is the speed of light) rather than
(254)
E(~p) =
|~p|2
2M
for a particle of mass M . The electromagnetic fields in vacuum obey the same
(Laplace) stationary equation of Schrödinger particles, thus under the same periodic
boundary conditions, the allowed values of momentum are connected to the box size
by the same Eq. (180). The “spin” degeneracy of photons is gs = 2, corresponding
to the two transverse polarizations. Counting the states within a p~-sphere yields the
total density of oscillator energies
(255)
gs gph (ǫ) =
V
π 2 ~3 c3
ǫ2 ,
to be compared with the dispersion of Schrödinger particles, Eq. (190).
-2
-1
R(ε,T) [W m eV ]
3.2. IDEAL SYSTEMS
131
8000
400 K
6000
4000
300 K
2000
0
0
0.1
0.2
0.3
ε [eV]
0.4
0.5
Figure 3.16. The radiance (radiated power per unit surface and
spectral energy) R(ǫ, T ) = 4c u(ǫ, T ) of electromagnetic fields at equilibrium at T = 300 K and 400 K. The area under each curve equals the
R∞
2
4
total radiated power per unit surface 0 R(ǫ, T ) dǫ = π60(k~B3Tc2) (StefanBoltzmann law). One square meter blackbody radiates a total 459 W
at 300 K and 1450 W at 400 K. The energy position of the maximum
of R(ǫ, T ) shifts linearly with T (Wien displacement law).
The statistics of these oscillators yields the following thermodynamical relations:
Z ∞
π 2 (kB T )4
U = V
u(ǫ, T ) dǫ = V
(256)
,
15 ~3 c3
0
1
ǫ3
1
(257)
,
with u(ǫ, T ) = gs gph (ǫ) [nǫ ]B ǫ = 2 3 3 ǫ
V
π ~ c e kB T − 1
Z ∞
Z ∞ 2
1
y
2 ξ(3)
3
[N ] =
gs gph (ǫ)[nǫ ]B dǫ = V 2 3 3 (kB T )
(258)
dy = V 2 3 3 (kB T )3 ,
y
π ~c
π ~c
0
0 e −1
where ξ is the Riemann function (ξ(3) = 1.20206). These results were first derived by
M.K.E.L. Planck to interpret the experimental data of thermal-radiation spectrum.
As a by-product, the energy spectral density u(ǫ, T ) (per unit volume and energy)
yields the radiative power R(ǫ, T ) = 4c u(ǫ, T ) (power per unit surface15 per unit
15
The c/4 factor between radiance R and energy density u originates from the fact that
photons carry
light c. However, given an infinitesimal surface, only the
R 1 energy at the
R 2πspeed of
1
1
fraction 4π
of
photons in the surroundings of that surface crosses it in
cos
θ
d
cos
θ
dϕ
=
4
0
0
the right direction.
132
3. STATISTICAL PHYSICS
Figure 3.17. The observed cosmic microwave background frequency
spectrum, compared to a 2.73 K blackbody spectrum equivalent to
Eq. (257).
spectral energy) of radiation at thermodynamic equilibrium.16 Figure 3.16 reports
the radiative power R(ǫ, T ) as a function of energy for two different temperatures.
The spectral distribution of energy density (and, equivalently, radiative power) of
electromagnetic fields at equilibrium is a universal function of temperature (called
blackbody spectrum), and does not depend on the precise way this equilibrium is
established (for example on the optical properties of the material of the cavity
enclosing the fields). The distribution (256) is in extremely good agreement with
experiments on radiation escaping through a tiny hole from isothermal cavities. A
spectacular realization (Fig. 3.17) of the thermal-equilibrium radiation is that of the
cosmic background, a “fossil” relic of an early stage of the universe when it was all
at thermal equilibrium.
We will apply the same statistics, with a slightly different density of states, to
understand the thermal properties of the vibrations of solids (phonons in the Debye
model, Sec. 4.3.2).
16
The same quantities are often quoted in terms of frequency
ν, rather then energy
R∞
−3 2
ǫ = 2π~ν =h hν. For example:
g̃(ν,
T
)
=
8πV
c
ν
,
U
=
V
ũ(ν,
T ) dν, with ũ(ν, T ) =
0
i
hν
−3 3
8π h c ν / exp kB T − 1 .
3.3. INTERACTION RADIATION-MATTER
133
3.3. Interaction radiation-matter
In Sec. 1.1.10 we have sketched the basic semiclassical results for a material system
(e.g. an atom, a molecule) interacting with the radiation field. In the absence of any
external stimulation, the system decays spontaneously to a lower-energy state with
the rate given by Eq. (73). Here we want to describe in some detail what happens
under the action of an external field, i.e. the effect of a stimulating radiation, as, e.g.,
in an absorption experiment. Typically this radiation needs to resonate with the
energy of transition between two eigenstates of the system. Off-resonance transitions
are associated to very small rates, and we ignore them here.
Assume we have an ensemble of noninteracting quantum systems (e.g. atoms in
gas phase) and focus our attention on two single-system levels only: |1i and |2i, of
energies E1 and E2 (E1 < E2 ), with populations n1 and n2 , respectively. Resonant
radiation of energy ǫ = ~ω = E2 − E1 may induce transitions between the two levels.
In particular, the excitation |1i → |2i is driven by the presence of radiation, thus any
given atom initially in state |1i has a probability per unit time (a rate) of upward
transition proportional to the spectral energy density (per unit volume and spectral
interval) ρ(ǫ) of the electromagnetic field at the resonance energy:
(259)
R1→2 = B12 ρ(ǫ) ,
where B12 is a suitable constant of proportionality, depending on the microscopical
characteristics of the system and its coupling to the field, and we neglect nonlinear
couplings O(ρ2 ). On the other hand, any atom initially in state |2i has a probability
per unit time A21 to decay spontaneously to state |1i (in the dipole approximation, A21 equals the γ21 of Eq. (73)), plus an additional probability of downward
transitions promoted by the presence of stimulating radiation, whose rate is then
proportional to the same resonant spectral density:
(260)
R2→1 = A21 + B21 ρ(ǫ) ,
where again B21 is a yet-to-determine constant of proportionality.
At any given time, the total number of systems undergoing the |1i → |2i transition is n1 R1→2 , and the total number of systems going |2i → |1i is n2 R2→1 . These
relations are valid under arbitrary radiation conditions, for example when the system is probed by a stimulating radiation in a spectroscopy experiment (Fig. 0.5).
However, these relations are valid in particular when the system and the radiation
field are both at equilibrium at a given temperature. At equilibrium the average
populations of different states must remain unchanged, and this implies that the
total number of |1i → |2i and |2i → |1i processes must, on average, be equal:
(261)
[n1 ] R1→2 = [n2 ] R2→1 .
134
3. STATISTICAL PHYSICS
We substitute (259) and (260) in the balance equation (261)
(262)
[n1 ] B12 ρ(ǫ) = [n2 ] (A21 + B21 ρ(ǫ)) .
and solve for ρ(ǫ) obtaining
(263)
ρ(ǫ) =
A21
B21
[n1 ] B12
−
[n2 ] B21
1
.
At equilibrium, the ratio of the populations equals the probability ratio which, by
1]
= PP21 = exp[β(E2 − E1 )] = exp[βǫ] at resonance.
Boltzmann statistics, is simply [n
[n2 ]
At equilibrium, the spectral density of the radiation field follows the universal energy
dependency described in Sec. 3.2.2.2, in particular in Eq. (257): ρ(ǫ) = u(ǫ, T ).
Accordingly,
A
(264)
21
ǫ3
1
B21
=
u(ǫ,
T
)
=
ρ(ǫ)
=
.
B12
ǫ
π 2 ~3 c3 exp ǫ − 1
−
1
exp
kB T
kB T
B21
Comparison of the first and last form, which must be equal for any temperature,
implies the following conditions on the coefficients:
B12
A21
1
= 1,
= 2 3 3 ǫ3 ,
B21
B21
π ~c
known as Einstein relations. These relations among B12 B21 A21 were derived under
the assumption of equilibrium, but as these coefficients only depend on properties of
the individual system, they remain constant under arbitrary field conditions. Once
one of these coefficients is determined, e.g. by A21 = γ21 of Eq. (73), the two others
are fixed by the Einstein relations.
The first relation expresses the symmetric role of the initial and final states in
quantum mechanics, implying that a radiation field induces equal rate of excitation
|1i → |2i and of stimulated emission |2i → |1i. This means in particular that under
very strong applied radiation density the spontaneous emission becomes negligible,
and R1→2 ≃ R2→1 , thus rapidly also n1 ≃ n2 (saturated transition). Saturating the
different components of the |n = 2i ←→ |n = 3i transition of hydrogen is precisely
the role of the pump beam in the experiment of Fig. 1.12.
The second relation implies that, for a given spectral energy density ρ(ǫ) (independent of ǫ), the ratio of spontaneous emission to stimulated emission varies with ǫ3 .
Accordingly, in a low-energy transition (microwaves, infrared) stimulated emission
prevails, while at higher energy (ultraviolet, X-rays) spontaneous emission prevails.
At equilibrium, the ratio of spontaneous to stimulated emission
A21
ǫ
(266)
−1
= exp
B21 u(ǫ, T )
kB T
(265)
3.3. INTERACTION RADIATION-MATTER
135
Figure 3.18. The principle of operation of three- and four-level
lasers, which realize in practice the needed population inversion, with
level |2i pumped at the expense of level |1i. The lasing transition
(2 → 1) is slower than the other indicated downward transitions,
which are fast dipole-allowed transitions.
indicates that thermal radiation stimulates emission effectively only for temperatures kB T ∼ ǫ or higher, as was to be expected. More interestingly, in the context
of spectroscopy, the relation B12 = π 2 ~3 c3 ǫ−3 A21 shows an emission-forbidden transitions is also absorption-forbidden, and that an intense transition in absorption is
also intense in emission, except at very low energies ǫ.
3.3.1. The laser. Usual media absorb traversing electromagnetic waves, and
this is at the basis of absorption spectroscopies (Fig. 0.5). However, consider the
possibility that a bunch of non-interacting quantum systems amplifies light rather
than attenuating it. For this to occur, the total emission rate needs to exceed
absorption: n2 R2→1 > n1 R1→2 , i.e. the ratio
rate of emission
A21
n2 (A21 + B21 ρ(ǫ))
n2
(267)
.
1+
=
=
rate of absorption
n1 B12 ρ(ǫ)
n1
B21 ρ(ǫ)
needs to exceed unity. As the second term becomes very small as soon as radiation
intensity builds up, this equation tells us that a single condition grants light amplification (ratio > 1): a population inversion (n2 > n1 ), with respect to Boltzmann
population of prevailing low-energy states. In practice, this radically out of equilibrium population is realized by means of electronic or optical pumping, involving
other levels beside the two relevant ones, as illustrated in Fig. 3.18. Of course,
a population inversion is highly unstable, and precisely the prevalence of emission
over absorption tends to lead the ensemble toward a regular equilibrium population
n2 < n1 . However, pumping may sustain the population inversion for an extended
period of time, at the expense of some external power source.
136
3. STATISTICAL PHYSICS
Figure 3.19. (a) Cr impurities in a transparent Al2 O3 crystal play
the role of weakly interacting atoms carrying the 3 levels involved in
the population inversion. Optical pumping of the “active medium”
raises many Cr atoms from the ground state |1i to the broad shortlived state |3i, which decays spontaneously to state |2i. Further spontaneous decay |2i → |1i is extremely slow: decay is mainly promoted
by stimulated emission. Since the energy of metastable state |2i is
very sharply defined, the emitted radiation is very monochromatic.
(b) The schematic construction of the ruby laser, showing the optical
pumping lamp, the escape of photons not moving axially, and suggesting the buildup of repeatedly reflected axially moving photons,
which stimulate further coherent emission. The usable photons escape through the partially reflecting mirror at one end.
Once an optical active medium which amplifies light is realized, light can be channeled through it in a precise direction, by building a resonating optical cavity around
it (Figs. 3.19 and 3.20). A crucial feature of stimulated emission is coherence: the
emitted photon is not emitted in a random direction with a random phase, as in
spontaneous emission, but prevalently in the same direction and with the same phase
as the stimulating beam of photons.
The use of a long cavity lets photons emitted at odd directions escape basically
unamplified, while photons along the axis of the cavity bounce back and forth several
times through the active medium, thus getting strongly amplified. As a result, a
powerful highly coherent beam of radiation builds up for as long as the population
inversion is maintained. A device such as described here, producing a coherent beam
of photons by means of Light Amplification by the Stimulated Emission of Radiation
is named laser.
Massive commercial applications of such coherent beams extend from the industrial to the consumer side, including telecommunications, optical data storage,
telemetry, cutting, welding, surgery... Lasers play also a fundamental role as tools
for research in many spectroscopies, photochemistry, ultracold trapped gas cooling,
microscopy, adaptive optics of telescopes...
3.3. INTERACTION RADIATION-MATTER
137
Figure 3.20. (a) The principle of operation of a He-Ne (four-levels)
laser. Electrons accelerated by an electric field hit the He atoms, lead
them to a metastable excited state He∗ (1s2s), which can then resonantly transfer its excitation energy to a Ne atom (He∗ + Ne → He +
Ne∗ ). The excited Ne atoms are now in the highest of a 4-levels configuration, with the correct lifetimes for lasing operation. The emission
of the upper transition is a characteristic red light (λ ≃ 633 nm), but
other transitions can be selected. (b) The basic components of a gas
laser.
The present Chapter introduces the basics of equilibrium statistics and its connection to thermodynamics. It also reports a few examples of applications to ordinary matter, basically restricted to ideal noninteracting systems. Statistics becomes much more complicated and stimulating when real interacting systems are
considered [?, ?]. Physicists have devised all sorts of techniques to investigate interacting systems both theoretically (diagrammatic expansions, many-body methods,
renormalization group, ...) and experimentally (definitions and measurements of
position/momentum/spin correlation functions, high-pressure techniques to investigate phase diagrams in extreme regimes, analysis of fluctuations, study of finite-size
effects, exotic magnetic phases...). In the present introduction we have completely
omitted any reference to dynamical out-of-equilibrium quantities. These are needed
to account for transport (we shall use few basic definitions to define electrical and
138
3. STATISTICAL PHYSICS
thermal conductivities in solids below) and general hydrodynamical properties starting from the microscopical interactions. This active and exciting sub-field of research
in statistical physics goes far beyond the scope of the present introductory course.
CHAPTER 4
Solids
Macroscopic systems realize an equilibrium thermodynamical state as a balance
between energy (which tends to decrease) and entropy (which tends to increase).
Temperature sets the relative importance of the entropic contribution over the energetic one: energy prevails at low temperature, entropy at high temperature. The
solid state is an ordered state, where each atom basically sits around a definite position: entropy is much smaller there than in a fluid state. Indeed, experience shows
that most materials solidify at low temperature.
The solid state signals the prevalence of adiabatic potential energy over nuclear
kinetic energy. Kinetic energy Tn is translationally invariant, and tends to favor
position-delocalized states, characterized by as low momentum as possible, and
scarce position correlation. On the contrary, potential energy Vad takes advantage of characteristic optimal interatomic separations (see e.g. Fig. 2.1), and tends
therefore to impose fixed interatomic relative distances and clear position localization. For essentially all materials at zero temperature, potential energy prevails,
thus leading to solid states. The only remarkable exception is He, which at ordinary
pressure remains fluid (and superfluid) down to zero temperature, due to its exceptionally weak interatomic interactions associated to scarce atomic polarizability (see
Table 2.1). The application of pressure leads to low-temperature solid phases even
for He, though.
The tendency of a solid to hold together is measured by its “total bond energy”
(in molecular language, the energy necessary to disaggregate the compound into its
atomic components). In the context of solids, this is referred to as the cohesive
energy. Like for molecules, both the Vad and Tn contributions vanish in the “atomized” state, whereas in the solid state, [Vad ] is negative (assuming Vad = 0 in
the atomized state), and [Tn ] is positive (it cannot vanish, even at T = 0, due to
Heisenberg uncertainty) but much smaller in absolute value (see Fig. 2.14).
Solid matter is characterized by long-distance rigidity: a force applied to one or
few of the atoms in a solid sample acts through the whole sample, which accelerates
maintaining its average shape unchanged. This is due to the ability of solids (as
opposed to fluids) to resist shear forces. This basic macroscopic property, on which
our everyday experience relies, is far from trivial from the point of view of the
139
140
4. SOLIDS
microscopic equations governing the dynamics of electrons and nuclei composing a
solid. Indeed, our analysis of the adiabatic potential acting directly between two
atoms at large distance reveals that Vad always decays with a rather fast power law
1/R6 . As the number of atoms at distance R from a given atom grows with R2 , the
total interaction energy with faraway atoms decays with ∼ 1/R4 . In practice, an
atom interacts significantly only with those atoms sitting in its close neighborhood,
of a few nm3 say. Long-distance rigidity therefore cannot be related to long-range
forces. This means that, in a solid, short-range forces propagate from one atom to
the next ones, and from those to further atoms again and again, through the whole
sample.
4.1. The microscopic structure of solids
Many solids show ordered microscopical structures, but completely disordered
solids are very frequent as well. Before coming to the experimental evidence for
the ordered structure of many solids, we try to understand why regular spatial
arrangements of atoms should not come too much as a surprise.
In our initial study of many-atoms system (Chapter 2), we analyzed the typical
shape of the adiabatic potential (Fig. 2.1) for a diatom. We also found that the
adiabatic potential of many atoms is an explicit function of the relative positions
of all of them, including all distances and angles. For exceptionally simple systems,
such as the noble-gas elements, the total adiabatic potential of Nn atoms is decently
approximated by a sum of 2-body potentials (e.g. of Lennard-Jones type – Eq. (136))
(268)
Vad (R1 , R2 , ...RNn ) =
Nn
X
α<α′
V2 (|Rα − Rα′ |) .
Initially, neglect the nuclear kinetic energy: the state of minimum energy of two such
atoms is realized by placing them at the equilibrium distance RM of the potential
V2 , with an energy gain equal to the depth −ε of the potential well.1 Suppose now
that a third atom is added, and that all equal atoms are constrained to stay along
a line: the third atom can join the two on either side, at approximately a distance
RM from the nearest atom, as shown in Fig. 4.1. Neglecting the weak second- thirdetc. -neighbor forces (which induce extra attraction, as illustrated in Fig. 4.1), the
energy gain is simply −2ε. Likewise, Nn atoms along a chain place themselves at
almost perfectly regular distances,2 and gain a total cohesive energy ≃ −(Nn −1) ε,
√
For the Lennard-Jones potential (136), RM = 6 2 σ, with V2 (RM ) = VLJ (RM ) = −ε.
2 Slight distortions occur at the surface. Deep inside the bulk, however, each atom is subject
to basically the same potential as its neighboring ones, thus it reaches an equilibrium position
relative to the others which involves perfectly regular spacings.
1
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
141
RM
V2(R)
Energy / ε
1
0
-1
0
(a)
(b)
1
2
3
R/RM
Figure 4.1. (a) A 1D solid constructed by successive addition of
atoms. The atoms after the second add at fairly regular distances,
slightly smaller than RM , the optimal equilibrium distance of the 2body adiabatic potential. (b) The reason of the slightly smaller equilibrium separation in the solid than in the diatom: if the separation
was RM , forces to second, third... neighbors would all be attractive
and uncompensated, thus some energy is gained by contracting the
interatomic separation. This contraction is tiny, as V2 explodes at
short distance, and even tinier at the surface than in the bulk, due to
local deficiency of second, and further neighbors.
i.e. essentially −ε per atom. Depending on the precise shape of V2 the actual energy
gain turns out slightly larger due to second- and higher-neighbor attraction.
In 1D, regularity is trivial and indeed unavoidable. In larger dimensionality, the
freedom of the arrangement of atoms is significantly larger, and several regular
and irregular atomic arrangements are possible. In 2D, a third atom can add to
form an equilateral triangle, and extra atoms that join the cluster find a lowestenergy arrangement by progressively building a triangular lattice, as illustrated in
Fig. 4.2. In the limit of large Nn , when surface effects can be neglected, each
atom is surrounded by 6 nearest neighbors, so that the cohesive energy per atom
is approximately −3ε, as each bond is shared between two atoms. Second- and
higher-neighbor terms make the actual energy gain slightly larger, but the general
message is that a two-body interaction tends to favor configurations of maximal
coordination, i.e. geometric arrangements where each atom has as many nearest
142
4. SOLIDS
Figure 4.2. A 2D solid constructed by successive addition of atoms.
The basic unit is the equilateral triangle, which repeats itself indefinitely in space, so that each atom in the bulk is (maximally) coordinated to 6 other atoms. Contractions due to second, third... neighbors
are ignored in this figure.
neighbors as possible. Indeed if the atoms were forced to occupy a square lattice,
each atom would bind to 4 nearest-neighbors, rather than 6, thus gaining a cohesive
energy per atom of ≃ −2ε only: this gives a macroscopical cohesive energy difference
∆U = U square − U triang ≃ Nn ε in favor of the triangular lattice against the square
lattice which is therefore strongly unstable.
In 3D, the implications of the principle of maximal coordination reach even further.
4 atoms maximize their coordination by placing themselves at the vertexes of a
regular tetrahedron. Extra atoms extend this basic unit in space following either of
two different regular patterns: the face-centered cubic (fcc) lattice and the hexagonal
close-packed (hcp) lattice. As illustrated in Fig. 4.3, both lattices are the result
of a regular stacking of 2D triangular lattices. In both hcp and fcc lattices, the
second layer is stacked above the first one so that the atoms sit on top of the
centers of half the triangles of the lower layer. The third layer is also stacked on
top of half the centers of the second-layer triangles: in the hcp directly above the
atoms of the first layer, while in the fcc above the remaining triangles. In both
configurations, each atom is surrounded by 12 nearest neighbors, thus the cohesive
energy is approximately −6ε per atom. When higher-order neighbors are included,
the cohesive energy is usually marginally more favorable to fcc than to hcp.
It is indeed observed that the solid state of Ne, Ar, Kr and Xe is a regular fcc
lattice. The optimal equilibrium distance (accounting for all higher-neighbors interactions) for the Lennard-Jones fcc solid equals 0.971 RM = 1.09 σ, with a total
cohesive energy per atom of −8.6 ε (significantly more bound than the −6 ε nearestneighbor estimate). By plugging the parameters of Table 2.1 in this simple model,
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
143
Figure 4.3. Two different close packings of rigid spheres: fcc is
realized as a stacking sequence of type abcabc..., while hcp is realized
as a stacking of type ababab... of triangular lattices. Upper panel:
decomposition in triangular layers of conventional cells of fcc and hcp.
Lower panel: two layers of “cannonballs”. If a third layer is placed on
sites of type (c), an fcc stacking is initiated. If instead the third layer
is placed on sites of type (a), an hcp stacking is realized.
one obtains the bond distances and cohesive energies of Table 4.1. Not unexpectedly,
the experimental energy is generally slightly larger (less cohesive) than the prevision of the simple Lennard-Jones model, where the ionic kinetic energy Tn and the
associated zero-point motion are neglected. The good overall agreement confirms
144
4. SOLIDS
exp
theory
element RM [pm] 1.09σ [pm]
Ne
313
300
Ar
375
371
399
401
Kr
Xe
433
443
exp
theory
exp
[meV] −8.6 ε [meV] Tmelt [K]
−20
−27
24.6
−80
−90
83.8
−110
−124
115.8
−170
−167
161.4
U
Nn
Table 4.1. The estimate of the equilibrium nearest-neighbor interatomic separation and cohesive energy per atom for solid noble-gas
elements based on the fcc Lennard-Jones model (Table 2.1), compared
to experimental determinations. Discrepancies of the order of few percent are not surprising, especially for the lighter Ne, for a model which
neglects the kinetic energy of the nuclei. Note the strong correlation
between the cohesive energy per particle and the melting temperature
Tmelt of the solid (| NUn | ≃ 12 kB Tmelt ).
the concept of maximizing the coordination, leading to compact structures similar
to the packing of hard spheres, typically fcc, a concept valid whenever the adiabatic
potential can be decomposed in 2-body terms, as in Eq. (268).
The noble gases (and mixtures thereof) are not the only system where the approximation of a 2-body adiabatic potential works well: it can successfully describe
the overall structure of many solids formed by “spherically symmetric” close-shell
molecules, e.g. methane CH4 [?]. With suitable modifications, similar 2-body models permit to describe other molecular solids (e.g. H2 , N2 , Cl2 ) where again weak
Van der Waals interactions provide cohesion, but the asymmetry of the individual
molecular unit can favor different lattice structures.
However, molecular solids, where each molecular unit retains many of its molecular
properties and is only weakly bound to other units, constitute a marginal class of
solids, by no means the most typical one, like the Ar2 dimer is not the most typical
example of a molecule. Contrary to molecular solids, electrons of the outer atomic
shells change substantially their quantum state when they belong to a covalent or
metallic solid, in pretty much the same way that the electronic state of H N and O
changes in forming the covalent molecules H2 , N2 , H2 O, as discussed in Sec. 2.2. A
covalent or metallic solid can indeed be seen as a huge molecule, where the molecular
bonding extends to the whole sample. Experimentally, cohesive energies per atom
U
of solids are comparable to bond energies of covalently bound diatomic molecules,
Nn
i.e. of the order of several eV. These are of course much larger than those of noblegas solids (Table 4.1); for example we report the cohesive energies per atom of a few
solid elements: Li −1.65 eV, C (diamond) −7.36 eV, Si −4.64 eV, Fe −4.29 eV; Cs
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
(a)
145
(b)
Figure 4.4. (a) A sliced wafer of perfectly crystalline Si. The diameter of the cylinder is 300 mm. (b) A balls-and-sticks view “from
the inside” of the silicon crystal lattice structure. This is the same
geometrical structure as C diamond. The coordination is four only.
This view is down the (110) direction.
−0.83 eV. Ionic crystals (e.g. NaCl) show similar cohesive energies, of several eV
per ion pair.
To compute a quantitative estimate of the cohesive energy of a covalent or metallic solid, it is necessary to study the dynamics of its electrons in detail. Like for
many-electron molecules, in practice, reliable estimates can be computed only on
the basis of detailed self-consistent calculations. Before coming back to the electronic states of solids in Sec. 4.2, observe that it is to be expected that the adiabatic
potential associated to such nontrivial electronic states shows strong dependence
on all bond lengths and angles, pretty much as it does in molecules where it enforces more or less rigid equilibrium molecular geometries (Fig. 2.7). Therefore, in
covalent and metallic solids, the 2-body approximation Eq. (268) is bound to fail
completely, and different structures, other than fcc, are to be expected, depending
on the detailed chemistry and thus on the relevant Vad involved. Indeed, elemental solids show several different ordered structures including (beside fcc): hcp, body
centered cubic (bcc), diamond, and others. Long-range crystalline order reaches
spectacular levels of perfection, for example in industrial-grade Si single crystals,
where a sub-nanometer unit cell repeats itself over and over in three dimensions for
distances exceeding the meter (Fig. 4.4). We shall soon present those structures in
the standard idealized formalism, commonly employed to report and classify many
experimental data on crystalline solids, based on an elementary unit infinitely repeated periodically in space.
146
4. SOLIDS
(a)
(b)
(c)
Figure 4.5. Point defects in an otherwise perfect triangular lattice:
(a) a vacancy, (b) an interstitial atom, and (c) an impurity atom.
(a)
(b)
(c)
Figure 4.6. (a) A dislocation (b) observed by STM in PtNi alloy. (c)
The motion of dislocations substantially reduces the shear resistance
of a real crystal with respect to that of a perfect crystal.
Figure 4.7. A stacking fault observed in an otherwise perfect GaN crystal.
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
147
Figure 4.8. A grain boundary observed in the high-resolution transmission electron microscope (TEM) image of a strontium-titanate
film.
A perfect infinitely repeated lattice is an idealization which is never exactly realized in nature. In a real material, the crystalline structure is bound to contain
defects. A localized defect, such as a vacancy or an interstitial atom (Fig. 4.5) raises
the total energy by a few times the typical bond energy ε of an atom to its neighbors.
This is irrelevant for the total cohesive energy per atom NUn , in the thermodynamical
limit. At low temperature, at equilibrium, one expects a small finite concentration
(of the order of ∼ e−βε ) of localized defects to survive. The modest difference in
energy between the fcc and hcp lattices makes the creation of extended defects such
as dislocations (Fig. 4.6) or stacking faults (Fig. 4.7) likely. The energy cost of
extended defects is macroscopically large and should therefore suppress them at
equilibrium. However defects remain easily “frozen” within the lattice, the typical
time for an extended defect to drift out of a macroscopically large sample being often astronomically large. As a result, at very low temperature a solid often remains
locked in a metastable non-equilibrium state with a finite (often large) concentration of extended defects depending on preparation (see the discussion of ortho- and
para-hydrogen of Sec. 3.0.4 for a simpler example of a metastable non-equilibrium
state surviving for long time). Extended defects of the type of Figs. 4.6-4.8 are crucial for understanding the plastic deformations of real crystals under strain. Even
without “internal” defects, real solids are not ideal because the lattice periodicity
is forced to end at the inevitable terminating surface or interface or grain boundary
(Fig. 4.8).
148
4. SOLIDS
Figure 4.9. High-resolution TEM image of an amorphous zirconium
(Zr) alloy. In contrast to Fig. 4.8, the spots are randomly arranged,
indicating that this material is noncrystalline.
(a)
(b)
(c)
Figure 4.10. Naturally grown (a) quartz (SiO2 ) and (b) fluorite
(doped CaF2 ) polycrystals. (c) Sapphire (doped Al2 O3 ) cut artificially
along crystal planes.
In many materials (many-components off-stoichiometric compounds, glasses, many
polymers, alloys...), the cost of the formation of defects is so small that it is highly
nontrivial (and often even impossible) to obtain crystalline samples. These materials
form amorphous solids, where no long-range lattice order is present (Fig. 4.9): their
microscopic structure often resembles that of a frozen liquid. The formalism that
we are going to set up for crystals is mostly useless for amorphous systems: their
investigation requires more advanced tools, exceeding the scope the present basic
course.
The tendency to grow and to break along flat planes (cleave) at characteristic fixed
relative angles (Fig. 4.10) is a macroscopic evidence of crystalline order in many
solids. An even more compelling evidence is provided by the diffraction of X-rays,
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
MONOCHROMATIC
"RADIATION"
SOURCE
149
11
00
00
11
00
11
00
11
00
11
00
11
00
11
00
11
00
11
00
11
00
11
00
11
00
11
00
11
00
11
SAMPLE
(a)
"RADIATION"
DETECTOR
(b)
(c)
Figure 4.11. (a) The scheme of a powder or polycrystalline sample
diffraction experiment. The “radiation” beam may be constituted by
any wave field interacting with matter, typically X-rays, neutrons,
or electrons. The sample must be thin with respect to the typical
attenuation length of that radiation in that material. The diffraction
patterns made by a beam of (b) X rays and of (c) electrons passing
through the same thin Al foil. The characteristic angles of diffraction
are connected to the lattice periodicities and the wavelength of the
incident radiation.
neutrons and electrons of wavelengths in the a0 region. As Fig. 4.11 illustrates,
all sorts of wave probes of wavelength in the correct range interact with crystals
and produce diffracted beams, characteristic of a regular array of scatterers, i.e. of a
spatially periodic density. We proceed to introduce the basic mathematics describing
a periodic system, with the central concepts of direct and reciprocal lattice, and
employ the related Fourier analysis to understand diffraction experiments.
4.1.1. Lattices and crystal structures. The basic property of a crystalline
solid is the essential equivalence of different “sites”. Any Ne atom in its fcc lattice
“sees” an essentially equivalent environment, unless it lies close to the crystal surface
or to some defect. In a sufficiently “clean” crystal, most atoms are far enough, say
at least 5 neighbors away, from the nearest imperfection. The forces experienced by
one such “bulk” atom, and by its electrons, equal those the same atom would feel
if it was part of a perfect lattice extending through all space. It thus makes sense
to understand many properties of crystalline solids by modeling them as “perfect”,
ideal crystals.
150
4. SOLIDS
P
a’1
a’2
a2
a1
Figure 4.12. A finite portion of a generic 2D lattice generated by
the primitive vectors ~a1 and ~a2 . Other equally good primitive vectors
~a′1 and ~a′2 are indicated, based on a different origin. Any lattice point
~ can be expressed as n1~a1 + n2~a2 , for example P = −1 ~a1 + 4 ~a2 =
R,
−4 ~a′1 + 8 ~a′2 .
The main symmetry of a crystal is a discrete translational symmetry.3 Given a
point ~r in the crystal, perfectly equal physical properties (including electric potential,
electric field, mass and charge density, current, etc.) are observed at all other points
(269)
~
~r ′ = ~r + R,
~ = n1~a1 + n2~a2 + n3~a3 ,
with R
with nj arbitrary integers, and ~aj three linearly independent vectors. All points
~ in Eq. (269) form an infinite array extending through space: it is
of the type R
named a Bravais lattice. The vectors ~aj are said to generate the lattice. They are
called primitive if, for any ~r, the points ~r ′ defined by Eq. (269) are all the points
which have equal physical properties as ~r (that means that ~aj are taken as short as
~ points as dense as possible).
possible, to make the lattice of R
As a first example, Fig. 4.12 illustrates these ideas for a 2D lattice. Figure 4.13
shows a portion of a simple-cubic lattice, seen from different angles. Primitive
vectors can be chosen as a x̂, a ŷ, a ẑ, i.e. orthogonal and all of the same length
a, the side of the smallest cubes in the lattice. The drawn portion of the lattice
includes 125 points generated by 5 × 5 × 5 consecutive values of the nj indexes.
3
In an isolated atom, the rotational symmetry commuting with the effective single-electron
Hamiltonian makes the angular momenta Li of individual electrons good quantum numbers, used
to label atomic states such as, e.g., 1s2 2s2 2p4 . Sections 4.1.1 and 4.1.2 address the similar problem
to determine and diagonalize the symmetry operators of electrons in a crystal, in order to provide
electrons with appropriate quantum numbers.
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
(a)
(b)
151
(c)
Figure 4.13. A finite portion (5 × 5 × 5 lattice spacings) of a simple
cubic lattice seen from (a) the ẑ direction (001), (b) the ŷ + ẑ direction
(011), and (c) the x̂ + ŷ + ẑ direction (111).
(a)
(b)
(c)
Figure 4.14. A finite portion (2 × 2 × 2 lattice spacings) of a face
centered cubic (fcc) lattice seen from (a) the ẑ direction (001), (b) the
ŷ + ẑ direction (011), and (c) the x̂ + ŷ + ẑ direction (111).
Other important examples of 3D Bravais lattices – already encountered above –
are the fcc and bcc lattices sketched in Figs. 4.14 and 4.15. These lattices are built
by adding sites to the simple cubic lattice (in the fcc, at the center of each face of
the cubes, in the bcc at the body center of the cubes). The added sites are perfectly
equivalent to the original sites. The lattice-point density of fcc is four times and
that of bcc is twice that of simple cubic of the same cube side a. For fcc and bcc,
the same cubic-lattice vectors a x̂, a ŷ, a ẑ as for the simple cubic lattice are often
conveniently used. However, these are not primitive: Fig. 4.16 indicates a standard
choice of primitive lattice vectors.
152
(a)
4. SOLIDS
(b)
(c)
Figure 4.15. A finite portion (3 × 3 × 3 cubic lattice spacings) of a
body centered cubic (bcc) lattice seen from (a) the ẑ direction (001),
(b) the ŷ + ẑ direction (011), and (c) the x̂ + ŷ + ẑ direction (111).
(a)
(b)
Figure 4.16. (a) Primitive lattice vectors for the fcc Bravais lattice:
~a1 = a2 (ŷ+ ẑ), ~a2 = a2 (ẑ+ x̂), ~a3 = a2 (x̂+ ŷ). The point P , for example,
may be expressed as P = ~a1 + ~a2 + ~a3 . (b) Primitive lattice vectors
for the bcc lattice: ~a1 = a2 (ŷ + ẑ − x̂), ~a2 = a2 (ẑ + x̂ − ŷ), ~a3 =
a
(x̂ + ŷ − ẑ).
2
Taking the primitive vectors as three converging edges defines a parallelepiped of
volume Vc = ~a1 × ~a2 · ~a3 . This parallelepiped contains all “different”, inequivalent
points ~r of space: any other ~r ′ lying outside this parallelepiped is equivalent to
some ~r inside, to which it can be reduced by using Eq. (269) with suitable nj . This
elementary parallelepiped contains therefore all the relevant information about the
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
153
Figure 4.17. Several possible choices of primitive cells for the same
2D lattice.
(a)
(b)
Figure 4.18. Primitive cells of the fcc (a) and bcc (b) Bravais lattices. The volume of the fcc primitive cell is 14 of the volume a3 of
the conventional cell. The volume of the bcc primitive cell is 12 of the
volume a3 of the conventional cell.
whole of this periodic world: what happens outside it amounts to essentially boring
repetitions of what goes on inside. For this reason, this minimal volume which, by
~ fills up the whole space without overlapping takes
applying lattice translations R,
154
4. SOLIDS
Figure 4.19. Wigner-Seitz primitive cell of a generic 2D lattice.
The Wigner-Seitz cell of a 2D lattice is always an hexagon, unless the
lattice is rectangular.
the special name of primitive cell, or unit cell. In fact, according to this more
general definition the primitive cell needs not be a parallelepiped at all (in 2D, a
parallelogram – see Fig. 4.17). Each primitive cell contains, in particular, one and
only one Bravais lattice point.
A special choice of primitive cell retains the full symmetry of the lattice, and
is free of the arbitrariness (illustrated in Fig. 4.12) in the choice of the primitive
~ the Wigner-Seitz cell is the set of all positions ~r
vectors. Given a lattice point R,
~
~ ′ (see for example Fig. 4.19). One can
closer to R than to any other lattice point R
show that the Wigner-Seitz cell is indeed a primitive cell. Figure 4.20 shows the
Wigner-Seitz cells for two common lattices.
It often happens that exactly one atom occupies each lattice site of a Bravais
lattice as, e.g., in fcc solid Ne and Al. However, even more often several atoms
belong to the same primitive cell, which is then repeated in space. For example, the
nuclei of solid NaCl (and scores of similar compounds such as LiCl, NaBr, KI, AgF,
CaO, BaSe, ...) sit around simple cubic lattice points, alternately so that each Na
has 6 nearest neighbor Cl, and vice versa. Similarly, the nuclei of CsCl (and similar
compounds CsBr, TlCl, ...) sit at the lattice sites of a bcc lattice, alternately so
that each Cs has 8 nearest neighbor Cl, and vice versa (see Fig. 4.21). To describe
such crystals, it suffices to recognize the true periodicity of the lattice. For example
the NaCl structure can be described as a fcc Bravais lattice containing two atoms
per cell: a Na atom at ~0 and a Cl atom at the center of the fcc primitive cell
1
(~a + ~a2 + ~a3 ) = a2 (x̂ + ŷ + ẑ). This brings us to the general necessity to introduce
2 1
a basis, i.e. a list of atoms with their positions within a primitive cell. A crystal
structure is thus defined by a Bravais lattice plus a basis. A basis is sometimes
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
(a)
155
(b)
Figure 4.20. Wigner-Seitz primitive cells. (a) Of the fcc Bravais
lattices (a rhombic dodecahedron). The surrounding cube is the conventional cubic cell drawn in Fig. 4.18 translated by, e.g. a2 x̂, in order
to put a lattice site at its center. (b) Of the bcc lattice (a truncated octahedron). Each regular hexagon bisects a segment joining the central
point to the a vertex of the cube.
needed even when all atoms in the crystal are chemically equal, because they can
be geometrically different. For example, Fig. 4.22 illustrates the fact that the 2D
honeycomb lattice (all sites hosting “equal” atoms) is not a simple Bravais lattice,
as no two neighboring atoms are geometrically equivalent. The honeycomb lattice
is the building block of the graphite form of carbon, which, as shown in Fig. 4.23,
is composed by a “vertical” stack of parallel planes of honeycomb-bound carbon
atoms. The graphite structure is a simple hexagonal lattice with a basis of 4 atoms
per primitive cell. The hcp structure (Fig. 4.24) is also an hexagonal Bravais lattice
with a 2-atoms basis. Similarly, the diamond structure (of C-diamond, Si, Ge and
α-Sn, drawn in Fig. 4.25) is described by a fcc lattice with a two-point basis. Each
site has four nearest neighbors (see also Fig. 4.4).
Once the possibility that several atoms belong to each cell, one often finds it more
convenient to use a conventional cell containing several equal atoms, rather than
the primitive unit cell of the Bravais lattice. For example, fcc and bcc are often
conveniently described in terms of the underlying nonprimitive simple cubic cell, of
larger volume than the primitive unit cells of Fig. 4.18. The fcc lattice is then seen
as a simple cubic Bravais lattice with a 4-points basis (~0, a2 (ŷ + ẑ), a2 (ẑ + x̂), and
a
(x̂ + ŷ)), while the bcc is seen as a simple cubic lattice with a 2-points basis (~0,
2
156
4. SOLIDS
(a)
(b)
Figure 4.21. (a) The sodium chloride structure. (b) The cesium
chloride structure. Black and white balls represent ions of two different
types. In the NaCl structure, the ions of each kind form interpenetrating fcc lattices. In the CsCl structure, the ions of each kind form
interpenetrating simple cubic lattices.
Figure 4.22. The honeycomb lattice is a 2D triangular net to which
one third of the points has been removed. This is not a simple Bravais lattice, since the “environment” of two neighboring atoms is not
equivalent. In particular, vector quantities are in general different
at those two lattice points, even though all scalar quantities are the
same. The honeycomb net can be described as a Bravais lattice with
two primitive vectors (~a1 and ~a2 of equal length, separated by a 60◦
angle), and a basis composed by, e.g., ~0 and 13 (~a1 + ~a2 ).
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
157
Figure 4.23. The structure of C-graphite: a alternating stack of 2D
honeycomb lattices. It consists of an hexagonal lattice with a basis of
4 atoms per primitive unit cell, two in plane “a” plus 2 in plane “b”.
See also Fig. 0.2.
Figure 4.24. The hcp structure is an alternate stacking of triangular
lattices, as illustrated in Fig. 4.3. This crystal structure is defined
by ~aq
a2 of equal length a, and ~a3 of different length c (ideally
1 and ~
c = 8 a); the basis contains 2 atoms, at ~0 and at 1 ~a1 + 1 ~a2 + 1 ~a3 .
3
3
3
2
158
4. SOLIDS
Figure 4.25. The diamond lattice consists of two interpenetrating
fcc Bravais lattices, displaced along the body diagonal of the cubic
cell by one quarter of the length of this diagonal. It can be regarded
as a fcc lattice with the two-point basis ~0 and 41 (~a1 + ~a2 + ~a3 ) =
a
(x̂ + ŷ + ẑ).
4
a
2
(x̂ + ŷ + ẑ)). All results must coincide when this conventional “lattice with basis”
formalism is used in place of the genuine description in terms of pure Bravais lattice.
We have introduced here the Bravais lattices as infinite arrays of discrete geometric points. This is a convenient means to visualize them. However, the precise
mathematical meaning of a Bravais lattice is a group of translations, precisely those
that transform the corresponding array of discrete points back into itself. Any vec~ can be seen also as a translation operator T ~ such that T ~ ~r = R
~ + ~r, for any
tor R
R
R
point ~r. It is easily verified that the set of all lattice translations {TR~ } of a Bravais
lattice is closed for composition (TR~ TR~ ′ = TR+
~ R
~ ′ ), it contains a neutral element (T~0 ),
and for each TR~ there exists an inverse element (T−R~ ) such that the composition of
TR~ and its inverse yields the neutral element. This means that all lattice transla~ +R
~′ = R
~ ′ + R,
~ this
tions {TR~ } form a group of geometric transformations. As R
group is Abelian i.e. commutative. When the Hamiltonian has the full symmetry of
this group of discrete translations, its eigenstates are simultaneous eigenstates of all
group operations, i.e. they are labeled by the group irreducible representations. The
purpose of the following Section is precisely to find these irreducible representations:
we shall see that their structure is very general and not especially complicated.
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
159
The extra symmetry of many lattices and crystal structures leads to some additional intricacy. For example, a simple cubic lattice transforms back into itself if
rotated around any of its points by 90◦ around any of x̂ or ŷ or ẑ, or by 120◦ around
any body diagonal direction such as x̂ + ŷ + ẑ. These extra transformations extend
the group of discrete translations {TR~ } outlined above. The full symmetry group of
the lattice (space group) is a proper combination of the point group (a finite group of
rigid rotations and reflections about one point) and the group of discrete translations
{TR~ }. Back in the first half of the 19th century it was recognized that only 7 different point groups could occur for 3D lattices. In particular, neither groups including
fivefold axes (rotations by 2π
= 72◦ ), nor groups including sevenfold or higher-order
5
axes exist, as they could not replicate infinitely in space. These 7 point groups combine differently with the translations to form 14 inequivalent Bravais lattices.4 The
introduction of a basis into the Bravais lattices may reduce the global symmetry
of the objects replicated in the primitive cell, and thus the space group. Including
all possible types of basis, the different point groups become 32, rather than 7, and
the space groups 230, rather than 14. We leave the details of the classification of
space and point groups of 3D lattices to specific solid-state courses, and only note
that these extra symmetries are important to recognize the extra degeneracies of
the electronic or vibrational states of the crystal.
4.1.2. The reciprocal lattice. The Fourier transform of a periodic function
includes only discrete “frequencies”: this is the key to understand the use of a
reciprocal lattice. This is illustrated simply in 1D. By definition, a periodic function
f (x) of period a (the lattice spacing) satisfies f (x) = f (x − na) = f (x − R) (R = na
is a lattice vector, as in Eq. (269)). Then, in its Fourier expansion
Z ∞
1
−1 ˜
(270)
f (x) = F [f ](x) = √
eikx f˜(k) dk ,
2π −∞
all Fourier components f˜(k) vanish except those whose exp(ikx) has the same periodicity as f (x). The corresponding k must satisfy exp(ikx) = exp[ik(x − a)], i.e.
exp(−ika) = 1, i.e. ka = 2πl, i.e. k = l · 2π
, for any integer l = 0, ±1, ±2, ....
a
We indicate those special k-values compatible with the lattice periodicity with the
notation
2π
.
(271)
G = G(l) = l ·
a
As at any value of k 6= G which does not respect the lattice periodicity the Fourier
component f˜(k) vanishes, the Fourier expansion (270) can be written as a discrete
4
Two groups are equivalent if they contain the same symmetry operations (e.g., rotations,
discrete translations, etc., the latter up to suitable scaling factors).
160
4. SOLIDS
Fourier series
(272)
f (x) =
X
eiGx f˜(G) ,
G
with coefficients
1
f˜(G) =
a
(273)
Z
a
exp(−iGx)f (x)dx .
0
The G points, of all k’s, acquire therefore a special role when the a-periodicity is
involved, and periodic functions are to be represented in Fourier space. According
to (271), the G points build a regular lattice, of unit vector 2π
, in k space. Apart
a
from the physical dimensions (inverse length rather than length), the k space is not
different from the x space, thus the k-space lattice of G points holds all the properties
of a Bravais lattice on its own: the lattice of G points of Eq. (271) is called reciprocal
lattice. By definition, direct-lattice points R reciprocal-lattice points G satisfy
eiRG = ei na l2π/a = ei 2π nl = 1 .
(274)
The simple 1D example introduced here can be generalized to the 3D case relevant
for actual crystals. A function f (~r) has the periodicity of a Bravais lattice if f (~r) =
~ for any lattice vector R
~ = n1~a1 + n2~a2 + n3~a3 , with integer nj as in
f (~r − R)
Eq. (269). Then, the only nonzero components in its Fourier expansion satisfy
~ · ~r) = exp[iG
~ · (~r − R)]
~ for all R
~ in the direct lattice. This relation is satisfied
exp(iG
~ such that
for all G
~ ~
eiR·G = 1 .
(275)
~ of the reciprocal lattice are all the ~k vectors whose associated
In words, the vectors G
plane wave has the periodicity of the direct Bravais lattice. In particular, taking the
~ one finds that the G
~ vectors are all the vectors of the
primitive vectors ~aj for R,
type
~ = G(l
~ 1 , l2 , l3 ) = l1~b1 + l2~b2 + l3~b3
G
(276)
(lj are integers), with
(277)
~b1 = 2π ~a2 × ~a3 ,
Vc
~b2 = 2π ~a3 × ~a1 ,
Vc
~b3 = 2π ~a1 × ~a2 ,
Vc
~ vectors
where Vc = ~a1 ×~a2 ·~a3 is the volume of the primitive unit cell. As above, the G
form a Bravais lattice, of which Eq. (277) yields a set of primitive unit vectors. Note
~ ·G
~ = 2π(n1 l1 + n2 l2 + n3 l3 ). Any
that as ~bi · ~aj = 2πδij , the product of Eq. (275) R
~ is associated to a plane wave which does not respect the lattice
other vector ~k 6= G
periodicity, thus the corresponding Fourier component f˜(~k) vanishes. Like in 1D,
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
161
the Fourier expansion of the periodic function is a discrete Fourier summation over
the reciprocal lattice
X ~
~ ,
(278)
f (~r) =
eiG·~r f˜(G)
~
G
with coefficients
(279)
1
f˜(G) =
Vc
Z
Vc
~ · ~r f (~r) d3 r ,
exp −iG
where the integration is carried out over a single primitive unit cell volume. The
3
. The Wigner-Seitz
volume ~b1 × ~b2 · ~b3 of the reciprocal-lattice primitive cell is (2π)
Vc
cell of the reciprocal lattice is called first Brillouin zone.
By applying the transformations (277), it is easy to verify that the reciprocal
. The
lattice of a simple cubic lattice of side a is another simple cubic lattice of side 2π
a
fcc lattice of conventional cell of side a has as reciprocal lattice a bcc of conventional
side 4π
. Conversely, the bcc lattice of conventional cell of side a has a fcc reciprocal
a
. As a consequence, the first Brillouin zone of
lattice5 of conventional cell side 4π
a
the fcc lattice has the shape of the bcc Wigner-Seitz cell (Fig. 4.20b), and that of
the bcc lattice has the shape of the fcc Wigner-Seitz cell (Fig. 4.20a). It is a useful
exercise to determine the reciprocal lattice of the hexagonal lattice starting from the
conventional primitive vectors of Fig. 4.24.
~ r
~ vectors in the reciprocal lattice determine plane waves eiG·~
G
of the direct-lattice
periodicity. For a given plane wave, consider the constant plane-wave surfaces, the
“wave fronts”, for example those corresponding to maximum real part of the wave
~
~
function eiG·~r = 1: these constitute a family of parallel planes, perpendicular to G
2π
and separated by a wavelength λ = |G|
~ . Some of these planes pass through the
lattice points. In particular they all pass through the lattice points if the integer
indexes l1 , l2 , and l3 have no common multiplicative factor other than 1. Otherwise,
~ given by nl1 , nl2 , nl3 , one in n of these planes pass through lattice points. The
for G
integer indexes l1 , l2 , and l3 are called Miller indexes of the family of lattice planes.
These indexes are inversely proportional to the intercepts of these planes with the
crystal primitive directions. The standard notation to indicate planes and directions
in ~k space is (l1 l2 l3 ), with the convention that if some of these integers is negative,
then it carries an overbar, and commas are eliminated, as in (2, −1, 0) = (2 1̄ 0).
The relation of a few commonly encountered lattice planes and the underlying cubic
cell is illustrated in Fig. 4.27. Traditionally, for all cubic lattices, including fcc and
bcc, lattice planes are labeled with respect to the conventional cubic direct- and
5
This is a consequence of the general fact that the reciprocal of the reciprocal lattice is the
original lattice. This can be verified by applying twice the transformations (277), or even more
~ and G
~ in Eq. (275) can be exchanged.
simply by observing that the roles of R
162
4. SOLIDS
~ vector identifies a well defined family of parallel
Figure 4.26. A G
~
lattice planes, of constant plane wave eiG·~r . These are perpendicular
~ and separated by a distance 2π . Left: (l1 l2 l3 ) = (1 0 0) planes;
to G
~
|G|
right: (l1 l2 l3 ) = (1 1 1) planes.
Figure 4.27. Standard notation to indicate lattice planes in cubic symmetry.
reciprocal-lattice directions x̂, ŷ, ẑ, not with respect to the primitive unit vectors of
Fig. 4.16.
4.1.2.1. An algebraic note. The algebraic interpretation of the reciprocal lattice
is connected to group theory: any ~k-vector labels an irreducible representation of
the group of the discrete direct-lattice translations. These representations are 1dimensional and the corresponding character of a group operation TR~ is simply
~
exp(−i~k · R).
Two irreducible representation labeled by ~k and ~k ′ have all equal
~ for
characters (thus are in fact the same representation) whenever ~k − ~k ′ = G
~ in the reciprocal lattice. Indeed, exp(−i~k · R)
~ = exp[−i(~k ′ + G)
~ · R]
~ =
some G
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
163
~ exp(−iG
~ · R)
~ = exp(−i~k ′ · R),
~ thanks to Eq. (275). Accordingly, all
exp(−i~k ′ · R)
possible irreducible representations of the discrete translational group are labeled by
all ~k points within one primitive zone of the reciprocal lattice, e.g. the first Brillouin
zone.
4.1.3. Diffraction experiments. Diffraction of radiation is the main quantitative source of structural data about solids. Electrons of energy 1.5 ÷ 150 eV,
electromagnetic fields of 1 ÷ 10 keV (Fig. 0.4), and neutrons of 1 ÷ 100 meV: the
wavelength of all these “radiations” fits in the range 0.1 ÷ 1 nm of the unit cells size
of not too complicated crystals (of the order of few typical interatomic distances
∼ 10−10 m). Solids can and do diffract waves of the three listed kinds. However,
electrons interact very strongly with matter, and are thus sensitive to few topmost
surface layers only. If the energy of X rays is chosen off-resonance from all the core
transitions of the atoms in the material (see Fig. 1.24), then their penetration in the
solids is generally sufficient (i.e. many unit-cell lengths) to produce clear bulk diffraction patterns. Neutrons are even more fit for structural diffraction studies, as they
only interact (weakly) with the nuclei, and are almost insensitive to the electrons.
A sufficiently small sample guarantees that the total probability of probe-sample
interaction is weak, so that most probing radiation goes unscattered through the
sample, a small fraction scatters once, and essentially none scatters twice or more.
Under these conditions scattering is easily understood.
For simplicity, the incoming beam is assumed to be produced by a monochromatic
source placed at a large distance from the sample, so that it is characterized by a
well-defined wave vector ~k (or momentum ~~k). The detector is also very remote from
the sample, so that it detects outgoing radiation scattered elastically to another welldefined wave vector ~k ′ (Fig. 4.28a). As analyzed more quantitatively in the theory
of scattering, every infinitesimal volume d3~r of a continuous distribution of matter
scatters radiation in proportion to the amount of matter ρ(~r)d3~r locally present. As
suggested around Eq. (73) and sketched in Fig. 4.28b, the total rate of transition γ~k ~k′
R
~′
~
is proportional to the square modulus of the matrix element d3~r e−ik ·~r ρ(~r) eik·~r =
R 3 −i(~k′ −~k)·~r
d ~r e
ρ(~r) ∝ F [ρ](~k ′ − ~k) = ρ̃(~k ′ − ~k), namely the Fourier transform of the
relevant density. These considerations suggest that the elastic scattering rate of the
incoming ~k wave radiation into the ~k ′ direction is proportional to the square modulus
of the Fourier transform ρ̃ of the appropriate density ρ, evaluated at the transferred
wave vector ~q = ~k ′ − ~k (Fig. 4.28a). The relevant density is the electronic charge
density ρel (~r) for X rays, and the nuclear-matter density ρnuc (~r) for neutrons:
2
2 (280)
I X−rays (~q) ∝ F [ρ el ](~q) ≡ ρ̃ el (~q) .
neutr
nuc
nuc
164
4. SOLIDS
scritta inutile
k’
y
q
x
0
k’
k
r
2θ
k
(a)
SAMPLE
∆ϕ1 = k
∆ϕ2 = −k’
r
r
(b) ∆ϕ tot= ∆ϕ 1+ ∆ϕ2 = ( k −k’) r = −q r
Figure 4.28. (a) In an elastic-scattering experiment (|~k ′ | = |~k|),
the transferred wave vector ~q = ~k ′ − ~k contains information equivalent
to the scattering angle 2θ. (b) The “far field” wave scattered by a
point-like element of matter at ~r is dephased by −~q · ~r with respect to
the one at the origin ~0, due to the different path length emphasized
by bold segments. Calculation of the total interfering wave from a
continuous distribution leads to a Fourier-transform “summation”.
As a nucleus is essentially point-like, if seen at atomic length scales, the Fourier
transform of the density distribution of one nucleus is basically flat, thus one atom
scatters neutrons essentially independently of ~q, as drawn in Fig. 4.29a. When
two atoms diffuse neutrons, the scattered matter wave interfere: interference leads
to intensity reinforcement and reduction in alternating directions. Constructive
interference occurs whenever the two source-atom-detector path lengths differ by an
integer multiple of the radiation wavelength λ = 2π/|~q|. Quantitatively, the Fourier
~ 1 and R
~2
transform ρ̃nuc (~q) of two equal point-like objects at R
!
!2
2 ~
~
~
~
R
−
R
R
+
R
1
2
1
2
~ 1 · ~q) + exp(−iR
~ 2 · ~q) = exp −i
|ρ̃nuc (~q)|2 ∝ exp(−iR
· ~q 2 cos
· ~q 2
2
h
i
~1 − R
~ 2 ) · ~q = 2 [1 + cos(a qx )]
=
(281)
2 1 + cos (R
shows characteristic oscillations (Fig. 4.30a,c). Constructive interference yields
maximum intensity in the ~q directions characterized by a projection qx along the
~1 − R
~ 2 is aligned along x̂) such that
line joining the two nuclei (assume that R
~1 − R
~ 2 ) · ~q = |R
~1 − R
~ 2 | qx ≡ a qx = 2π× an integer l1 . The role of interatomic
(R
~
~
separation a = |R1 − R2 | on the interference pattern of the scattered waves is illustrated by the comparison of Fig. 4.30a and c: an increased separation in real space
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
(a)
165
(b)
Figure 4.29. Comparison of (a) neutron and (b) X-ray elastic scattering by a single Si atom. qx represents the x̂-component of the
change in wave vector ~k ′ − ~k of the scattered radiation. Intensity
is proportional to the square modulus of the Fourier transform of
the appropriate density. Neutrons probe the nuclear-matter density,
which varies over a length of the order of ∼ 1 fm, and is therefore
~ its Fourier
indistinguishable from a Dirac-delta distribution δ(~r − R):
transform is essentially independent of qx , until huge qx ≈ 104 Å−1 .
X rays measure the Fourier transform of the total atomic electronic
charge density, which changes over a length of the order of 1 Å, thus
the atomic form factor varies significantly over a qx range of few Å−1 .
produces closer interference maxima in ~q space. Scattered intensity is independent
of the qy , qz components in the plane orthogonal to the line joining the two atoms,
which are therefore ignored by Fig. 4.30 (see also Fig. 4.33a below for a 2D example).
Neutron scattering from a crystal arises from the interference of the waves scattered coherently by many atoms. The Fourier transform in Eq. (280) turns into a
discrete sum, not unlike light diffracted by an optical grating. Figure 4.31 shows the
intensity scattered by short regular chains of atoms: despite the smallness of such
1D “crystals”, narrow dominating diffraction structures emerge at regular ~q directions. The relative intensity of the weak subsidiary interference peaks in between the
strong Bragg diffraction peaks decreases quickly as the number of atoms increases.
Already for 30 atoms these structures become almost invisible (Fig. 4.31c,d), and
vanish completely in the limit of a macroscopically large regular crystal.6 A simple
analysis reveals that the directions ~q of maximum diffracted intensity are compatible
6
The modest regularity sufficient to produce strong diffracted peaks grants the possibility
to produce visible diffraction patterns even in the event that the coherence length of the probing
166
4. SOLIDS
(a)
(b)
(c)
(d)
Figure 4.30. (a) Neutron and (b) X-ray scattering by two Si
atoms. The patterns shows the characteristic interference periodic~1 − R
~ 2 | is the distance between the two nuclei
, where a = |R
ity 2π
a
(here, a = 2 Å). The X-rays pattern is the product (285) of the neutron pattern (uniquely determined by the atomic positions) with the
atomic structure factor, carrying information on the charge distribution of the single atom. (c) and (d) are the same as (a) and (b), but
with enlarged interatomic separation a = 3 Å: note that the interference fringes move closer, while the enveloping atomic form factor
remains unchanged.
~i −R
~ i+1 ) = qx a = 2π l1 : these are the positions of the 1D reciprocal lattice
with ~q ·(R
radiation available in the lab extends to several lattice cells spacings only, much shorter than the
whole sample size.
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
(a)
(b)
(c)
(d)
167
Figure 4.31. Neutron and X-rays diffraction patterns, like in
Fig. 4.30, but produced by 7 (a,b) and 30 (c,d) atoms rather than
2π
Å−1 reflect the sepa2. The strong diffraction peaks at distance 2.0
ration a = 2.0 Å. The relative intensity of the subsidiary interference
peaks decreases quickly as the number of atoms is increased, and vanishes in the limit of infinitely large crystal. In the X-ray pattern, the
diffraction scheme is multiplied by the atomic form factor.
vectors qx = Gx = 2π
l . This occurs also for 2D and 3D periodicity: we conclude
a 1
that Bragg diffraction peaks occur for transferred wave vector equal to reciprocal~
lattice points ~q = G.
The patterns of Figs. 4.30-4.31 show that the X-ray diffractograms differ from
those of neutrons in an amplitude modulation. This is easily understood in steps:
168
4. SOLIDS
(a)
(b)
Figure 4.32. Comparison of the X-ray atomic form factors of carbon
(Z = 6) and of silicon (Z = 14). (a) Absolute value, showing that
the total X-ray scattering increases ∝ Z 2 ; (b) The form factor of Si is
scaled to coincide with that of C at ~q = 0, showing that the form factor
of heavier atoms is broader, due to the sharper charge localization of
their inner core electrons.
• X rays scattered by a single atom probe the continuous electronic density
distribution ρat (~r). According to Eq. (280), one atom scatters X rays elastically with the nontrivial ~q dependence given by Fourier transform of its
electronic density, called atomic form factor fat (~q) ≡ ρ̃at (~q), as drawn for
Si in Fig. 4.29b. Figure 4.32 reports two examples of X-ray atomic form
factors: these are characteristic bell-shaped functions of total weight proportional to the squared number of electrons.
• As illustrated in Secs. 2.2.1 and 2.2.2 above, chemical bonding modifies
the electronic states of a molecule or a solid, so that the electronic charge
density does differ from the sum of the densities of the individual atoms.
However only valence electrons are involved in bonding and delocalize significantly. X-ray scattering are scattered mostly by the (for Z not too small
more numerous) core electrons, which retain an atomic-like charge distribution. Accordingly, it is a fair approximation to assume that the electronic
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
(282)
169
distribution of a collection of atoms (e.g. in a molecule, or in a solid) equals
the sum of the individual atomic electronic distributions:
X
~ .
ρel (~r) =
ρat (~r − R)
~
R
~ of a
• For a sample composed by many equal atoms sitting at the points R
Bravais lattice, the density of electrons can be rewritten as a convolution:
(283)
Z
Z
X
X
3
~ = d ~s ρat (~r −~s)
~ = d3~s ρat (~r −~s) · ρBravais (~s) ,
ρel (~r) =
ρat (~r − R)
δ(~s − R)
~
R
~
R
P
~
where ρBravais (~s) ≡ R~ δ(~s − R).
• A basic theorem guarantees that the Fourier transform of the convolution
of two distributions is the plain product of the Fourier transforms of the
individual distributions:
(284)
IX−rays (~q) ∝ |ρ̃el (~q)|2 = |fat (~q) ρ̃Bravais (~q)|2 = |fat (~q)|2 |ρ̃Bravais (~q)|2 .
• Finally, observe that ρnuc (~r) ∝ ρBravais (~s), and conclude that
(285)
IX−rays (~q) ∝ |fat (~q)|2 |ρ̃nuc (~q)|2 ∝ |fat (~q)|2 |Ineutr (~q)|2 .
This relation details the role of the atomic form factor fat (~q) in X-ray diffraction: X-rays scatter in the same ~q directions as neutrons of the same wavelength, but the peak intensities are modulated multiplicatively by the abs
square atomic form factor. This observation accounts for the compared
diffractograms of Figs. 4.29 to 4.31.
The 1D examples discussed above generalize simply to 2D and 3D. Figure 4.33
shows the 2D pattern of neutrons scattered by 2, 3, 4, and 5 equal atoms located at
the vertexes of regular polygons. The fact that 3 and 4 atoms produce a Bravais lattice as an interference pattern, while the pentagonal arrangement does not, suggests
that pentagonal symmetry is incompatible with repetition in space. Figure 4.34
shows the typical effect of geometric deformations of the lattice on the diffraction
pattern, as described mathematically by Eq. (277). Figures 4.35, 4.36, 4.37, 4.38,
illustrate typical 2D neutron diffraction patterns for several geometries. The handy
software of Ref. [?] permits to explore arbitrary 2D geometries.
The 3D patterns follow similar rules: diffracted beams come out in the directions
~
where the transferred ~q matches a G-vector
of the 3D reciprocal lattice. In fact,
by shining a monochromatic neutron or X-ray beam on a single crystal, one gen~
erally obtains no diffracted beams, since, for that given ~k, all possible ~k ′ = ~k + G
have length |~k ′ | different from ~k, and thus would correspond to inelastic scattering.
170
4. SOLIDS
(a)
(b)
(c)
(d)
Figure 4.33. 2D neutron scattering patterns produced by regular
polygons of 2 to 5 equal atoms. For 2 atoms (a), as indicated by
Eq. (281), scattering is independent of the ~q component perpendicular
to the line joining them. Note that the regular triangle (b) and square
(c) produce a Bravais lattice as a “diffraction” pattern, while the
q1
pentagon (d) does not. In this figure and in others below, h = 2π
and
q2
k = 2π .
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
171
(a)
(b)
(c)
Figure 4.34. 2D X-rays intensity scattered by 4 atoms: this simulates the general features of a square (a), rectangular (b), and oblique
(c) net. The relations (277) of inverse proportionality of the reciprocallattice and direct-lattice basis are well illustrated. The diffraction
spots at the reciprocal lattice points become much sharper when an
actual lattice produces them, rather than 4 atoms only.
Diffracted beams are produced only by carefully orienting the crystal, so that, for
~ the incident and scattered wavenumbers match:
some G,
(286)
~ .
|~k| = |~k ′ | = |~k + G|
172
4. SOLIDS
Figure 4.35. The neutron diffraction pattern produced by a 9 × 9
square lattice of equal atoms. A red square marks the real-space
primitive cell; a blue square marks the reciprocal space primitive cell.
The peaks are much sharper than in the 2 × 2 example of Fig. 4.34.
Figure 4.36. The neutron diffraction pattern produced by a rectangular lattice.
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
Figure 4.37.
oblique lattice.
The neutron diffraction pattern produced by an
Figure 4.38. The neutron diffraction pattern produced by a triangular lattice.
173
174
4. SOLIDS
PLANES
LATTICE
CRYSTAL
SOURCE
ing
m
bea
om
inc
k
θ
θ
k’
|k| sin θ
G
θ
θ
k’
ou
tgo
ing
bea
m
(a)
DETECTOR
(b)
Figure 4.39. (a) The Ewald construction. Given the incident vector
~k, draw a sphere (the Ewald sphere, in gray) of radius |~k| about the
point ~k. Diffraction peaks corresponding to reciprocal lattice vectors
~ will be observed only if −G
~ happens to lie on the surface of the
G
Ewald sphere, as drawn: radiation is then diffracted to the direction
~k ′ = ~k + G.
~ (b) The relation between the scattering angle 2θ and
~ vectors. The angle θ between the incident (or
the lengths of ~k and G
the scattered) beam and the relevant Bragg plane of atoms fixes the
projection ~k · Ĝ = |~k| sin θ, which is compatible with diffraction when
~
it equals 21 |G|.
This geometric condition is represented by the Ewald construction of Fig. 4.39a.
~ by
The scattering angle, i.e. the angle between ~k and ~k ′ is related to |~k| and G
squaring Eq. (286), obtaining
(287)
~k · Ĝ = 1 |G|
~ .
2
Figure 4.39b illustrates this geometric relation: the projection ~k · Ĝ = |~k| sin θ should
~ Accordingly, scattering is observed at angles 2θ connected
equal half the length of G.
~ by
to the lengths of ~k and G
(288)
sin θ =
~
|G|
.
2 |~k|
~ to a family of lattice planes (drawn in Fig. 4.39b) sepIn Sec. 4.1.2, we related G
arated by a distance d = n |2π
~ , where n is the greatest common factor among the
G|
~ On the other hand, |~k| is connected to the radiation
integer Miller indexes fixing G.
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
175
Figure 4.40. The Ewald construction for a powder sample: ~k defines a fixed Ewald sphere (gray). All possible orientations of the
~ lattice (solid
crystal are realized in the powder sample: the whole G
~ point covpoints) is thus rotated in all possible ways, so that every G
ers a sphere (represented by a circle in the plane of the figure). Each
of these spheres intersects the Ewald sphere on a circle, whose projection in the plane of the figure is marked by two empty dots. All ~k ′ on
that circle make a fixed angle 2θ with ~k. Radiation is then scattered
to cones whose axis is the k̂ direction.
: substitution in Eq. (288) yields the celebrated Bragg
wavelength λ by |~k| = 2π
λ
condition for diffraction
(289)
2d sin θ = nλ .
~ or equivalently
According to this construction, no diffraction occurs for 2|~k| < |G|,
for λ > 2d.
In practice, to produce diffracted beams off a perfect single crystal, one must
~ point touches the
move the reciprocal lattice and the Ewald sphere until some G
Ewald sphere, as in Fig. 4.39a. One can either vary the radiation wavelength, thus
changing the Ewald sphere diameter, or else rotate the crystal sample (its reciprocal
lattice rotates accordingly, see Eq. (277)).
176
4. SOLIDS
(a)
(b)
(c)
Figure 4.41. Powder diffraction patterns: intensity as a function of
the scattering angle 2θ. (a) BaTiO3 , the intensity counting compared
to the film recording. (b) La2 Mo2 O9 , showing a structural phase transition. (c) LaPO4 , showing the progressive sample annealing of a clear
crystalline phase as temperature is raised, and the comparison with
the peak positions of known phases.
In the lab, it is common to characterize the structure of powder samples, i.e. collections of microcrystals rotated randomly in space. This uniform distribution of
orientations is equivalent to averaging over all possible rotations of the reciprocal
lattice. As illustrated in Fig. 4.40, radiation is scattered at fixed angles 2θ (given by
Eq. (288)) away from the incident ~k, thus it forms cones of diffracted radiation. Figure 4.11 shows an example of such a pattern. The sample in that experiment is an
Al foil, not a proper powder. The ring pattern of Fig. 4.11 indicates sharp cones of
radiation diffracted at characteristic angles: this proves that Al is microcrystalline.
Microcrystalline structures of this kind (tightly bound collections of randomly oriented microscopic individual crystals separated by grain boundaries) are responsible
for the plastic deformable character of most solid metals, as opposed to the rigidity
of many solids forming large single crystals (e.g. Si, NaCl).
4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS
177
Structural data about powder or microcrystalline samples are conveniently collected in plots of the diffracted intensity at the each angle 2θ, as in the patterns
of Fig. 4.41. Here the vertical axis reports the total scattered intensity, integrated
along circles at fixed 2θ.
Till this point, the discussion assumed a single atom per unit cell of the Bravais
lattice. To extend the formalism to general crystals with a n-atoms basis, replace
the atomic form factor in Eq. (285) with a suitable structure factor
(290)
S(~q) =
n
X
~
fat j (~q) ei~q·dj ,
j=1
representing the Fourier transform of the matter distribution of the n atoms in the
cell sitting at positions d~j . For neutrons, the fat j (~q) are ~q-independent weights,
numerically different for different nuclei in the cell; for X-rays, fat j (~q) are the form
factors (see Fig. 4.32 and Eq. (285)) of the individual atoms in the cell. The decomposition I(~q) ∝ |S(~q)|2 |ρ̃Bravais (~q)|2 implies a fundamental result: a given Bravais
lattice yields the same characteristic diffraction pattern irrespective of the number,
kinds and position of the atoms populating its unit cell: these details affect the
structure factor |S(~q)|2 , in turn applying a multiplicative intensity modulation to
the same peaks.
The possibility of describing lattices with a basis might make us suspect some
ambiguity. For example, a square lattice of side a could also be seen as a square
~ points of the 2a
lattice of side 2a with 4 atoms per cell: the reciprocal-lattice G
lattice are twice as dense in each direction, but the diffracted pattern should remain
the same as we only change the formal description of the same system. Indeed,
~ points of the denser
the structure factor (drawn in Fig. 4.34a) vanishes for all G
reciprocal lattice which do not belong to the reciprocal lattice of the true a-side
square. However, the 2a lattice may become the correct minimal description of the
actual lattice, e.g. in case of a structural deformation – one atom in each 2 × 2
square moving away from its perfect-square position. Under these conditions, the
~ peaks acquire nonzero intensity. Similarly, a fcc lattice of conventional
2a denser G
~ points of a
cell side a, as a Bravais lattice, produces diffraction spots for all ~q = G
4π
bcc reciprocal lattice of conventional side a . However, the same fcc lattice can be
seen as a simple cubic lattice of side a with 4 atoms/cell. The diffraction pattern of
~ points (a simple cubic lattice of side 2π ) than
the simple cubic lattice has denser G
a
that of the fcc. The actual diffraction pattern must coincide regardless of the chosen
formalism. Indeed, the structure factor (290), computed for 4 equal atoms (equal
~
fat j (~q)) sitting at positions ~0, a2 (x̂ + ŷ), a2 (x̂ + ẑ), a2 (ŷ + ẑ), gets rid of peaks at G
) not belonging to the actual bcc reciprocal
points of the simple cubic lattice (side 2π
a
4π
lattice (side a ).
178
4. SOLIDS
Defects and the finite size of the crystals in the sample add a continuous background of diffuse scattering to the sharp Bragg peaks (see e.g. Fig. 4.31) of the
diffraction pattern. If disorder increases, this continuous background increases in
intensity, until for completely amorphous or liquid samples, neutron or X-ray scattering does not show the sharp Bragg peaks characteristic of lattice periodicity any
more: scattered intensity becomes a smooth function of the angle 2θ. Even for such
materials, the scattered distribution provides useful information about their average
structural properties, which can be retrieved by numerical Fourier analysis.
4.2. Electrons in crystals
Within the adiabatic framework (Sec. 2.1), electrons move in a solid according
to the electronic equation (117). The total electronic energy obtained by solving
Eq. (117), added to the internuclear repulsion, yields the adiabatic potential (125)
which, in turn, determines the dynamics of the nuclei through Eq. (124). For a
crystalline solid at low enough temperature, Vad keeps the atomic configuration
close to its minimum, characterized by a regular arrangement of the nuclei, not
unlike a finite (huge) portion of an ideal crystal structure of the kind described
in Sec. 4.1. In Sec. 4.3 we shall investigate the motions of the ions around their
equilibrium configuration. For the moment however we neglect the nuclear kinetic
energy and the movements of the ions: we assume they sit at equilibrium exactly
at the ideal crystal-structure positions, and point our attention to the dynamics of
electrons. Within this idealized scheme, we investigate the universal properties of
the solutions of the electronic equation (117), those required by symmetry.
Equation (117) is a many-body equation, carrying the same conceptual difficulties
we discussed for the molecular case. As for atoms and molecules, the Schrödinger
problem of many electrons moving in the field of the nuclei is usually treated in the
framework of some mean-field self-consistent technique of the Hartree-Fock type.
As discussed in Sec. 1.2.4, this approximation maps the N -electron equation to
a set of single-electron self-consistent equations for the motions of an electron in
the field of the nuclei and N − 1 other electrons. We assume that the mean-field
effective potential Veff (~r) has the same symmetry as the field created by the bare
nuclei, i.e. the full crystal symmetry7 (Fig. 4.42). As in atoms, the single-electron HF
orbitals carry spherical-symmetry labels l and m, similarly in solids a single-electron
states are labeled by Bravais-lattice group representations, namely vectors ~k chosen
within a primitive zone of the reciprocal lattice, e.g. the first Brillouin zone. More
concretely, the ~k quantum number contains information on how the wavefunction
7
This assumption is far more reasonable than the assumption that the HF mean field of atoms
is spherically symmetric. The reason is that the crystal states have all equal probability in different
cells (as discussed below), while non-s atomic states have nonuniform probability distributions.
179
0
-10
Vne(x)
Veff(x) [Ha]
4.2. ELECTRONS IN CRYSTALS
-20
x
Figure 4.42. The bare nuclear potential Vne (x) (solid) and the
screened effective one-electron potential Veff (x) (dashed) along a 1D
cut through a direct-lattice primitive direction in a typical solid (e.g.
Al). The effective potential is less attractive than the bare nuclear potential, but it shows the same discrete lattice translational symmetry,
~
Veff (~r) = Veff (~r + R).
changes under the action of lattice translation TR~ , i.e. in going from one cell to
another within the lattice. Additional quantum numbers are needed to label states
of different wavefunction within a single primitive cell, and for spin projection.
A precise determination of the wavefunction and energy for a single electron moving in the HF mean field requires a detailed calculation, which is generally carried
out by numerical means. In this context, symmetry plays a twofold role:
• greatly simplify the solution of the HF-Schrödinger equation in the lattice;
• understand general features of its solutions.
Lattice discrete symmetry applies to the electronic eigenstates of crystals through
Bloch’s theorem. This theorem states that all Schrödinger eigenstates in a periodic
~ can be chosen in the factorized form
potential [Veff (~r) = Veff (~r + R)]
(291)
~
ψj (~r) = eik·~r u~k j (~r) ,
where the function u~k j (~r) has the same periodicity of the lattice [u~k j (~r) = u~k j (~r +
~ and ~k is a suitable wave vector (depending on ψj , but otherwise subject to no
R)],
restriction). This fundamental result is seen in two alternate, but equally instructive
ways:
180
4. SOLIDS
• In a periodic context all electronic eigenfunctions have a nontrivial spatial
dependence only within one primitive unit cell: in any other cell displaced
by TR~ , the wavefunction is equal to that in the original cell, apart from a
~ ~
constant phase factor eik·R which leaves the probability distribution |ψj |2
unaffected. This consideration is basically also the demonstration of Bloch’s
theorem.8
• In a periodic potential, the Schrödinger eigenstates are essentially planewave–like states, except for a periodic (thus trivial) amplitude modulation.
Observe that ~k may as well be restricted to one primitive cell of the reciprocal
lattice, e.g. a primitive parallelepiped, or the first Brillouin zone9 – see Fig. 4.17.
Indeed, if ~k was outside this primitive cell, we could always find a reciprocal lattice
~ such that ~k ′ = ~k + G
~ is in the primitive cell of our choice. But then
vector G
′
~
~
~′
~
~
ψj (~r) = eik·~r u~k j (~r) = ei(k −G)·~r u~k j (~r) = eik ·~r e−iG·~r u~k j (~r), and the function u~′k′ j (~r) =
~
e−iG·~r u~k j (~r) is lattice-periodic, thus it makes a valid Bloch function.
It is instructive to find the explicit equation satisfied by the Bloch functions
u~k j (~r). This is obtained by substituting the decomposition (291) in the stationary Schrödinger equation for the electrons in the periodic effective potential:
~2 2
~
~
(292)
−
∇ + Veff (~r) eik·~r u~k (~r) = E eik·~r u~k (~r) .
2me
To deal with the kinetic term, observe that
h
i
~
~ · ei~k·~r ∇u
~ ~ (~r) + i~kei~k·~r u~ (~r)
∇2 eik·~r u~k j (~r) = ∇
kj
kj
2
~
~
~ ~ (~r) − |~k|2 ei~k·~r u~ (~r) = ei~k·~r ∇
~ + i~k u~ (~r) .
= eik·~r ∇2 u~k j (~r) + 2i~keik·~r ∇u
kj
kj
kj
By substituting this decomposition into the Schrödinger equation (292), and dividing
~
by the common factor eik·~r , we obtain
2
~2 ~
∇ + i~k + Veff (~r) u~k j (~r) = E~k j u~k j (~r) .
(293)
−
2me
8
The single-electron effective Hamiltonian Te + Veff (~r) commutes with all lattice translations
TR~ . Accordingly, its eigenfunctions may be chosen as simultaneous eigenfunctions of all discrete
~′ ~
translation operators. But then TR~ ψj (~r) = e−ik ·R ψj (~r) for some ~k ′ in the first Brillouin zone.
~′
Accordingly, one can call ~k = −~k ′ and u~k j (~r) = ψj (~r) eik ·~r is periodic by construction. 1dimensional group representations do not seriously break the lattice symmetry, except possibly for
phases. As a related example, the symmetric and antisymmetric wavefunctions of Eq. (126), drawn
in Fig. 2.3 have both a perfectly symmetric square modulus under reflections across the mid-plane
separating the nuclei, exactly like Bloch states have square modulus equal in all cells.
9 In 1D, any k-interval of length 2π . For example, the interval − π < k ≤ π is the first Brillouin
a
a
a
zone.
4.2. ELECTRONS IN CRYSTALS
181
E
j=3
j=2
π
_
a
− _π
a
kx
j=1
Figure 4.43. For each given ~k in the first Brillouin zone, the solution
of Eq. (293) provides discrete electronic energies E~k j , for j = 1, 2, ...,
three of which are sketched here. These energies depend parametrically on ~k. The continuous functional dependency Ej (~k) = E~k j is an
energy band of the solid.
This is the basic equation for the stationary states of an electron characterized
by a given ~k. Thanks to the periodicity of u~k j established by Bloch’s theorem,
Eq. (293) must be solved within a single cell of the direct lattice, with applied
periodic boundary conditions. This is to be contrasted with the original Schrödinger
equation, which should instead be solved in the whole crystal. Lattice discrete
symmetry, through Bloch’s theorem, permits us to study the quantum dynamical
problem, rather than on a macroscopically large volume (where it would be arduous
to seek for solutions – this is the reason why non-periodic solids are much harder
to deal with), on a relatively small unit cell, where investigation can profitably be
carried out by a number of approximate techniques. Once u~k j (~r) is found, the true
electronic wavefunction ψj (~r) is then extended to the whole lattice by means of
Eq. (291).
Before attempting the solution of Eq. (293), we establish a few universal properties
of the single-electron eigenstates and eigenenergies in a solid. Equation (293) is a
second-order differential equation of the Schrödinger type, defined in a finite volume
Vc , with standard periodic boundary conditions. Apart for the ~k shift of wave
vector, Eq. (293) is equivalent to a stationary Schrödinger equation. Consequently,
for a given fixed ~k, its solutions must be qualitatively similar to those of a standard
182
4. SOLIDS
Figure 4.44. Electronic states of a solid built approaching isolated
atoms: as the atoms move closer, the lattice separation a becomes
comparable to the size of the atomic wavefunctions, and the individual
atomic levels spread out in bands. If external pressure is applied
to the solid, so that the lattice parameter decreases to a′ , then the
bandwidths increase further, and more bands may overlap.
Schrödinger equation in a finite volume, namely: a ladder of discrete eigenenergies
E~k j associated to eigenfunctions u~k j (~r) with larger and larger number of nodes for
increasing energy. The index j = 1, 2, 3, ... labels precisely the solutions in order of
increasing energy (Fig. 4.43).
~k can vary arbitrarily within the first Brillouin zone, and as it does, the equation
(293) for u~k (~r) changes, which explains attaching the label ~k to both eigenenergies
E~k j and eigenfunctions u~k j (~r). The parameter ~k acts analytically in Eq. (293), thus
it is reasonable to expect that its solutions depend analytically on ~k. Indeed, for
fixed j, the eigenenergies E~k j depend on ~k as continuous functions called energy
bands, or simply “bands” (Fig. 4.43). As ~k takes all its allowed values within the
first Brillouin zone, for each j, E~k j spans a continuous interval of available energies
(the range of the E~k j function, sometimes called itself a “band”). The ranges of two
successive bands E~k j and E~k j+1 can either overlap (like bands 2 and 3 in Fig. 4.43) or
not overlap (like bands 1 and 2 in Fig. 4.43). Both possibilities are compatible with
Eq. (293), and do occur in actual solids. The main consequence of Bloch’s theorem
is then a spectrum of electronic energies involving intervals of allowed energies,
separated by ranges of forbidden energies (band gaps). In a crystal, electrons are
characterized by an energy spectrum somewhat intermediate between that of a free
[j=1]
Re ψk j(x)
Re ψk j(x)
(a)
183
[j=2]
4.2. ELECTRONS IN CRYSTALS
x
(b)
x
Figure 4.45. A sketch of the real part of some Bloch wavefunction
ψk j (x) = eik·x uk j (x) in a 1D lattice, for (bottom to top) k = 0,
, k = 0.2 2π
, k = 0.3 2π
, k = 0.4 2π
, k = 0.5 2π
, for the
k = 0.1 2π
a
a
a
a
a
states belonging to the two lowest bands j = 1 (a) and j = 2 (b).
Negative k values of the same length |k| yield wavefunctions whose
real part is the same. As k increases across the first Brillouin zone,
the phase difference of the wavefunction at neighboring sites increases.
At the zone boundary k = πa , this phase difference is maximum, and
equals π, corresponding to the sign alternation of the upmost curve.
At the same time, the shape of the wavefunction at each site also
evolves with k, due to the k dependence of the equation (293) for
uk j (x).
particle (all positive energies) and that of an atom (isolated eigenvalues separated by
gaps). Figure 4.44 sketches the connection between the atomic and the solid-state
spectra as the component atoms move together to form the solid.
Figure 4.45 shows the qualitative shape of a few band eigenfunctions in a 10-sites
portion of a simple 1D lattice (the picture proposed here extends simply to 2D and
3D). The real part of the wavefunctions for two bands is drawn: the imaginary part
would be similar, but out of phase by one quarter of wavelength. The probability
density |ψ~k j (x)|2 is periodic, i.e. repeated equally in all cells. The primary role of
~k is to tune the phase change of the wavefunction in going from one site to the
next. The wavelength associated to these Bloch waves is 2π
. The values of ~k chosen
|~k|
in Fig. 4.45 yield wavelengths commensurate to the 10-sites region drawn. Other
184
4. SOLIDS
|2>
|1>
...
|0>
|1>
|0>
| N n −1>
(a)
(b)
Figure 4.46. A 1D periodic lattice of Nn sites with periodic boundary conditions has the connectivity of a ring. (a) The Nn = 2 case,
representing a diatomic molecule. (b) A generic 1D lattice.
intermediate choices of ~k would produce incommensurate wavelengths. In addition,
~k also modifies the local “shape” of the wavefunction ψ~ (x) within each cell, through
kj
the explicit dependency of Eq. (293), and thus of u~k j (x). This ~k-dependency is
smooth and often relatively weak, while the probability density usually changes
more substantially in comparing a band j to another j ′ .
The detailed calculation of the band energies E~k j and wavefunctions u~k j (x) of an
actual solid is typically carried out by means of some self-consistent calculation of the
mean-field potential Veff (~r), associated to numerical solution of Eq. (293) in a directlattice primitive unit cell, for a sufficiently dense set of ~k points. However, some
insight in the physics of the band states can be obtained by considering substantially
simplified “model” solutions.
4.2.1. Models of bands in solids. Although Eq. (293) is not especially difficult to solve numerically in a primitive unit cell, for any reasonable number of
“sample” ~k points, it is useful to gain better insight in the properties of the band
energies and states by means of simplified approximate solutions, which allow for
some analytical treatment.
4.2.1.1. The tight-binding model. In regions near the atomic nuclei, the crystal
effective potential acting on an electron is not much different from that of an isolated
atom. A crystal may be seen as a huge molecule: accordingly, the electronic (band)
states could be approximately expressed as suitable linear combinations of atomic
wavefunctions, like the molecular states of H+
2 and H2 .
The 2-atoms calculation of Sec. 2.2.1 is the simplest example of tight binding. It
addresses the 1s “band” of a “crystal” composed by Nn = 2 H atoms only. |1s Li
4.2. ELECTRONS IN CRYSTALS
185
Ek
1
0
-1
-0.4
-0.2
0
0.2
0.4
k a / 2π
Figure 4.47. The continuum of electronic states in a solid builds
up as the Nn discrete energies at the allowed k points get denser and
denser with Nn → ∞.
represents the state at the left atom. |1s Ri, represents that at the right atom
obtained from |1s Li by a “lattice” translation. Artificial periodic boundary conditions bring |1s Ri back again to |1s Li upon a further translation (Fig. 4.46). The
symmetry-adapted states (see Eq. (126)) can be seen as bonding and antibonding
combinations
1 X ikpa
1 X
1
|pi = √
e |pi ,
k=0
|Si = √ (|1s Li + |1s Ri) = √
Nn p=0,1
Nn p=0,1
2
1
1 X ikpa
π
1 X
|Ai = √ (|1s Li − |1s Ri) = √
(−1)p |pi = √
e |pi , k = ,
a
Nn p=0,1
Nn p=0,1
2
with |0i = |1s Li and |1i = |1s Ri. The only possible phase relations between in
those two-sites combinations are “in phase” and “out of phase”.
More in general, for a chain of Nn ≥ 2 H atoms (a 1D “crystal”, Fig. 4.46), the
combinations compatible with the discrete translational symmetry are the following
Nn ones:
Nn −1
1 X
eikpa |pi .
ψk = √
Nn p=0
k takes the Nn equally spaced values
1 2π
2 2π
3 2π
Nn /2 − 1 2π
1 2π
k = 0, ±
, ±
, ±
, ..., ±
, and
Nn a
Nn a
Nn a
Nn
a
2 a
compatible with the “ring” periodicity. In the limit of very large Nn , the k states
form a continuum in the first Brillouin zone − πa < k ≤ πa . As k spans this range,
186
4. SOLIDS
Figure 4.48. The band states in a solid seen as combinations of the
orbitals of the isolated atoms as they move together to form a crystal.
The Nn discrete levels of a finite solid of Nn unit cells build up the
band continuum in the thermodynamic limit Nn → ∞, as discussed
in Sec. 4.2.2. The bandwidth increases as the lattice parameter a
is reduced to its equilibrium value, due to the increasing overlap, as
illustrated for the bonding and antibonding states of the diatom in
Fig. 2.4b.
the energy of the resulting band moves continuously from the bonding value (band
bottom) to the antibonding value (band top), as illustrated in Fig. 4.47. This bandstructure is a generalization of the molecular Eq. (128). Similarly, as the interatomic
separation a is reduced, the overlap integral between atomic orbital increases, and
so does the bandwidth (Fig. 4.48).
These simple concepts apply to any atomic orbital of for real 3D solids, as illustrated in Fig. 4.49 for solid sodium. Note that, for a given interatomic separation a,
the overlaps of shallow extended atomic states are larger than those of deeper, more
compact core states. Accordingly, in the solid state, inner levels originate narrower
bands than external optically active atomic orbitals. The initial lowering of the
band center of mass from the atomic energy as a is reduced from very large values
(Fig. 4.48) is due to two main reasons:
• a band state in a crystal is subject to the collective attraction of all nuclei,
rather than just one;
4.2. ELECTRONS IN CRYSTALS
187
Figure 4.49. Tight-binding energy bands of solid sodium, as a function of internuclear distance. The strongly overlapping 3s, 3p, ...
bands indicate that the tight-binding method is not especially well
suited to describe the wide conduction band at the equilibrium lattice
spacing. In fact, the plane-waves method (Sec. 4.2.1.2) is better fit to
the conduction band of alkali metals.
• the kinetic energy of the more delocalized band state is lower than that of
the more compact atomic state.
If the band in exam happens to be filled with electrons, this energy lowering contributes to the crystal cohesive energy.
4.2.1.2. The plane-waves method. An approach to approximate bands alternative to tight binding starts from observation that, as illustrated in Fig. 4.42, the
effective one-electron potential for the outer electrons is substantially screened in
the crystal, thus free-electron plane-wave states should approximate well the actual
eigenstates, except near the atomic nuclei. In the following, we use the 3D formalism, but the problem is most simply visualized in 1D, and most illustrative examples
restrict to 1D for simplicity.
188
4. SOLIDS
Expand the one-electron lattice eigenstates on the basis of plane waves:
X
|bi =
b~k′ |~k ′ i,
~k′
where |~k ′ i are suitably normalized plane-wave states,10 and, in principle, the sum
extends over all ~k ′ vectors, and amounts therefore to an integration.
Application of Schrödinger equation to the candidate eigenket |bi and multiplication on the left by h~k| (implying a volume integration over ~r) maps the initial
differential problem to an algebraic (matrix) equation for the wavefunction Fourier
components b~k′ :
2
p
+ Veff |bi = E |bi
H |bi =
2me
2
p
~
hk|
+ Veff |bi = E h~k|bi
2me
X
X ~2 k 2
′
(294)
δ~k,~k′ + h~k|Veff |~k i b~k′ = E
δ~k,~k′ b~k′ ,
2m
e
′
′
~k
~k
where appropriate orthonormality h~k|~k ′ i = δ~k,~k′ of the plane waves and the fact that
they are eigenstates of the momentum have been used. The matrix elements
Z
~′ ~
′
~
~
hk|Veff |k i = N ei(k −k)~r Veff (~r) d3~r = Ṽeff (~k − ~k ′ )
are the Fourier components of the potential (and N is the appropriate normalization
according to the convention discussed in footnote 10).
Until this point, no mention has been made of any lattice symmetry, and indeed
Eq. (294) is the equivalent formulation of Schrödinger equation in Fourier space for
a generic potential Veff . This formulation is generally not advantageous, since the
matrix indexes ~k of the eigenvalue problem (294) are continuous quantities, taking
infinitely many values, exactly like ~r in the real-space equation. For a periodic
potential however, as observed in the general discussion around Eq. (278), the Fourier
expansion of the periodic potential is a discrete Fourier series over the reciprocal
~ a vector of the reciprocal
lattice, i.e. Ṽeff (~k − ~k ′ ) is nonzero only for ~k − ~k ′ = G,
lattice. This means that in the continuous-indexed energy matrix of Eq. (294) most
off-diagonal matrix elements vanish. In practice, given any ~k ′ in the first Brillouin
zone, the off-diagonal potential matrix elements connect the plane wave |~k ′ i only
~
to plane waves |~ki, whose ~k is displaced by a reciprocal lattice vector: ~k = ~k ′ + G.
In an infinite space, the standard normalization is h~r|~ki = (2π)−3/2 exp(i~k · ~r). In a finite
volume V , the correct normalization is h~r|~ki = V −1/2 exp(i~k · ~r).
10
4.2. ELECTRONS IN CRYSTALS
189
One can then consider separately each subset of states originated from a given ~k.
Only the corresponding matrix sub-blocks need to be diagonalized. In one of these
sub-blocks, the matrix form of Eq. (294) is:

..
..
..
.
.
.

~ 1 ) + Ṽeff (~0)
~
~
~
~ 3)
 · · · T (~k+ G
Ṽeff (G1 − G2 )
Ṽeff (G1 − G

 ···
~2 − G
~ 1)
~ 2 ) + Ṽeff (~0)
~2 − G
~ 3)
Ṽeff (G
T (~k+ G
Ṽeff (G

 ···
~3 − G
~ 1)
~3 − G
~ 2)
~ 3 ) + Ṽeff (~0)
Ṽeff (G
Ṽeff (G
T (~k+ G

..
..
..
.
.
.
...
(295)
 
.. 
  . 
··· 
b~ ~ 
 
 k+G1 

b
·
· · ·   ~k+G~ 2 


··· 
  b~k+G~ 3 
..
...
.
 . 
..
b

 ~k+G~ 1 


= E~k  b~k+G~ 2  ,


 b~k+G~ 3 
..
.
where T (~k) = ~2mke . For fixed ~k in the first Brillouin zone, this matrix must be diagonalized. Diagonalization becomes trivial whenever all off-diagonal matrix elements
~ 6= ~0) = 0): the eigenvalues in
are identically null, i.e. for a constant potential (Ṽeff (G
the solid are then simply the free-particle energies shifted by the constant potential
2 2
E~k = ~2mke + Ṽeff (~0), and the eigenstates coincide with the original plane waves |~ki.
~ 6= ~0 Fourier components of Veff are nonzero:
However, in any realistic solid, many G
these generate off-diagonal couplings among plane waves. The exact eigenstates
~ vectors,
of the problem are linear combinations of the plane waves differing by G
~
obtained by the diagonalization of the full matrix in Eq. (295). As the G-points
are infinite, this is again an infinite matrix, but one can cut the basis restricting
~
to a finite number d of relevant states, i.e. of G-points,
and then diagonalize a finite version of Eq. (295) numerically. This method is routinely used in standard
bandstructure calculations.
~ 6= 0)|), analytical information
Whenever the potential is “weak” (small |Ṽeff (G
can be extracted out of (295). More precisely, whenever the off-diagonal elements
~2 −G
~ 1 ) are small compared to the diagonal separation |T (~k+G
~ 1 ) − T (~k+G
~ 2 )|
Ṽeff (G
of the coupled states, the off-diagonal term acts as a small perturbation, thus it
~ G
~ 1)
“perturbs” the diagonal energy only weakly. As a result, if all couplings Ṽeff (G−
~ 1 i to all other plane-wave states are small with respect to their
of a state |~k + G
diagonal energy separation, we can safely assume that the exact band energy shall
2 2
190
4. SOLIDS
T(k) [eV]
30
20
10
0
-2π/a
0
2π/a
k
~ 1 ) = T (~k+
Figure 4.50. A graphical solution of the equation T (~k+G
~ 2 ). The special ~k-points associated to plane waves with degenerate
G
kinetic energies are obtained by translating the free-energy parabolic
dispersion to all possible reciprocal lattice G-vectors (here only two
are indicated for simplicity). In this figure, a = 2.1 Å, and G1 = 0,
G2 = 2π
.
a
not differ much from the diagonal energy
(296)
E~k ≈
~2 |~k|2
+ Ṽeff (~0) .
2me
Even in the favorable case of tiny Fourier components of Veff , the condition
~ ~
~
~
~
~
T
(
k+
G
)
−
T
(
k+
G
)
≪
Ṽ
(
G
−
G
)
(297)
eff 1
1
2 2 is not verified for a few special ~k values, those such that the kinetic terms are
~ 1 ) ≃ T (~k + G
~ 2 ). At these points (highlighted in
degenerate or nearly so: T (~k + G
~1 −G
~ 2 ) becomes dominating, and it displaces
Fig. 4.50), the off-diagonal term Ṽeff (G
the actual band significantly away from the free-electron parabola Eq. (296). Near
the degeneracy of two states only, approximate bands energies can be calculated by
diagonalizing the 2 × 2 matrix
~ 1 ) + Ṽeff (~0)
~1 − G
~ 2)
T (~k+ G
Ṽeff (G
~2 − G
~ 1)
~ 2 ) + Ṽeff (~0) .
Ṽeff (G
T (~k+ G
4.2. ELECTRONS IN CRYSTALS
191
This is similar to Eq. (130), but complex hermitian rather than real. The eigenen~ 1 ) + Ṽeff (~0) in place of EL , T (~k + G
~ 2 ) + Ṽeff (~0) in place of
ergies (131), with T (~k + G
~
~
ER , and |Ṽeff (G1 − G2 )| in place of ∆, solve also to the 2 × 2 problem at hand. As
these eigenenergies (131) show, the off-diagonal element produces
a characteristic
~1 − G
~ 2 ).
“repulsion” of the levels, which do never get any closer than 2 Ṽeff (G
This model illustrates the tendency of the periodic components of the potential
to open forbidden energy intervals in the otherwise uninterrupted parabolic freeelectron dispersion. As illustrated in Fig. 4.51, in 1D, gaps are guaranteed to open
~
~ of the potential happens to
at all G2 points (unless some Fourier component |Ṽeff (G)|
vanish). In several dimensions, degeneracies of the kinetic term occur for all ~k, such
~ = |~k|, which is the condition (286) for Bragg scattering. This may open
that |~k + G|
a band gap in some direction in ~k-space, but this does not always generate a true
gap, i.e. a range of forbidden energy, since those same energies may well be allowed
in some other ~k-direction (see Fig. 4.55 below).
Both the free-electron starting point described here and the tight-binding method
sketched above lead to single-electron spectra characterized by bands of allowed energy separated by gaps of forbidden energy, in accord with Bloch’s theorem. These
models allow us to clarify the physical meaning of both quantum numbers of electronic states in solids (see Eq. (293) and Fig. 4.43): ~k (illustrated in Fig. 4.45), and
the band index j. In a tight-binding context, the band index contains indications
about the atomic nature of the state (which is a clear-cut concept especially for
the narrow bands of inner shells, see Fig. 4.49). Instead, in the quasi-free-electron
approach (especially efficient for the wide bands related to the empty atomic levels)
the band index represents basically the number of times that the ~k 6= ~0 Fourier
components of the potential have reflected the free-electron parabola back inside
the first Brillouin zone.
4.2.2. Filling of the bands: metals and insulators. The T = 0 state
(ground state) of a system of many independent electrons in a periodic potential
is obtained by filling the one-electron band states up to a Fermi energy, like in
the free-fermion model described in Sec. 3.2.2.1. The Fermi energy separates filled
(below) from empty (above) levels. Depending on the total number of electrons in
the solid, the Fermi energy may end up either inside one (several) energy band(s),
or within a band gap. In the first case, the electrons in the partly filled band(s)
close to the Fermi energy are ready to take up excitation energy, for example by
external fields. In particular, an arbitrarily weak applied electric field can accelerate electrons, which can then conduct electric current. Such a solid is a metal. In
contrast, all electrons in completely filled bands are “frozen” by Pauli’s principle.
Any dynamical response of these electrons requires excitation across some energy
192
4. SOLIDS
Figure 4.51. (a) The free-electron dispersion E versus k – a
parabola in 1D. (b) A G-translated free-electron parabola crosses the
original free-electron band at 12 G: in the neighborhood of this crossing
a weak periodic potential component produces the largest distortion.
(c) A gap of width 2|UG | = 2|Ṽeff (G)| opens at the degeneracy point
k = 21 G: the band distorts in a whole neighborhood around there. (d)
The portions of the band originated from the starting free-electron
dispersion. (e) The effects of other G components of the potential on
the same free-electron parabola. This particular way of representing
the bands is known as extended-zone scheme. (f) Same bands reported
inside the first Brillouin zone (reduced-zone scheme). (g) Same bands,
in a repeated-zone scheme.
4.2. ELECTRONS IN CRYSTALS
193
Figure 4.52. Two qualitatively different possibilities for band filling
at T = 0: a partly filled band (the solid is a metal), and a completely
filled band (with all bands either full or empty and the Fermi energy
in a gap the solid is an insulator).
Figure 4.53. Qualitative bands of a pure semiconductor, compared
to the Fermi distribution.
gap. A solid where all bands are either filled or empty, with the Fermi energy inside
a gap is an insulator. In solids, the highest completely filled band is called valence
band, and the first empty (or partly filled) band is the conduction band.
This basic difference affects substantially all properties of these two classes of materials, even at finite temperature. The Fermi-Dirac distribution (238) applies to
independent electrons at equilibrium in a solid pretty much like in a free gas, simply
replacing the free-electron energies and plane-wave states with the band energies and
Bloch states. Thus, a nonzero temperature in a metal generates a finite concentration of electrons above the chemical potential and holes below it, similarly to what
happens around the Fermi sphere of a free-electron gas. Thus the thermodynamics
194
4. SOLIDS
E
3p
3s
E
3p
3s
Ne
εF
εF
2p
2s
2p
2s
1s
(a)
k
− π_
a
− 2π
__
3a
− π__
3a
0
π
__
3a
2π
__
3a
π
_
a
Na
1s
(b)
k
− π_
a
− 2π
__
3a
− π__
3a
0
π
__
3a
2π
__
3a
_π
a
Figure 4.54. The filling of the bands of a symbolic Ne (a) and Na
(b) 1D crystal composed of Nn = 6 atoms. According to Eq. (298), the
allowed k-points in the first Brillouin zone are k = 0, ± 16 2π
, ± 26 2π
,
a
a
3 2π
π
and 6 a . The − a point must be excluded, since it is no different of
π
. Individual bands are assumed not to overlap. Energies are purely
a
qualitative and not in scale.
of electrons in metals is interesting and rich of physical consequences, including a
characteristic T -linear contribution to the heat capacity of the solid, as discussed
in Sec. 3.2.2.1. Instead, in an insulator of gap ∆ between conduction and valence
band, the average occupancy of a valence-band state is [nv ] ≃ 1 − exp[−∆/(2kB T )],
extremely close to 1 at low temperature, and correspondingly the average occupancy
of a conduction-band state is [nc ] ≃ exp[−∆/(2kB T )], extremely small at low temperature. On the contrary, for temperatures much smaller than k∆B , the electrons
of an insulator can be considered to all effects as frozen in the filled bands, their
excitation being accessible only by means of high-energy spectroscopies. For a widegap insulator (e.g. Al2 O3 ∆ ≃ 5 eV, C diamond ∆ = 5.5 eV, SiO2 ∆ = 8.0 eV,
NaCl ∆ = 8.97 eV), any practical temperature is by far smaller than k∆B , and the
band-state occupancies are indistinguishable from those at T = 0. For example,
with a 4 eV gap at room temperature (kB T ≈ 0.025 eV), [nc ] ≈ e−80 ≃ 10−35 .
However, small-gap insulators, usually called semiconductors (e.g. Ge ∆ = 0.75 eV,
Si ∆ = 1.17 eV, GaAs ∆ = 1.51 eV), show measurable conduction associated to
thermal electronic excitations across the gap, even at room temperature, as sketched
in Fig. 4.53, and discussed in Sec. 4.2.2.2.
An infinite crystal contains an infinite number of cells and an infinite number of
electrons: to establish the expected position of the Fermi energy, a counting rule is
4.2. ELECTRONS IN CRYSTALS
195
needed. Consider a macroscopic portion of the solid of volume V = Nn 1 Nn 2 Nn 3 Vc ,
extending for Nn i lattice repetitions in the ~a1 , ~a2 , ~a3 primitive directions. Apply
periodic boundary conditions to this portion of solid, so that discrete translational
invariance is preserved. To satisfy these periodic boundary conditions ψ~k j (~r) =
ψ~k j (~r + Nn i~ai ), the ~k label of the Bloch states is restricted to
(298)
~k = n1 ~b1 + n2 ~b2 + n3 ~b3 , with nj = − Nn j+1, − Nn j+2, ... Nn j−2, Nn j−1, Nn j .
Nn 1
Nn 2
Nn 3
2
2
2
2
2
These values of ~k are the lattice equivalent to those of Eq. (180). These Nn =
Nn 1 Nn 2 Nn 3 discrete ~k values become dense and fill the primitive unit cell of the
reciprocal lattice as Nn j → ∞ and the infinite Bravais direct lattice is recovered (see
Figs. 4.46 – 4.48). The number of electrons of this finite lattice portion equals Nn
times the number ncell of electrons of each unit cell. Each band (orbital) state has
room for 2 electrons, one for each spin state, ↑ and ↓. If the bands are all disjoint,
one on top of another, the Nn ncell electrons fill the 2Nn spin-orbital states of 21 ncell
bands, as illustrated in Fig. 4.54. An even value of ncell leads to 21 ncell full bands,
followed by empty bands above. For example, each atom of solid Ne (fcc, one atom
per cell) carries ncell = 10 electrons to the bands, for a total of 10Nn electrons in the
crystal. In a tight-binding language, 2Nn electrons fill the 1s band, 2Nn electrons
fill the 2s band, 6Nn electrons fill completely the 2p bands, for a total of 5 filled
bands. The Fermi energy then lies in the gap between the filled 2p band and the
empty 3s-3p bands, and the crystal is an insulator. Odd ncell leads to 12 (ncell − 1)
filled bands, plus a half-filled band. For example, the 11 electrons that each Na
atom puts in the band states of its bcc crystal (one atom per cell) fill completely
the 1s, 2s, 2p bands, and fill only the 21 Nn lowest orbital states of the 3s band. This
3s band is then half filled, the Fermi energy cuts through it: the solid is a metal.
Although it is true that any band crystal with an odd number of electrons per cell
ncell is a metal,11 for even ncell both insulators (like Ne) and metals are possible,
since there is no guarantee that the ( 21 ncell )-th and the ( 12 ncell + 1)-th bands are
separated by a gap, as illustrated in Fig. 4.55. The alkali earth (IIA) and end of the
transition (IIB: Zn, Cd, and Hg) solids are all metals, with even ncell , precisely due
to overlapping bands at the Fermi energy.
4.2.2.1. Metals. The most characteristic macroscopic feature of a metal is its
ability to conduct electric current at low temperature. In fact, all solids, even
insulators, show some measurable conductivity associated to impurities and thermal
excitations. What really characterizes the metallic state is a conductivity which
11
When the independent electrons approximation breaks down, as in strongly correlated materials, there occur insulating states with odd ncell , often accompanied by magnetic order of the
spins.
196
4. SOLIDS
Figure 4.55. In 2D and 3D crystals, the band energy E~k j depends on
the direction of ~k. It often occurs that even though a gap is observed
in the one-electron bandstructure for some ~k direction (and even for all
directions), with all directions taken into account the gap disappears,
since the energy range forbidden in some direction ka becomes allowed
for a different direction kb , due to band overlap. If the Fermi energy
ends up in this region, it crosses several bands, and the solid is a
metal, even with an even number of electrons per cell.
decreases as T is increased, while the conductivity of insulators increases as T is
increased due to increasing thermal excitations across the gap.
Consider now the prevision of band theory for the conduction of electrons in a
crystal: the expected behavior is very peculiar, in sharp disagreement with observation. To describe the motion of electrons in the field of the periodic lattice plus
the applied external field, a semiclassical approach is useful. As common in quantum mechanics, an electron is represented by a wave packet, i.e. a superposition of
Bloch states in a single band j, characterized by a rather sharp wave-number distribution, thus a large spatial extension, much larger than the crystal lattice spacing
(Fig. 4.56). The equations governing the motion of the center of mass ~r of this wave
packet are assumed to be:
(299)
(300)
d
1~
~r = ~vj (~k) = ∇
~ E~
dt
~ k kj
h
i
d~
~
~
~
~ k = −qe E(~r, t) + ~vj (k) × H(~r, t) .
dt
4.2. ELECTRONS IN CRYSTALS
197
Figure 4.56. Schematic picture of the electron dynamics within the
semiclassical model. The length over which applied fields (dashed line)
vary is much greater than the spread in the electron wave packed (solid
line), which, in turn, is much larger than the lattice constant.
~ r, t) and magnetic H(~
~ r, t) fields12 are supposed
The external perturbing electric E(~
to vary slowly on the scale of the wave-packet size (Fig. 4.56). A rigorous derivation
of equations (299) and (300) goes far beyond the scope of the present course: we
only propose a few heuristic arguments to support their plausibility.
• The center-mass velocity of the wave packet is the group velocity associated
to the dispersion E~k j of the Bloch waves. Equation (299) states the basic
fact of wave mechanics that a wave packet with dispersion ω(~k) = ~−1 E~k j
~ ~ ω(~k). In the special
moves with the velocity given by its group velocity ∇
k
2
2
~~k
case of a free electron E~k = ~2mke , this yields the usual relation ~v (~k) = m
e
between velocity and momentum.
• If the force associated to an external electric
potential φiext (~r, t) acts on
h
the band electron, then its total energy E~k j − qe φext (~r, t) should remain
conserved along the semiclassical motion. To verify this, we derive this total
12
The magnetic field is assumed to be measured in the same units (T) as the magnetic
~
induction field B.
198
4. SOLIDS
energy with respect to time (notation:
i
d h
~ ~ E~
E~ − qe φext (~r, t) = ∇
k kj
dt k j
~ ~ E~
∇
k kj
=
~
df
dt
≡ f˙). Equation (299) gives
˙
~ ~r φext (~r, t) · ~r˙
· ~k − qe ∇
h
i
h
i
˙
~ r, t) = ~vj (~k) · ~~k˙ + qe E(~
~ r, t) .
· ~~k + qe E(~
If Eq. (300) is satisfied, then this derivative indeed vanishes, as the vector
˙
~ = −qe~vj × H
~ is perpendicular to the velocity ~vj . Total energy is
~~k + qe E
thus conserved.
• Equation (300) resembles the classical equation of motion of a particle of
charge −qe moving under the action of the external electromagnetic fields
~ r) and H(~
~ r) only. The periodic forces produced by the crystal act only
E(~
through the band dispersion E~k j , generating the nontrivial ~k-dependence
~~k
. This
of the velocity of Eq. (299), in place of the free-electron ~v (~k) = m
e
~
means that in the crystal ~k does not equal the electron momentum, as for
a free electron. It is rather called crystal momentum.
• The semiclassical equations assume that the external fields induce no interband transition. For weak fields, inter-band transitions are indeed exceedingly rare, while strong fields make the semiclassical approximation fail and
lead to electric or magnetic breakdown. Also, if the fields are not static, the
frequency of any oscillation must not exceed inter-band gaps (~ω ≪ ∆).
Rapidly varying field are applied in spectroscopy precisely to induce interband transitions.
The semiclassical equations confirm that all electrons in a completely filled band
do not contribute to either electric current or heat current. The electric current
carried by a wave packet representing an electron is (−qe )~vj (~k). The total electric
current density carried by a filled band j amounts to13
Z
2
~j =
(301)
(−qe )~vj (~k)
d3~k ,
3
(2π)
BZ
with the integration extended over the first Brillouin zone (BZ). This integral vanishes since, due to Eq. (299), the integrand function is the gradient of a periodic
13
Similarly to Eq. (231), the ni sum is turned into an integral by inserting the appropriate
V
density of states. According to Eq. (180), the ~k-density of states is (2π)
3 . An extra factor gs = 2
accounts for the spin degeneracy. The total current times the sample length is obtained by summing
the charge times the velocity (−qe )~vj (~k) of the individual electrons over a Brillouin zone. Dividing
by V , we obtain the current density (charge per unit area and time).
4.2. ELECTRONS IN CRYSTALS
199
function (E~k j ) over a unit cell of the reciprocal lattice.14 In other terms, in a filled
band as many electrons carry current in a direction as others carry current the opposite way, for a vanishing net current. The same reasoning applies to energy (heat)
current
Z
2
~jE =
(302)
E~k j ~vj (~k)
d3~k ,
3
(2π)
BZ
by noting that the integrand is proportional to the ~k-gradient of (E~k j )2 , which is
also a periodic function of ~k. Completely filled band do not contribute to transport
any more than completely empty bands. All electric and thermal conductivity is
to be attributed to partly filled bands. This explains why no systematic increase of
conductivity is observed in the metallic elements for increasing Z (for example the
conductivities of fcc Cu, Ag, and Au are very similar), despite the largely different
total number of electrons.
On the other hand, Bloch states are stationary states of the Schrödinger equation
in the perfect lattice: if a wave packet of Bloch states representing an electron
~ ~ E~ = ~0), then that
has a finite mean velocity (as happens unless by chance ∇
k kj
velocity shall persist forever. Thus, even in the absence of external electric fields,
persistent currents should be observed in metals, but they are not. Furthermore,
as illustrated in Fig. 4.57 for the simple 1D case, the semiclassical motion following
Eq. (300) under the action of a constant field cycles the k-point across the whole
BZ. Correspondingly, Eq. (299) gives an oscillating velocity covering positive and
negative velocities for the same amount of time. According to this model, a DC
electric field should induce an AC current (Bloch oscillations) in a metal wire! This
prevision is neither an artifact of 1D nor of the semiclassical approximation: Ohm’s
~ is not consistent with the main model ingredient, Bloch electrons in an
law15 ~j = σ E
ideal lattice, taken alone. Trouble is, Bloch electrons are capable of traveling through
a perfect lattice without friction, thanks to coherent interference of scattered waves.
The missing ingredient here is collisions. Any real crystal is not ideal since
14
In general, if f (~r) is periodic with the periodicity of some Bravais lattice, the result of
its integration overR a primitive unit cell
of a Rtranslation of this cell by any ~r ′ .
R is independent
′
3
′
3
~ ~r f (~r + ~r ′ ) d3~r. In particular,
~
~
~
Therefore, 0 = ∇~r ′ cell f (~r + ~r ) d ~r = cell ∇~r ′ f (~r + ~r ) d ~r = cell ∇
R
3
′
~ ~r f (~r) d ~r = ~0. Of course, the same holds in ~k-space for functions,
taking ~r = ~0, we have cell ∇
such as the band energies E~k j , which are periodic with the periodicity of the reciprocal lattice.
15 The current I in a wire is directly proportional to the applied potential drop: I = R−1 V .
The coefficient of proportionality R depends on the length L and cross section A of the wire, but
not on the current or potential drop. In terms of the current density ~j crossing perpendicularly
~ the potential drop V = L|E|.
~ Thus
the surface area A, I = |~j|A. In terms of the electric field E,
L −1 ~
L −1
Ohm’s law rewrites ~j = A
R E, where the conductivity σ = A
R is a characteristic property of
the material, and the resistivity ρ = σ −1 .
200
4. SOLIDS
6
j=2
Ε
Ek
4
2
0
dEk/dk
2
j=1
(a)
electron velocity
0
-2
(b)
electron acceleration
-5
2
d Ek/dk
2
0
-10 (c)
-3
-2
-1
0
1
2
3
ka
Figure 4.57. (a) Electronic motion in the first BZ of a 1D lattice.
Under the action of a leftward external force (uniform constant right~ the wave number k drifts at constant speed: at
ward electric field E),
the BZ boundary it folds back k = −π/a → π/a. Correspondingly, the
band energy of the electron oscillates. The semiclassical-wavepacket
(b) velocity, Eq. (299), and (c) acceleration, Eq. (309), are proportional to the first and second derivative of the band energy, respectively.
(1) it contains imperfections, as discussed in Sec. 4.1;
(2) its nuclei are not frozen at their equilibrium positions but actually vibrate
around them, as we shall discuss in Sec. 4.3.
4.2. ELECTRONS IN CRYSTALS
201
Both these discrepancies from the ideal lattice picture are sources of collisions for
conduction electrons.16 To represent the effect of collisions as simply as possible, we
shall assume that:
(1) Each electron moves, between successive collisions, according to the semiclassical equations of motion (299), (300).
(2) Each electron experiences an instantaneous collision at random, with probability τ −1 per unit time. The time τ is variously known as the relaxation
time, the collision time or the mean free time. Its physical interpretation
is the following: any electron shall, on average, travel τ seconds before
experiencing its next collision.
(3) After a collision, the electron emerges with a ~k whose direction is perfectly
random, whose modulus reflects the (Fermi) distribution at the appropriate local temperature, and in such a way to respect Pauli’s principle. All
memory of the initial ~k (i.e. velocity) is lost.
As a result of collisions, the application of an external electric field does not
produce free acceleration of the electrons, thus no Bloch oscillations are observed.
On the contrary, the external field produces only a weak perturbation to the thermal
equilibrium distribution. Basically, collisions act mostly close to the Fermi surface
(the surface in ~k space separating full and empty states at T = 0, analogous to the
Fermi sphere of the free gas). As a DC electric field attempts to shift each occupied
~k state in the −qe E
~ direction (Eq. (300)), collisions rapidly transfer electrons from
occupied states in the higher-energy region to the emptied region of lower energy,
in a tendency to re-establish thermal equilibrium, as illustrated in Fig. 4.58a. After
an initial transient, the net effect amounts to a small constant displacement of the
filled states with respect to zero field (Fig. 4.58b). This displacement produces a net
current density equal to the ~k-space integration of V −1 (−qe )~vj (~k) through the region
δ 3~k of unbalanced occupancy. This can be roughly estimated in the free-electron
parabolic band as
Z
Z
V
−1
3
~j =
V (−qe )~vj (~k) 3 d ~k ≃
(303)
(−qe )ÊvF d3~k ≃ −qe vF Ê (δ 3~k) ,
4π
3
3
δ ~k
δ ~k
where numerical factors of order 1 have been ignored. The ~k-space volume (δ 3~k)
in between the equilibrium Fermi surface and its field-shifted replica, sketched in
Fig. 4.59 for free electrons, can be estimated by observing that in the average time τ
~
between two collisions, each electron changes its wave vector by δ~k = ~−1 τ (−qe ) E.
16
Some scattering is produced by electron-electron interaction as well, but this is essentially negligible compared to the two other sources of collisions in ordinary metals at ordinary
temperature.
202
4. SOLIDS
E=0
0
[nk]
Ek
E=0
1
E
εF
-2
collisions
0.5
(b)
(a)
-kF
0
0
-kF
-2kF
-kF
0
kF
2kF
k
k
Figure 4.58. (a) Band occupancy under the combined effect of an
applied constant leftward electric field and collisions with impurities
and with instantaneous displacements of the lattice ions (phonons).
The external field accelerates the electrons to the right according to
Eq. (300). Fast collision tend to reestablish equilibrium by scattering
extra-energetic electrons prevalently into lower-energy states which
have been left empty. (b) Under the combined effect of the field and
collisions, the Fermi occupancy distribution shifts in ~k space. This
constant shift, here largely exaggerated, is proportional to the applied
electric field, and is responsible for electric-current transport.
Dropping again factors of order unity, the volume (δ 3~k) is approximately
τ
~ 2
3~
2
~
(δ k) ≃ −|δ k| kF ≃ (−qe ) E
kF ,
~
~ field. By substituting
where the minus sign recalls that the shift is opposite to the E
~kF
this expression and vF ≃ me (as if electrons were free) in Eq. (303), we obtain
qe2 τ 3 ~
qe2 τ N ~
~kF τ
~ 2
~
Ê (−qe ) E kF ≃
k E≃
E,
(304)
j ≃ (−qe )
me
~
me F
me V
(Sec. 3.2.2.1),
where we used the relation of kF and electron density, kF3 = 3π 2 N
V
again dropping factors of order unity. Equation (304) agrees with Ohm’s law, and
evaluates a conductivity
(305)
σ≃
qe2 τ N
.
me V
4.2. ELECTRONS IN CRYSTALS
203
kz
3
δ k
δk
ky
kx
Figure 4.59. The shift δ~k of the Fermi sphere induced by a leftward
external electric field. The ~k-space volume (δ 3~k) is responsible for
unbalanced electrons velocity, thus for a net electric current.
element ρ [nΩm] ρ [nΩm] τ [10−14 s] τ [10−14 s]
at 77 K at 273 K at 77 K
at 273 K
Na
8
42
17
3.2
14
61
18
4.1
K
Rb
22
110
14
2.8
Cu
2
16
21
2.7
Ag
3
15
20
4.0
Au
5
20
12
3.0
Mg
6
39
7
1.1
Al
3
25
6.5
0.8
Table 4.2. Measured electrical resistivity ρ = σ −1 for several elemental metals. The corresponding relaxation
times are obtained from
N
Eq. (305) through τ = me / ρqe2 N
,
with
equaling the density of
V
V
electrons in the conduction band.
From the measured resistivity ρ = σ −1 of a metal and the conduction electrons
, Eq. (305) permits to estimate the relaxation time τ , as in Tanumber density N
V
ble 4.2. The observed trend is of increasing resistivity with temperature, corresponding to a decreasing relaxation time. Collisions become more frequent as thermal
motion produces larger displacements of the nuclei from their equilibrium positions,
204
4. SOLIDS
Figure 4.60. Measured low-temperature resistivity of Na for three
different samples with different defect concentrations, leading to different low-T residual resistivity, associated to different defect concentrations (the sample of the lowest curve is a less defective Na crystal;
the effect of defects can be much larger in other metals, and huge in
disordered alloys). For low to intermediate T , phonon-induced resistivity grows as T 5 , but it then rapidly reaches a T -linear regime.
as described in Sec. 4.3. The average time between collisions can
P then be assumed
inversely proportional to the total number of phonons τ −1 ∝ ǫ [nǫ ]B . At thermal
energies kB T much larger than the characteristic phonon energies (ǫ ≈ 1÷100 meV),
the total phonon number is proportional to T (expand the boson average, Eq. (237):
1
= kBǫT ). Indeed, at large temperature the resistivity is linear
[nǫ ]B = eβǫ1−1 ≃ βǫ
in T , to a good degree of approximation. At low temperature, the phonon number decreases rapidly (see Eq. (258)), but impurities and all sorts of lattice defects
provide a residual T -independent scattering. Accordingly, the resistivity for T → 0
converges to a sample-dependent constant, as shown for Na in Fig. 4.60.
We can estimate the thermal conductivity of electrons in a metal in analogy
~ ~r T = dT ẑ is established
to electrical conductivity. When a thermal gradient ∇
dz
across the sample, at a given point in the metal, electrons around the Fermi energy
4.2. ELECTRONS IN CRYSTALS
1
cooler
T
hotter
-
[nk]
T
+
205
0.5
0
-2kF
-kF
0
kF
2kF
k
Figure 4.61. Heat transport is associated to electrons moving out
from the hotter (right) region carrying (on average) higher energy than
those moving out from the cooler (left) region. At a given point ~r in
the sample, the distribution of ~k states is therefore slightly asymmetric
~ ~r T . This asymmetry is strongly exaggerated in
in the direction of ∇
figure.
have a slightly distorted equilibrium distribution since those coming from the hightemperature side have a slightly higher temperature than those coming from the lowtemperature side. This temperature difference depends on the distance these electrons have traveled after the previous collision to the given point. This average distance d ≃ τ vF is called the mean free path. At that point therefore the electrons coming from the hot side are associated to a T + ≃ T +d dT
≃ T +τ vF dT
and those moving
dz
dz
dT
−
the opposite way to a T ≃ T − τ vF dz (Fig. 4.61). The average energy transported
by a typical electron to that place will be therefore ≃ kB (T − − T + ) ≃ −kB τ vF dT
,
dz
where the minus sign indicates that energy is transported in the direction opposite
to the gradient. Multiply this individual contribution by the typical speed ≃ vF of
kB T
an electron and by the density ≃ N
of electrons carrying heat (only those within
V ǫF
kB T of the Fermi energy do, for all others the contributions cancel as in Eq. (302)):
we obtain an estimate of the total heat current
2
2
~ ~r T vF N kB T = −kB2 T τ vF N ∇
~ ~r T ≃ − kB T τ N ∇
~ ~r T ,
(306) ~jE ≃ −kB τ vF ∇
V ǫF
ǫF V
me V
206
4. SOLIDS
Figure 4.62. A scheme of Hall’s experiment: a metal sample traversed by a current and immersed in a perpendicular magnetic field
develops a transverse potential due to the Lorentz force acting on the
carriers. The sign of this potential is determined by the band effective
mass m∗ .
where we employed the relation ǫF = 12 me vF2 . A heat current as in Eq. (306) is
compatible with a thermal conductivity17
k2 T τ N
.
(307)
K≃ B
me V
By comparing σ and K in Eqs. (307) and (305) we find the prevision of a ratio
π 2 kB2
K
T
=
σ
3 qe2
between thermal and electric conductivities according to the relaxation-time model
considered. A careful analysis of the factors of order unity previously ignored yields
2
the factor π3 in Eq. (307). This relation tells us that good electrical conductors
should also be good heat conductors. The empirical observation of this fact is
known as Wiedmann-Franz law. Experimentally, for a wide range of metals and
temperatures, the accord of conductivities with Eq. (308) is surprisingly good, for
such a simple model. For example, measurement yields the following values for
K 3 qe2
2 T (which should equal unity according to Eq. (308)): 0.868 (Na, 273 K),
σ π 2 kB
0.950 (Au, 273 K), 0.966 (Au, 373 K), 1.08 (Pb, 273 K), 1.04 (Pb, 373 K).
Many other transport experiments, both in the DC and AC regime can be interpreted in terms of this simple relaxation-time model. An especially important
class of measurements is connected to the Hall effect, i.e. the potential buildup
perpendicular to a current transversing a metallic sample when this is immersed
~ This potential drop is due to lateral charge
in a perpendicular magnetic field H.
accumulation sketched in Fig. 4.62 due the Lorentz term in Eq. (300): the latter
is independent of the sign of the charge carriers, thus the transverse Hall potential probes precisely the sign of the charge carriers. In most metals, this potential
(308)
17
~ ~r T .
The underlying linear-response relation is ~jE = −K ∇
4.2. ELECTRONS IN CRYSTALS
207
drop is consistent with negative charge carriers, but many exceptions (including Be,
Mg, In, Al) show a potential of reversed sign w.r.t. the prevision of the semiclassic
relaxation-time model. This calls for the importance of hitherto neglected band
effects associated to lattice periodicity.
The main effect of the lattice periodic potential is to replace the electron mass
in Eqs. (305) and (307) with an effective mass m∗ , accounting for the actual value
of the acceleration of electrons close to the Fermi energy in a realistic band rather
than on the free parabola. According to Eq. (299)
(309)
"
#
h
i d~k
X
∂ 2 E~k j d(~kw ) X
d2
d ~
1
~ ~ ~vj (~k) ·
êu
~r = ~vj (k) = ∇
= 2
=
êu (m∗ )−1
uw Fw ,
k
dt2
dt
dt
~ uw
∂ku ∂kw
dt
uw
where êu indicate the Cartesian versors x̂, ŷ, and ẑ. The last equality is based on
~k)
~ + ~vj × H)
~ acting on
Eq. (300) – replacing d(~
with the external force F~ = −qe (E
dt
the electron. The final expression (309) is a sort of Newton equation, with a mass
tensor of components
(310)
(m∗ )−1
uw
2
1 ∂ E~k j
,
= 2
~ ∂ku ∂kw
which is connected to the band curvature at Fermi energy. In the 1D example of
Fig. 4.57, m∗ is proportional to the inverse of the acceleration of panel c. In 3D,
the effective mass m∗ to insert in Eq. (305) is a suitable average over the tensor
components (m∗ )uw , and may differ substantially from the free-electron mass me .
In particular m∗ turns out much larger than me for narrow flat bands, characterized
by weak curvatures, such as 3d bands of transition metals or 4f bands of rareearth metals. According to (305) and (307), larger effective masses are associated to
smaller conductivities, which is in accord with our intuition that external fields have
a harder time accelerating “heavy” wave packets than free electrons. Note however
that in the ratio (308), the m∗ correction cancels out, and the Wiedmann-Franz
law should and does hold roughly independently of band curvature. Also, negative
effective masses occur whenever the Fermi level sits in a region where the curvature
of the band is downward (e.g. close to the BZ boundary of Fig. 4.58a). An electron
of negative m∗ and charge accelerates in the same direction as the applied field, and
can therefore be seen as a positive charge of positive mass, called a hole. A hole
carries current in the same direction as the applied field (like a genuine electron),
but it produces reversed Hall effect, as its average velocity ~vj (~k) aligns in the same
~ Negative effective mass explains the reversed Hall
direction as the electric field E.
potential of several metals.
4.2.2.2. Semiconductors. Intrinsic (i.e. pure) semiconductors are insulators characterized by a small gap between valence and conduction band. Solid Si and Ge have
208
4. SOLIDS
Figure 4.63. Qualitative lattice-parameter dependence of the band
energies for the diamond-lattice solids of the IVB group. The large-a
separated s and p bands turns into a hybrid sp3 band as a is reduced,
but eventually this wide band splits into a “bonding” and “antibonding” band (similar to the bonding and antibonding states of CH4 ),
whose separation (and individual width) grows as the lattice shrinks.
This leads to the unusual situation of a bandgap which increases under
applied external pressure, and shrinks with lattice thermal expansion.
the same structure as C diamond, with larger lattice parameter a. The bands of
these materials are qualitatively illustrated in Fig. 4.63. In pure semiconductors,
the Fermi energy sits inside this bandgap, the lower sp3 “bonding” band being completely full and the upper “antibonding” band completely empty. Note that the
possibility of this band “splitting” is directly connected to their diamond crystal
structure, and would not occur in a hypothetical simple cubic or fcc Si/Ge. Moreover, precisely this bandgap yields a particular stability to the rarefied diamond
structure for those solids where the number of electrons matches the capacity of the
“bonding” band.18 Semiconductors of type III–V (e.g. GaAs) and some II-VI (e.g.
BeSe) crystallize in a similar crystal structure, namely the zincblend structure, with
two different atomic species occupying the two geometrically inequivalent sites of
the unit cell of the diamond structure (Fig. 4.25). Other structures are observed in
other semiconductors. Accordingly, the bands are qualitatively and quantitatively
18
Solid Na, Mg and Al realize an energetically more stable configuration in different, more
compact lattice structures (bcc, hexagonal, and fcc respectively), as they have too few electrons
to fill completely the “bonding” band if the atoms arranged themselves in a hypothetical diamond
structure.
4.2. ELECTRONS IN CRYSTALS
209
Figure 4.64. Direct versus indirect gap in insulators. The name
refers to the possibility of “direct” spectroscopic observation by interband optical absorption in the first case. Absorption through an indirect gap instead must be assisted by some phonon absorption/emission
in order to grant the wave-number conservation in the process.
different in different compounds. In particular, the gap ∆ = Ec − Ev between the top
of the valence band and the bottom of the conduction band may be direct (same ~k
for the maximum Ec of E~k c and for the minimum Ev of E~k v as in GaAs and InP) or
indirect (when these two extrema occur at different ~k points, as in Si, Ge, GaP), see
Fig. 4.64.
Transport in an intrinsic semiconductor is driven by temperature. The average
number density of electrons thermally excited into the conduction band is
Z
Z
[nc ]
1 ∞
1 ∞
1
(311)
Nc =
=
dE ,
g(E) [nE ]F dE =
g(E) β(E−µ)
V
V Ec
V Ec
e
+1
where g(E) is the density of band states (an example of which is sketched in Fig. 4.65).
The chemical potential lies somewhere in the gap between conduction and valence band (at low temperature close to the mid-gap energy), thus several kB T
below the conduction-band bottom Ec (see Fig. 4.66). It is therefore usually a
very good approximation to neglect the 1 at the denominator of [nE ]F , and take
4. SOLIDS
g(ε)
[states/eV]
210
-5
Energy [eV]
0
Figure 4.65. The qualitative shape of the density of band states
in a lattice. Gaps are characterized by vanishing density of states.
In general, several bands are present in an actual solid, with narrow high-density bands at lower energies, and broad low-density
bands at larger energies. The band boundaries show a characteristic g(E) ∝ |E − Eboundary |1/2 behavior, characteristic of the quadratic
~k-dependence of E~ near the band maximum or minimum (see
kj
Eq. (190)).
1
eβ(E−µ) +1
≃ e−β(E−µ) = e−β(Ec −µ) e−β(E−Ec ) . The first exponential is the same for all
states in the band: it reflects the exponential suppression of the electron occupancy
of the conduction band due to its distance from µ. The second factor represents
a standard Boltzmann occupancy of a gas of classical noninteracting particles, as
in Eq. (151), which indicates that an extremely rarefied electron gas is essentially
classical. The energies E of the states can be estimated by Taylor-expanding the
conduction band around its minimum19
1 X ∂ 2 E~k c ~2 ~ ~ min 2
min
min
E = E~k c = Ec +
|k − k | +...
(k
−k
)(k
−k
)+...
≃
E
+
u
w
c
u
w
2
∂ku ∂kw ~ min
2m∗
uw
k
c
Thus the excitation energy E = (E − Ec ) in the second exponential approximates the
kinetic energy of a free particle of mass m∗c , once the wave numbers are measured
from ~k min . Accordingly, the density of conduction states (excluding spin) goes as
19
For energies high above the band minimum E~k c , the quadratic expansion is inaccurate, but
the statistical occupancy factor suppresses the contribution of the higher-energy states anyway.
4.2. ELECTRONS IN CRYSTALS
211
Figure 4.66. In an intrinsic semiconductor with an energy gap ∆
large compared with kB T , the chemical potential µ lies within an order
kB T of the center of the energy gap, and is therefore far (compared to
kB T ) from both the bottom of the conduction band Ec and the top of
the valence band Ev .
√
gtr (E) = m∗c 3/2 V /( 2π 2 ~3 ) E 1/2 (see Eq. (190)). The calculation of the E integration in Eq. (311) is then identical to the calculation of the classical partition function
Z1 tr = V /Λ3c of Eq. (184), with here a thermal length
s
2π
Λc = ~
.
∗
mc kB T
In detail, we obtain
Z
Z
eβ(µ−Ec ) ∞
eβ(µ−Ec ) ∞
−β(E−Ec )
Nc ≃
gs gtr (E) e−βE dE
g(E)e
dE ≃
V
V
Ec
0
∗
3/2
Z
~
k′ |2
2eβ(µ−Ec )
mc kB T
V −β |~2m
2eβ(µ−Ec )
2eβ(µ−Ec )
∗
3~ ′
β(µ−Ec )
c
dk =
= 2e
= (312)
e
Z1 tr =
,
V
8π 3
V
Λ3c
2π~2
212
4. SOLIDS
where the factor gs = 2 reflects the spin degeneracy.20
For m∗c = me , at 300 K, the thermal length Λc = 4.3 nm. The exponential
factor, for µ sitting in the middle of a 1.2 eV gap i.e. 0.6 eV below Ec , is of the
order eβ(µ−Ec ) ≈ e−23 ≃ 10−10 . This corresponds to about Nc ≈ 2 × 1015 m−3 , a
modest charge-carrier density compared to that of regular metals (≈ 1028 m−3 ).
Note however that this carrier density varies exponentially with T −1 (e.g. for the
same conditions, at 600 K Nc ≈ 3×1020 m−3 ). If this dependency is plugged into the
expression (305) for conductivity in the presence of collisions, one expects a rather
poor room-temperature conductivity, which is approximately increasing with the
exponential of T −1 , apart for weaker corrections due to (i) drifts of µ, (ii) the T 3/2
term in Eq. (312), and (iii) increase in collisions (reduction of τ ). Indeed, a roomtemperature resistivity several orders of magnitude larger than in simple metals, with
a drastic T -dependence, is observed in intrinsic semiconductors (Fig. 4.67 – curve
1, compare with Table 4.2). This makes pure semiconductors sensitive temperature
sensors, especially at low temperatures (where metals become essentially useless –
see Fig. 4.60).
In practice, semiconductors find important applications mainly as doped solids,
called extrinsic semiconductors. Doping is realized typically by substitutional impurities replacing the perfect lattice atoms, as sketched in Fig. 4.68. Pentavalent
impurities, such as P or As replace Si/Ge atoms at the regular lattice sites, thus
establishing formally four chemical bonds. In other terms, the valence band is not
especially deformed by the impurities. However a pentavalent atom carries extra
positive nuclear charge, thus forming a potential well which tends to attract one
extra electron close to it. This forms a characteristic localized “impurity” state,
sitting in the band gap. At zero temperature, the extra electron carried by the
pentavalent atom occupies that impurity state.
The main feature of the impurity states of pentavalent dopants (donors) is their
close vicinity to the conduction band (Fig. 4.69). The reason for this weak binding
of the impurity electron to its ion is related to screening. The simplest model for
20
Essentially the same result is obtained for the number of holes in the valence band:
∗
3/2
Z
1 Ev
mv kB T
2
Pv =
.
g(E)(1 − [nE ]F ) dE ≃ eβ(Ev −µ) 3 = 2 eβ(Ev −µ)
V −∞
Λv
2π~2
The requirement of charge neutrality (Pv = Nc ) fixes the position of the chemical potential:
eβ(Ev −µ) m∗v 3/2 = eβ(µ−Ec ) m∗c 3/2 , where we simplified common factors. By taking the logarithm of
both sides we obtain
3
m∗
1
µ = (Ev + Ec ) + kB T ln v∗ ,
2
4
mc
which confirms that at T = 0 the chemical potential sits in the middle of the gap, and when T is
raised it drifts slowly toward the band with smaller effective mass, to compensate for the smaller
density of states.
4.2. ELECTRONS IN CRYSTALS
Figure 4.67. The measured resistivity of antimony-doped germanium as a function of inverse temperature for several impurity concentrations [?].
213
214
4. SOLIDS
Figure 4.68. Substitutional atoms of group III or V replace Si or Ge
atoms of the pure semiconductor (left), producing extrinsic semiconductors of p type (center) or n type (right) respectively. The square
lattice is just a convenient pictorial for the actual diamond lattice.
Figure 4.69. Donor and acceptor levels are very shallow. They are
usually located within few tens meV of the borders of the conduction
and valence band respectively.
the electron binding to the impurity is a particle of charge −qe and mass m∗c moving
qe 1
in the screened Coulomb potential of the impurity ion ≃ 4πǫ
, where ε is the
0ε r
static dielectric constant of the pure semiconductor (ε = 12 for Si, ε = 16 for Ge),
and r the distance between the electron and the impurity nucleus. This is an Hatom–like problem of effective nuclear charge Z = ε−1 and effective mass µ = m∗c ,
whose ground-state energy is given by Eq. (30). The bound-state ground energy
4.2. ELECTRONS IN CRYSTALS
215
Figure 4.70. Schematic temperature dependence of the majority
carrier density (n doping). The high-temperature regime corresponds
to the prevalence of intrinsic carriers. The intermediate regime of
almost constant Nc ≃ Nd is the temperature region where extrinsic carriers prevail and are dissociated from their impurities. The
low-temperature decrease of Nc is due to “freezing” of the extrinsic
carriers, “captured” by the localized impurity states.
then equals a suitably rescaled Rydberg energy:
m∗ 1 EHa
(313)
δE ≃ c 2
.
me ε 2
Typical values of m∗c and ε lead to binding energies of the order 10−4 EHa , or few
tens meV. Indeed, the separation δE = Ec − Ed of the impurity levels of P and
As from the bottom of the conduction band of Si is observed 44 meV and 49 meV
respectively (in Ge it is found 12 meV and 13 meV).
A trivalent impurity (acceptor) produces a localized excess negative charge, as
long as the conduction band is filled. The missing electron can be represented as a
bound hole, attracted to the excess negative charge representing the impurity, with
a small binding energy. In the electron picture this bound hole will manifested itself
as an additional electronic level Ea slightly above the top of the valence band. The
hole is bound when this level is empty. When an electron is promoted from the
valence band into this localized level, paying a small energy Ea − Ev , the excess
charge of the impurity is removed, and an unbound hole is left in the valence band.
Impurity levels are localized and do not contribute to transport. At T = 0, a
homogeneous doped semiconductor is an insulator, with the Fermi level sitting near
either Ea or Ed , according to whether the density Na of acceptors or that Nd of
donors is larger (p / n doping respectively). As temperature is raised from 0, the
bound charges get rapidly unbound into the band levels, mostly in the valence band,
216
4. SOLIDS
if Na > Nd (p doping), or in the conduction band, if Nd > Na (n doping). Due to
the small binding energy of the impurity levels compared to the band gap, it is far
easier thermally to excite an electron into the conduction band from a donor level,
or a hole into the valence band from an acceptor level, than it is to excite an electron
across the entire gap ∆ from valence to conduction band. At room temperature, the
probability that an electron unbinds from the impurity levels is high. The chemical
potential moves toward the middle of the gap, but (for temperature not too high) it
remains closer to the valence (p doping) or conduction (n doping) band. This leads
to a substantial concentration of carriers, which in a wide temperature range dominates over the intrinsic carriers. In this regime, to a good approximation, the density
of majority carriers (holes for p doping, electrons for n doping) approximates the net
concentration of impurities,21 as in Fig. 4.70. This carrier concentration, plugged
into Eq. (305), is compatible with the doping and temperature dependence of resistivity shown in Fig. 4.67. The carrier population in the minority band (conduction
for p doping and valence for n doping) is extremely small (but rapidly increasing
with temperature).
Doped semiconductors are interesting mostly for the properties of inhomogeneous
systems, i.e. crystals where the local impurity concentrations vary in space. Careful methods of fabrication permit to tune the doping on a sub-µm scale, within
crystals of typical lateral size of up to several cm. This technology is at the basis
of the modern electronics industry, where solid-state devices have replaced the old
vacuum tubes as “active” components, i.e. components which permit active manipulation (mainly amplification) of electric signals. Countless other applications of
inhomogeneous semiconductor include sensors, light production, electronic data manipulation... Semiconductor devices have been since the 1950’s and still are one of
the main areas of research and development, a common playground of fundamental
quantum mechanics, solid-state physics, materials science, industrial engineering,
and electronics. Specific courses delve in this vast field. Here we only sketch the
principle of functioning of the simplest inhomogeneous extrinsic semiconductor: the
p-n junction.
Consider a piece of semiconductor with ideal step-like p-n dopant densities
Na , x < 0
Na (x) =
0,
x>0
0,
x<0
Nd (x) =
(314)
,
Nd , x > 0
as a function of some 1D displacement x across the sample. This is what one could
conceptually (but not in practice!) realize by assembling a p-doped and a n-doped
piece of semiconductor. When contact is realized, the chemical potential in the two
21
For n doping an electron density Nc ≃ Nd − Na . For p doping a hole density Pv ≃ Na − Nd .
4.2. ELECTRONS IN CRYSTALS
217
separate sections of the semiconductor, initially substantially different, must become
the same, as required by thermodynamical equilibrium. Starting with each portion
in an homogeneous neutral situation, the equilibration of the chemical potential
is realized by diffusion of electrons from the n (where their concentration is high)
into the p side (where their concentration is very small), and of holes in the other
direction (like when a wall between a vessel containing oxygen and one containing
nitrogen is removed and the two gases mix). As the diffusion continues, the resulting
transfer of charge builds up an electric field opposing further diffusive currents until
an equilibrium configuration is reached, when the effect of the field on the charge
carriers cancels the effect of diffusion.
It is possible to write coupled equations (based on Maxwell’s equations plus the
semiclassical equations (299) and (300) for the electron dynamics) for the poten(x) and the local densities of
tial φ(x) describing this electric field E(x) = − dφ
dx
electrons and holes; however, here we restrict to a qualitative level of description.
Because the carriers are highly mobile, in this equilibrium configuration the carrier
densities are very low wherever the field has an appreciable value. This is represented in Fig. 4.71b. The thickness of this depletion layer turns out typically
in the 10 ÷ 103 nm range, depending on the precise physical parameters (dopant
concentrations, dielectric constant of the semiconductor). The impurity ion charge
remains uncompensated in the depletion layer, thus producing the electric charge
profile illustrated in Fig. 4.71c. The double layer of charge produces the electric
field sketched in Fig. 4.71d. This corresponds to a finite potential drop (Fig. 4.71e),
which compensates for the equilibration of the chemical potential far away from the
junction:
(315)
qe ∆φ0 = qe [φ(+∞) − φ(−∞)] = µn − µp ,
where µp/n indicate the chemical potential in the isolated bulk p or n semiconductors,
prior to the construction of the junction. This potential difference shifts rigidly the
bands (and impurity levels) away from the junction, as illustrated in Fig. 4.72a.
When metal electrodes are deposited on both sides of the junction and joined by a
wire, at thermodynamical equilibrium the chemical potential aligns everywhere in
the connected solids to a common value, and no net current flows in the circuit, as
the individual potential drops at the interfaces cancel.
At the p-n junction, for example, the net current vanishes due to cancellation of
four contributions:
• At the p side of the depletion layer few “minority” electrons appear by
thermal excitation out of the valence band: these are immediately “swept”
by the strong electric field, and drop to the n side (Fig. 4.72a). This electron
generation current Jegen depends exponentially on temperature, but only
weakly on the potential drop and the size of the depletion region.
218
4. SOLIDS
Figure 4.71. (a) A sharp p-n junction in the absence of voltage
bias, in thermodynamic equilibrium. (b) The qualitative equilibrium profile of the hole-density Pv (x) (left) and electron density Nc (x)
(right). Far away from the junction, the bulk values Pv (x ≪ 0) ≃ Na ,
Nc (x ≫ 0) ≃ Nd are realized. In the region where a substantial
electric field is present, both carrier concentrations are strongly suppressed: this is therefore called the depletion region. (c) Total electric
charge density ρ(x). The net negative charge on the p-type side of the
junction and the net positive charge on the n-type side are given by
the uncompensated densities Na and Nd of acceptors and donors. (d)
These charges generate an electric field x̂E(x), which is the physical
origin the depletion layer. (e) The corresponding electric potential
φ(x).
4.2. ELECTRONS IN CRYSTALS
(b)
(a)
219
(c)
Figure 4.72. Energetics from the point of view of (negatively
charged) electrons, as a function of the displacement x across the
p-n junction. (a) At equilibrium, in absence of external bias. (b) In
the presence of forward bias (V > 0). (c) In the presence of reverse
bias (V < 0). Carrier currents are indicated: note that generation
currents are essentially independent of bias.
• Few (majority) electrons at the n side of the junction acquire enough thermal energy to overcome the potential barrier to the p side. Once in the p
region, they will most likely fill a hole in the valence band: the corresponding current is therefore called recombination current Jerec .
• Similarly, a hole generation current Jhgen sweeps thermal holes from the n to
the p side, while a hole recombination current Jhrec accounts for thermally
diffusing holes into the n side.
At equilibrium these currents cancel in pairs (Jegen = Jerec , and Jhgen = Jhrec ) and no
net current flows through the junction.22
The shorting wire may be cut and a voltage generator inserted, to alter the potential drop across the semiconductor by an externally fixed potential V , with the sign
convention of Fig. 4.73. The bulk p and n regions are relatively good conductors,
due to a large carrier density compared to the depletion layer. We can therefore
assume that practically all the potential drop induced by the external applied field
is realized over the depletion layer, as illustrated in Fig. 4.72. When V 6= 0 the condition of thermodynamical equilibrium is violated and a net current density jx (thus
a current I) is realized through the junction. To understand semi-quantitatively
22
rec/gen
The four quantities Je/h
are defined as the (positive) norm of the corresponding number
current-density vectors, and are measured in units of s−1 m−2 .
220
4. SOLIDS
n
0.08
I [A]
p
0.04
V
I
(a)
+
0
−
-0.8
-0.4
0
0.4
V [V]
(b)
Figure 4.73. (a) The sign convention for the applied external potential V to the diode. Positive V increases the potential of the p side
and produces forward bias. (b) The I − V characteristic according to
Eq. (318).
the I − V characteristic of the p-n junction, observe that the recombination currents depend very strongly on the potential drop through the depletion layer: Jerec
is proportional to the number of carriers acquiring sufficient energy to overcome the
potential barrier. This is now modified by V : Jerec ∝ exp [−qe (∆φ0 − V )/(kB T )].
The requirement of thermal equilibrium at V = 0 (Jerec = Jegen ) and the fact that the
generation current is almost independent of V fixes the proportionality constant:
(316)
qe V
Jerec = Jegen e kB T .
The total current density carried by electrons is therefore
qe V
(317)
Je = Jerec − Jegen = Jegen e kB T − 1 .
A similar analysis, with similar result, can be carried out for holes. The hole currents
move in the opposite direction with respect to electrons, but carry positive charge,
thus their contribution adds up to that of electrons, to give a total electric current
density
qe V
(318)
jx = qe (Jegen + Jhgen ) e kB T − 1 .
This I −V characteristic, shown in Fig. 4.73b, is strongly asymmetric and nonlinear:
basically the junction operates as a rectifier, allowing electric current to circulate in
one direction only, similarly to the old vacuum diode based on thermionic emission.
For this reason, a two-terminal device containing a single p-n junction is named a
diode. I − V curves very much like Eq. (318) are indeed observed experimentally
0.8
4.2. ELECTRONS IN CRYSTALS
(b)
(a)
(c)
(d)
Figure 4.74. Typical I − V characteristics of commercial silicon
diodes: (a) a Zener diode, (b) and (c) Diotec 1N4001, (d) Bourns
0805/1206. Observe two kinds of deviations from the ideal behavior
of Eq. (318): at large negative voltage (not especially large for this
Zener diode), a reverse breakdown regime occurs; at positive voltage,
current deviates from the pure exponential of V due to series resistance
of the homogeneous segments of the semiconductor.
221