Download Introduction to the Physics of Matter

Introduction to the Physics of Matter Lecture notes for the course Struttura della Materia I Laurea triennale in fisica – Università degli Studi di Milano Nicola Manini Version 3.14 – March 12, 2008 The present printout is the one official release which supersedes all previous hystorical lecture notes, and accounts for 4 years’ worth of bug fixing. ...many eyes make all bugs shallow... http://www.mi.infm.it/manini/dida/ Contents Introduction 0.1. Basic ingredients 0.1.1. Typical scales 0.1.2. Qualitative considerations 0.2. Spectra and broadening vii vii ix xii xii Chapter 1. Atoms 1.1. One-electron atom/ions 1.1.1. The energy spectrum 1.1.2. The angular wavefunction 1.1.3. The radial wavefunction 1.1.4. Orbital angular momentum and magnetic dipole moment 1.1.5. The Stern-Gerlach experiment 1.1.6. Electron Spin 1.1.7. Total angular momentum and magnetic moment 1.1.7.1. Total magnetic moment in the coupled basis 1.1.8. Fine structure 1.1.8.1. Spin-orbit coupling 1.1.8.2. The relativistic kinetic correction 1.1.8.3. The Lamb shift 1.1.9. Nuclear Spin and hyperfine structure 1.1.10. Electronic transitions, selection rules 1.1.11. Spectra in a magnetic field 1.2. Many-electron atoms 1.2.1. Identical particles 1.2.2. The independent-particles approximation 1.2.3. The 2-electron atom 1.2.4. The self-consistent field theory 1.2.5. The periodic table 1.2.6. Core levels and spectra 1.2.7. Optical spectra 1.2.7.1. Alkali atoms 1.2.7.2. Atoms with incomplete degenerate shells 1 1 5 6 9 12 13 16 17 20 22 22 24 25 27 28 31 33 33 36 39 43 48 50 53 54 56 iii iv CONTENTS 1.2.7.3. Many-electron atoms in magnetic fields 1.2.8. Dipole selection rules Chapter 2. Molecules 2.1. The adiabatic separation 2.2. Chemical and nonchemical bonding 2.2.1. H+ 2 2.2.2. Covalent and ionic bonding 2.2.3. Weak nonchemical bonds 2.2.4. Classification of bonding 2.3. Molecular spectra 2.3.1. Rotational and ro-vibrational spectra 2.3.2. Electronic spectra 2.3.3. Zero-point effects 60 62 65 65 69 69 74 80 83 84 85 89 90 Chapter 3. Statistical physics 3.0.4. Probability and statistics 3.0.5. Quantum statistics and the density operator 3.1. Equilibrium ensembles 3.1.1. Connection to thermodynamics 3.1.2. Entropy and the second principle 3.2. Ideal systems 3.2.1. The high-temperature limit 3.2.1.1. Internal degrees of freedom of molecules 3.2.1.2. Isolated (spin) degrees of freedom 3.2.2. Degenerate Fermi and Bose gases 3.2.2.1. Fermi particles 3.2.2.2. Bose particles 3.3. Interaction radiation-matter 3.3.1. The laser 93 93 95 97 99 102 103 106 111 114 117 122 127 133 135 Chapter 4. Solids 4.1. The microscopic structure of solids 4.1.1. Lattices and crystal structures 4.1.2. The reciprocal lattice 4.1.2.1. An algebraic note 4.1.3. Diffraction experiments 4.2. Electrons in crystals 4.2.1. Models of bands in solids 4.2.1.1. The tight-binding model 4.2.1.2. The plane-waves method 4.2.2. Filling of the bands: metals and insulators 139 140 149 159 162 163 178 184 184 187 191 CONTENTS 4.2.2.1. Metals 4.2.2.2. Semiconductors 4.2.3. Spectra of electrons in solids 4.3. Lattice dynamics 4.3.1. The normal modes of vibration 4.3.2. Thermal properties of phonons 4.3.3. Other phonon effects Appendix A. Conclusions and outlook A.1. Essential equations A.2. “Advanced” Topics A.3. Acknowledgments v 195 207 223 227 228 234 240 243 244 245 245 Introduction The purpose of this course is to develop some initial microscopic understanding of many basic phenomena regarding “matter” in its atomic molecular and condensed states. 0.1. Basic ingredients Fundamental experiments realized in the late XIXth and early XXth century prove that any piece of matter (e.g. a sample of pure He gas in a vessel, a block of solid ice, a screw made of metal alloy, a mobile phone, a block of wood, a cup of soup, a bee... ) is ultimately composed by a huge but finite number of negatively charged electrons and positively charged nuclei. The nuclear inner structure is usually irrelevant to most “ordinary” properties of matter: to the purpose of the physics of matter, nuclei can be treated as structureless point-like particles. If we neglect relativistic effects and the interactions of our sample of matter with its surrounding, then all internal microscopic interactions among the components are of simple Coulombic nature. The nonrelativistic motion of the electrons and nuclei in the sample is governed by the following Hamiltonian energy operator: (1) Htot = Tn + Te + Vne + Vnn + Vee where: (2) Tn = 1 X PR2 α 2 α Mα is the kinetic energy of the nuclei (PRα is the conjugate momentum to the position Rα ), 1 X 2 Pri (3) Te = 2 me i is the kinetic energy of the electrons (Pri is the conjugate momentum to ri ), q2 X X Zα (4) Vne = − e 4πǫ0 α i |Rα − ri | vii viii INTRODUCTION is the electron-ion attraction potential energy, q 2 1 X X Zα Zβ (5) Vnn = e 4πǫ0 2 α β6=α |Rα − Rβ | is the nucleus-nucleus repulsion, and finally q2 1 X X 1 (6) Vee = e 4πǫ0 2 i j6=i |ri − rj | represents the electron-electron repulsion. Basically, the distinction between a steel key and a bottle of wine is only made by the “ingredients”, i.e. the number of electrons and the number and types of nuclei (charge numbers Zα and masses Mα ) involved. A state ket |ψi containing all quantum-mechanical information describing the motion of all nuclei and electrons evolves according to Schrödinger’s equation d |ψ(t)i = Htot |ψ(t)i . dt This equation, based on Hamiltonian (1), is apparently simple and universal. This simplicity and universality indicates that in principle it is possible to understand the observable behavior of any isolated macroscopic object in terms of its microscopic interactions. In practice, however, exact solutions of Eq. (7) are available for few simple and idealized cases only. If one attempts an approximate numerical solution of Eq. (7), she/he soon faces the problem that the information contents of a N particles ket increases exponentially with N , and soon exceeds the capacity of any computer. To describe even a relatively basic system as a pure rarefied molecular gas, or an elemental solid, nontrivial approximations to the solution of Eq. (7) are called for. Applying smart approximations to Eq. (7) to understand observed materials properties and to make correct previsions of new properties is a refined art. These approximations become often important conceptual tools to link the macroscopic properties of matter with the underlying microscopic interactions. The present course gives a panoramic view of several observed phenomena in the physics of matter, introducing a few standard conceptual tools for their understanding. The proposed schemes of approximations are often rather simple and primitive idealizations: the bibliography suggests indications to expand the student’s conceptual toolbox get closer to today’s state of the art in research. We should be aware that even the smartest and most experienced physicists of matter cannot often give accurate previsions of basic properties such as the conducting behavior of a pure material of known composition and structure, without actually carrying out the experiment. Quantitative and often even qualitative understanding based on Eq. (7) of more complex systems (e.g. (7) i~ 0.1. BASIC INGREDIENTS (a) ix (b) Figure 0.1. Scanning tunneling microscope (STM) images of (a) a single-wall carbon nanotube (SWNT) and (b) a multi-wall carbon nanotube (MWNT). These are cylindrical macromolecules entirely composed of carbon atoms. biological matter) is still by far beyond the capability of today’s theoretical physics of matter. Needless to say, this basic course focuses on simple well-understood experiments phenomena and conceptual schemes with marginal hints at very few selected systems and techniques investigated in current research. Physicists chemists and biologists collect a wealth of experimental data, for which understanding is often only qualitative and partial. Creativity and insight help us to develop new conceptual schemes and approximate models to interpret these data and proceed toward a better understanding of the intimate structure and dynamics of matter. 0.1.1. Typical scales. The motions described by the Hamiltonian Htot involve several characteristic dimensional scales, dictated by the physical constants [?] in Htot , where the absence of the speed of light c is noteworth. First, observe that in Htot the elementary charge qe and electromagnetic constant ǫ0 always appear in the fixed combination e2 ≡ qe2 , 4πǫ0 x INTRODUCTION (a) (b) Figure 0.2. STM images of: (a) gold clusters (about 1 nm diameter) on graphite; (b) a reconstructed Au(111) surface (atomic corrugation 15 pm, tunneling current 0.3 nA; scanning voltage 0.3 V). of dimensions energy × length. A unique combination of e2 , Planck’s constant ~, and electron mass me yields the characteristic length ~2 = 0.529177 × 10−10 m me e2 named Bohr radius, which sets the typical length scale of electronic motions. All distances in the physics of matter could conveniently be measured in a0 units, and indeed for most materials typical interatomic distances turn out a few times a0 . Individual atoms are routinely seen, e.g. by scanning tunneling microscopy (STM): Figures 0.1-0.3 confirm that atoms in many solids are typically spaced by a fraction of nm. The interaction energy of two elementary point charges at the typical distance a0 (8) a0 = e2 me e4 = 2 = 4.35975 × 10−18 J = 27.2114 eV, a0 ~ named Hartree energy, sets the typical energy scale for phenomena involving a single electron in ordinary matter; in practice, eV units are more often used. The nuclear charge factors Zα ≤ 100 can scale the e2 coupling up by 2 decades, thus the electronic energies up by at most 4 orders of magnitude (104 EHa ≃ 300 keV). However, delicate balances often involve characteristic electronic energies as small as 1 meV. Nuclear motions are usually associated to smaller energies (∼ 10−4 ÷ 10−3 EHa ) than (9) EHa = 0.1. BASIC INGREDIENTS (a) xi (b) Figure 0.3. STM images of (a) a clean Si(100) surface (4 degree miscut); (b) the basal surface of covellite CuS (the image covers approximately 6 nm on a side; bright spots are high tunneling current sites corresponding to surface S atoms; Cu atoms do not show in this image). electronic motions, because of the at least 1836 times larger mass at the denominator of the kinetic term (2). The typical timescale of electronic motions is inversely proportional to its energy scale: (10) tB = ~ ~3 = = 2.41888 × 10−17 s. EHa me e4 Oscillations of period 2πtB have a frequency νB = ωB /(2π) = EHa /(2π~) = 6.5797 × 1015 Hz. The typical electron velocity is then set by the ratio r a0 EHa e2 (11) vB = = 2.18769 × 106 m/s, = = tB me ~ about 1% of the speed of light, which justifies a posteriori the initial neglect of relativity. In fact, the ratio r e2 1 EHa vB = = = 7.29735 × 10−3 = , (12) α= 2 c me c ~c 137.036 xii INTRODUCTION the so called fine-structure constant, measures the relative importance of the relativistic corrections to the nonrelativistic electron dynamics given by Eq. (1). As discussed above, motions of the nuclei are much slower, thus relativistic corrections to their kinetic energy Tn are usually negligible. Figure 0.4 compares the typical length and energy scales for electrons in matter to the wavelengths and photon energies of the electromagnetic waves, used to investigate matter itself as sketched mainly in Secs. 0.2, 4.1.3, and 4.2.3 below. The photon wavelength matches the typical interatomic distances (∼ 10−10 m) for photons in the X-rays region (∼ 104 eV ≫ EHa ). On the other hand, photons whose energies fit the typical atomic energy scale EHa lie in the ultraviolet, near the visible range: in this range, the characteristic electromagnetic wavelength is a fraction of µm, i.e. at least three orders of magnitude larger than the typical atomic sizes. Typical energies associated to the motion of the nuclei in matter (of order 10−2 eV) match the infrared region of the electromagnetic spectrum. 0.1.2. Qualitative considerations. In ordinary matter, electrons tend to lump around the strongest positive charges around, the nuclei, driven by the term (4). Due to quantum kinetic energy (3), electrons do not collapse to the nuclei but form atoms of finite size ≈ a0 , as sketched in Chapter 1. Atoms then act as the building blocks of matter, in its gaseous and condensed phases. Many observed macroscopic properties of extended matter such as elasticity, heat transport, heat capacity can be described in terms of the motion of the center of mass of atoms or small collections of atoms. Ultimately, the interaction among atoms, governing these motions, is driven by the quantum dynamics of charged electrons and nuclei described by Eq. (7), usually described within the adiabatic separation scheme (Sec. 2.1). Understanding the dynamics of a finite number of electrons in the field of two or several nuclei, and the motion of these nuclei themselves (Chapter 2) provides the basics of interatomic bonding, the mechanism granting the very existence of condensed matter. Finally, the methods and approximations developed for few-atom systems and for statistically large ensembles (Chapters 3) lead to new concepts and phenomena associated to the macroscopic size of extended systems (Chapter 4). 0.2. Spectra and broadening Starting from the late XIXth century, physicists have developed and employed all sorts of techniques to investigate the intimate excitations of matter. Many data have been obtained through two broad classes of spectroscopies: absorption and emission. • Absorption: the sample is crossed by a collimated beam of monochromatic light (not necessarily visible): the intensity loss of the beam going through the sample is measured as a function of light frequency (or equivalently wavelength), as sketched in Fig. 0.5. 0.2. SPECTRA AND BROADENING Figure 0.4. The spectrum of electromagnetic radiation. Wavelength, frequency and photon energy are compared on logarithmic scales. xiii xiv INTRODUCTION MONOCHROMATIC PHOTON SOURCE ω 1111111111 0000000000 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 I(ω) SAMPLE 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 PHOTON DETECTOR I ω Figure 0.5. A radical schematization of the setup for absorption spectroscopy. 1111111111 0000000000 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 SAMPLE 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 MONOCHROMATOR PHOTON DETECTOR ω I I(ω) ω Figure 0.6. A radical schematization of the setup for emission spectroscopy. • Emission: the sample is excited, e.g. by means of a flame or electrical discharge. The light emitted in the de-excitation transitions is collected, and its intensity is measured as a function of frequency (Fig. 0.6). Spectra of both kinds are routinely collected to probe the properties of gaseous (both atomic and molecular), liquid, and solid samples. Atomic and molecular spectra are usually characterized by sharp monochromatic peaks (also called “lines”). However, no absorption/emission peak is ever infinitely sharp (Fig. 0.7): at least 3 simultaneous effects combine to broaden each spectral Intensity [A.U.] 0.2. SPECTRA AND BROADENING 20 xv width=0.02 width=0.05 width=0.08 15 10 5 0 1.2 1.4 1.6 1.8 2 2.2 ω Figure 0.7. Broadening limits the details observable in a spectrum: when the line width exceeds the separation between two lines (dotted curve), these look like a single peak, and the “fine” detail of the two near peaks is then lost. line [?]: (i) experimental resolution of the spectrometer, (ii) “natural” broadening due to finite lifetime, (iii) Doppler broadening. Experimental resolution is typically limited by the resolution of the monochromator, noise in the photon detector, inhomogeneities in the sample or in some external field applied to it. Experimental broadening is usually generated by several random concurrent effects, and determines therefore a Gaussian line shape. Its has no fundamental nature, thus it can be (and often is) reduced by means of technological advances. According to the basic Schrödinger theory, all eigenstates of an atom are stationary states. However, this theory neglects interaction with the fluctuating radiation field, which is always present. When this interaction is included, only the ground state is really stationary. Instead, any excited state decays spontaneously to all states at lower energy. The total decay rate is the sum of the individual decay rates.1 The total decay rate determines a (random) exponential decay of the average population of atoms initially in an excited state: (13) 1 [N ](t) = N0 e−tγ tot = N0 e−t/τ , For example, as described in Sec. 1.1.10 below, the 3p state of H decays at a rate γ3p→1s = 1.67 × 108 s−1 to the ground state, and at a γ3p→2s = 2.25 × 107 s−1 to the 2s state; decay to state 2p is dipole-forbidden, thus occurs at a negligible rate. Therefore the 3p state empties at a total tot rate γ3p = γ3p→1s + γ3p→2s = 1.90 × 108 s−1 . xvi INTRODUCTION which defines the lifetime of that atomic level τ = 1/γ tot . In practice, an atom hardly lasts in an excited state longer than a few times its characteristic τ . Accordingly, τ sets the typical duration that any spectroscopy experiment involving that excited state may last. Due to the time-energy uncertainty, the energy of an atomic level cannot be measured with better precision than ~ = ~ γ tot . τ The effect of the finite lifetime of atomic levels on the spectral lines is therefore a broadening of the otherwise infinitely sharp line. This “natural” lifetime broadening appears in the spectrum as a Lorentzian peak profile γ (15) I(ω) = I0 , π[(ω − ω0 )2 + γ 2 ] (14) ∆E ≃ where ω0 = ~1 (Ef −Ei ) is the original line position set by the energy difference of the initial and final state. Atomic excited states are characterized by typical lifetimes of several ns, thus by natural spectral broadening γ of a fraction of µeV. In addition to the natural broadening due to finite lifetime, the random thermal motion in the gas-phase sample introduces an extra source of broadening: the Doppler broadening. When seen from the lab frame, atoms/molecules moving toward or away from the detector at a velocity vx have transition frequencies blue or red shifted with respect to those at rest. Since the thermal (molecular center-mass) velocities are non-relativistic, the Doppler frequency shift is given by the simple form vx (16) ω = ω0 1 ± ω0 = angular frequency at rest. c The molecular velocities are random distributed (see Eq. (192) in Sec. 3.2.1 below) depending on the gas temperature T : the average number of molecules with velocity component vx in the direction of detection is r M vx2 M dn(vx ) exp − =N , (17) dvx 2πkB T 2kB T where M is the molecular mass. Radiation intensity then spreads around the rest frequency ω0 as " 2 # M c2 ω − ω0 (18) I(ω) = I0 exp − . 2kB T ω0 This represents a Gaussian broadening of full width at half-maximum r √ kB T . (19) ∆ωDoppler = ω0 8 ln 2 M c2 0.2. SPECTRA AND BROADENING xvii Since the photons wavelength λ = 2πc/ω, the relative broadening is the same in terms of wavelengths: r kB T ∆λDoppler √ (20) . = 8 ln 2 λ0 M c2 Heavier atoms/molecules move more slowly and are less affected by Doppler broadening: this source of broadening is then relevant mainly for the spectra of the lightest atom, hydrogen. The Hα line, introduced shortly, of gas-phase H suffers of 10 µeV Doppler broadening at 300 K. CHAPTER 1 Atoms The importance of the spectroscopy of atoms and ions for the understanding of the whole physics of matter cannot be overestimated. The study of atoms starts off naturally from the exact dynamics of a single electron in the central field of a charged nucleus (Sec. 1.1) because, beside the intrinsic interest of this system, the notation and concepts developed here are at the basis of the language of all atomic physics. This language is then used to introduce (Sec. 1.2) the spectroscopy of many-electron atoms, collections of 2 to 102 electrons moving in the attractive central field of a single nucleus and subject to mutual repulsion. 1.1. One-electron atom/ions The one-electron atom is one of the few quantum systems where essentially exact solutions of the Schrödinger equation (7) are available. Here, comparison of theory and experiment allows physicists to establish the limits of validity and predictive power of the quantum mechanical model Eqs. (1-6). When relativistic effects are included (Sec. 1.1.8), the model is found in almost perfect agreement with extremely accurate experimental data, all the tiny discrepancies being satisfactorily accounted for by a perturbative treatment of residual interactions (Sec. 1.1.9). The solution of the Schrödinger equation (7) for the one-electron atom is a basic exercise in quantum mechanics.1 Both the Vee and Vnn terms in Eq. (1) vanish, and only the nuclear and electronic kinetic energies plus the Coulomb attraction Vne are 1 We often use standard Dirac notation [?], with a ket |ki representing a physical state in the Hilbert space, characterized by the full set of quantum numbers k. The bra hj| is a linear operator from the Hilbert space to the complex numbers, which takes the “component” or “overlap” hj|ki of any ket |ki along the direction |ji. If {|ji} is a basis of eigenstates of some linear operator J (representing a physical observable), the real number |hj|ki|2 represents the probability that starting from a state |ki, measurement of J yields the eigenvalue j. As an example, taking for J the position operator R, with eigenvalues r and eigenkets |ri, the overlap ψk (r) = hr|ki (called wavefunction in real-space representation of state |ki) is a complex number such that |ψk (r)|2 equals the probability (density) of finding the particle at r when position is measured in an initial state |ki. Rather than attacking directly the full time-dependent Eq. (7), it is often smarter to first solve the time-independent Schrödinger equation, i.e. the eigenvalue problem Htot |ψi = E|ψi. On the resulting basis of energy eigenstates |ji (with energy Ej ), one then expands the general solution of 1 2 1. ATOMS relevant. Detailed solutions of the one-electron atom problem are available in many textbooks, including Refs. [?, ?, ?]. Here we only summarize the strategy and main results: • Separation of the center of mass motion: the position operators of ~ (of mass M ) and of the electron ~re are replaced by the the nucleus R combinations ~ ~. ~ cm = M R + me~re and ~r = ~re − R (21) R M + me In terms of these new coordinates, the Hamiltonian separates into a decou~ cm pled kinetic term for R Hcm = − (22) ~2 ∇2 , 2(M + me ) R~ cm plus a Coulombic Hamiltonian for the relative coordinate ~r HCoul = − (23) ~2 2 Ze2 ∇ − , 2µ ~r |~r| where (24) µ= M me M + me is the reduced mass of the 2-particle system. This separation is equivalent to that done for solving the classical Kepler-Newton two-body “planetary” ~ cm translational motion is trivially described problem. The free global R in terms of plane waves. The internal atomic dynamics is that of a single particle of mass µ in the same Coulombic central field as the original nucleus-electron interaction. • Separation in spherical coordinates: given the spherical symmetry of the potential, the Schrödinger equation is conveniently rewritten in polar coordinates r, θ, ϕ. By factorizing the total wavefunction ψ(r, θ, ϕ) = R(r)Θ(θ)Φ(ϕ), the variables separate, and the original three-dimensional (3D) equation splits into three independent second-order equations for the Eq. (7) |ψ(t)i = X j |ji hj|ψ(0)i e−i Ej t/~ , where the weights are precisely the complex components hj|ψ(0)i of the initial state |ψ(0)i on this basis. In particular, the time evolution of a pure energy eigenstate |ji involves a single rotating phase factor exp(−i Ej t/~): this justifies the qualification of |ji as stationary. 1.1. ONE-ELECTRON ATOM/IONS 3 r, θ, and ϕ motions: (25) (26) (27) d2 Φ + ηΦ = 0 , dϕ2 dΘ η 1 d sin θ + λ− Θ = 0, sin θ dθ dθ sin2 θ ~2 λ ~2 1 d 2 dR − E R = 0. r + U (r) + − 2µ r2 dr dr 2µ r2 Here we indicate a general function U (r) for the potential energy −Ze2 /r: this same formalism can be applied to any central potential (e.g. in Sec. 2.3 below). • Solution of the separate eigenvalue problems: the differential equations are solved imposing the relevant boundary conditions for R(r), Θ(θ) and Φ(ϕ). The eigenvalues η, λ, and E can assume only certain values, compatible with the boundary conditions: η = ηm = m 2 , λ = λl = l(l + 1), 2 4 µ EHa Z 2 µZ e (30) E = En = − 2 2 = − , 2~ n m e 2 n2 (28) (29) m = 0, ±1, ±2, ... l = |m|, |m| + 1, |m| + 2, ... n = l + 1, l + 2, l + 3, ... The integer numbers m (magnetic q.n.), l (azimuthal q.n.), and n (principal quantum number) parameterize the eigenvalues and the corresponding eigenfunctions: (31) (32) (33) 1 Φm (ϕ) = √ eimϕ 2π Θlm (θ) = (−1) |m|−m 2 Rnl (r) = −k 3/2 s s 2l + 1 (l − |m|)! |m| P (cos θ) 2 (l + |m|)! l (n − l − 1)! −kr/2 (kr)l L2l+1 , n+l (kr) e 2n[(n + l)!]3 where k is a shorthand for 2Z/(an), a is a corrected atomic length unit a = mµe a0 = ~2 /(µe2 ). The associated Legendre functions Plm (x) (m ≥ 0) are defined by (34) Plm (x) 2 m/2 = (1 − x ) dm Pl (x), dxm 1 dl 2 Pl (x) = l (x − 1)l . l 2 l! dx 4 1. ATOMS E [EHa ] -0.850 -0.544 -0.378 -1.511 -0.1 -3.400 -0.2 -0.3 -0.4 -0.5 -13.598 n= 1 n= 2 n= 3 n= 4 n= 5 n= 6 Figure 1.1. The 6 lowest-energy levels En of the spectrum of hydrogen according to the nonrelativistic Schrödinger theory. Energies in eV near each line. The zero of the scale coincides with the onset of the continuum of unbound states. The associated Laguerre polynomials Lqp (ρ) are polynomials of degree p−q, defined by (35) Lqp (ρ) = dq Lp (ρ) , dρq Lp (ρ) = eρ dp p −ρ (ρ e ) . dρp As the individual terms are properly normalized by the square root etc. prefactors of Eqs. (31)-(33), so is the total atomic wavefunction (36) ψnlm (r, θ, ϕ) = Rnl (r) Θlm (θ) Φm (ϕ) representing the atomic state |n, l, mi. Explicitly, the orthonormality relations read: (37) Z ′ ′ ′ ∗ hn, l, m|n , l , m i = r2 dr sin θdθ dϕ ψnlm (r, θ, ϕ) ψn′ l′ m′ (r, θ, ϕ) = δnn′ δll′ δmm′ . In addition to all these bound states, a continuum of unbound states of arbitrary positive energy represents the ionic states, where the electron moves far away from the nucleus. 1.1. ONE-ELECTRON ATOM/IONS 5 1.1.1. The energy spectrum. The energy eigenvalues (30) of the nonrelativistic 1-electron atom depend on the principal quantum number n only, and show the characteristic structure drawn in Fig. 1.1. In particular, the lowest-energy state (the ground state) is |n, l, mi = |1, 0, 0i. For H, its binding energy amounts to − 21 EHa mµe = −13.5983 eV (slightly less negative than − 21 EHa = −13.6057 eV, due to the reduced-mass correction µ/me = 0.999456). Above this n = 1 state there lie a sequence of bound energy “levels”. In particular, the lowest excited level of H (n = 2, including states |2, 0, 0i, |2, 1, −1i, |2, 1, 0i, and |2, 1, 1i) is 4-fold degenerate and sits at − 21 212 − 1 EHa mµe = 83 EHa mµe = 10.1987 eV above the ground state. Further n-levels have an increasing degeneracy given by the values of l = 0, ...n − 1 compatible with that n, and by the values m = 0, ±1, ... ± l compatible with each l. This m-degeneracy (2l+1 states) occurs for any central potential and represents the possibility for the orbital angular momentum to align in any direction in 3D space without affecting the energy of the atom. In contrast, the additional degeneracy for different l (n2 states in total) is characteristic of the Coulomb −r−1 potential, and is lifted for different radial dependencies of the potential energy U (r). Transitions between levels of any n are observed (Fig. 1.2). Historically, the close agreement of the results of Schrödinger’s equation with accurate experimental data marked one of the early triumphs of quantum mechanics. The transitions group naturally in series of transitions with the same lowest state (initial state in absorption or final state in emission, see Fig. 1.2), and belonging to the same spectral region. In particular, the spacings of the Coulombic eigenvalues (30) are peculiar in making the two highest-energy series (lowest state n = 1 and 2) not overlap with any other series, since the energy distance between the lowest level (n) and the next (n + 1) is larger than the whole range of bound-state energies from level n + 1 to the ionization threshold. For hydrogen, the transitions whose lower level is n = 1 constitute the so-called Lyman series (10.2 ÷ 13.6 eV, in the ultraviolet); the transitions whose lower level is n = 2 constitute the Balmer series (1.89÷3.40 eV, visible); the transitions whose lower level is n = 3 are called Paschen series (0.66 ÷ 1.51 eV, infrared); the transitions whose lower level is n = 4 are called Brackett series (0.31 ÷ 0.85 eV, infrared). The weak dependence of the spectrum on the mass of the nucleus (through the reduced mass µ) produces a fine duplication (relative energy separations ≃< 0.1% – see Fig. 1.3) of the lines of the spectrum of a mixture of different isotopes such as 1 H and 2 H (also called deuterium D). Finally, note that the Z 2 dependence of the eigenvalues (30) makes one half of the lines of one half of the series of the He+ ion (one third of those of Li2+ , ...) almost coincident (except for the mass shift, and relativistic effects) with the series of H, as illustrated in Fig. 1.4. 6 1. ATOMS Figure 1.2. The observed spectrum of hydrogen (a), with a detail of the Balmer series (b) in the visible range. (c) A portion of the spectrum of the star ζ Tauri, showing more than 20 lines of the Balmer series. 1.1.2. The angular wavefunction. The angular solutions Ylm (θ, ϕ) = Θlm (θ)Φm (ϕ) are the normalized eigenfunctions (named spherical harmonics) of the free angular motion of one quantum-mechanical particle.2 Ylm contains complete information about an important observable: the orbital angular momentum. Indeed, ~2 × (the 2 Definitions (31) and (32) adhere to the standard conventions for the phase of the ubiquitous Ylm functions. 1.1. ONE-ELECTRON ATOM/IONS 7 Figure 1.3. A close view of the Balmer Hα line emitted by a mixture of 1 H and 2 H. angular part of the Laplace operator) occurring in the separation of variables repre~ 2 of the rotating two-body system. sents the squared orbital angular momentum |L| ~ 2 . Likewise, ~m are the eigenvalues of the ~2 λ = ~2 l(l + 1) are the eigenvalues of |L| ∂ angular momentum component Lz = −i~ ∂ϕ . Information about other components (Lx or Ly ) is not compatible with the measurement of Lz , and thus has only statistical meaning in quantum mechanics, since these other components do not commute with Lz . On the other hand, the choice of the ẑ direction in space (connected to the choice of the polar coordinate system used) is arbitrary, and due to spherical symmetry, any alternative choice would lead to the same observable results. The spherical harmonics carry complete information about the angular distribution of ~r. In a state |l, mi with fixed squared angular momentum l and ẑprojection m, the probability that the vector ~r joining the nucleus to the electron is proportional to (sin θ cos ϕ, sin θ sin ϕ, cos θ) equals |hθ, φ|l, mi|2 sin θ dθ dϕ ≡ |Ylm (θ, ϕ)|2 sin θ dθ dϕ. Equation (31) indicates that the ϕ dependence of |Ylm |2 is always trivial: |Ylm |2 are constant functions on all circles at fixed θ. Figure 1.5 collects polar plots of |Ylm |2 as a function of θ for several values of l and m (other visualization of the same objects can be found in many textbooks). It is apparent that l − |m| counts the number of zeros (nodes) of |Ylm |2 as the polar angle θ runs from 0 to π. A large number of nodes indicates a large angular momentum component perpendicular to ẑ. Notes: The quantization of angular momentum originates from the boundary conditions Φ(ϕ + 2π) = Φ(ϕ), and Θ(θ) finite at θ = 0 and π for the angular wavefunction: these are realized only for discrete values of the quantum numbers m and l [?]. The visible (Fig. 1.5) increase of |Yl0 (0, ϕ)|2 with l does not contradict normalization (37), because of the sin θ integration factor. If the rl factor taken from the radial wavefunction (33) is grouped together with Ylm (θ, ϕ), one can express rl · Ylm (θ, ϕ) in Cartesian components (rx , ry , rz ) of ~r, obtaining a homogeneous 8 1. ATOMS Figure 1.4. Observed changes in the spectra of one-electron atom/ions due to changes in the nuclear mass. The currently accepted value of the Rydberg constant R∞ = EHa /(2hc) is 109737.31568527(73) cm−1 (2007 CODATA). 1.1. ONE-ELECTRON ATOM/IONS l=0 l=1 9 l=2 l=3 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 z 0 0 0 0 -0.2 -0.2 -0.2 -0.2 -0.4 -0.4 -0.4 -0.4 0.1 0.2 0.3 0.4 0.5 x 0.1 0.2 0.3 0.4 0.5 x 0.1 0.2 0.3 0.4 0.5 x 0.1 0.2 0.3 0.4 0.5 x Figure 1.5. Polar plots of the lowest-l spherical harmonics: radial distance from the origin equals |Ylm (θ, ϕ)|2 . Here the x − z plane (ϕ = 0) is shown, but ϕ may be taken of any value. θ measures the angle away from the ẑ axis and varies from 0 (upward) to π (downward). Colors encode increasing value of |m|, from its minimum m = 0 (red) to its maximum m = l (violet). polynomial of degree l. For example, r r 3 3 (38) r Y10 (θ, ϕ) = rz , r Y1 ±1 (θ, ϕ) = (∓rx − i ry ) . 4π 8π This observation makes it clear that the parity of Ylm (θ, ϕ) (character for ~r → −~r) is the same as that of l, i.e. (−1)l . Finally, it easy and useful to retain the expression for the simplest spherical harmonic function (a polynomial of degree 0, i.e. a constant): Y00 (θ, ϕ) = (4π)−1/2 . Important notation: the standard spectroscopic language to indicate the value of orbital angular momentum is s, p, d, f, g, h, ... for l = 0, 1, 2, 3, 4, 5, ... respectively. 1.1.3. The radial wavefunction. The radial wavefunction Rnl (r), Eq. (33), has the structure of a product of (i) a normalization factor, (ii) a power rl (mentioned above in relation to Ylm ), (iii) an associated Laguerre polynomial of degree n−l−1 in . The power term defines the behavior of Rnl (r) for r, and (iv) the exponential of − Zr 2a r → 0. The exponential decay dominates at large r, where Rnl (r) ∼ rn−1 exp − Zn ar . The Laguerre polynomial L2l+1 n+l (ρ) vanishes at as many different points as its degree (n − l − 1): these zeroes are all located at positive ρ, thus each produces a radial 10 s 1. ATOMS 4 0.0175 0 0.015 0.0125 -2 3 0 0 5 10 15 1 20 r/a 25 30 log 0.0025 10 2 R ] a 0.005 2 a 3 R 2 0.0075 [a 0.01 3 R 2 3 -4 35 -6 0 0 10 20 r/a 30 0 p 5 10 15 20 r/a 25 30 35 0 0.02 [a R ] -2 10 0.01 log a 3 R 3 2 2 0.015 0.005 -4 -6 0 0 10 20 r/a 30 0 d 5 10 15 20 r/a 25 30 35 5 10 15 20 r/a 25 30 35 0 0.0015 10 3 log a [a 0.001 3 R 2 2 R ] -2 -4 0.0005 -6 0 0 10 20 r/a 30 0 Figure 1.6. Z = 1 hydrogenic radial s p and d squared eigenfunctions |Rnl (r)|2 . Left: linear scale; right: log10 scale. Solid: n = 1, dashed: n = 2, · – : n = 3, · · – : n = 4, · · · – : n = 5. The (n−l−1) radial nodes, where Rnl changes sign, appear as kinks in the log plots 2 1.1. ONE-ELECTRON ATOM/IONS 0.5 s 0.1 p 0.15 2 2 a r R R 0.1 a r 2 0.3 0.2 d 0.08 2 0.4 2 2 a r R 11 0.06 0.04 0.05 0.02 0.1 0 0 0 10 20 r/a 30 40 0 0 10 20 r/a 30 40 0 10 20 r/a 30 Figure 1.7. Hydrogen s p and d radial probability distribution P (r) = |r Rnl (r)|2 . Solid: n = 1, dashed: n = 2, · – : n = 3, · · – : n = 4, · · · – : n = 5. The (n−l −1) radial nodes show as tangencies to the horizontal axis, because of squaring Rnl . These curves refer to Z = 1. node, i.e. a radial distance r where Rnl (r) vanishes and changes sign. Figure 1.6 summarizes all these facts for the lowest-n radial eigenfunctions. It is important to distinguish the “radial probability distribution” P (r) = r2 |R(r)|2 of Fig. 1.7 from the |R(r)|2 of Fig. 1.6. P (r) dr provides the probability that the nucleus-electron distance is within dr of r, regardless of the direction where ~r points. The r2 weight factor is precisely the spherical-coordinates Jacobian, proportional to the surface of the sphere of radius r, or rather the volume of the spherical shell “between r and r + dr”. In contrast, the probability that the electron is found at a specific position ~r relative to the nucleus is given by (39) P3D (~r) d3~r = |ψ(~r)|2 d3~r = |ψ(r, θ, ϕ)|2 d3~r , where the polar coordinates are those representing that point ~r. Equation (39) indicates that |ψ(r, θ, ϕ)|2 gives the actual 3D probability distribution in space, not P (r). This means in particular that the probability density profile along a line through the nucleus (specifed by fixing (θ, ϕ)) is simply |R(r)|2 . Note (Fig. 1.6) that all s eigensolutions have nonzero Rn0 (0) = 2[Z/(an)]3/2 . Indeed, r = 0 is a (cusp-like) absolute maximum of |Rn0 (r)|2 . It is no surprise that the most likely point in space for the electron is ~r = ~0, the place where the nucleus sits, thus where the potential U (r) is the most attractive. For s states the misleading vanishing of P (r → 0) is entirely due to the r2 weight. For l > 0, also the probability density |Rnl (r)|2 vanishes at the origin, where the centrifugal repulsion described by the λ = l(l + 1) “effective potential” term in Eq. (27). diverges. We have here the quantum-mechanical analogous to the 40 12 1. ATOMS impossibility of a classical point particle carrying nonzero angular momentum to reach the origin of a central potential. More wave-mechanically: for l > 0 the origin is a common point of one or several nodes of the angular wavefunction: the only way for the total wavefunction to be single-valued there is to vanish. For increasing nuclear charge Z, |Rnl (r)|2 and P (r) localize closer and closer to the origin (scaling as PZ (r) = Z 3 P1 (rZ)): this explains the ∝ Z 2 (rather than ∝ Z 1 as in Vne ) dependency of the eigenenergies (30). The simplest radial wavefunction, that of the ground state, exemplifies well this Z dependency: r 3/2 k 3 −kr/2 Zr Z (40) R10 (r) = . exp − e =2 2 a a In contrast, at fixed Z and for increasing n, the radial probability distribution P (r) peaks at larger and larger distance from the origin, see Fig. 1.7. Notation: the hydrogenic kets/eigenfunctions |n, l, mi of (36) are often shorthanded as n[l], where n is the principal quantum number, [l] is the relevant letter s, p, d, ... for that value of l, and information about m is dropped. For example, 4p refers to any of ψ41 −1 , ψ41 0 , ψ41 1 . This notation is far from unambiguous, as the same 4p symbol implies different radial dependences Rnl (r) for nuclei of different mass and charge. 1.1.4. Orbital angular momentum and magnetic dipole moment. The angular momentum of an orbiting charged particle such as an electron is associated to a magnetic dipole moment. This is best illustrated in the classical case (Fig. 1.8): a particle of mass m and charge q rotating along a circular orbit of radius r at a ~ = ~r × p~ = m r v n̂, speed v, thus in a period T = 2πr/v, has angular momentum L where n̂ is the unit vector perpendicular to its trajectory. The current along the loop is simply I = q/T = q v/(2πr). The magnetic moment of a ring current equals the product of the current times the loop area: qv 2 q v r n̂ q ~ πr n̂ = = L. 2πr 2 2m One can show that this equality holds for any shape of the orbit. Relation (41) holds also in quantum mechanics, as an operatorial relation. For an electron of charge q = −qe , where the angular momentum is quantized in units of ~, it is convenient to write (41) as (41) (42) ~µ = I πr2 n̂ = ~µ = − ~ ~ ~qe L L qe ~ L=− = −gl µB , 2me 2me ~ ~ ~qe where the Bohr magneton µB = 2m = 9.27401 × 10−24 J T−1 (or equivalently A m2 ) e is the natural scale of atomic magnetic moments. gl = 1 is the orbital g-factor, 1.1. ONE-ELECTRON ATOM/IONS 13 ~ Figure 1.8. The relation between the orbital angular momentum L and the magnetic moment ~µ produced by an electron of charge −qe ~ produced by the rotating on a circular orbit. The magnetic field B circulating current is indicated by the curved lines. introduced for uniformity of notation with those situations with nontrivial g-factors g 6= 1 that we shall encounter below. The atomic angular momenta can be detected by letting the associated magnetic ~ is uniform, it induces a moments interact with a magnetic field. If the field B ~ with a frequency (the Larmor frequency) precession of ~µ around the direction of B qe B ω = 2me , routinely detected in microwave resonance experiments. If the field is nonuniform instead, a net force acts on the atom, as we discuss in the next section. 1.1.5. The Stern-Gerlach experiment. The interaction energy of a magnetic moment with a magnetic field is (43) ~. Hmagn = −~µ · B This energy is expected to remain constant in time due to the classical precession. A ~ generates a force equal to the gradient of this interaction energy. nonuniform field B (44) ~ µ · B) ~ = ∇(~ ~ µ · B) ~ . F~ = −∇(−~ 14 1. ATOMS Figure 1.9. The origin of the force that a nonuniform magnetic field produces on a magnetic moment. (a) If the magnetic dipole is seen as a circulating current, the net force originates from a force component ~ (b) If the dipole consistently pointing in the direction of increasing B. is seen as a pair magnetic monopoles, a net force arises from the unbalance between the forces on the individual monopoles. In particular, in a magnetic field with a dominant Bz component, the ẑ force component Fz is proportional to the derivative of Bz along the same direction: ~ z Bz = µz ∂Bz . (45) F~z ≃ µz ∇ ∂z The “microscopical” origin of this force is pictured in Fig. 1.9. The observation that a nonuniform magnetic field produces a force proportional to a magnetic-moment component is at the basis of the Stern-Gerlach experiment, one of the key experiments of quantum mechanics. As illustrated in Fig. 1.10, a collimated beam of neutral atoms at thermal speeds is emitted from an oven into a vacuum chamber where it traverses a region of inhomogeneous magnetic field and is finally collected by a suitable detector. Basically, the Stern-Gerlach apparatus is a device for measuring the component of atomic ~µ in the field-gradient direction. The first sections of Ref. [?] report a full analysis of the Stern-Gerlach experiment and its far reaching implications. The original experiment (1922) was carried out using Ag atoms, but similar deflections are observed using atomic H. The main outcome of the Stern-Gerlach experiment is that the ẑ component of ~µ is not distributed continuously as one would expect for a classical vector pointing at random in space, but rather peaked at discrete values. The lower panel of 1.1. ONE-ELECTRON ATOM/IONS Figure 1.10. (a) In the Stern-Gerlach experiment, a collimated beam of atoms from an oven traverses a region of inhomogeneous magnetic field created by a magnet with asymmetric core expansions: the atoms are finally detected at a collector plate. (b) In an inhomogeneous magnetic field, a magnet experiences a net force which depends on its orientation. (c) The deflection pattern recorded on the detecting plate in a Stern-Gerlach measurement of the ẑ component of the magnetic dipole moment of Ag atoms (the outcome would be the same for H atoms). Contrary to the classical prediction of an even distribution of randomly oriented magnetic moments, two discrete components are observed, due to angular-momentum quantization. 15 16 1. ATOMS Fig. 1.10 shows clustering of the atoms in two lumps. Now, quantum mechanics indeed makes the prevision that the ẑ-component Lz of angular momentum (and thus µz of magnetic moment) should show discrete eigenvalues. However, (i) the number of eigenvalues of Lz must be odd (2l + 1, with integer l = 0, 1, 2, ... – see Eq. (29)) and (ii) the ground state of hydrogen has l = 0, thus it should have no magnetic moment at all, and one undeflected lump should be observed, rather than two. This is a first hint that some extra degree of freedom must play a role in the one-electron atom. 1.1.6. Electron Spin. The Stern-Gerlach experiment, the multiplet fine structure of the spectral lines (the fine doublets of Fig. 1.3), and the Zeeman splitting of the spectral lines (see Sec. 1.1.11 below) are three pieces of evidence pointing to the existence of an extra degree of freedom of the electron, beside its position in space. W. Pauli introduced a nonclassical internal degree of freedom, later named electron spin, with properties similar to orbital angular momentum. Spin may be pictured as the intrinsic angular momentum of rotation of the electron around itself, although this picture is imprecise. Like orbital angular momentum, if spin is measured along one direction, say ẑ, one finds Sz ∼ ~ms , where again the quantum number ms takes (2s + 1) values ms = −s, . . . s. As a Stern-Gerlach deflector produces two lumps, 2 = 2s + 1 components are postulated, requiring that the intrinsic angular momentum of the electron must be s = 21 . This in turn is associated to a squared ~ 2 = 1 ( 1 + 1)~2 = 3 ~2 . spin angular momentum operator |S| 2 2 4 To avoid confusion with the spin ẑ-projection quantum number ms , hence we shall use ml for the projection quantum number associated to Lz . With this notation, a complete wavefunction, necessary to specify all degrees of freedom of the electron, is slightly more complicated than R(r)Ylml (θ, ϕ): an extra spin dependence must be inserted. Assuming, as apparent from the nonrelativistic Hamiltonian (1), that spin and orbital motions do not interact, the eigenstate of a one-electron atom with spin pointing up (↑) or down (↓) in a definite orientation is written (46) ψn l ml ms (r, θ, ϕ, σ) = Rn l (r) Yl ml (θ, ϕ) χms (σ) . Here σ is the variable for the extra degree of freedom, which can take value ± 21 , according to checking if the electron spin points up or down in some ẑ direction, while the quantum number ms = ± 21 indicates which way the spin of this specific state is actually pointing with respect to that fixed direction. These basic spin functions are therefore simply χms (σ) = hσ|ms i = δms σ . Less trivial spin wavefunctions occur when the spin points in some direction other than ẑ (non-Sz eigenstates). A Stern-Gerlach apparatus was employed to purify a beam of atoms with spin polarized some oblique direction, then analyzed by a second apparatus to measure the spin component σ along the fixed ẑ direction. 1.1. ONE-ELECTRON ATOM/IONS 17 For the oblique-spin pure state, the (now nontrivial) spin wavefunction χ(σ) bears the standard significance of a wavefunction in quantum mechanics: |χ(↑)|2 is the probability that, when Sz is measured, + 21 ~ is found, while |χ(↓)|2 is the probability is only to obtain − 12 ~. Upon Stern-Gerlach measurement, the spin ẑ component P 2 found pointing either up or down, therefore the total probability σ |χ(σ)| = |χ(↑ )|2 + |χ(↓)|2 = 1. To the present initial level of understanding, electron spin is just an extra quantum number which, in the absence of magnetic fields only provides an extra degeneracy to all atomic states: the total degeneracy of the nth level is 2n2 , rather than n2 . Spin will affect the energy levels only when relativistic effects are considered, in Sec. 1.1.8. An important novelty characteristic of spin is that the separation of the ↑ and ↓ sub-beams in a Stern-Gerlach apparatus is compatible with a g-factor for spin gs ≃ 2, rather than 1 as the orbital gl . The precise value of the electron intrinsic magnetic moment is determined extremely accurately by electron spin resonance (ESR) measurements: gs = 2.00116. 1.1.7. Total angular momentum and magnetic moment. The spin operator components Sx , Sy , Sz are assumed to follow the same algebra of commutation relations as the orbital angular momentum components, such as [Lx , Ly ] = i ~ Lz , etc. These commutation relations, characteristic of angular momentum in quantum mechanics, are all that it takes to determine all the properties (eigenvalues, eigenvectors, matrix elements) of the angular momentum operators, irrespective of their spin or orbital nature [?]. The basic physical consideration is that the deep nature of both spin and orbital angular momentum is precisely of being angular momenta. Like linear momentum p~ is the generator of translations, angular momentum gen~ is all that is needed to generate erates the rotations. Orbital angular momentum L rotations of a structureless particle. However electron spin makes electrons slightly more complicated objects: rotations must include spin, beside position degrees of freedom. The operator that generates the rotations of an electron is therefore the ~ + S. ~ total angular momentum J~ = L As [Li , Sk ] = 0, the J~ components Jx , Jy , Jz satisfy the same commutation re~ 2 has eigenvalues lations as the orbital and spin part. As a consequence, again |J| ~2 j(j + 1), with either integer or half odd integer j, and Jz has eigenvalues ~mj , with mj taking the 2j + 1 values mj = −j, −j + 1, . . . j − 1, j. ~ 2 and Jz commute with |L| ~ 2 and |S| ~ 2 individually. One can then Both operators |J| ask legitimately: which values of j are compatible with two given angular momenta l and s? The commutation relations are sufficient to answer completely this question and also to obtain the “coupled states” in terms of the eigenstates of the original 18 1. ATOMS s jmax=l+s s l l jmin =|l−s| Figure 1.11. The intuitive mnemonic rule of angular-momentum composition, Eq. (47). basis where the individual Lz and Sz components are diagonal. For our purposes, it suffices to retain the result of this interesting exercise in quantum mechanics [?], namely that the allowed values for j are j = |l − s|, |l − s| + 1, . . . l + s − 1, l + s . (47) The extremal values recall the classical picture (Fig. 1.11) of vector composition; the discrete values reproduce the quantum mechanical quantization of angular momentum. The coupling of the angular momenta realizes a change of basis: diagonalization of the operators Lz and Sz gives a basis of d = (2l + 1) · (2s + 1) states |ml , ms i labeled by all possible combinations of the allowed values for the projections of the angular momentum in the ẑ direction. In this d-dimensional space of states it is ~ 2 and Jz are diagonal often convenient to employ a different basis, one where |J| instead. The number of states must remain the same, so it must be verified that (2l+1) · (2s+1) = 2(|l−s|) + 1 + 2(|l−s|+1) + 1 + . . . + 2(l+s−1) + 1 + 2(l+s) + 1 . This is easily checked. The coupling of angular momentum states is realized through a unitary transformation in this d-dimensional space, i.e. through a d × d unitary matrix: X jm (48) |j, mj i = Cl mlj s ms |ml , ms i , ml ,ms jm Cl mlj s ms numbers (named Clebsch-Gordan coefficients, conventionally where the chosen real, tabulated in many books, e.g. Refs. [?, ?]) are the matrix elements of the basis transformation. With the basic nonrelativistic Hamiltonian (1) all the d states within the (l,s) multiplet have the same energy, and this degeneracy holds 1.1. ONE-ELECTRON ATOM/IONS 19 whether we describe them in terms of the |ml , ms i basis or of the |j, mj i basis. Any of these two bases is equally well suited to describe the system. The difference is that the |ml , ms i basis emphasizes the invariance of the system with respect to separate space and spin rotation, while the |j, mj i basis emphasizes the system invariance with respect to global rotations (of both position and spin by the same amount). Concretely, in the one-electron atom at hand, the orbital angular momentum l combines with the electron spin s = 12 . Rule (47) assigns to j either one value j = s = 12 (for l = 0, s states), or two values j = l ± 21 (for l ≥ 1, p, d, f, ... states). For s states, the transformation matrix from the |ml = 0, ms i basis to the |j = 21 , mj i 1 mj ~ basis is trivially the 2 × 2 identity C 2 1 = δm ms , since here J~ coincides with S. 00 2 ms j For l ≥ 1, the “maximally aligned” components 1 1 1 j = l+ , mj = ± l+ = ml = ±l, ms = ± ; 2 2 2 the remaining coupled states are each expressed in terms of two uncoupled states only, those whose Jz component match: j = l + 1 , m j = a 2 and the orthogonal ket ml = mj + 1 , ms = − 1 + b 2 2 ml = mj − 1 , ms = + 1 , 2 2 j = l− 1 , mj = b 2 m l = m j + 1 , m s = − 1 − a 2 2 m l = m j − 1 , m s = + 1 , 2 2 where l+ 21 a = Cl m mj 1 j+ 2 1 2 − 21 = s l + 21 − mj 2l + 1 and l+ 21 b = Cl m mj 1 j− 2 1 2 + 21 = s l + 21 + mj 2l + 1 are the quantum weights in the transformation. Clearly, a2 + b2 = 1 as required by a unitary transformation. Even more concretely, for a p (l = 1) orbital triplet, the explicit transformation matrix between the |ml , ms i basis and the coupled |j, mj i basis involves a 6 × 6 20 matrix  j =    j=    j=    j=    j=  j = 1. ATOMS as follows: q q   1 2 1 1 , mj = 2 0 0 0 − 3 2 3  q q    1 1 , mj = − 12   0 0 − 23 0 2 3     3 , mj = 23   1 0 0 0 0 2 = q q   2 1 3 , mj = 21   0 0 0 2 3 3   q q   3 1 2 , mj = − 12   0 0 0 2 3 3   3 3 0 0 0 0 0 , mj = − 2 2 = 1, ms = + 12      1  = −1, ms = + 2   , = 1, ms = − 12    1 = 0, ms = − 2   = −1, ms = − 12 = 0, ms = + 12 doublet followed by the j = 1 + 21 = 32 quartet. In practice, if we measure the total angular momentum J~ of a one-electron atom, then each multiplet of states of given l yields two groups of states characterized by two values of j: l − 21 and l + 21 . If spherical symmetry is not broken, all states at given j (but different mj ) must have the same energy. This suggests that, when we find a mechanism that splits the degeneracy of these states (i.e. a physical reason for associating different energy to different values of j), we will obtain an explanation for the two-fold fine-split structure of the observed spectral lines (Fig. 1.3). In Sec. 1.1.8 we shall discuss a weak interaction that splits states of different j, thus making the coupled basis preferable to the uncoupled one. The present discussion is much more general than the composition of orbital and spin angular momenta of a single electron. The rules for composing angular momenta are purely algebraic: they apply equally well to any kind of angular momentum. We shall rely on this formalism when dealing with many-electron atoms, especially in Secs. 1.2.4 and 1.2.7.2. Notation: the states of the coupled basis are commonly indicated as 2s+1 [l]j , where [l] is the relevant capital letter S, P, D... for that value of l = 0, 1, 2... Information about n is given otherwise, e.g. with the n[l] notation, and information about mj is lost. For example, 3d 2 D3/2 stands for any of the four n = 3, l = 2, j = 32 , mj kets. 1.1.7.1. Total magnetic moment in the coupled basis. In the presence of both orbital and spin angular momenta, the effective magnetic moment of the atom is the vector sum of them: ~ + 2S ~ ~ + gs S ~ L gl L ≃ −µB . (49) ~µ = ~µl + ~µs = −µB ~ ~ By studying the matrix elements of the ~µ operator, we can evaluate the magnetic properties of an atom, where both spin and orbital magnetic moment come into play. According to the operatorial relation (49), these properties are only determined by where we list the j = 1 − 1 2 = 1 2   m l    0   m l    0   m l ·  0   m l    0   m l   m l 1 0 1.1. ONE-ELECTRON ATOM/IONS 21 angular-momentum properties, without any reference to the radial dependence of the electron wavefunction. The matrix elements of the magnetic-moment operator on the two bases described above, are substantially different. They are related via the unitary transformation (48). In the uncoupled basis |ml , ms i, the µz operator is diagonal, with eigenvalues Lz + 2Sz (50) hml , ms |µz |ml , ms i = −µB hml , ms | |ml , ms i = −µB (ml + 2ms ) . ~ The other components have simple (nondiagonal) expressions for the matrix elements, which can be obtained from the well known matrix elements of Lx/y and Sx/y [?]. In principle one could obtain the matrix elements of ~µ on the coupled basis |j, mj i by using explicitly the Clebsch-Gordan transformation (48). However, a simpler and more instructive method yields these matrix elements within each subspace at fixed total angular momentum j. The key point is a symmetry argument: on average, all vector quantities characterizing a spherically symmetric object freely rotating in space are proportional to its total angular momentum (Wigner-Eckart theorem). This means in particular that ~ ~ ∝ hJi, ~ ~ ∝ hJi. ~ h~µi ∝ hJi, hLi and hSi As the matrix elements of J~ are well known, the only unknown quantities are the corresponding proportionality constants. To obtain them, note that by definition µB ~ + 2S|j, ~ mj i = − µB hj, mj |J~ + S|j, ~ mj i hj, mj |~µ|j, mj i = − hj, mj |L ~ ~ µB ~ mj i = −gj µB hj, mj |J|j, ~ mj i , = −(1 + γ) (51) hj, mj |J|j, ~ ~ ~ mj i and hj, mj |J|j, ~ mj i. We where we introduce the ratio γ between hj, mj |S|j, determine γ by observing that the same ratio is involved when we take the scalar ~ product with J: ~ mj i = γ hj, mj |J|j, ~ mj i , hj, mj |S|j, (52) ~ mj i = γ hj, mj |J~ · J|j, ~ mj i . hj, mj |J~ · S|j, ~ with the γ can be extracted from Eq. (52) by replacing the scalar product J~ · S expression ~2 ~2 ~ 2 ~ = |J| + |S| − |L| , (53) J~ · S 2 ~ ~ ~ obtained by squaring (J − S) = L. We obtain the proportionality constant ~ mj i ~ 2 + |S| ~ 2 − |L| ~ 2 |j, mj i hj, mj |J~ · S|j, j(j + 1) + s(s + 1) − l(l + 1) hj, mj | |J| γ= = . = 2 ~ 2 |j, mj i 2 j(j + 1) ~ 2j(j + 1) hj, mj | |J| 22 1. ATOMS Finally, we obtain the proportionality constant introduced in Eq. (51) between the magnetic moment and the total angular momentum j(j + 1) + s(s + 1) − l(l + 1) (54) gj = 1 + γ = 1 + 2j(j + 1) called Landé g-factor. gj gives a measure (in units of µB ) of the new effective atomic magnetic moment due to the combined orbital and spin contributions, within a given fixed-j multiplet. For example, the ẑ component of h~µi is (55) hj, mj |µz |j, mj i = −gj µB mj . Note that not all off-diagonal matrix elements of Sz (and thus of µz ) vanish on the coupled basis |j, mj i. Amusingly, the values of gj are not restricted to the range 1 ≤ gj ≤ 2, contrary to what one might expect for a combination of two moments with gl = 1 and gs = 2. 1.1.8. Fine structure. The smallness of the observed fine splittings (a fraction of meV, Fig. 1.3) suggests that they derive from some tiny interaction, absent in the original Hamiltonian (1). Indeed, all relativistic effects were neglected there: we consider them in detail here. 1.1.8.1. Spin-orbit coupling. A first neglected effect is due to the magnetic field experienced by electron spin, due to its own orbital motion. This is a subtle relativistic effect, due to the Lorentz transformation of the nuclear electric field into the frame of reference of the electron. Call ~v the electron velocity in the nuclear rest frame. In the electron frame of reference, the nucleus is seen to move with velocity −~v , and is therefore associated with a current −Zqe~v . According to the Biot-Savart law of electromagnetism, this current produces a magnetic field at the point (joined by a vector ~r) where the electron sits ~ ~ r) = − 1 ~r × (−Zqe~v ) = E(~r) × ~v . (56) B(~ 4πǫ0 c2 |~r|3 c2 Equation (56) uses the fact that the electric field produced by the nucleus at the same point is ~ r) = Zqe ~r , (57) E(~ 4πǫ0 |~r|3 and identifies this magnetic field as a relativistic effect. In Eq. (56) we recognize the orbital angular-momentum operator: ~ L Zqe ~ r) = Zqe ~r × ~v = (58) B(~ . 4πǫ0 c2 |~r|3 4πǫ0 c2 me |~r|3 ~ r) acts point by point on the electron. In this field, the This magnetic field B(~ interaction energy of the electron spin magnetic moment ~µs should in principle 1.1. ONE-ELECTRON ATOM/IONS 23 ~ r). However, this energy must actually be reduced by a factor 1 (first equal −~µs · B(~ 2 recognized by L.H. Thomas) due to the fact that the electron frame of reference is accelerated [?]. The correct magnetic interaction energy operator is therefore: ! ~ 1 Ze2 L 1 g µ Zq s B e ~ r) = ~ ·L ~. ~· (59) Hs−o = − ~µs · B(~ = S S 2 2 ~ 4πǫ0 c2 me |~r|3 2 m2e c2 r3 This operator, named spin-orbit interaction, has nonzero off-diagonal elements connecting states with different n, ml , ms . However, in practice states with different n have vastly different energies, so that the tiny n-off-diagonal spin-orbit couplings perturb negligibly the energy. These n-off-diagonal matrix elements are usually ignored. For given l and considering only the n-diagonal matrix elements of Hs−o , we rewrite Eq. (59) as Z ∞ Ze2 Ze2 −3 ~ ~ ~ ·L ~. hn, l|r |n, li S · L = (60) Hs−o ≃ r−3 [Rn l (r)]2 r2 dr S 2 m2e c2 2 m2e c2 0 The radial integral may be computed explicitly for hydrogenic wavefunctions Rn l , obtaining 3 2 Z −3 (61) hn, l|r |n, li = 3 a n l(l + 1)(2l + 1) (of course it diverges for l = 0). The spin-orbit Hamiltonian is thus conveniently rewritten as ~ ·L ~ S , (62) Hs−o = ξ ~2 where the spin-orbit energy parameter 3 3 2 1 µ Ze2 ~2 Z 4 2 = Z α E . (63) ξ = Ha 2 m2e c2 a n3 l(l + 1)(2l + 1) me n3 l(l + 1)(2l + 1) The last equality uses the expression for the mass-rescaled atomic length scale a = a0 mµe , the definition (9) of the Hartree energy, and the energy relation of the finestructure constant α2 = mEeHac2 . In this form, it is apparent that the typical energy scale of spin-orbit ξ • • • • ~ and S; ~ is positive, i.e. it favors antiparallel alignment of L is a leading α2 = (v/c)2 relativistic correction; is α2 ≃ 5.3 × 10−5 times smaller than the typical orbital energies; grows as Z 4 , reflecting the increase in nuclear field intensity ∝ Z and the reduction as Z −1 in average electron-nucleus distance so that hn, l|r−3 |n, li ∝ Z 3; 24 1. ATOMS • decreases as n−3 , reflecting increase with n in the average electron-nucleus distance, and the r−3 dependence of the interaction energy (59); ~L ~ ∝ l), due to the rl suppression of the radial • decreases roughly as l−3 (but S· wavefunction close to the origin, the region where spin-orbit interaction (59) dominates. 3 The energy scale of ξ for hydrogen amounts to α2 EHa mµe = 2.3178 × 10−22 J = 1.4467 meV. Note however that the lowest state for which spin-orbit applies (2p) has n3 l(l + 1)(2l + 1) = 48, thus ξ2p = 0.030139 meV only, and then for all higher levels ξ is even smaller. ~ · L. ~ On the decoupled Consider now the remaining operatorial part in Eq. (62): S ~ ·L ~ has plenty of nonzero off-diagonal matrix elebasis |l, s, ml , ms i, the operator S ~ · L, ~ within ments. To apply first-order perturbation theory, we must diagonalize S each initially degenerate space at fixed l (and s). Very generally, a tiny perturbation splits degenerate states easily, after rearranging them to the basis where the perturbation itself is diagonal. We come now to employ the coupled basis intro~ ·L ~ is diagonal on the re-coupled basis duced in Eq. (48). To convince oneself that S ~ + S) ~ = J~ and invert it as follows: |l, s, j, mj i, take the square of (L ~2 ~2 ~ 2 ~ ·L ~ = |J| − |S| − |L| . S 2 (64) (Note the similarity to Eq. (53).) All operators at the right hand side are of course diagonal on the |l, s, j, mj i basis: the expression for the eigenvalue is then hl, s, j, mj | (65) ~ ·L ~ S j(j + 1) − s(s + 1) − l(l + 1) |l, s, j ′ , m′j i = δj j ′ δmj m′j . 2 ~ 2 In summary: on the coupled basis |l, s, j, mj i, the spin-orbit interaction is diagonal and its eigenvalues are given by Eq. (65), multiplied by the energy prefactor ξ. As an example, for a p level of a one-electron atom, the two different eigenvalues ~ ~ of S·~2L are −1 (for J = 21 ) and + 12 (for J = 23 ). Therefore, spin-orbit splits any p level (3 × 2 = 6 orbital×spin states) of hydrogen into two multiplets, 2 P 1 (2 states) 2 and 2 P 3 (4 states), separated by an energy 32 ξ (= 45.21 µeV for the 2p level of H). 2 1.1.8.2. The relativistic kinetic correction. A second relativistic correction of the same order (v/c)2 must be included, beside spin-orbit. This energy contribution accounts for the leading correction to the kinetic energy expression p2 /(2µ): (66) p 1 p2 1 p4 p2 p4 2 2 2 4 2 2 Tr = µ c + p c − µc = µc 1 + − + ... − 1 = − + ... . 2 µ2 c 2 8 µ4 c 4 2µ 8µ3 c2 1.1. ONE-ELECTRON ATOM/IONS 25 As we did for Hs−o , to treat the weak perturbation −p4 /(8µ3 c2 ) at first order, we just need the diagonal matrix elements of this operator. Although p4 looks like a formidable differential operator, the trick p4 = (p2 )2 = [2µ(Htot − Vne )]2 permits to rewrite the diagonal matrix elements of p4 in terms of simple radial integrals of r−1 and r−2 . The final result is 3 p4 1 Z 4 α2 3 µ , (67) hn, l| − 3 2 |n, li = − 3 EHa − 8µ c n me 2l + 1 8n where we omit to indicate ml /ms or j/mj , which are irrelevant for such radial integrals. By combining the spin-orbit and kinetic correction (68) Hrel = Hs−o − p4 , 8µ3 c2 we obtain the diagonal matrix elements of the total relativistic correction of order α2 : 3 Z 4 α2 j(j + 1) − s(s + 1) − l(l + 1) 1 3 µ hn, l, j|Hrel |n, l, ji = EHa − + n3 me 2l(l + 1)(2l + 1) 2l + 1 8n 3 1 3 µ Z 4 α2 − (69) , = − 3 EHa n me 2j + 1 8n where the last simplification is based on having s = 12 , thus l = j ± 21 . This last expression applies even for s states (while all previous steps did not), for which separate analysis is necessary. Expression (69) can be combined with the nonrelativistic eigenvalues (30) to obtain the following equation for the eigenvalues corrected to order α2 : (70) " # 2 µ 2 1 3 Z 2 EHa µ 1 . 1 + Zα − hn, l, j|Htot + Hrel |n, l, ji = − 2 m e n2 me n 2j + 1 4n This remarkable relation yields a quantitative prevision for the spectrum that can be directly compared to experiment: all n-levels should be split, for the different values of j, but not for different values of l giving the same j. This same l degeneracy is obtained by solving the more refined theory based on Dirac’s equation, which is exact to all orders in α, not just α2 . 1.1.8.3. The Lamb shift. As the extra l-degeneracy is rather surprising, the occurrence of a splitting between levels with same j and different l was closely investigated, both theoretically and experimentally. Indeed, quantum fluctuations of the electromagnetic field and finite nuclear size should remove this degeneracy (Lamb 26 1. ATOMS d TT LI SP PHOTON SOURCE ER MONOCHROMATIC PHOTON DETECTOR 1111111 0000000 0000000 1111111 0000000 1111111 am e be 0000000 1111111 SWITCH prob 0000000 1111111 pump1111111 beam 00000 11111 0000000 00000 11111 0000000 1111111 00000 11111 0000000 1111111 00000 11111 0000000 1111111 SAMPLE 0000000 1111111 0000000 1111111 0000000 1111111 Figure 1.12. (a) Level scheme for the fine structure of the Balmer Hα line, including the relativistic corrections and the Lamb shift. (b) First direct spectroscopic observation of the Lamb shift in the Hα line, obtained [?] by the double-resonance experiment sketched in panel (d). (c) Theoretical lines with relative intensities. The three peaks at the left involve the n = 2 2 P3/2 state, and correspond to the three transitions marked at the left side of panel (a). The four lines at the right involve the n = 2 j = 21 states, illustrated by the four arrows at the right side of panel (a): in the absence of Lamb shift these lines would be two instead of four. Splittings within these four lines are associated to both spin orbit in the n = 3 multiplet and the Lamb shift between n = 2 2 S1/2 and 2 P1/2 , which is then measured of the order of 0.03 cm−1 , about 4 µeV. The average separation between the two groups of lines measures the spin-orbit splitting in the 2p, and it agrees with the theoretical evaluation ≃ 45 µeV. (Note: 0.1 cm−1 ≃ 12 µeV.) 1.1. ONE-ELECTRON ATOM/IONS 27 shift). Figure 1.12a reports the expected spectral fine structure of the Balmer Hα line, including the relativistic corrections and the Lamb shift. Due to Doppler broadening (random thermal atomic motion, see Sect. 0.2), it is exceedingly difficult to observe these tiny splittings. To circumvent this broadening and acquire the high-resolution spectrum of Fig. 1.12b, the authors of Ref. [?] used a trick based on double resonance. A powerful tunable monochromatic light beam is split into a strong interruptible “pump” beam plus a second weak “probe” beam. As the light frequency matches a resonant transition, absorption takes place and the probe beam is attenuated. However, if the pump beam saturates the transition in the sample, then absorption is strongly reduced. The spectrum records the probe absorption difference between time intervals when the pump beam is on and when it is off. Broadening is then removed since the beams are almost antiparallel, so that the same atoms are selected by the pump/probe frequency match, namely those with practically null instantaneous translational velocity in the beam direction, thus null Doppler shift. 1.1.9. Nuclear Spin and hyperfine structure. Like the electron, many nu~ For example, the proton in an H atom is a particle of clei can carry a spin I. 1 spin I = 2 . As usual for a quantum system, the nuclear magnetic moment ~µ is proportional to its angular momentum: I~ ~µN = µn . ~ Here, the nuclear magneton µn is defined by (71) µ n = gn qe ~ me = gn µ B , 2Mn Mn where the nuclear g-factor gn is a number of order unity whose value depends on the inner nuclear structure. For example, gn = 5.58569 for the proton. The nuclear spin produces a magnetic field which adds to the spin-orbit one. This field is extremely weak, since it is suppressed by the ratio me /Mn with respect to typical electronic fields. Because of this field, the nuclear spin interacts with the electron. The interaction is extremely weak for l > 0 orbitals, as the electron does not get very close to the nucleus, due to the rl term in Eq. (33). For s orbitals, the only electronic magnetic moment affected by the nuclear field is that associated to ~ Like for spin-orbit, the interaction Hamiltonian is proportional to the only spin S. scalar term one could build with the two vector quantities involved: HS~ I~ = C~µN · ~µe = C gn gs µ2B ~ me I~ · S . Mn ~ 2 28 1. ATOMS Figure 1.13. All-sky Doppler map of the 21 cm emission by 1 H in the Milky Way. The prefactor C is the relevant radial matrix element, which equals 1 2 C= Rn0 (0)2 . 2 4πǫ0 c 3 Using Rn0 (0) = 2[Z/(an)]3/2 , the characteristic coupling energy (72) Z3 2 4 Z3 2 me Z 3 gn me 2 me g = gn gs 3 E = α E = 1.06 µeV a.m.u. ξN = Cgn gs µ2B n 3 Ha Mn 3 n me c2 Ha Mn 3 n Mn Mn where we used the usual relation α2 = mEeHac2 , and the nuclear mass is measured in atomic mass units a.m.u. For 1 H, Eq. (72) yields ξN = 5.88 µeV. The electron and nuclear spins couple to a grand total angular momentum F~ = ~ ~ with the same rules derived for L ~ and S ~ in Sec. 1.1.7. With the customary I + S, trick we obtain ~ 2 ~2 ~2 ~ = |F | − |I| − |S| , I~ · S 2 2 ~ ~ so that the expectation value of I · S/~ equals 21 [f (f + 1) − i(i + 1) − s(s + 1)] on the coupled basis, where |F~ |2 and |Fz | are diagonal, rather than Iz and Sz . As s = 21 , two coupled states f = i ± 12 occur, with an energy separation of ξN i + 21 . ~ 2 = − 3 and 1 for f = 0 For 1 H, the proton has spin i = 21 , so that I~ · S/~ 4 4 and 1 respectively. The separation between these two hyperfine-split states equals −1 ≃ 21 cm, and a therefore ξN = 5.88 µeV: it corresponds to a wavelength hc ξN frequency of ξN h−1 ≃ 1.43 GHz. This transition, in the radio-frequency range, at precisely 1420405751.80 Hz, was discovered in 1951 in the spectrum of galactic atomic hydrogen, and has now become a standard tool to investigate the galactic distribution of atomic 1 H (see Fig. 1.13). 1.1.10. Electronic transitions, selection rules. Not all conceivable transitions are equally easy to observe. Basically, it is found that some transitions proceed at a large rate, while others occur immensely more slowly. The reason for this can be found in quantum mechanics. The probability per unit time that an undisturbed atom decays radiatively from an initial state |ii to a final state |f i is given by 1 ~ 2, (73) γif = E 3 |hf |d|ii| 3πǫ0 ~4 c3 if 1.1. ONE-ELECTRON ATOM/IONS 29 where Eif = Ei − Ef , and d~ is the operator describing coupling to the radiation field. In the approximation that the radiating object is much smaller than the radiation wavelength (see Fig. 0.4), the operator d~ = −qe~r is the electric dipole operator (dipole ~ vanishes are approximation). All transitions for which the matrix element hf |d|ii “forbidden”: this means that they occur at very low rates, associated to higher multipoles in the field expansion. The matrix elements of the dipole operator of the one-electron wavefunction are: Z ~ (74) hnf , lf , ml f |d|ni , li , ml i i = ψn∗ f lf ml f (~r) d~ ψni li ml i (~r) d3 r . This integration is conveniently carried out in polar coordinates: express the dipole operator d~ = −qe~r = −qe r (sin θ cos ϕ, sin θ sin ϕ, cos θ), and observe that r 2π [Y1 −1 (θ, φ) − Y1 1 (θ, φ)] rx = r 3 r 2π [Y1 −1 (θ, φ) + Y1 1 (θ, φ)] ry = r i 3 r 4π rz = r Y1 0 (θ, φ) 3 (this is the inverse of Eq. (38)), so that the dipole matrix element is proportional to |h~ri|2 = |hf |~r|ii|2 = h~ri∗ · h~ri = hrx i∗ hrx i + hry i∗ hry i + hrz i∗ hrz i 2π hY1 −1 − Y1 1 i∗ hY1 −1 − Y1 1 i −i2 hY1 −1 + Y1 1 i∗ hY1 −1 + Y1 1 i +2hY1 0 i∗ hY1 0 i = |hri|2 3 2 4π |hY1 −1 i|2 + |hY1 1 i|2 + |hY1 0 i|2 , = (75) |hri| 3 where we have shortened all bra-ket indications, omitting initial and final states indications and angular dependency of the spherical harmonics. The squared dipole matrix element in the transition rate is then written in polar coordinates as Z ∞ 2 2 2 2 ~ i , li , ml i i = q Rnf lf (r) r Rni li (r)r dr × hnf , lf , ml f |d|n e 0 2 Z Z 2π 4π X π ∗ (76) sin θ dθ dφ Ylf mf (θ, φ)Y1 m (θ, φ)Yli mi (θ, φ) . 3 m 0 0 No radial integral vanishes: multiplication by r turns no Rni li (r) into exactly another Rnl (r), thus r Rni li (r) has nonzero expansion coefficients on all other radial basis wavefunctions. The radial integral can be computed for any initial (ni , li ) and final (nf , lf ): it diminishes rapidly when ni and nf differ by large amounts, because 30 1. ATOMS Rni li (r) and Rnf lf (r) take nonnegligible values in remote places, so that their product is everywhere small. For the angular part, much sharper statements can be made: the integration in the lower row of Eq. (76) represents the angular overlap of a state with l = lf to the product Y1 m Yli mi . This product can be decomposed as two coupled angular momenta: according to the rule (47), the composition of an angular momentum l = 1 with an angular momentum l = li yields three allowed values of the coupled angular momentum in Y1 m Yli mi .3 Final states characterized by l = lf not satisfying (77) lf = |li − 1|, li , li + 1 are guaranteed to make the angular integral vanish. Moreover, note that also when lf = li the angular integral vanishes! The reason is that the parity of ~r (thus of Y1 m (θ, φ)) is negative, while the parities of Yli ml i (θ, φ) and of Yli∗ml f (θ, φ) are both (−1)li . The triple product Ylf∗ mf Y1 m Yli mi is therefore of negative parity, thus integration over all space averages to zero. In summary, in the dipole approximation, nonzero matrix element can occur only for transitions involving states with l changing by exactly unity. The allowed transitions have therefore (78) ∆l = lf − li = ±1 . This equality is called a dipole selection rule, the one regarding l. The fact that the dipole operator is associated to an orbital l = 1 implies also that its component m = −1, 0, 1. Accordingly, the only value of ml f for which the angular integral is nonzero, is obtained by adding m to ml i . From this observation we derive the ml -selection rule: (79) ∆ml = ml f − ml i = 0, ±1 . Until this point, spin was ignored, as the dipole operator is purely spatial: it does nothing to spin. Indeed on the uncoupled basis |n, l, ml , ms i, d~ can (and does) only change the spatial degrees of freedom, but it acts as the identity for spin. As a result, we have (80) (81) ∆s = sf − si = 0 ∆ms = ms f − ms i = 0 . The first spin selection rule (80) is trivial for a one-electron atom, as s ≡ but it will become relevant for many-electron atoms. 3 1 2 anyway, Indeed, it can be shown that angular integral within the absolute value is proportional to l m the Clebsch-Gordan coefficient Clif mlfi 1 m . 1.1. ONE-ELECTRON ATOM/IONS 31 By analyzing the composition of the coupled states |n, l, j, mj i in terms of the |n, l, ml , ms i states, we find that the following selection rules hold for the coupled states: (82) (83) ∆j = jf − ji = 0, ±1 (not 0 −→ 0) ∆mj = mj f − mj i = 0, ±1 . After establishing what transitions are allowed in the dipole approximation, we have better estimate the rate of these transitions according to (73). Remembering that ωif = ~−1 Eif ≈ Z 2 ~−1 EHa , and observing that the order of magnitude of ~ ≈ qe a0 /Z, we can estimate |hf |d|ii| (84) 2 3 ~ 2 2 Eif3 |hf |d|ii| Z4 e2 Z 2 4 2 2 a0 2 e γif = ωif = Z 2 α3 ωif , ≃ E ωif qe 2 ≃ e ωif = Z 3πǫ0 ~4 c3 ǫ0 ~3 c3 Ha Z (~c)3 ~c where we have used EHa a0 = e2 and α = e2 /(~c). As ωif ≃ Z 2 1016 Hz, and α3 ≃ 10−7 , we expect transition rates of the order γif ≃ Z 4 109 s−1 , i.e. typical −1 ≃ Z −4 ns. This sets the order of magnitude of the speed of atomic decay times γif transitions. The strong (Eif3 ) dependence of γif makes the transition time shorter for more energetic transitions, and longer for low-energy transitions. Much slower transition rates occur for dipole-forbidden transitions, associated to weaker higherorder couplings to the electromagnetic field (magnetic dipole, electric quadrupole...). Other, nonradiative transitions may also occur due to collisions, e.g. with fast electrons, with other atoms, with the vessel wall. These mechanisms become dominant for the decay of long-lived metastable states, which lack fast dipole-allowed decay transitions. 1.1.11. Spectra in a magnetic field. We conclude this Section with a brief analysis of atomic spectra in the condition where a maximum of information can be extracted from them, i.e. when the atomic sample is immersed in a uniform magnetic field. In these conditions, the atomic magnetic moment couples to the external field, so that different values of the component of the magnetic moment along the field ~ · ~µ, Eq. (43), detectable by spectroscopical direction acquire different energies −B investigation. The magnetic coupling with the external field rewrites: ~ ~ ~ · ~µ = µB B ~ · L + 2S = µB Bz Lz + 2Sz Hmagn = −B ~ ~ (see Eq. (49)). This operator is diagonal on the uncoupled |l, s, ml , ms i basis. Remember however (Sec. 1.1.8) that Hs−o is not diagonal on that basis, but rather on the coupled basis |l, s, j, mj i. In fact, Hs−o and Hmagn cannot be diagonalized simultaneously, as they do not commute. To obtain the spectrum and eigenvectors one (85) 32 1. ATOMS must then diagonalize the total operator Hmagn + Hs−o , within each (2l+1) · (2s+1)dimensional subspace at fixed l and s. This diagonalization is not especially complicated, but it is perhaps more instructive to understand in detail the two limiting cases where either characteristic energy scale µB |B| or ξ dominates. The simplest limit (µB |B| ≫ ξ), of magnetic field energy µB |B| much larger than the spin-orbit energy ξ, occurs for different values of the field, depending on the atom considered. For hydrogen 2p, the strong-field limit is reached for |B| ≫ 0.5 T, while for He+ 2p it takes a magnetic field as large as |B| ≫ 8 T, due to the Z 4 dependence of the spin-orbit energy, Eq. (63). In this limit of very strong field, the coupled basis is not especially good, as full rotational invariance of the atom is badly broken. The uncoupled basis |l, s, ml , ms i works fine instead: spin and orbital moments align relative to the field with an energy gain or cost depending on their different g-factors. On this basis, the large interaction Hmagn is diagonal: if we neglect the smaller Hs−o , the magnetic energy levels are simply (86) Emagn (ml , ms ) ≃ hml , ms |Hmagn |ml , ms i = µB Bz (ml + 2ms ) (Paschen-Back limit). Hs−o corrections may be added perturbatively. In the (more common) opposite weak-field limit (|B| ≪ ξ/µB ), spherical symmetry is only weakly perturbed. The states |l, s, j, mj i in the coupled basis are approximate eigenstates of Hmagn + Hs−o , and Hmagn acts as a weak perturbation. To first order in µB |B|/ξ, the energy correction due to Hmagn is provided by Eq. (51): (87) Emagn (mj ) ≃ hj, mj |Hmagn |j, mj i = hj, mj | − µz Bz |j, mj i = gj µB Bz mj (Zeeman limit), where gj is the Landé g-factor, obtained in Eq. (54). Both bases fail and neither of these approximate expressions is correct in the intermediate-field regime µB |B| ≃ ξ. Figure 1.14 shows the exact pattern of splittings of the six 2 P states under the action of a magnetic field, changing from the Zeeman (weak field) to the Paschen-Back (strong-field) limit. The initial slopes of the curves at B → 0, divided by the relevant mj , measure the values of the Landé gj . The experimentally observed spectra confirm the theory outlined here. For H, provided that a sufficiently strong magnetic field is applied, the Paschen-Bach is relatively straightforward to observe as a triplication of all lines. If very high spectral resolution can be achieved, also the weak-field Zeeman splitting of the H lines shown in the conceptual scheme of Fig. 1.15 could be detected. Such Zeeman effect is called “anomalous” since the lines are not regularly spaced, and the spacing is different for the (2 S1/2 ←→2 P1/2 )-originated and the (2 S1/2 ←→2 P3/2 )-originated lines. However, in atomic physics, this is rather the rule than an anomaly: only a few S = 0 lines of many-electron atoms happen to show “regular” Zeeman splittings (see Fig. 1.31 in Sec. 1.2.7.3). 1.2. MANY-ELECTRON ATOMS ml , ms l=1, s=1/2 Energy / ξ 33 +1,+1/2 5 0,+1/2 -1,1/2 j=3/2 j=1/2 0 +1,-1/2 0,-1/2 -5 -1,-1/2 0 2 Zeeman limit µBB / ξ 4 Paschen-Back limit Figure 1.14. Spin-orbit and magnetic splitting of a 2 P multiplet. With the shorthand b = µB B/ξ, the expression for the mj = ± 32 energies (solid lines) is simply 21 ± 2b ξ. The energies of the four other levels are 14 (−1 ± 2b + d)ξ (dotted lines) and 41 (−1 ± 2b − d)ξ p (dashed lines), where d = 9 + 4b(1 + b). 1.2. Many-electron atoms 1.2.1. Identical particles. The concept indistinguishable particles is central to understand correctly all the physics of matter beyond one-electron atoms. In classical mechanics, each particle is labeled by its own position and momentum: one could in principle follow individual trajectories through the motion, and thus tell identical particles i and j apart at any time. In quantum mechanics, indistinguishable particles are such at the deepest level. There is no way, even as a matter of principle, to ever distinguish e.g. 2 electrons. Quantum mechanics implements this perfect indistinguishableness through symmetry: any many-particles ket has a definite symmetry “character” for the permutation Pij swapping any two identical particles. As this permutation symmetry is a discrete symmetry which, once applied twice, leads back to the initial state, the eigenvalues of any permutation can only be +1 or −1. Those particles for whose swap the total ket |ai of the system is symmetric (+1 character) are called bosons. This is represented by Pij |ai = |ai. Those particles for whose swap the total ket |ai of the system is antisymmetric (−1 character) are called fermions. This is indicated by Pij |ai = −|ai. 34 1. ATOMS 3/2 1/2 2P 3/2 −1/2 45 µ eV −3/2 1/2 −1/2 2P 1/2 10 eV Field off Field on 1/2 2S 1/2 −1/2 Figure 1.15. Conceptual scheme of the Zeeman-split 1s←→2p lowest Lyman line of H. Splittings between lines of different mj are proportional to the magnetic energy µB B and the relevant g-factor: gj (2 P3/2 ) = 34 , gj (2 P1/2 ) = 23 , gj (2 S1/2 ) = 2. A simple rule connects the spin of a particle kind to its permutational symmetry: integer-spin particles are bosons, half odd-integer particles are fermions. Example of bosons: the photon (spin 1). Examples of fermions: the electron, the proton, the neutron (all spin 12 ). A collection of bosons and fermions lumped together is often treated as a single point particle. This makes sense when the internal dynamics is associated to very high excitation energy, so that the lump remains in its ground state, which can be degenerate according to the projection of the lump’s total angular momentum. One could ask what character for the permutation of two such identical lumps the collective ket shows. This is simply answered by counting the number of (−1)’s generated by the permutations of pairs of identical fermions. For example, hydrogen atoms in the same hyperfine state are bosons, since a −1 is generated by the permutation of the two electrons and a second −1 is generated by the permutation of the two protons. According to Eq. (47), these composite objects fulfil the spin rule: e.g. the grand total spin of H in its ground state is f = 0. Similarly, identical nuclei are bosons or fermions 1.2. MANY-ELECTRON ATOMS 35 He 25 Ionization energy [eV] Ne 20 Ar Kr 15 Xe Hg Rn 10 5 0 Li 0 Na Ga K 20 Tl In Rb Fr Cs 40 60 80 100 Z Figure 1.16. First ionization energies of neutral atoms (N = Z), as a function of the atomic number Z. according to whether they contain an even or odd number of nucleons (protons or neutrons). For example the deuteron ion D+ , a bound state of a proton and a neutron in a i = 1 spin state, is a boson. Likewise, the 13 C isotope of carbon is a fermion (13 nucleons + 6 electrons), while the 35 Cl isotope of chlorine is a boson (35 nucleons + 17 electrons). The permutational symmetry is crucial to understand the dynamics of many electrons and, in particular, the structure of many-electron atoms. As discussed in greater detail below, antisymmetry obliges N electrons to span N quantum states, thus effectively avoiding one another. In practice, the geometrical constraint of antisymmetry is often more effective than the dynamical electron-electron repulsion (6) in keeping electrons apart. Without permutational antisymmetry all electrons would occupy the same 1s shell in the atomic ground state. If that happened (as it could if electrons were distinguishable particles – or bosons – rather then fermions), then the atomic ionization energies should increase roughly with Z 2 and the size of atoms should decrease roughly as Z −1 . In stark contrast, relatively weak non-monotonic Z dependence of both these properties (Fig. 1.16 and 1.17) are observed: in particular, the ionization energy and the atomic size show a general tendency to respectively decrease and increase with Z. 36 1. ATOMS Figure 1.17. Empirical atomic radii of elements, as a function of Z. 1.2.2. The independent-particles approximation. Electrons interact strongly with each other. In a neutral atom, the electron-electron repulsion (6) is of the same order of magnitude as the attraction (4) to the nucleus. Exact solution of the Schrödinger equation associated to Hamiltonian (1) involves the determination of a N -electron wavefunction. For increasing N , this becomes rapidly a formidable task, as the N -electron wavefunction describes the correlated motion of all the N electrons, and thus depends in a nontrivial way on all position and spin coordinates of the N electrons. The amount of information associated to a generic N -electron wavefunction is exponentially large with N , and basically there is no way to store (let alone to compute!) the exact wavefunction for the ground state of several interacting electrons. Most approximate methods on the market are based on the observation that a basis of the Hilbert space of N -particle states can be built as the tensor product of single-particle basis states. More explicitly, if {|αj i} is a complete set of orthonormal states for a single particle, the product (88) |α1 , α2 , ...αN i = |α1 i ⊗ |α2 i ⊗ ...|αN i , realizes a basis for N particles when all possible choices of α1 , α2 , ... αN are explored. For indistinguishable particles, the correct permutational symmetry is imposed to the product state (88) by taking the properly symmetrized linear combination 1 X (89) |α1 , α2 , ...αN iS/A = √ (±1){P } |αP1 , αP2 , ...αPN i . NP P 1.2. MANY-ELECTRON ATOMS 37 Here P indicates a generic permutation of the N particles αj , and the sum extends over the N ! permutations; the normalization NP equals N ! if all the αj happen to be different; {P } in the exponent indicates the parity of the permutation P , i.e. the number of pair transpositions P is made of. The fully symmetrized basis state |α1 , α2 , ...αN iS realizes the correct permutational symmetry of N bosons; the antisymmetric combination |α1 , α2 , ...αN iA involving nontrivial (−1) signs can play the role of basis state for N fermions. For bosons, no restriction applies to the quantum numbers αj : any number of them may be equal. Instead, for fermions, all quantum numbers must necessarily be different. If two were equal, say αi = αj , in the sum of (89), the kets |αP1 , ...αPi , ..., αPj , ...αPN i and |αP1 , ...αPj , ..., αPi , ...αPN i would be equal, but with opposite parity phase factor (−1){P } , so that they all cancel in pairs in the sum, and the total ket |α1 , ...αi , ...αj , ...αN iA vanishes. In summary, the product basis kets for N fermions are characterized by N different quantum numbers: this property expresses the Pauli exclusion principle, according to which no two identical fermions can occupy the same quantum state. The symmetrized product kets (89) are substantially simpler objects than a generic boson/fermion ket: their information contents is only directly proportional to N , rather than exponentially. Despite this simplicity, they constitute a basis of the Hilbert space of the proper symmetry: any actual N -boson/N -fermion ket |aB/F i (e.g. an exact eigenstate of the interacting Hamiltonian) can be expressed as a linear combination of the factorized basis kets: (90) |aB/F i = X α1 ,α2 ,...αN caα1 ,α2 ,...αN |α1 , α2 , ...αN iS/A , where caα1 ,α2 ,...αN are the complex coefficients defining the linear combination. The number of these coefficients is exponentially large with N : the intrinsic difficulty of the correlated problem has been transfered to the expansion coefficients. Many approximate methods of resolutions of the Schrödinger problem for manyelectron systems replace the (lowest) exact eigenstate with one (as smart as possible) basis state |α1 , α2 , ...αN iS/A constructed with single-particle states |α1 i, |α2 i, solutions of some appropriate single-electron Hamiltonian. This is the so-called independent-particles approximation. For atoms, the simplest approach along this line consists in neglecting the electronelectron Coulomb interaction Vee (6) altogether. In this approximation, the Schrödinger problem for the N electrons is exactly factorized: each electron moves independently of the others in the field Vne of the nucleus of charge Zqe . The single-electron eigenstates |αj i are hydrogenic wavefunctions of the kind (46) (we neglect spin-orbit), 38 1. ATOMS N −1 Z Figure 1.18. When one electron moves away from an atom/ion, the nucleus of charge Zqe and the remaining (N −1) electrons attract it as if they were a point charge (Z − N + 1) qe . with the individual factors given in Eqs. (31) (32) (33), with the appropriate Z.4 In this atomic context αj stands for the j-th set of quantum numbers nj , lj , ml j , ms j . Choose at will any set of N different αj : the generic N -electron eigenstate is given by |α1 , α2 , ...αN iA as in (89). For example, for N = 4 electrons, an acceptable state is (91) |1, 0, 0, ↑, 3, 1, 1, ↑, 3, 1, 0, ↑, 3, 1, −1, ↓iA . The standard spectroscopic notation for this state 1s3p3 lists of the occupied singleparticle orbitals, with the corresponding “occupancies” as exponents: all information about the ml ’s and ms ’s is dropped. The total energy E tot of an atomic state is minus the work needed to decompose the atom from that given bound state to an isolated nucleus plus the N individual electrons at rest at infinite reciprocal distance. For the simple factorized eigenstates, the total energy is simply the sum of the single-particle energies. The eigenstate (91) is not the one with the lowest possible energy. The ground state can be obtained by minimizing the energy of each single-particle orbital, without violating the Pauli principle. For N = 4, the state with the lowest energy is any of the 1s2 2s2 , 1s2 2s2p, 1s2 2p2 (a total of 1 + 12 + 15 = 28 individual states), all with energy EHa Z 2 Z 2 5 tot (92) E1s2 2s2 = 2E1 + 2E2 = −2 + 2 = − Z 2 EHa . 2 2 1 2 4 2 tot 2 The excited state (91) has substantially higher total energy E1s3p 3 = − 3 Z EHa . As the maximum occupancy of the n-th hydrogenic level 2n2 grows rapidly, in the brutally simplified model at hand (complete neglect of Vee ), the minimum energy to remove an electron from the atom (first ionization energy) increases with Z only 4 We ignore the reduced-mass correction, assuming an infinite nuclear mass. The finiteness of the nuclear mass introduces extra tiny electron-electron correlations in addition to those of Coulomb origin. 1.2. MANY-ELECTRON ATOMS 39 marginally more slowly than Z 2 , at variance with experiment (Fig. 1.16). Also, in this model any atom (regardless of Z) would be able to accept any number N of electrons, always forming bound states. Experimentally, however, only certain atoms can form negatively charged ionic bound states, but never with more than 1 extra charge (N ≤ Z + 1). These difficulties indicate that, as one could imagine, complete neglect of Vee is a very poor approximation. The main reason for the failure of this model is illustrated in Fig. 1.18: in reality while an electron is removed from the atom, it does not feel the bare nuclear attraction −Ze2 /r but rather, due to electron-electron repulsion and in accord to the divergence theorem, the substantially weaker combined effect of the nuclear charge and that of the other N −1 electrons, V (r) ≃ (−Z + N −1) e2 /r. This phenomenon of “screening” reduces the ionization energy substantially, and must be included for a decent factorized description of the atomic wavefunction. The following section sketches some theory for the simplest many-electron atom, He (Z = N = 2), for which significant insight is obtained by treating Vee as a perturbation to the uncorrelated electron states. Perturbative methods fail for N ≥ 3: Sec. 1.2.4 presents a more systematic method for improving the independentelectron approximation to include screening. 1.2.3. The 2-electron atom. Put the position and spin coordinates of the jth electron together and shorthand them as wj = (~rj , σj ). The permutation operator P in Eq. (89) acts on the position eigenkets exchanging the N coordinates with one another: (93) P |w1 , w2 , ...wN i = |wP1 , wP2 , ...wPN i . Accordingly, the wavefunction associated to Eq. (89) can be written equivalently 1 X Ψα1 ,...αN (w1 , ...wN ) = hw1 , w2 , ...wN |α1 , α2 , ...αN iA = √ (−1){P } hwP1 , ...wPN |α1 , ...αN i N! P 1 X (−1){P } hwP1 |α1 i · ...hwPN |αN i (94) = √ N! P 1 X (−1){P } ψα1 (wP1 ) · ...ψαN (wPN ) . = √ N! P The sum in the last expression is the determinant of the ψαi (wj ): ψα1 (w1 ) ψα1 (w2 ) 1 ψα2 (w1 ) ψα2 (w2 ) (95) Ψα1 ,...αN (w1 , ...wN ) = √ .. .. . . N! ψ (w ) ψ (w ) αN 1 αN 2 matrix whose elements are . · · · ψαN (wN ) ··· ··· ... ψα1 (wN ) ψα2 (wN ) .. . 40 1. ATOMS This object is called a Slater determinant. In the simplest nontrivial case of N = 2 electrons (relevant, e.g., for the He atom, the Li+ , Be2+ ,... ions), the independent-electron wavefunction reads (96) 1 ψα1 (w1 ) ψα1 (w2 ) ψα1 (w1 )ψα2 (w2 ) − ψα1 (w2 )ψα2 (w1 ) √ . = Ψα1 ,α2 (w1 , w2 ) = √ 2 ψα2 (w1 ) ψα2 (w2 ) 2 In the cases where the orbital quantum numbers coincide (n1 = n2 , l1 = l2 , ml 1 = ml 2 ), the orbital part can be factorized from the determinant. The latter only involves spin: (97) Ψn,l,ml ,↑, 1 χ↑ (σ1 ) χ↑ (σ2 ) r1 ) ψn,l,ml (~r2 ) √ n,l,ml ,↓ (w1 , w2 ) = ψn,l,ml (~ 2 χ↓ (σ1 ) χ↓ (σ2 ) . This antisymmetric up–down combination of the two electron spins is an eigenstate ~ = ~s1 + ~s2 , with null eigenvalue |S| ~2 = of the square modulus of the total spin S ~ 2 , like (97), are useful because the matrix elements S(S +1)~2 = 0. Eigenstates of |S| of the (hitherto neglected) Coulomb repulsion between states of different S vanish since Vee is an orbital operator, which cannot change spin. The other S = 0 states (spin singlets), those involving two different sets of orbital quantum numbers, are: ΨS=0 n1 ,l1 ,ml 1 , n2 ,l2 ,ml 2 (w1 , w2 ) = ψn1 ,l1 ,ml 1 (~r1 ) ψn2 ,l2 ,ml 2 (~r2 ) + ψn1 ,l1 ,ml 1 (~r2 ) ψn2 ,l2 ,ml 2 (~r1 ) χ↑ (σ1 ) χ↓ (σ2 ) − χ↑ (σ2 ) χ↓ (σ1 ) √ √ (98) . 2 2 Note that these S = 0 states are not single Slater determinants of the type (96). The singlet states are characterized by an orbital part of the wavefunction which is symmetric under permutation P12 , with the spin part taking care of the required antisymmetry. ~ 2 . The other value The singlet states (97) and (98) are S = 0 eigenstates of |S| of S allowed by Eq. (47) is S = 1. The spin part of the wavefunctions of these spin-triplet states is any of: (99) (100) (101) X S=1,MS =1 (σ1 , σ2 ) = χ↑ (σ1 ) χ↑ (σ2 ) χ↑ (σ1 ) χ↓ (σ2 ) + χ↑ (σ2 ) χ↓ (σ1 ) √ X S=1,MS =0 (σ1 , σ2 ) = 2 S=1,MS =−1 X (σ1 , σ2 ) = χ↓ (σ1 ) χ↓ (σ2 ) , 1.2. MANY-ELECTRON ATOMS 41 which are all symmetric for P12 . Therefore the orbital part takes care of antisymmetry: S ΨS=1,M n1 ,l1 ,ml 1 , (102) n2 ,l2 ,ml 2 (w1 , w2 ) = ψn1 ,l1 ,ml 1 (~r1 ) ψn2 ,l2 ,ml 2 (~r2 ) − ψn1 ,l1 ,ml 1 (~r2 ) ψn2 ,l2 ,ml 2 (~r1 ) S=1,MS √ X (σ1 , σ2 ) . 2 In these states, at least one of the orbital quantum numbers for the two electrons needs to be different: (n1 , l1 , ml 1 ) 6= (n2 , l2 , ml 2 ), or else the wavefunction vanishes. The following table summarizes the basic properties of the singlet (S = 0) and triplet (S = 1) basis states |n1 , l1 , ml 1 , n2 , l2 , ml 2 , S, MS i: orbital quantum numbers orbital wavefunction spin quantum numbers spin wavefunction S = 0 – spin-singlet states S = 1 – spin-triplet states any (n1 , l1 , ml 1 ) 6= (n2 , l2 , ml 2 ) symmetric antisymmetric ↑ and ↓ (different) any antisymmetric symmetric These spin-symmetrized states are convenient 0th -order states for a perturbation theory in Vee . They are eigenstates of Te + Vne : like in the example of Eq. (92), the “unperturbed” energies 1 EHa Z 2 Z 2 1 tot (0) (103) En1 ,n2 = − , + 2 = −2 EHa + 2 n21 n2 n21 n22 where the last expression refers to He (Z = 2). The ground state n1 = n2 = 1 tot (0) (in spectroscopic terms 1s2 ) is necessarily a spin singlet and has energy E1,1 = −4 EHa ≃ −109 eV. As a next step, the effect of Vee is accounted for, considering only its diagonal matrix elements, and neglecting its off-diagonal matrix elements. This approximation leaves the states unchanged and provides the first-order (additive) correction E tot (1) = hVee i to the eigenenergies, in accord with general perturbation theory [?]: (104) tot (1) En1 ,l1 ,ml 1 , n2 ,l2 ,ml 2 , S,MS = hn1 , l1 , ml 1 , n2 , l2 , ml 2 , S, MS |Vee |n1 , l1 , ml 1 , n2 , l2 , ml 2 , S, MS i . Since the electron-electron repulsion is positive, this correction is always positive. The detailed calculation of these Coulomb integrals is a rather intricate mathematical exercise, but its qualitative outcome is very instructive. The largest of these E tot (1) corrections occurs for the most localized wavefunction, the one where the two electrons stay very close together, both in the 1s level: the ground state 1s2 . The average inter-electron distance is of the order of a0 , thus the Coulomb integral 42 1. ATOMS Figure 1.19. Energy levels of atomic He, showing some transitions. n, l refer to n2 , l2 , while n1 = 1 for all states reported here. Note that triplet states sit systematically lower than the singlets. tot (1) E1,0,0, 1,0,0, 0,0 is of order ∼ e2 /a0 = EHa . The precise value obtained from integration [?] is 45 EHa ≃ 34 eV. This brings the estimated ground-state energy of He to −74.8 eV, in fair agreement with the experimental value −79.00 eV (minus the sum of the first and second ionization energies of He). For the optically most relevant states, those with one electron sitting in 1s and the other in an excited state n2 [l2 ], perturbation theory accounts for several experimental observations: • All Coulomb integrals are smaller than the one for the ground state, and tend to decrease for increasing n2 : the Coulomb correction become less and less important as the electrons move apart from each other. 1.2. MANY-ELECTRON ATOMS 43 • The Coulomb integrals at given n2 depend weakly on l2 : they usually increase for increasing l2 . By looking at the hydrogenic radial distributions Fig. 1.7, one can note that indeed the electrons, on average, sit slightly closer when l2 is larger. This is an important finding, since it breaks the H-atom l-degeneracy of the shells, putting Ens < Enp < End < ..., in accord to experimental finding (Fig. 1.19). • The Coulomb integrals depend on S, clearly not through the spin wavefunction which has nothing to do with the purely spatial Vee , but through the different electron-electron correlation in the spatially P12 -symmetrical (S = 0) or spatially P12 -antisymmetrical (S = 1) wavefunction. In particular, S Eq. (102) shows that the triplet wavefunction ΨS=1,M n1 ,l1 ,ml 1 , n2 ,l2 ,ml 2 (w1 , w2 ) vanishes for ~r1 → ~r2 . On the contrary, the singlet wavefunction ΨS=0 n1 ,l1 ,ml 1 , n2 ,l2 ,ml 2 (w1 , w2 ) is finite at ~r1 = ~r2 . Therefore, on average, the electrons in a spin-triplet state avoid each other more effectively than in the spin-singlet state with the same orbital quantum numbers.5 Indeed, Coulomb integrals are systematically smaller for S = 1 than for S = 0 states, as explicit evaluation of the integral (104) shows. This result accounts for the experimental observation that the triplet states lie systematically lower than the corresponding singlet (Fig. 1.19). Splittings between states of different spin, here singlets and triplets, are named exchange splittings. The perturbative approach presented here is useful mostly as a conceptual tool, to understand qualitative trends, and general concepts such as those listed above. Perturbation theory is relatively successful for the 2-electron atom, but for N > 2 electrons the repulsion that a given electron experiences from all the other N − 1 electrons is comparable to the attraction generated by the nucleus, and any attempt to treat it as a small perturbation fails. A better approximate approach, based on a mean-field self-consistent evaluation of the electron-electron repulsion, yields fair quantitative accuracy for any N and is commonly used to date. The reliability of this and similar self-consistent field methods have made them standard tools for understanding experiments and making previsions of atomic properties of matter from first principles. 1.2.4. The self-consistent field theory. The problem of describing at best the ground state of a N -electron problem in terms of a single Slater determinant belongs to the general framework of variational problems. The simple idea is that 5 This means that the fixed-spin states include some degree of geometric correlation of the electronic motion, which would be absent in a pure Slater determinant (96). Employing a basis where the perturbation Vee is diagonal within the 0th -order degenerate space follows the same strategy as the choice (Secs. 1.1.7 and 1.1.8) of the |l, s, j, mj i basis to have Hs−o diagonal within the degenerate multiplets. 44 1. ATOMS the average energy E var [a] = ha|Htot |ai of any state |ai is larger than or equal to that of the ground state. The lower E var [a] is, the closer |ai gets to the ground state. When for |ai we take a generic Slater determinant, the “best” state in its class is the result of the minimization of the energy E var [ψα1 , ...ψαN ] = hα1 , ...αN |A Htot |α1 , ...αN iA Z = dw1 , ...dwN Ψ∗α1 ,...αN (w1 , ...wN )Htot Ψα1 ,...αN (w1 , ...wN ) (105) Z X 1 Xh = hαi |H1 |αi i + dw dw′ vee (w, w′ ) |ψαi (w)|2 |ψαj (w′ )|2 2 i,j i Z i − dw dw′ ψα∗ i (w)ψα∗ j (w′ ) vee (w, w′ ) ψαj (w)ψαi (w′ ) with respect to arbitrary variations of the N single-particle wavefunctions ψRαi com∗ posing the Slater determinant. i them to remain orthonormal dw ψαi (w)ψαj (w) = h We only require 2 ~ ∇~r2 − δij . In Eq. (105), H1 (w) = − 2m e vee (w, w′ ) = e2 |~ r −~ r′ | Ze2 |~ r| stands for the one-particle term, and δms m′s for the electron-electron Coulomb repulsion. Finding a minimum of E var is the problem of minimizing a functional, i.e. a function whose independent variables is a set of functions. This constrained minimization problem is formally solved if the ψαi satisfy the set of coupled nonlinear integrodifferential equations called Hartree-Fock (HF) equations: 2 1 }| z { H1 (w)ψα (w) + z Z (106) dw′ X β }| 3 { |ψβ (w′ )|2 vee (w, w′ ) ψα (w) − z Z dw′ X }| ψβ∗ (w′ ) vee (w, w′ ) ψα (w′ )ψβ (w) β = ǫα ψα (w) . Each of the three numbered terms derives from a corresponding term in the total energy (105). If one pretends that all ψβ functions were, rather than the unknown functions they really are, given fixed functions, then equations (105) would become linear in ψα . The general form of these equations is that of a Schrödinger equation: term 1 contains the kinetic energy and the Coulomb attraction of the nucleus; term 2P represents the Coulomb repulsion of the average charge distribution of all electrons ( β |ψβ (w′ )|2 represents the density distribution of the N electrons); term 3 is a nonclassical nonlocal exchange term which, in particular, removes the unphysical repulsion of the electron with itself introduced by term 2 (observe that the α = β terms in the sums of terms 2 and 3 cancel).6 The HF equations realize indeed a 6 { In the β sum of term 3 only ms β = ms α terms survive, as vee is purely orbital and does not modify spin. 1.2. MANY-ELECTRON ATOMS 45 Generate N orthogonal initial wavefunctions Generate the integro− differential operator Solve the Schroedinger− like equation Replace old wave− functions with new ones Choose the N lowest eigensolutions Compute total energy and expectation values Are NO the new wavefunctions equal to the old ones? YES Figure 1.20. An idealized resolution scheme of the self-consistent HF equations (106). very natural way to treat the electron-electron repulsion as well as possible at the mean-field level. Terms 2 and 3 of (106) depend explicitly on the (unknown) wavefunctions ψβ . The standard strategy (Fig. 1.20) for the solution of the HF equation is based on initially assuming that all ψβ in (106) are known: start from some arbitrary initial set of N orthonormal one-electron wavefunctions, put them in place of all ψβ ’s in (106), solve (usually numerically) the linear equations for ψα ; from the list of new solutions, take the N eigenfunctions with lowest single-particle eigenenergy ǫβ and re-insert them into the equations (106) in place of the ψβ ’s; iterate this procedure as long as needed. Usually, after several iterations (of the order of 10, depending on the starting ψβ ), self-consistency is reached, i.e. the wavefunctions do not change appreciably from one iteration to the next. The converged wavefunctions allow to compute several observable quantities, e.g. the total energy given by Eq. (105). The sum of the nuclear potential plus the repulsion of the charge distribution of 46 1. ATOMS r V[r] 2 4 6 8 r -2 -4 -6 -8 Figure 1.21. Qualitative form of the atomic Hartree-Fock effective potential multiplied by r (in units of e2 ). The asymptotic values are consistent with N = Z = 8. The effective potential multiplied by r and divided by −e2 defines the effective charge Zeff (r) seen at that distance from the nucleus. the other electrons represents the self-consistent potential acting on the motion of charged particles within the atom. Until now, no assumption has been made about the symmetry of the self-consistent potential. It needs not have any special symmetry, and indeed it often has none. However, a simplifying approximation is commonly made: that the charge distribution and the self-consistent potential are spherically symmetric, like the nuclear potential. This approximation has the advantage of allowing to separate variables in Eq. (106), like in the solution of the one-electron atom, and write each HF single-particle solution ψα as the product of a nontrivial radial wavefunction (determined numerically) times a spherical harmonic Ylml . In the spherical approximation, one-electron wavefunctions are labeled by hydrogen-like quantum numbers α = (n, l, ml , ms ): here angular quantum numbers l, ml label exactly the same angular dependence (31), (32) as the one-electron atom’s, while the radial wavefunction Rnl (r) has now a nontrivial radial dependency (determined self-consistently). Like for the 1-electron atom, the number of radial nodes (n−l−1) defines n. The N -electron ground state is built by filling the single-electron levels starting from 1s, 2s, ... upward. In the spherical approximation, one finds that the selfconsistent potential felt by each electron has the correct behavior for large r (namely, (N−1−Z) e2 /r) and for small r (namely, −Ze2 /r), as illustrated in Fig. 1.21. As the potential has not a simple Coulomb shape, the single-electron levels are not found 1.2. MANY-ELECTRON ATOMS 47 Figure 1.22. Radial properties of a spherically symmetrical atom (Ar, N = Z = 18) computed by means of the HF self-consistent method. (a) The radial probability distribution for the filled singleelectron states. Note that the characteristic radius of the innermost shell n = 1 is ≈ a0 /Z, while the outer filled shell (n = 3) is slightly larger than a0 . (b) The total radial probability distribution P (r) and effective charge Z(r) specifying the effective potential. at the positions (30) of the 1-electron atom, and in particular their energy depends explicitly on l, not only on n. Indeed, an ns orbital, with larger probability than np close to the origin where the effective HF potential is more strongly attractive, is placed lower in energy. Thus the Hartree-Fock method accounts quite naturally for the observed l-ordering ns, np, nd ... of the single-electron levels observed in the atomic spectra (e.g. for He in Fig. 1.19). Figure 1.22 reports the filled single-electron HF radial wavefunctions, for the Ar atom. The typical radii of the individual shells vary very substantially with n (from ≈ a0 /Z of 1s, to slightly more than a0 for the 3p valence shell), so that peaks related to the 3 filled shells emerge clearly in the total probability distribution. Accordingly, the n-dependence of the shell energy is faster for many-electron atoms than in the one-electron atom. The independent-electron self-consistent spherical-field model has become more than a simple approximation: it provides the basic language of atomic physics. Physicists are well aware that a single-determinant configuration is only an approximation of the actual eigenstate of an atom, but, in practice, atomic states are 48 1. ATOMS routinely labeled by the electron occupancies of single-electron orbitals best representing the actual many-electron state. For example, the standard notation for the electronic ground-state configuration of Mg is 1s2 2s2 2p6 3s2 . 1.2.5. The periodic table. The HF theory permits us to understand the ground state (and even many excitations) of many-electron atoms and ions. With the tot HF method, the 1s2 ground configuration of He has total energy E1s 2 = −77.8 eV, about 2 eV above the measured ground-state energy; the radial dependency of the one-electron wavefunction is of course nonhydrogenic. In Lithium, a third electron adds into 2s (configuration 1s2 2s). The HF ground-state total energy is −7.4328 EHa [?], to be compared with the experimental (1st +2nd +3rd ) ionization energy 7.4755 EHa , with an error of about 1 eV. The first ionization potential can be computed by subtracting the total energy given by a self-consistent calculation for the positive ion, N = Z −1: for Li one obtains a ionization potential of 5.34 eV, in good accord with the experimental value 5.39 eV. This value is much smaller than that of He (24.59 eV). The reason is that the 2s shell is much more weakly bound than 1s. Beryllium has 1s2 2s2 ground state. For all these atoms (N ≤ 4) involving only s orbitals, the spherical approximation is appropriate. Starting from Boron, electrons occupy progressively a degenerate p shell: as the p shell is non spherically symmetric, the spherical approximation for the self-consistent field is questionable. The 2p shell is completely filled as Neon (N = Z = 10), the P next noble2 gas, is reached. Again Ne is a spherically symmetrical atom, since m |Ylm (θ, ϕ)| is independent of θ and ϕ. The ionization potential of Ne is again very large, but not as much as that of He (see Fig. 1.23). The next atom, Na, involves one electron in the 3s shell, which is located much higher in energy than 2p. Again the ionization potential has a dip, as shown in Fig. 1.16, which can be interpreted as the starting of a new shell which is only weakly bound. As Z = N further increases, the filling of the n = 3 shell proceeds fairly smoothly, with 3s and 3p becoming more and more strongly bound until the next noble gas Ar is reached.7 For Z this large, the l-dependence of the single-particle HF energy is so strong that the HF self-consistent field puts the 4s shell lower than 3d. Indeed experiment shows that potassium has ground state 1s2 2s2 2p6 3s2 3p6 4s rather than 1s2 2s2 2p6 3s2 3p6 3d. The physical properties of this atom are similar to those of other alkali metals (Li, Na). 7 For Ar, Z = 18, the HF approximation finds a total energy of −526.817 EHa , which is 0.791 EHa = 21.5 eV in excess of the experimental energy [?]. The absolute error is rather large, which indicates that the neglect of explicit correlations in the electronic motion is a serious drawback of HF. However, the relative error is ∼ 0.15% only, and the excess energy per electron amounts to approximately 1 eV, indicating that this mean-field approximation captures the bulk of Coulomb energy. 1.2. MANY-ELECTRON ATOMS Figure 1.23. The observed position of the ground state and lowest excitations, and the spread of the atomic multiplets due to Coulomb exchange (and, to a tiny extent, spin-orbit) for Z ≤ 11. The energy of the singly ionized atom sets the zero of this scale (top of the figure). 49 50 1. ATOMS The 3d shell is then filled after 4s but before 4p, although some inversions (as in Cr and Cu) indicate that 4s and 3d are almost degenerate at the HF level, and more subtle effects of electron correlation play an interesting role. Further intersections associated to a strong l-dependence of energy occur as 4d, 4f and 5d are being filled, as listed in the periodic table. Similar properties of all elements with a given number of s or p electrons in the outermost shell suggest the arrangement of all atoms in the periodic table. The “low-energy” properties of atomic materials with incomplete d shells (transition metals) and f shells (lanthanides) are relatively similar in each group. The size of the atoms (a not especially well defined property) computed with HF is in relatively good accord with the empirical trends of Fig. 1.17. In particular, noble gases are especially small and alkali atoms especially large relative to other atoms of similar Z; on the whole, the size of neutral atoms tends to increase slowly with Z. Cations (=positive ions) can be produced with any charge Z − N : the size of these species decreases as shells get emptied and screening is less and less effective. Usually (but not always) an ion with N electrons has the same electronic configuration as the atom with Z = N (this is especially true for small charging N ≃ Z). However all single-electron wavefunctions shrink closer to the nucleus in the cation than in the neutral atom. Anions (=negative ions) are stable in gas phase only for certain atoms and only up to a maximum charging of 1 electron (i.e. N ≤ Z + 1). The HF model signals that some ionic state is not stable by never reaching self consistence. The results of HF is often in accord with experiment. For example all halogens (F, Cl, Br, I, At), thanks to their an almost complete relatively “deep” in energy np5 shell (Fig. 1.23), have positive electron affinity (defined as the ionization potential of the negative ion), which means that their negative ion is stable against loss of the extra electron. Beside ionic states, HF permits also to compute (to some extent) excited states and excitation energies. For example, after computing the ground-state properties of Na, by filling the N = 11 lowest single-electron levels as 1s2 2s2 2p6 3s, one could run a new self-consistent calculation putting the 11th valence electron in 3p rather than in 3s (configuration 1s2 2s2 2p6 3p). The self-consistent field turns out different, and the total energy is larger. The difference in total energy between the two calculations is a fair estimate of the excitation energy, here of the 3s→3p transition of Na, about 2 eV, see Fig. 1.26 below. 1.2.6. Core levels and spectra. According to HF theory, the energy of the deepest single-particle levels drops very rapidly with increasing Z, essentially as ∝ −Z 2 . Indeed, in the independent-electrons language, one could conceive exciting not just the shallow levels with the smallest binding energy, but also the deep core levels 1s, 2s ... For example for Na, a configuration such as 1s1 2s2 2p6 3s2 could be 1.2. MANY-ELECTRON ATOMS 15 51 X-ray absorption coefficient Ar Cl S 10 Na Mg Al Si P Absorption Z = 11 to 18 5 0 0 1000 2000 Energy [eV] 3000 4000 Figure 1.24. The observed absorption coefficient of all atoms in the third row of the periodic table, showing, for increasing Z, the regular displacement of the K edge, and buildup of the L edge. investigated, where an inner 1s electron has been promoted to the outer 3s shell. This state involves a huge excitation energy (of the order of 40 EHa ), and one might suspect that such an enormously unbound state has no right to exist. Indeed, the atom in this state has plenty of transitions that allow to get rid of big chunks of this excitation energy, e.g. to a substantially lower-energy state 1s2 2s2 2p5 3s2 . According to Eq. (73), the decay transition rate is very large as it grows with the third power of the energy associated to the transition, which dominates over the reduction in dipole matrix element due to the small size of the initial shell, to a total ∼ Z 4÷5 dependence – see Eq. (84). Accordingly, the lifetime broadening of core states may be huge, exceeding ~γ ≈ 1 eV. Despite such huge broadening, core-hole states are not just a theoretical prediction of the independent-electron model, but they are routinely observed in UV and X-ray spectroscopies. Like optical spectroscopies, many experiments probing core states can be classified as absorption or emission, with the same conceptual scheme of Figs. 0.5 and 0.6. Absorption data (Fig. 1.24) show a remarkable regularity of the spectra above ≈ 100 eV, and a systematic change of peak positions and intensities as Z is increased. A characteristic feature of X-ray absorption spectra is the asymmetry of the peaks, which show a sharp edge on the low-energy side and a broad slow decrease on the 52 1. ATOMS Figure 1.25. The observed highest-energy core levels for uranium, their labeling in terms of the hole quantum numbers n l and j, and the dipole-allowed transitions among them. Within each shell, note the huge l-related and spin-orbit splittings. high-energy side. The reason for the edge is that below the minimum excitation energy for the core state no absorption takes place, while above threshold, the core electron may be promoted to several empty bound and unbound states of the atom in gas phase, or of the condensed phase. The slow decrease above edge is due to an increase of the kinetic energy of the ejected electron (which equals the difference between the absorbed photon energy and the energy of the atomic core state): the final state becomes increasingly orthogonal to the initial core level, and correspondingly the dipole matrix element (74) decreases. For this same reason, an X-ray photon hitting an atom is much more likely to extract a core electron than a weakly bound outer-shell electron. Emission spectra show the same simplicity and regularity as absorption spectra. Atoms are excited initially, typically by collisions with high-energy electrons. The subsequent emission involves transitions only from levels for which enough energy is made available by excitation. For example, if 2 keV electrons are used to excite the sample, emission involving the 1s shell is observed for all Z ≤ 14 (Si), but not for P and higher-Z atoms (see Fig. 1.24). 1.2. MANY-ELECTRON ATOMS 53 Yet another needless traditional notation haunts core states and X-ray spectra: a hole (=missing electron) in shell n = 1, 2, 3, 4, ... is labeled K, L, M , N , ... The substructures related to states of different l and j acquire a Roman subscript (e.g. LIII for 2p 2 P3/2 ), as in Fig. 1.25. Dipole-allowed transitions in emission are organized in series according to the initial shell, with a Greek-letter subscript for the final shell. For example, the transition K → L (in other words the decay 1s2s2 2p6 ... → 1s2 2s2 2p5 ...) is called Kα emission line, K → M is Kβ , and L → M is Lα . In the days of the great discoveries of chemistry and physics, when the structure and classification of atoms were being understood, H. G.-J. Moseley acquired and compared characteristic emission spectra of many elements: he showed that the Kα inverse wavelength (equivalently, energy) is roughly proportional to Z 2 . It approximately fits a law: (107) 1 λKα ≈ C (Z − a)2 . In this phenomenological dependence, the proportionality constant C is close to the Rydberg constant (see Fig. 1.4), and the quantity a, accounting for screening, is approximately 2. The discovery of this regularity permitted to identify the correct value of Z of each atomic species, thus correcting several mistakes in the early periodic table. The decent accuracy of Moseley’s fit suggests that one could estimate the energy positions of the core levels (within, say, 20%) without going through a full self-consistent HF calculation. Indeed, the core states may be thought of as approximately hydrogenic states in an effective −Zeff e2 /r Coulomb potential, where the value of Zeff is determined by the average of the radial distribution of the singleelectron wavefunction at hand related to the effective potential curve, like that of Fig. 1.21. Accordingly, core energies of a shell can be estimated by means of Eq. (30), taking for Z a Zeff ≃ Z − 2 for the K shell, Zeff ≃ Z − 10 for the L shell, and in general Zeff ≃ Z− (number of electrons in inner shells up to and including the target shell). Nowadays, X-ray spectroscopies are routinely used in a number of ways, including use as position-sensitive analytic tools, local probe of the near chemical environment of different atomic species, and many others. Many more application examples of X-ray spectroscopies are reported at http://www.esrf.fr/ and at http://www.elettra.trieste.it/. 1.2.7. Optical spectra. The simplicity of the core spectra is rapidly lost when the optical and soft-UV range is explored. Different atoms show completely different spectra, but some regularity, a few qualitative statements, and many similarities can be recognized. 54 1. ATOMS The main observation necessary to understand optical spectra (and core transitions as well) is that the dipole operator driving the transition in a many-electron context is the sum of the individual dipoles of the single electrons: X X (108) d~ = d~i = −qe ~ri . i i The operator ~ri acts on a N -electron wavefunction by multiplying it by the position coordinate of the i-th electron. In the independent-electron approximation, which, we have seen, works fairly well for atoms in the HF self-consistent model, the matrix element of one such operator between two properly antisymmetrized states (89) is simply X X hα1 , ...αN |A d~i |β1 , ...βN iA = hαPi |d~i |βi i hαP1 |β1 i...hαPi−1 |βi−1 ihαPi+1 |βi+1 i...hαPN |βN i. i i P Here, all the overlap integrals vanish unless all αPk = βk , in the assumption that the single-electron states composing the initial states are essentially equal to those composing the final states. Furthermore, also the single-particle dipole matrix element hαPi |d~i |βi i vanishes, unless it satisfies the single-electron dipole selection rules. In other terms, for N electrons, dipole allowed transitions occur only between states where one single electron makes a dipole transition, and the other N −1 electrons remain in the initial single-particle state. Of course, any of the N electrons can make the transition to any empty state. The angular part of the single-electron wavefunction is again of type Ylm (spherical harmonics): the dipole selection rules (∆l = ±1, ∆s = 0) derived for the one-electron atom in Sec. 1.1.10 continue to hold for the single electron making the transition. Accordingly, a few examples of allowed transitions of Be follow: 1s2 2s2 →1s2 2s2p, 1s2 2s2p→1s2 2s4d, and 1s2 2s2 →1s2s2 3p; and a few examples of forbidden transitions: 1s2 2s2 6→1s2 2s3d, 1s2 2s2 6→1s2 2p2 , and 1s2 2s2 6→1s2s2p3p. Further selection rules (Sec. 1.2.7.2) affect the total angularmomentum quantum numbers obtained by coupling the spins and orbital angular momenta of individual electrons. 1.2.7.1. Alkali atoms. In the alkalis, the last complete np shell lies rather deep in energy (typically a few tens eV), while the outermost (partly) occupied (n+1)s level is very shallow (from 5.4 eV in Li to 3.9 eV in Cs). The ground state symmetry is 2 S1/2 , like that of H, where now the label 2S+1 [L]J refers to total angular momenta which, for the alkalis, equal the angular momenta of the outer (“optical”) electron, since the close shells (=completely filled shells) contribute null spin and null orbital angular momentum. All excitations of the inner-shells electrons are core excitations, involving energies much larger than the first ionization energy. Electrons in the inner shells remain essentially “frozen” in a spherically symmetric core providing an effective potential (Fig. 1.21) for the motion of the outer electron, which is the responsible for all low-energy excitations of the alkali atoms. The 1.2. MANY-ELECTRON ATOMS 55 Figure 1.26. (a) The observed level scheme of the first alkali-metal atoms, compared to that of H. (b) Energy-level diagram of Na atom, with some transitions indicated by arrows, with the associated wavelengths measured in Å=0.1 nm. (c) Fine structure of the lowest Na levels due to spin-orbit interaction. The zero of the energy scale is set by (a) the ground state of the ion A+ , and (b,c) the ground state of the neutral atom. associated spectrum of excitations resembles that of a one-electron atom, the main difference being the significant energy gaps between states characterized by the same n but different l. In particular, the separation of (n+1)s and (n+1)p is especially 56 1. ATOMS large, of the order of 2 to 3 eV, and originates a characteristic transition in the visible spectrum. This (n+1)s→ (n+1)p transition is especially strong due to the large dipole matrix element involving strongly overlapping and fairly extended wavefunctions. Figure 1.26 reports a typical level scheme and several optical transitions. Spin-orbit affects all non-s states. The natural generalization of Eqs. (60) and (63) to a generic radial potential yields a microscopic estimate of the relevant ξ for a given shell: (109) ξ= 1 dVeff (r) 1 ~2 4 2 hn, l| |n, li = Z α E . Ha eff s−o 2 m2e c2 r dr n3 l(l + 1)(2l + 1) where Zeff s−o is implicitly defined by this equation and provides rough estimates eff of ξ. Due to the strong localization of the mean field 1r dV dr (r) near the origin, the effective charge Zeff s−o is usually larger than that introduced above for the estimate of the level position: Zeff < Zeff s−o < Z. The spin-orbit splitting of the levels shows in splittings of all optical transitions. The observed spin-orbit splittings of the lowest excited p state of the alkali atoms are: Element Li Na K Rb Cs Z 3 11 19 37 55 single-electron excited level 2p 3p 4p 5p 6p Spin-orbit splitting 32 ξ [meV] 0.042 2.1 7.2 29.5 68.7 Zeff s−o 0.98 3.5 6.0 10.0 14.2 Remarkably, the spin-orbit splitting of Li 2p is smaller than that of H 2p: the reason is that most of the Li 2p wavefunction lies well outside the compact 1s screening shell. The fine-structure splittings of higher excited non-s states are smaller than those reported here (Fig. 1.26c). Within the HF model, Eq. (109) permits to evaluate the spin-orbit energy ξ for any shell of any atom, not just for the excited shells of alkali atoms. For example, the large effective Zeff s−o accounts for the colossal spin-orbit splittings (tens or hundreds eV) of the core shells of heavy atoms, observed in X-ray spectra (e.g. the LII –LIII splitting – Figs. 1.24 and 1.25). 1.2.7.2. Atoms with incomplete degenerate shells. The occupancies of the single orbitals are often insufficient to determine many characters of the atomic ground state, e.g. its total angular momentum J, thus its degeneracy. The ground-state symmetry of noble gases, alkali earth and, in general, all atoms in close-shell configurations, including Zn, Cd, Hg, Yb, and No is trivial: these atoms all qualify as nondegenerate spherically symmetric 1 S0 . Likewise, the ground state of alkali metals is simply 2 S1/2 , with a twofold degeneracy associated to spin. Another similar and relatively simple case is that of B and Sc (and atoms of the same groups IIIB and IIIA), where a single electron occupies a degenerate p or d shell, while all 1.2. MANY-ELECTRON ATOMS 57 inner shells are complete: here the spin and orbital angular momentum equal those of the lone electron. Spin orbit splits J = L − 21 and J = L + 21 , putting L − 21 lower. Accordingly, the ground state of B is 2 P1/2 and that of Sc is 2 D3/2 . The last relatively simple class is that of the halogen atoms (group VIIB, p5 configuration), characterized by a single hole in an otherwise full shell. Here not surprisingly the single hole carries the same spin and orbital angular momentum (S = 12 and L = 1) as a single electron in that shell. Interestingly, the effective spin-orbit interaction of the hole is reversed in sign.8 Once this is agreed upon, the ground-state symmetry of all halogens is 2 P3/2 . Similarly, the symmetry of the ground state of Tm (4f13 ) is 2 F7/2 . A more intricate situation occurs when several electrons occupy a degenerate shell, but are too few to fill it completely. The resulting independent-particle quantum state is highly degenerate, in terms of one-electron energetics. For example, in carbon two (identical) electrons occupy a p shell in 6·5/2 = 15 possible ways. In general, d! N electrons occupy d = 2(2l + 1) degenerate spin-orbitals in Nd = N !(d−N dif)! ferent ways, corresponding to physically different orthogonal quantum states. This large degeneracy is partly lifted by (i) the residual electron-electron interaction (i.e. the energy differences, ignored at the HF level, related to the correlated electronic motion), and (ii) the spin-orbit interaction. In the outer atomic shells, Coulomb energies are of the order of one to several eV (like for He, Sec. 1.2.3), while for Z smaller than about 30 spin orbit is a comparatively small interaction (ξ ≪ 1 eV), which we initially neglect. The residual Coulomb interaction (not accounted for by the HF self-consistent field) is spheri~ 2 and |L| ~ 2 . Figure 1.27 provides a cally symmetric: it thus commutes with total |S| complete list of the “multiplet” states labeled by S and L. The residual Coulomb interactions acts on HF states like the full Coulomb repulsion in He (Sec. 1.2.3): it first of all splits states of different total spin S: low-spin states sit higher in energy, as the electrons move, on average, closer to one another. The ground state will therefore have the highest possible spin: this result is in accord with the empirical first Hund rule. In degenerate configurations there often occur several states of the same spin, but different total angular momentum L. Coulomb repulsion is generally larger for 8 Observe that the total spin-orbit operator for N electrons in a degenerate shell is H s−o = PN ~ ξ i=1 li · ~si . As the shell completes with d = 2(2l + 1) electrons, it is convenient to rewrite Pd Pd Pd Hs−o = ξ i=1 ~li · ~si − ξ i=N +1 ~li · ~si = −ξ i=N +1 ~li · ~si . The first term vanishes, for closed-shell cancellation. The residual term, of opposite sign, represents the spin-orbit interaction of “missing” electrons, named holes. Halogen atoms have N = d − 1 electrons, thus 1 hole in a p valence shell, which behaves as an electron with reversed spin-orbit ξeff = −ξ < 0. 58 1. ATOMS Figure 1.27. All terms in which Coulomb correlation splits a degenerate [l]N configuration, for l = 0, 1, 2, 3 and all possible fillings N . When several states with the same 2S+1 [L] occur in the configuration, the number of occurrences is written under the letter [L] denoting the total orbital angular momentum. In LS (Russell-Saunders) coupling, weak spin-orbit interaction further splits states characterized by different J, e.g. 2 P → 2 P1/2 , 2 P3/2 , not detailed here. states with low L. Accordingly, the ground state is the state with highest possible L among those with maximum S. This is known as second Hund rule. Finally, once the total L and S are determined, the hitherto neglected spin-orbit interaction couples them together to a total angular momentum J. The allowed values of J are given by the usual rule (47), and the question of which of them is lowest in energy is decided by the sign of the effective spin-orbit parameter for that partly filled shell. While the true spin-orbit parameter is necessarily positive (see ~ with total S ~ may as Eq. (109)), the effective parameter for the coupling of total L well be negative, as discussed above for the halogens. Indeed the sign of the effective spin-orbit reverses when more than 2l + 1 electrons occupy a shell where 2(2l + 1) electrons can fit. Accordingly, the third Hund rule states that the ground state has 1.2. MANY-ELECTRON ATOMS 6 singlet states 9 triplet states 59 { (1) 1S 0 (5) 1D 2 { (5) (3) (1) 3P 3P 2 3P 1 0 Figure 1.28. The correct labeling and ordering of the p2 multiplets according to LS coupling and Hund’s rules. Degeneracies 2J + 1 are indicated at the left. J = |L − S| for less than half-filled shell and J = L + S when the shell is more than half filled. In the p2 example mentioned above, the fifteen states arrange themselves in triplets 3 P0 , 3 P1 , 3 P2 , and singlets 1 D2 , 1 S0 . No other states are compatible with Pauli’s principle [?, ?]. According to Hund’s rules, the ordering of these states is illustrated in Fig. 1.28. Similar qualitative ordering of the levels, as given by Hund’s rules, is observed also in configurations involving several incomplete shells, as in Fig. 1.29. For low-Z atoms, Fig. 1.23 shows the ranges where the multiplet of states corresponding to each configuration as observed spectroscopically. Observe that the lowest multiplets spread out by several eV, while more excited multiplets are much less disperse, as the residual Coulomb repulsion becomes weaker and weaker for more extended wavefunctions. ~ and all ~li together The described scheme of coupling all ~si together to a total S ~ (followed by spin-orbit coupling of S ~ and L ~ together) is called Russellto a total L Saunders or LS coupling. It provides a satisfactory basis of coupled states to low-Z atoms, where Coulomb exchange dominates over spin orbit. For increasing Z the spin-orbit interaction grows fast, while electron-electron repulsion weakens due to spreading of the outer occupied orbitals. For very large Z ≥ 50 spin-orbit dominates: Hs−o must be accounted for before Coulomb terms. Spin-orbit couples the spin and orbital moment of each electron to an individual ji = li ± 21 . These individual total angular momenta are then coupled to a total J by smaller Coulomb terms. This ordering of the couplings of the angular momenta is called jj coupling, and provides another basis for the many-electron states. While the LS basis is almost diagonal at small Z, the jj basis is almost diagonal for large Z (see Fig. 1.30). For intermediate Z, either basis could be used, but the matrix of Coulomb exchange plus spin-orbit 60 1. ATOMS (a) (b) Figure 1.29. The conceptual sequence of splittings caused by interactions of decreasing intensity for LS multiplets. (a) The 6·(6−1)/2! = 15 states of two “equivalent” electrons in a np2 configuration, with the splittings induced by an external magnetic field assumed in the Zeeman limit. (b) The 6·10 = 60 states of two “inequivalent” (different n and/or l) electrons in a 3d4p configuration, as occurs in the spectrum of excited Ti. must be diagonalized to get proper eigenstates, as linear combinations of the states of the chosen basis. 1.2.7.3. Many-electron atoms in magnetic fields. When an external magnetic field is applied to a many-electron atom, it may behave in two very different ways, according to whether the atom carries a magnetic moment or not. Atoms with total 1.2. MANY-ELECTRON ATOMS 61 Figure 1.30. Transition from LS coupling in carbon (Z = 6) to intermediate coupling (Coulomb exchange and spin-orbit of the same order) in germanium (Z = 32) to jj coupling in lead (Z = 82). angular momentum J = 0 have no permanent magnetic dipole to align with the field: the field induces a tiny magnetic moment, of order µB µB∆B , where ∆ is the energy gap between the ground state and the lowest excitation, but we will ignore such tiny effects. Instead, open-shell atoms with J 6= 0 carry a magnetic moment ~ ~µ = −gJ µB J/~, with a tendency to align to the field. For LS coupling, the appropriate g-factor gJ is determined by Eq. (54), with the total angular momenta J, L, and S in place of the single-electron j, l, and s. As discussed in Sec. 1.1.11, this total magnetic moment, derived by the coupling of orbital and spin contributions is relevant in the limit of weak external magnetic field (Zeeman limit). In most practical experimental situations this is the relevant limit for many-electron atoms, due to the Z 4 increase of the spin-orbit energy ξ, and the maximum field accessible in the lab (of the order of 10 T). The opposite strong-field (Paschen-Back) limit can be realized only for highly excited states, with an electron close to dissociation, thus weakly affected by the nuclear field. The simplest splitting pattern – three equally spaced lines, corresponding to ∆MJ = 1, 0 and −1 – is called regular Zeeman splitting. An example is shown in Fig. 1.31. It occurs when the initial and final g-factors are equal (typically g = gL = 1). This in principle occurs when either S = 0 or L = 0. However, L = 0 is most unlikely, as the electron making the transition changes its l by unity, 62 1. ATOMS Figure 1.31. Regular Zeeman spectrum, with its interpretation. It only occurs for S = 0 states. For example, it is observed in the 2s3d 1 D2 →2s2p 1 P1 emission of Be. Figure 1.32. Anomalous Zeeman spectrum in Na (left) and in Zn (right). thus transitions Li = 0 → Lf = 0 occur very rarely. In practice, regular Zeeman splitting is observed in optical transitions between spin-singlet states (S = 0). In all other cases, the Zeeman spectrum shows more complicated patterns due to the different initial and final g-factors (anomalous Zeeman spectrum, see Fig. 1.32). Many-electron atoms can also be introduced in Stern-Gerlach apparatuses (Sec. 1.1.5), in order to investigate the magnetic moment of their ground state. The amount of deflection measures the ẑ component of the magnetic moment, thus the Landé gfactor gJ , according to Eq. (45): ∂Bz ∂Bz F~z = µz = −µB gJ MJ . ∂z ∂z The number of sub-beams into which the inhomogeneous field splits the original beam measures directly the number of allowed MJ values, i.e. the ground-state degeneracy 2J +1. (110) 1.2.8. Dipole selection rules. As discussed at the beginning of Sec. 1.2.7, the main selection rule requires the that a single electron changes state to another state satisfying ∆l = ±1. Several dipole selection rules for the total quantum numbers J, L, and S of many-electron atoms in LS coupling also apply. They are summarized 1.2. MANY-ELECTRON ATOMS 63 below: (111) (112) (113) (114) (115) Parity changes ∆S = 0 ∆L = 0, ±1 ∆J = 0, ±1 ∆MJ = 0, ±1 (no 0 → 0 transition) (no 0 → 0 transition if ∆J = 0). As both S and L are good quantum numbers only in the limit of very small spinorbit, in practice selection rules (112) and (113) are only approximate. Figure 1.30 draws the allowed transitions in characteristic examples of LS coupling and jj coupling. In this latter case specific dipole selection rules apply, which we leave to specific atomic-physics courses. The present Chapter summarizes few basic concepts and experimental evidence in the field of atomic physics, which, in the context of a general course in physics of matter, provide a minimal background and language for understanding and describing the microscopic atomic structure, which is the at the root of the physics of matter. Important conceptual points (e.g. the seniority scheme for the labeling of LS states when L, S and J are not sufficient), several modern spectroscopic techniques (e.g. Auger), and analytical, chemical, astrophysical applications of atomic physics are omitted. These advanced subjects are left to more specific volumes [?, ?, ?]. CHAPTER 2 Molecules Diatomic molecules are the simplest system where several nuclei are bound together by their interaction with one or several electrons. Their simplicity makes them the ideal system to learn two central concepts of condensed-matter physics: the adiabatic separation of the electronic and nuclear motions, and chemical bonding. 2.1. The adiabatic separation The disentanglement of the fast electron dynamics from the slow motion of the nuclei is a crucial conceptual step to make progress in understanding and interpreting a huge amount of phenomena and observations in the physics of matter. The total Hamiltonian Htot , Eq. (1), for a piece of matter depends on the coordinates r of all electrons and R of all nuclei, and so do all of its eigenfunctions Ψ. Consider the following factorization for the total wavefunction: (116) Ψ(r, R) = Φ(R) ψe (r, R) . (a) Assume that the electronic wavefunction ψe (r, R) is a solution ψe (r, R) of the following electronic equation: (117) [Te + Vne (r, R) + Vee (r)] ψe(a) (r, R) = Ee(a) (R) ψe(a) (r, R) , where (a) represents the set of quantum numbers characterizing a given N -electron (a) (a) eigenstate with energy Ee . The electronic wavefunction ψe (r; R) describes an electronic eigenstate compatible with a given geometrical configuration R of the ions: its R-dependence is purely parametric (no ∇R operators in Eq. (117)). Physically this ansatz corresponds to assuming the ionic and electronic motions as decoupled from each other, in accord to the following related observations: • The electron mass is much smaller than the nuclear mass. As masses appear at the denominators of the kinetic terms Eq. (2) and Eq. (3), electrons move much faster, thus in shorter timescales, than atomic nuclei. • Energy levels corresponding to different electronic eigenstates are usually separated by energy gaps much larger than typical energies associated with the motion of the atomic nuclei. 65 66 2. MOLECULES The assumption behind the factorization defined by Eqs. (116), (117), known as the adiabatic or Born-Oppenheimer scheme, is that, once an initial electronic state has been selected, the nuclei move slowly enough not to induce transitions to different electronic states. To make use of the ansatz (116), observe that Te ∼ ∇2r does not act on the R coordinates: (118) Te [Φ(R) ψe (r, R)] = Φ(R) [Te ψe (r, R)] . Derivation is slightly more intricate for the nuclear kinetic term: (119) ∇2R [Φ(R) ψe (r, R)] = ψe (r, R) ∇2R Φ(R)+2 [∇R ψe (r, R)] ∇R Φ(R)+Φ(R) ∇2R ψe (r, R) . Thus, when we substitute the factorization Ψ(r, R) = Φ(R) ψe (r, R) into the Schrödinger equation Htot Ψ = Etot Ψ we obtain X ~2 − 2 [∇R α ψe (r, R)] ∇R α Φ(R) + Φ(R)∇2R α ψe (r, R) + ψe (r, R)∇2R α Φ(R) 2Mα α (120)+ Φ(R) Te ψe (r, R) + [Vne + Vee + Vnn ] Φ(R)ψe (r, R) = Etot Φ(R)ψe (r, R) , (omitting the coordinate dependencies of the V terms). This can be rearranged as follows: X ~2 − (121) 2 [∇R α ψe (r, R)] ∇R α Φ(R) + Φ(R) ∇2R α ψe (r, R) + 2Mα α # " X ~ 2 ∇2 Rα Φ(R) + Φ(R) [Te + Vne + Vee + Vnn ] ψe (r, R) = (122)− ψe (r, R) 2Mα α = Etot Φ(R)ψe (r, R) , in order to highlight the electronic Hamiltonian (Te + Vne + Vee ) of Eq. (117). The two terms of line (121), involving derivatives ∇R of the electronic wavefunction ψe (r, R) are called nonadiabatic terms. The nonadiabatic terms are much me ) than the typical differences between electronic smaller (usually by a factor ≈ M α energies in Eq. (117), dominating Eq. (122). The adiabatic approximation consists precisely in the neglect of these terms. (a) In Eq. (122), drop the nonadiabatic terms and substitute a solution ψe (r, R) of Eq. (117) for ψe (r, R): # " 2 2 X ∇ ~ Rα + Ee(a) (R) + Vnn (R) Φ(R) = Etot ψe(a) (r, R) Φ(R) , (123) ψe(a) (r, R) − 2 α Mα (a) where the electronic wavefunction ψe (r, R) is displaced to the left of the operator, to stress that the differential part only acts on the ionic part Φ(R). Note that all 2.1. THE ADIABATIC SEPARATION 67 three terms Te , Vne and Vee are completely (and in principle exactly) accounted for by (a) the electronic eigenvalue Ee (R), the internuclear repulsion Vnn remains indicated explicitly, and only the nuclear kinetic term Tn is treated approximately, due to the neglect of nonadiabatic corrections. The electronic wavefunction can now be (a) ∗ dropped (formally by multiplying Eq. (123) by ψe (r, R) and integrating over all electronic coordinates r), to derive the equation for the adiabatic motion of the nuclei described by Φ(R): " # ~2 X ∇2R α (124) − + Ee(a) (R) + Vnn (R) Φ(R) = Etot Φ(R) . 2 α Mα The equation for the electronic system (117) describes the motion of all electrons in the piece of matter. When the number of electrons is larger then one (as usually is the case), Eq. (117) involves at least the same theoretical difficulties of manyelectron atoms discussed in Sec. 1.2. Equation (117) is usually solved within some approximate quantum many-body method, e.g. the Hartree-Fock method sketched in Sec. 1.2.4, or the Density-Functional theory. In the present polyatomic context, theory faces the extra difficulty that the electronic problem depends explicitly on the precise position R of the nuclei, through Vne . One should then solve the electronic problem for many geometric arrangements (classical configurations R) of the nuclei, (a) to obtain a detailed knowledge of the eigenfunction ψe (r, R) and more importantly (a) (a) of the eigenvalue Ee (R) as a function of R. The electronic energy function Ee (R) is especially important as it drives the motion of the nuclei through Eq. (124). The neglected nonadiabatic terms (which, remember, are of nuclear kinetic nature) would induce transitions among different electronic states. Within the adiabatic approxi(a) mation, once an electronic state ψe (r, R) is chosen, the motion of the nuclei does (a′ ) not let this electronic eigenstate mix with other eigenstates ψe (r, R): the electronic state follows adiabatically the slow nuclear motion. (a) Once the exact (or some approximate) Ee (R) is plugged into the Schrödinger equation for the nuclei (124), it allows to understand the translational motion of the nuclei as driven by a total adiabatic potential energy (125) (a) Vad (R) = Vnn (R) + Ee(a) (R) , which is the sum of the direct Coulombic ion-ion repulsion Vnn plus the electronic (a) eigenvalue Ee (R) (which changes as a function of the ionic coordinates, as discussed above). This second term, also called “adiabatic electronic contribution”, (a) can be seen as the ”glue” which keeps the atoms together. Ee (R) is generally a complicated function of all the ionic coordinates R: contrary to Vnn , it usually cannot be expressed as a simple sum of two-body contributions. 68 2. MOLECULES Of all the infinite electronic eigenstates labeled by (a), the electronic ground state produces the especially important lowest adiabatic potential energy surface Vad (R). Electronic excitations (a) generate different adiabatic potential energy surfaces. In the adiabatic scheme, the translational motions of the nuclei follow the total adiabatic potential Vad (R). As in common language “nuclear” refers to internal dynamics of the nuclei, in practice Vad (R) is thought of as the potential energy governing the motions of “atoms”, or “ions”. As a function of the 3Nn atomic coordinates, Vad (R) shows two very general symmetries, consequences of the symmetry of the original Hamiltonian (1), describing an isolated system: • Translational symmetry: if all ions are moved by an arbitrary (equal for ~1+ all) displacement ~u, then Vad remains unchanged. Symbolically: Vad (R ~ 2 +~u, ..., R ~ Nn+~u) = Vad (R ~ 1, R ~ 2 , ..., R ~ Nn ). ~u, R • Rotational symmetry: if all ions are rotated by an arbitrary (equal for all) rotation A around a given arbitrary point in space, then Vad remains ~ 1 , AR ~ 2 , ..., AR ~ Nn ) = Vad (R ~ 1, R ~ 2 , ..., R ~ Nn ). unchanged. In symbols: Vad (AR This means that Vad depends in practice on the relative positions of the atoms. Vad (R) often shows a well defined minimum, for the atoms placed in a given definite relative configuration RM . In the simplest example of a di-atom, one always finds a definite interionic distance RM for which Vad (R) is minimum. Choose any two atoms in the periodic table: the potential energy as a function of distance has the qualitative shape of Fig. 2.1. If the potential well is deep enough, then at low temperature the motion is confined in a neighborhood of the equilibrium position RM : the ions execute small oscillations around RM . The ionic motion is sometimes 2 treated within classical mechanics (Mn ddtR2 = −∇R Vad (R)). This does not mean that the actual atomic motion is any classical, only that under some conditions (e.g. very heavy ions), the classical limit could provide a good approximation to the actual quantum dynamics. The classical ionic dynamics often yields useful insight into the quantum solution of Eq. (124). For example, the classical problem of the normal modes describing small independent harmonic oscillations around RM is mapped to that of a collection of quantum harmonic oscillators (a single quantum oscillator, for a diatomic molecule). We shall return to this fundamental quantum problem in Secs. 2.3 and 4.3, and sketch its well known solutions. In summary, the adiabatic scheme provides a basic separation of the intricate coupled electron-ion dynamics into two conceptually and practically distinct problems: the electronic equation (117) governs the motion of the electrons in the field of the ions (imagined as instantaneously frozen); the equation for the slower motion of the ions (124) is a standard Schrödinger equation defined by the adiabatic potential, which is the sum of the interionic Coulomb repulsion plus the “glue” provided by the electronic total energy – Eq. (125). 2.2. CHEMICAL AND NONCHEMICAL BONDING 69 Energy / depth of the well Vad(R) 1 0 -1 0 1 2 3 R/RM Figure 2.1. The typical qualitative shape of the adiabatic potential for a diatom, as a function of the interionic separation R. The zero of energy has been taken as the sum of the total energies of the two individual atoms at large distance. Horizontal lines represent possible vibrational ground and low-energy excited levels. The actual number and positions of these states depends also on the diatom reduced mass. 2.2. Chemical and nonchemical bonding In the previous Section we anticipated a generic profile for the adiabatic potential of a diatom (Fig. 2.1). If that sort of long-distance attractive plus short-distance strongly repulsive behavior is really general, and persists for Nn > 2 atoms, then it can provide the microscopic mechanism allowing collections of atoms to bind together, forming all kinds of bound states (including molecules, liquids, solid objects of everyday matter...), without any tendency to collapse to infinitely dense pointlike objects. To obtain both qualitative and quantitative insight into the bonding nature of Vad (R), we investigate conveniently simple model systems. + 2.2.1. H+ 2 . This investigation starts naturally with the simplest di-atom: H2 , where all complications of electron-electron repulsion Vee are avoided. We proceed to look for evidence that the adiabatic energy of H+ 2 has a qualitative dependence of the inter-proton distance R of the kind sketched in Fig. 2.1: we check if, as R (a) is reduced to finite values, the lowering in electronic energy Ee (R) exceeds the 2 increase in internuclear Coulomb repulsion Vnn (R) = eR , thus providing bonding. 70 2. MOLECULES [EHa / a0] 0 Vne(r,R) -5 Vne / R ry rx -10 -3 -2 -1 0 rx /R 1 2 3 Figure 2.2. Three parallel cuts iof the nuclei-electron attraction h 1 1 2 Vne (r, R) = −e |~r−Rx̂/2| + |~r+Rx̂/2| in H2+ , along the line through the nuclei (the x axis) and along two lines at distances R/2 and R from the nuclei. This would imply that at some finite R the total adiabatic potential energy (125) of the ion is smaller (more negative) than that at R = ∞. To convince ourselves that this is indeed the case, consider the potential energy Vne acting on the electron of H+ 2 . Vne is the sum of the two attractive Coulomb terms VL + VR produced by the Left and Right nucleus, which, in the adiabatic spirit, we take as stationary (their distance R is the independent variable of the adiabatic potential-energy function Vad (R) we investigate). Figure 2.2 shows three “cuts” of Vne (r, R) as a function of the electron position r. The main observation here is that in the intermediate region, between the two nuclei, the total potential is roughly twice more negative than when one of the two nuclei was removed to infinity. This suggests that the electron moving in the field of both nuclei could take advantage of both attractions and lower its average potential energy by spending a significant fraction of its probability distribution in this intermediate extra-attractive region. We check if this mechanism produces bonding, i.e. Vad (R) decreases below its infinite-R value Vad (R) = E1s = − 21 EHa , namely the energy of an isolated hydrogen atom (ignore the mass correction mµe throughout). 2.2. CHEMICAL AND NONCHEMICAL BONDING 71 ψA energy / EHa ψS 0 ψ1s L ψ1s R Vne(r)=VL+VR -5 -10 -3 -2 -1 0 r/R 1 2 3 Figure 2.3. Cuts along the axis through the two nuclei of: (lower to upper) the potential energy Vne (r, R), the 1s eigenfunctions at each isolated well, and their symmetric and antisymmetric combinations. To estimate the energy that may be gained we use a variational approach: the ground-state energy is lower or equal than the average energy of any trial state of our construction. For this, start from the 1s Hydrogen orbitals, Eq. (40), |1s Li and |1s Ri centered at the Left and Right nucleus respectively. Guided by the reflection symmetry of Vne (r, R), build the trial electronic wavefunction for ψe as either Symmetric or Antisymmetric normalized linear combination (126) |Si = p |Ai = p |1s Li + |1s Ri 2(1 + Reh1s L|1s Ri) |1s Li − |1s Ri 2(1 − Reh1s L|1s Ri) . The wavefunctions ψ1s L (r) = hr|1s Li, ψS (r) = hr|Si etc. are sketched in Fig. 2.3. (a) We have therefore Ee (R) ≤ h1s S|Htot |1s Si. For infinitely large internuclear sep(a) aration R, |Si and |Ai become exact eigenstates, of energy Ee (R = ∞) = E1s = 72 2. MOLECULES − 12 EHa . For finite R, the total electronic energy of these states becomes (127) (hL|Te + Vne |Li ± hR|Te + Vne |Li) + (L ↔ R) S S E S = h |Te + Vne | i = A A A 2(1 ± hL|Ri) where we have omitted the 1s labels for brevity and the Re(), as the overlap hL|Ri is real. The (L ↔ R) terms obtained by exchanging left and right equal the previous ones, thus we can omit them together with the factor 2 at the denominator. Replace VL + VR for Vne , and reorganize the six matrix elements, to take advantage of the fact that |Li is the ground state of (Te + VL ): (128) hL|Te + VL |Li + hL|VR |Li ± hR|Te + VL |Li ± hR|VR |Li EHa hL|VR |Li ± hR|VR |Li ES = =− + . A 1 ± hL|Ri 2 1 ± hL|Ri The second term contains two Coulomb matrix elements, which are both real and negative and depend on the internuclear distance. Despite the denominator, the symmetric state |Si is lower in energy than |Ai. The hL|VR |Li term represents the attraction that the right nucleus exerts on the electron sitting around the left 2 nucleus. At large R this attractive term balances almost exactly the Vnn = eR internuclear repulsion. This cancellation represents the classical result that the electrostatic interaction energy of a spherically symmetrical neutral object (the H atom) and a remote point charge (the H+ ion) decays much faster than R1 . The cross term hR|VR |Li, at large internuclear distance decays very fast, approximately as −EHa aR0 exp(−R/a0 ). Thus for large R R |Li 1 + hR|V hL|VR |Li + hR|VR |Li hR|VR |Li hL|VR |Li = hL|VR |Li ≃ hL|VR |Li 1 + − hL|Ri 1 + hL|Ri 1 + hL|Ri hL|VR |Li ! −e2 hR|Li + ... R/2 − hL|Ri ≃ hL|VR |Li (1 + hL|Ri + ...) , ≃ hL|VR |Li 1 + −e2 hL|Li + ... R 2 e hL|Ri applies since the distribution where the approximation hR|VR |Li ≃ − R/2 ψL (r)ψR (r) is very small everywhere, and peaks at the axial region between the e2 . As a consequence, for large R the two atoms, at the center of which VR ≃ − R/2 attraction (last term of Eq. (128) for the symmetric state) prevails over the repulsion 2 Vnn = eR , thus produces bonding. The sign of the correction to the attraction is reversed for |Ai, which therefore is not bonding as the internuclear repulsion prevails against attraction. This simple variational model suggests that the adiabatic energy gain is exponentially small in R/a0 at large-R, due to the decay of the atomic 1s wavefunctions. All integrals in Eq. (127) can be evaluated for arbitrary R: Fig. 2.4 depicts the outcomes of this variational calculation. Note that: 2.2. CHEMICAL AND NONCHEMICAL BONDING 73 1 2 (A) Vnn=e /R Vad <S|Te|S> Energy/EHa -0.3 0 Vad (S) -0.4 Vad Vnn+<S|Vne|S> -1 |A> <S|Vne|S> |S> RM (a) 0 2 4 6 8 R/a0 0 2 2.5 -0.5 -0.565 (b) 8 6 4 R/a0 Figure 2.4. (a) The total adiabatic potential of the |Si variational state of H+ 2 decomposed in its kinetic, potential and ion-ion contributions: Vad (R) = hS|Te |Si + hS|Vne (r, R)|Si + Vnn (R). (b) A blowup of the total adiabatic potential Vad (R) for the |Si (solid) and |Ai (dashed) state. The minimum RM of Vad (R) is indicated and corresponds to a molecular bond energy of 0.065 EHa or 1.76 eV. (S) • the adiabatic potential Vad (R) associated to the |Si state shows a minimum at a finite separation RM ; this potential well can bind the two protons together, and for this reason |Si is called a bonding orbital; (A) • the monotonically decreasing, repulsive adiabatic potential Vad (R) justifies calling |Ai an antibonding orbital; • the dot-dashed curve indicates that as R is reduced the lowering of the electronic potential energy hS|Vne (r, R)|Si does not compensate the raise in internuclear repulsion, thus our initial expectation that bonding is related to a gain in electrostatic potential energy is not confirmed by the present variational model; • as the electron moves in the wider potential well created by the two protons rather than in the narrower well of an isolated proton, its kinetic energy (S) decreases significantly, enough to dig a minimum in Vad (R). 74 2. MOLECULES The conclusion of our simple variational model for H+ 2 is that the physical origin of bonding is to be attributed to both kinetic and potential energy lowering of the electron “screening” the nuclei and spending a significant fraction of probability in the inter-nuclear region. This mechanism works optimally at R ≃ 2.5 a0 , and produces a molecular bond energy of about 1.76 eV. For smaller R, the adiabatic potential shoots up due to the divergence in Vnn (now compensated poorly), and a new increase of the kinetic energy, as the molecular potential well contracts to a more atomic-like shape. The proposed variational estimate provides basic qualitative trends, but is especially inaccurate at small R: a more quantitative treatment requires solving the Schrödinger equation in the double Coulomb well: this yields a bond energy of 2.79 eV at an optimal distance of 2.00 a0 , in good agreement with experiment. The electronic Schrödinger problem for one electron in the field of the two nuclei has axial (cylindrical) rather than the spherical symmetry of an atom. Accordingly, the adiabatic electronic states are labeled by their angular-momentum projection m along the molecular axis through the nuclei, and not, as in atoms, by their total angular momentum l. Thus, the electronic states of linear molecules, are labeled σ, π, δ, ..., to indicate the absolute value |m| = 0, 1, 2, ... of their angular momentum projection: non-σ states are orbitally twofold degenerate. Both |Si and |Ai of Eq. (126) are σ states, as being composed of 1s (l = 0, thus m = 0) atomic states. Standard chemical notation adds a star apex to antibonding states, thus |Ai would actually be labeled 1σ ∗ . Other excited electronic states of H+ 2 (whether bonding or antibonding) can however possess axial angular momentum m 6= 0, thus symmetries other than σ. 2.2.2. Covalent and ionic bonding. The ion H+ 2 is instructive because in its simplicity it illustrates the physical origin of chemical bonding, permits to introduce the basic concepts of bonding and antibonding orbitals, and gives a numerical estimate of typical bond energies and distances in molecules. Nonetheless, a chemist would not call the electronic structure of H+ 2 a proper chemical bond. For an actual covalent bond, it would take two electrons occupying (clearly, with antiparallel spins) the same bonding orbital |Si. This single-particle orbital should be determined by some method (e.g. HF) accounting – at least approximately – for electron-electron repulsion, but at a semi-qualitative level we could take |Si similar to our variational guess (126). Accordingly, the ground-state electronic structure of a neutral H2 molecule resembles roughly the following 2-electron state: (129) 1 χ (σ ) χ (σ ) ↑ 1 ↑ 2 . ψeGS (r1 σ1 , r2 σ2 ) = hr1 σ1 , r2 σ2 |S ↑, S ↓iA = hr1 |Sihr2 |Si √ 2 χ↓ (σ1 ) χ↓ (σ2 ) 2.2. CHEMICAL AND NONCHEMICAL BONDING 75 Both electrons occupy the same spatially symmetric bonding orbital, in a global state which, as it must, is antisymmetric for exchange of electron 1 and 2 (through its spin part). The singlet wavefunction (129) represents a typical homonuclear (GS) (GS) covalent bond. For H2 , the bond energy Vad (∞) − Vad (RM ) ≃ 4.7 eV (at an equilibrium distance RM ≃ 1.3 a0 ). This bond energy is less than twice that of H+ 2, since this singlet state pays a significant electron-electron repulsion. Beside the ground state, the potential well produced by the two protons has space for several excited states, which can be classified as single-electron excitations, with one electron promoted to higher bonding/antibonding states. In general, excited electronic states need not favor bonding: for example, the lowest triplet state of H2 , with an electron in |Si and the other in |Ai is not bond. However, higher excited states, with both electrons in bonding orbitals, are observed to be bond, although with lower binding energy and larger interatomic equilibrium distance [?]. When pairs of more complex atoms are let interact to form diatomic molecules, the number of involved bonding and antibonding orbitals is necessarily larger, to accommodate all electrons. Figure 2.5 sketches a qualitative ground electronic structure of several homonuclear diatoms, leaving the deep core 1s electrons out. Several interesting observations may be drawn from this figure: • Electrons fill the single-particle levels according to the same scheme as in atoms (lower to higher), and the qualitative ordering of the levels does not change much in passing from one molecule to another. • Covalent bonding is mainly related to an occupancy imbalance between bonding and antibonding orbitals: the most strongly bond molecule is N2 , where this imbalance (“bond order”) equals 3. • When electrons start to fill 2p-derived molecular orbitals, the π states are lower than the σ combination. This is surprising, as the overlap of the m = 0 p orbital pointing along the molecular axis is much larger than that of the m = ±1 which are mostly located away from it: the bonding-antibonding splitting of σ orbitals should be larger than the corresponding π orbitals. However, the Coulomb repulsion of the filled 1s- and 2s-derived orbitals pushes the 2p-derived σ orbital up by a substantial amount. The “natural” ordering is restored in O2 . • Even a dimer such as Be2 , for which bonding and antibonding orbitals are equally populated, shows some amount of bonding, of the type described in Sec. 2.2.3 below. • In B2 and O2 , two electrons sit in a degenerate π or π ∗ molecular orbital, which has room for 4 electrons: to minimize the residual Coulomb repulsion, the ground state is a spin-triplet Hund-rule state (see Sec. 1.2.7.2). 76 2. MOLECULES Figure 2.5. A schematic chemist’s view of the electronic structure of a few simple homonuclear diatomic molecules. Blue indicates bonding, red indicates antibonding single-electron orbitals. Red arrows represent electrons filling the orbitals. The observed bond energy is indicated. All single-electron energies and splittings are purely qualitative. • The Ne2 dimer, like Be2 , has all bonding and antibonding orbitals filled: as a result, only an exceedingly weak bond forms, as described in Sec. 2.2.3 below. Homonuclear diatomic molecules are very special diatoms, due to their peculiar L ↔ R symmetry. Most pairs of atoms bind together, and many form covalent bonds not so unlike those illustrated for the homonuclear molecules. For example, the CO molecule has the same number of electrons as the molecule N2 . The main difference 2.2. CHEMICAL AND NONCHEMICAL BONDING 4σ * C O 77 2σ * H F 1s 2p 2π * 3σ 2p 1π 2p 1π 2σ 2σ * 2s 1σ 2s 2σ CO 11.2 eV HF 2s 5.9 eV Figure 2.6. A simplified chemist’s view of the electronic structure of two heteronuclear diatomic molecules: CO and HF. Blue indicates bonding, red indicates antibonding single-electron orbitals. The observed bond energy is indicated. Single-electron energies and splittings are purely qualitative. is that the L ↔ R symmetry is broken: the attraction of the O nucleus (Z = 8) is stronger than that of C (Z = 6). As a consequence, as sketched in Fig. 1.23, the 2s and 2p shells of O sit deeper than the 2s and 2p shells of C: the bonding and antibonding orbitals are not pure symmetric/antisymmetric combinations of the type (126). In the same spirit of the variational treatment of H+ 2 (but neglecting overlaps hL|Ri for simplicity), these orbitals may often be approximated as eigenstates of a 2 × 2 matrix EL −∆ hL|H1 |Li hL|H1 |Ri , (130) ≡ −∆ ER hR|H1 |Li hR|H1 |Ri where H1 = Te + V eff is the effective one-electron Hamiltonian fixed by the selfconsistent potential acting on the electron under examination, and |Li, |Ri are the considered atomic states (e.g. the 2s or 2p) of the left and right atom (here C and O). The diagonal elements are mostly dictated by the energy positions of atomic shells. As the atoms move closer, the off-diagonal term ∆ > 0 grows larger and larger. The eigenvalues of the matrix (130) are: s 2 EL + ER EL − ER (131) E ab = ± + ∆2 , 2 2 78 2. MOLECULES and the corresponding antibonding |ai and bonding |bi eigenkets can be written as (132) s s aE 1 1 u u EL − ER = 1± √ 1∓ √ |Li ∓ |Ri , with u = . b 2 2 2∆ 1 + u2 1 + u2 For u = 0 we recover the symmetric homonuclear molecule, with |2∆| representing the splitting between bonding |bi = |Si and antibonding |ai = |Ai states (126). For finite u (assuming EL > ER ), the eigenenergies E ab in (131) remain centered around 2 EL +ER 2 1/2 , but their splitting E − E = (E − E ) + (2∆) is larger than both the a b L R 2 diagonal separation EL −ER and the off-diagonal mixing energy |2∆|. For increasing u, the eigenkets (132) resemble less and less the simple symmetric and antisymmetric combinations (126): |ai acquires a larger |Li character, while |bi acquires a larger |Ri character. In the limit of very large |u|, the eigenenergies (131) E ab ≃ E L , and R L the eigenkets | ab i ≃ | R i. As illustrated in Fig. 2.6, an intermediate value of u ≃ 1 applies for the 2s and 2p orbitals of CO near its equilibrium separation: bonding molecular orbitals lie prevalently on the O side, antibonding ones on the C side, but substantial quantum delocalization is present. The bond of CO and of many similar heteronuclear diatoms is classified as covalent polar, since it is associated to a nonzero permanent electric dipole due to the charge transfer associated to the unequal charge distribution of the electrons in the bonding state |bi. HF (here Hydrogen-Fluorine, not Hartree-Fock!, see Fig. 2.6) is the prototype diatom where, for typical interatomic distances, u is very large (Figure 1.23 shows how much deeper lies the 2p shell of F than the 1s shell of H). As a result, the relevant bonding orbital is not much different from the 2p of an isolated F atom. Thus a radical (rather naive) description of the bond of HF involves a complete charge transfer from H to F (which therefore completes the shell to 2p6 ). The two ions H+ and F− would then attract each other with a −e2 /R attraction as long as they are separated enough. As the proton moves inside the typical size of the outer shell of F− , the screening of the positive charge of the F nucleus diminishes, and the attraction gradually turns into repulsion. The origin of bonding in this simple picture is the energy gained in moving the ions from infinity to the equilibrium distance, to which the energy paid to form the ions from the neutral atoms (the ionization potential of the atom which turns into a cation – here H – minus the electron affinity of the atom acquiring the electron – here F) is to be subtracted. Indeed, [e2 /(92 pm) = 15.7 eV] minus the ionization potential of H (13.6 eV) plus the electron affinity of F (3.4 eV) yields 5.5 eV, in not especially poor agreement with the observed bond energy 5.9 eV. A molecular bond similar to that of HF, where u is so large that almost complete charge transfer occurs is named ionic bond. 2.2. CHEMICAL AND NONCHEMICAL BONDING Figure 2.7. 3D structures of several organic molecules: (left to right up to down) methane CH4 , propane C3 H8 , butane C4 H10 , benzene C6 H6 , ethanol CH3 CH2 OH, pyrene C16 H10 , morphine, heroin, a peptide, three proteins of increasing complexity. In the stickrepresentation of morphine and heroin, H atoms are left undrawn and all corners represent C atoms. For the last protein, the single-atom representation is given up in favor of a block representation. 79 80 2. MOLECULES The picture of bonding according to Figs. 2.5 and 2.6 and relative discussion is extremely simplified, at the point of being inaccurate. In reality, all orbitals of the same symmetry are actually mixed, and the correlation with atomic levels indicates at most the prominent atomic component in the actual molecular orbital. For example, the orbitals labeled 2σ, 2σ ∗ , 3σ, and 4σ ∗ , are linear combinations each involving a little of all m = 0 atomic orbitals, mostly 2s and 2pz , but also 1s, 3s, 3pz ,... This hybrid character of molecular orbitals, a rather boring detail for diatomic molecules, is a crucial ingredient in determining the 3D shape of polyatomic molecules and covalent solids. Hybrid orbitals provide adiabatic potentials which depend strongly not only on interatomic distances but also on angles. An especially important example is the mix of 2s and 2p, at the basis of the geometry of covalent bonds in organic molecules: sp3 combinations determine the ideally 109◦ angle between the bonds of tetrahedrally coordinated carbon and silicon; sp2 combinations tend to bind atoms such as carbon and nitrogen to three other atoms in the same plane, forming ideally 120◦ angles. The endless combinations of the molecular orbitals of polyatomic 3D molecular structures (Fig. 2.7) and the weaker interactions among different molecular units, are at the basis of the infinite richness of organic chemistry, the microscopic functioning core of living matter. Analogous mixtures of 3s & 3p and of 4s & 4p determine the covalency of Si and of Ge, Ga, As, in turn at the root of the crystalline and amorphous (rather than molecular) structures that these elements tend to form in their solid state (see Figs. 4.9, 4.25, and 4.63 below). 2.2.3. Weak nonchemical bonds. Our description of covalent bonds in terms of linear combinations of single fixed atomic orbitals, is doomed to obtain an exponential dependence of Vad (R) at large distance, due to the exponential decay of the orbitals themselves. Experimentally however, at large distance, the interaction energy between any two atoms is always attractive and decays following a universal power law: ∼ R−6 . This is due to classical electromagnetism concepts which the previous analysis completely overlooked: atomic electric dipole moments and atomic polarizability. The electric dipole moment of any atom is, on average, null. As long as spherical symmetry is unbroken, the single-electron atomic orbitals are also eigenstates of the ~ 2 , namely some R(r) times spherical harmonics total orbital angular momentum |L| YLM . These function have definite parity (−1)L , thus the angular probability distribution |YLM (r̂)|2 of each electron is even (see Fig. 1.5). Thus, the average electric ~ dipole hLM |d|LM i vanishes, as discussed above in Sec. 1.1.10. However, electrons move around the nucleus and occupy instantaneously specific positions, associated to nonzero (fluctuating) dipole moment. This dipole moment produces an instanta~ whose intensity decays away from the atom as R−3 , based on neous electric field E elementary arguments of electromagnetism. 2.2. CHEMICAL AND NONCHEMICAL BONDING 81 A field acting on a second remote atom “polarizes” it. The charge distribution of the electrons of the second atom reacts to minimize the total energy, now including, ~ ~ ~ beside P the inner terms of (1), a coupling −d · E to the external field, where d = −qe ~ri . The second atom responds to the instantaneous external electric field by ~ = αE. ~ As the building up an induced dipole in the same direction as the field: hdi field is weak, the atomic polarizability α is independent of the field (linear response). As a result, the total energy of the two atoms at distance R lowers by an amount ~ E ~ = − 1 α|E| ~ 2 ∝ R−6 with respect to the atoms placed at infinite distance. By − 12 hdi· 2 dimensional analysis, the prefactor of R−6 must be a [distance]6 × [energy]. As only atomic physics is involved, the order of magnitude of this van der Waals attraction must be vdW Vad (R) ∝ −EHa (133) a 6 0 R . The order of magnitude of the coefficient of the −R−6 decay of the interaction potential is EHa a60 ≃ 0.6 eV Å6 ≃ 10−79 J m6 . More quantitatively, the prefactor of R−6 is proportional to the product of the two atomic electrical polarizabilities [?] (since the dipole fluctuations are proportional to α of the first atom, as reciprocity suggests). These polarizabilities can be determined by second-order perturbation theory of atomic states. By standard perturbative arguments, an external field distorts the ground state so that (i) the electronic ~ of excited states with l wavefunction acquires components (proportional to |E|) changed by ±1, and (ii) the total energy lowers quadratically with the field. For ~ = Ez ẑ, the ground state of an H atom example, in a weak external electric field E takes the form of a linear combination |1, 0, 0; Ez i = a|1, 0, 0i + (134) X n>1 bn |n, 1, 0i , where the hydrogenic eigenkets |n, l, miP are represented in real space by the standard 2 2 eigenfunctions of Eq. (36), and |a| + n |bn | = 1. The resulting dipole moment P h1, 0, 0; Ez |dz |1, 0, 0; Ez i = 2a n bn hn, 1, 0|dz |1, 0, 0i is proportional to Ez , because the bn coefficients are. To evaluate the polarizability α, estimate the coefficients bn by standard first-order perturbation theory [?] bn = −Ez hn10|dz |100i/(E1 − En ), where En are the energy levels of H, Eq. (30). The weak-field polarizability α is then given by (135) α= X |hn, 1, 0|dz |1, 0, 0i|2 h1, 0, 0; Ez |dz |1, 0, 0; Ez i ≃ −2 . Ez E1 − En n>1 82 2. MOLECULES The physical dimensions of α are [Charge]2 [Length]2 [Energy]−1 = [Charge]2 [Time]2 [Mass]−1 , the same as 4πǫ0 × [Length]3 . It is therefore natural to measure atomic polarizabilities in units of 4πǫ0 a30 = 1.64878 · 10−41 C2 s2 /kg. The resulting static polarizability of H is αH = 4.50 4πǫ0 a30 . The homologous quantities for He and Li are αHe = 1.383 4πǫ0 a30 and αLi = 164.11 4πǫ0 a30 . The very small polarizability of He (huge 1s-2p gap) and the large one of Li (comparably small 2s-2p gap) are accounted for by Eq. (135): α is inversely proportional to the energy gap from the highest filled state up to the nearest empty state of different l character. The dipole–induced-dipole mechanism is perfectly general, and it accounts for the leading long-range attraction of all pairs of neutral atoms. As distance R reduces, open-shell atoms, e.g. H and those forming the diatoms of Figs. 2.5 and 2.6, gradually modify their orbitals to form robust covalent bonds. For pairs of close-shells atoms (Be2 , Ne2 ) only the van der Waals mechanism produces (rather weak) attraction, until short-distance repulsion turns in. As R is reduced the short-distance repulsive Coulombic divergence of Vad ∝ R−1 is peculiar to H+ 2 and few other dimers involving H. For general many-electron atoms, long before the nucleus-nucleus repulsion Vnn (R) becomes relevant, the electronic (a) energy Ee (R) blows up because of Pauli’s principle: as the core electrons are brought together in the same region of space, their wavefunctions become less and less orthogonal. Some of these electrons is then pushed up into some empty valence level, which makes tiny reductions of R cost tens or even hundreds EHa . This rapid “hard-core” increase of the energy as two atoms collide is responsible for the “impenetrability” of matter, i.e. the sharp increase of pressure of a sample whose volume is so much reduced that each atom is left less than its characteristic volume (∼ few Å3 ) of space. We see that a combined effect of electrons indistinguishableness and quantum kinetic energy sustains matter against collapse due to electromagnetic attraction. In a diatomic context, the Vad (R) blowup at short distance is sometimes parameterized phenomenologically with a R−12 power law. The simplest potential capturing the long-distance van der Waals attraction and the phenomenological short-range repulsion is the popular Lennard-Jones potential σ 12 σ 6 (136) VLJ (R) = 4ε . − R R This expression is nothing but a phenomenological model (out of a large class) for the actual Vad (R). Its two parameters ε (the depth of the potential well at RM ) and σ (the radius where VLJ (R) changes sign) are listed in Table 2.1 for the noble-gas dimers, whose actual Vad (R) the Lennard-Jones potential is a fair approximation for. In this context, it is also used for describing the dynamics of a collection of more than two noble-gas atoms as a sum of pair potentials (two-body forces). This extrapolation is a fair approximation of the actual adiabatic potential of simple 2.2. CHEMICAL AND NONCHEMICAL BONDING 83 element σ [pm] ε [meV] 4εσ 6 [EHa a60 ] He 256 0.879 1.7 Ne 275 3.08 8.9 Ar 340 10.5 109 368 14.4 239 Kr Xe 407 19.4 590 Table 2.1. Lennard-Jones parameters (fit to atom-atom scattering data) defining the pair potentials of the diatoms of the indicated elements according to Eq. (136). Note the increase of atomic size (reflected by σ) with Z and the rapid increase in atomic polarizability in going from He to Ne (reflected by the −R−6 coefficient 4εσ 6 ≃ α2 ). close-shell systems at low density: the phase diagram and correlation properties of the Lennard-Jones solid/fluid is in close qualitative and semi-quantitative agreement to observed data for noble-gas systems. The two-body Lennard-Jones model instead is very inappropriate for atoms forming strong directional covalent bonds. 2.2.4. Classification of bonding. We have clarified the mechanism for the general tendency, pictured in Fig. 2.1, of atoms to attract at large distance and repel when coming in contact. Contrasting this qualitative likeness, significant differences in the equilibrium distances and huge differences in well depths make different diatoms, bonded by different mechanisms, very unlike. When at least one of the atoms is a noble gas, the dipole–induced-dipole mechanism is the only mechanism creating attraction, little or no covalency occurs, and consequently the equilibrium atom-atom distance RM is rather large and the bond energy [Vad (+∞) − Vad (RM )] is small (few meV, see Table 2.1). Weakly bonded Van der Waals systems retain ordinarily the monoatomic gas phase to relatively low temperature, and show weaker tendency to form diatomic molecules than to condensate to monoatomic liquid and solid phases. A few atoms with open shells (O, N, F) show prominent tendency to form diatomic molecules, with short strong covalent bonds, with typical bond energies of the order of a few eV. These molecular units are retained in the low-temperature liquid and solid phases. For many other elements (e.g. Li, Be, B, C), the extra energy gain in forming many bonds per atom makes them form extended metallic or covalent solids, rather than diatoms. When different atoms are covalently bound together, some amount of electronic charge moves closer to one nucleus than to the other, as the energy of the atomic shells involved in bonding is different. This phenomenon makes heteronuclear bonds polar. The extreme limit of complete or almost complete 84 2. MOLECULES charge transfer is named ionic bond (e.g. HF, LiF): energies and lengths involved are in the same range as for the covalent bonds. 2.3. Molecular spectra In the previous Section, we have discussed general properties of the solutions of the electronic equation (117) for a diatom, thus acquiring information on the typical shape of the adiabatic potential Vad . We consider now the motion of the two nuclei in the adiabatic force field described by Vad , with its spectroscopical implications. This motion can be described in terms of solutions of Eq. (124). As remarked in Sec. 2.1, the adiabatic potential is independent of the centermass position of the molecule (translational invariance) and of the orientation in space of the straight line through the two nuclei (rotational invariance). Precisely the same transformation (21) applied to the two-body problem of the one-electron atom separates the center-mass motion of the two-body problem of the diatomic molecule. Like for atoms, due to translational symmetry, the molecular center of mass translates freely: the random thermal translational motion in a gas-phase sample originates Doppler and collisional broadening of the spectra. By treating the internal degree of freedom in polar coordinates, like for the oneelectron atom, we convert the equation for the relative coordinate into angular and radial equations (25), (26), and (27), where in the latter U (r) is replaced by the distance-dependent adiabatic potential Vad (R) (the nucleus-nucleus separation R here replaces the electron-nucleus distance r of the one-electron atom). The angular equations are universal, thus the angular wavefunction, describing the orientation in space of the molecule (thus molecular rotations) are standard spherical-harmonics Ylm . Rotational states |l, mi are labeled by the molecular angular momentum l,1 plus its ẑ component m. The formal structure of the radial equation is the same as for the one-electron atom. The substantial physical difference stands in the equilibrium distance (the R where the potential is most attractive) which for the diatom is finite R = RM > 0, rather than R = 0 as for the one-electron atom. The main consequence is that the radial motion is mostly localized near RM , in a region where the centrifugal term ~2 l(l+1) in the equation is often fairly small. If we neglect the variations of R−2 2µ R2 along a region around RM where the radial wavefunctions differ significantly from zero, then the radial motion is approximately independent of the rotation, i.e., the radial solutions are independent of l. The radial quantum number v = 0, 1, 2, ... for the diatomic molecule indicates the number of radial nodes, like n − l − 1 for 1 In the literature, the molecular angular momentum quantum number is sometimes called r and sometimes j, rather than l. 2.3. MOLECULAR SPECTRA 85 the one-electron atom. If the adiabatic potential is Taylor-expanded around its minimum 1 d2 Vad (R′ ) (137) Vad (R) = Vad (RM ) + (R − RM )2 + ... , 2 dR′2 ′ R =RM and truncated at second order, then the radial motion is approximately a harmonic motion. In this approximation, the energy spectrum contains three additive contributions, of decreasing importance: • A large electronic term Vad (RM ), whose lowering at the equilibrium configuration Vad (∞) − Vad (RM ) measures the well depth responsible for the chemical bond. • A vibrational term, which in the harmonic approximation amounts to 1 (138) Evib (v) = ~ω v + , 2 (139) measuring the energy of radial vibration around the equilibrium position RM . Here s k d2 Vad (R′ ) M 1 M2 ω= , with k = , and µ = µ dR′2 R′ =RM M1 + M2 where µ indicates the reduced mass of the two-body oscillator composed by the two nuclei of mass M1 and M2 (see Eq. (24)). Typical vibrational energies ~ω of few hundred meV or less are observed. ~2 l(l+1) ~2 l(l+1) ≃ 2µ • A rotational contribution from the mean rotational term 2µ 2 R2 RM in the radial equation (27) yields: (140) Erot (l) = ~ 2 ~2 l(l + 1) |L| = , 2 2µ RM 2I 2 where I indicates the classical momentum of inertia µRM of the diatom with respect to its center of mass, assuming the interatomic distance R is frozen at the equilibrium separation RM . The typical order of magnitude of the rotational energies ~2 /(2I) in molecular spectra is few meV or less, H2 having the largest one, 7 meV. 2.3.1. Rotational and ro-vibrational spectra. In an “adiabatic” transition with the electrons remaining in the electronic ground state, observed spectra fulfill the standard dipole selection rules: ∆l = ±1. The dipole operator involved here is the product of the internuclear separation times the charge difference permanently attached to the two atoms. This charge difference vanishes for equal nuclei, thus 86 2. MOLECULES Figure 2.8. Observed purely rotational spectrum of HCl gas. Here, the |li = 0i → |lf = 1i line close to 20 cm−1 is not seen because of limitations of the spectrometer. no dipole transition occurs for homonuclear molecules.2 On the contrary the large dipole moments of strongly polar molecules, such as HF and HCl, produce intense dipole transitions. The dipole moment of CO, a weakly polar molecule, is only about 10% that of HCl, thus producing much weaker infrared absorption. Rotational and vibrational molecular spectra are mostly observed in absorption rather than emission, as the spontaneous emission rate of such low-energy transitions is very small, due to the E 3 dependence of the decay rate, Eq. (73). Purely rotational spectra, usually in the far infrared region, are associated to ∆v = 0, ∆l = 1 transitions. The energy difference between |li i and |lf i = |li +1i states is ~2 ~2 [(li + 1)(li +1 + 1) − li (li + 1)] = [li + 1] . 2I I Accordingly, if in the sample several initial rotational states are populated, then the rotational spectrum contains several equally spaced lines. The energy spacing is 2 twice as large as the typical rotational quantum ~2I . Figure 2.8 reports a characteristic purely rotational absorption spectrum. The measured separation of the lines permits to determine the interatomic equilibrium separation RM through Eq. (141). Roto-vibrational spectra are observed typically in near-infrared absorption. Most of the dipole intensity concentrates in ∆v = 1 transitions, although in practice, also weaker transitions with ∆v > 1 are observed routinely. Figure 2.9 reports a characteristic roto-vibrational spectrum. Again, transitions occur starting from several initial rotational states: as a consequence, the purely vibrational peak is “decorated” by rotational transitions, which on the low-energy side imply ∆l = −1 (141) 2 ∆Erot (li ) = This is the reason why clean air (composed mainly of gas-phase N2 and O2 ) is essentially transparent in the near infrared. Transparency in the visible range is associated to large gaps (several eV) from the electronic ground state to the first allowed electronic excitation. 2.3. MOLECULAR SPECTRA Figure 2.9. Observed roto-vibrational spectrum of HCl gas: absorption in this region is associated to the “fundamental” transition |v = 0i → |v = 1i, decorated by the rotational P branch (at the left, |li i → |li −1i) and R branch (at the right, |li i → |li +1i). The isotopic duplication of the lines is visible: H35 Cl is responsible for the stronger peaks and less abundant H37 Cl for the weaker ones. Figure 2.10. A scheme of the rotational transitions occurring on top of a vibrational transition of frequency ν0 in a diatomic molecule. 87 88 2. MOLECULES Figure 2.11. Normal modes of vibration of polyatomic molecules: (a) CO2 (~ω1 = 166 meV, ~ω2 = 83 meV, ~ω3 = 291 meV); (b) H2 O (~ω1 = 453 meV, ~ω2 = 198 meV, ~ω3 = 466 meV). and are called P branch, and on the high-energy side imply ∆l = +1 and are called R branch, as illustrated in Fig. 2.10. The rotational structures are equally spaced according to Eq. (141) (for the P branch a similar result holds). The roto-vibrational spectra are characteristic for the absence of a purely vibrational peak at energy ~ω, with could only occur if ∆l = 0 were dipole allowed (which is not). Roto-vibrational spectra are often investigated through Raman spectroscopy, which is not based on dipole transitions as infrared absorption. Raman experiments are based on optical (electronic) non-resonant excitations of the molecule, which rapidly decay back to the electronic ground state, possibly leaving a vibrational and/or rotational excitation. As the experiment involves two photons, the selection rules allow ∆l = 0, ±2 transitions. The roto-vibrational dynamics of polyatomic molecules involves further intricacies. Once translations and rotations have been accounted for, 3Nn − 6 (or 3Nn − 5 for linear molecules) internal degrees of freedom correspond to vibrations, which approximately described in terms of small harmonic oscillations (the classical normal modes) around the multi-dimensional minimum of Vad . Figure 2.11 sketches 2.3. MOLECULAR SPECTRA 89 Figure 2.12. Rotational states decorating vibrational states, in turn (a′ ) (a′′ ) decorating different electronic states ψe and ψe producing differ′ ′′ ent (bonding) adiabatic potentials Vad (R) and Vad (R). such modes of vibration for CO2 and H2 O. The normal modes are then treated as quantum (harmonic) oscillators. 2.3.2. Electronic spectra. In the optical and ultraviolet photon region, excited electronic states are investigated. These transitions are similar to atomic excitations, and can basically be understood in terms of promotion of an electron from a filled molecular orbital to an empty one, e.g. from a bonding to an antibonding (a′ ) (a′′ ) orbital. An electronic transition ψe → ψe in a molecule leads from one adiabatic potential surface to another, as illustrated in Fig. 2.12. In general, the shapes of dif′ ′′ ferent adiabatic potentials Vad (R) and Vad (R) associated to different electronic states ′ ′′ are well distinct, and in particular RM 6= RM . This means that electronic transitions are usually accompanied by vibrational transitions, excited by the displacement of the equilibrium geometry. Roughly equally spaced vibrational satellites decorate the 90 2. MOLECULES E N+2 N2 (a) R’M R’’ M R(b) Figure 2.13. (a) A sketch of the adiabatic potentials (in the harmonic approximation) of two different molecular electronic states, here the electronic ground state of neutral N2 , and a generic electronic state of the ion N+ 2 . Arrows highlight the most intense Franck-Condon transitions. (b) The observed photoemission spectrum of N2 → N+ 2 , showing sequences of vibrational satellites build on top of each electronic state. electronic transitions (see Fig. 2.13), the spacings being given by the harmonic frequency ~ω ′′ of the final adiabatic potential. The intensities of the different satellites are distributed according to matrix elements proportional to |hv ′ = 0|v ′′ i|2 , where the vibrational ground state |v ′ = 0i of the initial adiabatic potential is projected on all the final vibrational eigenstates |v ′′ i in the excited electronic adiabatic potential. The most intense transitions involve those |v ′′ i states with large overlap in the region of the original minimum RM , as illustrated in Fig. 2.13 (Franck-Condon principle). Accordingly, a large number of significantly excited vibrational satellites indicates a large displacement of the equilibrium position in the electronic transition. For example, the spectrum of Fig. 2.13 indicates that, in the electronic transition from 2 the ground state of N2 to N+ 2 , RM shifts more in going to the Πu state than to 2 + 2 + Σg or Σu states. Rotational structures accompanying the electronic-vibrational structures are complicated by the change in momenta of inertia from I ′ to I ′′ and by occasional changes in electronic angular momentum. 2.3.3. Zero-point effects. We conclude this Section with an amusing detail: the bond energy of a diatom is slightly less than the depth Vad (+∞) − Vad (RM ) 2.3. MOLECULAR SPECTRA Energy / depth of the well 1 91 Vad(R) 0 -1 0 Eb(quant) v=2 v=1 Eb(class) v=0 1 2 R/RM 3 Figure 2.14. The role of the zero-point vibrational motion in the precise definition of the bond energy of a diatomic molecule. The zero-point energy is a consequence of the Heisenberg’s uncertainty principle, or, in other words, the nonzero value of the quantum kinetic term hTn i in a position-localized state. The zero-point energy decreases as the atomic masses Mα increase, and would eventually vanish in the limit of a classical adiabatic motion of the atoms. of the adiabatic potential well, which would apply if the nuclear masses were infinite, or equivalently if the dynamics of the ions were classical. Due to quantum zero-point motion, associated to Heisenberg’s uncertainty, the actual groundstate energy includes a vibrational contribution, that in the harmonic approximation equals Evib (0) = 21 ~ω, see Eq. (138). As illustrated in Fig. 2.14, when zero-point energy is accounted for, the actual binding energy reduces to Eb = Vad (+∞) − [Vad (RM ) + Evib (0)]. This usually small effect can be probed by changing the nuclear isotopic masses, thus modifying the vibrational frequency ω ∝ µ−1/2 , without significantly affecting Vad (R). The zero-point effect is the most spectacular in 4 He2 , which is so extremely weakly bound (see Table 2.1) that a single bound “vibrational” level is observed [?], with a zero-point energy Evib (0) that almost completely cancels the adiabatic attraction (the well depth is approximately 900 µeV), to a total net binding energy Eb ≃ 0.1 µeV only! Of the 25% lighter 3 He2 , no bound state is observed. 92 2. MOLECULES The present Chapter summarizes few basic ideas and experimental evidence in the field of molecular physics. The concepts of adiabatic separation and chemical bonding stand at the heart of the physics and chemistry of matter: they open the way to all understanding of the dynamics of systems composed by more than a single atom, including not just molecules but also polymers, clusters, solids. The few concepts sketched in these pages only scratch the surface of the extremely rich field of molecular spectroscopy, which provides detailed microscopic information about the geometry and dynamics of diatomic and polyatomic molecules [?, ?, ?]. Beside spectroscopical characterization, molecules are produced transformed and studied by means of all sorts of chemical reactions: these can mostly be analyzed conceptually in terms of the dynamics of atoms guided by suitable multi-dimensional potential energy surfaces Vad (R). The qualitative understanding and quantitative study of these phenomena constitutes the hard core of chemistry and chemical physics. CHAPTER 3 Statistical physics The purpose of statistical mechanics is to relate average properties of “macroscopic” objects (thermodynamical quantities) to the fundamental interactions governing their microscopic dynamics. We have seen in the previous chapters that many detailed properties of individual atoms and molecules can be understood on the basis of the microscopic electromagnetic interactions driving the motion of the composing electrons and nuclei. The number of electrons and nuclei in atoms and small molecules does not exceed a few hundred, a few thousand at most. Macroscopic objects, as opposed to microscopic systems, are characterized by huge numbers of degrees of freedom. For example, a sodium-chloride crystal weighting 1 g, ready to be thrown in hot water for cooking pasta, is composed of about 3 · 1023 electrons and approximately 1022 Na and 1022 Cl nuclei. One may as well conceive some wavefunction describing the dynamics of such a huge number of degrees of freedom, but must also readily give up any hope to ever store the huge amount of information that even an approximate wavefunction holds to describe in detail this system. On the other hand, this limitation is not especially bad, since such intimate details of the dynamics of our salt crystal as the individual motions of electrons and nuclei are probably boring and of little practical interest. A physicist or a materials engineer is rather interested in the measurable average macroscopic properties of objects and substances, such as stiffness, tensile strength, heat capacity, heat and electrical conductivity, magnetic susceptibility, phase transitions... To pursue the aim of drawing a link between the microscopic dynamics and the macroscopical thermodynamical properties, equilibrium statistical physics borrows its mathematical tools from the theory of probability and statistics. 3.0.4. Probability and statistics. All of statistical reasoning is based on the notion of probability. The naive notion of probability coincides with the relative number of observations. For example, after rolling a dice a large number N of times, we observe “two” N2 times: the ratio N2 /N is an estimate of the probability P2 of obtaining “two” in a single roll. However, we have also an a priori idea of probability. We will assert that P2 = 16 , even against contradictory observation, unless we have evidence that the dice is loaded. This a priori notion of probability 93 94 3. STATISTICAL PHYSICS is at the basis of a more rigorous definition of probability as a measure defined on a “space of events”, such that the measure of all space equals unity. (1) It is a basic assumption of measure theory that for non intersecting sets of events A and B, the probability of A ∪ B (i.e. that any event in either A or B is realized) equals PA + PB . (2) Given two spaces of events (which could intersect, or even coincide), two sets of events A and B belonging to the first and second space are called independent if the probability of A and B (i.e., an event is realized that satisfies the conditions for belonging to A and at the same time those for belonging to B) is the product P (A) · P (B). Both these properties are trivial when probability and relative number of observations are identified. The two basic properties of probability sketched above allow us to derive probabilities of complicated events in terms of probabilities of elementary events. For example, the probability of obtaining two ones when two independent dices are rolled 1 1 1 1 is 16 · 16 = 36 (property 2); the probability for a four and a five is 36 + 36 = 18 , as both (a four on dice 1 and a five on dice 2) and (a five on dice 1 and a four on dice 2) are possible mutually exclusive events (property 1). Statistics makes wide use of probability distributions: these are lists of probabilities of mutually exclusive sets of events which cover the whole space of events. For example, when the two rolled dices are considered, the outcome may be grouped in 1 equal numbers (6 mutually exclusive possibilities, each of probability 36 ) or different 1 ): using numbers (6 · 5/2 = 15 mutually exclusive possibilities, each of probability 18 1 5 property 1, this leads to a distribution Pequal = 6 , and Pdifferent = 6 . Clearly, the events considered exhaust all space, and the sum of all the probabilities in the distribution equals unity. It is a useful exercise to work out the distribution Pboth even , Peven odd , Pboth odd on the same space of events. Basically, a probability distribution makes probability a function of certain specifications of the events considered. Examples of popular distributions in statistical physics are the binomial distribution, the Poisson distribution, and the Gaussian distribution. Quantum mechanics, even at the level of a single particle, or few of them, has an intrinsic statistical interpretation, in the probabilistic postulate of observation. When a measurement of observable A is done on a quantum system initially in some state |ii, the system will be found in the eigenstate |ai of A associated to eigenvalue a, with probability Pa (i) = |ha|ii|2 . The quantum state |ii contains in its belly all probability distributions corresponding to all possible operators associated to potential measurements that may be carried out on the system. However, statistical physics is not especially concerned about this statistical interpretation because, if the quantum mechanical system is left undisturbed, the ket evolves from 3. STATISTICAL PHYSICS 95 any given initial state according to the (deterministic) Schrödinger equation (7). On the contrary, the statistic description of a macroscopic system attempts to investigate its average properties without the need of a precise specification of the initial conditions, but rather assuming that all “reasonable” initial conditions could occur with equal probability. Ideally one would like to identify macroscopic properties with time averages of the microscopic observables along the evolution dictated by internal dynamics. Two basic assumptions reconcile the irrelevance of the initial conditions with the time-average viewpoint: equilibrium and ergodicity. Equilibrium requires that long ago the system has undergone some initial transient, and that now, at the time of interest, no systematic evolution is occurring any more, all collective quantities (e.g. pressure) fluctuating in time around some well defined average value. Ergodicity assumes that all kinds of states are randomly explored in a period of time short with respect to the typical duration of measurements: subsequent times provide independent random realizations of an underlying probability distribution. A dishonest dice roller violates ergodicity by controlling accurately the initial conditions, rather than rolling blindly to generate successive truly random independent numbers. A similar violation may also occur in statistical physics, as illustrated in the example of H2 nuclear spin. H2 molecules occur with total nuclear spin 1 (orthohydrogen) or 0 (parahydrogen). These states are almost degenerate, so they should all occur with equal probability: one expects that the observed ratio of ortho- to parahydrogen equals the degeneracy ratio 3:1. However ortho-para interconversion is rather difficult: normal gaseous encounters either leave nuclear spins unaltered or simply exchange them, thus leaving the total abundance of each species unaltered. Normally, inter-conversions occur at the walls, in the neighborhood of magnetic impurities. One could however keep the H2 sample in a vessel where all magnetic impurities have been carefully removed. As a result, an anomalous abundance ratio of ortho- to parahydrogen can be stabilized for extended time. Instead, if magnetic impurities are present, careless “shuffling” of the nuclear spins occurs, the expected 3:1 equilibrium distribution is readily recovered, and the system is ergodic. In brief, the ergodic hypothesis assumes that the system is sufficiently random that no conserved quantities prevents the access (in periods of time short with respect to the duration of the experiment) to some major part of the space of states. When a system is at equilibrium and ergodic, time averages can safely be replaced by averages over a suitable probability distribution of the microscopic states. We now sketch the standard math used to describe a random distribution of quantum states. 3.0.5. Quantum statistics and the density operator. We want to formalize our basic ignorance of the precise quantum state of a system, while describing 96 3. STATISTICAL PHYSICS correctly its statistical properties [?]. A system may be found in any of a number of quantum states |a1 i, |a2 i, ... with probability w1 , w2 , ... respectively. The states |a1 i, |a2 i, ... need not be eigenstates of any observable, they need not be orthogonal, and in number they could even exceed the dimension of the Hilbert space of states. P The normalization of probability requires that i wi = 1. On average, the measurement of an observable B on such a system should provide X X X (142) [B] = wi hai |B|ai i = wi b |hai |bi|2 , i i b where the {|bi} is the basis of eigenkets of B, with eigenvalues b. We introduce the square brackets [ ] to indicate the statistical mean of the quantum average values. The averages (142) are naturally computed based on the statistical density operator X (143) ρ̂ = wi |ai ihai | , i as [B] = (144) X i = X b wi hai |B|ai i = b hb|ρ̂|bi = X b X i wi X b b hb|ai ihai |bi = hb|ρ̂B|bi = Tr(ρ̂B) . X b b hb| X i ! wi |ai ihai | |bi The density operator collects all dynamical and statistical properties of the system. ρ̂ is Hermitian, it can therefore be diagonalized. On its diagonal basis, the density operator is expressed as: X Pm |ρm ihρm | , (145) ρ̂ = m where the kets |ρm i form a complete orthonormal basis of the Hilbert space (in contrast to the |ai i of the definition (143)), associated to real eigenvalues Pm . Note the normalization (146) X X X X X Tr(ρ̂) = hb|ρ̂|bi = wi hb|ai ihai |bi = wi hai |bihb|ai i = wi hai |ai i = 1 . b P b i i b i thus also m Pm = 1, and the eigenvalues Pm can be interpreted as probabilities (all Pm ≥ 0). The special case of a statistical operator describing a single pure state ρ̂ = |a1 iha1 | falls back to standard deterministic quantum mechanics: [B] = Tr(ρ̂B) = ha1 |B|a1 i = hBi. In this case (and only in this case) ρ̂ has the property of a projector ρ̂2 = ρ̂. 3.1. EQUILIBRIUM ENSEMBLES 97 W S U Figure 3.1. The whole “universe” U partitioned in two weakly interacting parts: a “system” S plus the “rest of the universe” W. 3.1. Equilibrium ensembles The basic postulate of equilibrium statistical mechanics is: in accord to the ergodic hypothesis, all quantum states of the “universe” U with energy E ≤ EU ≤ E + ∆E are equally likely. In other words, the probability of a state |ii is a constant PiU = 1/ΩU (E; ∆E) (the same for any |ii) for states in this range of energy and zero otherwise (microcanonical ensemble). ΩU (E; ∆E) represents the number of accessible states for the whole universe in the allowed energy range E ≤ EU ≤ E + ∆E, so that probability is correctly normalized to unity. From this very “democratic” postulate, we derive the probability distribution of a system S in thermal equilibrium with the rest of the universe W (Fig. 3.1). The total Hamiltonian HU = HS + HW + HSW involves only a very weak interaction term HSW ≈ 0, so that EU ≈ ES + EW . We address the specific problem: What is the probability Pm of finding the system S in a given quantum state |mi of energy Em ? As the coupling between S and W is weak, we can safely assume that the precise state in S and that in W are distributed independently at random, except for the necessity of energy conservation. This means that (147) P U = PmS P W . Microcanonical ensembles govern both U and W=U−S individually. Accordingly, like in U each state has probability P U = 1/ΩU (E; ∆E), in W each state has probability P W = 1/ΩW (E − Em ; ∆E) (E − Em is the residual energy of W, once the energy Em of S has been removed from the total energy). From Eq. (147) we extract 98 3. STATISTICAL PHYSICS the probability distribution of S: PmS (148) PU = W = P 1 ΩU (E;∆E) 1 ΩW (E−Em ;∆E) = ΩW (E − Em ; ∆E) . ΩU (E; ∆E) Now we recognize that for most states the energy of the system Em is a tiny part of the total energy of the universe E: we then Taylor-expand the numerator (rather: its logarithm1) around E: (149) ln ΩW (E − Em ; ∆E) = ln ΩW (E; ∆E) − βEm + . . . , ′ where β = ∂ ln ΩW∂E(E′ ;∆E) |E ′ =E . The linear approximation for the logarithm is exceedingly good as long as Em ≪ EW . We substitute it back into Eq. (148) and obtain ΩW (E; ∆E) −βEm (150) PmS = , e ΩU (E; ∆E) which is a very remarkable result for at least two reasons: • the whole dependence on the state m of S has been confined to the expoU (E;∆E) nential factor, as the ratio Z = ΩΩW is independent of Em ; (E;∆E) • the probability of the state |mi of S only depends on its energy Em , through a simple exponential. We may rewrite Eq. (150) inserting the normalization factor Z −1 as e−βEm . Z Here we omit the label S, as Eq. (151) describes the Boltzmann equilibrium probability distribution for the states of a generic system in weak thermal contact with a huge environment. This distribution is associated to the Gibbs canonical ensemble. As the probability distribution (151) is necessarily normalized, Z can alternatively be expressed entirely in terms of properties of the system S under study, as X (152) Z= e−βEm = Tr e−βH , (151) Pm = P (Em ) = m without any reference to U and W (so that we write H in place of HS ). Z is usually called partition function. Note that the sum in Eq. (152) involves all microstates including, in particular, all degenerate components of degenerate levels. An equivalent 1 The number Ω W of microstates of a huge “rest of the universe” increases roughly exponentially with its energy, thus the expansion of log ΩW is much more accurate and better convergent than that of ΩW . 3.1. EQUILIBRIUM ENSEMBLES 99 formulation of the same sum is: (153) Z= X g(E)e−βE . E where g(E) is the degeneracy, i.e. the number of states at energy E. Note also that the trace in Eq. (152) can be taken on any basis of convenience. The density operator associated to this equilibrium probability distribution is diagonal in the energy representation: X e−βEm 1 |mihm| = e−βH . (154) ρ̂eq = Z Z m This same density operator can however be written on any basis, by applying a suitable unitary transformation. 3.1.1. Connection to thermodynamics. Assume that the system S can be partitioned into two weakly interacting subsystems S1 and S2 (Em1 m2 ≈ E1m1 + E2m2 ). When thermal equilibrium is established between S1 , S2 , and the rest of the world W, the distribution e−β(E1m1 +E2m2 ) , −β(E1n1 +E2n2 ) n1 n2 e Pm1 m2 = Z −1 e−βEm1 m2 = P can be factorized into Pm1 m2 e−βE1m1 e−βE2m2 e−βE1m1 e−βE2m2 P P = · . · = −βE1n1 −βE2n2 Z1 Z2 n1 e n2 e This means that both subsystems at equilibrium follow Boltzmann statistics PmSii = e−βEimi /Zi , with the same β parameter. This suggests that the intensive quantity β, with dimensions inverse energy, might be a function of temperature. The natural statistical definition for the thermodynamic internal energy U is the average of the energy operator X 1X (155) U = [H] = Tr(ρ̂H) = Em Pm = Em e−βEm . Z m m Deriving U with respect to β yields the squared energy fluctuation changed in sign (156) P P 2 −βEm −βEm 2 2 −Z m Em + e ∂U 2 m Em e 2 = − (H − [H]) . − [H] = = − H ∂β Z2 Thus, ∂U ≤ 0, i.e. the internal energy decreases when β increases. This suggests ∂β that β and temperature T might be inversely related. The determination of the precise relationship is sketched below. 100 3. STATISTICAL PHYSICS First, note that Z is a multiplicative function (Z = Z1 Z2 for a system composed of two subsystems), thus its logarithm is an additive function (ln Z = ln Z1 + ln Z2 ), thus it must represent a thermodynamical extensive quantity. Define the function (157) F =− ln Z . β With this definition, we obtain a relation between the β-derivative of the extensive quantity ln Z and the internal energy U : P −βEm ∂ ∂(βF ) 1 ∂Z ∂(ln Z) me ∂β = − = − = = − ∂β V,N ∂β Z ∂β Z P ∂ −βEm P Em e−βEm m ∂β e (158) = m = [H] = U , =− Z Z the average energy, according to its statistical definition (155). Equation (158), identifies the β derivative of βF to the internal energy. But this relation recalls a known thermodynamical identity involving the free energy and temperature: starting from the basic definition of the free energy F = U − T S (S = entropy), one finds ∂(F/T ) ∂F −2 −1 ∂F 2 = −T −T F + T . = −T 2 (159) U = F + T S = F − T ∂T ∂T ∂T This can be written more compactly in terms of a derivative w.r.t. the inverse temperature: (160) ∂(F/T ) =U. ∂(1/T ) In Eqs. (159) and (160) all T -derivatives are carried out at constant number of particles N and volume V . By comparing Eq. (158) and Eq. (160) we see that • it is perfectly natural to identify the function F , defined statistically in Eq. (157), with the free energy F of thermodynamics; • β must then be proportional to the inverse temperature 1/T . The proportionality constant between T and 1/β is known as Boltzmann constant kB , and represents the numerical conversion factor between temperature and energy: 1 (161) β= . kB T The numerical value of kB , the ratio of the gas constant to the Avogadro constant kB = NRA = 1.380658 · 10−23 J/K= 86.1734 µeV/K, is obtained by comparison of Eq. (189) below with the empirical definition of temperature through the ideal-gas thermometer. In statistical mechanics, temperature always appears in the energy combination kB T , and even more often in the β notation. 3.1. EQUILIBRIUM ENSEMBLES 101 1 1 P(Em) P(Em) 0.1 0.5 0.01 0.001 0 0 (a) 2 4 6 Em 8 0 (b) 2 4 6 8 Em Figure 3.2. The statistical meaning of temperature: (a) in a system at equilibrium with the “rest of the universe” the probability P (Em ) of individual microstates |mi decays exponentially with their energy Em . (b) The slope of the straight line representing P (Em ) as a function of energy in a lin-log scale is precisely −β = − kB1T . The physical meaning of temperature is now clear: the probability distribution of the microstates of any system in thermal equilibrium is determined uniquely by the energies of these states, in an extremely simple way: at a given temperature T , the probability P (Em ) of a given state of energy Em is proportional to exp[−βEm ]. Thus, −β = − kB1T is the slope of the straight line representing P (Em ) as a function of Em in a lin-log scale, as in Fig. 3.2. Note that, because of the normalization factor Z −1 in Eq. (151), all the energies of all states in the system determine the precise equilibrium probability for the occurrence of a given energy eigenstate |mi. However, only the energy difference of two states |mi and |ni determines their probability ratio PPmn = exp[β(En − Em )]. The discussed relations, in particular Eq. (157), establish an explicit link between statistical mechanics and thermodynamics, i.e. between the microscopic dynamics and macroscopic observable average properties. Based on Eq. (157), one can then extract all thermodynamical properties of a system at equilibrium purely from its partition function Z. Here follows a summary of several basic extensive and intensive 102 3. STATISTICAL PHYSICS thermodynamical quantities: (162) (163) free energy internal energy ln Z β ∂(βF ) U = F + TS = ∂β F = −kB T ln Z = − V,N (164) (165) free enthalpy enthalpy (166) entropy (167) heat capacity (168) pressure ∂(ln Z) =− ∂β V,N G = F + PV H = F + PV + TS U U −F ∂F = + kB ln Z = S=− ∂T V,N T T 2 ∂U 2 ∂U 2 ∂ ln Z = −k β = k β CV = B B ∂T V,N ∂β V,N ∂β 2 2 ∂F ∂(F/N ) N P =− = . ∂V β,N V ∂(N/V ) β 3.1.2. Entropy and the second principle. The connection (166) of entropy with statistics is only valid at equilibrium. A much more general definition of entropy is available, which applies for any statistical distribution of the quantum states defined by an arbitrary density operator ρ̂ (not necessarily diagonal in the energy representation): (169) S = −kB Tr(ρ̂ ln ρ̂) . This definition conforms to the intuitive idea of entropy P as a measure of disorder. Consider the basis |ρm i where ρ̂ is diagonal. Here, ρ̂ = m Pm |ρm ihρm |. On this basis X (170) S = −kB (Pm ln Pm ) . m In this form we can compute S for the most ordinate distribution, a pure state, with Pm = 1 for a single state m, and 0 for all others, and obtain S = 0 since all terms vanish. S increases when several states have nonzero probabilities. At the opposite limit, a completely random distribution with equal probability Pm = Ω1 for a number P 1 1 Ω of states yields S = −kB Ω m Ω ln( Ω ) = kB ln(Ω). We conclude that entropy is a logarithmic measure of the number of states that the system accesses. When the system is at equilibrium, the general statistical definition (169) coincides with Eq. (166). This is readily verified by substituting the equilibrium density 3.2. IDEAL SYSTEMS 103 operator (154) into Eq. (169), and evaluating the trace on the energy basis: P −βEm e−βEm P −βEm X ln Z (ln e−βEm − ln Z) e me S = −kB Pm ln Pm = −kB = −kB m Z Z m P −βEm (βEm + ln Z) e U Z(ln Z) (171) = kB β [H] + = + kB ln Z . = kB m Z Z T Moreover, it is possible to show [?] that if ρ̂ is the density matrix of the GibbsBoltzmann equilibrium distribution (154), the entropy of the system, defined according to Eq. (169), is maximum under the constraint of assigned internal energy (172) Tr(ρ̂H) = U . This result applies to an arbitrary system at fixed internal energy, volume, and number of particles. It tells us that any generic density operator ρ̂gen , with no other restriction, yields an entropy smaller than or equal to that of the equilibrium distribution ρ̂eq of Eq. (154). It is observed experimentally that any isolated system evolves spontaneously toward equilibrium: then the result of maximum entropy proves the second principle of thermodynamics, that, in a spontaneous transformation toward equilibrium ρ̂gen → ρ̂eq , entropy is bound to increase: ∆S = Seq − Sgen ≥ 0. This proof can be generalized to of a system not necessarily isolated, but for our purposes it suffices to retain that the experimental fact of entropy increase in the approach to equilibrium can be understood on statistical grounds. Many textbooks, including Refs. [?, ?, ?], delve into the ideas briefly introduced in the present Section in greater detail. 3.2. Ideal systems Before attempting any understanding of the complicated macroscopic properties of structured objects such as a bicycle or a bowl of soup, physicists have wisely chosen to investigate simpler systems: (usually) macroscopically homogeneous pure (or controllably “doped”) bunches of atoms or molecules all of the same kind, or of few kinds. These simpler systems constitute the wide class of materials. The rationale behind studying materials is that the properties of a complex structured object can be understood in terms of the functionality of the individual pieces it is composed of, whose function, in turn, depends on their shapes and material properties. For example, to a very good approximation the total heat capacity of a bicycle equals the sum of the heat capacities of the composing pieces. The macroscopic properties of a material are often studied in the limit of an infinitely wide sample, for which the interactions with the surrounding environment (e.g. the containing vessel) are sufficiently weak to justify the assumptions of Sec. 3.1. 104 3. STATISTICAL PHYSICS As the surface atoms/molecules directly interacting with the environment usually involve a layer about ∼ 1 nm thick, for any sample whose linear dimensions (all three of them) are larger than, say, ∼ 1 µm, the error induced by this bulk approximation should not be too bad. Even within the idealizations of a homogeneous bulk sample, the recipe (1) compute the spectrum of energies Em and eigenstates |mi of the system; (2) at given temperature T (or β = kB1T ), compute the partition function Z, Eq. (152); (3) generate the equilibrium density operator ρ̂ using Eq. (154); (4) compute macroscopic average quantities as [B] = Tr(ρ̂B), as in Eq. (144); is not really applicable for any realistic system, due to the difficulty of step 1. However, for a wide class of ideal systems the programme of statistical mechanics can be carried to satisfactory conclusion. By ideal systems we mean systems composed of individual components (particles) whose mutual interactions are completely negligible. Ideal systems have (173) H= N X Hi , i=1 where Hi governs the dynamics of a small set of degrees of freedom, e.g. the position and spin of a particle. The simplicity of many ideal systems permits to obtain exact partition functions, and thus to understand their thermodynamics by means of statistical methods. No ideal system exists in nature (although photons and neutrinos are a very good approximation to non-interacting particles), and if one did exist, strictly speaking the methods of equilibrium statistical mechanics would be irrelevant for that system since it would be non ergodic and have no means of reaching equilibrium. However, many properties of weakly interacting systems are fairly close to those of ideal systems. Many properties of actual materials can be understood qualitatively, and often even quantitatively, in terms of properties of ideal systems. Of course, the attention devoted to ideal systems should never make us forget their idealized nature, and the fact that quantitative understanding of real thermodynamical properties of real materials often require more sophisticated conceptual tools. For an ideal system, the natural basis of states is a (properly symmetrized) factored basis (89), where the αi quantum numbers identify the state |αi i of “particle” i in a system that (for a start) we assume to be composed of a single type of identical noninteracting particles. These could be individual noninteracting electrons, or individual noninteracting atoms, or individual noninteracting molecules... If it was 3.2. IDEAL SYSTEMS 105 Figure 3.3. The typical occupancy distribution of bosons among single-particle energy levels for (a) low temperature, (b) intermediate temperature, and (c) high temperature, assuming that, as is often the case, the single-particle spectrum has no upper bound, so that most individual levels remain unoccupied. not for symmetrization/antisymmetrization, the partition function would read (174) unrestricted unrestricted X X ? Z= exp[−β(Eα1 +Eα2 +· · ·+EαN )] = exp[−βEα1 ] exp[−βEα2 ] . . . exp[−βEαN ] . α1 α2 ... αN α1 α2 ... αN with Eα indicating the single-particle eigenenergies of state |αi. However, due to the symmetrization, exchanging any two sets of quantum numbers αi and αj leads to the same symmetrized state (up to possibly a sign). The number of permutations of the quantum numbers is simply N ! for the fermions, where all αi are different. For bosons, the number of permutations of the quantum numbers giving the same state is more intricate, as in general many of the αi could be the same. To find the number of permutations, it is sufficient to consider some ordering of the singleparticle states (for example in order of increasing energy), and to note the numbers n0 , n1 , n2 , ... of indexes αi which in a given term in the sum (174) equal the lowest, first excited, second excited, ... states.2 The actual number of permutations of a set of indexes α1 , α2 , . . . , αN is N !/(n0 ! n1 ! n2 ! . . . ). This expression remains valid for fermions, as all nα equal either 0 or 1. We have therefore the “correct counting” partition function # " N X X n0 ! n1 ! n2 ! . . . exp −β Eαi . (175) Z= N ! α α ... α i=1 1 2 N Here the difference between fermions and bosons is taken care by the sum, which respects Pauli’s principle for fermions, and remains unrestricted for bosons. 2 Clearly, the sum of these occupation numbers P α nα = N . 106 3. STATISTICAL PHYSICS 3.2.1. The high-temperature limit. At low temperature, thePBoltzmann exponential factor tends to privilege those states with a total energy α nα Eα as low as possible: the system tends to order, with few single-particle states significantly occupied (Fig. 3.3a): we shall come back to this regime in Sec. 3.2.2. When temperature is high (small β), assuming, as is often the case, that the single-particle spectrum has unlimited high-energy excitations, a huge number of single-particle states (those of energy < kB T ) become almost equally likely, and the system becomes extremely disordered (Fig. 3.3c). The states with all different quantum numbers are overwhelmingly more than those with two or more or them equal. As a result, at high temperature the main contribution to the boson partition function comes from states which have nα = 0 or 1 at most. All terms with nα > 1 in the sum of (175), violating Pauli’s principle for fermions, add a comparably negligible term to a huge Z. Thus, for both bosons and fermions at high temperature, the partition function is approximately # " unrestricted N X X 1 1 (Z1 )N (176) Z≃ exp −β Z1 · Z2 · ... · ZN = , Eαi = N ! α α ... α N ! N ! i=1 1 2 N P −βEα where the single-particle partition function Z1 = only differs from the αe total Z of Eq. (152) in that it involves single-particle states |αi rather than collective states |mi. In this limit, the free energy is (Z1 )N e Z1 ln Z ≃ −kB T ln ≃ −N kB T ln , β N! N where we made use of the Stirling approximation N (178) ln(N !) ≃ N ln , e 3 which is very accurate for large N . If, as often occurs at high temperature, each of the N particles moves freely around in the volume occupied by the sample, then the motion of its center of mass can be separated as sketched in Sec. 1.1 for the one-electron atom. The translational degrees of freedom commute with the internal ones. Accordingly, the 1-particle partition function Z1 can be factored (177) (179) F =− Z1 = Z1 tr Z1 int into a translational times an internal part. The latter describes the statistical dynamics of the internal degrees of freedom of the particle (including rotations), and 3 An even more accurate version of Stirling’s formula is ln(N !) ≃ N ln Ne + but the omitted logarithmic correction is by far negligible for large N . 1 2 ln π 3 (1 + 6N ) , 3.2. IDEAL SYSTEMS 107 depend therefore on the characteristic spectrum of its excitations. Instead, the translational part is universal, in the sense that it only depends on the total mass M of the particle. To obtain Z1 tr , recall the spectrum of a freely translating particle. For simplicity, assume that the particle is contained in a macroscopic cubic box of size L × L × L. Take periodic boundary conditions, i.e. the wavefunction is the same at opposite boundaries of the box, but the results (Eq. (184) onward) would not change if we assumed that the wavefunction vanishes at the boundary (prove it!). The allowed values of the j = x, y, z momentum components 2π (180) pj = ~ kj = ~ nj nj = 0, ±1 ± 2 ± 3, . . . L are associated to plane-wave eigenfunctions ψ(~r) = L−3/2 exp(i ~k~n ·~r), of translational kinetic energy (2π~)2 (n2x + n2y + n2z ) |~p|2 = . 2M 2M L2 For macroscopically large L, the translational states form a “continuum”: the sum yielding the translational partition function X X (2π~)2 (n2x + n2y + n2z ) (182) Z1 tr = exp(−βE~n ) = exp −β 2M L2 n n n E~n = (181) ~ n x y z is conveniently approximated by the integral Z ∞Z ∞Z ∞ (2π~)2 2 2 2 (n + ny + nz ) dnx dny dnz . (183) Z1 tr = exp −β 2M L2 x −∞ −∞ −∞ This factorizes in the product of three identical Gaussian integrals.4 In total, !3 r 3 L M kB T V L (184) Z1 tr = = = 3, ~ 2π Λ Λ where we have introduced the thermal length r 2π (185) Λ=~ . M kB T We can now substitute the result for the single-particle partition function into the global partition function (176) of the gas (186) 4 Remember that Z= VN (Z1 tr )N (Z1 int )N = (Z1 int )N . N! N ! Λ3N √ x2 dx = 2π a. exp − 2 2a −∞ R∞ 108 3. STATISTICAL PHYSICS Derivation yields the translational contribution to the internal energy (187) ∂ VN ∂ ∂ 3N 3 ∂ 1/2 ln Ztr = − ln = 3N ln Λ = 3N ln β = = N kB T , Utr = − ∂β ∂β N ! Λ3N ∂β ∂β 2β 2 compatible with the experimentally well established translational contribution 23 kB per particle to the heat capacity of high-temperature gases. Likewise, we obtain an expression for the free-energy contribution of the translational motion: eV e Z1 tr Z1 int = −N kB T ln + ln Z1 int . (188) F = −N kB T ln N N Λ3 Observe that Z1 int describes the internal degrees of freedom: it can therefore depend on T , but not on the volume V of the sample. This observation and the definition of pressure (168) allow us to obtain a remarkably general equation of state for the ideal gas (189) ∂ ∂F eV eV N kB T ∂ = − P =− −N k T ln + ln Z ln = . = N k T B 1 int B ∂V β,N ∂V N Λ3 ∂V N Λ3 V T,N This relation, obtained on purely statistical grounds, is equivalent to the well es, universally and quantitatively tablished equation of state of perfect gases P = nRT V valid for atomic and molecular gases at high temperature and low density. The , with n moles numerical value kB = NRA is determined by comparison N kVB T = nRT V of gas containing nNA particles. Other quantities are accessible experimentally. For example, the single-particle center-mass kinetic-energy distribution, yields the probability to find a particle in a given energy interval. To obtain this distribution, we need the energy density of translational states (190) M 3/2 V 1/2 E gtr (E) = √ 2 π 2 ~3 compatible with the kinetic-energy expression (181).5 The single-particle centermass kinetic-energy probability distribution is then (in the spirit of Eq. (153)) the product of the density of states times the Boltzmann probability that a given state 5 This density of states can be obtained from the observation that the ~n values are evenly distributed with unit density. According to Eq. (181), the energy is proportional to the squared length of the ~n vector E~n = A|~n|2 . Therefore, the number of states with energy ≤ E equals the p volume of the ~n-sphere of radius |~n| = E/A. The density of states is the derivative of this E 1/2 d 4π E 3/2 = 2π A number of states with respect to energy: gtr (E) = dE 3/2 , which gives Eq. (190) 3 A after substituting the value of A = (2π~)2 2M L2 (from Eq. (181)), and V = L3 . 3.2. IDEAL SYSTEMS 109 T=100 K 0.05 dP/de [1/meV] 0.04 0.03 0.02 T=300 K 0.01 0 0 20 40 60 80 e [meV] Figure 3.4. The distribution (191) of the center-mass kinetic energy of the high-temperature ideal gas. is occupied: (191) e−βE 2 (2M )3/2 V 1/2 Λ3 −βE dP (E) = gtr (E) E e = √ β 3/2 E 1/2 e−βE . = 2 3 dE Z1 tr 4π ~ V π This probability distribution is remarkably universal: it does not even depend on the mass of the particles, but only on temperature (see Fig. 3.4). Similarly, one determines the center-mass velocity distribution. Each component of ~v = Mp~ is Gaussian-distributed as r M vj2 dP (vj ) βM exp −β = . (192) dvj 2π 2 To obtain the distribution of speed v = |~v |, one can simply observe that the kinetic energy E and speed v are connected by E = M2 v 2 , and use the distribution (191): (193) r 1/2 βM 2 dP (E) dE dP (E) 2 3/2 M v 2 dP (v) 2 2 −β M v = = Mv = Mv √ β (M β)3/2 v 2 e− 2 v . e 2 = dv dE dv dE 2 π π We find the celebrated Maxwell-Boltzmann equilibrium velocity distribution, reported in Fig. 3.5. The v 2 factor bears the same “polar” origin discussed for the radial distribution of one-electron wavefunctions in Sec. 1.1.3: one should not forget 110 3. STATISTICAL PHYSICS 0.6 dP/dv [1/w] 0.5 0.4 0.3 | | | [v^2]^1/2 [v] vmax 0.2 0.1 0 0 1 2 3 4 v/w Figure 3.5. The distribution (193) of the single-particle center-mass speed of the high-temperature ideal√gas, rescaled by w = (βM )−1/2 . This distribution q peaks at vmax = 2 w ≃ 1.414 w, while the mean 8 velocity is [v] = w ≃ 1.596 w, and the mean square velocity is π √ [v 2 ]1/2 = 3 w ≃ 1.732 w. that the velocity ~v with largest probability is ~v = ~0. Observations of the distribution confirm the statistical analysis (see Fig. 3.6). Internal degrees of freedom do not affect the equation of state and the translational distributions. However, they do contribute an additive temperature-dependent term to the free energy (188) (the part Fint (T ) = N F1 int (T ) = −N kB T ln Z1 int ), to the internal energy 3 (194) U = N kB T + N [F1 int (T ) − T F1′ int (T )] , 2 and therefore also to the heat capacity 3 (195) CV = N kB − N T F1′′int (T ) . 2 The internal term vanishes for structureless particles (e.g. electrons), while it contributes significantly to thermodynamics whenever the internal degrees of freedom are associated to excitation energies not too different from kB T . This occurs occasionally for atoms, and occurs all the times for molecules, as illustrated in the following examples. 3.2. IDEAL SYSTEMS (a) 111 (b) Figure 3.6. (a) The velocity distribution of the vapor is probed [?] by analyzing the atoms emerging from a oven through a tiny hole, by letting them through the spiraling slot of a rotating cylinder to a tungsten surface ionization detector. (b) Typical observed distribution of the center-mass velocities for high-temperature K vapor, compared to the curve of Eq. (193). 3.2.1.1. Internal degrees of freedom of molecules. We report here briefly the example of diatomic molecules, in the approximation that rotational and vibrational motions are independent (Sec. 2.3). The partition function factorizes Z1 int = Z1 vib Z1 rot , and therefore the free energy F1 int = −kB T ln Z1 int = F1 rot + F1 vib . The rotational partition function is determined uniquely by the dimensionless ratio ~2 β ~2 /(2I) = Θrot /T , where the characteristic Θrot = 2Ik : B ∞ X Θrot l(l + 1) . (196) Z1 rot = (2l + 1) exp − T l=0 This sum6 cannot be evaluated in closed form. However, the characteristic temperatures Θrot are often very small (e.g. 85.4 K for H2 , 15.2 K for HCl, 2.86 K for N2 ). When T ≫ Θrot the exponential in Z1 rot changes slowly with l and many terms 6 The expression (196) is actually incorrect for the special case of homonuclear molecules, where nuclear spin and indistinguishableness play a role. Correct expressions for homonuclear molecules are recovered essentially by multiplying Θrot by two. 112 3. STATISTICAL PHYSICS 2 1 0.8 Cv1 rot/kB U1 rot/Erot 1.5 1 0.6 0.4 0.5 0.2 0 0 0 (a) 0.5 1 Τ/Θ 1.5 2 0 (b) 0.5 1 Τ/Θ 1.5 2 Figure 3.7. The temperature dependence of the rotational contribution to the internal energy per molecule U1 rot and heat capacity per molecule CV 1 rot . contribute to the sum in Eq. (196): it is a good approximation to replace it with an integral, which is elementary: (197) Z Z ∞ ∞ Θrot T Θrot exp − l(l + 1) dl = y dy = [T ≫ Θrot ] . (2l+1) exp − Z1 rot ≃ T T Θrot 0 0 The high-temperature rotational contribution to the thermodynamic functions is therefore: T Frot ≃ −kB T ln (198) F1 rot = N Θrot Urot U1 rot = (199) ≃ kB T N CV rot (200) ≃ kB CV 1 rot = N T Srot ≃ kB + kB ln (201) S1 rot = N Θrot (see Eqs. (162), (163), (166), (167)). At lower temperature Tp∼ Θrot instead, truncating the series (196) to a finite number of terms lmax ≈ 2 T /Θrot approximates Z1 rot better. Figure 3.7 reports the temperature dependence of the rotational heat capacity and internal energy per molecule. Characteristically, as temperature is raised, the rotational degree of freedom “unfreezes”, reaching the classical equipartition limit at large temperature T ≫ Θrot . Relative equilibrium populations 3.2. IDEAL SYSTEMS 113 1 2 1.5 Cv1 vib/kB U1 vib/Evib 0.8 1 0.6 0.4 0.2 0.5 0 0 0.5 (a) 1 Τ/Θ 1.5 2 0 (b) 0.5 1 Τ/Θ 1.5 2 Figure 3.8. The exact temperature dependence of the vibrational contribution to the internal energy per molecule U1 vib and heat capacity per molecule CV 1 vib . Pl = (2l + 1) exp − ΘTrot l(l + 1) /Z1 rot account for the observed relative intensities of the rotational structures in molecular spectra (Figs. 2.8 and 2.9).7 In the harmonic approximation (138), the vibrational partition function is determined uniquely by the dimensionless ratio β̃ = β ~ω = ΘTvib , where the characteristic Θvib = k1B ~ω: (202)Z1 vib 7 ∞ 1 1 Θvib X Θvib Θvib v+ = exp − exp −v = exp − T 2 2 T T v=0 v=0 1 1 Θvib 1 = . = exp − Θvib 2 T 1 − exp − T 2 sinh Θ2Tvib ∞ X Here the dipole matrix element averaged over the initial (2l+1)-degenerate state and summed over the final states lf = l + 1 (R branch) or lf = l − 1 (P branch) is taken independent of l. Only l+1 the radial part of the matrix element is, while calculation shows that the angular part equals 2l+1 l for the R branch and 2l+1 for the P branch. For l not too small, these factors are both close to 0.5, and are therefore often taken as constant. 114 3. STATISTICAL PHYSICS Here, the series is evaluated in closed form, thus yielding an exact expression valid at all temperatures. The vibrational thermodynamical functions are therefore: h i ~ω Fvib (203) = + kB T ln 1 − exp(−β̃) F1 vib = N 2 Uvib ~ω ~ω (204) U1 vib = = + N 2 exp(β̃) − 1 " #2 2 CV vib β̃/2 kB β̃ CV 1 vib = (205) = = kB N 2 cosh(β̃) − 2 sinh(β̃/2) h i kB β̃ Svib (206) = −kB ln 1 − exp(−β̃) + S1 vib = N exp(β̃) − 1 (see Eqs. (162), (163), (166), (167)). The temperature dependence of the vibrational heat capacity and internal energy per molecule is drawn in Fig. 3.8. In analogy to rotations, as temperature is raised, the vibrational degree of freedom “unfreezes”, reaching the classical equipartition limit at large temperature T ≫ Θvib . Note that the large-T limit for a single oscillator provides a contribution kB T to U1 (and thus kB to CV 1 ) equal to that of the two rotational degrees of freedom. The reason is that one harmonic oscillator is associated to two quadratic terms (kinetic and potential) in the Hamiltonian, rather than one as for each rotational degree of freedom (only kinetic). Typical vibrational temperatures of a few diatomic molecules are: 6300 K for H2 , 4300 K for H 35Cl, 3400 K for N2 , 403 K for K 35Cl. It is seen that, contrary to the rotational unfreezing, the vibrational transition from the quantum-frozen to the classical regime is fairly accessible to heat capacity measurement, which find good accord with the statistical model. Discrepancies at high temperature are mainly due to the failure of the harmonic approximation. 3.2.1.2. Isolated (spin) degrees of freedom. It is straightforward to apply the Boltzmann formalism to the statistics of the magnetic moments carried by atoms or molecules in a gas, through the appropriate partition function Z1 int . However, the same statistical methods can be applied also to equal isolated non interacting degrees of freedom, e.g. atomic magnetic moments carried by dilute magnetic impurities in a solid or a liquid. For brevity, we refer to “spins”: in reality the magnetic moment is generally proportional to some total angular momentum. For any system whose Hilbert space of states is finite-dimensional (and not too large...), it is possible to compute explicitly the partition function Z1 int . For ex~ = B ẑ as in ample, an angular momentum J coupled to a total magnetic field B Eq. (43) spans a (2J + 1)-dimensional space of states, a basis of which is labeled by the Jz projection quantum number MJ . The “spin” partition function is the sum of 3.2. IDEAL SYSTEMS 115 0 0.4 -0.1 Cv1 spin/kB U1 spin/Espin 0.3 -0.2 -0.3 0.2 0.1 -0.4 0 -0.5 0 1 (a) 2 Τ/Θ 3 4 5 0 (b) 1 2 Τ/Θ 3 4 5 Figure 3.9. Temperature dependence of the internal energy per spin U1 spin (210) and heat capacity per spin CV 1 spin (211) characterizing the thermodynamics of a magnetic moment in a uniform field B, associated to a characteristic temperature scale ΘB = k1B Espin = 1 gµB B. kB (2J + 1) terms corresponding to the levels of Eq. (87): (207) Z1 spin = J X MJ =−J exp −β̃ MJ , where the ratio β̃ = β gj µB B = ΘB /T is defined in terms of the characteristic ΘB = gj µB B/kB (gj is the relevant g-factor). It is a simple exercise to evaluate Z1 spin for any given value of J. For the reader’s convenience, we summarize here the results for the simplest case spin J = 21 : (208) Z1 spin ΘB = 2 cosh 2T β̃ = 2 cosh 2 1 spin− . 2 116 3. STATISTICAL PHYSICS The spin- 12 thermodynamic functions are therefore: β̃ Fspin = −kB T ln 2 cosh = N 2 (209) F1 spin (210) U1 spin = (211) CV 1 spin (212) S1 spin ! Uspin gµB B β̃ =− tanh N 2 2 2 kB β̃ CV spin = = N 2 cosh(β̃) + 2 " ! # β̃ β̃ β̃ Sspin = kB ln 2 cosh − tanh . = N 2 2 2 The temperature dependence of the internal energy and heat capacity of each magnetic moment is drawn in Fig. 3.9. Like for molecular rotations and vibrations, as temperature is raised, the spin degree of freedom “unfreezes”, with an increase of the heat capacity. However, when temperature is further raised, due to the finite spectrum, the internal energy cannot increase indefinitely: U1 spin flattens out, thus the spin heat capacity decays at large temperature. The density of magnetization of such a system of magnetic moments ~ = N [~µ1 ] M V (213) is a quantity of straightforward experimental accessibility and relevance. In such an ~ is necessarily oriented parallel to the magnetic field, M ~ = M ẑ. We ideal system M have [µ (214) z 1 ] = Tr(ρ1 µz 1 ) = J X MJ =−J Mz = (215) J exp(−β̃MJ ) 1 X U1 spin =− EMJ PMJ = − Z1 spin B M =−J B J 1 gµB N β̃ spin− . = tanh 2V 2 2 −gµB MJ N [~µz 1 ] N U1 spin =− V VB The average magnetization changes with temperature following the same functional dependency as the internal energy, apart for a trivial factor − VNB , thus Fig. 3.9a can also be read as magnetization as a function of temperature for the J = 21 system. For weak field, β̃ → 0, the hyperbolic tangent can be expanded to lowest order, ~ obtaining the linear response of the localized spins to a weak total field B: (216) 1 N gµB 2 1 gµB N spin− . β̃ = χB B , with χB = Mz ≃ 4V V 2 kB T 2 3.2. IDEAL SYSTEMS 117 χB represents the (weak-field) magnetic susceptibility. The characteristic inverse-T dependency of the Curie susceptibility of free spins reflects the disordering effect of temperature. In practice it is more standard to measure the susceptibility χH relative to the external applied field strength H = ǫ0 c2 Bext [in A/m]. The relation between χB and χH derives from h i h i ~ = χB B ~ = χB B ~ ext + B ~ int = χB (ǫ0 c2 )−1 H ~ + (ǫ0 c2 )−1 M ~ , (217) M ~ int = (ǫ0 c2 )−1 M ~ for the magnetic field of a uniformely magnewhere the relation B ~ into evidence, tized material is used. We put M ~ = M (218) (ǫ0 c2 )−1 χB ~ H, 1 − (ǫ0 c2 )−1 χB thus χH = (ǫ0 c2 )−1 χB 1 − (ǫ0 c2 )−1 χB yields the desired relation for the dimensionless susceptibility χH . 3.2.2. Degenerate Fermi and Bose gases. In Sec. 3.2.1 we found that, at high temperature, non-interacting fermions and bosons behave in an equivalent manner, as an ideal classical gas, plus (possibly) internal degrees of freedom. At low temperature however, spectacular fermion/boson differences show up. The calculation of the exact partition function Z (175) at arbitrary T can be carried out by replacing the N sums over the single-particle quantum numbers αi with sums over the occupation numbers nα . To this purpose, the toPN P tal energy in the exponential i Eαi = α nα Eα is written in terms of occupation numbers n of the single-particle states, so that the exponential factorizes α P Q exp (−β α nα Eα ) = α exp(−βnα Eα ): (219) X Y X X n0 ! n1 ! n2 ! . . . P P nα e−β α nα Eα = e−βEα e−β α nα Eα = . Z= N ! {nα } {nα } α α α ... α 1 2 N P α nα =N P α nα =N The occupancies nα are 0 or 1 for P fermions, and 0, 1, 2, 3, ... for bosons. The binomial coefficients correcting the α1 ... αN sum for overcounting, is suppressed in the nα -sum, as the states identified by the occupation numbers are counted correctly, without undue overcounting. The constraint of fixed total number of particles N makes the sum over the occupancies in Eq. (219) extremely difficult to compute: without this constraint, one could exchange sum and product, and the sums would all look alike. To get rid of the constraint we use a trick: replace the canonical ensemble, where the number of particles is fixed, with the grand canonical ensemble, where N is allowed to vary, to describe a thermodynamical system which is (weakly) exchanging not only energy but also particles with the rest of the universe. Figure 3.10 illustrates 118 3. STATISTICAL PHYSICS E U [N] N U −µ N −µ N Figure 3.10. At fixed temperature, the internal energy is generally a convex function of the number of particles N , which is usually minimum for N = 0. The addition of −µN shifts the equilibrium to some finite average number of particles [N ], which is an increasing function of the parameter µ. the need to subtract µN to the energy eigenvalue in the expression (152) of the partition function. For an arbitrary system of equal particles, summing the canonical Z over N yields the correct grand partition function (220) Q= ∞ X X N =1 m(N ) −β(Em(N ) −µN ) e −β(H−µN̂ ) = T̃r e , where N̂ is the operator counting the number of particles, and the T̃r indicates summing over all states and all numbers of particles. Q plays the same role for the grand canonical ensemble as Z for the canonical ensemble. A reasoning similar to that of Sec. 3.1.1 yields the relation between the grand partition function and thermodynamics. In particular, β identifies with the inverse temperature and µ with the chemical potential. The basic relation linking Q to the thermodynamical 3.2. IDEAL SYSTEMS 119 potentials (analogous to Eq. (162)) is the first in the following list: (221) (222) (223) (224) J(T, V, µ) F U G V P (T, µ) = kB T ln Q(T, V, µ) µ [N ] − J µ [N ] − J + T S J + F = µ [N ] ∂J ∂P [N ] = =V ∂µ T,V ∂µ T ∂G ∂F = µ = ∂[N ] T,V ∂[N ] T,P ∂J ∂P S = =V . ∂T µ,V ∂T µ (225) (226) (227) = = = = The listed relations allow us to compute all thermodynamical quantities for a system at equilibrium with a reservoir of energy and of particles. Further details are discussed in Ref. [?]. Armed with this new tool, we proceed to compute the grand partition function Q forP a system of noninteracting identical bosons or fermions. Remembering P that N = α nα , we first rearrange the exponent of Eq. (219) and then use the N to get rid of the constraint Q = ∞ X N =0 = X {nα } X P −β [( α nα Eα )−µN ] e = N =0 P {nα } α nα =N P e α β(µ−Eα )nα ∞ X = XY X e−β [( P nα Eα )−µ P α nα ] P {nα } α nα =N eβ(µ−Eα )nα = YX α {nα } α α nα eβ(µ−Eα )nα = YX α nα eβ(µ−Eα ) nα Next, observe that the occupancy numbers nα are mute indexes over which it is summed. There is no special reason to distinguish among them: just call them n. Accordingly, we write the grand partition function as " # Y X n (228) Q= . eβ(µ−Eα ) α n The sum in square brackets can be carried out explicitly: for fermions n = 0, 1, thus it equals 1 + eβ(µ−Eα ) ; for bosons, n = 0, 1, 2, . . . , therefore that sum is a geometric series, whose summation8 gives (1−eβ(µ−Eα ) )−1 . By introducing a quantity θ, θ = +1 8 This series converges only if the number eβ(µ−Eα ) < 1, i.e. only if µ is smaller than the smallest single-particle energy Eα , which is usually 0. Therefore µ must be negative for bosons. . 120 3. STATISTICAL PHYSICS for bosons and θ = −1 for fermions, we can write Q in a form factorized over different single-particle states and valid for both bosons and fermions: Y −θ (229) Q= 1 − θeβ(µ−Eα ) . α The grand potential (221) provides the connection to thermodynamics: X (230) J = P V = kB T ln Q = −θkB T ln 1 − θeβ(µ−Eα ) . α This expression and Eqs. (222)-(227) determine all the thermodynamics of noninteracting bosons/fermions. For noninteracting particles, the single-particle states |αi are labeled, as discussed in Sec. 3.2.1, by the single-particle momentum p~, plus possibly internal degrees of freedom. At low temperature any nontrivial internal dynamics is usually “frozen”. Only gs degenerate states remain accessible, and we may collect them in a spin variable σ (representing for example the ẑ projection of the total angular momentum of the particle). In practice, like in Eq. (182), the α-sum is a sum over nx , ny , nz , and (possibly) spin σ. As the translational levels are very dense, the n-sum can be replaced by an integration over energy E, weighted (like in Eq. (153)) with the density of translational states gtr (E) computed in Eq. (190): Z ∞ Z ∞ X X X (231) → → gs dE g(E) . dE gtr (E) → α σ nx ny nz 0 0 Here we introduce the total density of states g(E) = gs gtr (E). We obtain an equation of state Z ∞ g(E) J dE (232) = −θkB T ln 1 − θ eβ(µ−E) P = V V 0 Z (2M )3/2 ∞ √ β(µ−E) dE = −θkB T gs E ln 1 − θ e . 4π 2 ~3 0 This can be rewritten in terms of the variable Ẽ = Eβ, and integrated by parts to obtain: (233) Z ∞ p 2 2 3/2 Z ∞ Ẽ 3/2 βµ−Ẽ −3 5/2 (2M ) √ , Ẽ ln 1 − θe = d Ẽ d Ẽ k T g Λ P = −θ(kB T ) gs B s 4π 2 ~3 0 3 π eẼ−βµ − θ 0 where the thermal length Λ was defined in Eq. (185). This equation of state for the ideal gas of bosons/fermions expresses the pressure in terms of temperature9 and 9 In introducing the present discussion we addressed particularly the low-temperature regime. In fact, the only assumption involving temperature is that all internal degrees should be either frozen or included in gs . As long as no other internal degree of freedom plays any role, Eqs. (230) and (233) hold for any temperature. 3.2. IDEAL SYSTEMS 121 chemical potential. This is unpractical, since µ is a quantity of difficult experimental access: it would be preferable that P was expressed in terms of T and the density [N ] , like in the high-temperature limit Eq. (189). Unfortunately, in general there is V no simple analytic expression of µ as a function of the density, thus a convenient explicit equation of state is not really available. More interestingly, by computing the internal energy U , it may be verified that the high-temperature ideal-gas relation 2U 3V applies at any temperature, for both bosons and fermions, even though from the point of view of the equation of state (233), at low temperature ideal bosons and fermions are far from an “ideal gas” in the thermodynamical sense. By a high-temperature power expansion, it is possible [?] to derive the leading correction to the ideal-gas behavior Eq. (189) due to quantum statistics: (235) Λ3 [N ] (2π)3/2 ~3 [N ] [N ] for δ = kB T 1 − θ 2−5/2 δ + O(δ)2 , = ≪ 1. P = V gs V gs (M kB T )3/2 V (234) P = The explicit form of the “degeneracy” parameter δ makes it clear that Eq. (235) is a high-temperature and low-density expansion. The sign of the leading correction shows opposite tendencies of boson statistics to reduce pressure, and of fermion statistics to increase it. This is a first high-temperature hint of the better “social” character of bosons versus fermions. The experimental verification of these corrections in real gases (e.g. of atoms) is extremely difficult, as interatomic interactions introduce corrections to the pressure of the same order as or larger than those associated to indistinguishableness. At low temperature, the behavior of ideal bosons and fermions becomes radically different. The average occupation number [nα ] of single-particle states is a clear indicator of these differences. This is obtained as P Q P P ( n neβ(µ−Eα )n ) · α′ 6=α n eβ(µ−Eα′ )n neβ(µ−Eα )n Q P β(µ−E ′′ )n [nα ] = = Pn β(µ−Eα )n . α ne α′′ ne P ∂ This fraction is recognized as ∂(βµ) ln[ n eβ(µ−Eα )n ]. After Eq. (228) above, we computed the sum within the logarithm, and found the result [1−θeβ(µ−Eα ) ]−θ . Therefore, we have [nα ] = ∂ −θeβ(µ−Eα ) ∂ ln[1 − θeβ(µ−Eα ) ]−θ = −θ ln[1 − θeβ(µ−Eα ) ] = −θ , ∂(βµ) ∂(βµ) 1 − θeβ(µ−Eα ) with the final result (236) [nα ] = 1 eβ(Eα −µ) − θ . 122 3. STATISTICAL PHYSICS In a single formula, this equation collects the celebrated Bose-Einstein distribution (237) [nα ]B = 1 eβ(Eα −µ) − 1 and Fermi-Dirac distribution (238) [nα ]F = 1 eβ(Eα −µ) + 1 . The average occupancy of each single-particle state depends only on temperature and on the energy of the state itself.10 However, the presence of all other particles affects each single-particle occupancy distribution through the chemical potential µ, ] and temperature T . which contains information about total particle density [N V 3.2.2.1. Fermi particles. The T → 0 limit of the fermion-gas statistics is of fundamental interest for the physics of matter. The reason is that conduction electrons in many metals can be approximately described as free non-interacting spin- 21 fermions, which at room temperature have a huge degeneracy parameter δ ≫ 1. The thermal length for electrons at 300 K is close to 4 nm, corresponding to a thermal V volume Λ3 ≈ 80 nm3 , much larger than the volume per electron [N ≈ 0.01 nm3 in ] a typical metal, thus yielding a degeneracy ratio δ ≈ 4000. Electrons in metals are thus very far from the range of validity of Eq. (235). On the contrary, their properties can often be understood in terms of the ideal Fermi gas at low, and sometimes even null, temperature. The T = 0 properties of a Fermi gas are relatively simple. At T = 0, the ground state is the only N -fermion11 state which is populated: for noninteracting fermions, elementary quantum mechanics suggests that this is simply the antisymmetric state realized by filling the N lowest-energy single-particle levels (i.e. the lowest-|~k| N/gs plane-wave states), up to some maximum single-particle energy ǫF called Fermi energy, and leaving all the states above empty, as illustrated in Fig. 3.11b. Indeed, by taking the β → ∞ limit of Eq. (238), the average occupancy becomes a step function of energy:  Eα < µ  1, 1 , Eα = µ , (239) lim [nα ]F = T →0  2 0, Eα > µ 10 And on no other property. In particular, in the absence of any applied magnetic field, occupancy is independent of ms : at all temperatures the ideal gas is in a spin-unpolarized nonmagnetic state. 11 Hence, for brevity, we adopt the symbol N for the average number of particles [N ]. 3.2. IDEAL SYSTEMS 123 ε T=0 1 [n]F εF T=0 .... 0.5 T=0.05µ/kB T=0.2 µ/kB 0 0 (a) 1 2 ε/µ (b) Figure 3.11. (a) The average filling of the single-particle levels of the ideal Fermi gas as a function of energy E (measured in units of the chemical potential µ), for three increasing temperatures. The drawn temperature 0.05 kµB corresponds to several thousand K for electrons in simple metals. (b) The filling of the single-particle levels of noninteracting fermions (here gs = 2) at T = 0. which thus identifies the chemical potential at T = 0 with ǫF . Then, by requiring that (240) Z µ Z ∞ Z (2M )3/2 V µ (2M )3/2 V gs 2 3/2 1/2 dE g(E) = gs dE g(E) [nα ]F = N= µ , dE E = 4π 2 ~3 4π 2 ~3 3 0 0 0 we establish the relation between the particle density and the chemical potential 2/3 ~2 6π 2 N (241) ǫF = µ(T = 0) = . 2M gs V p2 F ) is filled In p~-space, each state within a sphere of radius pF (such that ǫF = 2M by gs fermions, while those outside are empty. To this maximum momentum pF = 2 1/3 N (the Fermi momentum), there corresponds a maximum velocity ~kF = ~ 6π gs V pF vF = M , the Fermi velocity. Similarly, the Fermi energy is sometimes conveniently expressed as a Fermi temperature TF = kǫFB . 124 3. STATISTICAL PHYSICS In simple metals, typical densities of conduction electrons (M = me , gs = 2) of the order N ≈ 1028 ÷ 1029 m−3 (roughly the inverse cube of typical interatomic V separations) yield ǫF ≈ 2 ÷ 10 eV, i.e. TF ≈ 20000 ÷ 100000 K. This corresponds to kF ≈ 1010 m−1 , pF ≈ 10−24 kg m s−1 , and typical velocities vF ≈ 106 m s−1 . At T = 0 it is also straightforward to obtain the internal energy: (242) Z Z ∞ Z ǫF (2M )3/2 V gs 2 5/2 (2M )3/2 V gs ǫF 3/2 ǫ . dE E = U= dE E g(E) [nα ]F = dE E g(E) = 4π 2 ~3 4π 2 ~3 5 F 0 0 0 by substituting ǫF from Eq. (241). This can be expressed in terms of the density N V 3/2 If only ǫF is substituted using Eq. (240), one can obtain the easier-to-remember relation: 3 (243) U (T = 0) = N ǫF . 5 In accord with Eq. (234), the T = 0 pressure 2N (244) P (T = 0) = ǫF 5 V is very large, due to the huge average kinetic energy of the fermions, obliged by Pauli’s principle to occupy different plane-wave states. This energy depends strongly (as V −5/3 ) on the available volume, as the spacing between the momentum states (180) is inversely proportional to the size of the box containing the gas (which makes g(E) ∝ V , as in Eq. (190)). The pressure exerted by a typical electron gas in a simple metal at T = 0 is as large as P ≈ 109 ÷ 1010 Pa! A sharp electric-potential step at the surface of the metal prevents the conduction electrons from escaping the solid. The T = 0 properties of the free electron gas explain many properties of the electrons in simple metals at ordinary temperatures,12 but not those thermodynamical quantities that involve temperature explicitly, such as the heat capacity. In the T ≪ TF limit, a systematic expansion of the exact thermodynamical equation of state (233) and the relation (226) connecting µ to N yield these thermal propV erties as lowest-order corrections. The mathematical details [?] of this procedure (called Sommerfeld expansion) are slightly intricate, but the qualitative trends are straightforward. For example, one can estimate the leading dependency of µ and U on temperature by considering Fig. 3.11a: when a small temperature is turned on, the average occupancy [nα ]F changes slightly from the T = 0 step function, with 12 The accord of the free-electron theory with experimental data of many simple metals is surprisingly good in view of the strong Coulomb interactions between electrons. The reason is that long-range Coulomb forces are efficiently screened by the electron gas. An experimentally observed phenomenon which is directly related to electron-electron Coulomb repulsion is that of plasmon collective excitations, which however occur at rather high energy (few eV), and are therefore of little importance to the thermodynamics at ordinary temperatures. 3.2. IDEAL SYSTEMS empty states filled states ky 125 2 ~ M k BT / h k F δk~ kF kx Figure 3.12. Thermal excitations/deexcitations across the Fermi sphere involve mainly the occupations of the states within a skin across the Fermi sphere. For electrons in metals at ordinary temperature T ≪ TF , the skin thickness ∝ kB T is greatly exaggerated here. some weak probability that states above µ are populated and states below µ are empty. As the density of states g(E) ∝RE 1/2 is slightly larger above ǫF than below, ∞ to conserve the fermion number N = 0 g(E) [nα ]F dE the chemical potential decreases slowly as T increases.13 The internal energy U increases mainly due to the few electrons moving up from states of a skin region of thickness ≈ kB T below ǫF into states ≈ kB T above (Fig. 3.12): the energy of each excited electron increases by ≈ kB T . The number of excited electrons is of the order of the density of states g(ǫF ) times the energy interval kB T where excitation occurs. The total internal energy increases therefore by approximately g(ǫF )(kB T )2 . The precise expansion yields (245) " # 2 2 5π 3 T π2 + ... T ≪ TF , U = U (T = 0)+ g(ǫF ) (kB T )2 +... = N ǫF 1 + 6 5 12 TF where the second equality was obtained using the useful relation valid for spin- 12 free fermions 3N (246) g(ǫF ) = . 2ǫF 13 For g s = 2 the precise T -dependence of the chemical potential is µ = ǫF 1 − π2 12 T TF 2 + ... . 126 3. STATISTICAL PHYSICS Figure 3.13. Low-temperature measured molar heat capacity of metallic potassium divided by T , as a function of T 2 . For Cv ≃ a T 1 + b T 3 , the finite intercept at T 2 = 0 of the CV /T curve measures the coefficient a of the T 1 (electronic) contribution; the slope of the graph measures the coefficient b of T 3 , which is the contribution of lattice vibrations (Sec. 4.3.2). T -derivation of Eq. (245) yields the heat capacity of the low-temperature ideal Fermi gas: (247) π2 T + ... CV = N kB 2 TF T ≪ TF . The effect of Fermi statistics is to depress the heat capacity by a factor ∼ TTF with respect to the high-temperature ideal-gas value 23 N kB . The reason is that only few electrons with energy very close to the Fermi energy are involved in thermal excitations, the large majority of electrons remaining “frozen” in deeper filled states. Experimentally, a T -linear contribution to the total heat capacity is observed in solid metals at low temperature (Fig. 3.13), where other (lattice – see Sec. 4.3 below) contributions are small. The T -linear contribution is attributed to electrons: for example, the observed T -linear coefficient for potassium ≃ 2.1 mJ mol−1 K−2 π2 ≃ 1.7 mJ mol−1 K−2 (Fig. 3.13) agrees fairly with the free-electron estimate NA kB 2T F (obtained based on the experimental density N ≃ 1.3 · 1028 m−3 of conduction V electrons in potassium, which, through Eq. (241), yields a Fermi energy ǫF ≃ 2.1 eV). The Fermi gas is nonmagnetic, as all spin states are equally occupied. Within ~ = Hz ẑ linear response, the application of an external magnetic field strength H ~ = Bz ẑ, that we assume to couple only to the spin produces a total magnetic field B degrees of freedom: the gs degeneracy is lifted, and the ideal Fermi gas polarizes. ~ = Mz ẑ denotes the volume density of magnetic moment, like in Eq. (213). For M spin- 21 electrons (gs = 2), the magnetization is Mz = −µB [N↑ − N↓ ]/V . By computing the linear response to the external field, one obtains the magnetic behavior of 3.2. IDEAL SYSTEMS 127 Figure 3.14. The measured magnetic susceptibility of metallic Zr2 V6 Sb. The wide T -independent χ region is characteristic of metallic behavior. The increase of χ as T → 0 is due to the presence of magnetic impurities (see Eq. (216)). the ideal Fermi gas. At high T , according to Eq. (216), independent spins produce a Curie susceptibility χB ∝ N/T , while at small T ≪ TF the Pauli principle freezes out the spins of most electrons, the paired ones deep inside the Fermi sphere (Fig. 3.12). Only the approximately g(ǫF ) kB T electrons near the Fermi surface do spin-polarize, thus producing a substantially T -independent magnetization. Detailed calculation yields: 1/3 µ2B g(ǫF ) 3 µ2B N 31/3 qe2 N (248) Mz = Bz T ≪ TF , Bz = Bz = 4/3 V 2 ǫF V 4π me V where the last expression shows the explicit density dependence. Accordingly, the spin susceptibility of the low-temperature ideal Fermi gas is given by Eq. (218), with 1/3 1/3 2 31/3 N πN 2 −1 2 2 −1 qe (249) (ǫ0 c ) χB = 4/3 (ǫ0 c ) = α a0 . 4π me V 3V As only the electrons near the Fermi level polarize, χH is tiny, ≈ 10−5 : it is proportional to the 13 power of the density, as opposed to linear in N/V , as for isolated spins, Eq. (216). This weak T -independent paramagnetic response (Pauli paramagnetism) is a characteristic signature of metallic behavior in the experimental study of materials (see e.g. Fig. 3.14). 3.2.2.2. Bose particles. The ground state of N non-interacting bosons is even simpler than that of N fermions: they all occupy the lowest-energy state ~k = ~0 of 128 3. STATISTICAL PHYSICS energy E = 0. If a spin degeneracy gs is present, the average number of bosons of each spin flavor is N/gs . This situation is reflected in the expression (237) for [nα ]B : 1 as β → ∞, the occupancies of all positive-energy states vanish, and µ ≈ − βN → 0− in such a way to ensure that the occupancy of the E = 0 state remains finite = N . However, the density of states (190) vanishes at E = 0: this indicates that the conversion (231) of the sum over the discrete single particle states into an energy integral is missing completely the most populated state, which becomes dominant at low temperatures. A correct treatment including separately the population of this state shows a phase transition at a finite temperature 2/3 6.625 ~2 N , (250) TBE = kB 2M V signaled by the macroscopic filling of the E = 0 level below TBE .14 This lowtemperature state is called a “Bose-Einstein condensate”. Even though many atoms and molecules are bosons, they all solidify at much larger temperatures than the relevant TBE for standard densities. The only exception in condensed-matter is 4 He, and indeed a superfluid transition similar to that discussed for the ideal boson gas is observed is 4 He at low temperature and ordinary pressure. However, 4 He at low temperatures is a liquid, not a gas, therefore interparticle interactions play an important role: more sophisticated tools are necessary to understand the actual nature of the superfluid transition of 4 He. Artificial systems are being studied in recent years, where droplets of extremely cold atoms are kept in a metastable gas state inside an electromagnetic trap. These droplets of boson atoms yield the experimental realization of a Bose-Einstein condensation much more similar to that of an ideal gas than the 4 He superfluid state (Fig. 3.15). Beside actual bosons, also the thermodynamical properties of fictitious particles related to harmonic oscillators are described by the Bose-Einstein distribution. Equation (138) yields the eigenvalues of an harmonic oscillator in terms of the quantum number v representing the number of nodes of the wavefunction considered. In a polyatomic molecular context, or whenever several harmonic oscillators of frequencies ωα contribute to the dynamics, the vibrational state is labeled by all the vα quantum numbers, and the associated total energy is X 1 ~ωα . (251) Evib (v1 , v2 , ...) = vα + 2 α 14 The average filling of every individual state |αi is given by Eq. (236). If the size of the system (both N and V ) doubles, the average occupancy of |αi (and of any state |α′ i close in energy) does not change: it is the density of states (190) that doubles to take care of the extra particles. Below TBE , the E = 0 state marks an exception: its occupancy is a finite fraction of N , and if the system size is doubled, also its occupancy doubles. 3.2. IDEAL SYSTEMS 129 Figure 3.15. The experimental evidence of the realization of a BoseEinstein condensate phase of gaseous 87 Rb atoms. The graphs show a 2D plot of the single-particle momentum distribution of the atoms in the droplet, as temperature is cooled down through the Bose-Einstein condensation temperature TBE . The sudden buildup of the asymmetric lump at p~ = ~0 measures the growth of the condensate fraction as the transition temperature is reached. This relation should be compared to the expression X (252) E(n1 , n2 , ...) = nα Eα α used, e.g. in Eq. (175), for the energy of noninteracting particles in terms of the occupation numbers nα of the single-particle statesPof energy Eα . It is seen that, apart from an irrelevant constant zero-point shift 21 α ~ωα , the two expressions are identical provided that the following identifications are made: single oscillator α oscillator quantum number vα oscillator energy quantum ~ωα oscillator eigenvalue vα ~ωα ←→ ←→ ←→ ←→ α single−particle state nα occupancy number of state α Eα single−particle energy of state α nα Eα energy of n particles in state α . As vα = 0, 1, 2, 3, ..., the identification works for bosons, not fermions. The outlined similarities in the Hamiltonian induce a completely parallel statistical behavior. For 130 3. STATISTICAL PHYSICS example, consider the average vibrational energy of one oscillator, Eq. (204): once the zero-point term ~ω2α is removed, energy v ~ωα is proportional to the quantum α number v, thus the average energy exp~ωβ̃−1 , reflects an average value of v, equal to 1 , precisely the same average occupancy [nα ]B , Eq. (237), of noninteracting exp β̃−1 boson state |αi with Eα = ~ωα and µ = 0. It is therefore natural to think of the harmonic ladder as fictitious boson particles, called “phonons” or “photons”, depending on the context. Accordingly, an oscillator in its v = 4 state is said to hold 4 photons, and an oscillator it its ground state contains no photons. The Bose-Einstein distribution can therefore be profitably employed to describe the thermodynamics of a set of harmonic oscillators, by replacing the Eα with the relevant ǫ = ~ωα , and taking µ = 0. In this context, the total number N of bosons cannot be held fixed (phonons/photons are not conserved particles), and varies freely with temperature, contrary to fermions and “real” bosons, such as atoms in a box. The lack of µ makes the statistics of such a system a simple exercise. We summarize here briefly the result for the photon gas, which describes the thermodynamics of the normal modes of the electromagnetic fields at thermal equilibrium within an isothermal cavity. The main difference between photons and material massive particles is that the dispersion relation, i.e. the ǫ(~p) (or, equivalently, ω(~k)) dependency is (253) ǫ(~p) = c|~p| or w(~k) = c|~k| , or ~|~k|2 w(~k) = 2M (c is the speed of light) rather than (254) E(~p) = |~p|2 2M for a particle of mass M . The electromagnetic fields in vacuum obey the same (Laplace) stationary equation of Schrödinger particles, thus under the same periodic boundary conditions, the allowed values of momentum are connected to the box size by the same Eq. (180). The “spin” degeneracy of photons is gs = 2, corresponding to the two transverse polarizations. Counting the states within a p~-sphere yields the total density of oscillator energies (255) gs gph (ǫ) = V π 2 ~3 c3 ǫ2 , to be compared with the dispersion of Schrödinger particles, Eq. (190). -2 -1 R(ε,T) [W m eV ] 3.2. IDEAL SYSTEMS 131 8000 400 K 6000 4000 300 K 2000 0 0 0.1 0.2 0.3 ε [eV] 0.4 0.5 Figure 3.16. The radiance (radiated power per unit surface and spectral energy) R(ǫ, T ) = 4c u(ǫ, T ) of electromagnetic fields at equilibrium at T = 300 K and 400 K. The area under each curve equals the R∞ 2 4 total radiated power per unit surface 0 R(ǫ, T ) dǫ = π60(k~B3Tc2) (StefanBoltzmann law). One square meter blackbody radiates a total 459 W at 300 K and 1450 W at 400 K. The energy position of the maximum of R(ǫ, T ) shifts linearly with T (Wien displacement law). The statistics of these oscillators yields the following thermodynamical relations: Z ∞ π 2 (kB T )4 U = V u(ǫ, T ) dǫ = V (256) , 15 ~3 c3 0 1 ǫ3 1 (257) , with u(ǫ, T ) = gs gph (ǫ) [nǫ ]B ǫ = 2 3 3 ǫ V π ~ c e kB T − 1 Z ∞ Z ∞ 2 1 y 2 ξ(3) 3 [N ] = gs gph (ǫ)[nǫ ]B dǫ = V 2 3 3 (kB T ) (258) dy = V 2 3 3 (kB T )3 , y π ~c π ~c 0 0 e −1 where ξ is the Riemann function (ξ(3) = 1.20206). These results were first derived by M.K.E.L. Planck to interpret the experimental data of thermal-radiation spectrum. As a by-product, the energy spectral density u(ǫ, T ) (per unit volume and energy) yields the radiative power R(ǫ, T ) = 4c u(ǫ, T ) (power per unit surface15 per unit 15 The c/4 factor between radiance R and energy density u originates from the fact that photons carry light c. However, given an infinitesimal surface, only the R 1 energy at the R 2πspeed of 1 1 fraction 4π of photons in the surroundings of that surface crosses it in cos θ d cos θ dϕ = 4 0 0 the right direction. 132 3. STATISTICAL PHYSICS Figure 3.17. The observed cosmic microwave background frequency spectrum, compared to a 2.73 K blackbody spectrum equivalent to Eq. (257). spectral energy) of radiation at thermodynamic equilibrium.16 Figure 3.16 reports the radiative power R(ǫ, T ) as a function of energy for two different temperatures. The spectral distribution of energy density (and, equivalently, radiative power) of electromagnetic fields at equilibrium is a universal function of temperature (called blackbody spectrum), and does not depend on the precise way this equilibrium is established (for example on the optical properties of the material of the cavity enclosing the fields). The distribution (256) is in extremely good agreement with experiments on radiation escaping through a tiny hole from isothermal cavities. A spectacular realization (Fig. 3.17) of the thermal-equilibrium radiation is that of the cosmic background, a “fossil” relic of an early stage of the universe when it was all at thermal equilibrium. We will apply the same statistics, with a slightly different density of states, to understand the thermal properties of the vibrations of solids (phonons in the Debye model, Sec. 4.3.2). 16 The same quantities are often quoted in terms of frequency ν, rather then energy R∞ −3 2 ǫ = 2π~ν =h hν. For example: g̃(ν, T ) = 8πV c ν , U = V ũ(ν, T ) dν, with ũ(ν, T ) = 0 i hν −3 3 8π h c ν / exp kB T − 1 . 3.3. INTERACTION RADIATION-MATTER 133 3.3. Interaction radiation-matter In Sec. 1.1.10 we have sketched the basic semiclassical results for a material system (e.g. an atom, a molecule) interacting with the radiation field. In the absence of any external stimulation, the system decays spontaneously to a lower-energy state with the rate given by Eq. (73). Here we want to describe in some detail what happens under the action of an external field, i.e. the effect of a stimulating radiation, as, e.g., in an absorption experiment. Typically this radiation needs to resonate with the energy of transition between two eigenstates of the system. Off-resonance transitions are associated to very small rates, and we ignore them here. Assume we have an ensemble of noninteracting quantum systems (e.g. atoms in gas phase) and focus our attention on two single-system levels only: |1i and |2i, of energies E1 and E2 (E1 < E2 ), with populations n1 and n2 , respectively. Resonant radiation of energy ǫ = ~ω = E2 − E1 may induce transitions between the two levels. In particular, the excitation |1i → |2i is driven by the presence of radiation, thus any given atom initially in state |1i has a probability per unit time (a rate) of upward transition proportional to the spectral energy density (per unit volume and spectral interval) ρ(ǫ) of the electromagnetic field at the resonance energy: (259) R1→2 = B12 ρ(ǫ) , where B12 is a suitable constant of proportionality, depending on the microscopical characteristics of the system and its coupling to the field, and we neglect nonlinear couplings O(ρ2 ). On the other hand, any atom initially in state |2i has a probability per unit time A21 to decay spontaneously to state |1i (in the dipole approximation, A21 equals the γ21 of Eq. (73)), plus an additional probability of downward transitions promoted by the presence of stimulating radiation, whose rate is then proportional to the same resonant spectral density: (260) R2→1 = A21 + B21 ρ(ǫ) , where again B21 is a yet-to-determine constant of proportionality. At any given time, the total number of systems undergoing the |1i → |2i transition is n1 R1→2 , and the total number of systems going |2i → |1i is n2 R2→1 . These relations are valid under arbitrary radiation conditions, for example when the system is probed by a stimulating radiation in a spectroscopy experiment (Fig. 0.5). However, these relations are valid in particular when the system and the radiation field are both at equilibrium at a given temperature. At equilibrium the average populations of different states must remain unchanged, and this implies that the total number of |1i → |2i and |2i → |1i processes must, on average, be equal: (261) [n1 ] R1→2 = [n2 ] R2→1 . 134 3. STATISTICAL PHYSICS We substitute (259) and (260) in the balance equation (261) (262) [n1 ] B12 ρ(ǫ) = [n2 ] (A21 + B21 ρ(ǫ)) . and solve for ρ(ǫ) obtaining (263) ρ(ǫ) = A21 B21 [n1 ] B12 − [n2 ] B21 1 . At equilibrium, the ratio of the populations equals the probability ratio which, by 1] = PP21 = exp[β(E2 − E1 )] = exp[βǫ] at resonance. Boltzmann statistics, is simply [n [n2 ] At equilibrium, the spectral density of the radiation field follows the universal energy dependency described in Sec. 3.2.2.2, in particular in Eq. (257): ρ(ǫ) = u(ǫ, T ). Accordingly, A (264) 21 ǫ3 1 B21 = u(ǫ, T ) = ρ(ǫ) = . B12 ǫ π 2 ~3 c3 exp ǫ − 1 − 1 exp kB T kB T B21 Comparison of the first and last form, which must be equal for any temperature, implies the following conditions on the coefficients: B12 A21 1 = 1, = 2 3 3 ǫ3 , B21 B21 π ~c known as Einstein relations. These relations among B12 B21 A21 were derived under the assumption of equilibrium, but as these coefficients only depend on properties of the individual system, they remain constant under arbitrary field conditions. Once one of these coefficients is determined, e.g. by A21 = γ21 of Eq. (73), the two others are fixed by the Einstein relations. The first relation expresses the symmetric role of the initial and final states in quantum mechanics, implying that a radiation field induces equal rate of excitation |1i → |2i and of stimulated emission |2i → |1i. This means in particular that under very strong applied radiation density the spontaneous emission becomes negligible, and R1→2 ≃ R2→1 , thus rapidly also n1 ≃ n2 (saturated transition). Saturating the different components of the |n = 2i ←→ |n = 3i transition of hydrogen is precisely the role of the pump beam in the experiment of Fig. 1.12. The second relation implies that, for a given spectral energy density ρ(ǫ) (independent of ǫ), the ratio of spontaneous emission to stimulated emission varies with ǫ3 . Accordingly, in a low-energy transition (microwaves, infrared) stimulated emission prevails, while at higher energy (ultraviolet, X-rays) spontaneous emission prevails. At equilibrium, the ratio of spontaneous to stimulated emission A21 ǫ (266) −1 = exp B21 u(ǫ, T ) kB T (265) 3.3. INTERACTION RADIATION-MATTER 135 Figure 3.18. The principle of operation of three- and four-level lasers, which realize in practice the needed population inversion, with level |2i pumped at the expense of level |1i. The lasing transition (2 → 1) is slower than the other indicated downward transitions, which are fast dipole-allowed transitions. indicates that thermal radiation stimulates emission effectively only for temperatures kB T ∼ ǫ or higher, as was to be expected. More interestingly, in the context of spectroscopy, the relation B12 = π 2 ~3 c3 ǫ−3 A21 shows an emission-forbidden transitions is also absorption-forbidden, and that an intense transition in absorption is also intense in emission, except at very low energies ǫ. 3.3.1. The laser. Usual media absorb traversing electromagnetic waves, and this is at the basis of absorption spectroscopies (Fig. 0.5). However, consider the possibility that a bunch of non-interacting quantum systems amplifies light rather than attenuating it. For this to occur, the total emission rate needs to exceed absorption: n2 R2→1 > n1 R1→2 , i.e. the ratio rate of emission A21 n2 (A21 + B21 ρ(ǫ)) n2 (267) . 1+ = = rate of absorption n1 B12 ρ(ǫ) n1 B21 ρ(ǫ) needs to exceed unity. As the second term becomes very small as soon as radiation intensity builds up, this equation tells us that a single condition grants light amplification (ratio > 1): a population inversion (n2 > n1 ), with respect to Boltzmann population of prevailing low-energy states. In practice, this radically out of equilibrium population is realized by means of electronic or optical pumping, involving other levels beside the two relevant ones, as illustrated in Fig. 3.18. Of course, a population inversion is highly unstable, and precisely the prevalence of emission over absorption tends to lead the ensemble toward a regular equilibrium population n2 < n1 . However, pumping may sustain the population inversion for an extended period of time, at the expense of some external power source. 136 3. STATISTICAL PHYSICS Figure 3.19. (a) Cr impurities in a transparent Al2 O3 crystal play the role of weakly interacting atoms carrying the 3 levels involved in the population inversion. Optical pumping of the “active medium” raises many Cr atoms from the ground state |1i to the broad shortlived state |3i, which decays spontaneously to state |2i. Further spontaneous decay |2i → |1i is extremely slow: decay is mainly promoted by stimulated emission. Since the energy of metastable state |2i is very sharply defined, the emitted radiation is very monochromatic. (b) The schematic construction of the ruby laser, showing the optical pumping lamp, the escape of photons not moving axially, and suggesting the buildup of repeatedly reflected axially moving photons, which stimulate further coherent emission. The usable photons escape through the partially reflecting mirror at one end. Once an optical active medium which amplifies light is realized, light can be channeled through it in a precise direction, by building a resonating optical cavity around it (Figs. 3.19 and 3.20). A crucial feature of stimulated emission is coherence: the emitted photon is not emitted in a random direction with a random phase, as in spontaneous emission, but prevalently in the same direction and with the same phase as the stimulating beam of photons. The use of a long cavity lets photons emitted at odd directions escape basically unamplified, while photons along the axis of the cavity bounce back and forth several times through the active medium, thus getting strongly amplified. As a result, a powerful highly coherent beam of radiation builds up for as long as the population inversion is maintained. A device such as described here, producing a coherent beam of photons by means of Light Amplification by the Stimulated Emission of Radiation is named laser. Massive commercial applications of such coherent beams extend from the industrial to the consumer side, including telecommunications, optical data storage, telemetry, cutting, welding, surgery... Lasers play also a fundamental role as tools for research in many spectroscopies, photochemistry, ultracold trapped gas cooling, microscopy, adaptive optics of telescopes... 3.3. INTERACTION RADIATION-MATTER 137 Figure 3.20. (a) The principle of operation of a He-Ne (four-levels) laser. Electrons accelerated by an electric field hit the He atoms, lead them to a metastable excited state He∗ (1s2s), which can then resonantly transfer its excitation energy to a Ne atom (He∗ + Ne → He + Ne∗ ). The excited Ne atoms are now in the highest of a 4-levels configuration, with the correct lifetimes for lasing operation. The emission of the upper transition is a characteristic red light (λ ≃ 633 nm), but other transitions can be selected. (b) The basic components of a gas laser. The present Chapter introduces the basics of equilibrium statistics and its connection to thermodynamics. It also reports a few examples of applications to ordinary matter, basically restricted to ideal noninteracting systems. Statistics becomes much more complicated and stimulating when real interacting systems are considered [?, ?]. Physicists have devised all sorts of techniques to investigate interacting systems both theoretically (diagrammatic expansions, many-body methods, renormalization group, ...) and experimentally (definitions and measurements of position/momentum/spin correlation functions, high-pressure techniques to investigate phase diagrams in extreme regimes, analysis of fluctuations, study of finite-size effects, exotic magnetic phases...). In the present introduction we have completely omitted any reference to dynamical out-of-equilibrium quantities. These are needed to account for transport (we shall use few basic definitions to define electrical and 138 3. STATISTICAL PHYSICS thermal conductivities in solids below) and general hydrodynamical properties starting from the microscopical interactions. This active and exciting sub-field of research in statistical physics goes far beyond the scope of the present introductory course. CHAPTER 4 Solids Macroscopic systems realize an equilibrium thermodynamical state as a balance between energy (which tends to decrease) and entropy (which tends to increase). Temperature sets the relative importance of the entropic contribution over the energetic one: energy prevails at low temperature, entropy at high temperature. The solid state is an ordered state, where each atom basically sits around a definite position: entropy is much smaller there than in a fluid state. Indeed, experience shows that most materials solidify at low temperature. The solid state signals the prevalence of adiabatic potential energy over nuclear kinetic energy. Kinetic energy Tn is translationally invariant, and tends to favor position-delocalized states, characterized by as low momentum as possible, and scarce position correlation. On the contrary, potential energy Vad takes advantage of characteristic optimal interatomic separations (see e.g. Fig. 2.1), and tends therefore to impose fixed interatomic relative distances and clear position localization. For essentially all materials at zero temperature, potential energy prevails, thus leading to solid states. The only remarkable exception is He, which at ordinary pressure remains fluid (and superfluid) down to zero temperature, due to its exceptionally weak interatomic interactions associated to scarce atomic polarizability (see Table 2.1). The application of pressure leads to low-temperature solid phases even for He, though. The tendency of a solid to hold together is measured by its “total bond energy” (in molecular language, the energy necessary to disaggregate the compound into its atomic components). In the context of solids, this is referred to as the cohesive energy. Like for molecules, both the Vad and Tn contributions vanish in the “atomized” state, whereas in the solid state, [Vad ] is negative (assuming Vad = 0 in the atomized state), and [Tn ] is positive (it cannot vanish, even at T = 0, due to Heisenberg uncertainty) but much smaller in absolute value (see Fig. 2.14). Solid matter is characterized by long-distance rigidity: a force applied to one or few of the atoms in a solid sample acts through the whole sample, which accelerates maintaining its average shape unchanged. This is due to the ability of solids (as opposed to fluids) to resist shear forces. This basic macroscopic property, on which our everyday experience relies, is far from trivial from the point of view of the 139 140 4. SOLIDS microscopic equations governing the dynamics of electrons and nuclei composing a solid. Indeed, our analysis of the adiabatic potential acting directly between two atoms at large distance reveals that Vad always decays with a rather fast power law 1/R6 . As the number of atoms at distance R from a given atom grows with R2 , the total interaction energy with faraway atoms decays with ∼ 1/R4 . In practice, an atom interacts significantly only with those atoms sitting in its close neighborhood, of a few nm3 say. Long-distance rigidity therefore cannot be related to long-range forces. This means that, in a solid, short-range forces propagate from one atom to the next ones, and from those to further atoms again and again, through the whole sample. 4.1. The microscopic structure of solids Many solids show ordered microscopical structures, but completely disordered solids are very frequent as well. Before coming to the experimental evidence for the ordered structure of many solids, we try to understand why regular spatial arrangements of atoms should not come too much as a surprise. In our initial study of many-atoms system (Chapter 2), we analyzed the typical shape of the adiabatic potential (Fig. 2.1) for a diatom. We also found that the adiabatic potential of many atoms is an explicit function of the relative positions of all of them, including all distances and angles. For exceptionally simple systems, such as the noble-gas elements, the total adiabatic potential of Nn atoms is decently approximated by a sum of 2-body potentials (e.g. of Lennard-Jones type – Eq. (136)) (268) Vad (R1 , R2 , ...RNn ) = Nn X α<α′ V2 (|Rα − Rα′ |) . Initially, neglect the nuclear kinetic energy: the state of minimum energy of two such atoms is realized by placing them at the equilibrium distance RM of the potential V2 , with an energy gain equal to the depth −ε of the potential well.1 Suppose now that a third atom is added, and that all equal atoms are constrained to stay along a line: the third atom can join the two on either side, at approximately a distance RM from the nearest atom, as shown in Fig. 4.1. Neglecting the weak second- thirdetc. -neighbor forces (which induce extra attraction, as illustrated in Fig. 4.1), the energy gain is simply −2ε. Likewise, Nn atoms along a chain place themselves at almost perfectly regular distances,2 and gain a total cohesive energy ≃ −(Nn −1) ε, √ For the Lennard-Jones potential (136), RM = 6 2 σ, with V2 (RM ) = VLJ (RM ) = −ε. 2 Slight distortions occur at the surface. Deep inside the bulk, however, each atom is subject to basically the same potential as its neighboring ones, thus it reaches an equilibrium position relative to the others which involves perfectly regular spacings. 1 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 141 RM V2(R) Energy / ε 1 0 -1 0 (a) (b) 1 2 3 R/RM Figure 4.1. (a) A 1D solid constructed by successive addition of atoms. The atoms after the second add at fairly regular distances, slightly smaller than RM , the optimal equilibrium distance of the 2body adiabatic potential. (b) The reason of the slightly smaller equilibrium separation in the solid than in the diatom: if the separation was RM , forces to second, third... neighbors would all be attractive and uncompensated, thus some energy is gained by contracting the interatomic separation. This contraction is tiny, as V2 explodes at short distance, and even tinier at the surface than in the bulk, due to local deficiency of second, and further neighbors. i.e. essentially −ε per atom. Depending on the precise shape of V2 the actual energy gain turns out slightly larger due to second- and higher-neighbor attraction. In 1D, regularity is trivial and indeed unavoidable. In larger dimensionality, the freedom of the arrangement of atoms is significantly larger, and several regular and irregular atomic arrangements are possible. In 2D, a third atom can add to form an equilateral triangle, and extra atoms that join the cluster find a lowestenergy arrangement by progressively building a triangular lattice, as illustrated in Fig. 4.2. In the limit of large Nn , when surface effects can be neglected, each atom is surrounded by 6 nearest neighbors, so that the cohesive energy per atom is approximately −3ε, as each bond is shared between two atoms. Second- and higher-neighbor terms make the actual energy gain slightly larger, but the general message is that a two-body interaction tends to favor configurations of maximal coordination, i.e. geometric arrangements where each atom has as many nearest 142 4. SOLIDS Figure 4.2. A 2D solid constructed by successive addition of atoms. The basic unit is the equilateral triangle, which repeats itself indefinitely in space, so that each atom in the bulk is (maximally) coordinated to 6 other atoms. Contractions due to second, third... neighbors are ignored in this figure. neighbors as possible. Indeed if the atoms were forced to occupy a square lattice, each atom would bind to 4 nearest-neighbors, rather than 6, thus gaining a cohesive energy per atom of ≃ −2ε only: this gives a macroscopical cohesive energy difference ∆U = U square − U triang ≃ Nn ε in favor of the triangular lattice against the square lattice which is therefore strongly unstable. In 3D, the implications of the principle of maximal coordination reach even further. 4 atoms maximize their coordination by placing themselves at the vertexes of a regular tetrahedron. Extra atoms extend this basic unit in space following either of two different regular patterns: the face-centered cubic (fcc) lattice and the hexagonal close-packed (hcp) lattice. As illustrated in Fig. 4.3, both lattices are the result of a regular stacking of 2D triangular lattices. In both hcp and fcc lattices, the second layer is stacked above the first one so that the atoms sit on top of the centers of half the triangles of the lower layer. The third layer is also stacked on top of half the centers of the second-layer triangles: in the hcp directly above the atoms of the first layer, while in the fcc above the remaining triangles. In both configurations, each atom is surrounded by 12 nearest neighbors, thus the cohesive energy is approximately −6ε per atom. When higher-order neighbors are included, the cohesive energy is usually marginally more favorable to fcc than to hcp. It is indeed observed that the solid state of Ne, Ar, Kr and Xe is a regular fcc lattice. The optimal equilibrium distance (accounting for all higher-neighbors interactions) for the Lennard-Jones fcc solid equals 0.971 RM = 1.09 σ, with a total cohesive energy per atom of −8.6 ε (significantly more bound than the −6 ε nearestneighbor estimate). By plugging the parameters of Table 2.1 in this simple model, 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 143 Figure 4.3. Two different close packings of rigid spheres: fcc is realized as a stacking sequence of type abcabc..., while hcp is realized as a stacking of type ababab... of triangular lattices. Upper panel: decomposition in triangular layers of conventional cells of fcc and hcp. Lower panel: two layers of “cannonballs”. If a third layer is placed on sites of type (c), an fcc stacking is initiated. If instead the third layer is placed on sites of type (a), an hcp stacking is realized. one obtains the bond distances and cohesive energies of Table 4.1. Not unexpectedly, the experimental energy is generally slightly larger (less cohesive) than the prevision of the simple Lennard-Jones model, where the ionic kinetic energy Tn and the associated zero-point motion are neglected. The good overall agreement confirms 144 4. SOLIDS exp theory element RM [pm] 1.09σ [pm] Ne 313 300 Ar 375 371 399 401 Kr Xe 433 443 exp theory exp [meV] −8.6 ε [meV] Tmelt [K] −20 −27 24.6 −80 −90 83.8 −110 −124 115.8 −170 −167 161.4 U Nn Table 4.1. The estimate of the equilibrium nearest-neighbor interatomic separation and cohesive energy per atom for solid noble-gas elements based on the fcc Lennard-Jones model (Table 2.1), compared to experimental determinations. Discrepancies of the order of few percent are not surprising, especially for the lighter Ne, for a model which neglects the kinetic energy of the nuclei. Note the strong correlation between the cohesive energy per particle and the melting temperature Tmelt of the solid (| NUn | ≃ 12 kB Tmelt ). the concept of maximizing the coordination, leading to compact structures similar to the packing of hard spheres, typically fcc, a concept valid whenever the adiabatic potential can be decomposed in 2-body terms, as in Eq. (268). The noble gases (and mixtures thereof) are not the only system where the approximation of a 2-body adiabatic potential works well: it can successfully describe the overall structure of many solids formed by “spherically symmetric” close-shell molecules, e.g. methane CH4 [?]. With suitable modifications, similar 2-body models permit to describe other molecular solids (e.g. H2 , N2 , Cl2 ) where again weak Van der Waals interactions provide cohesion, but the asymmetry of the individual molecular unit can favor different lattice structures. However, molecular solids, where each molecular unit retains many of its molecular properties and is only weakly bound to other units, constitute a marginal class of solids, by no means the most typical one, like the Ar2 dimer is not the most typical example of a molecule. Contrary to molecular solids, electrons of the outer atomic shells change substantially their quantum state when they belong to a covalent or metallic solid, in pretty much the same way that the electronic state of H N and O changes in forming the covalent molecules H2 , N2 , H2 O, as discussed in Sec. 2.2. A covalent or metallic solid can indeed be seen as a huge molecule, where the molecular bonding extends to the whole sample. Experimentally, cohesive energies per atom U of solids are comparable to bond energies of covalently bound diatomic molecules, Nn i.e. of the order of several eV. These are of course much larger than those of noblegas solids (Table 4.1); for example we report the cohesive energies per atom of a few solid elements: Li −1.65 eV, C (diamond) −7.36 eV, Si −4.64 eV, Fe −4.29 eV; Cs 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS (a) 145 (b) Figure 4.4. (a) A sliced wafer of perfectly crystalline Si. The diameter of the cylinder is 300 mm. (b) A balls-and-sticks view “from the inside” of the silicon crystal lattice structure. This is the same geometrical structure as C diamond. The coordination is four only. This view is down the (110) direction. −0.83 eV. Ionic crystals (e.g. NaCl) show similar cohesive energies, of several eV per ion pair. To compute a quantitative estimate of the cohesive energy of a covalent or metallic solid, it is necessary to study the dynamics of its electrons in detail. Like for many-electron molecules, in practice, reliable estimates can be computed only on the basis of detailed self-consistent calculations. Before coming back to the electronic states of solids in Sec. 4.2, observe that it is to be expected that the adiabatic potential associated to such nontrivial electronic states shows strong dependence on all bond lengths and angles, pretty much as it does in molecules where it enforces more or less rigid equilibrium molecular geometries (Fig. 2.7). Therefore, in covalent and metallic solids, the 2-body approximation Eq. (268) is bound to fail completely, and different structures, other than fcc, are to be expected, depending on the detailed chemistry and thus on the relevant Vad involved. Indeed, elemental solids show several different ordered structures including (beside fcc): hcp, body centered cubic (bcc), diamond, and others. Long-range crystalline order reaches spectacular levels of perfection, for example in industrial-grade Si single crystals, where a sub-nanometer unit cell repeats itself over and over in three dimensions for distances exceeding the meter (Fig. 4.4). We shall soon present those structures in the standard idealized formalism, commonly employed to report and classify many experimental data on crystalline solids, based on an elementary unit infinitely repeated periodically in space. 146 4. SOLIDS (a) (b) (c) Figure 4.5. Point defects in an otherwise perfect triangular lattice: (a) a vacancy, (b) an interstitial atom, and (c) an impurity atom. (a) (b) (c) Figure 4.6. (a) A dislocation (b) observed by STM in PtNi alloy. (c) The motion of dislocations substantially reduces the shear resistance of a real crystal with respect to that of a perfect crystal. Figure 4.7. A stacking fault observed in an otherwise perfect GaN crystal. 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 147 Figure 4.8. A grain boundary observed in the high-resolution transmission electron microscope (TEM) image of a strontium-titanate film. A perfect infinitely repeated lattice is an idealization which is never exactly realized in nature. In a real material, the crystalline structure is bound to contain defects. A localized defect, such as a vacancy or an interstitial atom (Fig. 4.5) raises the total energy by a few times the typical bond energy ε of an atom to its neighbors. This is irrelevant for the total cohesive energy per atom NUn , in the thermodynamical limit. At low temperature, at equilibrium, one expects a small finite concentration (of the order of ∼ e−βε ) of localized defects to survive. The modest difference in energy between the fcc and hcp lattices makes the creation of extended defects such as dislocations (Fig. 4.6) or stacking faults (Fig. 4.7) likely. The energy cost of extended defects is macroscopically large and should therefore suppress them at equilibrium. However defects remain easily “frozen” within the lattice, the typical time for an extended defect to drift out of a macroscopically large sample being often astronomically large. As a result, at very low temperature a solid often remains locked in a metastable non-equilibrium state with a finite (often large) concentration of extended defects depending on preparation (see the discussion of ortho- and para-hydrogen of Sec. 3.0.4 for a simpler example of a metastable non-equilibrium state surviving for long time). Extended defects of the type of Figs. 4.6-4.8 are crucial for understanding the plastic deformations of real crystals under strain. Even without “internal” defects, real solids are not ideal because the lattice periodicity is forced to end at the inevitable terminating surface or interface or grain boundary (Fig. 4.8). 148 4. SOLIDS Figure 4.9. High-resolution TEM image of an amorphous zirconium (Zr) alloy. In contrast to Fig. 4.8, the spots are randomly arranged, indicating that this material is noncrystalline. (a) (b) (c) Figure 4.10. Naturally grown (a) quartz (SiO2 ) and (b) fluorite (doped CaF2 ) polycrystals. (c) Sapphire (doped Al2 O3 ) cut artificially along crystal planes. In many materials (many-components off-stoichiometric compounds, glasses, many polymers, alloys...), the cost of the formation of defects is so small that it is highly nontrivial (and often even impossible) to obtain crystalline samples. These materials form amorphous solids, where no long-range lattice order is present (Fig. 4.9): their microscopic structure often resembles that of a frozen liquid. The formalism that we are going to set up for crystals is mostly useless for amorphous systems: their investigation requires more advanced tools, exceeding the scope the present basic course. The tendency to grow and to break along flat planes (cleave) at characteristic fixed relative angles (Fig. 4.10) is a macroscopic evidence of crystalline order in many solids. An even more compelling evidence is provided by the diffraction of X-rays, 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS MONOCHROMATIC "RADIATION" SOURCE 149 11 00 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 SAMPLE (a) "RADIATION" DETECTOR (b) (c) Figure 4.11. (a) The scheme of a powder or polycrystalline sample diffraction experiment. The “radiation” beam may be constituted by any wave field interacting with matter, typically X-rays, neutrons, or electrons. The sample must be thin with respect to the typical attenuation length of that radiation in that material. The diffraction patterns made by a beam of (b) X rays and of (c) electrons passing through the same thin Al foil. The characteristic angles of diffraction are connected to the lattice periodicities and the wavelength of the incident radiation. neutrons and electrons of wavelengths in the a0 region. As Fig. 4.11 illustrates, all sorts of wave probes of wavelength in the correct range interact with crystals and produce diffracted beams, characteristic of a regular array of scatterers, i.e. of a spatially periodic density. We proceed to introduce the basic mathematics describing a periodic system, with the central concepts of direct and reciprocal lattice, and employ the related Fourier analysis to understand diffraction experiments. 4.1.1. Lattices and crystal structures. The basic property of a crystalline solid is the essential equivalence of different “sites”. Any Ne atom in its fcc lattice “sees” an essentially equivalent environment, unless it lies close to the crystal surface or to some defect. In a sufficiently “clean” crystal, most atoms are far enough, say at least 5 neighbors away, from the nearest imperfection. The forces experienced by one such “bulk” atom, and by its electrons, equal those the same atom would feel if it was part of a perfect lattice extending through all space. It thus makes sense to understand many properties of crystalline solids by modeling them as “perfect”, ideal crystals. 150 4. SOLIDS P a’1 a’2 a2 a1 Figure 4.12. A finite portion of a generic 2D lattice generated by the primitive vectors ~a1 and ~a2 . Other equally good primitive vectors ~a′1 and ~a′2 are indicated, based on a different origin. Any lattice point ~ can be expressed as n1~a1 + n2~a2 , for example P = −1 ~a1 + 4 ~a2 = R, −4 ~a′1 + 8 ~a′2 . The main symmetry of a crystal is a discrete translational symmetry.3 Given a point ~r in the crystal, perfectly equal physical properties (including electric potential, electric field, mass and charge density, current, etc.) are observed at all other points (269) ~ ~r ′ = ~r + R, ~ = n1~a1 + n2~a2 + n3~a3 , with R with nj arbitrary integers, and ~aj three linearly independent vectors. All points ~ in Eq. (269) form an infinite array extending through space: it is of the type R named a Bravais lattice. The vectors ~aj are said to generate the lattice. They are called primitive if, for any ~r, the points ~r ′ defined by Eq. (269) are all the points which have equal physical properties as ~r (that means that ~aj are taken as short as ~ points as dense as possible). possible, to make the lattice of R As a first example, Fig. 4.12 illustrates these ideas for a 2D lattice. Figure 4.13 shows a portion of a simple-cubic lattice, seen from different angles. Primitive vectors can be chosen as a x̂, a ŷ, a ẑ, i.e. orthogonal and all of the same length a, the side of the smallest cubes in the lattice. The drawn portion of the lattice includes 125 points generated by 5 × 5 × 5 consecutive values of the nj indexes. 3 In an isolated atom, the rotational symmetry commuting with the effective single-electron Hamiltonian makes the angular momenta Li of individual electrons good quantum numbers, used to label atomic states such as, e.g., 1s2 2s2 2p4 . Sections 4.1.1 and 4.1.2 address the similar problem to determine and diagonalize the symmetry operators of electrons in a crystal, in order to provide electrons with appropriate quantum numbers. 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS (a) (b) 151 (c) Figure 4.13. A finite portion (5 × 5 × 5 lattice spacings) of a simple cubic lattice seen from (a) the ẑ direction (001), (b) the ŷ + ẑ direction (011), and (c) the x̂ + ŷ + ẑ direction (111). (a) (b) (c) Figure 4.14. A finite portion (2 × 2 × 2 lattice spacings) of a face centered cubic (fcc) lattice seen from (a) the ẑ direction (001), (b) the ŷ + ẑ direction (011), and (c) the x̂ + ŷ + ẑ direction (111). Other important examples of 3D Bravais lattices – already encountered above – are the fcc and bcc lattices sketched in Figs. 4.14 and 4.15. These lattices are built by adding sites to the simple cubic lattice (in the fcc, at the center of each face of the cubes, in the bcc at the body center of the cubes). The added sites are perfectly equivalent to the original sites. The lattice-point density of fcc is four times and that of bcc is twice that of simple cubic of the same cube side a. For fcc and bcc, the same cubic-lattice vectors a x̂, a ŷ, a ẑ as for the simple cubic lattice are often conveniently used. However, these are not primitive: Fig. 4.16 indicates a standard choice of primitive lattice vectors. 152 (a) 4. SOLIDS (b) (c) Figure 4.15. A finite portion (3 × 3 × 3 cubic lattice spacings) of a body centered cubic (bcc) lattice seen from (a) the ẑ direction (001), (b) the ŷ + ẑ direction (011), and (c) the x̂ + ŷ + ẑ direction (111). (a) (b) Figure 4.16. (a) Primitive lattice vectors for the fcc Bravais lattice: ~a1 = a2 (ŷ+ ẑ), ~a2 = a2 (ẑ+ x̂), ~a3 = a2 (x̂+ ŷ). The point P , for example, may be expressed as P = ~a1 + ~a2 + ~a3 . (b) Primitive lattice vectors for the bcc lattice: ~a1 = a2 (ŷ + ẑ − x̂), ~a2 = a2 (ẑ + x̂ − ŷ), ~a3 = a (x̂ + ŷ − ẑ). 2 Taking the primitive vectors as three converging edges defines a parallelepiped of volume Vc = ~a1 × ~a2 · ~a3 . This parallelepiped contains all “different”, inequivalent points ~r of space: any other ~r ′ lying outside this parallelepiped is equivalent to some ~r inside, to which it can be reduced by using Eq. (269) with suitable nj . This elementary parallelepiped contains therefore all the relevant information about the 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 153 Figure 4.17. Several possible choices of primitive cells for the same 2D lattice. (a) (b) Figure 4.18. Primitive cells of the fcc (a) and bcc (b) Bravais lattices. The volume of the fcc primitive cell is 14 of the volume a3 of the conventional cell. The volume of the bcc primitive cell is 12 of the volume a3 of the conventional cell. whole of this periodic world: what happens outside it amounts to essentially boring repetitions of what goes on inside. For this reason, this minimal volume which, by ~ fills up the whole space without overlapping takes applying lattice translations R, 154 4. SOLIDS Figure 4.19. Wigner-Seitz primitive cell of a generic 2D lattice. The Wigner-Seitz cell of a 2D lattice is always an hexagon, unless the lattice is rectangular. the special name of primitive cell, or unit cell. In fact, according to this more general definition the primitive cell needs not be a parallelepiped at all (in 2D, a parallelogram – see Fig. 4.17). Each primitive cell contains, in particular, one and only one Bravais lattice point. A special choice of primitive cell retains the full symmetry of the lattice, and is free of the arbitrariness (illustrated in Fig. 4.12) in the choice of the primitive ~ the Wigner-Seitz cell is the set of all positions ~r vectors. Given a lattice point R, ~ ~ ′ (see for example Fig. 4.19). One can closer to R than to any other lattice point R show that the Wigner-Seitz cell is indeed a primitive cell. Figure 4.20 shows the Wigner-Seitz cells for two common lattices. It often happens that exactly one atom occupies each lattice site of a Bravais lattice as, e.g., in fcc solid Ne and Al. However, even more often several atoms belong to the same primitive cell, which is then repeated in space. For example, the nuclei of solid NaCl (and scores of similar compounds such as LiCl, NaBr, KI, AgF, CaO, BaSe, ...) sit around simple cubic lattice points, alternately so that each Na has 6 nearest neighbor Cl, and vice versa. Similarly, the nuclei of CsCl (and similar compounds CsBr, TlCl, ...) sit at the lattice sites of a bcc lattice, alternately so that each Cs has 8 nearest neighbor Cl, and vice versa (see Fig. 4.21). To describe such crystals, it suffices to recognize the true periodicity of the lattice. For example the NaCl structure can be described as a fcc Bravais lattice containing two atoms per cell: a Na atom at ~0 and a Cl atom at the center of the fcc primitive cell 1 (~a + ~a2 + ~a3 ) = a2 (x̂ + ŷ + ẑ). This brings us to the general necessity to introduce 2 1 a basis, i.e. a list of atoms with their positions within a primitive cell. A crystal structure is thus defined by a Bravais lattice plus a basis. A basis is sometimes 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS (a) 155 (b) Figure 4.20. Wigner-Seitz primitive cells. (a) Of the fcc Bravais lattices (a rhombic dodecahedron). The surrounding cube is the conventional cubic cell drawn in Fig. 4.18 translated by, e.g. a2 x̂, in order to put a lattice site at its center. (b) Of the bcc lattice (a truncated octahedron). Each regular hexagon bisects a segment joining the central point to the a vertex of the cube. needed even when all atoms in the crystal are chemically equal, because they can be geometrically different. For example, Fig. 4.22 illustrates the fact that the 2D honeycomb lattice (all sites hosting “equal” atoms) is not a simple Bravais lattice, as no two neighboring atoms are geometrically equivalent. The honeycomb lattice is the building block of the graphite form of carbon, which, as shown in Fig. 4.23, is composed by a “vertical” stack of parallel planes of honeycomb-bound carbon atoms. The graphite structure is a simple hexagonal lattice with a basis of 4 atoms per primitive cell. The hcp structure (Fig. 4.24) is also an hexagonal Bravais lattice with a 2-atoms basis. Similarly, the diamond structure (of C-diamond, Si, Ge and α-Sn, drawn in Fig. 4.25) is described by a fcc lattice with a two-point basis. Each site has four nearest neighbors (see also Fig. 4.4). Once the possibility that several atoms belong to each cell, one often finds it more convenient to use a conventional cell containing several equal atoms, rather than the primitive unit cell of the Bravais lattice. For example, fcc and bcc are often conveniently described in terms of the underlying nonprimitive simple cubic cell, of larger volume than the primitive unit cells of Fig. 4.18. The fcc lattice is then seen as a simple cubic Bravais lattice with a 4-points basis (~0, a2 (ŷ + ẑ), a2 (ẑ + x̂), and a (x̂ + ŷ)), while the bcc is seen as a simple cubic lattice with a 2-points basis (~0, 2 156 4. SOLIDS (a) (b) Figure 4.21. (a) The sodium chloride structure. (b) The cesium chloride structure. Black and white balls represent ions of two different types. In the NaCl structure, the ions of each kind form interpenetrating fcc lattices. In the CsCl structure, the ions of each kind form interpenetrating simple cubic lattices. Figure 4.22. The honeycomb lattice is a 2D triangular net to which one third of the points has been removed. This is not a simple Bravais lattice, since the “environment” of two neighboring atoms is not equivalent. In particular, vector quantities are in general different at those two lattice points, even though all scalar quantities are the same. The honeycomb net can be described as a Bravais lattice with two primitive vectors (~a1 and ~a2 of equal length, separated by a 60◦ angle), and a basis composed by, e.g., ~0 and 13 (~a1 + ~a2 ). 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 157 Figure 4.23. The structure of C-graphite: a alternating stack of 2D honeycomb lattices. It consists of an hexagonal lattice with a basis of 4 atoms per primitive unit cell, two in plane “a” plus 2 in plane “b”. See also Fig. 0.2. Figure 4.24. The hcp structure is an alternate stacking of triangular lattices, as illustrated in Fig. 4.3. This crystal structure is defined by ~aq a2 of equal length a, and ~a3 of different length c (ideally 1 and ~ c = 8 a); the basis contains 2 atoms, at ~0 and at 1 ~a1 + 1 ~a2 + 1 ~a3 . 3 3 3 2 158 4. SOLIDS Figure 4.25. The diamond lattice consists of two interpenetrating fcc Bravais lattices, displaced along the body diagonal of the cubic cell by one quarter of the length of this diagonal. It can be regarded as a fcc lattice with the two-point basis ~0 and 41 (~a1 + ~a2 + ~a3 ) = a (x̂ + ŷ + ẑ). 4 a 2 (x̂ + ŷ + ẑ)). All results must coincide when this conventional “lattice with basis” formalism is used in place of the genuine description in terms of pure Bravais lattice. We have introduced here the Bravais lattices as infinite arrays of discrete geometric points. This is a convenient means to visualize them. However, the precise mathematical meaning of a Bravais lattice is a group of translations, precisely those that transform the corresponding array of discrete points back into itself. Any vec~ can be seen also as a translation operator T ~ such that T ~ ~r = R ~ + ~r, for any tor R R R point ~r. It is easily verified that the set of all lattice translations {TR~ } of a Bravais lattice is closed for composition (TR~ TR~ ′ = TR+ ~ R ~ ′ ), it contains a neutral element (T~0 ), and for each TR~ there exists an inverse element (T−R~ ) such that the composition of TR~ and its inverse yields the neutral element. This means that all lattice transla~ +R ~′ = R ~ ′ + R, ~ this tions {TR~ } form a group of geometric transformations. As R group is Abelian i.e. commutative. When the Hamiltonian has the full symmetry of this group of discrete translations, its eigenstates are simultaneous eigenstates of all group operations, i.e. they are labeled by the group irreducible representations. The purpose of the following Section is precisely to find these irreducible representations: we shall see that their structure is very general and not especially complicated. 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 159 The extra symmetry of many lattices and crystal structures leads to some additional intricacy. For example, a simple cubic lattice transforms back into itself if rotated around any of its points by 90◦ around any of x̂ or ŷ or ẑ, or by 120◦ around any body diagonal direction such as x̂ + ŷ + ẑ. These extra transformations extend the group of discrete translations {TR~ } outlined above. The full symmetry group of the lattice (space group) is a proper combination of the point group (a finite group of rigid rotations and reflections about one point) and the group of discrete translations {TR~ }. Back in the first half of the 19th century it was recognized that only 7 different point groups could occur for 3D lattices. In particular, neither groups including fivefold axes (rotations by 2π = 72◦ ), nor groups including sevenfold or higher-order 5 axes exist, as they could not replicate infinitely in space. These 7 point groups combine differently with the translations to form 14 inequivalent Bravais lattices.4 The introduction of a basis into the Bravais lattices may reduce the global symmetry of the objects replicated in the primitive cell, and thus the space group. Including all possible types of basis, the different point groups become 32, rather than 7, and the space groups 230, rather than 14. We leave the details of the classification of space and point groups of 3D lattices to specific solid-state courses, and only note that these extra symmetries are important to recognize the extra degeneracies of the electronic or vibrational states of the crystal. 4.1.2. The reciprocal lattice. The Fourier transform of a periodic function includes only discrete “frequencies”: this is the key to understand the use of a reciprocal lattice. This is illustrated simply in 1D. By definition, a periodic function f (x) of period a (the lattice spacing) satisfies f (x) = f (x − na) = f (x − R) (R = na is a lattice vector, as in Eq. (269)). Then, in its Fourier expansion Z ∞ 1 −1 ˜ (270) f (x) = F [f ](x) = √ eikx f˜(k) dk , 2π −∞ all Fourier components f˜(k) vanish except those whose exp(ikx) has the same periodicity as f (x). The corresponding k must satisfy exp(ikx) = exp[ik(x − a)], i.e. exp(−ika) = 1, i.e. ka = 2πl, i.e. k = l · 2π , for any integer l = 0, ±1, ±2, .... a We indicate those special k-values compatible with the lattice periodicity with the notation 2π . (271) G = G(l) = l · a As at any value of k 6= G which does not respect the lattice periodicity the Fourier component f˜(k) vanishes, the Fourier expansion (270) can be written as a discrete 4 Two groups are equivalent if they contain the same symmetry operations (e.g., rotations, discrete translations, etc., the latter up to suitable scaling factors). 160 4. SOLIDS Fourier series (272) f (x) = X eiGx f˜(G) , G with coefficients 1 f˜(G) = a (273) Z a exp(−iGx)f (x)dx . 0 The G points, of all k’s, acquire therefore a special role when the a-periodicity is involved, and periodic functions are to be represented in Fourier space. According to (271), the G points build a regular lattice, of unit vector 2π , in k space. Apart a from the physical dimensions (inverse length rather than length), the k space is not different from the x space, thus the k-space lattice of G points holds all the properties of a Bravais lattice on its own: the lattice of G points of Eq. (271) is called reciprocal lattice. By definition, direct-lattice points R reciprocal-lattice points G satisfy eiRG = ei na l2π/a = ei 2π nl = 1 . (274) The simple 1D example introduced here can be generalized to the 3D case relevant for actual crystals. A function f (~r) has the periodicity of a Bravais lattice if f (~r) = ~ for any lattice vector R ~ = n1~a1 + n2~a2 + n3~a3 , with integer nj as in f (~r − R) Eq. (269). Then, the only nonzero components in its Fourier expansion satisfy ~ · ~r) = exp[iG ~ · (~r − R)] ~ for all R ~ in the direct lattice. This relation is satisfied exp(iG ~ such that for all G ~ ~ eiR·G = 1 . (275) ~ of the reciprocal lattice are all the ~k vectors whose associated In words, the vectors G plane wave has the periodicity of the direct Bravais lattice. In particular, taking the ~ one finds that the G ~ vectors are all the vectors of the primitive vectors ~aj for R, type ~ = G(l ~ 1 , l2 , l3 ) = l1~b1 + l2~b2 + l3~b3 G (276) (lj are integers), with (277) ~b1 = 2π ~a2 × ~a3 , Vc ~b2 = 2π ~a3 × ~a1 , Vc ~b3 = 2π ~a1 × ~a2 , Vc ~ vectors where Vc = ~a1 ×~a2 ·~a3 is the volume of the primitive unit cell. As above, the G form a Bravais lattice, of which Eq. (277) yields a set of primitive unit vectors. Note ~ ·G ~ = 2π(n1 l1 + n2 l2 + n3 l3 ). Any that as ~bi · ~aj = 2πδij , the product of Eq. (275) R ~ is associated to a plane wave which does not respect the lattice other vector ~k 6= G periodicity, thus the corresponding Fourier component f˜(~k) vanishes. Like in 1D, 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 161 the Fourier expansion of the periodic function is a discrete Fourier summation over the reciprocal lattice X ~ ~ , (278) f (~r) = eiG·~r f˜(G) ~ G with coefficients (279) 1 f˜(G) = Vc Z Vc ~ · ~r f (~r) d3 r , exp −iG where the integration is carried out over a single primitive unit cell volume. The 3 . The Wigner-Seitz volume ~b1 × ~b2 · ~b3 of the reciprocal-lattice primitive cell is (2π) Vc cell of the reciprocal lattice is called first Brillouin zone. By applying the transformations (277), it is easy to verify that the reciprocal . The lattice of a simple cubic lattice of side a is another simple cubic lattice of side 2π a fcc lattice of conventional cell of side a has as reciprocal lattice a bcc of conventional side 4π . Conversely, the bcc lattice of conventional cell of side a has a fcc reciprocal a . As a consequence, the first Brillouin zone of lattice5 of conventional cell side 4π a the fcc lattice has the shape of the bcc Wigner-Seitz cell (Fig. 4.20b), and that of the bcc lattice has the shape of the fcc Wigner-Seitz cell (Fig. 4.20a). It is a useful exercise to determine the reciprocal lattice of the hexagonal lattice starting from the conventional primitive vectors of Fig. 4.24. ~ r ~ vectors in the reciprocal lattice determine plane waves eiG·~ G of the direct-lattice periodicity. For a given plane wave, consider the constant plane-wave surfaces, the “wave fronts”, for example those corresponding to maximum real part of the wave ~ ~ function eiG·~r = 1: these constitute a family of parallel planes, perpendicular to G 2π and separated by a wavelength λ = |G| ~ . Some of these planes pass through the lattice points. In particular they all pass through the lattice points if the integer indexes l1 , l2 , and l3 have no common multiplicative factor other than 1. Otherwise, ~ given by nl1 , nl2 , nl3 , one in n of these planes pass through lattice points. The for G integer indexes l1 , l2 , and l3 are called Miller indexes of the family of lattice planes. These indexes are inversely proportional to the intercepts of these planes with the crystal primitive directions. The standard notation to indicate planes and directions in ~k space is (l1 l2 l3 ), with the convention that if some of these integers is negative, then it carries an overbar, and commas are eliminated, as in (2, −1, 0) = (2 1̄ 0). The relation of a few commonly encountered lattice planes and the underlying cubic cell is illustrated in Fig. 4.27. Traditionally, for all cubic lattices, including fcc and bcc, lattice planes are labeled with respect to the conventional cubic direct- and 5 This is a consequence of the general fact that the reciprocal of the reciprocal lattice is the original lattice. This can be verified by applying twice the transformations (277), or even more ~ and G ~ in Eq. (275) can be exchanged. simply by observing that the roles of R 162 4. SOLIDS ~ vector identifies a well defined family of parallel Figure 4.26. A G ~ lattice planes, of constant plane wave eiG·~r . These are perpendicular ~ and separated by a distance 2π . Left: (l1 l2 l3 ) = (1 0 0) planes; to G ~ |G| right: (l1 l2 l3 ) = (1 1 1) planes. Figure 4.27. Standard notation to indicate lattice planes in cubic symmetry. reciprocal-lattice directions x̂, ŷ, ẑ, not with respect to the primitive unit vectors of Fig. 4.16. 4.1.2.1. An algebraic note. The algebraic interpretation of the reciprocal lattice is connected to group theory: any ~k-vector labels an irreducible representation of the group of the discrete direct-lattice translations. These representations are 1dimensional and the corresponding character of a group operation TR~ is simply ~ exp(−i~k · R). Two irreducible representation labeled by ~k and ~k ′ have all equal ~ for characters (thus are in fact the same representation) whenever ~k − ~k ′ = G ~ in the reciprocal lattice. Indeed, exp(−i~k · R) ~ = exp[−i(~k ′ + G) ~ · R] ~ = some G 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 163 ~ exp(−iG ~ · R) ~ = exp(−i~k ′ · R), ~ thanks to Eq. (275). Accordingly, all exp(−i~k ′ · R) possible irreducible representations of the discrete translational group are labeled by all ~k points within one primitive zone of the reciprocal lattice, e.g. the first Brillouin zone. 4.1.3. Diffraction experiments. Diffraction of radiation is the main quantitative source of structural data about solids. Electrons of energy 1.5 ÷ 150 eV, electromagnetic fields of 1 ÷ 10 keV (Fig. 0.4), and neutrons of 1 ÷ 100 meV: the wavelength of all these “radiations” fits in the range 0.1 ÷ 1 nm of the unit cells size of not too complicated crystals (of the order of few typical interatomic distances ∼ 10−10 m). Solids can and do diffract waves of the three listed kinds. However, electrons interact very strongly with matter, and are thus sensitive to few topmost surface layers only. If the energy of X rays is chosen off-resonance from all the core transitions of the atoms in the material (see Fig. 1.24), then their penetration in the solids is generally sufficient (i.e. many unit-cell lengths) to produce clear bulk diffraction patterns. Neutrons are even more fit for structural diffraction studies, as they only interact (weakly) with the nuclei, and are almost insensitive to the electrons. A sufficiently small sample guarantees that the total probability of probe-sample interaction is weak, so that most probing radiation goes unscattered through the sample, a small fraction scatters once, and essentially none scatters twice or more. Under these conditions scattering is easily understood. For simplicity, the incoming beam is assumed to be produced by a monochromatic source placed at a large distance from the sample, so that it is characterized by a well-defined wave vector ~k (or momentum ~~k). The detector is also very remote from the sample, so that it detects outgoing radiation scattered elastically to another welldefined wave vector ~k ′ (Fig. 4.28a). As analyzed more quantitatively in the theory of scattering, every infinitesimal volume d3~r of a continuous distribution of matter scatters radiation in proportion to the amount of matter ρ(~r)d3~r locally present. As suggested around Eq. (73) and sketched in Fig. 4.28b, the total rate of transition γ~k ~k′ R ~′ ~ is proportional to the square modulus of the matrix element d3~r e−ik ·~r ρ(~r) eik·~r = R 3 −i(~k′ −~k)·~r d ~r e ρ(~r) ∝ F [ρ](~k ′ − ~k) = ρ̃(~k ′ − ~k), namely the Fourier transform of the relevant density. These considerations suggest that the elastic scattering rate of the incoming ~k wave radiation into the ~k ′ direction is proportional to the square modulus of the Fourier transform ρ̃ of the appropriate density ρ, evaluated at the transferred wave vector ~q = ~k ′ − ~k (Fig. 4.28a). The relevant density is the electronic charge density ρel (~r) for X rays, and the nuclear-matter density ρnuc (~r) for neutrons: 2 2 (280) I X−rays (~q) ∝ F [ρ el ](~q) ≡ ρ̃ el (~q) . neutr nuc nuc 164 4. SOLIDS scritta inutile k’ y q x 0 k’ k r 2θ k (a) SAMPLE ∆ϕ1 = k ∆ϕ2 = −k’ r r (b) ∆ϕ tot= ∆ϕ 1+ ∆ϕ2 = ( k −k’) r = −q r Figure 4.28. (a) In an elastic-scattering experiment (|~k ′ | = |~k|), the transferred wave vector ~q = ~k ′ − ~k contains information equivalent to the scattering angle 2θ. (b) The “far field” wave scattered by a point-like element of matter at ~r is dephased by −~q · ~r with respect to the one at the origin ~0, due to the different path length emphasized by bold segments. Calculation of the total interfering wave from a continuous distribution leads to a Fourier-transform “summation”. As a nucleus is essentially point-like, if seen at atomic length scales, the Fourier transform of the density distribution of one nucleus is basically flat, thus one atom scatters neutrons essentially independently of ~q, as drawn in Fig. 4.29a. When two atoms diffuse neutrons, the scattered matter wave interfere: interference leads to intensity reinforcement and reduction in alternating directions. Constructive interference occurs whenever the two source-atom-detector path lengths differ by an integer multiple of the radiation wavelength λ = 2π/|~q|. Quantitatively, the Fourier ~ 1 and R ~2 transform ρ̃nuc (~q) of two equal point-like objects at R ! !2 2 ~ ~ ~ ~ R − R R + R 1 2 1 2 ~ 1 · ~q) + exp(−iR ~ 2 · ~q) = exp −i |ρ̃nuc (~q)|2 ∝ exp(−iR · ~q 2 cos · ~q 2 2 h i ~1 − R ~ 2 ) · ~q = 2 [1 + cos(a qx )] = (281) 2 1 + cos (R shows characteristic oscillations (Fig. 4.30a,c). Constructive interference yields maximum intensity in the ~q directions characterized by a projection qx along the ~1 − R ~ 2 is aligned along x̂) such that line joining the two nuclei (assume that R ~1 − R ~ 2 ) · ~q = |R ~1 − R ~ 2 | qx ≡ a qx = 2π× an integer l1 . The role of interatomic (R ~ ~ separation a = |R1 − R2 | on the interference pattern of the scattered waves is illustrated by the comparison of Fig. 4.30a and c: an increased separation in real space 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS (a) 165 (b) Figure 4.29. Comparison of (a) neutron and (b) X-ray elastic scattering by a single Si atom. qx represents the x̂-component of the change in wave vector ~k ′ − ~k of the scattered radiation. Intensity is proportional to the square modulus of the Fourier transform of the appropriate density. Neutrons probe the nuclear-matter density, which varies over a length of the order of ∼ 1 fm, and is therefore ~ its Fourier indistinguishable from a Dirac-delta distribution δ(~r − R): transform is essentially independent of qx , until huge qx ≈ 104 Å−1 . X rays measure the Fourier transform of the total atomic electronic charge density, which changes over a length of the order of 1 Å, thus the atomic form factor varies significantly over a qx range of few Å−1 . produces closer interference maxima in ~q space. Scattered intensity is independent of the qy , qz components in the plane orthogonal to the line joining the two atoms, which are therefore ignored by Fig. 4.30 (see also Fig. 4.33a below for a 2D example). Neutron scattering from a crystal arises from the interference of the waves scattered coherently by many atoms. The Fourier transform in Eq. (280) turns into a discrete sum, not unlike light diffracted by an optical grating. Figure 4.31 shows the intensity scattered by short regular chains of atoms: despite the smallness of such 1D “crystals”, narrow dominating diffraction structures emerge at regular ~q directions. The relative intensity of the weak subsidiary interference peaks in between the strong Bragg diffraction peaks decreases quickly as the number of atoms increases. Already for 30 atoms these structures become almost invisible (Fig. 4.31c,d), and vanish completely in the limit of a macroscopically large regular crystal.6 A simple analysis reveals that the directions ~q of maximum diffracted intensity are compatible 6 The modest regularity sufficient to produce strong diffracted peaks grants the possibility to produce visible diffraction patterns even in the event that the coherence length of the probing 166 4. SOLIDS (a) (b) (c) (d) Figure 4.30. (a) Neutron and (b) X-ray scattering by two Si atoms. The patterns shows the characteristic interference periodic~1 − R ~ 2 | is the distance between the two nuclei , where a = |R ity 2π a (here, a = 2 Å). The X-rays pattern is the product (285) of the neutron pattern (uniquely determined by the atomic positions) with the atomic structure factor, carrying information on the charge distribution of the single atom. (c) and (d) are the same as (a) and (b), but with enlarged interatomic separation a = 3 Å: note that the interference fringes move closer, while the enveloping atomic form factor remains unchanged. ~i −R ~ i+1 ) = qx a = 2π l1 : these are the positions of the 1D reciprocal lattice with ~q ·(R radiation available in the lab extends to several lattice cells spacings only, much shorter than the whole sample size. 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS (a) (b) (c) (d) 167 Figure 4.31. Neutron and X-rays diffraction patterns, like in Fig. 4.30, but produced by 7 (a,b) and 30 (c,d) atoms rather than 2π Å−1 reflect the sepa2. The strong diffraction peaks at distance 2.0 ration a = 2.0 Å. The relative intensity of the subsidiary interference peaks decreases quickly as the number of atoms is increased, and vanishes in the limit of infinitely large crystal. In the X-ray pattern, the diffraction scheme is multiplied by the atomic form factor. vectors qx = Gx = 2π l . This occurs also for 2D and 3D periodicity: we conclude a 1 that Bragg diffraction peaks occur for transferred wave vector equal to reciprocal~ lattice points ~q = G. The patterns of Figs. 4.30-4.31 show that the X-ray diffractograms differ from those of neutrons in an amplitude modulation. This is easily understood in steps: 168 4. SOLIDS (a) (b) Figure 4.32. Comparison of the X-ray atomic form factors of carbon (Z = 6) and of silicon (Z = 14). (a) Absolute value, showing that the total X-ray scattering increases ∝ Z 2 ; (b) The form factor of Si is scaled to coincide with that of C at ~q = 0, showing that the form factor of heavier atoms is broader, due to the sharper charge localization of their inner core electrons. • X rays scattered by a single atom probe the continuous electronic density distribution ρat (~r). According to Eq. (280), one atom scatters X rays elastically with the nontrivial ~q dependence given by Fourier transform of its electronic density, called atomic form factor fat (~q) ≡ ρ̃at (~q), as drawn for Si in Fig. 4.29b. Figure 4.32 reports two examples of X-ray atomic form factors: these are characteristic bell-shaped functions of total weight proportional to the squared number of electrons. • As illustrated in Secs. 2.2.1 and 2.2.2 above, chemical bonding modifies the electronic states of a molecule or a solid, so that the electronic charge density does differ from the sum of the densities of the individual atoms. However only valence electrons are involved in bonding and delocalize significantly. X-ray scattering are scattered mostly by the (for Z not too small more numerous) core electrons, which retain an atomic-like charge distribution. Accordingly, it is a fair approximation to assume that the electronic 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS (282) 169 distribution of a collection of atoms (e.g. in a molecule, or in a solid) equals the sum of the individual atomic electronic distributions: X ~ . ρel (~r) = ρat (~r − R) ~ R ~ of a • For a sample composed by many equal atoms sitting at the points R Bravais lattice, the density of electrons can be rewritten as a convolution: (283) Z Z X X 3 ~ = d ~s ρat (~r −~s) ~ = d3~s ρat (~r −~s) · ρBravais (~s) , ρel (~r) = ρat (~r − R) δ(~s − R) ~ R ~ R P ~ where ρBravais (~s) ≡ R~ δ(~s − R). • A basic theorem guarantees that the Fourier transform of the convolution of two distributions is the plain product of the Fourier transforms of the individual distributions: (284) IX−rays (~q) ∝ |ρ̃el (~q)|2 = |fat (~q) ρ̃Bravais (~q)|2 = |fat (~q)|2 |ρ̃Bravais (~q)|2 . • Finally, observe that ρnuc (~r) ∝ ρBravais (~s), and conclude that (285) IX−rays (~q) ∝ |fat (~q)|2 |ρ̃nuc (~q)|2 ∝ |fat (~q)|2 |Ineutr (~q)|2 . This relation details the role of the atomic form factor fat (~q) in X-ray diffraction: X-rays scatter in the same ~q directions as neutrons of the same wavelength, but the peak intensities are modulated multiplicatively by the abs square atomic form factor. This observation accounts for the compared diffractograms of Figs. 4.29 to 4.31. The 1D examples discussed above generalize simply to 2D and 3D. Figure 4.33 shows the 2D pattern of neutrons scattered by 2, 3, 4, and 5 equal atoms located at the vertexes of regular polygons. The fact that 3 and 4 atoms produce a Bravais lattice as an interference pattern, while the pentagonal arrangement does not, suggests that pentagonal symmetry is incompatible with repetition in space. Figure 4.34 shows the typical effect of geometric deformations of the lattice on the diffraction pattern, as described mathematically by Eq. (277). Figures 4.35, 4.36, 4.37, 4.38, illustrate typical 2D neutron diffraction patterns for several geometries. The handy software of Ref. [?] permits to explore arbitrary 2D geometries. The 3D patterns follow similar rules: diffracted beams come out in the directions ~ where the transferred ~q matches a G-vector of the 3D reciprocal lattice. In fact, by shining a monochromatic neutron or X-ray beam on a single crystal, one gen~ erally obtains no diffracted beams, since, for that given ~k, all possible ~k ′ = ~k + G have length |~k ′ | different from ~k, and thus would correspond to inelastic scattering. 170 4. SOLIDS (a) (b) (c) (d) Figure 4.33. 2D neutron scattering patterns produced by regular polygons of 2 to 5 equal atoms. For 2 atoms (a), as indicated by Eq. (281), scattering is independent of the ~q component perpendicular to the line joining them. Note that the regular triangle (b) and square (c) produce a Bravais lattice as a “diffraction” pattern, while the q1 pentagon (d) does not. In this figure and in others below, h = 2π and q2 k = 2π . 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 171 (a) (b) (c) Figure 4.34. 2D X-rays intensity scattered by 4 atoms: this simulates the general features of a square (a), rectangular (b), and oblique (c) net. The relations (277) of inverse proportionality of the reciprocallattice and direct-lattice basis are well illustrated. The diffraction spots at the reciprocal lattice points become much sharper when an actual lattice produces them, rather than 4 atoms only. Diffracted beams are produced only by carefully orienting the crystal, so that, for ~ the incident and scattered wavenumbers match: some G, (286) ~ . |~k| = |~k ′ | = |~k + G| 172 4. SOLIDS Figure 4.35. The neutron diffraction pattern produced by a 9 × 9 square lattice of equal atoms. A red square marks the real-space primitive cell; a blue square marks the reciprocal space primitive cell. The peaks are much sharper than in the 2 × 2 example of Fig. 4.34. Figure 4.36. The neutron diffraction pattern produced by a rectangular lattice. 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS Figure 4.37. oblique lattice. The neutron diffraction pattern produced by an Figure 4.38. The neutron diffraction pattern produced by a triangular lattice. 173 174 4. SOLIDS PLANES LATTICE CRYSTAL SOURCE ing m bea om inc k θ θ k’ |k| sin θ G θ θ k’ ou tgo ing bea m (a) DETECTOR (b) Figure 4.39. (a) The Ewald construction. Given the incident vector ~k, draw a sphere (the Ewald sphere, in gray) of radius |~k| about the point ~k. Diffraction peaks corresponding to reciprocal lattice vectors ~ will be observed only if −G ~ happens to lie on the surface of the G Ewald sphere, as drawn: radiation is then diffracted to the direction ~k ′ = ~k + G. ~ (b) The relation between the scattering angle 2θ and ~ vectors. The angle θ between the incident (or the lengths of ~k and G the scattered) beam and the relevant Bragg plane of atoms fixes the projection ~k · Ĝ = |~k| sin θ, which is compatible with diffraction when ~ it equals 21 |G|. This geometric condition is represented by the Ewald construction of Fig. 4.39a. ~ by The scattering angle, i.e. the angle between ~k and ~k ′ is related to |~k| and G squaring Eq. (286), obtaining (287) ~k · Ĝ = 1 |G| ~ . 2 Figure 4.39b illustrates this geometric relation: the projection ~k · Ĝ = |~k| sin θ should ~ Accordingly, scattering is observed at angles 2θ connected equal half the length of G. ~ by to the lengths of ~k and G (288) sin θ = ~ |G| . 2 |~k| ~ to a family of lattice planes (drawn in Fig. 4.39b) sepIn Sec. 4.1.2, we related G arated by a distance d = n |2π ~ , where n is the greatest common factor among the G| ~ On the other hand, |~k| is connected to the radiation integer Miller indexes fixing G. 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 175 Figure 4.40. The Ewald construction for a powder sample: ~k defines a fixed Ewald sphere (gray). All possible orientations of the ~ lattice (solid crystal are realized in the powder sample: the whole G ~ point covpoints) is thus rotated in all possible ways, so that every G ers a sphere (represented by a circle in the plane of the figure). Each of these spheres intersects the Ewald sphere on a circle, whose projection in the plane of the figure is marked by two empty dots. All ~k ′ on that circle make a fixed angle 2θ with ~k. Radiation is then scattered to cones whose axis is the k̂ direction. : substitution in Eq. (288) yields the celebrated Bragg wavelength λ by |~k| = 2π λ condition for diffraction (289) 2d sin θ = nλ . ~ or equivalently According to this construction, no diffraction occurs for 2|~k| < |G|, for λ > 2d. In practice, to produce diffracted beams off a perfect single crystal, one must ~ point touches the move the reciprocal lattice and the Ewald sphere until some G Ewald sphere, as in Fig. 4.39a. One can either vary the radiation wavelength, thus changing the Ewald sphere diameter, or else rotate the crystal sample (its reciprocal lattice rotates accordingly, see Eq. (277)). 176 4. SOLIDS (a) (b) (c) Figure 4.41. Powder diffraction patterns: intensity as a function of the scattering angle 2θ. (a) BaTiO3 , the intensity counting compared to the film recording. (b) La2 Mo2 O9 , showing a structural phase transition. (c) LaPO4 , showing the progressive sample annealing of a clear crystalline phase as temperature is raised, and the comparison with the peak positions of known phases. In the lab, it is common to characterize the structure of powder samples, i.e. collections of microcrystals rotated randomly in space. This uniform distribution of orientations is equivalent to averaging over all possible rotations of the reciprocal lattice. As illustrated in Fig. 4.40, radiation is scattered at fixed angles 2θ (given by Eq. (288)) away from the incident ~k, thus it forms cones of diffracted radiation. Figure 4.11 shows an example of such a pattern. The sample in that experiment is an Al foil, not a proper powder. The ring pattern of Fig. 4.11 indicates sharp cones of radiation diffracted at characteristic angles: this proves that Al is microcrystalline. Microcrystalline structures of this kind (tightly bound collections of randomly oriented microscopic individual crystals separated by grain boundaries) are responsible for the plastic deformable character of most solid metals, as opposed to the rigidity of many solids forming large single crystals (e.g. Si, NaCl). 4.1. THE MICROSCOPIC STRUCTURE OF SOLIDS 177 Structural data about powder or microcrystalline samples are conveniently collected in plots of the diffracted intensity at the each angle 2θ, as in the patterns of Fig. 4.41. Here the vertical axis reports the total scattered intensity, integrated along circles at fixed 2θ. Till this point, the discussion assumed a single atom per unit cell of the Bravais lattice. To extend the formalism to general crystals with a n-atoms basis, replace the atomic form factor in Eq. (285) with a suitable structure factor (290) S(~q) = n X ~ fat j (~q) ei~q·dj , j=1 representing the Fourier transform of the matter distribution of the n atoms in the cell sitting at positions d~j . For neutrons, the fat j (~q) are ~q-independent weights, numerically different for different nuclei in the cell; for X-rays, fat j (~q) are the form factors (see Fig. 4.32 and Eq. (285)) of the individual atoms in the cell. The decomposition I(~q) ∝ |S(~q)|2 |ρ̃Bravais (~q)|2 implies a fundamental result: a given Bravais lattice yields the same characteristic diffraction pattern irrespective of the number, kinds and position of the atoms populating its unit cell: these details affect the structure factor |S(~q)|2 , in turn applying a multiplicative intensity modulation to the same peaks. The possibility of describing lattices with a basis might make us suspect some ambiguity. For example, a square lattice of side a could also be seen as a square ~ points of the 2a lattice of side 2a with 4 atoms per cell: the reciprocal-lattice G lattice are twice as dense in each direction, but the diffracted pattern should remain the same as we only change the formal description of the same system. Indeed, ~ points of the denser the structure factor (drawn in Fig. 4.34a) vanishes for all G reciprocal lattice which do not belong to the reciprocal lattice of the true a-side square. However, the 2a lattice may become the correct minimal description of the actual lattice, e.g. in case of a structural deformation – one atom in each 2 × 2 square moving away from its perfect-square position. Under these conditions, the ~ peaks acquire nonzero intensity. Similarly, a fcc lattice of conventional 2a denser G ~ points of a cell side a, as a Bravais lattice, produces diffraction spots for all ~q = G 4π bcc reciprocal lattice of conventional side a . However, the same fcc lattice can be seen as a simple cubic lattice of side a with 4 atoms/cell. The diffraction pattern of ~ points (a simple cubic lattice of side 2π ) than the simple cubic lattice has denser G a that of the fcc. The actual diffraction pattern must coincide regardless of the chosen formalism. Indeed, the structure factor (290), computed for 4 equal atoms (equal ~ fat j (~q)) sitting at positions ~0, a2 (x̂ + ŷ), a2 (x̂ + ẑ), a2 (ŷ + ẑ), gets rid of peaks at G ) not belonging to the actual bcc reciprocal points of the simple cubic lattice (side 2π a 4π lattice (side a ). 178 4. SOLIDS Defects and the finite size of the crystals in the sample add a continuous background of diffuse scattering to the sharp Bragg peaks (see e.g. Fig. 4.31) of the diffraction pattern. If disorder increases, this continuous background increases in intensity, until for completely amorphous or liquid samples, neutron or X-ray scattering does not show the sharp Bragg peaks characteristic of lattice periodicity any more: scattered intensity becomes a smooth function of the angle 2θ. Even for such materials, the scattered distribution provides useful information about their average structural properties, which can be retrieved by numerical Fourier analysis. 4.2. Electrons in crystals Within the adiabatic framework (Sec. 2.1), electrons move in a solid according to the electronic equation (117). The total electronic energy obtained by solving Eq. (117), added to the internuclear repulsion, yields the adiabatic potential (125) which, in turn, determines the dynamics of the nuclei through Eq. (124). For a crystalline solid at low enough temperature, Vad keeps the atomic configuration close to its minimum, characterized by a regular arrangement of the nuclei, not unlike a finite (huge) portion of an ideal crystal structure of the kind described in Sec. 4.1. In Sec. 4.3 we shall investigate the motions of the ions around their equilibrium configuration. For the moment however we neglect the nuclear kinetic energy and the movements of the ions: we assume they sit at equilibrium exactly at the ideal crystal-structure positions, and point our attention to the dynamics of electrons. Within this idealized scheme, we investigate the universal properties of the solutions of the electronic equation (117), those required by symmetry. Equation (117) is a many-body equation, carrying the same conceptual difficulties we discussed for the molecular case. As for atoms and molecules, the Schrödinger problem of many electrons moving in the field of the nuclei is usually treated in the framework of some mean-field self-consistent technique of the Hartree-Fock type. As discussed in Sec. 1.2.4, this approximation maps the N -electron equation to a set of single-electron self-consistent equations for the motions of an electron in the field of the nuclei and N − 1 other electrons. We assume that the mean-field effective potential Veff (~r) has the same symmetry as the field created by the bare nuclei, i.e. the full crystal symmetry7 (Fig. 4.42). As in atoms, the single-electron HF orbitals carry spherical-symmetry labels l and m, similarly in solids a single-electron states are labeled by Bravais-lattice group representations, namely vectors ~k chosen within a primitive zone of the reciprocal lattice, e.g. the first Brillouin zone. More concretely, the ~k quantum number contains information on how the wavefunction 7 This assumption is far more reasonable than the assumption that the HF mean field of atoms is spherically symmetric. The reason is that the crystal states have all equal probability in different cells (as discussed below), while non-s atomic states have nonuniform probability distributions. 179 0 -10 Vne(x) Veff(x) [Ha] 4.2. ELECTRONS IN CRYSTALS -20 x Figure 4.42. The bare nuclear potential Vne (x) (solid) and the screened effective one-electron potential Veff (x) (dashed) along a 1D cut through a direct-lattice primitive direction in a typical solid (e.g. Al). The effective potential is less attractive than the bare nuclear potential, but it shows the same discrete lattice translational symmetry, ~ Veff (~r) = Veff (~r + R). changes under the action of lattice translation TR~ , i.e. in going from one cell to another within the lattice. Additional quantum numbers are needed to label states of different wavefunction within a single primitive cell, and for spin projection. A precise determination of the wavefunction and energy for a single electron moving in the HF mean field requires a detailed calculation, which is generally carried out by numerical means. In this context, symmetry plays a twofold role: • greatly simplify the solution of the HF-Schrödinger equation in the lattice; • understand general features of its solutions. Lattice discrete symmetry applies to the electronic eigenstates of crystals through Bloch’s theorem. This theorem states that all Schrödinger eigenstates in a periodic ~ can be chosen in the factorized form potential [Veff (~r) = Veff (~r + R)] (291) ~ ψj (~r) = eik·~r u~k j (~r) , where the function u~k j (~r) has the same periodicity of the lattice [u~k j (~r) = u~k j (~r + ~ and ~k is a suitable wave vector (depending on ψj , but otherwise subject to no R)], restriction). This fundamental result is seen in two alternate, but equally instructive ways: 180 4. SOLIDS • In a periodic context all electronic eigenfunctions have a nontrivial spatial dependence only within one primitive unit cell: in any other cell displaced by TR~ , the wavefunction is equal to that in the original cell, apart from a ~ ~ constant phase factor eik·R which leaves the probability distribution |ψj |2 unaffected. This consideration is basically also the demonstration of Bloch’s theorem.8 • In a periodic potential, the Schrödinger eigenstates are essentially planewave–like states, except for a periodic (thus trivial) amplitude modulation. Observe that ~k may as well be restricted to one primitive cell of the reciprocal lattice, e.g. a primitive parallelepiped, or the first Brillouin zone9 – see Fig. 4.17. Indeed, if ~k was outside this primitive cell, we could always find a reciprocal lattice ~ such that ~k ′ = ~k + G ~ is in the primitive cell of our choice. But then vector G ′ ~ ~ ~′ ~ ~ ψj (~r) = eik·~r u~k j (~r) = ei(k −G)·~r u~k j (~r) = eik ·~r e−iG·~r u~k j (~r), and the function u~′k′ j (~r) = ~ e−iG·~r u~k j (~r) is lattice-periodic, thus it makes a valid Bloch function. It is instructive to find the explicit equation satisfied by the Bloch functions u~k j (~r). This is obtained by substituting the decomposition (291) in the stationary Schrödinger equation for the electrons in the periodic effective potential: ~2 2 ~ ~ (292) − ∇ + Veff (~r) eik·~r u~k (~r) = E eik·~r u~k (~r) . 2me To deal with the kinetic term, observe that h i ~ ~ · ei~k·~r ∇u ~ ~ (~r) + i~kei~k·~r u~ (~r) ∇2 eik·~r u~k j (~r) = ∇ kj kj 2 ~ ~ ~ ~ (~r) − |~k|2 ei~k·~r u~ (~r) = ei~k·~r ∇ ~ + i~k u~ (~r) . = eik·~r ∇2 u~k j (~r) + 2i~keik·~r ∇u kj kj kj By substituting this decomposition into the Schrödinger equation (292), and dividing ~ by the common factor eik·~r , we obtain 2 ~2 ~ ∇ + i~k + Veff (~r) u~k j (~r) = E~k j u~k j (~r) . (293) − 2me 8 The single-electron effective Hamiltonian Te + Veff (~r) commutes with all lattice translations TR~ . Accordingly, its eigenfunctions may be chosen as simultaneous eigenfunctions of all discrete ~′ ~ translation operators. But then TR~ ψj (~r) = e−ik ·R ψj (~r) for some ~k ′ in the first Brillouin zone. ~′ Accordingly, one can call ~k = −~k ′ and u~k j (~r) = ψj (~r) eik ·~r is periodic by construction. 1dimensional group representations do not seriously break the lattice symmetry, except possibly for phases. As a related example, the symmetric and antisymmetric wavefunctions of Eq. (126), drawn in Fig. 2.3 have both a perfectly symmetric square modulus under reflections across the mid-plane separating the nuclei, exactly like Bloch states have square modulus equal in all cells. 9 In 1D, any k-interval of length 2π . For example, the interval − π < k ≤ π is the first Brillouin a a a zone. 4.2. ELECTRONS IN CRYSTALS 181 E j=3 j=2 π _ a − _π a kx j=1 Figure 4.43. For each given ~k in the first Brillouin zone, the solution of Eq. (293) provides discrete electronic energies E~k j , for j = 1, 2, ..., three of which are sketched here. These energies depend parametrically on ~k. The continuous functional dependency Ej (~k) = E~k j is an energy band of the solid. This is the basic equation for the stationary states of an electron characterized by a given ~k. Thanks to the periodicity of u~k j established by Bloch’s theorem, Eq. (293) must be solved within a single cell of the direct lattice, with applied periodic boundary conditions. This is to be contrasted with the original Schrödinger equation, which should instead be solved in the whole crystal. Lattice discrete symmetry, through Bloch’s theorem, permits us to study the quantum dynamical problem, rather than on a macroscopically large volume (where it would be arduous to seek for solutions – this is the reason why non-periodic solids are much harder to deal with), on a relatively small unit cell, where investigation can profitably be carried out by a number of approximate techniques. Once u~k j (~r) is found, the true electronic wavefunction ψj (~r) is then extended to the whole lattice by means of Eq. (291). Before attempting the solution of Eq. (293), we establish a few universal properties of the single-electron eigenstates and eigenenergies in a solid. Equation (293) is a second-order differential equation of the Schrödinger type, defined in a finite volume Vc , with standard periodic boundary conditions. Apart for the ~k shift of wave vector, Eq. (293) is equivalent to a stationary Schrödinger equation. Consequently, for a given fixed ~k, its solutions must be qualitatively similar to those of a standard 182 4. SOLIDS Figure 4.44. Electronic states of a solid built approaching isolated atoms: as the atoms move closer, the lattice separation a becomes comparable to the size of the atomic wavefunctions, and the individual atomic levels spread out in bands. If external pressure is applied to the solid, so that the lattice parameter decreases to a′ , then the bandwidths increase further, and more bands may overlap. Schrödinger equation in a finite volume, namely: a ladder of discrete eigenenergies E~k j associated to eigenfunctions u~k j (~r) with larger and larger number of nodes for increasing energy. The index j = 1, 2, 3, ... labels precisely the solutions in order of increasing energy (Fig. 4.43). ~k can vary arbitrarily within the first Brillouin zone, and as it does, the equation (293) for u~k (~r) changes, which explains attaching the label ~k to both eigenenergies E~k j and eigenfunctions u~k j (~r). The parameter ~k acts analytically in Eq. (293), thus it is reasonable to expect that its solutions depend analytically on ~k. Indeed, for fixed j, the eigenenergies E~k j depend on ~k as continuous functions called energy bands, or simply “bands” (Fig. 4.43). As ~k takes all its allowed values within the first Brillouin zone, for each j, E~k j spans a continuous interval of available energies (the range of the E~k j function, sometimes called itself a “band”). The ranges of two successive bands E~k j and E~k j+1 can either overlap (like bands 2 and 3 in Fig. 4.43) or not overlap (like bands 1 and 2 in Fig. 4.43). Both possibilities are compatible with Eq. (293), and do occur in actual solids. The main consequence of Bloch’s theorem is then a spectrum of electronic energies involving intervals of allowed energies, separated by ranges of forbidden energies (band gaps). In a crystal, electrons are characterized by an energy spectrum somewhat intermediate between that of a free [j=1] Re ψk j(x) Re ψk j(x) (a) 183 [j=2] 4.2. ELECTRONS IN CRYSTALS x (b) x Figure 4.45. A sketch of the real part of some Bloch wavefunction ψk j (x) = eik·x uk j (x) in a 1D lattice, for (bottom to top) k = 0, , k = 0.2 2π , k = 0.3 2π , k = 0.4 2π , k = 0.5 2π , for the k = 0.1 2π a a a a a states belonging to the two lowest bands j = 1 (a) and j = 2 (b). Negative k values of the same length |k| yield wavefunctions whose real part is the same. As k increases across the first Brillouin zone, the phase difference of the wavefunction at neighboring sites increases. At the zone boundary k = πa , this phase difference is maximum, and equals π, corresponding to the sign alternation of the upmost curve. At the same time, the shape of the wavefunction at each site also evolves with k, due to the k dependence of the equation (293) for uk j (x). particle (all positive energies) and that of an atom (isolated eigenvalues separated by gaps). Figure 4.44 sketches the connection between the atomic and the solid-state spectra as the component atoms move together to form the solid. Figure 4.45 shows the qualitative shape of a few band eigenfunctions in a 10-sites portion of a simple 1D lattice (the picture proposed here extends simply to 2D and 3D). The real part of the wavefunctions for two bands is drawn: the imaginary part would be similar, but out of phase by one quarter of wavelength. The probability density |ψ~k j (x)|2 is periodic, i.e. repeated equally in all cells. The primary role of ~k is to tune the phase change of the wavefunction in going from one site to the next. The wavelength associated to these Bloch waves is 2π . The values of ~k chosen |~k| in Fig. 4.45 yield wavelengths commensurate to the 10-sites region drawn. Other 184 4. SOLIDS |2> |1> ... |0> |1> |0> | N n −1> (a) (b) Figure 4.46. A 1D periodic lattice of Nn sites with periodic boundary conditions has the connectivity of a ring. (a) The Nn = 2 case, representing a diatomic molecule. (b) A generic 1D lattice. intermediate choices of ~k would produce incommensurate wavelengths. In addition, ~k also modifies the local “shape” of the wavefunction ψ~ (x) within each cell, through kj the explicit dependency of Eq. (293), and thus of u~k j (x). This ~k-dependency is smooth and often relatively weak, while the probability density usually changes more substantially in comparing a band j to another j ′ . The detailed calculation of the band energies E~k j and wavefunctions u~k j (x) of an actual solid is typically carried out by means of some self-consistent calculation of the mean-field potential Veff (~r), associated to numerical solution of Eq. (293) in a directlattice primitive unit cell, for a sufficiently dense set of ~k points. However, some insight in the physics of the band states can be obtained by considering substantially simplified “model” solutions. 4.2.1. Models of bands in solids. Although Eq. (293) is not especially difficult to solve numerically in a primitive unit cell, for any reasonable number of “sample” ~k points, it is useful to gain better insight in the properties of the band energies and states by means of simplified approximate solutions, which allow for some analytical treatment. 4.2.1.1. The tight-binding model. In regions near the atomic nuclei, the crystal effective potential acting on an electron is not much different from that of an isolated atom. A crystal may be seen as a huge molecule: accordingly, the electronic (band) states could be approximately expressed as suitable linear combinations of atomic wavefunctions, like the molecular states of H+ 2 and H2 . The 2-atoms calculation of Sec. 2.2.1 is the simplest example of tight binding. It addresses the 1s “band” of a “crystal” composed by Nn = 2 H atoms only. |1s Li 4.2. ELECTRONS IN CRYSTALS 185 Ek 1 0 -1 -0.4 -0.2 0 0.2 0.4 k a / 2π Figure 4.47. The continuum of electronic states in a solid builds up as the Nn discrete energies at the allowed k points get denser and denser with Nn → ∞. represents the state at the left atom. |1s Ri, represents that at the right atom obtained from |1s Li by a “lattice” translation. Artificial periodic boundary conditions bring |1s Ri back again to |1s Li upon a further translation (Fig. 4.46). The symmetry-adapted states (see Eq. (126)) can be seen as bonding and antibonding combinations 1 X ikpa 1 X 1 |pi = √ e |pi , k=0 |Si = √ (|1s Li + |1s Ri) = √ Nn p=0,1 Nn p=0,1 2 1 1 X ikpa π 1 X |Ai = √ (|1s Li − |1s Ri) = √ (−1)p |pi = √ e |pi , k = , a Nn p=0,1 Nn p=0,1 2 with |0i = |1s Li and |1i = |1s Ri. The only possible phase relations between in those two-sites combinations are “in phase” and “out of phase”. More in general, for a chain of Nn ≥ 2 H atoms (a 1D “crystal”, Fig. 4.46), the combinations compatible with the discrete translational symmetry are the following Nn ones: Nn −1 1 X eikpa |pi . ψk = √ Nn p=0 k takes the Nn equally spaced values 1 2π 2 2π 3 2π Nn /2 − 1 2π 1 2π k = 0, ± , ± , ± , ..., ± , and Nn a Nn a Nn a Nn a 2 a compatible with the “ring” periodicity. In the limit of very large Nn , the k states form a continuum in the first Brillouin zone − πa < k ≤ πa . As k spans this range, 186 4. SOLIDS Figure 4.48. The band states in a solid seen as combinations of the orbitals of the isolated atoms as they move together to form a crystal. The Nn discrete levels of a finite solid of Nn unit cells build up the band continuum in the thermodynamic limit Nn → ∞, as discussed in Sec. 4.2.2. The bandwidth increases as the lattice parameter a is reduced to its equilibrium value, due to the increasing overlap, as illustrated for the bonding and antibonding states of the diatom in Fig. 2.4b. the energy of the resulting band moves continuously from the bonding value (band bottom) to the antibonding value (band top), as illustrated in Fig. 4.47. This bandstructure is a generalization of the molecular Eq. (128). Similarly, as the interatomic separation a is reduced, the overlap integral between atomic orbital increases, and so does the bandwidth (Fig. 4.48). These simple concepts apply to any atomic orbital of for real 3D solids, as illustrated in Fig. 4.49 for solid sodium. Note that, for a given interatomic separation a, the overlaps of shallow extended atomic states are larger than those of deeper, more compact core states. Accordingly, in the solid state, inner levels originate narrower bands than external optically active atomic orbitals. The initial lowering of the band center of mass from the atomic energy as a is reduced from very large values (Fig. 4.48) is due to two main reasons: • a band state in a crystal is subject to the collective attraction of all nuclei, rather than just one; 4.2. ELECTRONS IN CRYSTALS 187 Figure 4.49. Tight-binding energy bands of solid sodium, as a function of internuclear distance. The strongly overlapping 3s, 3p, ... bands indicate that the tight-binding method is not especially well suited to describe the wide conduction band at the equilibrium lattice spacing. In fact, the plane-waves method (Sec. 4.2.1.2) is better fit to the conduction band of alkali metals. • the kinetic energy of the more delocalized band state is lower than that of the more compact atomic state. If the band in exam happens to be filled with electrons, this energy lowering contributes to the crystal cohesive energy. 4.2.1.2. The plane-waves method. An approach to approximate bands alternative to tight binding starts from observation that, as illustrated in Fig. 4.42, the effective one-electron potential for the outer electrons is substantially screened in the crystal, thus free-electron plane-wave states should approximate well the actual eigenstates, except near the atomic nuclei. In the following, we use the 3D formalism, but the problem is most simply visualized in 1D, and most illustrative examples restrict to 1D for simplicity. 188 4. SOLIDS Expand the one-electron lattice eigenstates on the basis of plane waves: X |bi = b~k′ |~k ′ i, ~k′ where |~k ′ i are suitably normalized plane-wave states,10 and, in principle, the sum extends over all ~k ′ vectors, and amounts therefore to an integration. Application of Schrödinger equation to the candidate eigenket |bi and multiplication on the left by h~k| (implying a volume integration over ~r) maps the initial differential problem to an algebraic (matrix) equation for the wavefunction Fourier components b~k′ : 2 p + Veff |bi = E |bi H |bi = 2me 2 p ~ hk| + Veff |bi = E h~k|bi 2me X X ~2 k 2 ′ (294) δ~k,~k′ + h~k|Veff |~k i b~k′ = E δ~k,~k′ b~k′ , 2m e ′ ′ ~k ~k where appropriate orthonormality h~k|~k ′ i = δ~k,~k′ of the plane waves and the fact that they are eigenstates of the momentum have been used. The matrix elements Z ~′ ~ ′ ~ ~ hk|Veff |k i = N ei(k −k)~r Veff (~r) d3~r = Ṽeff (~k − ~k ′ ) are the Fourier components of the potential (and N is the appropriate normalization according to the convention discussed in footnote 10). Until this point, no mention has been made of any lattice symmetry, and indeed Eq. (294) is the equivalent formulation of Schrödinger equation in Fourier space for a generic potential Veff . This formulation is generally not advantageous, since the matrix indexes ~k of the eigenvalue problem (294) are continuous quantities, taking infinitely many values, exactly like ~r in the real-space equation. For a periodic potential however, as observed in the general discussion around Eq. (278), the Fourier expansion of the periodic potential is a discrete Fourier series over the reciprocal ~ a vector of the reciprocal lattice, i.e. Ṽeff (~k − ~k ′ ) is nonzero only for ~k − ~k ′ = G, lattice. This means that in the continuous-indexed energy matrix of Eq. (294) most off-diagonal matrix elements vanish. In practice, given any ~k ′ in the first Brillouin zone, the off-diagonal potential matrix elements connect the plane wave |~k ′ i only ~ to plane waves |~ki, whose ~k is displaced by a reciprocal lattice vector: ~k = ~k ′ + G. In an infinite space, the standard normalization is h~r|~ki = (2π)−3/2 exp(i~k · ~r). In a finite volume V , the correct normalization is h~r|~ki = V −1/2 exp(i~k · ~r). 10 4.2. ELECTRONS IN CRYSTALS 189 One can then consider separately each subset of states originated from a given ~k. Only the corresponding matrix sub-blocks need to be diagonalized. In one of these sub-blocks, the matrix form of Eq. (294) is:  .. .. .. . . .  ~ 1 ) + Ṽeff (~0) ~ ~ ~ ~ 3)  · · · T (~k+ G Ṽeff (G1 − G2 ) Ṽeff (G1 − G   ··· ~2 − G ~ 1) ~ 2 ) + Ṽeff (~0) ~2 − G ~ 3) Ṽeff (G T (~k+ G Ṽeff (G   ··· ~3 − G ~ 1) ~3 − G ~ 2) ~ 3 ) + Ṽeff (~0) Ṽeff (G Ṽeff (G T (~k+ G  .. .. .. . . . ... (295)   ..    .  ···  b~ ~     k+G1   b · · · ·   ~k+G~ 2    ···    b~k+G~ 3  .. ... .  .  .. b   ~k+G~ 1    = E~k  b~k+G~ 2  ,    b~k+G~ 3  .. . where T (~k) = ~2mke . For fixed ~k in the first Brillouin zone, this matrix must be diagonalized. Diagonalization becomes trivial whenever all off-diagonal matrix elements ~ 6= ~0) = 0): the eigenvalues in are identically null, i.e. for a constant potential (Ṽeff (G the solid are then simply the free-particle energies shifted by the constant potential 2 2 E~k = ~2mke + Ṽeff (~0), and the eigenstates coincide with the original plane waves |~ki. ~ 6= ~0 Fourier components of Veff are nonzero: However, in any realistic solid, many G these generate off-diagonal couplings among plane waves. The exact eigenstates ~ vectors, of the problem are linear combinations of the plane waves differing by G ~ obtained by the diagonalization of the full matrix in Eq. (295). As the G-points are infinite, this is again an infinite matrix, but one can cut the basis restricting ~ to a finite number d of relevant states, i.e. of G-points, and then diagonalize a finite version of Eq. (295) numerically. This method is routinely used in standard bandstructure calculations. ~ 6= 0)|), analytical information Whenever the potential is “weak” (small |Ṽeff (G can be extracted out of (295). More precisely, whenever the off-diagonal elements ~2 −G ~ 1 ) are small compared to the diagonal separation |T (~k+G ~ 1 ) − T (~k+G ~ 2 )| Ṽeff (G of the coupled states, the off-diagonal term acts as a small perturbation, thus it ~ G ~ 1) “perturbs” the diagonal energy only weakly. As a result, if all couplings Ṽeff (G− ~ 1 i to all other plane-wave states are small with respect to their of a state |~k + G diagonal energy separation, we can safely assume that the exact band energy shall 2 2 190 4. SOLIDS T(k) [eV] 30 20 10 0 -2π/a 0 2π/a k ~ 1 ) = T (~k+ Figure 4.50. A graphical solution of the equation T (~k+G ~ 2 ). The special ~k-points associated to plane waves with degenerate G kinetic energies are obtained by translating the free-energy parabolic dispersion to all possible reciprocal lattice G-vectors (here only two are indicated for simplicity). In this figure, a = 2.1 Å, and G1 = 0, G2 = 2π . a not differ much from the diagonal energy (296) E~k ≈ ~2 |~k|2 + Ṽeff (~0) . 2me Even in the favorable case of tiny Fourier components of Veff , the condition ~ ~ ~ ~ ~ ~ T ( k+ G ) − T ( k+ G ) ≪ Ṽ ( G − G ) (297) eff 1 1 2 2 is not verified for a few special ~k values, those such that the kinetic terms are ~ 1 ) ≃ T (~k + G ~ 2 ). At these points (highlighted in degenerate or nearly so: T (~k + G ~1 −G ~ 2 ) becomes dominating, and it displaces Fig. 4.50), the off-diagonal term Ṽeff (G the actual band significantly away from the free-electron parabola Eq. (296). Near the degeneracy of two states only, approximate bands energies can be calculated by diagonalizing the 2 × 2 matrix ~ 1 ) + Ṽeff (~0) ~1 − G ~ 2) T (~k+ G Ṽeff (G ~2 − G ~ 1) ~ 2 ) + Ṽeff (~0) . Ṽeff (G T (~k+ G 4.2. ELECTRONS IN CRYSTALS 191 This is similar to Eq. (130), but complex hermitian rather than real. The eigenen~ 1 ) + Ṽeff (~0) in place of EL , T (~k + G ~ 2 ) + Ṽeff (~0) in place of ergies (131), with T (~k + G ~ ~ ER , and |Ṽeff (G1 − G2 )| in place of ∆, solve also to the 2 × 2 problem at hand. As these eigenenergies (131) show, the off-diagonal element produces a characteristic ~1 − G ~ 2 ). “repulsion” of the levels, which do never get any closer than 2 Ṽeff (G This model illustrates the tendency of the periodic components of the potential to open forbidden energy intervals in the otherwise uninterrupted parabolic freeelectron dispersion. As illustrated in Fig. 4.51, in 1D, gaps are guaranteed to open ~ ~ of the potential happens to at all G2 points (unless some Fourier component |Ṽeff (G)| vanish). In several dimensions, degeneracies of the kinetic term occur for all ~k, such ~ = |~k|, which is the condition (286) for Bragg scattering. This may open that |~k + G| a band gap in some direction in ~k-space, but this does not always generate a true gap, i.e. a range of forbidden energy, since those same energies may well be allowed in some other ~k-direction (see Fig. 4.55 below). Both the free-electron starting point described here and the tight-binding method sketched above lead to single-electron spectra characterized by bands of allowed energy separated by gaps of forbidden energy, in accord with Bloch’s theorem. These models allow us to clarify the physical meaning of both quantum numbers of electronic states in solids (see Eq. (293) and Fig. 4.43): ~k (illustrated in Fig. 4.45), and the band index j. In a tight-binding context, the band index contains indications about the atomic nature of the state (which is a clear-cut concept especially for the narrow bands of inner shells, see Fig. 4.49). Instead, in the quasi-free-electron approach (especially efficient for the wide bands related to the empty atomic levels) the band index represents basically the number of times that the ~k 6= ~0 Fourier components of the potential have reflected the free-electron parabola back inside the first Brillouin zone. 4.2.2. Filling of the bands: metals and insulators. The T = 0 state (ground state) of a system of many independent electrons in a periodic potential is obtained by filling the one-electron band states up to a Fermi energy, like in the free-fermion model described in Sec. 3.2.2.1. The Fermi energy separates filled (below) from empty (above) levels. Depending on the total number of electrons in the solid, the Fermi energy may end up either inside one (several) energy band(s), or within a band gap. In the first case, the electrons in the partly filled band(s) close to the Fermi energy are ready to take up excitation energy, for example by external fields. In particular, an arbitrarily weak applied electric field can accelerate electrons, which can then conduct electric current. Such a solid is a metal. In contrast, all electrons in completely filled bands are “frozen” by Pauli’s principle. Any dynamical response of these electrons requires excitation across some energy 192 4. SOLIDS Figure 4.51. (a) The free-electron dispersion E versus k – a parabola in 1D. (b) A G-translated free-electron parabola crosses the original free-electron band at 12 G: in the neighborhood of this crossing a weak periodic potential component produces the largest distortion. (c) A gap of width 2|UG | = 2|Ṽeff (G)| opens at the degeneracy point k = 21 G: the band distorts in a whole neighborhood around there. (d) The portions of the band originated from the starting free-electron dispersion. (e) The effects of other G components of the potential on the same free-electron parabola. This particular way of representing the bands is known as extended-zone scheme. (f) Same bands reported inside the first Brillouin zone (reduced-zone scheme). (g) Same bands, in a repeated-zone scheme. 4.2. ELECTRONS IN CRYSTALS 193 Figure 4.52. Two qualitatively different possibilities for band filling at T = 0: a partly filled band (the solid is a metal), and a completely filled band (with all bands either full or empty and the Fermi energy in a gap the solid is an insulator). Figure 4.53. Qualitative bands of a pure semiconductor, compared to the Fermi distribution. gap. A solid where all bands are either filled or empty, with the Fermi energy inside a gap is an insulator. In solids, the highest completely filled band is called valence band, and the first empty (or partly filled) band is the conduction band. This basic difference affects substantially all properties of these two classes of materials, even at finite temperature. The Fermi-Dirac distribution (238) applies to independent electrons at equilibrium in a solid pretty much like in a free gas, simply replacing the free-electron energies and plane-wave states with the band energies and Bloch states. Thus, a nonzero temperature in a metal generates a finite concentration of electrons above the chemical potential and holes below it, similarly to what happens around the Fermi sphere of a free-electron gas. Thus the thermodynamics 194 4. SOLIDS E 3p 3s E 3p 3s Ne εF εF 2p 2s 2p 2s 1s (a) k − π_ a − 2π __ 3a − π__ 3a 0 π __ 3a 2π __ 3a π _ a Na 1s (b) k − π_ a − 2π __ 3a − π__ 3a 0 π __ 3a 2π __ 3a _π a Figure 4.54. The filling of the bands of a symbolic Ne (a) and Na (b) 1D crystal composed of Nn = 6 atoms. According to Eq. (298), the allowed k-points in the first Brillouin zone are k = 0, ± 16 2π , ± 26 2π , a a 3 2π π and 6 a . The − a point must be excluded, since it is no different of π . Individual bands are assumed not to overlap. Energies are purely a qualitative and not in scale. of electrons in metals is interesting and rich of physical consequences, including a characteristic T -linear contribution to the heat capacity of the solid, as discussed in Sec. 3.2.2.1. Instead, in an insulator of gap ∆ between conduction and valence band, the average occupancy of a valence-band state is [nv ] ≃ 1 − exp[−∆/(2kB T )], extremely close to 1 at low temperature, and correspondingly the average occupancy of a conduction-band state is [nc ] ≃ exp[−∆/(2kB T )], extremely small at low temperature. On the contrary, for temperatures much smaller than k∆B , the electrons of an insulator can be considered to all effects as frozen in the filled bands, their excitation being accessible only by means of high-energy spectroscopies. For a widegap insulator (e.g. Al2 O3 ∆ ≃ 5 eV, C diamond ∆ = 5.5 eV, SiO2 ∆ = 8.0 eV, NaCl ∆ = 8.97 eV), any practical temperature is by far smaller than k∆B , and the band-state occupancies are indistinguishable from those at T = 0. For example, with a 4 eV gap at room temperature (kB T ≈ 0.025 eV), [nc ] ≈ e−80 ≃ 10−35 . However, small-gap insulators, usually called semiconductors (e.g. Ge ∆ = 0.75 eV, Si ∆ = 1.17 eV, GaAs ∆ = 1.51 eV), show measurable conduction associated to thermal electronic excitations across the gap, even at room temperature, as sketched in Fig. 4.53, and discussed in Sec. 4.2.2.2. An infinite crystal contains an infinite number of cells and an infinite number of electrons: to establish the expected position of the Fermi energy, a counting rule is 4.2. ELECTRONS IN CRYSTALS 195 needed. Consider a macroscopic portion of the solid of volume V = Nn 1 Nn 2 Nn 3 Vc , extending for Nn i lattice repetitions in the ~a1 , ~a2 , ~a3 primitive directions. Apply periodic boundary conditions to this portion of solid, so that discrete translational invariance is preserved. To satisfy these periodic boundary conditions ψ~k j (~r) = ψ~k j (~r + Nn i~ai ), the ~k label of the Bloch states is restricted to (298) ~k = n1 ~b1 + n2 ~b2 + n3 ~b3 , with nj = − Nn j+1, − Nn j+2, ... Nn j−2, Nn j−1, Nn j . Nn 1 Nn 2 Nn 3 2 2 2 2 2 These values of ~k are the lattice equivalent to those of Eq. (180). These Nn = Nn 1 Nn 2 Nn 3 discrete ~k values become dense and fill the primitive unit cell of the reciprocal lattice as Nn j → ∞ and the infinite Bravais direct lattice is recovered (see Figs. 4.46 – 4.48). The number of electrons of this finite lattice portion equals Nn times the number ncell of electrons of each unit cell. Each band (orbital) state has room for 2 electrons, one for each spin state, ↑ and ↓. If the bands are all disjoint, one on top of another, the Nn ncell electrons fill the 2Nn spin-orbital states of 21 ncell bands, as illustrated in Fig. 4.54. An even value of ncell leads to 21 ncell full bands, followed by empty bands above. For example, each atom of solid Ne (fcc, one atom per cell) carries ncell = 10 electrons to the bands, for a total of 10Nn electrons in the crystal. In a tight-binding language, 2Nn electrons fill the 1s band, 2Nn electrons fill the 2s band, 6Nn electrons fill completely the 2p bands, for a total of 5 filled bands. The Fermi energy then lies in the gap between the filled 2p band and the empty 3s-3p bands, and the crystal is an insulator. Odd ncell leads to 12 (ncell − 1) filled bands, plus a half-filled band. For example, the 11 electrons that each Na atom puts in the band states of its bcc crystal (one atom per cell) fill completely the 1s, 2s, 2p bands, and fill only the 21 Nn lowest orbital states of the 3s band. This 3s band is then half filled, the Fermi energy cuts through it: the solid is a metal. Although it is true that any band crystal with an odd number of electrons per cell ncell is a metal,11 for even ncell both insulators (like Ne) and metals are possible, since there is no guarantee that the ( 21 ncell )-th and the ( 12 ncell + 1)-th bands are separated by a gap, as illustrated in Fig. 4.55. The alkali earth (IIA) and end of the transition (IIB: Zn, Cd, and Hg) solids are all metals, with even ncell , precisely due to overlapping bands at the Fermi energy. 4.2.2.1. Metals. The most characteristic macroscopic feature of a metal is its ability to conduct electric current at low temperature. In fact, all solids, even insulators, show some measurable conductivity associated to impurities and thermal excitations. What really characterizes the metallic state is a conductivity which 11 When the independent electrons approximation breaks down, as in strongly correlated materials, there occur insulating states with odd ncell , often accompanied by magnetic order of the spins. 196 4. SOLIDS Figure 4.55. In 2D and 3D crystals, the band energy E~k j depends on the direction of ~k. It often occurs that even though a gap is observed in the one-electron bandstructure for some ~k direction (and even for all directions), with all directions taken into account the gap disappears, since the energy range forbidden in some direction ka becomes allowed for a different direction kb , due to band overlap. If the Fermi energy ends up in this region, it crosses several bands, and the solid is a metal, even with an even number of electrons per cell. decreases as T is increased, while the conductivity of insulators increases as T is increased due to increasing thermal excitations across the gap. Consider now the prevision of band theory for the conduction of electrons in a crystal: the expected behavior is very peculiar, in sharp disagreement with observation. To describe the motion of electrons in the field of the periodic lattice plus the applied external field, a semiclassical approach is useful. As common in quantum mechanics, an electron is represented by a wave packet, i.e. a superposition of Bloch states in a single band j, characterized by a rather sharp wave-number distribution, thus a large spatial extension, much larger than the crystal lattice spacing (Fig. 4.56). The equations governing the motion of the center of mass ~r of this wave packet are assumed to be: (299) (300) d 1~ ~r = ~vj (~k) = ∇ ~ E~ dt ~ k kj h i d~ ~ ~ ~ ~ k = −qe E(~r, t) + ~vj (k) × H(~r, t) . dt 4.2. ELECTRONS IN CRYSTALS 197 Figure 4.56. Schematic picture of the electron dynamics within the semiclassical model. The length over which applied fields (dashed line) vary is much greater than the spread in the electron wave packed (solid line), which, in turn, is much larger than the lattice constant. ~ r, t) and magnetic H(~ ~ r, t) fields12 are supposed The external perturbing electric E(~ to vary slowly on the scale of the wave-packet size (Fig. 4.56). A rigorous derivation of equations (299) and (300) goes far beyond the scope of the present course: we only propose a few heuristic arguments to support their plausibility. • The center-mass velocity of the wave packet is the group velocity associated to the dispersion E~k j of the Bloch waves. Equation (299) states the basic fact of wave mechanics that a wave packet with dispersion ω(~k) = ~−1 E~k j ~ ~ ω(~k). In the special moves with the velocity given by its group velocity ∇ k 2 2 ~~k case of a free electron E~k = ~2mke , this yields the usual relation ~v (~k) = m e between velocity and momentum. • If the force associated to an external electric potential φiext (~r, t) acts on h the band electron, then its total energy E~k j − qe φext (~r, t) should remain conserved along the semiclassical motion. To verify this, we derive this total 12 The magnetic field is assumed to be measured in the same units (T) as the magnetic ~ induction field B. 198 4. SOLIDS energy with respect to time (notation: i d h ~ ~ E~ E~ − qe φext (~r, t) = ∇ k kj dt k j ~ ~ E~ ∇ k kj = ~ df dt ≡ f˙). Equation (299) gives ˙ ~ ~r φext (~r, t) · ~r˙ · ~k − qe ∇ h i h i ˙ ~ r, t) = ~vj (~k) · ~~k˙ + qe E(~ ~ r, t) . · ~~k + qe E(~ If Eq. (300) is satisfied, then this derivative indeed vanishes, as the vector ˙ ~ = −qe~vj × H ~ is perpendicular to the velocity ~vj . Total energy is ~~k + qe E thus conserved. • Equation (300) resembles the classical equation of motion of a particle of charge −qe moving under the action of the external electromagnetic fields ~ r) and H(~ ~ r) only. The periodic forces produced by the crystal act only E(~ through the band dispersion E~k j , generating the nontrivial ~k-dependence ~~k . This of the velocity of Eq. (299), in place of the free-electron ~v (~k) = m e ~ means that in the crystal ~k does not equal the electron momentum, as for a free electron. It is rather called crystal momentum. • The semiclassical equations assume that the external fields induce no interband transition. For weak fields, inter-band transitions are indeed exceedingly rare, while strong fields make the semiclassical approximation fail and lead to electric or magnetic breakdown. Also, if the fields are not static, the frequency of any oscillation must not exceed inter-band gaps (~ω ≪ ∆). Rapidly varying field are applied in spectroscopy precisely to induce interband transitions. The semiclassical equations confirm that all electrons in a completely filled band do not contribute to either electric current or heat current. The electric current carried by a wave packet representing an electron is (−qe )~vj (~k). The total electric current density carried by a filled band j amounts to13 Z 2 ~j = (301) (−qe )~vj (~k) d3~k , 3 (2π) BZ with the integration extended over the first Brillouin zone (BZ). This integral vanishes since, due to Eq. (299), the integrand function is the gradient of a periodic 13 Similarly to Eq. (231), the ni sum is turned into an integral by inserting the appropriate V density of states. According to Eq. (180), the ~k-density of states is (2π) 3 . An extra factor gs = 2 accounts for the spin degeneracy. The total current times the sample length is obtained by summing the charge times the velocity (−qe )~vj (~k) of the individual electrons over a Brillouin zone. Dividing by V , we obtain the current density (charge per unit area and time). 4.2. ELECTRONS IN CRYSTALS 199 function (E~k j ) over a unit cell of the reciprocal lattice.14 In other terms, in a filled band as many electrons carry current in a direction as others carry current the opposite way, for a vanishing net current. The same reasoning applies to energy (heat) current Z 2 ~jE = (302) E~k j ~vj (~k) d3~k , 3 (2π) BZ by noting that the integrand is proportional to the ~k-gradient of (E~k j )2 , which is also a periodic function of ~k. Completely filled band do not contribute to transport any more than completely empty bands. All electric and thermal conductivity is to be attributed to partly filled bands. This explains why no systematic increase of conductivity is observed in the metallic elements for increasing Z (for example the conductivities of fcc Cu, Ag, and Au are very similar), despite the largely different total number of electrons. On the other hand, Bloch states are stationary states of the Schrödinger equation in the perfect lattice: if a wave packet of Bloch states representing an electron ~ ~ E~ = ~0), then that has a finite mean velocity (as happens unless by chance ∇ k kj velocity shall persist forever. Thus, even in the absence of external electric fields, persistent currents should be observed in metals, but they are not. Furthermore, as illustrated in Fig. 4.57 for the simple 1D case, the semiclassical motion following Eq. (300) under the action of a constant field cycles the k-point across the whole BZ. Correspondingly, Eq. (299) gives an oscillating velocity covering positive and negative velocities for the same amount of time. According to this model, a DC electric field should induce an AC current (Bloch oscillations) in a metal wire! This prevision is neither an artifact of 1D nor of the semiclassical approximation: Ohm’s ~ is not consistent with the main model ingredient, Bloch electrons in an law15 ~j = σ E ideal lattice, taken alone. Trouble is, Bloch electrons are capable of traveling through a perfect lattice without friction, thanks to coherent interference of scattered waves. The missing ingredient here is collisions. Any real crystal is not ideal since 14 In general, if f (~r) is periodic with the periodicity of some Bravais lattice, the result of its integration overR a primitive unit cell of a Rtranslation of this cell by any ~r ′ . R is independent ′ 3 ′ 3 ~ ~r f (~r + ~r ′ ) d3~r. In particular, ~ ~ ~ Therefore, 0 = ∇~r ′ cell f (~r + ~r ) d ~r = cell ∇~r ′ f (~r + ~r ) d ~r = cell ∇ R 3 ′ ~ ~r f (~r) d ~r = ~0. Of course, the same holds in ~k-space for functions, taking ~r = ~0, we have cell ∇ such as the band energies E~k j , which are periodic with the periodicity of the reciprocal lattice. 15 The current I in a wire is directly proportional to the applied potential drop: I = R−1 V . The coefficient of proportionality R depends on the length L and cross section A of the wire, but not on the current or potential drop. In terms of the current density ~j crossing perpendicularly ~ the potential drop V = L|E|. ~ Thus the surface area A, I = |~j|A. In terms of the electric field E, L −1 ~ L −1 Ohm’s law rewrites ~j = A R E, where the conductivity σ = A R is a characteristic property of the material, and the resistivity ρ = σ −1 . 200 4. SOLIDS 6 j=2 Ε Ek 4 2 0 dEk/dk 2 j=1 (a) electron velocity 0 -2 (b) electron acceleration -5 2 d Ek/dk 2 0 -10 (c) -3 -2 -1 0 1 2 3 ka Figure 4.57. (a) Electronic motion in the first BZ of a 1D lattice. Under the action of a leftward external force (uniform constant right~ the wave number k drifts at constant speed: at ward electric field E), the BZ boundary it folds back k = −π/a → π/a. Correspondingly, the band energy of the electron oscillates. The semiclassical-wavepacket (b) velocity, Eq. (299), and (c) acceleration, Eq. (309), are proportional to the first and second derivative of the band energy, respectively. (1) it contains imperfections, as discussed in Sec. 4.1; (2) its nuclei are not frozen at their equilibrium positions but actually vibrate around them, as we shall discuss in Sec. 4.3. 4.2. ELECTRONS IN CRYSTALS 201 Both these discrepancies from the ideal lattice picture are sources of collisions for conduction electrons.16 To represent the effect of collisions as simply as possible, we shall assume that: (1) Each electron moves, between successive collisions, according to the semiclassical equations of motion (299), (300). (2) Each electron experiences an instantaneous collision at random, with probability τ −1 per unit time. The time τ is variously known as the relaxation time, the collision time or the mean free time. Its physical interpretation is the following: any electron shall, on average, travel τ seconds before experiencing its next collision. (3) After a collision, the electron emerges with a ~k whose direction is perfectly random, whose modulus reflects the (Fermi) distribution at the appropriate local temperature, and in such a way to respect Pauli’s principle. All memory of the initial ~k (i.e. velocity) is lost. As a result of collisions, the application of an external electric field does not produce free acceleration of the electrons, thus no Bloch oscillations are observed. On the contrary, the external field produces only a weak perturbation to the thermal equilibrium distribution. Basically, collisions act mostly close to the Fermi surface (the surface in ~k space separating full and empty states at T = 0, analogous to the Fermi sphere of the free gas). As a DC electric field attempts to shift each occupied ~k state in the −qe E ~ direction (Eq. (300)), collisions rapidly transfer electrons from occupied states in the higher-energy region to the emptied region of lower energy, in a tendency to re-establish thermal equilibrium, as illustrated in Fig. 4.58a. After an initial transient, the net effect amounts to a small constant displacement of the filled states with respect to zero field (Fig. 4.58b). This displacement produces a net current density equal to the ~k-space integration of V −1 (−qe )~vj (~k) through the region δ 3~k of unbalanced occupancy. This can be roughly estimated in the free-electron parabolic band as Z Z V −1 3 ~j = V (−qe )~vj (~k) 3 d ~k ≃ (303) (−qe )ÊvF d3~k ≃ −qe vF Ê (δ 3~k) , 4π 3 3 δ ~k δ ~k where numerical factors of order 1 have been ignored. The ~k-space volume (δ 3~k) in between the equilibrium Fermi surface and its field-shifted replica, sketched in Fig. 4.59 for free electrons, can be estimated by observing that in the average time τ ~ between two collisions, each electron changes its wave vector by δ~k = ~−1 τ (−qe ) E. 16 Some scattering is produced by electron-electron interaction as well, but this is essentially negligible compared to the two other sources of collisions in ordinary metals at ordinary temperature. 202 4. SOLIDS E=0 0 [nk] Ek E=0 1 E εF -2 collisions 0.5 (b) (a) -kF 0 0 -kF -2kF -kF 0 kF 2kF k k Figure 4.58. (a) Band occupancy under the combined effect of an applied constant leftward electric field and collisions with impurities and with instantaneous displacements of the lattice ions (phonons). The external field accelerates the electrons to the right according to Eq. (300). Fast collision tend to reestablish equilibrium by scattering extra-energetic electrons prevalently into lower-energy states which have been left empty. (b) Under the combined effect of the field and collisions, the Fermi occupancy distribution shifts in ~k space. This constant shift, here largely exaggerated, is proportional to the applied electric field, and is responsible for electric-current transport. Dropping again factors of order unity, the volume (δ 3~k) is approximately τ ~ 2 3~ 2 ~ (δ k) ≃ −|δ k| kF ≃ (−qe ) E kF , ~ ~ field. By substituting where the minus sign recalls that the shift is opposite to the E ~kF this expression and vF ≃ me (as if electrons were free) in Eq. (303), we obtain qe2 τ 3 ~ qe2 τ N ~ ~kF τ ~ 2 ~ Ê (−qe ) E kF ≃ k E≃ E, (304) j ≃ (−qe ) me ~ me F me V (Sec. 3.2.2.1), where we used the relation of kF and electron density, kF3 = 3π 2 N V again dropping factors of order unity. Equation (304) agrees with Ohm’s law, and evaluates a conductivity (305) σ≃ qe2 τ N . me V 4.2. ELECTRONS IN CRYSTALS 203 kz 3 δ k δk ky kx Figure 4.59. The shift δ~k of the Fermi sphere induced by a leftward external electric field. The ~k-space volume (δ 3~k) is responsible for unbalanced electrons velocity, thus for a net electric current. element ρ [nΩm] ρ [nΩm] τ [10−14 s] τ [10−14 s] at 77 K at 273 K at 77 K at 273 K Na 8 42 17 3.2 14 61 18 4.1 K Rb 22 110 14 2.8 Cu 2 16 21 2.7 Ag 3 15 20 4.0 Au 5 20 12 3.0 Mg 6 39 7 1.1 Al 3 25 6.5 0.8 Table 4.2. Measured electrical resistivity ρ = σ −1 for several elemental metals. The corresponding relaxation times are obtained from N Eq. (305) through τ = me / ρqe2 N , with equaling the density of V V electrons in the conduction band. From the measured resistivity ρ = σ −1 of a metal and the conduction electrons , Eq. (305) permits to estimate the relaxation time τ , as in Tanumber density N V ble 4.2. The observed trend is of increasing resistivity with temperature, corresponding to a decreasing relaxation time. Collisions become more frequent as thermal motion produces larger displacements of the nuclei from their equilibrium positions, 204 4. SOLIDS Figure 4.60. Measured low-temperature resistivity of Na for three different samples with different defect concentrations, leading to different low-T residual resistivity, associated to different defect concentrations (the sample of the lowest curve is a less defective Na crystal; the effect of defects can be much larger in other metals, and huge in disordered alloys). For low to intermediate T , phonon-induced resistivity grows as T 5 , but it then rapidly reaches a T -linear regime. as described in Sec. 4.3. The average time between collisions can P then be assumed inversely proportional to the total number of phonons τ −1 ∝ ǫ [nǫ ]B . At thermal energies kB T much larger than the characteristic phonon energies (ǫ ≈ 1÷100 meV), the total phonon number is proportional to T (expand the boson average, Eq. (237): 1 = kBǫT ). Indeed, at large temperature the resistivity is linear [nǫ ]B = eβǫ1−1 ≃ βǫ in T , to a good degree of approximation. At low temperature, the phonon number decreases rapidly (see Eq. (258)), but impurities and all sorts of lattice defects provide a residual T -independent scattering. Accordingly, the resistivity for T → 0 converges to a sample-dependent constant, as shown for Na in Fig. 4.60. We can estimate the thermal conductivity of electrons in a metal in analogy ~ ~r T = dT ẑ is established to electrical conductivity. When a thermal gradient ∇ dz across the sample, at a given point in the metal, electrons around the Fermi energy 4.2. ELECTRONS IN CRYSTALS 1 cooler T hotter - [nk] T + 205 0.5 0 -2kF -kF 0 kF 2kF k Figure 4.61. Heat transport is associated to electrons moving out from the hotter (right) region carrying (on average) higher energy than those moving out from the cooler (left) region. At a given point ~r in the sample, the distribution of ~k states is therefore slightly asymmetric ~ ~r T . This asymmetry is strongly exaggerated in in the direction of ∇ figure. have a slightly distorted equilibrium distribution since those coming from the hightemperature side have a slightly higher temperature than those coming from the lowtemperature side. This temperature difference depends on the distance these electrons have traveled after the previous collision to the given point. This average distance d ≃ τ vF is called the mean free path. At that point therefore the electrons coming from the hot side are associated to a T + ≃ T +d dT ≃ T +τ vF dT and those moving dz dz dT − the opposite way to a T ≃ T − τ vF dz (Fig. 4.61). The average energy transported by a typical electron to that place will be therefore ≃ kB (T − − T + ) ≃ −kB τ vF dT , dz where the minus sign indicates that energy is transported in the direction opposite to the gradient. Multiply this individual contribution by the typical speed ≃ vF of kB T an electron and by the density ≃ N of electrons carrying heat (only those within V ǫF kB T of the Fermi energy do, for all others the contributions cancel as in Eq. (302)): we obtain an estimate of the total heat current 2 2 ~ ~r T vF N kB T = −kB2 T τ vF N ∇ ~ ~r T ≃ − kB T τ N ∇ ~ ~r T , (306) ~jE ≃ −kB τ vF ∇ V ǫF ǫF V me V 206 4. SOLIDS Figure 4.62. A scheme of Hall’s experiment: a metal sample traversed by a current and immersed in a perpendicular magnetic field develops a transverse potential due to the Lorentz force acting on the carriers. The sign of this potential is determined by the band effective mass m∗ . where we employed the relation ǫF = 12 me vF2 . A heat current as in Eq. (306) is compatible with a thermal conductivity17 k2 T τ N . (307) K≃ B me V By comparing σ and K in Eqs. (307) and (305) we find the prevision of a ratio π 2 kB2 K T = σ 3 qe2 between thermal and electric conductivities according to the relaxation-time model considered. A careful analysis of the factors of order unity previously ignored yields 2 the factor π3 in Eq. (307). This relation tells us that good electrical conductors should also be good heat conductors. The empirical observation of this fact is known as Wiedmann-Franz law. Experimentally, for a wide range of metals and temperatures, the accord of conductivities with Eq. (308) is surprisingly good, for such a simple model. For example, measurement yields the following values for K 3 qe2 2 T (which should equal unity according to Eq. (308)): 0.868 (Na, 273 K), σ π 2 kB 0.950 (Au, 273 K), 0.966 (Au, 373 K), 1.08 (Pb, 273 K), 1.04 (Pb, 373 K). Many other transport experiments, both in the DC and AC regime can be interpreted in terms of this simple relaxation-time model. An especially important class of measurements is connected to the Hall effect, i.e. the potential buildup perpendicular to a current transversing a metallic sample when this is immersed ~ This potential drop is due to lateral charge in a perpendicular magnetic field H. accumulation sketched in Fig. 4.62 due the Lorentz term in Eq. (300): the latter is independent of the sign of the charge carriers, thus the transverse Hall potential probes precisely the sign of the charge carriers. In most metals, this potential (308) 17 ~ ~r T . The underlying linear-response relation is ~jE = −K ∇ 4.2. ELECTRONS IN CRYSTALS 207 drop is consistent with negative charge carriers, but many exceptions (including Be, Mg, In, Al) show a potential of reversed sign w.r.t. the prevision of the semiclassic relaxation-time model. This calls for the importance of hitherto neglected band effects associated to lattice periodicity. The main effect of the lattice periodic potential is to replace the electron mass in Eqs. (305) and (307) with an effective mass m∗ , accounting for the actual value of the acceleration of electrons close to the Fermi energy in a realistic band rather than on the free parabola. According to Eq. (299) (309) " # h i d~k X ∂ 2 E~k j d(~kw ) X d2 d ~ 1 ~ ~ ~vj (~k) · êu ~r = ~vj (k) = ∇ = 2 = êu (m∗ )−1 uw Fw , k dt2 dt dt ~ uw ∂ku ∂kw dt uw where êu indicate the Cartesian versors x̂, ŷ, and ẑ. The last equality is based on ~k) ~ + ~vj × H) ~ acting on Eq. (300) – replacing d(~ with the external force F~ = −qe (E dt the electron. The final expression (309) is a sort of Newton equation, with a mass tensor of components (310) (m∗ )−1 uw 2 1 ∂ E~k j , = 2 ~ ∂ku ∂kw which is connected to the band curvature at Fermi energy. In the 1D example of Fig. 4.57, m∗ is proportional to the inverse of the acceleration of panel c. In 3D, the effective mass m∗ to insert in Eq. (305) is a suitable average over the tensor components (m∗ )uw , and may differ substantially from the free-electron mass me . In particular m∗ turns out much larger than me for narrow flat bands, characterized by weak curvatures, such as 3d bands of transition metals or 4f bands of rareearth metals. According to (305) and (307), larger effective masses are associated to smaller conductivities, which is in accord with our intuition that external fields have a harder time accelerating “heavy” wave packets than free electrons. Note however that in the ratio (308), the m∗ correction cancels out, and the Wiedmann-Franz law should and does hold roughly independently of band curvature. Also, negative effective masses occur whenever the Fermi level sits in a region where the curvature of the band is downward (e.g. close to the BZ boundary of Fig. 4.58a). An electron of negative m∗ and charge accelerates in the same direction as the applied field, and can therefore be seen as a positive charge of positive mass, called a hole. A hole carries current in the same direction as the applied field (like a genuine electron), but it produces reversed Hall effect, as its average velocity ~vj (~k) aligns in the same ~ Negative effective mass explains the reversed Hall direction as the electric field E. potential of several metals. 4.2.2.2. Semiconductors. Intrinsic (i.e. pure) semiconductors are insulators characterized by a small gap between valence and conduction band. Solid Si and Ge have 208 4. SOLIDS Figure 4.63. Qualitative lattice-parameter dependence of the band energies for the diamond-lattice solids of the IVB group. The large-a separated s and p bands turns into a hybrid sp3 band as a is reduced, but eventually this wide band splits into a “bonding” and “antibonding” band (similar to the bonding and antibonding states of CH4 ), whose separation (and individual width) grows as the lattice shrinks. This leads to the unusual situation of a bandgap which increases under applied external pressure, and shrinks with lattice thermal expansion. the same structure as C diamond, with larger lattice parameter a. The bands of these materials are qualitatively illustrated in Fig. 4.63. In pure semiconductors, the Fermi energy sits inside this bandgap, the lower sp3 “bonding” band being completely full and the upper “antibonding” band completely empty. Note that the possibility of this band “splitting” is directly connected to their diamond crystal structure, and would not occur in a hypothetical simple cubic or fcc Si/Ge. Moreover, precisely this bandgap yields a particular stability to the rarefied diamond structure for those solids where the number of electrons matches the capacity of the “bonding” band.18 Semiconductors of type III–V (e.g. GaAs) and some II-VI (e.g. BeSe) crystallize in a similar crystal structure, namely the zincblend structure, with two different atomic species occupying the two geometrically inequivalent sites of the unit cell of the diamond structure (Fig. 4.25). Other structures are observed in other semiconductors. Accordingly, the bands are qualitatively and quantitatively 18 Solid Na, Mg and Al realize an energetically more stable configuration in different, more compact lattice structures (bcc, hexagonal, and fcc respectively), as they have too few electrons to fill completely the “bonding” band if the atoms arranged themselves in a hypothetical diamond structure. 4.2. ELECTRONS IN CRYSTALS 209 Figure 4.64. Direct versus indirect gap in insulators. The name refers to the possibility of “direct” spectroscopic observation by interband optical absorption in the first case. Absorption through an indirect gap instead must be assisted by some phonon absorption/emission in order to grant the wave-number conservation in the process. different in different compounds. In particular, the gap ∆ = Ec − Ev between the top of the valence band and the bottom of the conduction band may be direct (same ~k for the maximum Ec of E~k c and for the minimum Ev of E~k v as in GaAs and InP) or indirect (when these two extrema occur at different ~k points, as in Si, Ge, GaP), see Fig. 4.64. Transport in an intrinsic semiconductor is driven by temperature. The average number density of electrons thermally excited into the conduction band is Z Z [nc ] 1 ∞ 1 ∞ 1 (311) Nc = = dE , g(E) [nE ]F dE = g(E) β(E−µ) V V Ec V Ec e +1 where g(E) is the density of band states (an example of which is sketched in Fig. 4.65). The chemical potential lies somewhere in the gap between conduction and valence band (at low temperature close to the mid-gap energy), thus several kB T below the conduction-band bottom Ec (see Fig. 4.66). It is therefore usually a very good approximation to neglect the 1 at the denominator of [nE ]F , and take 4. SOLIDS g(ε) [states/eV] 210 -5 Energy [eV] 0 Figure 4.65. The qualitative shape of the density of band states in a lattice. Gaps are characterized by vanishing density of states. In general, several bands are present in an actual solid, with narrow high-density bands at lower energies, and broad low-density bands at larger energies. The band boundaries show a characteristic g(E) ∝ |E − Eboundary |1/2 behavior, characteristic of the quadratic ~k-dependence of E~ near the band maximum or minimum (see kj Eq. (190)). 1 eβ(E−µ) +1 ≃ e−β(E−µ) = e−β(Ec −µ) e−β(E−Ec ) . The first exponential is the same for all states in the band: it reflects the exponential suppression of the electron occupancy of the conduction band due to its distance from µ. The second factor represents a standard Boltzmann occupancy of a gas of classical noninteracting particles, as in Eq. (151), which indicates that an extremely rarefied electron gas is essentially classical. The energies E of the states can be estimated by Taylor-expanding the conduction band around its minimum19 1 X ∂ 2 E~k c ~2 ~ ~ min 2 min min E = E~k c = Ec + |k − k | +... (k −k )(k −k )+... ≃ E + u w c u w 2 ∂ku ∂kw ~ min 2m∗ uw k c Thus the excitation energy E = (E − Ec ) in the second exponential approximates the kinetic energy of a free particle of mass m∗c , once the wave numbers are measured from ~k min . Accordingly, the density of conduction states (excluding spin) goes as 19 For energies high above the band minimum E~k c , the quadratic expansion is inaccurate, but the statistical occupancy factor suppresses the contribution of the higher-energy states anyway. 4.2. ELECTRONS IN CRYSTALS 211 Figure 4.66. In an intrinsic semiconductor with an energy gap ∆ large compared with kB T , the chemical potential µ lies within an order kB T of the center of the energy gap, and is therefore far (compared to kB T ) from both the bottom of the conduction band Ec and the top of the valence band Ev . √ gtr (E) = m∗c 3/2 V /( 2π 2 ~3 ) E 1/2 (see Eq. (190)). The calculation of the E integration in Eq. (311) is then identical to the calculation of the classical partition function Z1 tr = V /Λ3c of Eq. (184), with here a thermal length s 2π Λc = ~ . ∗ mc kB T In detail, we obtain Z Z eβ(µ−Ec ) ∞ eβ(µ−Ec ) ∞ −β(E−Ec ) Nc ≃ gs gtr (E) e−βE dE g(E)e dE ≃ V V Ec 0 ∗ 3/2 Z ~ k′ |2 2eβ(µ−Ec ) mc kB T V −β |~2m 2eβ(µ−Ec ) 2eβ(µ−Ec ) ∗ 3~ ′ β(µ−Ec ) c dk = = 2e = (312) e Z1 tr = , V 8π 3 V Λ3c 2π~2 212 4. SOLIDS where the factor gs = 2 reflects the spin degeneracy.20 For m∗c = me , at 300 K, the thermal length Λc = 4.3 nm. The exponential factor, for µ sitting in the middle of a 1.2 eV gap i.e. 0.6 eV below Ec , is of the order eβ(µ−Ec ) ≈ e−23 ≃ 10−10 . This corresponds to about Nc ≈ 2 × 1015 m−3 , a modest charge-carrier density compared to that of regular metals (≈ 1028 m−3 ). Note however that this carrier density varies exponentially with T −1 (e.g. for the same conditions, at 600 K Nc ≈ 3×1020 m−3 ). If this dependency is plugged into the expression (305) for conductivity in the presence of collisions, one expects a rather poor room-temperature conductivity, which is approximately increasing with the exponential of T −1 , apart for weaker corrections due to (i) drifts of µ, (ii) the T 3/2 term in Eq. (312), and (iii) increase in collisions (reduction of τ ). Indeed, a roomtemperature resistivity several orders of magnitude larger than in simple metals, with a drastic T -dependence, is observed in intrinsic semiconductors (Fig. 4.67 – curve 1, compare with Table 4.2). This makes pure semiconductors sensitive temperature sensors, especially at low temperatures (where metals become essentially useless – see Fig. 4.60). In practice, semiconductors find important applications mainly as doped solids, called extrinsic semiconductors. Doping is realized typically by substitutional impurities replacing the perfect lattice atoms, as sketched in Fig. 4.68. Pentavalent impurities, such as P or As replace Si/Ge atoms at the regular lattice sites, thus establishing formally four chemical bonds. In other terms, the valence band is not especially deformed by the impurities. However a pentavalent atom carries extra positive nuclear charge, thus forming a potential well which tends to attract one extra electron close to it. This forms a characteristic localized “impurity” state, sitting in the band gap. At zero temperature, the extra electron carried by the pentavalent atom occupies that impurity state. The main feature of the impurity states of pentavalent dopants (donors) is their close vicinity to the conduction band (Fig. 4.69). The reason for this weak binding of the impurity electron to its ion is related to screening. The simplest model for 20 Essentially the same result is obtained for the number of holes in the valence band: ∗ 3/2 Z 1 Ev mv kB T 2 Pv = . g(E)(1 − [nE ]F ) dE ≃ eβ(Ev −µ) 3 = 2 eβ(Ev −µ) V −∞ Λv 2π~2 The requirement of charge neutrality (Pv = Nc ) fixes the position of the chemical potential: eβ(Ev −µ) m∗v 3/2 = eβ(µ−Ec ) m∗c 3/2 , where we simplified common factors. By taking the logarithm of both sides we obtain 3 m∗ 1 µ = (Ev + Ec ) + kB T ln v∗ , 2 4 mc which confirms that at T = 0 the chemical potential sits in the middle of the gap, and when T is raised it drifts slowly toward the band with smaller effective mass, to compensate for the smaller density of states. 4.2. ELECTRONS IN CRYSTALS Figure 4.67. The measured resistivity of antimony-doped germanium as a function of inverse temperature for several impurity concentrations [?]. 213 214 4. SOLIDS Figure 4.68. Substitutional atoms of group III or V replace Si or Ge atoms of the pure semiconductor (left), producing extrinsic semiconductors of p type (center) or n type (right) respectively. The square lattice is just a convenient pictorial for the actual diamond lattice. Figure 4.69. Donor and acceptor levels are very shallow. They are usually located within few tens meV of the borders of the conduction and valence band respectively. the electron binding to the impurity is a particle of charge −qe and mass m∗c moving qe 1 in the screened Coulomb potential of the impurity ion ≃ 4πǫ , where ε is the 0ε r static dielectric constant of the pure semiconductor (ε = 12 for Si, ε = 16 for Ge), and r the distance between the electron and the impurity nucleus. This is an Hatom–like problem of effective nuclear charge Z = ε−1 and effective mass µ = m∗c , whose ground-state energy is given by Eq. (30). The bound-state ground energy 4.2. ELECTRONS IN CRYSTALS 215 Figure 4.70. Schematic temperature dependence of the majority carrier density (n doping). The high-temperature regime corresponds to the prevalence of intrinsic carriers. The intermediate regime of almost constant Nc ≃ Nd is the temperature region where extrinsic carriers prevail and are dissociated from their impurities. The low-temperature decrease of Nc is due to “freezing” of the extrinsic carriers, “captured” by the localized impurity states. then equals a suitably rescaled Rydberg energy: m∗ 1 EHa (313) δE ≃ c 2 . me ε 2 Typical values of m∗c and ε lead to binding energies of the order 10−4 EHa , or few tens meV. Indeed, the separation δE = Ec − Ed of the impurity levels of P and As from the bottom of the conduction band of Si is observed 44 meV and 49 meV respectively (in Ge it is found 12 meV and 13 meV). A trivalent impurity (acceptor) produces a localized excess negative charge, as long as the conduction band is filled. The missing electron can be represented as a bound hole, attracted to the excess negative charge representing the impurity, with a small binding energy. In the electron picture this bound hole will manifested itself as an additional electronic level Ea slightly above the top of the valence band. The hole is bound when this level is empty. When an electron is promoted from the valence band into this localized level, paying a small energy Ea − Ev , the excess charge of the impurity is removed, and an unbound hole is left in the valence band. Impurity levels are localized and do not contribute to transport. At T = 0, a homogeneous doped semiconductor is an insulator, with the Fermi level sitting near either Ea or Ed , according to whether the density Na of acceptors or that Nd of donors is larger (p / n doping respectively). As temperature is raised from 0, the bound charges get rapidly unbound into the band levels, mostly in the valence band, 216 4. SOLIDS if Na > Nd (p doping), or in the conduction band, if Nd > Na (n doping). Due to the small binding energy of the impurity levels compared to the band gap, it is far easier thermally to excite an electron into the conduction band from a donor level, or a hole into the valence band from an acceptor level, than it is to excite an electron across the entire gap ∆ from valence to conduction band. At room temperature, the probability that an electron unbinds from the impurity levels is high. The chemical potential moves toward the middle of the gap, but (for temperature not too high) it remains closer to the valence (p doping) or conduction (n doping) band. This leads to a substantial concentration of carriers, which in a wide temperature range dominates over the intrinsic carriers. In this regime, to a good approximation, the density of majority carriers (holes for p doping, electrons for n doping) approximates the net concentration of impurities,21 as in Fig. 4.70. This carrier concentration, plugged into Eq. (305), is compatible with the doping and temperature dependence of resistivity shown in Fig. 4.67. The carrier population in the minority band (conduction for p doping and valence for n doping) is extremely small (but rapidly increasing with temperature). Doped semiconductors are interesting mostly for the properties of inhomogeneous systems, i.e. crystals where the local impurity concentrations vary in space. Careful methods of fabrication permit to tune the doping on a sub-µm scale, within crystals of typical lateral size of up to several cm. This technology is at the basis of the modern electronics industry, where solid-state devices have replaced the old vacuum tubes as “active” components, i.e. components which permit active manipulation (mainly amplification) of electric signals. Countless other applications of inhomogeneous semiconductor include sensors, light production, electronic data manipulation... Semiconductor devices have been since the 1950’s and still are one of the main areas of research and development, a common playground of fundamental quantum mechanics, solid-state physics, materials science, industrial engineering, and electronics. Specific courses delve in this vast field. Here we only sketch the principle of functioning of the simplest inhomogeneous extrinsic semiconductor: the p-n junction. Consider a piece of semiconductor with ideal step-like p-n dopant densities Na , x < 0 Na (x) = 0, x>0 0, x<0 Nd (x) = (314) , Nd , x > 0 as a function of some 1D displacement x across the sample. This is what one could conceptually (but not in practice!) realize by assembling a p-doped and a n-doped piece of semiconductor. When contact is realized, the chemical potential in the two 21 For n doping an electron density Nc ≃ Nd − Na . For p doping a hole density Pv ≃ Na − Nd . 4.2. ELECTRONS IN CRYSTALS 217 separate sections of the semiconductor, initially substantially different, must become the same, as required by thermodynamical equilibrium. Starting with each portion in an homogeneous neutral situation, the equilibration of the chemical potential is realized by diffusion of electrons from the n (where their concentration is high) into the p side (where their concentration is very small), and of holes in the other direction (like when a wall between a vessel containing oxygen and one containing nitrogen is removed and the two gases mix). As the diffusion continues, the resulting transfer of charge builds up an electric field opposing further diffusive currents until an equilibrium configuration is reached, when the effect of the field on the charge carriers cancels the effect of diffusion. It is possible to write coupled equations (based on Maxwell’s equations plus the semiclassical equations (299) and (300) for the electron dynamics) for the poten(x) and the local densities of tial φ(x) describing this electric field E(x) = − dφ dx electrons and holes; however, here we restrict to a qualitative level of description. Because the carriers are highly mobile, in this equilibrium configuration the carrier densities are very low wherever the field has an appreciable value. This is represented in Fig. 4.71b. The thickness of this depletion layer turns out typically in the 10 ÷ 103 nm range, depending on the precise physical parameters (dopant concentrations, dielectric constant of the semiconductor). The impurity ion charge remains uncompensated in the depletion layer, thus producing the electric charge profile illustrated in Fig. 4.71c. The double layer of charge produces the electric field sketched in Fig. 4.71d. This corresponds to a finite potential drop (Fig. 4.71e), which compensates for the equilibration of the chemical potential far away from the junction: (315) qe ∆φ0 = qe [φ(+∞) − φ(−∞)] = µn − µp , where µp/n indicate the chemical potential in the isolated bulk p or n semiconductors, prior to the construction of the junction. This potential difference shifts rigidly the bands (and impurity levels) away from the junction, as illustrated in Fig. 4.72a. When metal electrodes are deposited on both sides of the junction and joined by a wire, at thermodynamical equilibrium the chemical potential aligns everywhere in the connected solids to a common value, and no net current flows in the circuit, as the individual potential drops at the interfaces cancel. At the p-n junction, for example, the net current vanishes due to cancellation of four contributions: • At the p side of the depletion layer few “minority” electrons appear by thermal excitation out of the valence band: these are immediately “swept” by the strong electric field, and drop to the n side (Fig. 4.72a). This electron generation current Jegen depends exponentially on temperature, but only weakly on the potential drop and the size of the depletion region. 218 4. SOLIDS Figure 4.71. (a) A sharp p-n junction in the absence of voltage bias, in thermodynamic equilibrium. (b) The qualitative equilibrium profile of the hole-density Pv (x) (left) and electron density Nc (x) (right). Far away from the junction, the bulk values Pv (x ≪ 0) ≃ Na , Nc (x ≫ 0) ≃ Nd are realized. In the region where a substantial electric field is present, both carrier concentrations are strongly suppressed: this is therefore called the depletion region. (c) Total electric charge density ρ(x). The net negative charge on the p-type side of the junction and the net positive charge on the n-type side are given by the uncompensated densities Na and Nd of acceptors and donors. (d) These charges generate an electric field x̂E(x), which is the physical origin the depletion layer. (e) The corresponding electric potential φ(x). 4.2. ELECTRONS IN CRYSTALS (b) (a) 219 (c) Figure 4.72. Energetics from the point of view of (negatively charged) electrons, as a function of the displacement x across the p-n junction. (a) At equilibrium, in absence of external bias. (b) In the presence of forward bias (V > 0). (c) In the presence of reverse bias (V < 0). Carrier currents are indicated: note that generation currents are essentially independent of bias. • Few (majority) electrons at the n side of the junction acquire enough thermal energy to overcome the potential barrier to the p side. Once in the p region, they will most likely fill a hole in the valence band: the corresponding current is therefore called recombination current Jerec . • Similarly, a hole generation current Jhgen sweeps thermal holes from the n to the p side, while a hole recombination current Jhrec accounts for thermally diffusing holes into the n side. At equilibrium these currents cancel in pairs (Jegen = Jerec , and Jhgen = Jhrec ) and no net current flows through the junction.22 The shorting wire may be cut and a voltage generator inserted, to alter the potential drop across the semiconductor by an externally fixed potential V , with the sign convention of Fig. 4.73. The bulk p and n regions are relatively good conductors, due to a large carrier density compared to the depletion layer. We can therefore assume that practically all the potential drop induced by the external applied field is realized over the depletion layer, as illustrated in Fig. 4.72. When V 6= 0 the condition of thermodynamical equilibrium is violated and a net current density jx (thus a current I) is realized through the junction. To understand semi-quantitatively 22 rec/gen The four quantities Je/h are defined as the (positive) norm of the corresponding number current-density vectors, and are measured in units of s−1 m−2 . 220 4. SOLIDS n 0.08 I [A] p 0.04 V I (a) + 0 − -0.8 -0.4 0 0.4 V [V] (b) Figure 4.73. (a) The sign convention for the applied external potential V to the diode. Positive V increases the potential of the p side and produces forward bias. (b) The I − V characteristic according to Eq. (318). the I − V characteristic of the p-n junction, observe that the recombination currents depend very strongly on the potential drop through the depletion layer: Jerec is proportional to the number of carriers acquiring sufficient energy to overcome the potential barrier. This is now modified by V : Jerec ∝ exp [−qe (∆φ0 − V )/(kB T )]. The requirement of thermal equilibrium at V = 0 (Jerec = Jegen ) and the fact that the generation current is almost independent of V fixes the proportionality constant: (316) qe V Jerec = Jegen e kB T . The total current density carried by electrons is therefore qe V (317) Je = Jerec − Jegen = Jegen e kB T − 1 . A similar analysis, with similar result, can be carried out for holes. The hole currents move in the opposite direction with respect to electrons, but carry positive charge, thus their contribution adds up to that of electrons, to give a total electric current density qe V (318) jx = qe (Jegen + Jhgen ) e kB T − 1 . This I −V characteristic, shown in Fig. 4.73b, is strongly asymmetric and nonlinear: basically the junction operates as a rectifier, allowing electric current to circulate in one direction only, similarly to the old vacuum diode based on thermionic emission. For this reason, a two-terminal device containing a single p-n junction is named a diode. I − V curves very much like Eq. (318) are indeed observed experimentally 0.8 4.2. ELECTRONS IN CRYSTALS (b) (a) (c) (d) Figure 4.74. Typical I − V characteristics of commercial silicon diodes: (a) a Zener diode, (b) and (c) Diotec 1N4001, (d) Bourns 0805/1206. Observe two kinds of deviations from the ideal behavior of Eq. (318): at large negative voltage (not especially large for this Zener diode), a reverse breakdown regime occurs; at positive voltage, current deviates from the pure exponential of V due to series resistance of the homogeneous segments of the semiconductor. 221

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Introduction to the Physics of Matter