Download Basic concepts in quantum mechanics

Basic concepts in quantum mechanics László Erdős∗ Nov 18, 2008 The emergence of quantum physics in the mid 1920’s was a fundamental change; probably the most important one in the long history of the physics. Moreover, it was so strikingly new and different from anything known before, that several of the best scientific minds have doubted its validity. There are many ways to explain why we should believe in it, and “foundations of quantum mechanics” has become a subject in itself for those who represent the fundamentalist’s point of view. The more pragmatically minded mainstream approach, however, starts from a few basic axioms and focuses on results one can obtain from the theory. The only difference between the axioms of quantum mechanics and the standard axioms in mathematics (say, the axioms of elementary geometry [Euclid] or the axioms of set theory [Zermelo-Frenkel] or the axioms of integer arithmetics [Peano]) is that they are not at all obvious at first sight, and even not at second or third sights... So their justification is more indirect, but a powerful one: they work. We will follow the most pragmatical point of view: whatever crazy its axioms sound, quantum mechanics, as a matter of fact, has correctly predicted essentially any experiments in an enormous energy range. Quantum mechanical principles seem to be valid from subatomic physics to astrophysics. They correctly account for many phenomena on large and small scales that no other theory could tackle. A pretty rewarding price for accepting some axioms that may sound unbelievable at the beginning.... In this course we will do quantum mechanics, which is the most basic part of quantum physics. Quantum physics includes many other disciplines, such as quantum field theory, quantum statistical mechanics, quantum gravity etc., but they all originate in quantum mechanics, similarly as all classical physics (e.g. thermodynamics, fluid dynamics etc.) originates in classical mechanics. The main goal of quantum mechanics is to describe the motion of quantum particles. ∗ Part of these notes were prepared by using the webnotes by Michael Loss: “Stability of Matter” and the draft of a forthcoming book by Elliott Lieb and Robert Seiringer: “The Stability of Matter in Quantum Mechanics”. I am grateful for the authors to make a draft version of the book available to me before publication. 1 1 1.1 Classical mechanics Phase space The basic object in classical mechanics is a massive point particle. It has two intrinsic and permanent characteristics: its mass m and its charge q. The mass is always positive. The state of the particle can be described by its position (location) in the d-dimensional Euclidean space, x ∈ Rd , and by its momentum p ∈ Rd . Unless we say otherwise, we will always consider d = 3. The space of the possible x’s is called configuration space or position space, the space of possible momenta is momentum space. Depending on the physical situation, they may be restricted to a subset of Rd (e.g. if the particle is confined to a container, Ω ⊂ Rd , then x ∈ Ω). The product space of the configuration space and momentum space, Rd × Rd (or its natural subspace, e.g. Ω × Rd ) is called the phase space, the pairs (x, p), describing a possible position and momentum of a particle, are called phase space points. It is a deep fact of Nature, that the phase space point determines the state of the particle, i.e. knowing its position and momentum (two d-dimensional vectors) is sufficient to describe all its future; in other words, all what the particle “remembers” from its past, before a fixed time t0 , is given via the position and momentum at time t0 . Actually there are two different statements combined in this sentence. One is that exactly two vectors (points) determine everything, the other one is that it is sufficient to know these two vectors at a fixed time and we can forget about the whole past. Both of these somewhat surprising facts are consequences of Newton’s equation d2 x m 2 =F dt where F = F (x, p) is the (instantenous) force acting on the particle; the force may depend on the phase space point. The fact that Newton’s equation is a differential equation is equivalent to the fact that the past influences the future through the state at present. The fact that Newton’s equation is of second order postulates that only two quantities are sufficient, there is no need for more, because a second order equation needs two initial data to have a unique solution. Traditional Newtonian kinematics considers position and velocity as these ), it turns out two quantities (velocity being defined as the time derivative of position, ẋ = dx dt that momentum is a more canonical second quantity instead of velocity. More generally, we want to describe N massive point particles in Rd or in Ω ⊂ Rd . We label the particles by 1, 2, . . . , N. Each particle has a position and a momentum, indexed by the particle label. The location of the particles is given by x = (x1 , x2 , . . . , xN ) ∈ RdN and the momenta p = (p1 , p2 , . . . , pN ) ∈ RdN . 2 1.2 Hamiltonian The state of the system may change with time, and at time t it is described by a time dependent phase space point (x(t), p(t)). The time is always a real variable. The dynamics (time evolution) of the system is described by energy function or Hamilton function or Hamiltonian of the system. The Hamiltonian is a real function defined on the phase space H : Rd × Rd → R (1.1) Its value H(x, p) represents the energy of the physical system in state (x, p). A basic axiom of classical mechanics is that the Hamilton function determines the time evolution via the Hamiltonian equations of motion ẋ = ∇p H(x, p), ṗ = −∇x H(x, p) (1.2) (dot denotes time derivative). Being a system of first order ordinary differential equations (ODE’s), the equations (1.2), under some mild regularity condition on H, fully determine the whole future (and also past) trajectory (x(t), p(t)) once an initial data is given, i.e. once the state of the system (x(t0 ), p(t0 )) is known at some time t0 . In other words, the Hamiltonian function comprises all the physical laws relevant for the system. One important property of the Hamilton equations of motion is that the Hamiltonian (energy) is conserved with time d H(x(t), p(t)) = ∇x H · ẋ + ∇p H · ṗ = ∇x H · ∇p H − ∇p H · ∇x H = 0 dt When we build a physical model, we usually give its Hamiltonian. The standard Hamiltonian of a single particle in classical mechanics (without magnetic fields) has the form H= p2 + U(x) 2m (1.3) where m > 0 is the mass of the particle and U(x) a real valued function, the potential. The first term represents the kinetic energy of the particle, the potential describes the interaction with an (unspecified) environment (e.g. container). For N particles, we have N X p2j H= + U(x) (1.4) 2mj j=1 where mj is the mass of the j-th particle and U(x) is the potential. The term p2j /2mj represents the kinetic energy of the j-th particle, the potential describes the interactions both among the particles and with a possible environment. 3 In many cases, the potential function simplifies into a one-particle and a two-particle part: X X U(x) = Vj (xj ) + Wjk (xj − xk ). (1.5) j j<k The first term is called background potential, the second one is the interaction potential. Note that the interaction is assumed to be translationally invariant – not a necessity, but a condition that is satisfied in most cases. The Hamiltonian (1.4), especially with the choice (1.5) may look a bit ad hoc, especially one may note the asymmetric role of the momenta and positions. It is, however, a fact of life that the two most “visible” interactions in real life, gravity and electrostatics, depend only on the position and not on the momentum of the particles; thus the momenta are not directly coupled. If magnetic fields are present, then the kinetic energy of the jth particle is modified to 2 1 qj pj − A(xj ) (1.6) 2mj c where qj is the charge of the particle, c= 300,000 km/sec is the speed of light and A : Rd → Rd is a vector field, representing the magnetic vector potential such a way that the magnetic field B is given by B = ∇ × A = curl A (in dimension d = 3). We remark that the magnetic field (and quantities derived from it, like the flux, which is the integral of A over closed loops) is the physically measurable quantity, the vector potential is not directly measurable. Maxwell’s equation dictates that any physical magnetic field is divergence free, ∇ · B = 0, thus it can be written as a curl. Notice that in case of the presence of a magnetic field, the velocity of the jth particle is 1 qj vj = ẋj = ∇pj H = pj − A(xj ) . m c The formula (1.6) identifies the Lorenz force acting on the jth particle: Fj = qj vj × B c Exercise 1.1 Check this formula for the Lorenz force from Newton’s law, from the Hamilton equations and from the identity ∇A (v · A) − (v · ∇)A = v × (∇ × A) from vector calculus [where the first gradient acts only on A]. 4 Using the explicit form of H, we can write the equations of motion as ẋj (t) = pj (t) , mj ṗj (t) = −∇xj U(x) = −∇Vj (xj ) − X k6=j ∇W (xj − xk ) The first equation just says that the velocity (defined as the time derivative of x) is the momentum divided by the mass, the second equation is Newton’s equation if the negative gradient of the potential is interpreted as the force. We remark that we presented the Hamilton formalism of classical mechanics. This formalism uses the assumption that there is an absolute concept of time. In relativistic systems such an assumption cannot hold, and a more general formalism, the Lagrangian formalism, has been developed. The two formalisms are equivalent if time is absolute. While the Lagrangian formalism is more general, its quantized version is much harder to define in a mathematically rigorous way, although it is necessary for doing e.g. relativistic quantum field theory. In this course we will consider only non-relativistic quantum systems, so we will use the Hamilton formalism and enjoy its advantages. 1.3 Coulomb systems The most basic objects of study in quantum mechanics are massive, charged point particles interacting with electrostatic forces. This means that the Hamiltonian (1.4) holds (if no magnetic fields are present) with a potential (1.5), where the interaction between the jth and kth particle is the Coulomb potential Wjk (xj − xk ) = qj qk |xj − xk | Note that the potential is negative for opposite charges and it is zero for infinitely distant particles. If a potential goes to zero at infinity, then a negative potential is also called attractive, positive potential is repulsive. Note that the physics (equations of motion) is insensitive to adding an overall constant to the potential U. Using this freedom we will always (implicitly) assume that the potential goes to zero at infinity, whenever this is possible (i.e. whenever limx→∞ U(x) exists and is finite). We use the same implicit convention for all constituents of the potential, i.e. for Vj and Wjk as well. The background potential originates from a fixed background charge distribution ̺(x), i.e. it is also of Coulomb type: Z ̺(y) −1 Vj (x) = qj dy = qj | · | ⋆ ̺ (x) Rd |x − y| 5 Here the star denotes the convolution; in general Z Z (f ⋆ g)(x) = f (x − y)g(y)dy = Rd Rd f (y)g(x − y)dy In most cases ̺ is a sum of Dirac delta masses ̺(x) = K X k=1 Qk δ(x − Rk ) representing fixed point charges Qk sitting at the points Rk . In this case Vj (x) = K X k=1 qj Qk |x − Rk | The simplest possible model is one single point charge moving in a zero potential field. It is called the free particle and its Hamiltonian is H= p2 . 2m With initial position xin and momentum pin at time t0 , the equations of motion ẋ = p , m ṗ = 0 have the trivial solution x(t) = xin + pin (t − t0 ), m p(t) = pin If several free particles are moving in a zero potential field, then N X p2j H= 2mj j=1 and each particle follows its own trajectory xj (t) = xin j pin j (t − t0 ), + mj 6 pj (t) = pin j without ever noticing each other. The next simplest model is a single charged particle, with mass m and charge q, moving in the background of another particle with charge Q that is considered fixed at R ∈ Rd . The Hamiltonian is p2 qQ H= + 2m |x − R| Notice that R is considered as a parameter, i.e. it is not a dynamical variable. By shifting the origin, we can assume R = 0. If the single charged particle is an electron with charge q = −e, and the fixed particle is a nucleus with proton number Z, i.e. with charge Q = Ze, then H= Ze2 p2 − 2m |x| (1.7) This is the Hamiltonian of a hydrogenic atom; if Z = 1 then it is exactly the Hydrogen atom. In full generality, we can consider K nuclei with charges Qk = Zk e and masses Mk , k = 1, 2, . . . , K, located at positions R = (R1 , R2 , . . . RK ), and N electrons, each with charge q = −e and mass m at locations x1 , . . . xN . Then the potential of the N electrons and K nuclei is N X K X X Zk Zℓ e2 X Zk e2 e2 VC (x, R) = − + + (1.8) |xj − Rk | k<ℓ |Rk − Rℓ | j<ℓ |xj − xℓ | j=1 k=1 The first term represents the attraction between the electrons and the nuclei, the second term is the nuclei-nuclei repulsion while the last term is the electron-electron repulsion. The attractive terms are negative, the repulsive ones are positive. The full Hamiltonian of the N electrons is N X p2j H= + VC (x, R) 2m j=1 (1.9) if the nuclei are considered fixed. In this case their positions are parameters, and H is defined on the phase space of the N electrons, i.e. on RdN × RdN . If we consider the nuclei dynamical as well, we need to introduce their momentum variables, call them (P1 , P2 , . . . , PK ). The Hamiltonian of the N electrons and K nuclei thus is given by K N X X p2j Pk2 + VC (x, R) . H= + 2m k=1 2Mk j=1 (1.10) This formula is the complete Hamiltonian of a molecule consisting of N electrons and K nuclei. Since in reality Mk ≫ m (the mass of the proton is about 1800 times of the mass of 7 the electron), we often consider the simplified model (1.9) where the nuclei are considered so heavy (formally Mk = ∞) that they are treated as static particles. One key feature of all these Coulombic Hamiltonians is that the range of the Hamiltonian function H is the whole R, in particular arbitrarily negative energies can be achieved. Assuming some radiation mechanism that is able to suck energy out of the system, the energy of a Hydrogen atom (1.7) can be driven arbitrarily negative; just by placing the electron closer and closer to the attractive nuclei. Thus the Hydrogen atom would not be stable; the electron would collapse into the nucleus. Moreover, it could release an infinite amount of energy – this is clearly unphysical. This problem has been noticed well before the discovery of quantum mechanics. One possible explanation is that the assumption about point particles is wrong; indeed the nuclei has a nonzero diameter of about 10−13 cm. However, the typical size of the Hydrogen atom is 10−8 cm, i.e several orders of magnitude bigger than the nucleus. Therefore the extended shape of the nucleus cannot explain the non-collapse of the electron on a much bigger scale. This question is known as the problem of the stability of Hydrogen, and similarly one can ask whether a molecule of N electrons and K nuclei are stable in the sense that whether inf H > −∞ or inf H = −∞. As the formula (1.8) immediately shows, the Hamiltonians (1.9), (1.10) are unstable in this sense. Such a scenario would have dramatic consequences on the world; it would indicate that after a long time the electrons of atoms and molecules would fall into their respective nuclei and matter would look rather like a dense soup instead of consisting of fairly well separated particles and a huge energy would be released. It was one of the great triumphs of early quantum mechanics that it could explain why in the quantum model of the Hydrogen such a collapse does not occur. It took more than 40 years after that before the similar but stronger stability statement (“stability of matter of second kind” – see the definition later) was discovered and rigorously proven for molecules or for any Coulomb system. 2 2.1 Quantum mechanics States The state of a single particle in quantum mechanics is given by a complex valued wave function ψ : Rd → C defined on the classical configuration space, Rd , or on a subset Ω ⊂ Rd . Unlike in classical mechanics, where altogther 2d numbers were sufficient to specify the state (d position and d 8 momentum coordinates), in quantum mechanics the state is given by a whole function, i.e. infinitely many numbers. The non-negative function x → |ψ(x)|2 on Rd is interpreted as a probability density, i.e. for any subset Ω ⊂ Rd , Z |ψ(x)|2 dx = Prob { the particle is in Ω } (2.1) Ω Since we wish to interpret |ψ(x)|2 as a probability density, we always assume the normalization condition Z |ψ(x)|2 dx = 1 . Rd Therefore, the natural state space of a single quantum particle is the unit sphere in L2 (Rd ), the space of square integrable functions: Z 2 d d L (R ) = ψ : R → C : |ψ(x)|2 dx < ∞ Rd (the integral here is understood in Lebesgue sense). We recall that the L2 -space is equipped with a natural scalar product Z hf, gi = f (x)g(x)dx Rd and with a norm kf k = kf k2 = p hf, f i and it is a Hilbert space, i.e. it is complete with respect to this norm. Since we will mostly use this L2 -norm, we usually omit the subscript 2, i.e. kf k will always denote the L2 -norm, by convention. The definition (2.1) leaves a lot of room for discussion, especially what do we mean by the probability here. As we said, we are not going into fundamentalist issues; we just mention that quantum mechanics does not allow to determine the precise position of the particle in any measurement (uncertainty principle). Moreover, we point out the experimental fact that the outcome of a quantum experiment is not a deterministic quantity, but rather a random number: if the same experiment is repeated several times, the measuring apparatus may show different numbers: it is only their statistics that is meaningful, i.e. we can ask what the probability that the gauge in the apparatus shows number 1 is, or what the expectation value of the shown number is if many identical experiments are performed. 9 2.2 Observables It is a fact that not everything can be measured in quantum mechanics. The wave function in principle contains all information about the state, nevertheless not every property of ψ is accessible by measurements. By definition, the measurable quantities are those that can be represented by self-adjoint (linear) operators O acting on L2 (Rd ), i.e. O : L2 (Rd ) → L2 (Rd ); these are called observables. The result of the measurement on the state ψ is given by hψ, Oψi = Expected value of the measurement O in state ψ . (2.2) and it is always a real number. Without the normalization condition kψk = 1, the expected value of the measurement is given by hψ, Oψi . hψ, ψi Recall that apart from (non-trivial and non-negligible domain questions) self-adjointness means that O is symmetric, i.e. hψ, Oχi = hOψ, χi, ψ, χ ∈ D(O) ⊂ L2 (Rd ) and it is defined on the same domain as its adjoint, D(O) = D(O ∗). To facilitate the introduction, we do not worry about domain questions for the moment. For those who feel cheated, just think about bounded symmetric operators for the moment; it is a fact that any bounded operator can be extended uniquely to the whole Hilbert space, even if it were originally defined only on a dense subset (see, Theorem I.7. of Reed-Simon Vol I.) thus any symmetric bounded operator is self-adjoint. Note that any measurable quantity R (2.2) is quadratic in ψ, e.g. it does not make sense to ask, for example, for the integral of ψ(x)dx. Moreover, an overall phase factor is invisible for experiments, i.e. if we multiply the wave function by a phase factor eiα , α ∈ R, then clearly Z Z iα iα iα −iα iα he ψ, O(e ψ)i = eiα ψ(x)O(e ψ(x))dx = e e ψ(x)Oψ(x)dx = hψ, Oψi Rd Rd where the linearity of O has been used. This means that no measurement can distinguish between the state ψ and the state eiα ψ, so one may even identify these states; in mathematical language take the factor space with the equivalence relation ψ ∼ χ iff there is α ∈ R such that χ = eiα ψ. 10 2.3 Position and momentum The observable measuring the position is the multiplication operator by the variable x, i.e. Z Z hψ, xψi = ψ(x) x ψ(x)dx = x |ψ(x)|2 dx Rd Rd which is clearly the first moment of the probability distribution |ψ(x)|2 . [Remark: Since x is a d-vector, this is actually a vector-valued observable, so each coordinate of x = (x(1) , . . . x(d) ) is a real valued observable and hψ, xψi is interpreted as a d-vector whose components the real valued observables hψ, x(j) ψi, j = 1, 2 . . . d.] The observable measuring the momentum is −i times the derivative operator, p = −i∇ = −i∇x , i.e. Z hψ, (−i∇x )ψi = −i ψ(x) ∇x ψ(x)dx (2.3) Rd Remark 2.1 Later we will insert a constant – the Planck’s constant ~ (3.2) – into the definition, i.e. p = −i~∇. This is clearly necessary even for dimensional reasons: the momentum has a dimension (mass) · (length) · (time)−1 , while the derivative has dimension (length)−1 , thus we need a constant with dimension (mass) · (length)2 · (time)−1 to compensate. Its exact value, determines the relation between classical world (derivative, length) and quantum world (quantum momentum). However, even later (see Section 6) we will choose units where ~ = 1, and in most of the course we will not see ~ at all. So in this section, for simplicity, we drop ~. Simple integration by parts in (2.3) shows that −i∇ is (formally) self-adjoint, i.e. hχ, (−i∇x )ψi = h(−i∇x )χ, ψi This relation certainly holds for sufficiently smooth (e.g. once continuously differentiable) functions that sufficiently decay at infinity (for example, compactly supported). Later we will see how to determine the correct domain of self-adjointness of −i∇. The meaning of −i∇x in position space is more obscure than that of the position operator x, but if we rewrite it in Fourier space, we see a complete duality between position and momentum. We recall that the Fourier transform of ψ is defined (formally) as Z b ψ(k) = ψ(x)e−2πix·k dx Rd This definition is meaningful if ψ is an integrable function, i.e. ψ ∈ L1 (Rd ), but it can be extended to L2 (Rd ), moreover, it turns out to be an isomorphy in L2 (Rd ) (i.e. the map 11 “taking the Fourier transform” is a bijection from L2 (Rd ) onto L2 (Rd ) and it preserves the scalar product). In particular, we have Plancherel’s formula (theorem) Z Z b ψ(k)b χ(k)dk = ψ(x)χ(x)dx Rd Rd in particular b = kψk . kψk The inverse Fourier transform is given by ˇ = f(x) Z f (k)e2πix·k dk Rd (note the positive sign in the exponent!) and it can be shown that indeed ψ = (ψ̂)ˇ= (ψ̌)b (2.4) Formally the first relation can be seen from Z Z 2πix·k ψ(y)e−2πiy·k dy dk (ψ̂)ˇ(x) = e d Rd ZR Z = ψ(y) e2πi(x−y)·k dk dy d Rd ZR = ψ(y) δ(x − y)dy (2.5) Rd =ψ(x) however, here the application of the Fubini theorem and also the usage of the delta function is not fully rigorous, nevertheless, (2.4) can be established rigorously as an identity between L2 functions (in particular, one does not expect it to hold for every x, only for almost every x). Moreover, it follows directly from the definition of ψb that Z Z 2 b ψ(x) ∇x ψ(x)dx = 2π k|ψ(k)| dk hψ, (−i∇x )ψi = −i Rd Rd and similarly hψ, −∆ψi = hψ, (−i∇) · (−i∇)ψi = Z 2 Rd |∇ψ(x)| dx = (2π) 12 2 Z Rd 2 b k 2 |ψ(k)| dk (2.6) In summary, the action of the momentum operator −i∇x on ψ(x) is just multiplication by b 2πk in the Fourier representation ψ(k). This correspondance works in the other direction as well: b (−i∇k )ψi b hψ, x ψi = hψ, i.e. there is a complete duality between position and momentum and between position space b representation of the state, i.e. ψ(x), and its momentum space representation ψ(k). Derivative in one representation corresponds to multiplication by the variable (times 2π) in the other representation. This correspondance holds even pointwise via the following formulas: \ b [−i∇ψ(x)](k) = (2π)k ψ(k), \ b [(2π)xψ(x)](k) = −i∇k ψ(k) (2.7) Remark 2.2 (VERY IMPORTANT) It is a useful rule of thumb to think of x carrying the dimension of a length, while the Fourier variable, k, carries the dimension of (length)−1 . In general, large scale properties of a function ψ(x) (i.e. behaviour for |x| ≫ 1) is reflected b in the short scale properties of ψ(k), i.e. |k| ≪ 1; e.g. for a function ψ that decays slowly in b x-space, we will have a singularity at k = 0 in its Fourier transform ψ(k). Vice versa: short e In physics, the first regime is scale properties of ψ are reflected in large scale properties of ψ. called infrared (IR) (short wavelength = large distance) regime, the second one is ultraviolet (UV) regime (large wavelength = short distance). Recall that the Fourier transform expresses oscillations in a function. The oscillation has a natural lengthscale (wavelength), and its inverse (frequency) is the corresponding Fourier b variable k (often called also mode). Thus ψ(k) tells us how much oscillation with wavelength −1 b k occurs in ψ. Oscillation is closely related to derivative: higher frequency content (big ψ(k) for some large k) implies higher derivative of ψ, this is clear from (2.7). You can read more about the Fourier transform in Lieb-Loss: Analysis, Chapter 5 or in a handout to be published later. A final remark is that Fourier transform always carries 2π and there are different conventions where one tucks 2π in. We used the convention of the book Lieb-Loss: Analysis, while Reed-Simon defines the Fourier transform and its inverse as Z Z 1 1 −ix·k b ˇ ψ(k) = ψ(x)e dx, f (x) = f (k)eix·k dk (2π)d/2 Rd (2π)d/2 Rd The discrepancy is irrelevant as long as one is aware of it and checks at the beginning of each book which convention is used. 13 3 Hamiltonian: the generator of the time evolution The energy is the most important measurable quantity, the corresponding quantum observable is the Hamilton operator H. It is a self-adjoint operator defined on L2 (Rd ), thus its expected value is always real (analogously, the Hamilton function (1.1) is real valued). The Hamiltonian generates the time evolution of the state of the system, i.e. the time evolution of the time dependent wave function ψ(t) via the Schrödinger equation i~∂t ψ = Hψ (3.1) ~ = 1.05 × 10−34 Joule×sec = 1.05 × 10−27 g cm2 sec−1 (3.2) where is a universal physical constant (Planck’s constant divided by 2π) – we will discuss the units later. Similarly to classical mechanics, the Hamiltonian (which is now an operator and not a function) contains all physical information about the system, so any modelling in physics starts with determining H. We remark, that similarly to classical mechanics, this formalism applies only to non-fully-relativistic situations, i.e. where there is an absolute time. Otherwise, the quantized version of the Lagrangian formalism is needed. The Schrödinger equation is a first order evolution equation. With a given initial data, ψ(t0 ) = ψ0 , at a fixed time t0 , it has a unique solution ψ(t) = e−i(t−t0 )~ −1 H ψ0 (3.3) Formal substitution shows that (3.3) indeed solves (3.1). The main question, however: what exactly the exponential on the right hand side of (3.3) is and whether the formal rules of differentiations really apply. Even before we try make sense of the formula for the solution, the first question is whether the solution exists at all, and if yes, whether it is unique. Being a simple evolution equation, from standard ODE theory we know that existence and uniqueness (at least locally) is guaranteed by Lipschitz continuity, i.e. if there is a constant K such that e ≤ Kkψ − ψk e kHψ − H ψk which, by the linearity of H is equivalent that H is bounded. If H were a bounded operator (in particular a matrix acting on the finite dimensional Hilbert space CN ) then one could define eitH for any constant t ∈ R by a power series: eitH = ∞ X (it)n H n n=0 14 n! (3.4) Exercise 3.1 Check that this series converges in the operator norm and that the usual rules d itH of differentiations apply, in particular dt e = iHeitH , actually the same holds for any power ecH with c ∈ C. Along the way, you will have to check directly from (3.4) that eitH eisH = ei(t+s)H As we will see in a moment, the most important Hamilton operators are unbounded, since they contain derivatives, and derivative operators are never bounded in L2 , inequality of the form k∇ψk ≤ Kkψk could NEVER hold (WHY?). In particular, typical Hamiltonians are not defined on the whole L2 (Rd ) [Recall Hellinger-Toeplitz theorem, Corollary to Theorem III.12 in Reed-Simon Vol. I]. Therefore not only the series (3.4) does not converge, but it is even questionnable whether there is any element of the Hilbert space for which the right hand side even term by term could be applied (in principle it could be that the intersection of all domains D(H n ), n = 1, 2, 3, . . . is trivial). It turns out that the symmetry of H is not sufficient to define eitH , and to define the dynamics, we will need self-adjointness. The definition will go through the spectral theorem, which is a generalization of the diagonalization of hermitian matrices to unbounded operators. Recall that if H is a finite hermitian matrix on CN , H = H ∗ , then, alternatively to (3.4), one could define eitH as eitH = UeitD U ∗ (3.5) where H = UDU ∗ is the diagonalization of H, i.e. U is a unitary matrix (U −1 = U ∗ ) containing the orthonormalized eigenbasis of H and D = diag(λ1 , λ2 , . . . , λN ) is a diagonal matrix containing the eigenvalues (with multiplicity). The exponential of the diagonal matrix eitD is defined as the diagonal matrix with entries eitλj Exercise 3.2 Check that the two definitions (3.4) and (3.5) coincide for hermitian matrices The precise formulation of the spectral theorem for unbounded operators is fairly long; we will do it only later, when we will really need it. You can read the statement in Section VIII.3 of Reed Simon, but it may seem scary for the moment. The essence is that with its help one can define functions of self-adjoint operators, i.e. not only polynomials of H make sense (like H 2 , H 3, . . .), but essentially for any function f (λ) with real argument λ ∈ R one can define an operator f (H) such that for polynomials it coincides with the usual definition and all standard “calculus” rules apply. E.g. with the function ft (λ) = eitλ one can define the operators eitH for any t in such a way that e.g. eitH eisH = ei(t+s)H holds. It is quite remarkable that such a powerful calculus exists with operators that are more complicated objects than functions. The spectral theorem says that all these are possible, if H is self-adjoint. 15 We remark, that for certain special Hamiltonians, eitH can be easily computed without reference to spectral theorem. For example if H = V (x), i.e. no kinetic energy is present, then clearly ψt (x) = e−itV (x)/~ψ0 (x) solves the Schrödinger equation (3.1). Similarly, if H = −~2 ∆, then eit~∆ is a multiplication in Fourier space, so 2 it~∆ ψ(k) = eit~(2πk) ψ(k) ψbt (k) = e\ is the Fourier transform of the solution to (3.1). [See Homework problem]. Unfortuntely, such explicite formulas are not available for the general case, H = −∆ + V , and e−itH cannot be “put together” (at least not easily) from e−itV (x) and eit∆ , because V (x) and ∆ do not commute, thus e−it(−∆+V ) 6= eit∆ eitV 4 Hamiltonian: the energy Similarly to the classical case, the energy Eψ of a single quantum particle in state ψ is the sum of two parts: a kinetic energy and potential energy Eψ = Tψ + Vψ where the kinetic energy (without magnetic fields) is Z ~2 Tψ = |∇ψ(x)|2 dx 2m Rd and the potential energy is Vψ = Z V (x)|ψ(x)|dx Rd where V (x) is a real valued function (potential). Written with the observable notation Z p2 1 ~2 1 |(pψ)(x)|2 dx = ψ, ψ = h−i~∇ψ, −i~∇ψi = hψ, −∆ψi Tψ = 2m Rd 2m 2m 2m where we used the notation p = −i~∇x to replace the classical momentum p with an operator (the notation is a bit sloppy, one really should distinguish the operator p from the classical momentum, e.g. by putting a hat on 16 it, pb = −i~∇x , but we will not use the classical momentum later). We see that with this replacement, the quantum kinetic energy formally is the same as the classical kinetic energy in (1.3). The potential energy is even easier, in observable form it is Vψ = hψ, V ψi Thus the total energy is represented by the expectation value Z Z D p2 E ~2 2 Eψ = ψ, |∇ψ(x)| dx + V (x)|ψ(x)|2 +V ψ = 2m 2m Rd d R (4.1) of the Hamilton operator p2 ~2 +V =− ∆+V 2m 2m which acts on any function ψ as follows H= (Hψ)(x) = − ~2 (∆ψ)(x) + V (x)ψ(x) 2m i.e. the potential acts as a multiplication operator. Notice that the second identity in (4.1) requires an integration by parts, so one should worry about the domain problem, i.e. precisely for which ψ’s are both expressions well defined. As an example, the Hamiltonian of a hydrogenic atom, assuming that the nucleus with charge Ze is fixed at the origin and we describe only the single electron with charge −e, is given by ~2 Ze2 HHydr = − ∆x − (4.2) 2m |x| The Z = 1 case is the usual Hydrogen atom. In almost all cases in quantum mechanics, the Hamilton operator can be decomposed into a kinetic energy operator H0 and a potential energy operator: H = H0 + V In most cases H0 is a (pseudo)differential operator, while V is a multiplication operator. One may think that the potential is the “easy” part and the kinetic energy is the complicated one, eventually multiplication by a function seems easier than differentiation. However, if one writes the action of H in Fourier space then, by duality, differentiation and multiplication get interchanged. In particular, as (2.7) shows, derivative in position space corresponds to multiplication in Fourier space and vice versa. 17 Independenty of the representations, the key point is that typically H0 and V do not commute, since p and x do not commute. This is the basic source in quantum mechanics why the theory cannot be described by classical objects like functions that commute; so one needs matrices, or “generalized matrices”, i.e. operators. One of the fundamental observation of the “founding fathers” of quantum mechanics was that non-commutativity is needed. We mention that in the presence of a magnetic field B = ∇ × A in d = 3 dimensions the kinetic energy of a particle with mass m and charge q is modified Z 2 1 q TA,ψ = − i~∇ − A(x) ψ(x) (4.3) dx = hψ, HA ψi 2m R3 c with 2 2 q q 1 1 p − A(x) = − i~∇ − A(x) HA = 2m c 2m c beind the kinetic energy operator. Notice that the formula is gauge invariant, i.e. for any real function χ : R3 → R the kinetic energy operator with A and with A + ∇χ are unitarily equivalent: HA+∇χ = U(χ)HA U ∗ (χ) (4.4) where U(χ) is the multiplication operator by the complex phase exp(−i(q/~c)χ(x)). Exercise 4.1 Check the formula (4.4)! We add two remarks that will be relevant later on. First, the magnetic field itself has an energy (price to be paid to the power company to generate this field [E.Lieb]). In standard CGS units (and in d = 3) it is Z 1 |B(x)|2 dx (4.5) 8π wher the magnetic field has a dimension (mass)1/2 · (length)−1/2 · (time)−1 , i.e. measured in g1/2 cm−1/2 sec−1 . If we allow the system to adjust its own magnetic field, the field energy must also be included in the total energy of the system, although this is not an operator but just a number. Second, we remark that the magnetic kinetic energy (4.3) applies only to spinless particles, in particular, strictly speaking, not to electrons, that have spin 21 . We will later introduce the corresponding kinetic energy (Pauli operator) that takes also spins into account. 18 5 Ground state energy In the previous section we explained that in order to define the time evolution, one first needs to establish the self-adjointness of H (on a suitable domain) and second one needs the spectral theorem. Both steps require some non-trivial preparation from functional analysis and they are not very intuitive. So we will postpone their discussion, and we first focus on an aspect of quantum mechanics that can be presented with much less technicalities. We mentioned that due to radiations, systems tend to favor their low energy states, especially their ground states. Finding out the lowest energy state of a system is also important because it tells us at most how much energy it can release. If we had a system whose energy could be unbounded from below, then driving this system into lower and lower energy states, we could gain infinite amount energy out of it – clearly a cheap solution to all our energy problems. So one of the very first questions about a quantum Hamiltonian is whether it is bounded from below, more precisely if its ground state energy, defined as n o E0 = inf Eψ = hψ, Hψi : kψk = 1 is finite or minus infinity. The infimum may not be achieved even if E0 > ∞, nevertheless we still call it ground state energy. If there is a minimizer, ψ0 such that E0 = Eψ0 , then ψ0 is called the ground state. If E0 > −∞, then we say that the system satisfies the stability of first kind. Note that to pose the question of stability, one can avoid defining the domain of H precisely, we can simply use the following definition of the energy: Z Z ~2 2 Eψ = Tψ + Vψ = |∇ψ(x)| dx + V (x)|ψ(x)|2 dx 2m Rd d R The advantage of this definition (called the quadratic form of H), is that the kinetic energy term is the integral of a non-negative quantity, and if ψ is non-differentiable, more precisely if its derivative does not coincide almost everywhere with an L2 -function, then we simply define Tψ = ∞. The potential term may have both positive and negative part; we decompose it V (x) = [V (x)]+ − [V (x)]− where for any real number a, [a]− = − min{0, a} [a]+ = max{a, 0}, denote the positive and the negative parts, respectively. 19 As long as the negative part of the potential is controlled by the kinetic energy, in the sense that Z [V (x)]− |ψ(x)|2 dx ≤ Tψ + Kkψk2 (5.1) Rd for some finite constant K, independently of ψ, then Eψ is bounded from below: Z Z 2 Eψ = Tψ + [V (x)]+ |ψ(x)| dx − [V (x)]− |ψ(x)|2 dx ≥ −Kkψk2 = −K Rd Rd taking into account the normalization kψk = 1. We thus managed to reduce the problem of stability to the proof of inequality (5.1) and notice that this inequality requires no worry about domains. If Tψ = ∞, then the inequality always holds, so we can restrict our attention to those ψ’s such that Tψ < ∞ (this space will be called the H 1 -Sobolev space). If the left hand side of (5.1) is infinite, then the inequality does not hold, otherwise we have to compare two finite numbers. 6 Units Before we go on with more complicated formulas and concepts, we will fix which physical units we use. The idea is that by choosing the units properly, all physical constants like electron charge, mass, Planck constant etc. can be set equal 1 and in this way we can focus on the structure of the formulas undisturbed by irrelevant constants. In CGS units we have • m= mass of the electron = 9.11 × 10−28 g • e = (-1)× charge of the electron = 4.8 × 10−10 g1/2 cm3/2 sec−1 • c= speed of light = 3 × 1010 cm sec−1 • ~ = Planck’s constant divided by 2π = 1.055 × 10−27 g cm2 sec−1 The first three constants are known from classical physics. Planck’s constant is the fundamental constant of quantum mechanics, essentially by connecting the momentum with the physical space scale through p = −i~∇x . Out of these four constants, there is a unique way to form a single dimensionless constant α= e2 1 = ~c 137.04 20 which is called the fine structure constant. Since the electrostatic interaction potential is quadratic in the charge, α can be thought as measuring the strength of this interaction. The natural lengthscale is the Compton wavelength of the electron, defined as ℓC = ~ = 2.43 × 10−10 cm mc and the natural energy scale is the rest mass energy of the electron Er = mc2 = 8.2 × 10−7 ergs = 8.2 × 10−7 g cm2 sec−2 These are the natural units in relativistic quantum mechanics. In most of this course we will deal with non-relativistic quantum mechanics, so the appearance of the speed of light is unnatural. Our basic unit of length thus will be ℓ= ℓC ~2 = = 1.06 × 10−8 cm 2α 2me2 (6.1) which is twice the Bohr radius, the typical size of the Hydrogen atom. Our energy unit will be 2me4 2mc2 α2 = = 4 Ry = 8.73 × 10−11 ergs (6.2) ~2 where 1 Rydberg (Ry) is the ground state energy of the Hydrogen atom. Changing the unit of length requires rescaling the wave function ψ; note that the normalization condition must be respected. Therefore, if ψ(x) is the wavefunction in a certain unit, and we rescale the space by a factor λ, i.e. x → X = x/λ, then the wave function must be rescaled as e ψ(x) → λ−3/2 ψ(xλ−1 ) = λ−3/2 ψ(X) =: ψ(X) in the new unit. We can define a unitary transformation (Uλ ψ)(x) = λ−3/2 ψ(xλ−1 ) representing the change of unit of length. It is easy to see that Uλ∗ ∂x Uλ = λ−1 ∂x and Uλ∗ | · |α Uλ = λα | · |α where | · |α is a multiplication operator with |x|α . Comparing these two formulas, we see that the derivative operator scales as (length)−1 , i.e. exactly as the Coulomb potential. 21 As an exercise, we compute the Hamiltonian of a hydrogenic atom (4.2) (written in CGS units) in our new units. We write the space coordinate x in our unit of length (6.1) as x = ℓX = then and Thus, in the new units ~2 X 2me2 ~2 ~2 ~2 −2 2me4 ∆x = ∆X ∆ = X 2m 2m 2me2 ~2 Ze2 ~2 −1 2me4 Z Ze2 = = |x| |X| 2me2 ~2 |X| H=− Ze2 2me4 Z ~2 ∆x − = − ∆ − X 2m |x| ~2 |X| so the Hamiltonian of the hydrogenic atom is H = −∆X − Z |X| (6.3) measured in our energy unit (6.2). Comparing with (4.2) we see that in our new units 2m = ~ = e = 1 Important rule of thumb: It is always good to check how various quantities scale with length. The derivative scales as (length)−1 , thus the LaplacianRscales as (length)−2 and the Coulomb potential scales as (length)−1 . Moreover, any integral R3 (. . .)dx scales as (length)3 and any Fourier variable (frequency) scales as (length)−1 . This indicates how the appropriate term behaves under scaling x → λx. The scaling property of a term should not change along any mathematical manipulations, so this gives rise to a quick check: after a long calculation one can simply check whether the initial formula and the final formula scales in the same way with the length. If not, there is an error! In particular, the two terms in the hydrogenic Hamiltonian scale differently with length, thus it gives rise to an intrinsic lengthscale, namely a lengthscale of order Z −1 (measured in the units we use, in this case in ℓ). For example, we will see that the eigenfunctions of H live on a lengthscale Z −1 ; this is the “built-in” lengthscale in H: if Z were (length)−1 , then the two terms in H would both scale in the same way. 22 Exercise 6.1 (i) Prove that the Hamiltonian of a hydrogenic atom in the relativistic units ℓC and Er is given by 1 Zα H = − ∆x − 2 |x| (ii) Show that a vector potential A has dimension g1/2 cm1/2 sec−1 (i.e. charge/length) and 1/2 −1/2 the √magnetic field B has dimension sec−1 (i.e. charge/(length)2 ). Choose √ g cm −1 −2 ℓC ~c as the unit for A, and ℓC ~c as the unit for the magnetic field and prove that the Hamiltonian of a hydrogenic atom in a magnetic field is given by √ 1 Zα H = (p + αA)2 − 2 |x| in the relativistic units ℓC and Er . (iii) Show that in units indicated in part (ii), the field energy (4.5) remains unchanged, i.e. it is still given by Z 1 |B(x)|2 dx 8π R3 This exercise shows that in the relativistic units m=~=c=1 and √ α is the elementary charge e. Convention: To economize the formulas, we will often omit dx and the domain from the integration, i.e. we write Z Z f for f (x)dx R3 7 Stability of the hydrogenic atom We now show that the hydrogenic atom, given by the Hamiltonian (6.3), is stable: Theorem 7.1 Consider the set Z n 3 M := ψ : R → C : 2 R3 |∇ψ(x)| dx < ∞, 23 Z R3 o |ψ(x)|2 dx < ∞, |x| of (unnormalized) wave functions whose kinetic energy and Coulomb energy are finite. Then, for any Z > 0, the ground state energy Z nZ o Z 2 E0 = inf |∇ψ(x)| dx − |ψ(x)|2 dx : ψ ∈ M, kψk = 1 (7.1) R3 R3 |x| is finite and it is given by E0 = − Z2 4 and the function Z 3/2 ψ0 (x) = √ e−Z|x|/2 8π (7.2) is the unique minimizer. The key ingredient of the proof is a lemma that establishes the bound (5.1) for our potential (after a Schwarz inequality ab ≤ 21 (a2 + b2 ) for positive numbers a = k∇ψk, b = kψk): Lemma 7.2 Let ψ ∈ M, then Z |ψ(x)|2 dx ≤ k∇ψk kψk |x| (7.3) and equality holds if and only if ψ0 (x) = const. e−c|x| for some positive constant c > 0. Remark. You may wonder how to figure out such an inequality. Recall that in (5.1) we wanted to control the negative part of the potential (in this case the complete potential) by the kinetic energy (plus the L2 -norm). Apart from the trivial constant Z, the left hand side of (7.3) is the potential energy. We want to bound it in terms of k∇ψk and kψk – what kind of inequality has any chance to be correct at all? Here is a “back-of-the-envelope” test that checks whether an inequality can be correct. It does not prove the inequality, but if an inequality does not pass this test, it is surely wrong. The idea is to test how the inequality scales with two different rescaling. First, notice that if ψ is replaced by λψ with some λ > 0, then both sides of (7.3) changes with a factor λ2 . The two sides must scale in the same way with λ, otherwise the inequality cannot be correct. Second, replace ψ(x) with ψ(x/lambda). After a change of variables, the left hand side scales as Z Z |ψ(x/λ)|2 |ψ(y)|2 −1 3 dx = λ λ dy |x| |y| 24 Here λ3 comes from the change of variables, y = x/λ, the extra λ−1 comes from the denominator. On the right hand side, we find Z 1/2 Z 1/2 Z 1/2 Z 1/2 2 2 2 −1 3 |∇[ψ(x/λ)]| dx |ψ(x/λ)| dx |∇[ψ(y)]| dy |ψ(y)|2dy =λ λ , where, again, λ3 comes from the trivial volume factors and λ−1 comes from the derivative. Again, we see that the two sides scale in the same way, which is a necessary condition for the inequality to hold. Actually, it is easy to see that no other combination of the form k∇ψkα kψkβ would pass both tests apart from (7.3) with α = β = 1. Proof of Lemma 7.2. By a standard density argument, it is sufficient to prove the inequality for all ψ ∈ C0∞ , i.e. for all smooth, compactly supported functions. – See a remark at the end of the proof. Using integration by parts, compute 3 X 1 xj 2hψ, ψi = ψi hψ, ∂xj , |x| |x| j=1 X xj xj =− h∂xj ψ, ψi + h ψ, ∂xj ψi |x| |x| j X xj ≤2 |h∂xj ψ, ψi| |x| j where [A, B] = AB − BA is the commutator. By Schwarz inequality |h∂xj ψ, xj xj ψi| ≤ k∂xj ψk k ψk |x| |x| After summing up, and using another Schwarz inequality for the sum, we have X xj |h∂xj ψ, ψi| ≤ k∇ψk kψk |x| j Thus the proof of (7.3) is complete. Exercise 7.3 Prove the case of equality stated in the theorem. 25 (7.4) Proof of Theorem 7.1. Using (7.3) and kψk = 1, we have Z Z |ψ|2 Z 2 Z 2 2 |∇ψ| − Z − ≥ k∇ψk2 − Zk∇ψk = k∇ψk − |x| 2 4 after completing the square. Taking the infimum for all ψ ∈ M, we have E0 ≥ − Z2 4 and it is easy to check that among the functions ψ0 (x) = const. e−c|x| the equality is achieved only if c = Z/2. The prefactor in (7.2) comes from the normalization. (CHECK!) You may find the proof of Lemma 7.2 very special. The commutator trick works only for the Coulomb potential. What if we consider a different potential? This will lead us to Sobolev inequalities, which we will discuss later. One of them is the following lower bound on the L2 -norm of the gradient in d = 3 dimensions: there exists a universal constant C such that k∇ψk ≥ Ckψk6 This inequality can be used to prove the bound (5.1), since, by Hölder’s inequality (CHECK the exponents!) Z [V ]− |ψ|2 ≤ kψk26 k[V ]− k3/2 and thus combining this with the Sobolev inequality, we have Z [V ]− |ψ|2 ≤ k∇ψk2 = Tψ as long as k[V ]− k3/2 ≤ C 2 . If this latter condition is not satisfied (e.g. for the Coulomb potential V (x) = |x|−1 6∈ L3/2 ) one can still use this idea with a bit twist. Suppose that one can write [V ]− = V1 + V2 where V2 ∈ L∞ and V1 ∈ L3/2 with kV1 k3/2 ≤ C −2 . Then Z Z Z 2 2 [V ]− |ψ| = V1 |ψ| + V2 |ψ|2 ≤ k∇ψk2 + kV2 k∞ kψk2 = Tψ + kV2 k∞ kψk2 i.e. (5.1) is still satisfied with K = kV2 k∞ . 26 VERY IMPORTANT REMARK: We will often use the idea that to prove an inequality like (7.3) for all functions for which the two sides are finite, it is sufficient to prove the inequality for “nice” functions, most often for C0∞ functions. In mathematics literature, this fact is referred to as “standard density argument”, and one usually does not waste more time on it. We will do the same, but once for all, I want to show on this example how it goes. We have to know that C0∞ is dense in H 1 -norm, i.e. in the L2 norm and in the norm of the L2 of the gradient: kψk2H 1 := kψk2 + k∇ψk2 This statement can be found in Theorem 7.6 of Lieb-Loss (the density in L2 is in Theorem 2.16) and later we will define H 1 precisely, for the moment: it is the space of all functions whose gradient is in L2 . Armed with this information, the argument goes as follows: suppose that (7.3) is proven for all C0∞ functions, and let ψ ∈ M an arbitrary function. From the density of C0∞ in H 1 can find an approximating sequence ψn ∈ C0∞ ∩ M such that kψ − ψn k → 0, k∇ψ − ∇ψn k → 0 (in particular, k∇ψn k → k∇ψk and kψn k → kψk) and we have Z |ψn (x)|2 dx ≤ k∇ψn k kψn k |x| (7.5) (7.6) Moreover, ψn is a Cauchy sequence in the weighted L2 space with measure dµ(x) = |x|−1 dx, since Z Z |ψn (x) − ψm (x)|2 2 |ψn (x) − ψm (x)| dµ(x) = dx ≤ k∇(ψn − ψm )k kψn − ψm k → 0 |x| as n, m → ∞, where we used (7.3) for C0∞ functions. Using the Riesz-Fischer theorem (Completeness of Lp , Theorem 2.7 of Reed-Simon) and passing to a subsequence (which we continue to denote by ψn ) we can assume that ψn converges in L2 (R3 , dµ) as well. The limit must be ψ (WHY? – because a subsequence converges almost everywhere pointwise, both in L2 and in L2 (R3 , dµ), this fact is also part of Riesz-Fischer theorem, and a sequence of functions has only one pointwise limit) Taking now the n → ∞ limit on both sides of the inequality (7.6), we obtain that (7.3) holds for ψ as well. 27 8 8.1 Stability of atoms and molecules Stability of first kind In the previous section we proved that the very special hydrogenic atom is stable. But actually the same proof will immediately show that any atom or molecule or even any extended matter with Coulomb interaction is stable. The state of N particles is described by a wave function of N variables, i.e. by elements of Z n o 2 3N 3N L (R ) = ψ(x1 , x2 , . . . xN ) : R → C : |ψ(x1 , . . . xN )|2 dx1 . . . dxN < ∞ Later we will see that not every function is allowed, only those with a certain symmetry type (depending on whether the particles are bosons or fermions) but for the moment we consider all functions. The Hamilton operator (formally) is given by H= N X (−∆xj ) + VC (x, R) j=1 where x = (x1 , x2 , . . . , xN ) and VC was given in (1.8) (setting e = 1). This is the natural quantum analogue of (1.9) (in our chosen units), and we consider here the case of static nuclei. The Laplacian ∆xj acts only on the j-th variable, but it is still considered as an operator acting on functions ψ with N variables. Often we will write ∆j for ∆xj and similarly for the gradient ∇j = ∇xj . Thus the energy of this molecule in state ψ is Eψ = N Z X j=1 2 |∇j ψ(x)| dx + Z VC (x, R)|ψ(x)|2 dx (8.1) where dx = dx1 dx2 . . . dxN . Theorem 8.1 (Stability of first kind for atoms and molecules) For a fixed location of nuclei at R = (R1 , . . . RK ), consider the set N Z K X N Z n X X 3N 2 MR := ψ : R → C : |∇j ψ(x)| dx < ∞, j=1 k=1 j=1 28 o |ψ(x)|2 <∞ |xj − Rk | Then the ground state energy of the molecule with N electron and K fixed nuclei at locations R is given by n o E0 (R) = inf Eψ : ψ ∈ MR , kψk = 1 and the total ground state energy is given by E0 = inf E0 (R) R i.e. the ground state energies minimized for all nuclear positions. Then E0 > −∞. Proof. Since we are looking for a lower bound, we can drop the positive repulsion terms from VC , so it is sufficient to show that there exists a finite constant C, independent of R, such that K X N Z N Z X X Zk |ψ(x)|2 2 |∇j ψ(x)| dx − ≥ −Ckψk2 (8.2) |x − R | j k j=1 j=1 k=1 Let Z = maxk Zk . It is clearly sufficient to prove that there is a finite C ′ such that for each fixed j and fixed k Z Z Z|ψ(x)|2 1 2 |∇j ψ(x)| dx − ≥ −C ′ kψk2 (8.3) K |xj − Rk | and then we just add up these inequalities (for all j = 1, 2, . . . N, k = 1, 2, . . . K) to get the result with C = C ′ NK. We can write Z Z Z 1 Z|ψ(x)|2 1 2 dj . . . dxN H(x1 , . . . x |∇j ψ(x)| dx − dx1 . . . dx bj , . . . xN ) (8.4) = K |xj − Rk | K with H(x1 , . . . x bj , . . . xN ) = hZ dxj |∇ψ(x1 , . . . xj , . . . xN )|2 − i KZ |ψ(x1 , . . . xj , . . . xN )|2 |xj − Rk | dj means that the dxj integration is and with the convention that hat means omission (i.e. dx missing). Here we view all but the xj variable fixed (as a parameter) and consider the function g(xj ) = gx1 ,...bxj ,...xN (xj ) = ψ(x1 , . . . xj , . . . xN ) as a function of one variable: g : xj → ψ(x1 , . . . xj , . . . xN ) parametrized by the others. 29 This is an L2 function for almost all choices of (x1 , . . . x bj , . . . xN ) since Z Z hZ i 2 2 d dj . . . dxN kgx1 ,...bxj ,...xN k dx1 . . . dxj . . . dxN = dxj |ψ(x1 , . . . xj , . . . xN )| dx1 . . . dx = kψk22 = 1 Thus H(x1 , . . . x bj , . . . xN ) = Z R3 h |∇g(xj )|2 − (8.5) i KZk 2 |g(xj )| dxj |xj − Rk | where g = gx1 ,...bxj ,...xN . From the stability of the hydrogenic atom, we know that 1 H(x1 , . . . x bj , . . . xN ) ≥ − (KZ)2 kgx1,...bxj ,...xN k2 4 (the fact that the nucleus is at Rk and not at the origin does not change anything in the estimate, since the kinetic energy and the L2 norm are both translation invariant). Plugging this estimate into (8.4), we obtain (8.3) with C ′ = 14 Z 2 K, so we have the lower bound (8.2) with C = 14 Z 2 NK 2 . This proof was presented for static nuclei. If they were also dynamical, the proof is even easier, since then we have additional positive kinetic energy terms in the Hamiltonian. 8.2 Stability of second kind, preliminary estimates We have followed the dependence of the constant C = 14 Z 2 NK 2 on the parameters, especially on the total particle number N + K. In a typical molecule or extended matter N is close to K to ensure electrostatic (almost) neutrality. Thus our lower bound is cubic in the total number of the particles. This gives stability (of first kind), but is not completely satisfactory, since we would like to show that the ground state energy is at most proportional to the number of particles, i.e. we would like to have a bound that is linear in N + K. Definition 8.1 We say that a Coulombic system consisting of N electrons and K nuclei satisfies the stability of second kind, if there is a constant C = CZ that may depend on the maximal charge of the nuclei, Z = maxk Zk but is independent of the total number of particles, such that Eψ ≥ −CZ (N + K) 30 We will see that the extended matter with Coulomb interaction actually satisfies this stronger criteria for stability, but we will have to restrict the set of admissible wave functions in M to antisymmetric ones. Physically this reflects the fact that the electrons are fermions and it is also called the Pauli principle. The nuclei can have arbitrary particle types. There will be three ingredients for such a proof: (i) Coulomb singularities can be controlled by the kinetic energy (ii) Electrostatic screening (iii) Pauli principle So far we have seen (i), this is essentially the key inequality in Lemma 7.2. The electrostatic screening is a special property of the Coulomb systems, very roughly saying, it expresses the fact that if we have a collection of particles (with both negative and positive charges) in a bounded domain, then the electrostatic potential generated by these particles far away looks as if all charges were concentrated at one point. In particular, there is a strong cancellation. Finally, Pauli principle will strengthen the one particle Lemma 7.2 and will lead to the LiebThirring inequalities, that replace Sobolev inequalities in the fermionic setup. Why is Pauli principle important? The following explanation should give a first (nonrigorous) insight. Suppose we have N electron and one single nucleus, say, at the origin, with nuclear charge Z. Assume that we neglect all interaction among the electrons (which are positive, so for a lower bound we are allowed to neglect them). The Hamiltonian is H= N X i=1 − ∆j − Z |xj | (8.6) or, with the quadratic form Eψ = N Z X i=1 " |∇j ψ(x)|2 − Z|ψ(x)| |xj | 2 # dx (8.7) What is the minimal energy? For one electron, the energy would be −Z 2 /4 and the minimizing function ψ0 (x) = (const)e−Z|x|/2 (see Theorem 7.1). If the electrons do not interact, then the second electron can occupy the same state as well, etc, so ψ(x) = ψ(x1 , x2 , . . . , xN ) = ψ0 (x1 )ψ0 (x2 ) . . . ψ0 (xN ) 31 would be a natural guess (note that the single particle functions ψ0 (xj ) should be multiplied and not added). It is easy to compute (EXERCISE), that the energy of this state is Eψ = − NZ 2 4 i.e. the energy is additive. Exercise 8.2 Extend this argument to show rigorously that the atom with one nucleus of charge Z always satisfies the stability of second kind. The problem is that a typical electrostatic matter has several nuclei. What kind of nuclear configuration yields the lowest energy? If we neglect the nucleus-nucleus repulsion as well (very crude assumption), then nothing forbids the K nuclei with charges Z1 , Z2 , . . . , ZK to pile up atop of each other and forming a nucleus with total charge Z1 + ... + ZK . It can be proven, that indeed the total pile-up is the lowest energy configuration and the minimal energy is (Z1 + . . . + ZK )2 N 1 E0 = − ≥ − Z 2 NK 2 4 4 where Z = maxk Zk , this bound is exactly the same as we got in Theorem 8.1. To convince yourself, just check that the energy of a total pile up is lower than if the nuclei are concentrated, say, at two different centers, that are very far away. Let A be the index set of the nuclei around one center, then the energy is [WHY??] P 2 # P 2 " (N − N1 ) N1 j6∈A Zj j∈A Zj − − E(A) = min 0≤N1 ≤N 4 4 Check [!!] that (Z1 + . . . + ZK )2 N 4 and equality is if and only if A = ∅ or A = {1, 2, . . . , N}, i.e. there is only one center. E(A) ≥ − One may argue that we neglected the electrostatic repulsion among the nuclei (also also among the electrons), and this forbids putting all nuclei atop of each other. This is true, but the correct electrostatic alone does not solve the problem, we will prove later that taking all electrostatics into account the ground state energy is of order −CZ (N + K)5/3 . It is an improvement compared with the cubic behavior, but it is still not linear. To achieve a lower bound that is linear in N +K, one additionally needs the Pauli principle. Very roughly, the Pauli principle forbids the two electrons occupy the same state; if the 32 first electron is in state ψ0 (x1 ), then the second one cannot be in ψ0 (x2 ), i.e. the function ψ0 (x1 )ψ0 (x2 ) as a two-electron wavefunction is forbidden. More precisely, Pauli principle will imply that if one electron is in state ψ0 , then the second one must be in a state that is orthogonal to ψ0 , e.g. its wavefunction is ψ0 (x1 )ψ1 (x2 ) where ψ1 ⊥ ψ0 . Actually, the precise definition will require that the two particle wave function ψ(x1 , x2 ) be antisymmetric, i.e. ψ(x1 , x2 ) = −ψ(x2 , x1 ), so the product ψ0 (x1 )ψ1 (x2 ) is still not correct, but the antisymmetrized product i 1 h (ψ0 ∧ ψ1 )(x1 , x2 ) = √ ψ0 (x1 )ψ1 (x2 ) − ψ0 (x2 )ψ1 (x1 ) 2 √ will do the job (the 1/ 2 is the correct normalization to make ψ0 ∧ψ1 have norm one, CHECK!) If we choose ψ1 to be the eigenfunction to the second lowest eigenvalue of the Hamiltonian −∆ − Z/|x|, then ψ1 ⊥ ψ0 (think of the orthogonality of two eigenvectors with different eigenvalues of a hermitian matrix). The energy (8.7) of the function ψ = ψ0 ∧ ψ1 will be the sum of the two lowest eigenvalues. Exercise 8.3 Check this statement on the formal level, i.e. assuming that and E0 6= E1 , then Z −∆− ψj = Ej ψj , |x| j = 0, 1 hψ, Hψi = E0 + E1 where ψ = ψ0 ∧ ψ1 and H is given in (8.6). You can use the formal self-adjointness of H. For N electrons, the Pauli principle will imply that the lowest energy state of (8.6) is the antisymmetric product ψ = ψ0 ∧ ψ1 ∧ . . . ∧ ψN −1 of the N eigenfunctions, corresponding to the N lowest eigenvalues (with multiplicity), i.e. to E0 ≤ E1 ≤ E2 . . .. The energy of ψ is E0 + E1 + . . . + EN −1 . Exercise 8.4 Reviewing the degeneracy structure of the Hydrogen eigenvalues (with nuclear charge Z) compute how the sum E0 + E1 + . . . + EN −1 behaves in N for large N. (Answer −CZ 2 N 1/3 ) Imagine again that all nuclei are concentrated at the origin (i.e. neglect nucleus-nucleus interaction) and also neglect electron-electron interaction, but take into account the Pauli principle, we see that the ground state energy is of order −CZ 2 K 2 N 1/3 . 33 Summarizing: If we neglect repulsion (i.e. electrostatic is wrong), and neglect Pauli principle, then ground state energy is −CZ (N + K)3 . If we neglect Pauli principle, but take into account proper electrostatic, then the ground state energy is smaller than −CZ (N + K)5/3 , i.e. electrostatic improves the power by 4/3. Finally, if we take into account Pauli principle, but neglect electrostatics, the ground state energy is −CZ (N + K)7/3 (always expressed in terms of the total number of particles, N + K), i.e. Pauli principle improves the power by 2/3. The goal will be that if we take both effects into account, then the bound is linear in N + K, i.e. the original cubic power is improved by 4/3 + 2/3 = 2. 34

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Basic concepts in quantum mechanics