Download Basic concepts in quantum mechanics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Coherent states wikipedia , lookup

Copenhagen interpretation wikipedia , lookup

Perturbation theory (quantum mechanics) wikipedia , lookup

Atomic orbital wikipedia , lookup

Schrödinger equation wikipedia , lookup

Double-slit experiment wikipedia , lookup

EPR paradox wikipedia , lookup

Bell's theorem wikipedia , lookup

History of quantum field theory wikipedia , lookup

Noether's theorem wikipedia , lookup

Bohr–Einstein debates wikipedia , lookup

Identical particles wikipedia , lookup

Elementary particle wikipedia , lookup

Propagator wikipedia , lookup

Bohr model wikipedia , lookup

Wave function wikipedia , lookup

Tight binding wikipedia , lookup

Quantum state wikipedia , lookup

X-ray photoelectron spectroscopy wikipedia , lookup

Hidden variable theory wikipedia , lookup

Aharonov–Bohm effect wikipedia , lookup

Renormalization wikipedia , lookup

Scalar field theory wikipedia , lookup

Rutherford backscattering spectrometry wikipedia , lookup

Path integral formulation wikipedia , lookup

Electron scattering wikipedia , lookup

Wave–particle duality wikipedia , lookup

Renormalization group wikipedia , lookup

Matter wave wikipedia , lookup

Hydrogen atom wikipedia , lookup

Particle in a box wikipedia , lookup

Symmetry in quantum mechanics wikipedia , lookup

Atomic theory wikipedia , lookup

T-symmetry wikipedia , lookup

Relativistic quantum mechanics wikipedia , lookup

Canonical quantization wikipedia , lookup

Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup

Molecular Hamiltonian wikipedia , lookup

Transcript
Basic concepts in quantum mechanics
László Erdős∗
Nov 18, 2008
The emergence of quantum physics in the mid 1920’s was a fundamental change; probably
the most important one in the long history of the physics. Moreover, it was so strikingly
new and different from anything known before, that several of the best scientific minds have
doubted its validity. There are many ways to explain why we should believe in it, and “foundations of quantum mechanics” has become a subject in itself for those who represent the
fundamentalist’s point of view. The more pragmatically minded mainstream approach, however, starts from a few basic axioms and focuses on results one can obtain from the theory.
The only difference between the axioms of quantum mechanics and the standard axioms
in mathematics (say, the axioms of elementary geometry [Euclid] or the axioms of set theory
[Zermelo-Frenkel] or the axioms of integer arithmetics [Peano]) is that they are not at all
obvious at first sight, and even not at second or third sights... So their justification is more
indirect, but a powerful one: they work. We will follow the most pragmatical point of view:
whatever crazy its axioms sound, quantum mechanics, as a matter of fact, has correctly
predicted essentially any experiments in an enormous energy range. Quantum mechanical
principles seem to be valid from subatomic physics to astrophysics. They correctly account
for many phenomena on large and small scales that no other theory could tackle. A pretty
rewarding price for accepting some axioms that may sound unbelievable at the beginning....
In this course we will do quantum mechanics, which is the most basic part of quantum
physics. Quantum physics includes many other disciplines, such as quantum field theory, quantum statistical mechanics, quantum gravity etc., but they all originate in quantum mechanics,
similarly as all classical physics (e.g. thermodynamics, fluid dynamics etc.) originates in classical mechanics. The main goal of quantum mechanics is to describe the motion of quantum
particles.
∗
Part of these notes were prepared by using the webnotes by Michael Loss: “Stability of Matter” and
the draft of a forthcoming book by Elliott Lieb and Robert Seiringer: “The Stability of Matter in Quantum
Mechanics”. I am grateful for the authors to make a draft version of the book available to me before publication.
1
1
1.1
Classical mechanics
Phase space
The basic object in classical mechanics is a massive point particle. It has two intrinsic and
permanent characteristics: its mass m and its charge q. The mass is always positive. The
state of the particle can be described by its position (location) in the d-dimensional Euclidean
space, x ∈ Rd , and by its momentum p ∈ Rd . Unless we say otherwise, we will always consider
d = 3. The space of the possible x’s is called configuration space or position space, the space
of possible momenta is momentum space. Depending on the physical situation, they may be
restricted to a subset of Rd (e.g. if the particle is confined to a container, Ω ⊂ Rd , then
x ∈ Ω). The product space of the configuration space and momentum space, Rd × Rd (or its
natural subspace, e.g. Ω × Rd ) is called the phase space, the pairs (x, p), describing a possible
position and momentum of a particle, are called phase space points.
It is a deep fact of Nature, that the phase space point determines the state of the particle,
i.e. knowing its position and momentum (two d-dimensional vectors) is sufficient to describe
all its future; in other words, all what the particle “remembers” from its past, before a fixed
time t0 , is given via the position and momentum at time t0 . Actually there are two different
statements combined in this sentence. One is that exactly two vectors (points) determine
everything, the other one is that it is sufficient to know these two vectors at a fixed time and
we can forget about the whole past. Both of these somewhat surprising facts are consequences
of Newton’s equation
d2 x
m 2 =F
dt
where F = F (x, p) is the (instantenous) force acting on the particle; the force may depend
on the phase space point. The fact that Newton’s equation is a differential equation is equivalent to the fact that the past influences the future through the state at present. The fact
that Newton’s equation is of second order postulates that only two quantities are sufficient,
there is no need for more, because a second order equation needs two initial data to have a
unique solution. Traditional Newtonian kinematics considers position and velocity as these
), it turns out
two quantities (velocity being defined as the time derivative of position, ẋ = dx
dt
that momentum is a more canonical second quantity instead of velocity.
More generally, we want to describe N massive point particles in Rd or in Ω ⊂ Rd . We
label the particles by 1, 2, . . . , N. Each particle has a position and a momentum, indexed by
the particle label. The location of the particles is given by x = (x1 , x2 , . . . , xN ) ∈ RdN and
the momenta p = (p1 , p2 , . . . , pN ) ∈ RdN .
2
1.2
Hamiltonian
The state of the system may change with time, and at time t it is described by a time dependent phase space point (x(t), p(t)). The time is always a real variable. The dynamics (time
evolution) of the system is described by energy function or Hamilton function or Hamiltonian
of the system. The Hamiltonian is a real function defined on the phase space
H : Rd × Rd → R
(1.1)
Its value H(x, p) represents the energy of the physical system in state (x, p). A basic axiom
of classical mechanics is that the Hamilton function determines the time evolution via the
Hamiltonian equations of motion
ẋ = ∇p H(x, p),
ṗ = −∇x H(x, p)
(1.2)
(dot denotes time derivative). Being a system of first order ordinary differential equations
(ODE’s), the equations (1.2), under some mild regularity condition on H, fully determine the
whole future (and also past) trajectory (x(t), p(t)) once an initial data is given, i.e. once the
state of the system (x(t0 ), p(t0 )) is known at some time t0 . In other words, the Hamiltonian
function comprises all the physical laws relevant for the system.
One important property of the Hamilton equations of motion is that the Hamiltonian
(energy) is conserved with time
d
H(x(t), p(t)) = ∇x H · ẋ + ∇p H · ṗ = ∇x H · ∇p H − ∇p H · ∇x H = 0
dt
When we build a physical model, we usually give its Hamiltonian. The standard Hamiltonian of a single particle in classical mechanics (without magnetic fields) has the form
H=
p2
+ U(x)
2m
(1.3)
where m > 0 is the mass of the particle and U(x) a real valued function, the potential. The
first term represents the kinetic energy of the particle, the potential describes the interaction
with an (unspecified) environment (e.g. container).
For N particles, we have
N
X
p2j
H=
+ U(x)
(1.4)
2mj
j=1
where mj is the mass of the j-th particle and U(x) is the potential. The term p2j /2mj represents
the kinetic energy of the j-th particle, the potential describes the interactions both among the
particles and with a possible environment.
3
In many cases, the potential function simplifies into a one-particle and a two-particle part:
X
X
U(x) =
Vj (xj ) +
Wjk (xj − xk ).
(1.5)
j
j<k
The first term is called background potential, the second one is the interaction potential. Note
that the interaction is assumed to be translationally invariant – not a necessity, but a condition
that is satisfied in most cases.
The Hamiltonian (1.4), especially with the choice (1.5) may look a bit ad hoc, especially
one may note the asymmetric role of the momenta and positions. It is, however, a fact of life
that the two most “visible” interactions in real life, gravity and electrostatics, depend only on
the position and not on the momentum of the particles; thus the momenta are not directly
coupled.
If magnetic fields are present, then the kinetic energy of the jth particle is modified to
2
1 qj
pj − A(xj )
(1.6)
2mj
c
where qj is the charge of the particle, c= 300,000 km/sec is the speed of light and A : Rd → Rd
is a vector field, representing the magnetic vector potential such a way that the magnetic field
B is given by
B = ∇ × A = curl A
(in dimension d = 3). We remark that the magnetic field (and quantities derived from it, like
the flux, which is the integral of A over closed loops) is the physically measurable quantity,
the vector potential is not directly measurable. Maxwell’s equation dictates that any physical
magnetic field is divergence free, ∇ · B = 0, thus it can be written as a curl. Notice that in
case of the presence of a magnetic field, the velocity of the jth particle is
1
qj
vj = ẋj = ∇pj H =
pj − A(xj ) .
m
c
The formula (1.6) identifies the Lorenz force acting on the jth particle:
Fj =
qj
vj × B
c
Exercise 1.1 Check this formula for the Lorenz force from Newton’s law, from the Hamilton
equations and from the identity ∇A (v · A) − (v · ∇)A = v × (∇ × A) from vector calculus [where
the first gradient acts only on A].
4
Using the explicit form of H, we can write the equations of motion as
ẋj (t) =
pj (t)
,
mj
ṗj (t) = −∇xj U(x) = −∇Vj (xj ) −
X
k6=j
∇W (xj − xk )
The first equation just says that the velocity (defined as the time derivative of x) is the
momentum divided by the mass, the second equation is Newton’s equation if the negative
gradient of the potential is interpreted as the force.
We remark that we presented the Hamilton formalism of classical mechanics. This formalism uses the assumption that there is an absolute concept of time. In relativistic systems such
an assumption cannot hold, and a more general formalism, the Lagrangian formalism, has
been developed. The two formalisms are equivalent if time is absolute. While the Lagrangian
formalism is more general, its quantized version is much harder to define in a mathematically
rigorous way, although it is necessary for doing e.g. relativistic quantum field theory. In this
course we will consider only non-relativistic quantum systems, so we will use the Hamilton
formalism and enjoy its advantages.
1.3
Coulomb systems
The most basic objects of study in quantum mechanics are massive, charged point particles
interacting with electrostatic forces. This means that the Hamiltonian (1.4) holds (if no
magnetic fields are present) with a potential (1.5), where the interaction between the jth and
kth particle is the Coulomb potential
Wjk (xj − xk ) =
qj qk
|xj − xk |
Note that the potential is negative for opposite charges and it is zero for infinitely distant particles. If a potential goes to zero at infinity, then a negative potential is also called attractive,
positive potential is repulsive. Note that the physics (equations of motion) is insensitive to
adding an overall constant to the potential U. Using this freedom we will always (implicitly)
assume that the potential goes to zero at infinity, whenever this is possible (i.e. whenever
limx→∞ U(x) exists and is finite). We use the same implicit convention for all constituents of
the potential, i.e. for Vj and Wjk as well.
The background potential originates from a fixed background charge distribution ̺(x), i.e.
it is also of Coulomb type:
Z
̺(y)
−1
Vj (x) = qj
dy = qj | · | ⋆ ̺ (x)
Rd |x − y|
5
Here the star denotes the convolution; in general
Z
Z
(f ⋆ g)(x) =
f (x − y)g(y)dy =
Rd
Rd
f (y)g(x − y)dy
In most cases ̺ is a sum of Dirac delta masses
̺(x) =
K
X
k=1
Qk δ(x − Rk )
representing fixed point charges Qk sitting at the points Rk . In this case
Vj (x) =
K
X
k=1
qj Qk
|x − Rk |
The simplest possible model is one single point charge moving in a zero potential field. It
is called the free particle and its Hamiltonian is
H=
p2
.
2m
With initial position xin and momentum pin at time t0 , the equations of motion
ẋ =
p
,
m
ṗ = 0
have the trivial solution
x(t) = xin +
pin
(t − t0 ),
m
p(t) = pin
If several free particles are moving in a zero potential field, then
N
X
p2j
H=
2mj
j=1
and each particle follows its own trajectory
xj (t) =
xin
j
pin
j
(t − t0 ),
+
mj
6
pj (t) = pin
j
without ever noticing each other.
The next simplest model is a single charged particle, with mass m and charge q, moving
in the background of another particle with charge Q that is considered fixed at R ∈ Rd . The
Hamiltonian is
p2
qQ
H=
+
2m |x − R|
Notice that R is considered as a parameter, i.e. it is not a dynamical variable. By shifting the
origin, we can assume R = 0. If the single charged particle is an electron with charge q = −e,
and the fixed particle is a nucleus with proton number Z, i.e. with charge Q = Ze, then
H=
Ze2
p2
−
2m
|x|
(1.7)
This is the Hamiltonian of a hydrogenic atom; if Z = 1 then it is exactly the Hydrogen atom.
In full generality, we can consider K nuclei with charges Qk = Zk e and masses Mk ,
k = 1, 2, . . . , K, located at positions R = (R1 , R2 , . . . RK ), and N electrons, each with charge
q = −e and mass m at locations x1 , . . . xN . Then the potential of the N electrons and K
nuclei is
N X
K
X
X Zk Zℓ e2
X
Zk e2
e2
VC (x, R) = −
+
+
(1.8)
|xj − Rk | k<ℓ |Rk − Rℓ | j<ℓ |xj − xℓ |
j=1 k=1
The first term represents the attraction between the electrons and the nuclei, the second
term is the nuclei-nuclei repulsion while the last term is the electron-electron repulsion. The
attractive terms are negative, the repulsive ones are positive.
The full Hamiltonian of the N electrons is
N
X
p2j
H=
+ VC (x, R)
2m
j=1
(1.9)
if the nuclei are considered fixed. In this case their positions are parameters, and H is defined
on the phase space of the N electrons, i.e. on RdN × RdN .
If we consider the nuclei dynamical as well, we need to introduce their momentum variables,
call them (P1 , P2 , . . . , PK ). The Hamiltonian of the N electrons and K nuclei thus is given by
K
N
X
X
p2j
Pk2
+ VC (x, R) .
H=
+
2m k=1 2Mk
j=1
(1.10)
This formula is the complete Hamiltonian of a molecule consisting of N electrons and K
nuclei. Since in reality Mk ≫ m (the mass of the proton is about 1800 times of the mass of
7
the electron), we often consider the simplified model (1.9) where the nuclei are considered so
heavy (formally Mk = ∞) that they are treated as static particles.
One key feature of all these Coulombic Hamiltonians is that the range of the Hamiltonian
function H is the whole R, in particular arbitrarily negative energies can be achieved. Assuming some radiation mechanism that is able to suck energy out of the system, the energy of
a Hydrogen atom (1.7) can be driven arbitrarily negative; just by placing the electron closer
and closer to the attractive nuclei. Thus the Hydrogen atom would not be stable; the electron
would collapse into the nucleus. Moreover, it could release an infinite amount of energy – this
is clearly unphysical.
This problem has been noticed well before the discovery of quantum mechanics. One
possible explanation is that the assumption about point particles is wrong; indeed the nuclei
has a nonzero diameter of about 10−13 cm. However, the typical size of the Hydrogen atom
is 10−8 cm, i.e several orders of magnitude bigger than the nucleus. Therefore the extended
shape of the nucleus cannot explain the non-collapse of the electron on a much bigger scale.
This question is known as the problem of the stability of Hydrogen, and similarly one can
ask whether a molecule of N electrons and K nuclei are stable in the sense that whether
inf H > −∞ or inf H = −∞. As the formula (1.8) immediately shows, the Hamiltonians
(1.9), (1.10) are unstable in this sense. Such a scenario would have dramatic consequences
on the world; it would indicate that after a long time the electrons of atoms and molecules
would fall into their respective nuclei and matter would look rather like a dense soup instead
of consisting of fairly well separated particles and a huge energy would be released.
It was one of the great triumphs of early quantum mechanics that it could explain why
in the quantum model of the Hydrogen such a collapse does not occur. It took more than
40 years after that before the similar but stronger stability statement (“stability of matter of
second kind” – see the definition later) was discovered and rigorously proven for molecules or
for any Coulomb system.
2
2.1
Quantum mechanics
States
The state of a single particle in quantum mechanics is given by a complex valued wave function
ψ : Rd → C
defined on the classical configuration space, Rd , or on a subset Ω ⊂ Rd . Unlike in classical
mechanics, where altogther 2d numbers were sufficient to specify the state (d position and d
8
momentum coordinates), in quantum mechanics the state is given by a whole function, i.e.
infinitely many numbers.
The non-negative function x → |ψ(x)|2 on Rd is interpreted as a probability density, i.e.
for any subset Ω ⊂ Rd ,
Z
|ψ(x)|2 dx = Prob { the particle is in Ω }
(2.1)
Ω
Since we wish to interpret |ψ(x)|2 as a probability density, we always assume the normalization
condition
Z
|ψ(x)|2 dx = 1 .
Rd
Therefore, the natural state space of a single quantum particle is the unit sphere in L2 (Rd ),
the space of square integrable functions:
Z
2
d
d
L (R ) = ψ : R → C :
|ψ(x)|2 dx < ∞
Rd
(the integral here is understood in Lebesgue sense). We recall that the L2 -space is equipped
with a natural scalar product
Z
hf, gi =
f (x)g(x)dx
Rd
and with a norm
kf k = kf k2 =
p
hf, f i
and it is a Hilbert space, i.e. it is complete with respect to this norm. Since we will mostly
use this L2 -norm, we usually omit the subscript 2, i.e. kf k will always denote the L2 -norm,
by convention.
The definition (2.1) leaves a lot of room for discussion, especially what do we mean by the
probability here. As we said, we are not going into fundamentalist issues; we just mention
that quantum mechanics does not allow to determine the precise position of the particle in
any measurement (uncertainty principle). Moreover, we point out the experimental fact that
the outcome of a quantum experiment is not a deterministic quantity, but rather a random
number: if the same experiment is repeated several times, the measuring apparatus may show
different numbers: it is only their statistics that is meaningful, i.e. we can ask what the
probability that the gauge in the apparatus shows number 1 is, or what the expectation value
of the shown number is if many identical experiments are performed.
9
2.2
Observables
It is a fact that not everything can be measured in quantum mechanics. The wave function
in principle contains all information about the state, nevertheless not every property of ψ is
accessible by measurements. By definition, the measurable quantities are those that can be
represented by self-adjoint (linear) operators O acting on L2 (Rd ), i.e. O : L2 (Rd ) → L2 (Rd );
these are called observables. The result of the measurement on the state ψ is given by
hψ, Oψi = Expected value of the measurement O in state ψ .
(2.2)
and it is always a real number. Without the normalization condition kψk = 1, the expected
value of the measurement is given by
hψ, Oψi
.
hψ, ψi
Recall that apart from (non-trivial and non-negligible domain questions) self-adjointness
means that O is symmetric, i.e.
hψ, Oχi = hOψ, χi,
ψ, χ ∈ D(O) ⊂ L2 (Rd )
and it is defined on the same domain as its adjoint, D(O) = D(O ∗). To facilitate the introduction, we do not worry about domain questions for the moment. For those who feel cheated,
just think about bounded symmetric operators for the moment; it is a fact that any bounded
operator can be extended uniquely to the whole Hilbert space, even if it were originally defined
only on a dense subset (see, Theorem I.7. of Reed-Simon Vol I.) thus any symmetric bounded
operator is self-adjoint.
Note that any measurable quantity
R (2.2) is quadratic in ψ, e.g. it does not make sense to
ask, for example, for the integral of ψ(x)dx. Moreover, an overall phase factor is invisible
for experiments, i.e. if we multiply the wave function by a phase factor eiα , α ∈ R, then
clearly
Z
Z
iα
iα
iα
−iα
iα
he ψ, O(e ψ)i =
eiα ψ(x)O(e ψ(x))dx = e e
ψ(x)Oψ(x)dx = hψ, Oψi
Rd
Rd
where the linearity of O has been used. This means that no measurement can distinguish
between the state ψ and the state eiα ψ, so one may even identify these states; in mathematical
language take the factor space with the equivalence relation ψ ∼ χ iff there is α ∈ R such
that χ = eiα ψ.
10
2.3
Position and momentum
The observable measuring the position is the multiplication operator by the variable x, i.e.
Z
Z
hψ, xψi =
ψ(x) x ψ(x)dx =
x |ψ(x)|2 dx
Rd
Rd
which is clearly the first moment of the probability distribution |ψ(x)|2 . [Remark: Since x is
a d-vector, this is actually a vector-valued observable, so each coordinate of x = (x(1) , . . . x(d) )
is a real valued observable and hψ, xψi is interpreted as a d-vector whose components the real
valued observables hψ, x(j) ψi, j = 1, 2 . . . d.]
The observable measuring the momentum is −i times the derivative operator, p = −i∇ =
−i∇x , i.e.
Z
hψ, (−i∇x )ψi = −i
ψ(x) ∇x ψ(x)dx
(2.3)
Rd
Remark 2.1 Later we will insert a constant – the Planck’s constant ~ (3.2) – into the definition, i.e. p = −i~∇. This is clearly necessary even for dimensional reasons: the momentum
has a dimension (mass) · (length) · (time)−1 , while the derivative has dimension (length)−1 ,
thus we need a constant with dimension (mass) · (length)2 · (time)−1 to compensate. Its exact
value, determines the relation between classical world (derivative, length) and quantum world
(quantum momentum). However, even later (see Section 6) we will choose units where ~ = 1,
and in most of the course we will not see ~ at all. So in this section, for simplicity, we drop
~.
Simple integration by parts in (2.3) shows that −i∇ is (formally) self-adjoint, i.e.
hχ, (−i∇x )ψi = h(−i∇x )χ, ψi
This relation certainly holds for sufficiently smooth (e.g. once continuously differentiable)
functions that sufficiently decay at infinity (for example, compactly supported). Later we will
see how to determine the correct domain of self-adjointness of −i∇. The meaning of −i∇x
in position space is more obscure than that of the position operator x, but if we rewrite it in
Fourier space, we see a complete duality between position and momentum.
We recall that the Fourier transform of ψ is defined (formally) as
Z
b
ψ(k) =
ψ(x)e−2πix·k dx
Rd
This definition is meaningful if ψ is an integrable function, i.e. ψ ∈ L1 (Rd ), but it can be
extended to L2 (Rd ), moreover, it turns out to be an isomorphy in L2 (Rd ) (i.e. the map
11
“taking the Fourier transform” is a bijection from L2 (Rd ) onto L2 (Rd ) and it preserves the
scalar product). In particular, we have Plancherel’s formula (theorem)
Z
Z
b
ψ(k)b
χ(k)dk =
ψ(x)χ(x)dx
Rd
Rd
in particular
b = kψk .
kψk
The inverse Fourier transform is given by
ˇ =
f(x)
Z
f (k)e2πix·k dk
Rd
(note the positive sign in the exponent!) and it can be shown that indeed
ψ = (ψ̂)ˇ= (ψ̌)b
(2.4)
Formally the first relation can be seen from
Z
Z
2πix·k
ψ(y)e−2πiy·k dy dk
(ψ̂)ˇ(x) =
e
d
Rd
ZR
Z
=
ψ(y)
e2πi(x−y)·k dk dy
d
Rd
ZR
=
ψ(y) δ(x − y)dy
(2.5)
Rd
=ψ(x)
however, here the application of the Fubini theorem and also the usage of the delta function
is not fully rigorous, nevertheless, (2.4) can be established rigorously as an identity between
L2 functions (in particular, one does not expect it to hold for every x, only for almost every
x).
Moreover, it follows directly from the definition of ψb that
Z
Z
2
b
ψ(x) ∇x ψ(x)dx = 2π
k|ψ(k)|
dk
hψ, (−i∇x )ψi = −i
Rd
Rd
and similarly
hψ, −∆ψi = hψ, (−i∇) · (−i∇)ψi =
Z
2
Rd
|∇ψ(x)| dx = (2π)
12
2
Z
Rd
2
b
k 2 |ψ(k)|
dk
(2.6)
In summary, the action of the momentum operator −i∇x on ψ(x) is just multiplication by
b
2πk in the Fourier representation ψ(k).
This correspondance works in the other direction as
well:
b (−i∇k )ψi
b
hψ, x ψi = hψ,
i.e. there is a complete duality between position and momentum and between position space
b
representation of the state, i.e. ψ(x), and its momentum space representation ψ(k).
Derivative
in one representation corresponds to multiplication by the variable (times 2π) in the other
representation.
This correspondance holds even pointwise via the following formulas:
\
b
[−i∇ψ(x)](k)
= (2π)k ψ(k),
\
b
[(2π)xψ(x)](k)
= −i∇k ψ(k)
(2.7)
Remark 2.2 (VERY IMPORTANT) It is a useful rule of thumb to think of x carrying
the dimension of a length, while the Fourier variable, k, carries the dimension of (length)−1 .
In general, large scale properties of a function ψ(x) (i.e. behaviour for |x| ≫ 1) is reflected
b
in the short scale properties of ψ(k),
i.e. |k| ≪ 1; e.g. for a function ψ that decays slowly in
b
x-space, we will have a singularity at k = 0 in its Fourier transform ψ(k).
Vice versa: short
e In physics, the first regime is
scale properties of ψ are reflected in large scale properties of ψ.
called infrared (IR) (short wavelength = large distance) regime, the second one is ultraviolet
(UV) regime (large wavelength = short distance).
Recall that the Fourier transform expresses oscillations in a function. The oscillation has
a natural lengthscale (wavelength), and its inverse (frequency) is the corresponding Fourier
b
variable k (often called also mode). Thus ψ(k)
tells us how much oscillation with wavelength
−1
b
k occurs in ψ. Oscillation is closely related to derivative: higher frequency content (big ψ(k)
for some large k) implies higher derivative of ψ, this is clear from (2.7).
You can read more about the Fourier transform in Lieb-Loss: Analysis, Chapter 5 or in a
handout to be published later. A final remark is that Fourier transform always carries 2π and
there are different conventions where one tucks 2π in. We used the convention of the book
Lieb-Loss: Analysis, while Reed-Simon defines the Fourier transform and its inverse as
Z
Z
1
1
−ix·k
b
ˇ
ψ(k) =
ψ(x)e
dx,
f (x) =
f (k)eix·k dk
(2π)d/2 Rd
(2π)d/2 Rd
The discrepancy is irrelevant as long as one is aware of it and checks at the beginning of each
book which convention is used.
13
3
Hamiltonian: the generator of the time evolution
The energy is the most important measurable quantity, the corresponding quantum observable
is the Hamilton operator H. It is a self-adjoint operator defined on L2 (Rd ), thus its expected
value is always real (analogously, the Hamilton function (1.1) is real valued). The Hamiltonian
generates the time evolution of the state of the system, i.e. the time evolution of the time
dependent wave function ψ(t) via the Schrödinger equation
i~∂t ψ = Hψ
(3.1)
~ = 1.05 × 10−34 Joule×sec = 1.05 × 10−27 g cm2 sec−1
(3.2)
where
is a universal physical constant (Planck’s constant divided by 2π) – we will discuss the units
later.
Similarly to classical mechanics, the Hamiltonian (which is now an operator and not a
function) contains all physical information about the system, so any modelling in physics
starts with determining H. We remark, that similarly to classical mechanics, this formalism
applies only to non-fully-relativistic situations, i.e. where there is an absolute time. Otherwise,
the quantized version of the Lagrangian formalism is needed.
The Schrödinger equation is a first order evolution equation. With a given initial data,
ψ(t0 ) = ψ0 , at a fixed time t0 , it has a unique solution
ψ(t) = e−i(t−t0 )~
−1 H
ψ0
(3.3)
Formal substitution shows that (3.3) indeed solves (3.1). The main question, however: what
exactly the exponential on the right hand side of (3.3) is and whether the formal rules of
differentiations really apply.
Even before we try make sense of the formula for the solution, the first question is whether
the solution exists at all, and if yes, whether it is unique. Being a simple evolution equation, from standard ODE theory we know that existence and uniqueness (at least locally) is
guaranteed by Lipschitz continuity, i.e. if there is a constant K such that
e ≤ Kkψ − ψk
e
kHψ − H ψk
which, by the linearity of H is equivalent that H is bounded.
If H were a bounded operator (in particular a matrix acting on the finite dimensional
Hilbert space CN ) then one could define eitH for any constant t ∈ R by a power series:
eitH =
∞
X
(it)n H n
n=0
14
n!
(3.4)
Exercise 3.1 Check that this series converges in the operator norm and that the usual rules
d itH
of differentiations apply, in particular dt
e = iHeitH , actually the same holds for any power
ecH with c ∈ C. Along the way, you will have to check directly from (3.4) that
eitH eisH = ei(t+s)H
As we will see in a moment, the most important Hamilton operators are unbounded, since
they contain derivatives, and derivative operators are never bounded in L2 , inequality of the
form k∇ψk ≤ Kkψk could NEVER hold (WHY?). In particular, typical Hamiltonians are
not defined on the whole L2 (Rd ) [Recall Hellinger-Toeplitz theorem, Corollary to Theorem
III.12 in Reed-Simon Vol. I]. Therefore not only the series (3.4) does not converge, but it is
even questionnable whether there is any element of the Hilbert space for which the right hand
side even term by term could be applied (in principle it could be that the intersection of all
domains D(H n ), n = 1, 2, 3, . . . is trivial).
It turns out that the symmetry of H is not sufficient to define eitH , and to define the
dynamics, we will need self-adjointness. The definition will go through the spectral theorem,
which is a generalization of the diagonalization of hermitian matrices to unbounded operators.
Recall that if H is a finite hermitian matrix on CN , H = H ∗ , then, alternatively to (3.4), one
could define eitH as
eitH = UeitD U ∗
(3.5)
where H = UDU ∗ is the diagonalization of H, i.e. U is a unitary matrix (U −1 = U ∗ )
containing the orthonormalized eigenbasis of H and D = diag(λ1 , λ2 , . . . , λN ) is a diagonal
matrix containing the eigenvalues (with multiplicity). The exponential of the diagonal matrix
eitD is defined as the diagonal matrix with entries eitλj
Exercise 3.2 Check that the two definitions (3.4) and (3.5) coincide for hermitian matrices
The precise formulation of the spectral theorem for unbounded operators is fairly long;
we will do it only later, when we will really need it. You can read the statement in Section
VIII.3 of Reed Simon, but it may seem scary for the moment. The essence is that with its
help one can define functions of self-adjoint operators, i.e. not only polynomials of H make
sense (like H 2 , H 3, . . .), but essentially for any function f (λ) with real argument λ ∈ R one
can define an operator f (H) such that for polynomials it coincides with the usual definition
and all standard “calculus” rules apply. E.g. with the function ft (λ) = eitλ one can define the
operators eitH for any t in such a way that e.g. eitH eisH = ei(t+s)H holds. It is quite remarkable
that such a powerful calculus exists with operators that are more complicated objects than
functions. The spectral theorem says that all these are possible, if H is self-adjoint.
15
We remark, that for certain special Hamiltonians, eitH can be easily computed without
reference to spectral theorem. For example if H = V (x), i.e. no kinetic energy is present,
then clearly
ψt (x) = e−itV (x)/~ψ0 (x)
solves the Schrödinger equation (3.1). Similarly, if H = −~2 ∆, then eit~∆ is a multiplication
in Fourier space, so
2
it~∆ ψ(k) = eit~(2πk) ψ(k)
ψbt (k) = e\
is the Fourier transform of the solution to (3.1). [See Homework problem]. Unfortuntely,
such explicite formulas are not available for the general case, H = −∆ + V , and e−itH cannot
be “put together” (at least not easily) from e−itV (x) and eit∆ , because V (x) and ∆ do not
commute, thus
e−it(−∆+V ) 6= eit∆ eitV
4
Hamiltonian: the energy
Similarly to the classical case, the energy Eψ of a single quantum particle in state ψ is the
sum of two parts: a kinetic energy and potential energy
Eψ = Tψ + Vψ
where the kinetic energy (without magnetic fields) is
Z
~2
Tψ =
|∇ψ(x)|2 dx
2m Rd
and the potential energy is
Vψ =
Z
V (x)|ψ(x)|dx
Rd
where V (x) is a real valued function (potential). Written with the observable notation
Z
p2 1
~2
1
|(pψ)(x)|2 dx = ψ,
ψ =
h−i~∇ψ, −i~∇ψi =
hψ, −∆ψi
Tψ =
2m Rd
2m
2m
2m
where we used the notation
p = −i~∇x
to replace the classical momentum p with an operator (the notation is a bit sloppy, one really
should distinguish the operator p from the classical momentum, e.g. by putting a hat on
16
it, pb = −i~∇x , but we will not use the classical momentum later). We see that with this
replacement, the quantum kinetic energy formally is the same as the classical kinetic energy
in (1.3).
The potential energy is even easier, in observable form it is
Vψ = hψ, V ψi
Thus the total energy is represented by the expectation value
Z
Z
D p2
E
~2
2
Eψ = ψ,
|∇ψ(x)| dx +
V (x)|ψ(x)|2
+V ψ =
2m
2m Rd
d
R
(4.1)
of the Hamilton operator
p2
~2
+V =−
∆+V
2m
2m
which acts on any function ψ as follows
H=
(Hψ)(x) = −
~2
(∆ψ)(x) + V (x)ψ(x)
2m
i.e. the potential acts as a multiplication operator. Notice that the second identity in (4.1)
requires an integration by parts, so one should worry about the domain problem, i.e. precisely
for which ψ’s are both expressions well defined.
As an example, the Hamiltonian of a hydrogenic atom, assuming that the nucleus with
charge Ze is fixed at the origin and we describe only the single electron with charge −e, is
given by
~2
Ze2
HHydr = −
∆x −
(4.2)
2m
|x|
The Z = 1 case is the usual Hydrogen atom.
In almost all cases in quantum mechanics, the Hamilton operator can be decomposed into
a kinetic energy operator H0 and a potential energy operator:
H = H0 + V
In most cases H0 is a (pseudo)differential operator, while V is a multiplication operator. One
may think that the potential is the “easy” part and the kinetic energy is the complicated
one, eventually multiplication by a function seems easier than differentiation. However, if one
writes the action of H in Fourier space then, by duality, differentiation and multiplication
get interchanged. In particular, as (2.7) shows, derivative in position space corresponds to
multiplication in Fourier space and vice versa.
17
Independenty of the representations, the key point is that typically H0 and V do not
commute, since p and x do not commute. This is the basic source in quantum mechanics why
the theory cannot be described by classical objects like functions that commute; so one needs
matrices, or “generalized matrices”, i.e. operators. One of the fundamental observation of the
“founding fathers” of quantum mechanics was that non-commutativity is needed.
We mention that in the presence of a magnetic field B = ∇ × A in d = 3 dimensions the
kinetic energy of a particle with mass m and charge q is modified
Z 2
1
q
TA,ψ =
−
i~∇
−
A(x)
ψ(x)
(4.3)
dx = hψ, HA ψi
2m R3
c
with
2
2
q
q
1 1 p − A(x) =
− i~∇ − A(x)
HA =
2m
c
2m
c
beind the kinetic energy operator. Notice that the formula is gauge invariant, i.e. for any
real function χ : R3 → R the kinetic energy operator with A and with A + ∇χ are unitarily
equivalent:
HA+∇χ = U(χ)HA U ∗ (χ)
(4.4)
where U(χ) is the multiplication operator by the complex phase exp(−i(q/~c)χ(x)).
Exercise 4.1 Check the formula (4.4)!
We add two remarks that will be relevant later on. First, the magnetic field itself has an
energy (price to be paid to the power company to generate this field [E.Lieb]). In standard
CGS units (and in d = 3) it is
Z
1
|B(x)|2 dx
(4.5)
8π
wher the magnetic field has a dimension (mass)1/2 · (length)−1/2 · (time)−1 , i.e. measured in
g1/2 cm−1/2 sec−1 . If we allow the system to adjust its own magnetic field, the field energy
must also be included in the total energy of the system, although this is not an operator but
just a number. Second, we remark that the magnetic kinetic energy (4.3) applies only to
spinless particles, in particular, strictly speaking, not to electrons, that have spin 21 . We will
later introduce the corresponding kinetic energy (Pauli operator) that takes also spins into
account.
18
5
Ground state energy
In the previous section we explained that in order to define the time evolution, one first needs
to establish the self-adjointness of H (on a suitable domain) and second one needs the spectral
theorem. Both steps require some non-trivial preparation from functional analysis and they
are not very intuitive. So we will postpone their discussion, and we first focus on an aspect
of quantum mechanics that can be presented with much less technicalities.
We mentioned that due to radiations, systems tend to favor their low energy states, especially their ground states. Finding out the lowest energy state of a system is also important
because it tells us at most how much energy it can release. If we had a system whose energy
could be unbounded from below, then driving this system into lower and lower energy states,
we could gain infinite amount energy out of it – clearly a cheap solution to all our energy
problems.
So one of the very first questions about a quantum Hamiltonian is whether it is bounded
from below, more precisely if its ground state energy, defined as
n
o
E0 = inf Eψ = hψ, Hψi : kψk = 1
is finite or minus infinity. The infimum may not be achieved even if E0 > ∞, nevertheless
we still call it ground state energy. If there is a minimizer, ψ0 such that E0 = Eψ0 , then ψ0
is called the ground state. If E0 > −∞, then we say that the system satisfies the stability of
first kind.
Note that to pose the question of stability, one can avoid defining the domain of H precisely,
we can simply use the following definition of the energy:
Z
Z
~2
2
Eψ = Tψ + Vψ =
|∇ψ(x)| dx +
V (x)|ψ(x)|2 dx
2m Rd
d
R
The advantage of this definition (called the quadratic form of H), is that the kinetic energy
term is the integral of a non-negative quantity, and if ψ is non-differentiable, more precisely if
its derivative does not coincide almost everywhere with an L2 -function, then we simply define
Tψ = ∞. The potential term may have both positive and negative part; we decompose it
V (x) = [V (x)]+ − [V (x)]−
where for any real number a,
[a]− = − min{0, a}
[a]+ = max{a, 0},
denote the positive and the negative parts, respectively.
19
As long as the negative part of the potential is controlled by the kinetic energy, in the
sense that
Z
[V (x)]− |ψ(x)|2 dx ≤ Tψ + Kkψk2
(5.1)
Rd
for some finite constant K, independently of ψ, then Eψ is bounded from below:
Z
Z
2
Eψ = Tψ +
[V (x)]+ |ψ(x)| dx −
[V (x)]− |ψ(x)|2 dx ≥ −Kkψk2 = −K
Rd
Rd
taking into account the normalization kψk = 1. We thus managed to reduce the problem
of stability to the proof of inequality (5.1) and notice that this inequality requires no worry
about domains. If Tψ = ∞, then the inequality always holds, so we can restrict our attention
to those ψ’s such that Tψ < ∞ (this space will be called the H 1 -Sobolev space). If the left
hand side of (5.1) is infinite, then the inequality does not hold, otherwise we have to compare
two finite numbers.
6
Units
Before we go on with more complicated formulas and concepts, we will fix which physical units
we use. The idea is that by choosing the units properly, all physical constants like electron
charge, mass, Planck constant etc. can be set equal 1 and in this way we can focus on the
structure of the formulas undisturbed by irrelevant constants.
In CGS units we have
• m= mass of the electron = 9.11 × 10−28 g
• e = (-1)× charge of the electron = 4.8 × 10−10 g1/2 cm3/2 sec−1
• c= speed of light = 3 × 1010 cm sec−1
• ~ = Planck’s constant divided by 2π = 1.055 × 10−27 g cm2 sec−1
The first three constants are known from classical physics. Planck’s constant is the fundamental constant of quantum mechanics, essentially by connecting the momentum with the
physical space scale through p = −i~∇x .
Out of these four constants, there is a unique way to form a single dimensionless constant
α=
e2
1
=
~c
137.04
20
which is called the fine structure constant. Since the electrostatic interaction potential is
quadratic in the charge, α can be thought as measuring the strength of this interaction.
The natural lengthscale is the Compton wavelength of the electron, defined as
ℓC =
~
= 2.43 × 10−10 cm
mc
and the natural energy scale is the rest mass energy of the electron
Er = mc2 = 8.2 × 10−7 ergs = 8.2 × 10−7 g cm2 sec−2
These are the natural units in relativistic quantum mechanics. In most of this course we
will deal with non-relativistic quantum mechanics, so the appearance of the speed of light is
unnatural.
Our basic unit of length thus will be
ℓ=
ℓC
~2
=
= 1.06 × 10−8 cm
2α
2me2
(6.1)
which is twice the Bohr radius, the typical size of the Hydrogen atom. Our energy unit will
be
2me4
2mc2 α2 =
= 4 Ry = 8.73 × 10−11 ergs
(6.2)
~2
where 1 Rydberg (Ry) is the ground state energy of the Hydrogen atom.
Changing the unit of length requires rescaling the wave function ψ; note that the normalization condition must be respected. Therefore, if ψ(x) is the wavefunction in a certain unit,
and we rescale the space by a factor λ, i.e. x → X = x/λ, then the wave function must be
rescaled as
e
ψ(x) → λ−3/2 ψ(xλ−1 ) = λ−3/2 ψ(X) =: ψ(X)
in the new unit. We can define a unitary transformation
(Uλ ψ)(x) = λ−3/2 ψ(xλ−1 )
representing the change of unit of length. It is easy to see that
Uλ∗ ∂x Uλ = λ−1 ∂x
and
Uλ∗ | · |α Uλ = λα | · |α
where | · |α is a multiplication operator with |x|α . Comparing these two formulas, we see that
the derivative operator scales as (length)−1 , i.e. exactly as the Coulomb potential.
21
As an exercise, we compute the Hamiltonian of a hydrogenic atom (4.2) (written in CGS
units) in our new units. We write the space coordinate x in our unit of length (6.1) as
x = ℓX =
then
and
Thus, in the new units
~2
X
2me2
~2
~2 ~2 −2
2me4
∆x =
∆X
∆
=
X
2m
2m 2me2
~2
Ze2 ~2 −1 2me4 Z
Ze2
=
=
|x|
|X| 2me2
~2 |X|
H=−
Ze2
2me4 Z ~2
∆x −
=
−
∆
−
X
2m
|x|
~2
|X|
so the Hamiltonian of the hydrogenic atom is
H = −∆X −
Z
|X|
(6.3)
measured in our energy unit (6.2). Comparing with (4.2) we see that in our new units
2m = ~ = e = 1
Important rule of thumb: It is always good to check how various quantities scale with
length. The derivative scales as (length)−1 , thus the LaplacianRscales as (length)−2 and the
Coulomb potential scales as (length)−1 . Moreover, any integral R3 (. . .)dx scales as (length)3
and any Fourier variable (frequency) scales as (length)−1 . This indicates how the appropriate
term behaves under scaling x → λx. The scaling property of a term should not change along
any mathematical manipulations, so this gives rise to a quick check: after a long calculation
one can simply check whether the initial formula and the final formula scales in the same way
with the length. If not, there is an error!
In particular, the two terms in the hydrogenic Hamiltonian scale differently with length,
thus it gives rise to an intrinsic lengthscale, namely a lengthscale of order Z −1 (measured in
the units we use, in this case in ℓ). For example, we will see that the eigenfunctions of H live
on a lengthscale Z −1 ; this is the “built-in” lengthscale in H: if Z were (length)−1 , then the
two terms in H would both scale in the same way.
22
Exercise 6.1 (i) Prove that the Hamiltonian of a hydrogenic atom in the relativistic units
ℓC and Er is given by
1
Zα
H = − ∆x −
2
|x|
(ii) Show that a vector potential A has dimension g1/2 cm1/2 sec−1 (i.e. charge/length) and
1/2
−1/2
the √magnetic field B has dimension
sec−1 (i.e. charge/(length)2 ). Choose
√ g cm
−1
−2
ℓC ~c as the unit for A, and ℓC ~c as the unit for the magnetic field and prove that
the Hamiltonian of a hydrogenic atom in a magnetic field is given by
√
1
Zα
H = (p + αA)2 −
2
|x|
in the relativistic units ℓC and Er .
(iii) Show that in units indicated in part (ii), the field energy (4.5) remains unchanged, i.e.
it is still given by
Z
1
|B(x)|2 dx
8π R3
This exercise shows that in the relativistic units
m=~=c=1
and
√
α is the elementary charge e.
Convention: To economize the formulas, we will often omit dx and the domain from the
integration, i.e. we write
Z
Z
f for
f (x)dx
R3
7
Stability of the hydrogenic atom
We now show that the hydrogenic atom, given by the Hamiltonian (6.3), is stable:
Theorem 7.1 Consider the set
Z
n
3
M := ψ : R → C :
2
R3
|∇ψ(x)| dx < ∞,
23
Z
R3
o
|ψ(x)|2
dx < ∞,
|x|
of (unnormalized) wave functions whose kinetic energy and Coulomb energy are finite. Then,
for any Z > 0, the ground state energy
Z
nZ
o
Z
2
E0 = inf
|∇ψ(x)| dx −
|ψ(x)|2 dx : ψ ∈ M, kψk = 1
(7.1)
R3
R3 |x|
is finite and it is given by
E0 = −
Z2
4
and the function
Z 3/2
ψ0 (x) = √ e−Z|x|/2
8π
(7.2)
is the unique minimizer.
The key ingredient of the proof is a lemma that establishes the bound (5.1) for our potential
(after a Schwarz inequality ab ≤ 21 (a2 + b2 ) for positive numbers a = k∇ψk, b = kψk):
Lemma 7.2 Let ψ ∈ M, then
Z
|ψ(x)|2
dx ≤ k∇ψk kψk
|x|
(7.3)
and equality holds if and only if ψ0 (x) = const. e−c|x| for some positive constant c > 0.
Remark. You may wonder how to figure out such an inequality. Recall that in (5.1) we
wanted to control the negative part of the potential (in this case the complete potential) by
the kinetic energy (plus the L2 -norm). Apart from the trivial constant Z, the left hand side
of (7.3) is the potential energy. We want to bound it in terms of k∇ψk and kψk – what kind
of inequality has any chance to be correct at all?
Here is a “back-of-the-envelope” test that checks whether an inequality can be correct. It
does not prove the inequality, but if an inequality does not pass this test, it is surely wrong.
The idea is to test how the inequality scales with two different rescaling.
First, notice that if ψ is replaced by λψ with some λ > 0, then both sides of (7.3) changes
with a factor λ2 . The two sides must scale in the same way with λ, otherwise the inequality
cannot be correct.
Second, replace ψ(x) with ψ(x/lambda). After a change of variables, the left hand side
scales as
Z
Z
|ψ(x/λ)|2
|ψ(y)|2
−1 3
dx = λ λ
dy
|x|
|y|
24
Here λ3 comes from the change of variables, y = x/λ, the extra λ−1 comes from the denominator. On the right hand side, we find
Z
1/2 Z
1/2
Z
1/2 Z
1/2
2
2
2
−1 3
|∇[ψ(x/λ)]| dx
|ψ(x/λ)| dx
|∇[ψ(y)]| dy
|ψ(y)|2dy
=λ λ
,
where, again, λ3 comes from the trivial volume factors and λ−1 comes from the derivative.
Again, we see that the two sides scale in the same way, which is a necessary condition for the
inequality to hold.
Actually, it is easy to see that no other combination of the form k∇ψkα kψkβ would pass
both tests apart from (7.3) with α = β = 1.
Proof of Lemma 7.2. By a standard density argument, it is sufficient to prove the inequality
for all ψ ∈ C0∞ , i.e. for all smooth, compactly supported functions. – See a remark at the end
of the proof.
Using integration by parts, compute
3
X
1
xj 2hψ, ψi =
ψi
hψ, ∂xj ,
|x|
|x|
j=1
X
xj
xj
=−
h∂xj ψ, ψi + h ψ, ∂xj ψi
|x|
|x|
j
X
xj
≤2
|h∂xj ψ, ψi|
|x|
j
where [A, B] = AB − BA is the commutator. By Schwarz inequality
|h∂xj ψ,
xj
xj
ψi| ≤ k∂xj ψk k ψk
|x|
|x|
After summing up, and using another Schwarz inequality for the sum, we have
X
xj
|h∂xj ψ, ψi| ≤ k∇ψk kψk
|x|
j
Thus the proof of (7.3) is complete.
Exercise 7.3 Prove the case of equality stated in the theorem.
25
(7.4)
Proof of Theorem 7.1. Using (7.3) and kψk = 1, we have
Z
Z
|ψ|2
Z 2 Z 2
2
|∇ψ| − Z
−
≥ k∇ψk2 − Zk∇ψk = k∇ψk −
|x|
2
4
after completing the square. Taking the infimum for all ψ ∈ M, we have
E0 ≥ −
Z2
4
and it is easy to check that among the functions ψ0 (x) = const. e−c|x| the equality is achieved
only if c = Z/2. The prefactor in (7.2) comes from the normalization. (CHECK!)
You may find the proof of Lemma 7.2 very special. The commutator trick works only
for the Coulomb potential. What if we consider a different potential? This will lead us to
Sobolev inequalities, which we will discuss later. One of them is the following lower bound
on the L2 -norm of the gradient in d = 3 dimensions: there exists a universal constant C such
that
k∇ψk ≥ Ckψk6
This inequality can be used to prove the bound (5.1), since, by Hölder’s inequality (CHECK
the exponents!)
Z
[V ]− |ψ|2 ≤ kψk26 k[V ]− k3/2
and thus combining this with the Sobolev inequality, we have
Z
[V ]− |ψ|2 ≤ k∇ψk2 = Tψ
as long as k[V ]− k3/2 ≤ C 2 . If this latter condition is not satisfied (e.g. for the Coulomb
potential V (x) = |x|−1 6∈ L3/2 ) one can still use this idea with a bit twist. Suppose that one
can write
[V ]− = V1 + V2
where V2 ∈ L∞ and V1 ∈ L3/2 with kV1 k3/2 ≤ C −2 . Then
Z
Z
Z
2
2
[V ]− |ψ| = V1 |ψ| + V2 |ψ|2 ≤ k∇ψk2 + kV2 k∞ kψk2 = Tψ + kV2 k∞ kψk2
i.e. (5.1) is still satisfied with K = kV2 k∞ .
26
VERY IMPORTANT REMARK: We will often use the idea that to prove an inequality like (7.3) for all functions for which the two sides are finite, it is sufficient to prove the
inequality for “nice” functions, most often for C0∞ functions. In mathematics literature, this
fact is referred to as “standard density argument”, and one usually does not waste more time
on it. We will do the same, but once for all, I want to show on this example how it goes.
We have to know that C0∞ is dense in H 1 -norm, i.e. in the L2 norm and in the norm of
the L2 of the gradient:
kψk2H 1 := kψk2 + k∇ψk2
This statement can be found in Theorem 7.6 of Lieb-Loss (the density in L2 is in Theorem
2.16) and later we will define H 1 precisely, for the moment: it is the space of all functions
whose gradient is in L2 .
Armed with this information, the argument goes as follows: suppose that (7.3) is proven
for all C0∞ functions, and let ψ ∈ M an arbitrary function. From the density of C0∞ in H 1
can find an approximating sequence ψn ∈ C0∞ ∩ M such that
kψ − ψn k → 0,
k∇ψ − ∇ψn k → 0
(in particular, k∇ψn k → k∇ψk and kψn k → kψk) and we have
Z
|ψn (x)|2
dx ≤ k∇ψn k kψn k
|x|
(7.5)
(7.6)
Moreover, ψn is a Cauchy sequence in the weighted L2 space with measure dµ(x) = |x|−1 dx,
since
Z
Z
|ψn (x) − ψm (x)|2
2
|ψn (x) − ψm (x)| dµ(x) =
dx ≤ k∇(ψn − ψm )k kψn − ψm k → 0
|x|
as n, m → ∞, where we used (7.3) for C0∞ functions. Using the Riesz-Fischer theorem
(Completeness of Lp , Theorem 2.7 of Reed-Simon) and passing to a subsequence (which we
continue to denote by ψn ) we can assume that ψn converges in L2 (R3 , dµ) as well. The limit
must be ψ (WHY? – because a subsequence converges almost everywhere pointwise, both
in L2 and in L2 (R3 , dµ), this fact is also part of Riesz-Fischer theorem, and a sequence of
functions has only one pointwise limit)
Taking now the n → ∞ limit on both sides of the inequality (7.6), we obtain that (7.3)
holds for ψ as well.
27
8
8.1
Stability of atoms and molecules
Stability of first kind
In the previous section we proved that the very special hydrogenic atom is stable. But actually
the same proof will immediately show that any atom or molecule or even any extended matter
with Coulomb interaction is stable.
The state of N particles is described by a wave function of N variables, i.e. by elements of
Z
n
o
2
3N
3N
L (R ) = ψ(x1 , x2 , . . . xN ) : R → C :
|ψ(x1 , . . . xN )|2 dx1 . . . dxN < ∞
Later we will see that not every function is allowed, only those with a certain symmetry type
(depending on whether the particles are bosons or fermions) but for the moment we consider
all functions.
The Hamilton operator (formally) is given by
H=
N
X
(−∆xj ) + VC (x, R)
j=1
where x = (x1 , x2 , . . . , xN ) and VC was given in (1.8) (setting e = 1). This is the natural
quantum analogue of (1.9) (in our chosen units), and we consider here the case of static
nuclei. The Laplacian ∆xj acts only on the j-th variable, but it is still considered as an
operator acting on functions ψ with N variables. Often we will write ∆j for ∆xj and similarly
for the gradient ∇j = ∇xj .
Thus the energy of this molecule in state ψ is
Eψ =
N Z
X
j=1
2
|∇j ψ(x)| dx +
Z
VC (x, R)|ψ(x)|2 dx
(8.1)
where dx = dx1 dx2 . . . dxN .
Theorem 8.1 (Stability of first kind for atoms and molecules) For a fixed location of
nuclei at R = (R1 , . . . RK ), consider the set
N Z
K X
N Z
n
X
X
3N
2
MR := ψ : R → C :
|∇j ψ(x)| dx < ∞,
j=1
k=1 j=1
28
o
|ψ(x)|2
<∞
|xj − Rk |
Then the ground state energy of the molecule with N electron and K fixed nuclei at locations
R is given by
n
o
E0 (R) = inf Eψ : ψ ∈ MR , kψk = 1
and the total ground state energy is given by
E0 = inf E0 (R)
R
i.e. the ground state energies minimized for all nuclear positions. Then E0 > −∞.
Proof. Since we are looking for a lower bound, we can drop the positive repulsion terms
from VC , so it is sufficient to show that there exists a finite constant C, independent of R,
such that
K X
N Z
N Z
X
X
Zk |ψ(x)|2
2
|∇j ψ(x)| dx −
≥ −Ckψk2
(8.2)
|x
−
R
|
j
k
j=1
j=1
k=1
Let Z = maxk Zk . It is clearly sufficient to prove that there is a finite C ′ such that for each
fixed j and fixed k
Z
Z
Z|ψ(x)|2
1
2
|∇j ψ(x)| dx −
≥ −C ′ kψk2
(8.3)
K
|xj − Rk |
and then we just add up these inequalities (for all j = 1, 2, . . . N, k = 1, 2, . . . K) to get the
result with C = C ′ NK. We can write
Z
Z
Z
1
Z|ψ(x)|2
1
2
dj . . . dxN H(x1 , . . . x
|∇j ψ(x)| dx −
dx1 . . . dx
bj , . . . xN )
(8.4)
=
K
|xj − Rk |
K
with
H(x1 , . . . x
bj , . . . xN ) =
hZ
dxj |∇ψ(x1 , . . . xj , . . . xN )|2 −
i
KZ
|ψ(x1 , . . . xj , . . . xN )|2
|xj − Rk |
dj means that the dxj integration is
and with the convention that hat means omission (i.e. dx
missing). Here we view all but the xj variable fixed (as a parameter) and consider the function
g(xj ) = gx1 ,...bxj ,...xN (xj ) = ψ(x1 , . . . xj , . . . xN )
as a function of one variable:
g : xj → ψ(x1 , . . . xj , . . . xN )
parametrized by the others.
29
This is an L2 function for almost all choices of (x1 , . . . x
bj , . . . xN ) since
Z
Z hZ
i
2
2
d
dj . . . dxN
kgx1 ,...bxj ,...xN k dx1 . . . dxj . . . dxN =
dxj |ψ(x1 , . . . xj , . . . xN )| dx1 . . . dx
= kψk22 = 1
Thus
H(x1 , . . . x
bj , . . . xN ) =
Z
R3
h
|∇g(xj )|2 −
(8.5)
i
KZk
2
|g(xj )| dxj
|xj − Rk |
where g = gx1 ,...bxj ,...xN . From the stability of the hydrogenic atom, we know that
1
H(x1 , . . . x
bj , . . . xN ) ≥ − (KZ)2 kgx1,...bxj ,...xN k2
4
(the fact that the nucleus is at Rk and not at the origin does not change anything in the
estimate, since the kinetic energy and the L2 norm are both translation invariant). Plugging
this estimate into (8.4), we obtain (8.3) with C ′ = 14 Z 2 K, so we have the lower bound (8.2)
with C = 14 Z 2 NK 2 .
This proof was presented for static nuclei. If they were also dynamical, the proof is even
easier, since then we have additional positive kinetic energy terms in the Hamiltonian.
8.2
Stability of second kind, preliminary estimates
We have followed the dependence of the constant C = 14 Z 2 NK 2 on the parameters, especially
on the total particle number N + K. In a typical molecule or extended matter N is close to K
to ensure electrostatic (almost) neutrality. Thus our lower bound is cubic in the total number
of the particles. This gives stability (of first kind), but is not completely satisfactory, since
we would like to show that the ground state energy is at most proportional to the number of
particles, i.e. we would like to have a bound that is linear in N + K.
Definition 8.1 We say that a Coulombic system consisting of N electrons and K nuclei
satisfies the stability of second kind, if there is a constant C = CZ that may depend on the
maximal charge of the nuclei, Z = maxk Zk but is independent of the total number of particles,
such that
Eψ ≥ −CZ (N + K)
30
We will see that the extended matter with Coulomb interaction actually satisfies this
stronger criteria for stability, but we will have to restrict the set of admissible wave functions
in M to antisymmetric ones. Physically this reflects the fact that the electrons are fermions
and it is also called the Pauli principle. The nuclei can have arbitrary particle types.
There will be three ingredients for such a proof:
(i) Coulomb singularities can be controlled by the kinetic energy
(ii) Electrostatic screening
(iii) Pauli principle
So far we have seen (i), this is essentially the key inequality in Lemma 7.2. The electrostatic
screening is a special property of the Coulomb systems, very roughly saying, it expresses the
fact that if we have a collection of particles (with both negative and positive charges) in a
bounded domain, then the electrostatic potential generated by these particles far away looks
as if all charges were concentrated at one point. In particular, there is a strong cancellation.
Finally, Pauli principle will strengthen the one particle Lemma 7.2 and will lead to the LiebThirring inequalities, that replace Sobolev inequalities in the fermionic setup.
Why is Pauli principle important? The following explanation should give a first (nonrigorous) insight. Suppose we have N electron and one single nucleus, say, at the origin, with
nuclear charge Z. Assume that we neglect all interaction among the electrons (which are
positive, so for a lower bound we are allowed to neglect them). The Hamiltonian is
H=
N X
i=1
− ∆j −
Z |xj |
(8.6)
or, with the quadratic form
Eψ =
N Z
X
i=1
"
|∇j ψ(x)|2 −
Z|ψ(x)|
|xj |
2
#
dx
(8.7)
What is the minimal energy? For one electron, the energy would be −Z 2 /4 and the minimizing
function ψ0 (x) = (const)e−Z|x|/2 (see Theorem 7.1). If the electrons do not interact, then the
second electron can occupy the same state as well, etc, so
ψ(x) = ψ(x1 , x2 , . . . , xN ) = ψ0 (x1 )ψ0 (x2 ) . . . ψ0 (xN )
31
would be a natural guess (note that the single particle functions ψ0 (xj ) should be multiplied
and not added). It is easy to compute (EXERCISE), that the energy of this state is
Eψ = −
NZ 2
4
i.e. the energy is additive.
Exercise 8.2 Extend this argument to show rigorously that the atom with one nucleus of
charge Z always satisfies the stability of second kind.
The problem is that a typical electrostatic matter has several nuclei. What kind of nuclear
configuration yields the lowest energy? If we neglect the nucleus-nucleus repulsion as well
(very crude assumption), then nothing forbids the K nuclei with charges Z1 , Z2 , . . . , ZK to
pile up atop of each other and forming a nucleus with total charge Z1 + ... + ZK . It can
be proven, that indeed the total pile-up is the lowest energy configuration and the minimal
energy is
(Z1 + . . . + ZK )2 N
1
E0 = −
≥ − Z 2 NK 2
4
4
where Z = maxk Zk , this bound is exactly the same as we got in Theorem 8.1. To convince
yourself, just check that the energy of a total pile up is lower than if the nuclei are concentrated,
say, at two different centers, that are very far away. Let A be the index set of the nuclei around
one center, then the energy is [WHY??]
P
2 #
P
2
"
(N − N1 )
N1
j6∈A Zj
j∈A Zj
−
−
E(A) = min
0≤N1 ≤N
4
4
Check [!!] that
(Z1 + . . . + ZK )2 N
4
and equality is if and only if A = ∅ or A = {1, 2, . . . , N}, i.e. there is only one center.
E(A) ≥ −
One may argue that we neglected the electrostatic repulsion among the nuclei (also also
among the electrons), and this forbids putting all nuclei atop of each other. This is true,
but the correct electrostatic alone does not solve the problem, we will prove later that taking
all electrostatics into account the ground state energy is of order −CZ (N + K)5/3 . It is an
improvement compared with the cubic behavior, but it is still not linear.
To achieve a lower bound that is linear in N +K, one additionally needs the Pauli principle.
Very roughly, the Pauli principle forbids the two electrons occupy the same state; if the
32
first electron is in state ψ0 (x1 ), then the second one cannot be in ψ0 (x2 ), i.e. the function
ψ0 (x1 )ψ0 (x2 ) as a two-electron wavefunction is forbidden.
More precisely, Pauli principle will imply that if one electron is in state ψ0 , then the second
one must be in a state that is orthogonal to ψ0 , e.g. its wavefunction is ψ0 (x1 )ψ1 (x2 ) where
ψ1 ⊥ ψ0 . Actually, the precise definition will require that the two particle wave function
ψ(x1 , x2 ) be antisymmetric, i.e. ψ(x1 , x2 ) = −ψ(x2 , x1 ), so the product ψ0 (x1 )ψ1 (x2 ) is still
not correct, but the antisymmetrized product
i
1 h
(ψ0 ∧ ψ1 )(x1 , x2 ) = √ ψ0 (x1 )ψ1 (x2 ) − ψ0 (x2 )ψ1 (x1 )
2
√
will do the job (the 1/ 2 is the correct normalization to make ψ0 ∧ψ1 have norm one, CHECK!)
If we choose ψ1 to be the eigenfunction to the second lowest eigenvalue of the Hamiltonian
−∆ − Z/|x|, then ψ1 ⊥ ψ0 (think of the orthogonality of two eigenvectors with different
eigenvalues of a hermitian matrix). The energy (8.7) of the function ψ = ψ0 ∧ ψ1 will be the
sum of the two lowest eigenvalues.
Exercise 8.3 Check this statement on the formal level, i.e. assuming that
and E0 6= E1 , then
Z
−∆−
ψj = Ej ψj ,
|x|
j = 0, 1
hψ, Hψi = E0 + E1
where ψ = ψ0 ∧ ψ1 and H is given in (8.6). You can use the formal self-adjointness of H.
For N electrons, the Pauli principle will imply that the lowest energy state of (8.6) is the
antisymmetric product ψ = ψ0 ∧ ψ1 ∧ . . . ∧ ψN −1 of the N eigenfunctions, corresponding to
the N lowest eigenvalues (with multiplicity), i.e. to E0 ≤ E1 ≤ E2 . . .. The energy of ψ is
E0 + E1 + . . . + EN −1 .
Exercise 8.4 Reviewing the degeneracy structure of the Hydrogen eigenvalues (with nuclear
charge Z) compute how the sum E0 + E1 + . . . + EN −1 behaves in N for large N. (Answer
−CZ 2 N 1/3 )
Imagine again that all nuclei are concentrated at the origin (i.e. neglect nucleus-nucleus
interaction) and also neglect electron-electron interaction, but take into account the Pauli
principle, we see that the ground state energy is of order −CZ 2 K 2 N 1/3 .
33
Summarizing: If we neglect repulsion (i.e. electrostatic is wrong), and neglect Pauli principle, then ground state energy is −CZ (N + K)3 . If we neglect Pauli principle, but take into
account proper electrostatic, then the ground state energy is smaller than −CZ (N + K)5/3 ,
i.e. electrostatic improves the power by 4/3. Finally, if we take into account Pauli principle,
but neglect electrostatics, the ground state energy is −CZ (N + K)7/3 (always expressed in
terms of the total number of particles, N + K), i.e. Pauli principle improves the power by
2/3. The goal will be that if we take both effects into account, then the bound is linear in
N + K, i.e. the original cubic power is improved by 4/3 + 2/3 = 2.
34