Download FUNCTIONAL ANALYSIS LECTURE NOTES CHAPTER 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Vector space wikipedia , lookup

Four-vector wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Symmetric cone wikipedia , lookup

Jordan normal form wikipedia , lookup

Lp space wikipedia , lookup

Transcript
FUNCTIONAL ANALYSIS LECTURE NOTES
CHAPTER 2. OPERATORS ON HILBERT SPACES
CHRISTOPHER HEIL
1. Elementary Properties and Examples
First recall the basic definitions regarding operators.
Definition 1.1 (Continuous and Bounded Operators). Let X, Y be normed linear spaces,
and let L : X → Y be a linear operator.
(a) L is continuous at a point f ∈ X if fn → f in X implies Lfn → Lf in Y .
(b) L is continuous if it is continuous at every point, i.e., if fn → f in X implies Lfn → Lf
in Y for every f .
(c) L is bounded if there exists a finite K ≥ 0 such that
∀ f ∈ X,
kLf k ≤ K kf k.
Note that kLf k is the norm of Lf in Y , while kf k is the norm of f in X.
(d) The operator norm of L is
kLk = sup kLf k.
kf k=1
(e) We let B(X, Y ) denote the set of all bounded linear operators mapping X into Y ,
i.e.,
B(X, Y ) = {L : X → Y : L is bounded and linear}.
If X = Y = X then we write B(X) = B(X, X).
(f) If Y = F then we say that L is a functional. The set of all bounded linear functionals
on X is the dual space of X, and is denoted
X 0 = B(X, F) = {L : X → F : L is bounded and linear}.
We saw in Chapter 1 that, for a linear operator, boundedness and continuity are equivalent.
Further, the operator norm is a norm on the space B(X, Y ) of all bounded linear operators
from X to Y , and we have the composition property that if L ∈ B(X, Y ) and K ∈ B(Y, Z),
then KL ∈ B(X, Z), with kKLk ≤ kKk kLk.
Date: February 20, 2006.
These notes closely follow and expand on the text by John B. Conway, “A Course in Functional Analysis,”
Second Edition, Springer, 1990.
1
2
CHRISTOPHER HEIL
Exercise 1.2. Suppose that L : X → Y is a bounded map of a Banach space X into a
Banach space Y . Prove that if there exists a c > 0 such that kLf k ≥ c kf k for every f ∈ X,
then range(L) is a closed subspace of Y .
Exercise 1.3. Let Cb (Rn ) be the set of all bounded, continuous functions f : Rn → F. Let
C0 (Rn ) be the set of all continuous functions f : Rn → F such that lim|x|→∞ f (x) = 0 (i.e.,
for every ε > 0 there exists a compact set K such that |f (x)| < ε for all x ∈
/ K). Prove
that these are closed subspaces of L∞ (Rn ) (under the L∞ -norm; note that for a continuous
function we have kf k∞ = sup |f (x)|).
Define δ : Cb (Rn ) → F by
δ(f ) = f (0).
Prove that δ is a bounded linear functional on Cb (Rn ), i.e., δ ∈ (Cb )0 , and find kδk. This
linear functional is the delta distribution (see also Exercise 1.26 below).
Example 1.4. In finite dimensions, all linear operators are given by matrices, this is just
standard finite-dimensional linear algebra.
Suppose that X is an n-dimensional complex normed vector space and Y is an mdimensional complex normed vector space. By definition of dimension, this means that
there exists a basis BX = {x1 , . . . , xn } for X and a basis BY = {y1 , . . . , ym } for Y . If x ∈ X,
then x = c1 x1 + · · · + cn xn for a unique choice of scalars ci . Define the coordinates of x with
respect to the basis BX to be


c1
[x]BX =  ...  ∈ Cn .
cn
The vector x is completely determined by its coordinates, and conversely each vector in C n
is the coordinates of a unique x ∈ X. The mapping x 7→ [x]BX is a linear mapping of X onto
Cn . We similarly define [y]BY ∈ Cm for vectors y ∈ Y .
Let A : X → Y be a linear map (it is automatically bounded since X is finite-dimensional).
Then A transforms vectors x ∈ X into vectors Ax ∈ Y . The vector x is determined by its
coordinates [x]BX and likewise Ax is determined by its coordinates [Ax]BY . The vectors x
and Ax are related through the linear map A; we will show that the coordinate vectors [x]BX
and [Ax]BY are related by multiplication by an m × n matrix determined by A. We call this
matrix the standard matrix of A with respect to BX and BY , and denote it by [A]BX ,BY . That
is, the standard matrix should satisfy
[Ax]BY = [A]BX ,BY [x]BX ,
x ∈ X.
We claim that the standard matrix is the matrix whose columns are the coordinates of the
vectors Axk , i.e.,


[A]BX ,BY =  [Ax1 ]BY
· · · [Axn ]BY  .
CHAPTER 2. OPERATORS ON HILBERT SPACES
3
To see this, choose any x ∈ X and let x = c1 x1 + · · · + cn xn be its unique representation
with respect to the basis BX . Then



c1
[A]BX ,BY [x]BX =  [Ax1 ]BY · · · [Axn ]BY   ... 
cn
= c1 [Ax1 ]BY + · · · + cn [Axn ]BY
= [c1 Ax1 + · · · + cn Axn ]BY
= [A(c1 x1 + · · · + cn xn )]BY
= [Ax]BY .
Exercise 1.5. Extend the idea of the preceding example to show that that any linear
mapping L : `2 (N) → `2 (N) (and more generally, L : H → K with H, K separable) can be
realized in terms of multiplication by an (infinite but countable) matrix.
Exercise 1.6. Let A be an m × n complex matrix, which we view as a linear transformation
A : Cn → Cm . The operator norm of A depends on the choice of norm for Cn and Cm .
Compute an explicit formula for kAk, in terms of the entries of A, when the norm on Cn and
Cm is taken to be the `1 norm. Then do the same for the `∞ norm. Compare your results
to the version of Schur’s Lemma given in Theorem 1.23.
The following example is one that we will return to many times.
Example 1.7. Let {en }n∈N be an orthonormal basis for a separable Hilbert space H. Then
we know that every f ∈ H can be written
∞
X
f =
hf, en i en .
n=1
Fix any sequence of scalars λ = (λn )n∈N , and formally define
∞
X
Lf =
λn hf, en i en .
(1.1)
n=1
This is a “formal” definition because we do not know a priori that the series above will
converge—in other words, equation (1.1) may not make sense for every f .
Note that if H = `2 (N) and {en }n∈N is the standard basis, then L is given by the formula
Lx = (λ1 x1 , λ2 x2 , . . . ),
x = (x1 , x2 , . . . ) ∈ `2 (N).
We will show the following (the `∞ -norm of the sequence λ is kλk∞ = supn |λn |).
(a) The series defining Lf in (1.1) converges for each f ∈ H if and only if λ ∈ `∞ . In
this case L is a bounded linear mapping of H into itself, and kLk = kλk∞ .
4
CHRISTOPHER HEIL
(b) If λ ∈
/ `∞ , then L defines an unbounded linear mapping from the domain
∞
n
o
X
domain(L) = f ∈ H :
|λn hf, en i|2 < ∞
(1.2)
n=1
(which is dense in H) into H.
Proof. (a) Suppose that λ ∈ `∞ , i.e., λ is a bounded sequence. Then for any f we have
∞
X
n=1
2
|λn hf, en i| ≤
∞
X
n=1
kλk2∞ hf, en i|2 = kλk2∞ kf k2 < ∞,
so the series defining Lf converges (because {en } is P
an orthonormal sequence). Moreover,
2
2
2
2
the preceding calculation also shows that kLf k = ∞
n=1 |λn hf, en i| ≤ kλk∞ kf k , so we
see that kLk ≤ kλk∞ . On the other hand, by orthonormality we have Len = λn en (i.e., each
en is an eigenvector for L with eigenvalue λn ). Since ken k = 1 and kLen k = |λn | ken k = |λn |
we conclude that
kLk = sup kLf k ≥ sup kLen k = sup |λn | = kλk∞ .
n∈N
kf k=1
n∈N
The converse direction will be covered by the proof of part (b).
(b) Suppose that λ ∈
/ `∞ , i.e., λ is not a bounded sequence. Then we can find a subsequence
(λnk )k∈N such that |λnk | ≥ k for each k. Let cnk = k1 and define all other cn to be zero. Then
P
P
P
|2 = k k12 < ∞, so f = n cn en converges (and cn = hf, en i). But the formal series
n |cnP
Lf = n λn cn en does not converge, because
∞
X
n=1
2
|cn λn | =
∞
X
k=1
2
|cnk λnk | ≥
∞
X
k2
k=1
k2
= ∞.
In fact, the series defining Lf in (1.1) only converges for those f which lie in the domain
defined in (1.2). That domain is dense because it contains the finite span of {en }n∈N , which
we know is dense in H. Further, that domain is a subspace of H (exercise), so it is an innerproduct space. The map L : domain(L) → H is a well-defined, linear map, so it remains only
to show that it is unbounded. This follows from the facts that en ∈ domain(L), ken k = 1,
and kLen k = |λn | ken k = |λn |.
Exercise 1.8. Continuing Example 1.7, suppose that λ ∈ `∞ and set δ = inf n |λn |. Prove
the following.
(a) L is injective if and only if λn 6= 0 for every n.
(b) L is surjective if and only if δ > 0 (if δ = 0, use an argument similar to the one used
in part (b) of Example 1.7 to show that range(L) is a proper subset of H).
(c) If δ = 0 but λn 6= 0 for every n then range(L) is a dense but proper subspace of H.
(d) Prove that L is unitary if and only if |λn | = 1 for every n.
CHAPTER 2. OPERATORS ON HILBERT SPACES
5
In Example 1.7, we saw an unbounded operator whose domain was a dense but proper
subspace of H. This situation is typical for unbounded operators, and we often write L : X →
Y even when L is only defined on a subset of X, as in the following example.
Example 1.9 (Differentiation). Consider the Hilbert space H = L2 (0, 1), and define an
operator D : L2 (0, 1) → L2 (0, 1) by Df = f 0 . Implicitly, we mean by this that D is defined
on the largest domain that makes sense, namely,
domain(D) = f ∈ L2 (0, 1) : f is differentiable and f 0 ∈ L2 (0, 1) .
Note that if f ∈ domain(D), then Df is well-defined, Df ∈ L2 (0, 1), and kDf k2 < ∞.
Thus every vector in domain(D) maps to a vector in L2 (0, 1) which necessarily has finite
norm. Yet D is unbounded. For example, if we set en (x) = einx then ken k2 = 1, but
Den (x) = e0n (x) = ineinx so kDen k2 = n. While each vector Den has finite norm, there is no
upper bound to these norms. Since the en are unit vectors, we conclude that kDk = ∞.
The following definitions recall the basic notions of measures and measure spaces. For full
details, consult a book on real analysis.1
Definition 1.10 (σ-Algebras, Measurable Sets and Functions). Let X be a set, and let Ω
be a collection of subsets of X. Then Ω is a σ-algebra if
(a) X ∈ Ω,
(b) If E ∈ Ω then X \ E ∈ Ω (i.e., Ω is closed under complements),
S
(c) If E1 , E2 , · · · ∈ Ω then Ek ∈ Ω (i.e., Ω is closed under countable unions)
The elements of Ω are called the measurable subsets of X.
If we choose F = R then we usually allow functions on X to take extended-real values, i.e.,
f (x) is allowed to take the values ±∞. An extended-real-valued function f : X → [−∞, ∞]
is called a measurable function if {x ∈ X : f (x) > a} is measurable for each a ∈ R.
If we choose F = C then we require functions on X to take (finite) complex values—there
is no complex analogue of ±∞. A complex-valued function f : X → C is called a measurable
function if its real and imaginary parts are measurable (real-valued) functions.
Definition 1.11 (Measure Space). Let X be a set and Ω a σ-algebra of subsets of X. Then
a function µ on Ω is a (positive) measure if
(a) 0 ≤ µ(E) ≤ +∞ for all E ∈ Ω,
(b) If E1 , E2 , . . . is a countable family of disjoint sets in Ω, then
∞
S
X
∞
Ek =
µ(Ek ).
µ
k=1
1For
k=1
example, R. Wheeden and A. Zygmund, “Measure and Integral,” Marcel Dekker, 1977, or G. Folland,
“Real Analysis,” Second Edition, Wiley, 1999.
6
CHRISTOPHER HEIL
In this case, (X, Ω, µ) is called a measure space.
If µ(X) < ∞, then we say that µ is a finite measure.
S
If there exist countably many subsets E1 , E2 , . . . such that X = Ek and µ(Ek ) < ∞ for
all k, then we say that µ is σ-finite. For example, Lebesgue measure on Rn is σ-finite.
It is often useful to allow measures to take negative values.
Definition 1.12 (Signed Measure). Let X be a set and Ω a σ-algebra of subsets of X. Then
a function µ on Ω is a signed measure if
(a) −∞ ≤ µ(E) ≤ +∞ for all E ∈ Ω and µ(∅) = 0,
(b) If E1 , E2 , . . . is a countable family of disjoint sets in Ω, then
∞
S
∞
X
Ek =
µ(Ek ).
µ
k=1
k=1
Definition 1.13 (Integration). Let (X, Ω, µ) be a measure space.
(a) If f : X → [0, ∞] is a nonnegative, measurable function, then the integral of f over X
with respect to µ is
X
Z
Z
inf f (x) µ(Ej ) ,
f dµ =
f (x) dµ(x) = sup
X
X
x∈Ej
j
where the supremum is taken over all decompositions E = E1 ∪ · · · ∪ EN of E as the union
of a finite number of disjoint measurable sets Ek (and where we take the convention that
∞ · 0 = 0 · ∞ = 0).
(b) If f : X → [−∞, ∞] and we define
f + (x) = max{f (x), 0},
then
Z
f dµ =
X
Z
f − (x) = − min{f (x), 0},
+
X
f dµ −
Z
f − dµ,
X
as long as this does not have
R the form ∞ − ∞ (in that case the integral would be undefined).
Since |f | = f + + f − and X |f | dµ always exists (either as a finite number or as ∞), it follows
that
Z
Z
f dµ exists and is finite
⇐⇒
|f | dµ < ∞.
X
(c) If f : X → C, then
Z
X
f dµ =
X
Z
Re (f ) dµ + i
X
Z
Im (f ) dµ,
X
as long as both integrals on the right are defined and finite.
There are many other equivalent definitions of the integral.
CHAPTER 2. OPERATORS ON HILBERT SPACES
7
Definition 1.14 (Lp Spaces). Let (X, Ω, µ) be a measure space, and fix 1 ≤ p < ∞. Then
Lp (X) consists of all measurable functions f : X → [−∞, ∞] (if we choose F = R) or
f : X → C (if we choose F = C) such that
Z
p
kf kp =
|f (x)|p dµ(x) < ∞.
X
p
Then L (X) is a vector space under the operations of addition of functions and multiplication
of a function by a scalar. Additionally, the function k · kp defines a semi-norm on Lp (X).
Usually we identify functions that are equal almost everywhere (we say that f = g a.e. if
µ{x ∈ X : f (x) 6= g(x)} = 0), and then k · k becomes a norm on Lp (X).
For p = ∞ we define L∞ (X) to be the set of measurable functions that are essentially
bounded, i.e., for which there exists a finite constant M such that |f (x)| ≤ M a.e. Then
kf k∞ = ess sup |f (x)| = inf M ≥ 0 : |f (x)| ≤ M a.e.
x∈X
∞
is a semi-norm on L (X), and is a norm if we identify functions that are equal almost
everywhere.
For each 1 ≤ p ≤ ∞, the space Lp (X) is a Banach space under the above norm.
Exercise 1.15 (`p Spaces). Counting measure on a set X is defined by µ(X) = card(E) if E
is a finite subset of X, and µ(X) = ∞ if E is an infinite subset. Let Ω = P(X) (the set of all
subsets of X), and show that (X, Ω, µ) is a measure space. Show that Lp (X, Ω, µ) = `p (X).
Show that µ is σ-finite if and only if X is countable.
Exercise 1.16 (The Delta Measure). Let X = Rn and Ω = P(X). Define δ(E) = 1 if 0 ∈ E
and δ(E) = 0 if 0 ∈
/ E. Prove that δ is a measure, and find a formula for
Z
f (x) dδ(x).
Rn
R
Sometimes this integral is written informally as Rn f (x) δ(x) dx, but note that δ is a measure
on Rn , not a function on Rn (see also Exercise 1.26 below).
Exercise 1.17. Fix 0 ≤ g ∈ L1 (Rn ), under Lebesgue measure. Prove that µ(E) =
defines a finite measure on Rn .
R
E
g(x) dx
With this preparation, we can give some additional examples of operators on Banach or
Hilbert spaces.
Example 1.18 (Multiplication Operators). Let (X, Ω, µ) be a measure space, and let φ ∈
L∞ (X) be a fixed measurable function. Then for any f ∈ L2 (X) we have that f φ is
measurable, and
Z
Z
2
2
kf φk2 =
|f (x) φ(x)| dx ≤
|f (x)|2 kφk2∞ dx = kφk2∞ kf k22 < ∞,
X
X
8
CHRISTOPHER HEIL
so f φ ∈ L2 (X). Therefore, the multiplication operator Mφ : L2 (X) → L2 (X) given by
Mφ f = f φ is well-defined, and the calculation above shows that kMφ f k2 ≤ kφk∞ kf k2 .
Therefore Mφ is bounded, and kMφ k ≤ kφk∞ .
If we assume that µ is σ-finite, then we can show that kMφ k = kφk∞ , as follows. Choose
any ε > 0. Then by definition of L∞ -norm, the set E = {x ∈ X : |φ(x)| > kφk∞ − ε}
has positive measure. Since X is σ-finite, we can write X = ∪Fm where each µ(Fm ) < ∞.
Since E = ∪(E ∩ Fm ) is a countable union, we must have µ(E ∩ Fm ) > 0 for some m. Let
F = E ∩ Fm , and set f = µ(F1)1/2 χF . Then kf k2 = 1, but kMφ f k2 ≥ (kφk∞ − ε) kf k2 . Hence
kMφ k2 ≥ kφk∞ − ε.
Exercise: Find an example of a measure µ that is not σ-finite and a function φ such that
kMφ k < kφk∞ .
Exercise 1.19. Let (X, Ω, µ) be a measure space, and let φ be a fixed measurable function.
Prove that if f φ ∈ L2 (X) for every f ∈ L2 (X), then we must have φ ∈ L∞ (X).
Solution. Assume φ ∈
/ L∞ (X). Set
Ek = {x ∈ X : k ≤ |φ(x)| < k + 1}.
The Ek are measurable and disjoint, and since φ is not in L∞ (X) there must be infinitely
many Ek with positive measure. Choose any Enk , k ∈ N, all with positive measure and let
E = ∪Enk . Define

1

, x ∈ E nk ,
k µ(Enk )1/2
f (x) =

0,
x∈
/ E.
Then
Z
∞
∞ Z
X
X
1
1
2
=
< ∞,
|f | dµ =
2
k µ(Enk )
k2
X
k=1
k=1 Enk
but
Z
2
X
|f φ| dµ ≥
which is a contradiction.
∞ Z
X
k=1
E nk
∞
X
k2
1 = ∞,
=
k 2 µ(Enk )
k=1
Exercise 1.20. Continuing Example 1.18, do the following.
(a) Determine a necessary and sufficient condition on φ which implies that Mφ : L2 (X) →
L2 (X) is injective.
(b) Determine a necessary and sufficient condition on φ which implies that Mφ : L2 (X) →
L2 (X) is surjective.
(c) Prove that if Mφ is injective but not surjective then Mφ−1 : range(Mφ ) → L2 (X) is
unbounded.
(d) Extend from the case p = 2 to any 1 ≤ p ≤ ∞.
CHAPTER 2. OPERATORS ON HILBERT SPACES
9
Example 1.21 (Integral Operators). Let (X, Ω, µ) be a σ-finite measure space. An integral
operator is an operator of the form
Z
Lf (x) =
k(x, y) f (y) dµ(y).
(1.3)
X
This is just a formal definition, we have to provide conditions under which this makes sense,
and the following two theorems will provide such conditions. The function k that determines
the operator is called the kernel of the operator (not to be confused with the kernel/nullspace
of the operator!).
Note that an integral operator is just a generalization of matrix multiplication. For, if A
is an m × n matrix with entries aij and u ∈ Cn , then Au ∈ Cm , and its components are given
by
n
X
aij uj ,
i = 1, . . . , m.
(Au)i =
j=1
Thus, the values k(x, y) are analogous to the entries aij of the matrix A, and the values
Lf (x) are analogous to the entries (Au)i .
The following result shows that if the kernel is square-integrable, then the corresponding
integral operator is bounded. Later we will define the notion of a Hilbert–Schmidt operator.
For the case of integral operators mapping L2 (X) into itself, it can be shown that L is a
Hilbert–Schmidt operator if and only if the kernel k belongs to L2 (X × X).
Theorem 1.22 (Hilbert–Schmidt Integral Operators). Let (X, Ω, µ) be a σ-finite measure
space, and choose a kernel k ∈ L2 (X × X). That is, assume that
Z Z
2
kkk2 =
|k(x, y)|2 dµ(x) dµ(y) < ∞.
X
X
Then the integral operator given by (1.3) defines a bounded mapping of L2 (X) into itself,
and kLk ≤ kkk2 .
Proof. Although a slight abuse of the order of logic (technically we should show Lf exists
before trying to compute its norm), the following calculation shows that L is well-defined
and is a bounded mapping of L2 (X) into itself:
Z
2
kLf k2 =
|Lf (x)|2 dµ(x)
X
2
Z Z
=
k(x, y) f (y) dµ(y) dµ(x)
X
≤
X
Z Z
X
2
X
|k(x, y)| dµ(y)
Z
X
2
|f (y)| dµ(y) dµ(x)
10
CHRISTOPHER HEIL
=
Z Z
X
X
|k(x, y)|2 dµ(y) kf k22 dµ(x)
= kkk22 kf k22 ,
where the inequality follows by applying Cauchy–Schwarz to the inner integral. Thus L is
bounded, and kLk ≤ kkk2 .
The following result is one version of Schur’s Lemma. There are many forms of Schur’s
Lemma, this is one particular special case.
Exercise: Compare the hypotheses of the following result to the operator norms you
calculated in Exercise 1.6.
Theorem 1.23. Let (X, Ω, µ) be a σ-finite measure space, and Assume that k is a measurable
function on X × X which satisfies the “mixed-norm” conditions
Z
Z
|k(x, y)| dµ(x) < ∞.
|k(x, y)| dµ(y) < ∞ and C2 = ess sup
C1 = ess sup
x∈X
y∈X
X
X
Then the integral operator given by (1.3) defines a bounded mapping of L2 (X) into itself,
and kLk ≤ (C1 C2 )1/2 .
Proof. Choose any f ∈ L2 (X). Then, by applying the Cauchy–Schwarz inequality, we have
Z
2
kLf k2 =
|Lf (x)|2 dµ(x)
X
2
Z Z
k(x, y) f (y) dµ(y) dµ(x)
=
X
X
≤
Z Z
≤
Z Z
≤
Z
X
X
X
X
C1
X
= C1
≤ C1
Z
X
Z
X
|k(x, y)|
1/2
|k(x, y)|
|k(x, y)| dµ(y)
Z
X
Z
1/2
|f (y)| dµ(y)
2
2
X
dµ(x)
|k(x, y)| |f (y)| dµ(y) dµ(x)
|k(x, y)| |f (y)|2 dµ(y) dµ(x)
|f (y)|
2
Z
X
|k(x, y)| dµ(x) dµ(y)
|f (y)|2 C2 dµ(y)
= C1 C2 kf k22 ,
where we have used Tonelli’s Theorem to interchange the order of integration (here is where
we needed the fact that µ is σ-finite). Thus L is bounded and kLk ≤ (C1 C2 )1/2 .
CHAPTER 2. OPERATORS ON HILBERT SPACES
11
Exercise 1.24. Consider what happens in the preceding example if we take 1 ≤ p ≤ ∞
instead of p = 2. In particular, in part b, show that if C1 , C2 < ∞ then L : Lp (X) → Lp (X)
is a bounded mapping for each 1 ≤ p ≤ ∞ (try to do p = 1 or p = ∞ first).
Exercise 1.25 (Volterra Operator). Define L : L2 [0, 1] → L2 [0, 1] by
Z x
Lf (x) =
f (y) dy.
0
Show directly that L is bounded. Then show that L is an integral operator with kernel
k : [0, 1]2 → F defined by
(
1, y ≤ x,
k(x, y) =
0, y > x.
Observe that k ∈ L2 ([0, 1]2 ), so L is compact. This operator is called the Volterra operator.
Exercise 1.26 (Convolution). Convolution is one of the most important examples of integral
operators. Consider the case of Lebesgue measure on Rn . Given functions f , g on Rn , their
convolution is the function f ∗ g defined by
Z
(f ∗ g)(x) =
f (y) g(x − y) dy,
Rn
provided that the integral makes sense. Note that with g fixed, the mapping f 7→ f ∗ g is
an integral operator with kernel k(x, y) = g(x − y).
(a) Let g ∈ L1 (Rn ) be fixed. Use Schur’s Lemma (Theorem 1.23) to show that Lf = f ∗ g
is a bounded mapping of L2 (Rn ) into itself. In fact, use Exercise 1.24 to prove Young’s
Inequality: If f ∈ Lp (Rn ) (1 ≤ p ≤ ∞) and g ∈ L1 (Rn ), then f ∗ g ∈ Lp (Rn ), and
kf ∗ gkp ≤ kf kp kgk1 .
In particular, L1 (Rn ) is closed under convolution.
(b) Note that we cannot use the Hilbert–Schmidt condition (Theorem 1.22) to prove
Young’s Inequality, since
Z Z
|g(x − y)|2 dx dy = ∞,
2
Rn
n
Rn
even if we assume that g ∈ L (R ).
(c) Prove that convolution is commutative, i.e., that f ∗ g = g ∗ f .
(d) Prove that there is no identity element in L1 (Rn ), i.e., there is no function g ∈ L1 (Rn )
such that f ∗ g = f for all f ∈ L1 (Rn ). This is not trivial—it is easier to do if you make
use of the Fourier transform on Rn , and in particular use the Riemann–Lebesgue Lemma to
derive a contradiction.
12
CHRISTOPHER HEIL
(e) Some texts do talk informally about a “delta function” that is an identity element for
convolution, defined by the conditions
(
Z
∞, x = 0,
δ(x) =
and
δ(x) dx = 1,
0, x 6= 0,
Rn
but no such function actually exists. In particular, the function δ defined on the left-hand
side of the line above is equal to zero a.e., and
R hence is the zero function as far as Lebesgue
integration is concerned. That is, we have Rn δ(x) dx = 0, not 1. The “delta function” is
really just an informal use of the delta distribution (see Exercise 1.3) or the delta measure
(see Exercise 1.16). Show that if we define the convolution of a function f with the delta
measure δ to be
Z
(f ∗ δ)(x) =
f (x − y) dδ(y),
(1.4)
Rn
1
n
then f ∗ δ = f for all f ∈ L (R ). Note that in the “informal” notation of Exercise 1.16,
(1.4) reads
Z
(f ∗ δ)(x) =
f (x − y) δ(y) dy,
Rn
which perhaps explains the use of the term “delta function.”
Exercise 1.27. Prove that L1 (Rn ) is not closed under pointwise multiplication. That is,
prove that there exist f , g ∈ L1 (Rn ) such that the pointwise product h(x) = (f g)(x) =
f (x)g(x) does not belong to to L1 (Rn ).
Exercise 1.28 (Convolution Continued). (a) Consider the space Lp [0, 1], where we think of
functions in Lp [0, 1] as being extended 1-periodically to the real line. Define convolution on
the circle by
Z 1
(f ∗ g)(x) =
f (y) g(x − y) dy,
0
where the periodicity is used to define g(x − y) when x − y lies outside [0, 1] (equivalently,
replace x − y by x − y mod 1, the fractional part of x − y). Prove a version of Young’s
Inequality for Lp [0, 1].
(b) Consider the sequence space `p (Z). Define convolution on Z by
X
(x ∗ y)n =
xm yn−m .
m∈Z
p
Prove a version of Young’s Inequality for ` (Z).
Prove that `1 (Z) contains an identity element with respect to convolution, i.e., there exists
a sequence in `1 (Z) (typically denoted δ) such that δ ∗ x = x for every x ∈ `p (Z).
(c) Identify the essential features needed to define convolution on more general domains,
and prove a version of Young’s Inequality for that setting.
CHAPTER 2. OPERATORS ON HILBERT SPACES
13
Exercise 1.29 (Convolution and the Fourier Transform). Let F be the Fourier transform
on the circle, i.e., it is the isomorphism F : L2 [0, 1] → `2 (Z) given by F f = fˆ = {fˆ(n)}n∈Z ,
where
Z 1
ˆ
f (x) e−2πinx dx,
en (x) = e2πinx .
f (n) = hf, en i =
0
(a) Prove that the Fourier transform converts convolution in to multiplication. That is,
prove that if f , g ∈ L2 [0, 1], then (f ∗ g)∧ = fˆ ĝ, i.e.,
(f ∗ g)∧ (n) = fˆ(n) ĝ(n),
n ∈ Z.
(b) Note that if g ∈ L2 [0, 1], then g ∈ L1 [0, 1], so by Young’s Inequality we have that
f ∗ g ∈ L2 [0, 1]. Holding g fixed, define an operator L : L2 [0, 1] → L2 [0, 1] by Lf = f ∗ g.
Since {en }n∈Z is an orthonormal basis for L2 [0, 1], we have
X
fˆ(n) en ,
f ∈ L2 [0, 1].
f =
n∈Z
Show that
Lf = f ∗ g =
X
ĝ(n) fˆ(n) en ,
n∈Z
f ∈ L2 [0, 1].
Thus, in the “Fourier domain,” convolution acts by changing or adjusting the amount that
each “component” or “frequency” en contributes to the representation of the function in this
basis: the weight fˆ(n) for frequency n is replaced by the weight ĝ(n) fˆ(n). Explain why this
says that L is analogous to multiplication by a diagonal operator. In engineering parlance,
convolution is also referred to as filtering. Explain why this terminology is appropriate.
Compare this operator L to Example 1.7.
2. The Adjoint of an Operator
Example 2.1. Note that the dot product on Rn is given by x · y = xT y, while the dot
product on Cn is x · y = xT ȳ.
Let A be an m × n real matrix. Then x →
7 Ax defines a linear map of Rn into Rm , and
its transpose AT satisfies
∀ x ∈ Rn ,
∀ y ∈ Rm ,
Ax · y = (Ax)T y = xT AT y = x · (AT y).
∀ x ∈ Cn ,
∀ y ∈ Cm ,
Ax · y = (Ax)T ȳ = xT AT ȳ = x · (AH y).
Similarly, if A is an m × n complex matrix, then its Hermitian or adjoint matrix AH = AT
satisfies
Theorem 2.2 (Adjoint). Let H and K be Hilbert spaces, and let A : H → K be a bounded,
linear map. Then there exists a unique bounded linear map A∗ : K → H such that
∀ x ∈ H,
∀ y ∈ K,
hAx, yi = hx, A∗ yi.
14
CHRISTOPHER HEIL
Proof. Fix y ∈ K. Then Lx = hAx, yi is a bounded linear functional on H. By the Riesz
Representation Theorem, there exists a unique vector h ∈ H such that
hAx, yi = Lx = hx, hi.
Define A∗ y = h. Verify that this map A∗ is linear (exercise). To see that it is bounded,
observe that
kA∗ yk = khk = sup |hx, hi|
kxk=1
= sup |hAx, yi|
kxk=1
≤ sup kAxk kyk
kxk=1
≤ sup kAk kxk kyk = kAk kyk.
kxk=1
We conclude that A∗ is bounded, and that kA∗ k ≤ kAk.
Finally, we must show that A∗ is unique. Suppose that B ∈ B(K, H) also satisfied
hAx, yi = hx, Byi for all x ∈ H and y ∈ K. Then for each fixed y we would have that
hx, By − A∗ yi = 0 for every x, which implies By − A∗ y = 0. Hence B = A∗ .
Exercise 2.3 (Properties of the adjoint).
(a) If A ∈ B(H, K) then (A∗ )∗ = A.
(b) If A, B ∈ B(H, K) and α, β ∈ F, then (αA + βB)∗ = ᾱA∗ + β̄B ∗ .
(c) If A ∈ B(H1 , H2 ) and B ∈ B(H2 , H3 ), then (BA)∗ = A∗ B ∗ .
(d) If A ∈ B(H) is invertible in B(H) (meaning that there exists A−1 ∈ B(H) such that
AA−1 = A−1 A = I), then A∗ is invertible in B(H) and (A−1 )∗ = (A∗ )−1 .
Remark 2.4. Later we will prove the Open Mapping Theorem. A remarkable consequence
of this theorem is that if X and Y are Banach spaces and A : X → Y is a bounded bijection,
then A−1 : Y → X is automatically bounded.
Proposition 2.5. If A ∈ B(H, K), then kAk = kA∗ k = kA∗ Ak1/2 = kAA∗ k1/2 .
Proof. In the course of proving Theorem 2.2, we already showed that kA∗ k ≤ kAk. If f ∈ H,
then
kAf k2 = hAf, Af i = hA∗ Af, f i ≤ kA∗ Af k kf k ≤ kA∗ k kAf k kf k.
(2.1)
∗
Hence kAf k ≤ kA k kf k (even if kAf k = 0, this is still true). Since this is true for all f we
conclude that kAk ≤ kA∗ k. Therefore kAk = kA∗ k.
Next, we have kA∗ Ak ≤ kAk kA∗ k = kAk2 . But also, from the calculation in (2.1), we
have kAf k2 ≤ kA∗ Af k kf k. Taking the supremum over all unit vectors, we obtain
kAk2 = sup kAf k2 ≤ sup kA∗ Af k kf k = kA∗ Ak.
kf k=1
kf k=1
CHAPTER 2. OPERATORS ON HILBERT SPACES
15
Consequently kAk2 = kA∗ Ak. The final equality follows by interchanging the roles of A
and A∗ .
Exercise 2.6. Prove that if U ∈ B(H, K), then U is an isomorphism if and only if U is
invertible and U −1 = U ∗ .
Exercise 2.7. (a) Let λ = (λn )n∈N ∈ `∞ (N) be given and let L be defined as in Example 1.7.
Find L∗ .
(b)Prove that the adjoint of the multiplication operator Mφ defined in Exercise 1.18 is the
multiplication operator Mφ̄ .
Exercise 2.8. Let L and R be the left- and right-shift operators on `2 (N), i.e.,
L(x1 , x2 , . . . ) = (x2 , x3 , . . . )
and
R(x1 , x2 , . . . ) = (0, x1 , x2 , . . . ).
Prove that L = R∗ .
Example 2.9. Let L be the integral operator defined in (1.3), determined by the kernel
function k. Assume that k is chosen so that L : L2 (X) → L2 (X) is bounded. The adjoint is
the unique operator L∗ : L2 (X) → L2 (X) which satisfies
hLf, gi = hf, L∗ gi,
f, g ∈ L2 (X).
To find L∗ , let A : L2 (X) → L2 (X) be the integral operator with kernel k(y, x), i.e.,
Z
Af (x) =
k(y, x) f (y) dµ(y).
X
2
Then, given any f and g ∈ L (X), we have
Z
∗
hf, L gi = hLf, gi =
Lf (x) g(x) dµ(x)
X
=
Z Z
X
=
=
=
Z
Z
Z
k(x, y) f (y) dµ(y) g(x) dµ(x)
X
f (y)
X
f (y)
X
Z
Z
k(x, y) g(x) dµ(x) dµ(y)
X
k(x, y) g(x) dµ(x) dµ(y)
X
f (y) Ag(y) dµ(y)
X
= hf, Agi.
By uniqueness of the adjoint, we must have L∗ = A.
Exercise: Justify the interchange in the order of integration in the above calculation, i.e.,
provide hypotheses under which the calculations above are justified.
16
CHRISTOPHER HEIL
Exercise 2.10. Let {en }n∈N be an orthonormal basis for a separable Hilbert space H. Define
T : H → `2 (N) by T (f ) = {hf, en i}n∈N . Find a formula for T ∗ : `2 (N) → H.
Definition 2.11. Let A ∈ B(H).
(a) We say that A is self-adjoint or Hermitian if A = A∗ .
(b) We say that A is normal if AA∗ = A∗ A.
Example 2.12. A real n × n matrix A is self-adjoint if and only if it is symmetric, i.e.,
if A = AT . A complex n × n matrix A is self-adjoint if and only if it is Hermitian, i.e., if
A = AH .
Exercise 2.13. Show that every self-adjoint operator is normal. Show that every unitary
operator is normal, but that a unitary operator need not be self-adjoint. For H = Cn , find
examples of matrices that are not normal. Are the left- and right-shift operators on `2 (N)
normal?
Exercise 2.14. (a) Show that if A, B ∈ B(H) are self-adjoint, then AB is self-adjoint if
and only if AB = BA.
(b) Give an example of self-adjoint operators A, B such that AB is not self-adjoint.
(c) Show that if A, B ∈ B(H) are self-adjoint then A + A∗ , AA∗ , A∗ A, A + B, ABA,
and BAB are all self-adjoint. What about A − A∗ or A − B? Show that AA∗ − A∗ A is
self-adjoint.
Exercise 2.15. (a) Let λ = (λn )n∈N ∈ `∞ (N) be given and let L be defined as in Example 1.7.
Show that L is normal, find a formula for L∗ , and prove that L is self-adjoint if and only if
each λn is real.
(b) Determine a necessary and sufficient condition on φ so that the multiplication operator
Mφ defined in Exercise 1.18 is self-adjoint.
(c) Determine a necessary and sufficient condition on the kernel k so that the integral
operator L defined in (1.23) is self-adjoint.
The following result gives a useful condition for telling when an operator on a complex
Hilbert space is self-adjoint.
Proposition 2.16. Let H be a complex Hilbert space (i.e., F = C), and let A ∈ B(H) be
given. Then:
A is self-adjoint
⇐⇒
hAf, f i ∈ R ∀ f ∈ H.
CHAPTER 2. OPERATORS ON HILBERT SPACES
17
Proof. ⇒. Assume A = A∗ . Then for any f ∈ H we have
hAf, f i = hf, Af i = hA∗ f, f i = hAf, f i.
Therefore hAf, f i is real.
⇐. Assume that hAf, f i is real for all f . Choose any f , g ∈ H. Then
hA(f + g), f + gi = hAf, f i + hAf, gi + hAg, f i + hAg, gi.
Since hA(f + g), f + gi, hAf, f i, and hAg, gi are all real, we conclude that hAf, gi + hAg, f i
is real. Hence it equals its own complex conjugate, i.e.,
Similarly, since
we see that
hAf, gi + hAg, f i = hAf, gi + hAg, f i = hg, Af i + hf, Agi.
(2.2)
hA(f + ig), f + igi = hAf, f i − ihAf, gi + ihAg, f i + hAg, gi
−ihAf, gi + ihAg, f i = −ihAf, gi + ihAg, f i = ihg, Af i − ihf, Agi.
Multiplying through by i yields
hAf, gi − hAg, f i = −hg, Af i + hf, Agi.
(2.3)
Adding (2.2) and (2.3) together, we obtain
2hAf, gi = 2hf, Agi = 2hA∗ f, gi.
Since this is true for every f and g, we conclude that A = A∗ .
Example 2.17. The preceding result is false for real Hilbert spaces. After all, if F = R then
hAf, f i is real for every f no matter what A is. Therefore, any non-self-adjoint operator
provides a counterexample. For example, if H = Rn then any non-symmetric matrix A is a
counterexample.
The next result provides a useful way of calculating the operator norm of a self-adjoint
operator.
Proposition 2.18. If A ∈ B(H) is self-adjoint, then
kAk = sup |hAf, f i|.
kf k=1
Proof. Set M = supkf k=1 |hAf, f i|.
By Cauchy–Schwarz and the definition of operator norm, we have
M = sup |hAf, f i| ≤ sup kAf k kf k ≤ sup kAk kf k kf k = kAk.
kf k=1
kf k=1
kf k=1
To get the opposite
inequality,
note that if f is any nonzero vector in H then f /kf k is a
unit vector, so A kff k , kff k ≤ M . Rearranging, we see that
∀ f ∈ H,
hAf, f i ≤ M kf k2 .
(2.4)
18
CHRISTOPHER HEIL
Now choose any f , g ∈ H with kf k = kgk = 1. Then, by expanding the inner products,
canceling terms, and using the fact that A = A∗ , we see that
A(f + g), f + g − A(f − g), f − g = 2 hAf, gi + 2 hAg, f i
= 2 hAf, gi + 2 hg, Af i
= 4 Re hAf, gi.
Therefore, applying (2.4) and the Parallelogram Law, we have
4 Re hAf, gi ≤ |hA(f + g), f + gi| + |hA(f − g), f − gi|
≤ M kf + gk2 + M kf − gk2
= 2M kf k2 + kgk2 = 4M.
That is, Re hAf, gi ≤ M for every choice of unit vectors f and g. Write hAf, gi = |hAf, gi| e iθ .
Then eiθ g is another unit vector, so
M ≥ Re hAf, e−iθ gi = Re eiθ hAf, gi = |hAf, gi|.
Hence
kAf k = sup |hAf, gi| ≤ M.
kgk=1
Since this is true for every unit vector f , we conclude that kAk ≤ M .
The following corollary is a very useful consequence.
Corollary 2.19. Assume that A ∈ B(H).
(a) If F = R, A = A∗ , and hAf, f i = 0 for every f , then A = 0.
(b) If F = C and hAf, f i = 0 for every f , then A = 0.
Proof. Assume the hypotheses of either statement (a) or statement (b). In the case of
statement (a), we have by hypothesis that A is self-adjoint. In the case of statement (b), we
can conclude that A is self-adjoint because hAf, f i = 0 is real for every f . Hence in either
case we can apply Proposition 2.18 to conclude that
kAk = sup |hAf, f i| = 0.
kf k=1
Lemma 2.20. If A ∈ B(H), then the following statements are equivalent.
(a) A is normal, i.e., AA∗ = A∗ A.
(b) kAf k = kA∗ f k for every f ∈ H.
CHAPTER 2. OPERATORS ON HILBERT SPACES
19
Proof. (b) ⇒ (a). Assume that (b) holds. Then for every f we have
∗
(A A − AA∗ )f, f = hA∗ Af, f i − hAA∗ f, f i
= hAf, Af i − hA∗ f, A∗ f i
= kAf k2 − kA∗ f k2 = 0.
Since A∗ A − AA∗ is self-adjoint, it follows from Corollary 2.19 that A∗ A − AA∗ = 0.
(a) ⇒ (b). Exercise.
Corollary 2.21. If A ∈ B(H) is normal, then ker(A) = ker(A∗ ).
Exercise 2.22. Suppose that A ∈ B(H) is normal. Prove that A is injective if and only if
range(A) is dense in H.
Exercise 2.23. If A ∈ B(H), then the following statements are equivalent.
(a) A is an isometry, i.e., kAf k = kf k for every f ∈ H.
(b) A∗ A = I.
(c) hAf, Agi = hf, gi for every f , g ∈ H.
Exercise 2.24. If H = Cn and A, B are n × n matrices, then AB = I implies BA = I.
Give a counterexample to this for an infinite-dimensional Hilbert space. Consequently, the
hypothesis A∗ A = I in the preceding result does not imply that AA∗ = I.
Exercise 2.25. If A ∈ B(H), then the following statements are equivalent.
(a) A∗ A = AA∗ = I.
(b) A is unitary, i.e., it is a surjective isometry.
(c) A is a normal isometry.
The following result provides a very useful relationship between the range of A∗ and the
kernel of A.
Theorem 2.26. Let A ∈ B(H, K).
(a) ker(A) = range(A∗ )⊥ .
(b) ker(A)⊥ = range(A∗ ).
(c) A is injective if and only if range(A∗ ) is dense in H.
20
CHRISTOPHER HEIL
Proof. (a) Assume that f ∈ ker(A) and let h ∈ range(A∗ ), i.e., h = A∗ g for some g ∈ K.
Then since Af = 0, we have hf, hi = hf, A∗ gi = hAf, gi = 0. Thus f ∈ range(A∗ )⊥ , so
ker(A) ⊆ range(A∗ )⊥ .
Now assume that f ∈ range(A∗ )⊥ . Then for any h ∈ H we have hAf, hi = hf, A∗ hi = 0.
But this implies Af = 0, so f ∈ ker(A). Thus range(A∗ )⊥ ⊆ ker(A).
(b), (c) Exercises.
3. Projections and Idempotents: Invariant and Reducing Subspaces
Definition 3.1. a. If E ∈ B(H) satisfies E 2 = E then E is said to be idempotent.
b. If E ∈ B(H) satisfies E 2 = E and ker(E) = range(E)⊥ then E is called a projection.
Exercise 3.2. If E ∈ B(H) is an idempotent operator, then ker(E) and range(E) are closed
subspaces of H. Further, ker(E) = range(I − E) and range(E) = ker(I − E).
Lemma 3.3 (Characterization of Orthogonal Projections). Let E ∈ B(H) be a nonzero
idempotent operator. Then the following statements are equivalent.
(a) E is a projection.
(b) E is the orthogonal projection of H onto range(E).
(c) kEk = 1.
(d) E is self-adjoint.
(e) E is normal.
(f) E is positive, i.e., hEf, f i ≥ 0 for every f ∈ H.
Proof. (e) ⇒ (a). Assume that E 2 = E and E is normal. Then from Lemma 2.20 we know
that kEf k = kE ∗ f k for every f ∈ H. Hence Ef = 0 if and only if E ∗ f = 0, or in other
words, ker(E) = ker(E ∗ ). But we know from Theorem 2.26 that ker(E ∗ ) = range(E)⊥ .
Hence we conclude that ker(E) = range(E)⊥ , and therefore E is a projection.
The remaining implications are exercises.
Definition 3.4 (Orthogonal Direct Sum of Subspaces). Let {Mi }i∈I be a collection of closed
subspaces of H such that Mi ⊥ Mj whenever i 6= j. Then the orthogonal direct sum of the
Mi is the smallest closed subspace which contains every Mi . This space is
S
L
Mi = span
Mi .
i∈I
i∈I
CHAPTER 2. OPERATORS ON HILBERT SPACES
21
Exercise 3.5. Suppose that M , N are closed subspaces of H such that M ⊥ N . Prove that
M + N = {m + n : m ∈ M, n ∈ N } is a closed subspace of H, and that
M ⊕ N = M + N.
Show that every vector x ∈ M ⊕ N can be written uniquely as x = m + n with m ∈ M and
n ∈ N.
Extend by induction to finite collections of closed, pairwise orthogonal subspaces. (Unfortunately, the analogous statement is not true for infinite collections.)
Exercise 3.6. Show that if A ∈ B(H, K) then H = ker(A) ⊕ range(A∗ ).
Definition 3.7. Let A ∈ B(H) and M ≤ H.
(a) We say that M is invariant under A if A(M ) ⊆ M , where
A(M ) = {Ax : x ∈ M }.
That is, M is invariant if x ∈ M implies Ax ∈ M . Note that it need not be the case
that A(M ) = M .
(b) We say that M is a reducing subspace for A if both M and M ⊥ are invariant under
A, i.e., A(M ) ⊆ M and A(M ⊥ ) ⊆ M ⊥ .
Proposition 3.8. Let A ∈ B(H) and M ≤ H be given. Then the following statements are
equivalent.
(a) M is invariant under A.
(b) P AP = AP , where P = PM is the orthogonal projection of H onto M .
Exercise 3.9. Define L : `2 (Z) → `2 (Z) by
L(. . . , x−1 , x0 , x1 , . . . ) = (. . . , , x0 , x1 , x2 , . . . ),
where on the right-hand side the entry x1 sits in the 0th component position. That is,
L slides each component one unit to the left (L is called a bilateral shift). Find a closed
subspace of `2 (Z) that is invariant but not reducing under L.
Exercise 3.10. Assume that M ≤ H is invariant under L ∈ B(H). Prove that M ⊥ is
invariant under L∗ .
22
CHRISTOPHER HEIL
4. Compact Operators
Definition 4.1 (Compact and Totally Bounded Sets). Let X be a Banach space, and let
E ⊆ X be given.
(a) We say that E is compact if every open cover of E contains a finite subcover. That is,
E is compact if whenever {Uα }α∈I is a collection of open sets whose union contains
E, then there exist finitely many α1 , . . . , αN such that E ⊆ Uα1 ∪ · · · ∪ UαN .
(b) We say that E is sequentially compact if every sequence {fn }n∈N of points of E
contains a convergent subsequence {fnk }k∈N whose limit belongs to E.
(c) We say that E is totally bounded if for every ε > 0 there exist finitely many points
f1 , . . . , fN ∈ E such that
N
S
B(fk , ε),
E⊆
k=1
where B(fk , ε) is the open ball of radius ε centered at fk . That is, E is totally
bounded if and only there exist finitely many points f1 , . . . , fN ∈ E such that every
element of E is within ε of some fk .
In finite dimensions, a set is compact if and only if it is closed and bounded. In infinite
dimensions, all compact sets are closed and bounded, but the converse fails. Instead, we
have the following characterization of compact sets. (this characterization actually holds in
any complete metric space).
Theorem 4.2. Let E be a subset of a Banach space X. Then the following statements are
equivalent.
(a) E is compact.
(b) E is sequentially compact.
(c) E is closed and totally bounded.
Proof. (b) ⇒ (a).2 Assume that E is sequentially compact. Our first step will be to prove
the following claim, where the diameter of a set S is defined to be
diam(S) = sup{kf − gk : f, g ∈ S}.
Claim 1. For any open cover {Uα }α∈I of E, there exists a number δ > 0 (called a Lebesgue
number for the cover) such that if S ⊆ E satisfies diam(S) < δ, then there is an α ∈ I such
that S ⊆ Uα .
To prove the claim, suppose that {Uα }α∈I was an open cover of E such that no δ with
the required property existed. Then for each n ∈ N, we could find a set Sn ⊆ E with
diam(Sn ) < n1 such that Sn is not contained in any Uα . Choose any fn ∈ Sn . Since E is
sequentially compact, there must be a subsequence {fnk }k∈N that converges to an element of
2This
proof is adapted from one given in J. R. Munkres, “Topology,” Second Edition, Prentice Hall, 2000.
CHAPTER 2. OPERATORS ON HILBERT SPACES
23
E, say fnk → a ∈ E. But we must have a ∈ Uα for some α, and since Uα is open there must
exist some ε > 0 such that B(a, ε) ⊆ Uα . Now choose k large enough that we have both
1
ε
ε
<
and
ka − fnk k < .
nk
2
2
ε
The first inequality above implies that diam(Snk ) < 2 . Therefore, using this and second
inequality, we have Snk ⊆ B(a, ε) ⊆ Uα , which is a contradiction. Therefore the claim is
proved.
Next, we will prove the following claim.
Claim 2. For any ε > 0, there exist finitely many f1 , . . . , fN ∈ E such that
N
S
E⊆
B(fk , ε).
k=1
To prove this claim, assume that there is an ε > 0 such that E cannot be covered by
finitely many ε-balls centered at points of E. Choose any f1 ∈ E. Since E cannot be
covered by a single ε-ball, we have E 6⊆ B(f1 , ε). Hence there exists f2 ∈ E \ B(f1 , ε), i.e.,
f2 ∈ E and kf2 − f1 k ≥ ε. But
E cannot be covered by two ε-balls, so there must exist an
f3 ∈ E \ B(f1 , ε) ∪ B(f2 , ε) . In particular, we have kf3 − f1 k, kf3 − f2 k ≥ ε. Continuing in
this way we obtain a sequence of points {fn }n∈N in E which has no convergent subsequence,
which is a contradiction. Hence the claim is proved.
Finally, we show that E is compact. Let {Uα }α∈I be any open cover of E. Let δ be the
Lebesgue number given by Claim 1, and set ε = 3δ . By Claim 2, there exists a covering of E
by finitely many ε-balls. Each ball has diameter smaller than δ, so by Claim 1 is contained
in some Uα . Thus we find finitely many Uα that cover E.
(c) ⇒ (b). Assume that E is closed and totally bounded, and let {fn }n∈N be any sequence
of points in E. Since E is covered by finitely many balls of radius 21 , one of those balls must
(1)
contain infinitely many fn , say {fn }n∈N . Then we have
∀ m, n ∈ N,
(1)
kfm
− fn(1) k < 1.
(2)
Since E is covered by finitely many balls of radius 41 , we can find a subsequence {fn }n∈N of
(1)
{fn }n∈N such that
1
(1)
∀ m, n ∈ N, kfm
− fn(1) k < .
2
(k)
(k)
(k)
By induction we keep constructing subsequences {fn }n∈N such that kfm − fn k < k1 for
all m, n ∈ N.
(n)
Now consider the “diagonal subsequence” {fn }n∈N . Given ε > 0, let N be large enough
(m)
(n)
that N1 < ε. If m ≥ n > N , then fm is one element of the sequence {fk }k∈N , say
(n)
(m)
fm = fk . Then
1
(n)
(m)
kfm
− fn(n) k = kfk − fn(n) k <
< ε.
n
(n)
Thus {fn }n∈N is Cauchy and hence converges. Since E is closed, it must converge to some
element of E.
24
CHRISTOPHER HEIL
(a) ⇒ (c). Exercise.
Exercise 4.3. Show that if E is a totally bounded subset of a Banach space X, then its
closure E is compact. A set whose closure is compact is said to be precompact.
Notation 4.4. We let BallH denote the closed unit sphere in H, i.e.,
BallH = Ball(H) = {f ∈ H : kf k ≤ 1}.
Exercise 4.5. Prove that if H is infinite-dimensional, then BallH is not compact.
Definition 4.6 (Compact Operators). Let H, K be Hilbert spaces. A linear operator
T : H → K is compact if T (BallH ) has compact closure in K. We define
B0 (H, K) = {T : H → K : T is compact},
and set B0 (H) = B0 (H, H).
By definition, a compact operator is linear, and we will see that all compact operators are
bounded. Thus it will turn out that B0 (H, K) ⊆ B(H, K). In fact, we will see that B0 (H, K)
is a closed subspace of B(H, K).
The following result gives some useful reformulations of the definition of compact operator.
Proposition 4.7 (Characterizations of Compact Operators). Let T : H → K be linear.
Then the following statements are equivalent.
(a) T is compact.
(b) T (BallH ) is totally bounded.
(c) If {fn }n∈N is a bounded sequence in H, then {T fn }n∈N contains a convergent subsequence.
Proof. (a) ⇔ (b). This follows from Theorem 4.2 and Exercise 4.3.
(a) ⇒ (c). Suppose that T is compact and that {fn }n∈N is a bounded sequence in H.
By rescaling the sequence (i.e., multiplying by an appropriate scalar), we may assume that
fn ∈ BallH for every n. Therefore T fn ∈ T (BallH ) ⊆ T (BallH ). Since T (BallH ) is compact,
it follows from Theorem 4.2 that {T fn }n∈N contains a subsequence which converges to an
element of T (BallH ).
(c) ⇒ (a). Exercise.
Proposition 4.8. If T : H → K is compact, then it is bounded. That is,
B0 (H, K) ⊆ B(H, K).
Proof. Assume that T : H → K is linear but unbounded. Then there exist vectors fn ∈ H
such that kfn k = 1 but kT fn k ≥ n. Therefore every subsequence of {T fn }n∈N is unbounded,
and hence cannot converge. Therefore T is not compact by Proposition 4.7.
CHAPTER 2. OPERATORS ON HILBERT SPACES
25
Exercise 4.9. Show that if H is infinite-dimensional then the identity operator on H is not
compact. Hence a bounded operator need be compact in general.
The following exercise shows that a compact operator maps an orthonormal sequence to
a sequence that converges to the zero vector.
Exercise 4.10. (a) Let {hn }n∈N be a sequence of vectors in H, and let h ∈ H. Suppose
that every subsequence of {hn }n∈N contains a subsequence that converges to h. Prove that
hn → h.
Hint: Proceed by contradiction. Suppose that hn does not converge to h. Show that this
implies that there is an ε > 0 and a subsequence {hnk }k∈N such that kh − hnk k ≥ ε for
every k.
(b) Suppose that T : H → K is compact, and let {en }n∈N be an orthonormal sequence
in H. Show that T en → 0.
Hint: Choose any subsequence {fn }n∈N . Since T is compact, this sequence has a subsequence {gn }n∈N such that {T gn }n∈N converges, say T gn → h. Prove that hT gn , hi → 0
(use Bessel’s Inequality to find a bound for the `2 -norm of {hT gn , hi}n∈N ). Use part (a) to
complete the proof.
The following exercise shows that a compact operator maps weakly convergent sequences
to convergent sequences.
Definition 4.11. Let {fn }n∈N be a sequence of vectors in H and let f ∈ H. We say that
w
fn converges weakly to f , written fn → f , if
∀ g ∈ H,
hfn , gi → hf, gi as n → ∞.
w
Exercise 4.12. (a) Show that if fn → f , then fn → f .
w
(b) Show that if {en }n∈N is an orthonormal sequence in H, then en → 0.
w
(c) Suppose that T ∈ B(H) is compact. Show that if fn → f , then T fn → T f .
Exercise 4.13. Let φ ∈ L∞ (Rn ) be fixed, with φ 6= 0. Then by Exercise 1.18 we know that
the multiplication operator Mφ : L2 (Rn ) → L2 (Rn ) given by Mφ f = f φ is bounded. Show
that Mφ is not compact.
Hint: There must exist an ε > 0 and a set E ⊆ Rn with positive measure such that
|φ(x)| ≥ ε for all x ∈ E.
Exhibit a measure space (X, Ω, µ) and a bounded, nonzero φ ∈ L∞ (X) such that Mφ is
compact. Hint: Consider Exercise 4.23.
Exercise 4.14. Porve that if T : H → K is compact and injective, then T −1 : range(T ) → H
is unbounded.
26
CHRISTOPHER HEIL
Theorem 4.15 (Limits of Compact Operators). B0 (H, K) is a closed subspace of B(H, K)
(under the operator norm). That is,
(a) if S, T ∈ B0 (H, K) and α, β ∈ F, then αS + βT ∈ B0 (H, K),
(b) if Tn ∈ B0 (H, K), T ∈ B(H, K), and kT − Tn k → 0, then T ∈ B0 (H, K).
Proof. (a) Exercise.
(b) Assume that Tn are compact operators and that Tn → T in operator norm. By
Proposition 4.7, it suffices to show that T (BallH ) is a totally bounded subset of K.
Choose any ε > 0. Then there exists an n such that kT − Tn k < 3ε . Now, Tn is compact,
so Tn (BallH ) is totally bounded. Hence there exist finitely many points h1 , . . . , hm ∈ BallH
such that
m
S
Tn (BallH ) ⊆
B Tn hj , 3ε .
(4.1)
j=1
We will show that T (BallH ) is totally bounded by showing that
m
S
T (BallH ) ⊆
B T n hj , ε .
(4.2)
j=1
Choose any element of T (BallH ), i.e., any point T f with kf k ≤ 1. Then Tn f ∈ Tn (BallH ),
so by (4.1) there must be some j such that kTn f − Tn hj k < 3ε . Consequently,
kT f − T hj k ≤ kT f − Tn f k + kTn f − Tn hj k + kTn hj − T hj k
ε
< kT − Tn k kf k + + kTn − T k khj k
3
ε
ε ε
< ·1+ + ·1
3
3 3
= ε.
Hence (4.2) follows, so T is compact.
Exercise 4.16. Another way to prove Theorem 4.15 is to apply a Cantor diagonalization
argument. Fill in the details in the following sketch of this argument.
Suppose that {fn }n∈N is a bounded sequence in H. Then since T1 is compact, there exists a
(1)
(1)
subsequence {fn }n∈N of {fn }n∈N such that {T1 fn }n∈N converges. Then since T2 is compact,
(2)
(1)
(2)
there exists a subsequence {fn }n∈N of {fn }n∈N such that {T2 fn }n∈N converges (and note
(2)
that {T1 fn }n∈N also converges!). Continue to construct subsequences in this way, and then
(n)
show that the “diagonal subsequence” {T fn }n∈N converges (use the fact that there exists
a k such that kT − Tk k < ε). Therefore T is compact.
Theorem 4.17 (Compositions and Compact Operators). Let H1 , H2 , H3 be Hilbert spaces.
(a) If A : H1 → H2 is bounded and T : H2 → H3 is compact, then T A : H1 → H3 is
compact.
CHAPTER 2. OPERATORS ON HILBERT SPACES
27
(b) If T : H1 → H2 is compact and A : H2 → H3 is bounded, then AT : H1 → H3 is
compact.
Proof. (b) Assume that A is bounded and T is compact. Let {fn }n∈N be any bounded
sequence in H1 . Then since T is compact, there is a subsequence {T fnk }k∈N that converges
in H2 . Since A is bounded, the subsequence {AT fnk }k∈N therefore converges in H3 . Hence
AT is compact.
(a) Exercise.
Exercise 4.18. Prove that if T ∈ B0 (H, K), then range(T ) is a separable subspace of K.
Hints: Since T (BallH ) is compact, it is totally bounded. Hence for each n ∈ N we can
find finitely many balls of radius 1/n with centers in T (BallH ) that cover T (BallH ). If we
consider all these balls for every n, we have countably many balls that cover T (BallH ). Show
that this implies that T (BallH ) contains a countable, dense subset. Then do the same for
each ball of radius k ∈ N instead of just k = 1. Combine all of these together to get a
countable dense subset of range(T ).
Definition 4.19 (Finite-Rank Operators). Recall that the rank of an operator T : H → K
is the dimension of range(T ). We say that T is a finite-rank operator if range(T ) is finitedimensional. We set
B00 (H, K) = {T ∈ B(H, K) : T is finite-rank},
and set B00 (H) = B00 (H, H).
A linear, finite-rank operator need not be bounded (that is why we include the assumption
of boundedness in the definition of B00 (H, K) above). However, the following result shows
that if a finite-rank operator is bounded, then it is actually compact.
Proposition 4.20. If T : H → K is bounded, linear, and has finite rank, then T is compact.
Thus,
B00 (H, K) ⊆ B0 (H, K).
Proof. Since T is bounded, T (BallH ) is a bounded subset of the finite-dimensional space
range(T ). All finite-dimensional spaces are closed. Hence the closure of T (BallH ) is a closed
and bounded subset of range(T ), and therefore is compact.
This gives us the following very useful way to show that a general operator T is compact:
try to construct a sequence of finite-rank operators Tn that converge to T in operator norm.
Corollary 4.21. Suppose that Tn ∈ B(H, K) are finite-rank operators, T ∈ B(H, K), and
Tn → T in operator norm. Then T is compact.
Exercise 4.22. Show that if E ∈ B(H) is compact and idempotent, then E has finite rank.
28
CHRISTOPHER HEIL
Example 4.23. Let {en }n∈N be an orthonormal basis for a separable Hilbert space H, and
let λ = (λn )n∈N be a bounded sequence of scalars. Then we know from Example 1.7 that
∞
X
Lf =
n=1
λn hf, en i en
defines a bounded operator on H.
Suppose that λn → 0 as n → ∞. Define
N
X
LN f =
n=1
λn hf, en i en .
Since range(LN ) ⊆ span{e1 , . . . , eN } (must it be equality?), we have that LN is finite-rank.
(Exercise: Show that L is not finite-rank if there are infinitely many λn 6= 0.)
Further, LN is a good approximation to L, because (using the Plancherel Theorem) we
have
∞
2
X
k(L − LN )f k2 = λ
hf,
e
i
e
n
n
n
n=N +1
∞
X
=
n=N +1
≤
≤
|λn |2 |hf, en i|2
sup |λn |
n>N
2
∞
X
n=N +1
|hf, en i|2
sup |λn |2 kf k2 .
n>N
It follows that LN converges to L in operator norm:
lim kL − LN k2 ≤ lim sup |λn |2 = lim sup |λn |2 = 0.
N →∞
N →∞ n>N
N →∞
Since each LN is compact, we conclude that L is compact as well.
Exercise 4.24. Continuing the preceding example, prove the following.
(a) Prove that if λn does not converge to zero then L is not compact. Hint: We know at
least some of the eigenvectors of L.
(b) Prove that, with only the assumption that λ ∈ `∞ , we have
∀ f ∈ H,
LN f → Lf.
(4.3)
That is, for each individual vector f we have kLf − LN f k → 0, where this is the norm
in H. A sequence of operators which satisfies (4.3) is said to converge strongly or in the
strong operator topology (SOT). Prove that strong convergence of operators does not imply
convergence in operator norm, i.e., (4.3) does not imply that kL − LN k → 0.
(c) Assuming that λ ∈ `∞ , prove that L is self-adjoint if and only if every λn is real.
CHAPTER 2. OPERATORS ON HILBERT SPACES
29
We can characterize all the finite-rank operators, as follows.
Proposition 4.25 (Finite-Rank Operators). Let L : H → K be bounded and linear. Then
the following statements are equivalent.
(a) L has finite rank.
(b) There exist vectors ϕ1 , . . . , ϕN ∈ H and ψ1 , . . . , ψN ∈ K such that
Lf =
N
X
k=1
hf, ϕk i ψk ,
f ∈ H.
(4.4)
Proof. (a) ⇒ (b). Since L has finite rank, we know that range(L) is a finite-dimensional
subspace of K. Every finite-dimensional subspace is closed, so we can find a finite orthonormal basis {ψk }N
k=1 for range(L). Therefore, if f ∈ H then we can express Lf in terms of this
orthonormal basis:
N
N
N
X
X
X
Lf =
hLf, ψk i ψk =
hf, L∗ ψk i ψk =
hf, ϕk i ψk ,
k=1
k=1
k=1
∗
where ϕk = L ψk .
(b) ⇒ (a). We have range(L) ⊆ span{ψ1 , . . . , ψN }.
Corollary 4.26. If L ∈ B(H, K) has rank 1, then there exist ϕ ∈ H and ψ ∈ K such that
Lf = hf, ϕi ψ,
f ∈ H.
In particular, if ϕ = ψ are unit vectors, then Lf = hf, ϕi ϕ is the orthogonal projection of
H onto span{ϕ}.
Exercise 4.27. Compute the adjoint of L given by (4.4). Conclude that the adjoint of a
finite-rank operator is also finite-rank.
Exercise 4.28. Show that if A ∈ B(H) and AT = T A for every finite-rank T then A = cI
for some scalar c.
Exercise 4.29. Use the idea of the Proposition 4.25 to show that if H is separable and
L : H → H is any bounded linear operator, then there exist finite-rank operators LN that
converge to L in the strong operator topology, i.e., kLf − LN f k → 0 for each individual f .
However, observe that if L is not compact, then we cannot have LN → L in operator norm.
The following result shows that not only is the operator norm limit of a sequence of finiterank operators compact, but every compact operator can be realized as the operator norm
limit of finite-rank operators.
Theorem 4.30. If T ∈ B(H, K), then the following statements are equivalent.
30
CHRISTOPHER HEIL
(a) T is compact.
(b) There exist finite-rank operators Tn ∈ B(H, K) such that Tn → T .
As a consequence, we have that B00 (H, K) is a dense subspace of B0 (H, K), i.e.,
B00 (H, K) = B0 (H, K)
(closure in operator norm).
Proof. (b) ⇒ (a). This follows from Theorem 4.15 and the fact that all bounded finite-rank
operators are compact.
(a) ⇒ (b). Assume that T is compact. Let R = range(T ). If R is finite-dimensional,
then T is finite-rank, and we are done. So, assume that R is infinite-dimensional. By
Exercise 4.18, we know that R is separable, so there exists a countable orthonormal basis
{en }n∈N for R. For any f ∈ H we have T f ∈ R, so
Tf =
∞
X
n=1
Define
TN f =
hT f, en i en ,
N
X
n=1
f ∈ H.
hT f, en i en ,
f ∈ H,
and note that TN = PN T where PN is the orthogonal projection of K onto the closed
subspace span{e1 , . . . , eN }. By definition, we have that TN converges to T in the strong
operator topology, i.e., TN f → T f for every f . Our goal is to show more, namely, to show
that TN → T in operator norm. That is, we need to show that supkf k=1 kT f − TN f k → 0.
Choose any ε > 0. Since T (BallH ) is totally bounded, it is covered by finitely many ε-balls
centered at points in T (BallH ). Hence, there exist h1 , . . . , hm ∈ H such that
m
S
ε
T (BallH ) ⊆
B T hk ,
.
3
k=1
Since limN →∞ kT hk − TN hk k = 0 for k = 1, . . . , m, we can find an N0 such that
ε
∀ N > N0 , kT hk − TN hk k < , k = 1, . . . , m.
3
Choose any f with kf k = 1 and any N > N0 . Then T f ∈ B T hk , 3ε for some k, i.e.,
ε
kT f − T hk k < .
3
Therefore we also have
N
X
kTN f − TN hk k = hT (f − hk ), en i en n=1
=
X
N
n=1
|hT (f − hk ), en i|
2
1/2
CHAPTER 2. OPERATORS ON HILBERT SPACES
≤
X
∞
n=1
|hT (f − hk ), en i|
2
1/2
= kT f − T hk k <
31
ε
.
3
Alternatively, this follows even more simply from the fact that
ε
kTN f − TN hk k = kPN T f − PN T hk k ≤ kPN k kT f − T hk k < 1 · .
3
In any case, it follows that
ε ε ε
kT f − TN f k ≤ kT f − T hk k + kT hk − TN hk k + kTN hk − TN f k < + + = ε.
3 3 3
This is true for every unit vector, so we have kT − TN k ≤ ε for all N > N0 . Therefore, we
do indeed have kT − TN k → 0.
Corollary 4.31. If T ∈ B(H, K), then
T is compact
⇐⇒
T ∗ is compact.
Proof. Assume that T is compact. Then there exist finite-rank operators TN such that
TN → T . Hence TN∗ → T ∗ (why?), but each TN∗ is finite-rank, so T ∗ is compact. The
converse is symmetrical.
Exercise 4.32. Extend Example 4.23 as follows. Let H be a separable Hilbert space, and
let {en }n∈N be an orthonormal basis for H. Let λ = (λn )n∈N ∈ `∞ (N) be given. Define
Len = λn en . Prove that the definition of L can be extended to all of H in such a way that L
is a bounded linear operator. Prove that this operator L is compact if and only if λn → 0.
The next result shows that an integral operator with a square-integrable kernel is compact.
Theorem 4.33. Let (X, Ω, µ) be a σ-finite measure space. If k ∈ L2 (X × X), then the
integral operator
Z
Lf (x) =
k(x, y) f (y) dµ(y),
f ∈ L2 (X),
X
2
defines a compact mapping of L (X) into itself. Further, kLk ≤ kkk2 .
Proof. Note that by Theorem 1.22 we already know that L defines a bounded operator, and
that kLk ≤ kkk2 . So, we need only show that L is compact.
For simplicity, we will consider only the case where L2 (X) is separable. In this case there
exists an orthonormal basis {en }n∈N for L2 (X). Define
emn (x, y) = em (x) en (y),
x, y ∈ X.
Then it is easy to see that {emn }m,n∈N is an orthonormal sequence in L2 (X × X), and with
more work (exercise3 ) it can be shown that that it is also complete and hence forms an
orthonormal basis for L2 (X × X). Since k ∈ L2 (X × X), we therefore have
∞ X
∞
X
k =
hk, emn i emn ,
m=1 n=1
3For details on this
type of argument, see the “Real Analysis Review” handout on the instructor’s webpage.
32
CHRISTOPHER HEIL
where this series converges in the norm of L2 (X ×X), and in fact it converges unconditionally.
For each N ∈ N define an approximation to k by setting
kN =
N
N X
X
m=1 n=1
2
hk, emn i emn .
Then kN → k in L -norm.
Now define an approximation to L by defining LN to be the integral operator with kernel
kN , i.e.,
Z
LN f (x) =
kN (x, y) f (y) dµ(y),
X
f ∈ L2 (X).
Since kN ∈ L2 (X × X), we know that LN is bounded. Further, since the sums involved are
finite, we can interchange sums and integrals in the following calculation to obtain that
Z
LN f (x) =
kN (x, y) f (y) dµ(y)
X
=
Z X
N X
N
X m=1 n=1
=
N X
N
X
m=1 n=1
=
N X
N
X
m=1 n=1
hk, emn i emn (x, y) f (y) dµ(y)
hk, emn i
Z
em (x) en (y) f (y) dµ(y)
X
hk, emn i hf, en i em (x).
P
PN
This is an equality of functions, i.e., LN f = N
m=1
n=1 hk, emn i hf, en i em a.e. In any case
we have LN f ∈ span{e1 , . . . , eN }, so LN has finite rank. Since LN is bounded (why?), it is
therefore compact.
Consequently, if we can show that LN → L, then we can conclude that L itself is compact.
Note that L−LN is simply the integral operator with kernel k−kN . Since k−kN ∈ L2 (X×X),
we know that L − LN is bounded, and that
kL − LN k ≤ kk − kN k2 → 0 as N → ∞.
Hence LN → L in operator norm, so L is compact.
For the remainder of this section, we consider eigenvalues and eigenvectors of compact
operators.
Definition 4.34. Let A ∈ B(H) be given.
(a) A scalar λ ∈ F is an eigenvalue of A if there exists a nonzero vector f ∈ H such that
Af = λf . Equivalently, λ is an eigenvalue if ker(A − λI) 6= {0}.
(b) If λ ∈ F is an eigenvalue of A, then any nonzero vector in ker(A − λI) is called
an eigenvector of A corresponding to the eigenvalue λ, or simply a λ-eigenvector for
short. Equivalently, a nonzero vector f ∈ H is a λ-eigenvector if Af = λf .
CHAPTER 2. OPERATORS ON HILBERT SPACES
33
(c) If λ ∈ F is an eigenvalue of A, then ker(A − λI) is called the eigenspace of A
corresponding to the eigenvalue λ, or simply a λ-eigenspace for short. Every nonzero
vector in the λ-eigenspace is a λ-eigenvector of A.
(d) The point spectrum σp (A) of A is the set of eigenvalues of A:
σp (A) = {λ ∈ F : λ is an eigenvalue of A}.
Exercise 4.35. Let {en }n∈N be an orthonormal basis for a separable Hilbert space
P H, and let
∞
λ = (λn )n∈N ∈ ` (N) be fixed. Let L : H → H be the bounded operator Lf =
λn hf, en i en
defined in Example 1.7.
(a) Show that σp (L) = {λn : n ∈ N}.
(b) Show that if µ is one component of λ and J = {n ∈ N : λn = µ}, then the µeigenspace of L is span{en }n∈J .
(c) Show that the eigenspaces of L corresponding to distinct eigenvalues are orthogonal.
Exercise 4.36. Let L ∈ B(H) be given. Prove the following.
(a) If L is self-adjoint, then all eigenvalues of L are real.
(b) If L is positive (hLf, f i ≥ 0 for all f ), then all eigenvalues of L are nonnegative.
(c) If L is positive definite (hLf, f i > 0 for all f 6= 0), then all eigenvalues of L are
strictly positive.
(d) If L is unitary, then every eigenvalue λ satisfies |λ| = 1.
Exercise 4.37. Suppose that L ∈ B(H) is normal. Prove that if λ 6= µ are distinct
eigenvalues of L, then the corresponding eigenspaces are orthogonal, i.e., ker(L − λI) ⊥
ker(L − µI).
While any linear operator A : Cn → Cn must have an eigenvalue, bounded operators on
infinite-dimensional Hilbert spaces need not have any eigenvalues.
Exercise 4.38. (a) Prove that the Volterra operator defined in Exercise 1.25 is compact but
has no eigenvalues, i.e., its point spectrum is empty.
(b) Prove that the right-shift operator R on `2 (N) has no eigenvalues.
(c) Prove that every scalar |λ| < 1 is an eigenvalue of the left-shift operator L on ` 2 (N), and
find the corresponding eigenvectors. Thus, this operator has uncountably many eigenvalues.
(d) Let φ(x) = x. Prove that the multiplication operator Mφ : L2 [0, 1] → L2 [0, 1], defined
by Mφ f (x) = xf (x), is self-adjoint but has no eigenvalues.
(e) Define
k(x, y) =
(
i,
y ≤ x,
−i, y > x,
34
CHRISTOPHER HEIL
and let L : L2 [0, 1] → L2 [0, 1] be the integral operator with kernel k. Prove that L is both
2
(and only these),
compact and self-adjoint. Prove that the eigenvalues of L are λk = (2k+1)π
and find the corresponding eigenvectors.
Exercise 4.39 (Convolution). Fix g ∈ L2 [0, 1], where we consider functions in L2 [0, 1] to be
1-periodically extended to the real line. Let T be the convolution operator
Z 1
T f (x) = (f ∗ g)(x) =
g(x − y) f (y) dy.
0
(a) Prove that T is compact. Hint: Write T as an integral operator and show that its
kernel is square-integrable. Note that the fact that [0, 1] has finite measure is important.
(b) Prove that the complex exponential functions en (x) = e2πinx are eigenvectors of T ,
with corresponding eigenvalues ĝ(n) (the Fourier coefficients of g).
Exercise 4.40. (a) Assume that A ∈ B(H) is normal and let λ ∈ F be given. Show that
A − λI is normal. Use this to show that ker(A − λI) = ker(A∗ − λ̄I). Conclude that if λ
is an eigenvalue of A then λ̄ is an eigenvalue of A∗ , and the corresponding eigenspaces are
equal.
Hint: Consider Corollary 2.21.
(b) Find an example of a non-normal operator for which the conclusion of part (a) fails.
Hint: Consider a shift operator.
The next result shows that the eigenspaces (if any) of a compact operator corresponding
to nonzero eigenvalues must be finite-dimensional.
Proposition 4.41. Assume that T : H → H is compact and that λ 6= 0 is an eigenvalue
of T . Then ker(T − λI) is finite-dimensional.
Proof. Since T is bounded, we know that ker(T −λI) is a closed subspace of H. Suppose that
it was infinite-dimensional. Then we could find an infinite orthonormal sequence {en }n∈N in
ker(T − λI). Each en is a λ-eigenvector, i.e., T en = λen . But then {en }n∈N is a bounded
sequence in H yet
√
kT em − T en k = |λ| kem − en k = |λ| 2,
so since λ 6= 0 there can be no convergent subsequences of {T en }n∈N , which contradicts the
fact that T is compact.
The following is one useful theoretical result which implies the existence of an eigenvalue
of a compact operator T . It states that if inf kf k=1 kT f − λf k = 0, then this infimum is
actually achieved, i.e., kT f − λf k = 0 for some unit vector f , or in other words, there exists
a λ-eigenvector for T .
Proposition 4.42. Assume that T : H → H is compact and that λ 6= 0 is given. Then:
inf kT f − λf k = 0
kf k=1
=⇒
λ ∈ σp (T ).
CHAPTER 2. OPERATORS ON HILBERT SPACES
35
Proof. Assume that inf kf k=1 kT f − λf k = 0. Then we can find unit vectors fn such that
kT fn − λfn k → 0. Since T is compact, {T fn }n∈N has a convergent subsequence, say T fnk →
g ∈ H. Since λ 6= 0 we have
λfnk − T fnk + T fnk
0+g
g
→
= .
(4.5)
f nk =
λ
λ
λ
Since the fnk are unit vectors, we conclude that g 6= 0. Moreover, since T is continuous it
follow from (4.5) that T fnk → T g/λ. But we also know that T fnk → g, so we conclude that
T g/λ = g, or in other words that g is a λ-eigenvector.
Corollary 4.43. Assume T : H → H is compact and that λ 6= 0. If λ ∈
/ σp (T ) and
λ̄ ∈
/ σp (T ∗ ), then range(T − λI) is a bounded bijection of H onto itself, and (T − λI)−1 is
bounded.
Proof. Since λ is not an eigenvalue, we know that T − λI is injective. Further, it follows
from the preceding proposition that inf kf k=1 kT f − λf k > 0. Hence there exists a C > 0
such that kT f − λf k ≥ C for every unit vector f , and hence
∀ f ∈ H,
kT f − λf k ≥ C kf k.
(4.6)
It follows from Exercise 1.2 than range(T − λI) is a closed subspace of H. But then, since
λ̄ is not an eigenvalue of T ∗ , we have that
⊥
range(T − λI) = range(T − λI) = ker (T − λI)∗
= ker(T ∗ − λ̄I)⊥ = {0}⊥ = H.
Thus T − λI is a bounded bijection. It remains to show that (T − λI)−1 is bounded. Given
f ∈ H we have from (4.6) that
kf k = k(T − λI)(T − λI)−1 f k ≥ C k(T − λI)−1 f k.
Rearranging, we see that k(T − λI)−1 k ≤
1
C
< ∞.
Actually, it can be shown that if T : H → H is compact, λ 6= 0, and λ ∈
/ σp (T ), then
λ̄ ∈
/ σp (T ∗ ) follows automatically.
5. The Diagonalization of Compact Self-Adjoint Operators
First let us summarize the facts that have been developed regarding the operator L introduced in Example 1.7 and studied in other examples in previous sections.
Theorem 5.1. Let {en }n∈N be an orthonormal basis for a separable Hilbert space H, and
let λ = (λn )n∈∞ ∈ `∞ (N) be a bounded sequence of scalars. Define
∞
X
Lf =
λn hf, en i en ,
f ∈ H.
(5.1)
n=1
Then the following statements hold.
(a) L is bounded, and kLk = kλk∞ .
36
CHRISTOPHER HEIL
(b) L is normal, and L∗ f =
P∞
n=1
λ̄n hf, en i en .
(c) L is self-adjoint if and only if λn ∈ R for every n.
(d) L is compact if and only if λn → 0.
Exercise 5.2. Assume that λ → 0 and that L is defined by (5.1). In the definition of L,
combine those terms corresponding to identical λn together. That is, let µ = (µk )k∈I be
the sequence of distinct values in λ (so I is either {1, . . . , N } if there are only finitely many
distinct values, or I = N if there are infinitely many). If we set Jk = {n ∈ N : λn = µk },
then
X
hf, en i en
Pk f =
n∈Jk
is the orthogonal projection of H onto span{en }n∈Jk . Show that the operator L defined
in (5.1) can be rewritten as
X
Lf =
µk Pk f,
f ∈ H,
k∈I
with convergence of the series in the norm of H. Show further that
X
L =
µk P k ,
k∈I
with convergence of the series in operator norm. Show that span{en }n∈Jk is the µk -eigenspace
of L. Show that Pj Pk = Pk Pj = 0 for all j 6= k ∈ N, and consequently the eigenspaces
corresponding to distinct eigenvalues are orthogonal.
In this section we will prove a converse result, showing that all compact, self-adjoint
operators on a Hilbert space can be represented in the form of (5.1). First, however, we need
to develop some useful machinery.
Exercise 5.3. If λ is an eigenvalue of L ∈ B(H), then |λ| ≤ kLk.
Exercise 5.4. Let A be an n × n complex matrix. Define its spectral radius to be
ρ(A) = max{|λ| : λ is an eigenvalue of A} = max{|λ| : λ ∈ σp (A)}.
By the preceding exercise, if we choose any norm on Cn and let kAk be the corresponding
operator norm, we have
ρ(A) ≤ kAk.
(a) Prove that if A is self-adjoint and we use the Euclidean (`2 ) norm on Cn , then
kAk = ρ(A).
(b) Prove that the same equality holds if A is normal. Find an example of a non-normal
matrix for which ρ(A) < kAk.
CHAPTER 2. OPERATORS ON HILBERT SPACES
37
(c) Prove that if A is any n × n matrix, then (still using the Euclidean norm on Cn ),
kAk = ρ(A∗ A)1/2 .
(d) (Harder). Prove that A is a fixed but arbitrary n × n complex matrix and ε > 0
is given, then there exists a norm on Cn such that the corresponding operator norm of A
satisfies
kAk ≤ ρ(A) + ε.
Although an arbitrary compact operator need not have any eigenvalues (see Exercise 4.38),
the following result shows that a compact, self-adjoint operator must have at least one
eigenvalue.
Proposition 5.5. If T : H → H is compact and self-adjoint, then either kT k or −kT k is an
eigenvalue of T .
Proof. Since T is self-adjoint, we know from Proposition 2.18 that
kT k = sup |hT f, f i|.
kf k=1
Hence, there must exist unit vectors fn such that |hT fn , fn i| → kT k. Since T is self-adjoint,
all the inner products hT fn , fn i are real, so we can find a subsequence that converges either
to kT k or to −kT k. Call this subsequence {gn }n∈N , and let λ be either kT k or −kT k as
appropriate. Then we have kgn k = 1 for every n, and hT gn , gn i → λ. Hence, since both λ
and hT gn , gn i are real,
kT gn − λgn k2 = kT gn k2 − 2λ hT gn , gn i + λ2 kgn k2
≤ kT k2 kgn k2 − 2λ hT gn, gn i + λ2 kgn k2
= λ2 − 2λ hT gn , gn i + λ2
→ λ2 − 2λ2 + λ2 = 0.
It therefore follows from Proposition 4.42 that λ is an eigenvalue of T .
Now we can prove that every compact, self-adjoint operator has a very simple and special
form.
Theorem 5.6 (Spectral Theorem for Compact Self-Adjoint Operators). Let T : H → H
be compact and self-adjoint. Then there exist nonzero real numbers {λn }n∈J , either finitely
many or λn → 0 if infinitely many, and an orthonormal basis {en }n∈N of range(T ), such that
X
Tf =
λn hf, en i en ,
f ∈ H.
n∈J
Each λn is an eigenvalue of T , and each en is a corresponding eigenvector.
38
CHRISTOPHER HEIL
Proof. Note that since T is compact, range(T ) is separable by Exercise 4.18.
If T = 0 then the result is trivial, so assume that T is not the zero operator.
Let H1 = H and T1 = T . By Proposition 5.5, T1 has an eigenvalue λ1 which satisfies
|λ1 | = kT1 k > 0. Let e1 be a corresponding eigenvector, normalized to ke1 k = 1.
Let H2 = {e1 }⊥ and let T2 = T |H2 (the restriction of T to H2 ). If T2 = 0, then stop at
this point. Otherwise, continue as follows.
Since span{e1 } is invariant under T1 (after all, e1 is an eigenvector), we know from Exercise 3.10 that H2 is invariant under T1∗ = T1 . Exercise: Show that T2 : H2 → H2 is compact
and self-adjoint. Therefore T2 has an eigenvalue λ2 such that |λ2 | = kT2 k > 0. Note that
since T2 is a restriction of T1 , we have |λ2 | = kT2 k ≤ kT1 k = |λ1 |. Let e2 be a corresponding
eigenvector, normalized to ke2 k = 1. Note that by definition of H2 , we have e2 ⊥ e1 . Further,
λ2 is an eigenvalue of T (not just T2 ), and e2 is the corresponding eigenvector of T .
Let H3 = {e1 , e2 }⊥ and let T3 = T |H3 . If T3 = 0, then stop at this point. Otherwise,
continue as before to construct an eigenvalue λ3 and eigenvector e3 (which will be orthogonal
to both e1 and e2 ).
Continuing in this process, there are two possibilities.
Case 1: TN +1 = 0 for some N . In this case, since HN +1 = {e1 , . . . , eN }⊥ , we have
H = span{e1 , . . . , eN } ⊕ HN +1 .
Therefore, if f ∈ H then we can write f uniquely as
f =
N
X
n=1
hf, en i en + v
where v ∈ HN +1 . Since T (v) = TN +1 (v) = 0, we therefore have
Tf =
N
X
n=1
hf, en i T (en ) + T (v) =
In this case T is finite-rank and the proof is complete.
N
X
n=1
λn hf, en i en .
Case 2: TN 6= 0 for any N . In this case we obtain countably many eigenvalues λn and
corresponding orthonormal eigenvectors en . Since T is compact, we have by Exercise 4.10
that λn en = T (en ) → 0. Since ken k = 1, we conclude that λn → 0.
Let M = span{en }n∈N . Then {en }n∈N is an orthonormal basis for M , and H = M ⊕ M ⊥ .
Hence, if f ∈ H then we can write f uniquely as
f =
∞
X
n=1
hf, en i en + v
for some v ∈ M ⊥ . Therefore
∞
∞
X
X
Tf =
hf, en i T (en ) + T (v) =
λn hf, en i en + T (v).
n=1
If we show that T (v) = 0, then we are done.
n=1
CHAPTER 2. OPERATORS ON HILBERT SPACES
39
Note that since span{e1 , . . . , eN } ⊆ M , we have v ∈ M ⊥ ⊆ span{e1 , . . . , eN }⊥ = HN .
Hence
kT (v)k = kTN (v)k ≤ kTN k kvk = |λN | kvk → 0
as N → ∞.
Consequently T (v) = 0.
Since each eigenspace corresponding to nonzero eigenvalues is finite-dimensional, we can
group terms corresponding to the same eigenvalue together. Alternatively, we could write a
more efficient proof of the Spectral Theorem (as Conway does), by using the same argument
on the distinct eigenvalues and corresponding eigenspaces, instead of one eigenvalue and
eigenvector at a time. Either way, an extension of the ideas used in the preceding result
gives the following expanded form of the Spectral Theorem.
Theorem 5.7 (Spectral Theorem for Compact Self-Adjoint Operators). Let T : H → H be
compact and self-adjoint. Then the following statements hold.
(a) T has only a finite or countably infinite number of distinct eigenvalues, and each
eigenvalue is real.
(b) Let {µ1 , µ2 , . . . } = {µk }k∈I be the distinct nonzero eigenvalues, where either I =
{1, . . . , N } or I = N. Then each eigenspace
is finite-dimensional.
Ek = ker(T − λµk )
(c) If I is infinite, then µk → 0 as k → ∞.
(d) If Pk is the orthogonal projection onto the eigenspace Ek , then Pj Pk = Pk Pj = 0 for
j 6= k ∈ I. That is, eigenspaces corresponding to distinct eigenvalues are orthogonal.
(e) We have
T =
X
µk P k ,
k∈I
where the series converges in operator norm.
(f) There exist nonzero real numbers {λn }n∈J , either finitely many or λn → 0 if infinitely
many, and an orthonormal sequence {en }n∈J such that
X
Tf =
λn hf, en i en .
n∈J
The λn are obtained by repeating each µk according to its multiplicity (the dimension
of the eigenspace Ek ). The sequence {en }n∈J forms an orthonormal basis for
ker(T )⊥ = range(T ).
Corollary 5.8. If T : H → H is compact, self-adjoint, and injective, then H is separable.
Proof. Since T = T ∗ , we have that range(T ) = ker(T ∗ )⊥ = {0}⊥ = H. On the other hand,
since T is compact we know from Exercise 4.18 that range(T ) is separable.
40
CHRISTOPHER HEIL
Example 5.9 (Diagonalization of Self-Adjoint Matrices). Let us examine what the Spectral
Theorem says in finite dimensions. Let A be a self-adjoint n × n matrix (i.e., symmetric
if real, and Hermitian if complex). Then the Spectral Theorem says that there exist real
nonzero eigenvalues λ1 , . . . , λk and corresponding orthonormal eigenvectors u1 , . . . , uk such
that
k
X
Ax =
λj (x · uj ) uj .
(5.2)
j=1
We can extend this representation by including the zero eigenvalues of A, as follows. From
(5.2), we see that the column space, or range, of A is
C(A) = range(A) = span{u1 , . . . , uk }.
Since A is self-adjoint, its nullspace is the orthogonal complement of its column space, for
N (A) = ker(A) = range(A)⊥ = C(A)⊥ .
Let uk+1 , . . . , un be an orthonormal basis for N (A), and let λk+1 = · · · = λn = 0. Then
u1 , . . . , un is an orthonormal basis for Cn with corresponding eigenvectors λ1 , . . . , λn . Further,
we have the following representations:
n
n
X
X
x =
(x · uj ) uj
and
Ax =
λj (x · uj ) uj ,
x ∈ Fn .
j=1
j=1
Let us rewrite this representation as follows:
n
X
Ax =
λj (x · uj ) uj
j=1


λ1 (x · u1 )
|
|
..

=  u1 · · · u n  
.
|
|
λn (x · un )



λ1
|
|
..

=  u1 · · · u n  
.
|
|
un



λ1
|
|
..

=  u1 · · · u n  
.
|
|
un

= U ΛU H x,

x · u1
 ... 

x · un

—
— uH
1
..
x

.
— uH
n —

where U is the matrix that has u1 , . . . , un as columns, and Λ is the diagonal matrix with
λ1 , . . . , λn on the diagonal. On the one hand, this is nothing more than the diagonalization
of A. However, this says much more: every self-adjoint matrix can be diagonalized (even if
some eigenvalues are repeated), and furthermore, the eigenvector matrix is unitary (because
it has orthonormal columns). We summarize this next as a theorem.
CHAPTER 2. OPERATORS ON HILBERT SPACES
41
Theorem 5.10 (Diagonalization of Self-Adjoint Matrices). Let A be an n × n matrix. Then
the following statements are equivalent.
(a) A is self-adjoint.
(b) A = U ΛU ∗ where U is unitary and Λ is diagonal with real scalars on its diagonal.
(c) There exist real scalars λ1 , . . . , λn an orthonormal vectors u1 , . . . , un such that
n
X
Ax =
λj (x · uj ) uj ,
x ∈ Fn .
j=1
(d) There exists an orthonormal basis u1 , . . . , un for Fn consisting of eigenvectors of A
with corresponding real eigenvalues λ1 , . . . , λn .