* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download FUNCTIONAL ANALYSIS LECTURE NOTES CHAPTER 2
Survey
Document related concepts
Transcript
FUNCTIONAL ANALYSIS LECTURE NOTES CHAPTER 2. OPERATORS ON HILBERT SPACES CHRISTOPHER HEIL 1. Elementary Properties and Examples First recall the basic definitions regarding operators. Definition 1.1 (Continuous and Bounded Operators). Let X, Y be normed linear spaces, and let L : X → Y be a linear operator. (a) L is continuous at a point f ∈ X if fn → f in X implies Lfn → Lf in Y . (b) L is continuous if it is continuous at every point, i.e., if fn → f in X implies Lfn → Lf in Y for every f . (c) L is bounded if there exists a finite K ≥ 0 such that ∀ f ∈ X, kLf k ≤ K kf k. Note that kLf k is the norm of Lf in Y , while kf k is the norm of f in X. (d) The operator norm of L is kLk = sup kLf k. kf k=1 (e) We let B(X, Y ) denote the set of all bounded linear operators mapping X into Y , i.e., B(X, Y ) = {L : X → Y : L is bounded and linear}. If X = Y = X then we write B(X) = B(X, X). (f) If Y = F then we say that L is a functional. The set of all bounded linear functionals on X is the dual space of X, and is denoted X 0 = B(X, F) = {L : X → F : L is bounded and linear}. We saw in Chapter 1 that, for a linear operator, boundedness and continuity are equivalent. Further, the operator norm is a norm on the space B(X, Y ) of all bounded linear operators from X to Y , and we have the composition property that if L ∈ B(X, Y ) and K ∈ B(Y, Z), then KL ∈ B(X, Z), with kKLk ≤ kKk kLk. Date: February 20, 2006. These notes closely follow and expand on the text by John B. Conway, “A Course in Functional Analysis,” Second Edition, Springer, 1990. 1 2 CHRISTOPHER HEIL Exercise 1.2. Suppose that L : X → Y is a bounded map of a Banach space X into a Banach space Y . Prove that if there exists a c > 0 such that kLf k ≥ c kf k for every f ∈ X, then range(L) is a closed subspace of Y . Exercise 1.3. Let Cb (Rn ) be the set of all bounded, continuous functions f : Rn → F. Let C0 (Rn ) be the set of all continuous functions f : Rn → F such that lim|x|→∞ f (x) = 0 (i.e., for every ε > 0 there exists a compact set K such that |f (x)| < ε for all x ∈ / K). Prove that these are closed subspaces of L∞ (Rn ) (under the L∞ -norm; note that for a continuous function we have kf k∞ = sup |f (x)|). Define δ : Cb (Rn ) → F by δ(f ) = f (0). Prove that δ is a bounded linear functional on Cb (Rn ), i.e., δ ∈ (Cb )0 , and find kδk. This linear functional is the delta distribution (see also Exercise 1.26 below). Example 1.4. In finite dimensions, all linear operators are given by matrices, this is just standard finite-dimensional linear algebra. Suppose that X is an n-dimensional complex normed vector space and Y is an mdimensional complex normed vector space. By definition of dimension, this means that there exists a basis BX = {x1 , . . . , xn } for X and a basis BY = {y1 , . . . , ym } for Y . If x ∈ X, then x = c1 x1 + · · · + cn xn for a unique choice of scalars ci . Define the coordinates of x with respect to the basis BX to be c1 [x]BX = ... ∈ Cn . cn The vector x is completely determined by its coordinates, and conversely each vector in C n is the coordinates of a unique x ∈ X. The mapping x 7→ [x]BX is a linear mapping of X onto Cn . We similarly define [y]BY ∈ Cm for vectors y ∈ Y . Let A : X → Y be a linear map (it is automatically bounded since X is finite-dimensional). Then A transforms vectors x ∈ X into vectors Ax ∈ Y . The vector x is determined by its coordinates [x]BX and likewise Ax is determined by its coordinates [Ax]BY . The vectors x and Ax are related through the linear map A; we will show that the coordinate vectors [x]BX and [Ax]BY are related by multiplication by an m × n matrix determined by A. We call this matrix the standard matrix of A with respect to BX and BY , and denote it by [A]BX ,BY . That is, the standard matrix should satisfy [Ax]BY = [A]BX ,BY [x]BX , x ∈ X. We claim that the standard matrix is the matrix whose columns are the coordinates of the vectors Axk , i.e., [A]BX ,BY = [Ax1 ]BY · · · [Axn ]BY . CHAPTER 2. OPERATORS ON HILBERT SPACES 3 To see this, choose any x ∈ X and let x = c1 x1 + · · · + cn xn be its unique representation with respect to the basis BX . Then c1 [A]BX ,BY [x]BX = [Ax1 ]BY · · · [Axn ]BY ... cn = c1 [Ax1 ]BY + · · · + cn [Axn ]BY = [c1 Ax1 + · · · + cn Axn ]BY = [A(c1 x1 + · · · + cn xn )]BY = [Ax]BY . Exercise 1.5. Extend the idea of the preceding example to show that that any linear mapping L : `2 (N) → `2 (N) (and more generally, L : H → K with H, K separable) can be realized in terms of multiplication by an (infinite but countable) matrix. Exercise 1.6. Let A be an m × n complex matrix, which we view as a linear transformation A : Cn → Cm . The operator norm of A depends on the choice of norm for Cn and Cm . Compute an explicit formula for kAk, in terms of the entries of A, when the norm on Cn and Cm is taken to be the `1 norm. Then do the same for the `∞ norm. Compare your results to the version of Schur’s Lemma given in Theorem 1.23. The following example is one that we will return to many times. Example 1.7. Let {en }n∈N be an orthonormal basis for a separable Hilbert space H. Then we know that every f ∈ H can be written ∞ X f = hf, en i en . n=1 Fix any sequence of scalars λ = (λn )n∈N , and formally define ∞ X Lf = λn hf, en i en . (1.1) n=1 This is a “formal” definition because we do not know a priori that the series above will converge—in other words, equation (1.1) may not make sense for every f . Note that if H = `2 (N) and {en }n∈N is the standard basis, then L is given by the formula Lx = (λ1 x1 , λ2 x2 , . . . ), x = (x1 , x2 , . . . ) ∈ `2 (N). We will show the following (the `∞ -norm of the sequence λ is kλk∞ = supn |λn |). (a) The series defining Lf in (1.1) converges for each f ∈ H if and only if λ ∈ `∞ . In this case L is a bounded linear mapping of H into itself, and kLk = kλk∞ . 4 CHRISTOPHER HEIL (b) If λ ∈ / `∞ , then L defines an unbounded linear mapping from the domain ∞ n o X domain(L) = f ∈ H : |λn hf, en i|2 < ∞ (1.2) n=1 (which is dense in H) into H. Proof. (a) Suppose that λ ∈ `∞ , i.e., λ is a bounded sequence. Then for any f we have ∞ X n=1 2 |λn hf, en i| ≤ ∞ X n=1 kλk2∞ hf, en i|2 = kλk2∞ kf k2 < ∞, so the series defining Lf converges (because {en } is P an orthonormal sequence). Moreover, 2 2 2 2 the preceding calculation also shows that kLf k = ∞ n=1 |λn hf, en i| ≤ kλk∞ kf k , so we see that kLk ≤ kλk∞ . On the other hand, by orthonormality we have Len = λn en (i.e., each en is an eigenvector for L with eigenvalue λn ). Since ken k = 1 and kLen k = |λn | ken k = |λn | we conclude that kLk = sup kLf k ≥ sup kLen k = sup |λn | = kλk∞ . n∈N kf k=1 n∈N The converse direction will be covered by the proof of part (b). (b) Suppose that λ ∈ / `∞ , i.e., λ is not a bounded sequence. Then we can find a subsequence (λnk )k∈N such that |λnk | ≥ k for each k. Let cnk = k1 and define all other cn to be zero. Then P P P |2 = k k12 < ∞, so f = n cn en converges (and cn = hf, en i). But the formal series n |cnP Lf = n λn cn en does not converge, because ∞ X n=1 2 |cn λn | = ∞ X k=1 2 |cnk λnk | ≥ ∞ X k2 k=1 k2 = ∞. In fact, the series defining Lf in (1.1) only converges for those f which lie in the domain defined in (1.2). That domain is dense because it contains the finite span of {en }n∈N , which we know is dense in H. Further, that domain is a subspace of H (exercise), so it is an innerproduct space. The map L : domain(L) → H is a well-defined, linear map, so it remains only to show that it is unbounded. This follows from the facts that en ∈ domain(L), ken k = 1, and kLen k = |λn | ken k = |λn |. Exercise 1.8. Continuing Example 1.7, suppose that λ ∈ `∞ and set δ = inf n |λn |. Prove the following. (a) L is injective if and only if λn 6= 0 for every n. (b) L is surjective if and only if δ > 0 (if δ = 0, use an argument similar to the one used in part (b) of Example 1.7 to show that range(L) is a proper subset of H). (c) If δ = 0 but λn 6= 0 for every n then range(L) is a dense but proper subspace of H. (d) Prove that L is unitary if and only if |λn | = 1 for every n. CHAPTER 2. OPERATORS ON HILBERT SPACES 5 In Example 1.7, we saw an unbounded operator whose domain was a dense but proper subspace of H. This situation is typical for unbounded operators, and we often write L : X → Y even when L is only defined on a subset of X, as in the following example. Example 1.9 (Differentiation). Consider the Hilbert space H = L2 (0, 1), and define an operator D : L2 (0, 1) → L2 (0, 1) by Df = f 0 . Implicitly, we mean by this that D is defined on the largest domain that makes sense, namely, domain(D) = f ∈ L2 (0, 1) : f is differentiable and f 0 ∈ L2 (0, 1) . Note that if f ∈ domain(D), then Df is well-defined, Df ∈ L2 (0, 1), and kDf k2 < ∞. Thus every vector in domain(D) maps to a vector in L2 (0, 1) which necessarily has finite norm. Yet D is unbounded. For example, if we set en (x) = einx then ken k2 = 1, but Den (x) = e0n (x) = ineinx so kDen k2 = n. While each vector Den has finite norm, there is no upper bound to these norms. Since the en are unit vectors, we conclude that kDk = ∞. The following definitions recall the basic notions of measures and measure spaces. For full details, consult a book on real analysis.1 Definition 1.10 (σ-Algebras, Measurable Sets and Functions). Let X be a set, and let Ω be a collection of subsets of X. Then Ω is a σ-algebra if (a) X ∈ Ω, (b) If E ∈ Ω then X \ E ∈ Ω (i.e., Ω is closed under complements), S (c) If E1 , E2 , · · · ∈ Ω then Ek ∈ Ω (i.e., Ω is closed under countable unions) The elements of Ω are called the measurable subsets of X. If we choose F = R then we usually allow functions on X to take extended-real values, i.e., f (x) is allowed to take the values ±∞. An extended-real-valued function f : X → [−∞, ∞] is called a measurable function if {x ∈ X : f (x) > a} is measurable for each a ∈ R. If we choose F = C then we require functions on X to take (finite) complex values—there is no complex analogue of ±∞. A complex-valued function f : X → C is called a measurable function if its real and imaginary parts are measurable (real-valued) functions. Definition 1.11 (Measure Space). Let X be a set and Ω a σ-algebra of subsets of X. Then a function µ on Ω is a (positive) measure if (a) 0 ≤ µ(E) ≤ +∞ for all E ∈ Ω, (b) If E1 , E2 , . . . is a countable family of disjoint sets in Ω, then ∞ S X ∞ Ek = µ(Ek ). µ k=1 1For k=1 example, R. Wheeden and A. Zygmund, “Measure and Integral,” Marcel Dekker, 1977, or G. Folland, “Real Analysis,” Second Edition, Wiley, 1999. 6 CHRISTOPHER HEIL In this case, (X, Ω, µ) is called a measure space. If µ(X) < ∞, then we say that µ is a finite measure. S If there exist countably many subsets E1 , E2 , . . . such that X = Ek and µ(Ek ) < ∞ for all k, then we say that µ is σ-finite. For example, Lebesgue measure on Rn is σ-finite. It is often useful to allow measures to take negative values. Definition 1.12 (Signed Measure). Let X be a set and Ω a σ-algebra of subsets of X. Then a function µ on Ω is a signed measure if (a) −∞ ≤ µ(E) ≤ +∞ for all E ∈ Ω and µ(∅) = 0, (b) If E1 , E2 , . . . is a countable family of disjoint sets in Ω, then ∞ S ∞ X Ek = µ(Ek ). µ k=1 k=1 Definition 1.13 (Integration). Let (X, Ω, µ) be a measure space. (a) If f : X → [0, ∞] is a nonnegative, measurable function, then the integral of f over X with respect to µ is X Z Z inf f (x) µ(Ej ) , f dµ = f (x) dµ(x) = sup X X x∈Ej j where the supremum is taken over all decompositions E = E1 ∪ · · · ∪ EN of E as the union of a finite number of disjoint measurable sets Ek (and where we take the convention that ∞ · 0 = 0 · ∞ = 0). (b) If f : X → [−∞, ∞] and we define f + (x) = max{f (x), 0}, then Z f dµ = X Z f − (x) = − min{f (x), 0}, + X f dµ − Z f − dµ, X as long as this does not have R the form ∞ − ∞ (in that case the integral would be undefined). Since |f | = f + + f − and X |f | dµ always exists (either as a finite number or as ∞), it follows that Z Z f dµ exists and is finite ⇐⇒ |f | dµ < ∞. X (c) If f : X → C, then Z X f dµ = X Z Re (f ) dµ + i X Z Im (f ) dµ, X as long as both integrals on the right are defined and finite. There are many other equivalent definitions of the integral. CHAPTER 2. OPERATORS ON HILBERT SPACES 7 Definition 1.14 (Lp Spaces). Let (X, Ω, µ) be a measure space, and fix 1 ≤ p < ∞. Then Lp (X) consists of all measurable functions f : X → [−∞, ∞] (if we choose F = R) or f : X → C (if we choose F = C) such that Z p kf kp = |f (x)|p dµ(x) < ∞. X p Then L (X) is a vector space under the operations of addition of functions and multiplication of a function by a scalar. Additionally, the function k · kp defines a semi-norm on Lp (X). Usually we identify functions that are equal almost everywhere (we say that f = g a.e. if µ{x ∈ X : f (x) 6= g(x)} = 0), and then k · k becomes a norm on Lp (X). For p = ∞ we define L∞ (X) to be the set of measurable functions that are essentially bounded, i.e., for which there exists a finite constant M such that |f (x)| ≤ M a.e. Then kf k∞ = ess sup |f (x)| = inf M ≥ 0 : |f (x)| ≤ M a.e. x∈X ∞ is a semi-norm on L (X), and is a norm if we identify functions that are equal almost everywhere. For each 1 ≤ p ≤ ∞, the space Lp (X) is a Banach space under the above norm. Exercise 1.15 (`p Spaces). Counting measure on a set X is defined by µ(X) = card(E) if E is a finite subset of X, and µ(X) = ∞ if E is an infinite subset. Let Ω = P(X) (the set of all subsets of X), and show that (X, Ω, µ) is a measure space. Show that Lp (X, Ω, µ) = `p (X). Show that µ is σ-finite if and only if X is countable. Exercise 1.16 (The Delta Measure). Let X = Rn and Ω = P(X). Define δ(E) = 1 if 0 ∈ E and δ(E) = 0 if 0 ∈ / E. Prove that δ is a measure, and find a formula for Z f (x) dδ(x). Rn R Sometimes this integral is written informally as Rn f (x) δ(x) dx, but note that δ is a measure on Rn , not a function on Rn (see also Exercise 1.26 below). Exercise 1.17. Fix 0 ≤ g ∈ L1 (Rn ), under Lebesgue measure. Prove that µ(E) = defines a finite measure on Rn . R E g(x) dx With this preparation, we can give some additional examples of operators on Banach or Hilbert spaces. Example 1.18 (Multiplication Operators). Let (X, Ω, µ) be a measure space, and let φ ∈ L∞ (X) be a fixed measurable function. Then for any f ∈ L2 (X) we have that f φ is measurable, and Z Z 2 2 kf φk2 = |f (x) φ(x)| dx ≤ |f (x)|2 kφk2∞ dx = kφk2∞ kf k22 < ∞, X X 8 CHRISTOPHER HEIL so f φ ∈ L2 (X). Therefore, the multiplication operator Mφ : L2 (X) → L2 (X) given by Mφ f = f φ is well-defined, and the calculation above shows that kMφ f k2 ≤ kφk∞ kf k2 . Therefore Mφ is bounded, and kMφ k ≤ kφk∞ . If we assume that µ is σ-finite, then we can show that kMφ k = kφk∞ , as follows. Choose any ε > 0. Then by definition of L∞ -norm, the set E = {x ∈ X : |φ(x)| > kφk∞ − ε} has positive measure. Since X is σ-finite, we can write X = ∪Fm where each µ(Fm ) < ∞. Since E = ∪(E ∩ Fm ) is a countable union, we must have µ(E ∩ Fm ) > 0 for some m. Let F = E ∩ Fm , and set f = µ(F1)1/2 χF . Then kf k2 = 1, but kMφ f k2 ≥ (kφk∞ − ε) kf k2 . Hence kMφ k2 ≥ kφk∞ − ε. Exercise: Find an example of a measure µ that is not σ-finite and a function φ such that kMφ k < kφk∞ . Exercise 1.19. Let (X, Ω, µ) be a measure space, and let φ be a fixed measurable function. Prove that if f φ ∈ L2 (X) for every f ∈ L2 (X), then we must have φ ∈ L∞ (X). Solution. Assume φ ∈ / L∞ (X). Set Ek = {x ∈ X : k ≤ |φ(x)| < k + 1}. The Ek are measurable and disjoint, and since φ is not in L∞ (X) there must be infinitely many Ek with positive measure. Choose any Enk , k ∈ N, all with positive measure and let E = ∪Enk . Define 1 , x ∈ E nk , k µ(Enk )1/2 f (x) = 0, x∈ / E. Then Z ∞ ∞ Z X X 1 1 2 = < ∞, |f | dµ = 2 k µ(Enk ) k2 X k=1 k=1 Enk but Z 2 X |f φ| dµ ≥ which is a contradiction. ∞ Z X k=1 E nk ∞ X k2 1 = ∞, = k 2 µ(Enk ) k=1 Exercise 1.20. Continuing Example 1.18, do the following. (a) Determine a necessary and sufficient condition on φ which implies that Mφ : L2 (X) → L2 (X) is injective. (b) Determine a necessary and sufficient condition on φ which implies that Mφ : L2 (X) → L2 (X) is surjective. (c) Prove that if Mφ is injective but not surjective then Mφ−1 : range(Mφ ) → L2 (X) is unbounded. (d) Extend from the case p = 2 to any 1 ≤ p ≤ ∞. CHAPTER 2. OPERATORS ON HILBERT SPACES 9 Example 1.21 (Integral Operators). Let (X, Ω, µ) be a σ-finite measure space. An integral operator is an operator of the form Z Lf (x) = k(x, y) f (y) dµ(y). (1.3) X This is just a formal definition, we have to provide conditions under which this makes sense, and the following two theorems will provide such conditions. The function k that determines the operator is called the kernel of the operator (not to be confused with the kernel/nullspace of the operator!). Note that an integral operator is just a generalization of matrix multiplication. For, if A is an m × n matrix with entries aij and u ∈ Cn , then Au ∈ Cm , and its components are given by n X aij uj , i = 1, . . . , m. (Au)i = j=1 Thus, the values k(x, y) are analogous to the entries aij of the matrix A, and the values Lf (x) are analogous to the entries (Au)i . The following result shows that if the kernel is square-integrable, then the corresponding integral operator is bounded. Later we will define the notion of a Hilbert–Schmidt operator. For the case of integral operators mapping L2 (X) into itself, it can be shown that L is a Hilbert–Schmidt operator if and only if the kernel k belongs to L2 (X × X). Theorem 1.22 (Hilbert–Schmidt Integral Operators). Let (X, Ω, µ) be a σ-finite measure space, and choose a kernel k ∈ L2 (X × X). That is, assume that Z Z 2 kkk2 = |k(x, y)|2 dµ(x) dµ(y) < ∞. X X Then the integral operator given by (1.3) defines a bounded mapping of L2 (X) into itself, and kLk ≤ kkk2 . Proof. Although a slight abuse of the order of logic (technically we should show Lf exists before trying to compute its norm), the following calculation shows that L is well-defined and is a bounded mapping of L2 (X) into itself: Z 2 kLf k2 = |Lf (x)|2 dµ(x) X 2 Z Z = k(x, y) f (y) dµ(y) dµ(x) X ≤ X Z Z X 2 X |k(x, y)| dµ(y) Z X 2 |f (y)| dµ(y) dµ(x) 10 CHRISTOPHER HEIL = Z Z X X |k(x, y)|2 dµ(y) kf k22 dµ(x) = kkk22 kf k22 , where the inequality follows by applying Cauchy–Schwarz to the inner integral. Thus L is bounded, and kLk ≤ kkk2 . The following result is one version of Schur’s Lemma. There are many forms of Schur’s Lemma, this is one particular special case. Exercise: Compare the hypotheses of the following result to the operator norms you calculated in Exercise 1.6. Theorem 1.23. Let (X, Ω, µ) be a σ-finite measure space, and Assume that k is a measurable function on X × X which satisfies the “mixed-norm” conditions Z Z |k(x, y)| dµ(x) < ∞. |k(x, y)| dµ(y) < ∞ and C2 = ess sup C1 = ess sup x∈X y∈X X X Then the integral operator given by (1.3) defines a bounded mapping of L2 (X) into itself, and kLk ≤ (C1 C2 )1/2 . Proof. Choose any f ∈ L2 (X). Then, by applying the Cauchy–Schwarz inequality, we have Z 2 kLf k2 = |Lf (x)|2 dµ(x) X 2 Z Z k(x, y) f (y) dµ(y) dµ(x) = X X ≤ Z Z ≤ Z Z ≤ Z X X X X C1 X = C1 ≤ C1 Z X Z X |k(x, y)| 1/2 |k(x, y)| |k(x, y)| dµ(y) Z X Z 1/2 |f (y)| dµ(y) 2 2 X dµ(x) |k(x, y)| |f (y)| dµ(y) dµ(x) |k(x, y)| |f (y)|2 dµ(y) dµ(x) |f (y)| 2 Z X |k(x, y)| dµ(x) dµ(y) |f (y)|2 C2 dµ(y) = C1 C2 kf k22 , where we have used Tonelli’s Theorem to interchange the order of integration (here is where we needed the fact that µ is σ-finite). Thus L is bounded and kLk ≤ (C1 C2 )1/2 . CHAPTER 2. OPERATORS ON HILBERT SPACES 11 Exercise 1.24. Consider what happens in the preceding example if we take 1 ≤ p ≤ ∞ instead of p = 2. In particular, in part b, show that if C1 , C2 < ∞ then L : Lp (X) → Lp (X) is a bounded mapping for each 1 ≤ p ≤ ∞ (try to do p = 1 or p = ∞ first). Exercise 1.25 (Volterra Operator). Define L : L2 [0, 1] → L2 [0, 1] by Z x Lf (x) = f (y) dy. 0 Show directly that L is bounded. Then show that L is an integral operator with kernel k : [0, 1]2 → F defined by ( 1, y ≤ x, k(x, y) = 0, y > x. Observe that k ∈ L2 ([0, 1]2 ), so L is compact. This operator is called the Volterra operator. Exercise 1.26 (Convolution). Convolution is one of the most important examples of integral operators. Consider the case of Lebesgue measure on Rn . Given functions f , g on Rn , their convolution is the function f ∗ g defined by Z (f ∗ g)(x) = f (y) g(x − y) dy, Rn provided that the integral makes sense. Note that with g fixed, the mapping f 7→ f ∗ g is an integral operator with kernel k(x, y) = g(x − y). (a) Let g ∈ L1 (Rn ) be fixed. Use Schur’s Lemma (Theorem 1.23) to show that Lf = f ∗ g is a bounded mapping of L2 (Rn ) into itself. In fact, use Exercise 1.24 to prove Young’s Inequality: If f ∈ Lp (Rn ) (1 ≤ p ≤ ∞) and g ∈ L1 (Rn ), then f ∗ g ∈ Lp (Rn ), and kf ∗ gkp ≤ kf kp kgk1 . In particular, L1 (Rn ) is closed under convolution. (b) Note that we cannot use the Hilbert–Schmidt condition (Theorem 1.22) to prove Young’s Inequality, since Z Z |g(x − y)|2 dx dy = ∞, 2 Rn n Rn even if we assume that g ∈ L (R ). (c) Prove that convolution is commutative, i.e., that f ∗ g = g ∗ f . (d) Prove that there is no identity element in L1 (Rn ), i.e., there is no function g ∈ L1 (Rn ) such that f ∗ g = f for all f ∈ L1 (Rn ). This is not trivial—it is easier to do if you make use of the Fourier transform on Rn , and in particular use the Riemann–Lebesgue Lemma to derive a contradiction. 12 CHRISTOPHER HEIL (e) Some texts do talk informally about a “delta function” that is an identity element for convolution, defined by the conditions ( Z ∞, x = 0, δ(x) = and δ(x) dx = 1, 0, x 6= 0, Rn but no such function actually exists. In particular, the function δ defined on the left-hand side of the line above is equal to zero a.e., and R hence is the zero function as far as Lebesgue integration is concerned. That is, we have Rn δ(x) dx = 0, not 1. The “delta function” is really just an informal use of the delta distribution (see Exercise 1.3) or the delta measure (see Exercise 1.16). Show that if we define the convolution of a function f with the delta measure δ to be Z (f ∗ δ)(x) = f (x − y) dδ(y), (1.4) Rn 1 n then f ∗ δ = f for all f ∈ L (R ). Note that in the “informal” notation of Exercise 1.16, (1.4) reads Z (f ∗ δ)(x) = f (x − y) δ(y) dy, Rn which perhaps explains the use of the term “delta function.” Exercise 1.27. Prove that L1 (Rn ) is not closed under pointwise multiplication. That is, prove that there exist f , g ∈ L1 (Rn ) such that the pointwise product h(x) = (f g)(x) = f (x)g(x) does not belong to to L1 (Rn ). Exercise 1.28 (Convolution Continued). (a) Consider the space Lp [0, 1], where we think of functions in Lp [0, 1] as being extended 1-periodically to the real line. Define convolution on the circle by Z 1 (f ∗ g)(x) = f (y) g(x − y) dy, 0 where the periodicity is used to define g(x − y) when x − y lies outside [0, 1] (equivalently, replace x − y by x − y mod 1, the fractional part of x − y). Prove a version of Young’s Inequality for Lp [0, 1]. (b) Consider the sequence space `p (Z). Define convolution on Z by X (x ∗ y)n = xm yn−m . m∈Z p Prove a version of Young’s Inequality for ` (Z). Prove that `1 (Z) contains an identity element with respect to convolution, i.e., there exists a sequence in `1 (Z) (typically denoted δ) such that δ ∗ x = x for every x ∈ `p (Z). (c) Identify the essential features needed to define convolution on more general domains, and prove a version of Young’s Inequality for that setting. CHAPTER 2. OPERATORS ON HILBERT SPACES 13 Exercise 1.29 (Convolution and the Fourier Transform). Let F be the Fourier transform on the circle, i.e., it is the isomorphism F : L2 [0, 1] → `2 (Z) given by F f = fˆ = {fˆ(n)}n∈Z , where Z 1 ˆ f (x) e−2πinx dx, en (x) = e2πinx . f (n) = hf, en i = 0 (a) Prove that the Fourier transform converts convolution in to multiplication. That is, prove that if f , g ∈ L2 [0, 1], then (f ∗ g)∧ = fˆ ĝ, i.e., (f ∗ g)∧ (n) = fˆ(n) ĝ(n), n ∈ Z. (b) Note that if g ∈ L2 [0, 1], then g ∈ L1 [0, 1], so by Young’s Inequality we have that f ∗ g ∈ L2 [0, 1]. Holding g fixed, define an operator L : L2 [0, 1] → L2 [0, 1] by Lf = f ∗ g. Since {en }n∈Z is an orthonormal basis for L2 [0, 1], we have X fˆ(n) en , f ∈ L2 [0, 1]. f = n∈Z Show that Lf = f ∗ g = X ĝ(n) fˆ(n) en , n∈Z f ∈ L2 [0, 1]. Thus, in the “Fourier domain,” convolution acts by changing or adjusting the amount that each “component” or “frequency” en contributes to the representation of the function in this basis: the weight fˆ(n) for frequency n is replaced by the weight ĝ(n) fˆ(n). Explain why this says that L is analogous to multiplication by a diagonal operator. In engineering parlance, convolution is also referred to as filtering. Explain why this terminology is appropriate. Compare this operator L to Example 1.7. 2. The Adjoint of an Operator Example 2.1. Note that the dot product on Rn is given by x · y = xT y, while the dot product on Cn is x · y = xT ȳ. Let A be an m × n real matrix. Then x → 7 Ax defines a linear map of Rn into Rm , and its transpose AT satisfies ∀ x ∈ Rn , ∀ y ∈ Rm , Ax · y = (Ax)T y = xT AT y = x · (AT y). ∀ x ∈ Cn , ∀ y ∈ Cm , Ax · y = (Ax)T ȳ = xT AT ȳ = x · (AH y). Similarly, if A is an m × n complex matrix, then its Hermitian or adjoint matrix AH = AT satisfies Theorem 2.2 (Adjoint). Let H and K be Hilbert spaces, and let A : H → K be a bounded, linear map. Then there exists a unique bounded linear map A∗ : K → H such that ∀ x ∈ H, ∀ y ∈ K, hAx, yi = hx, A∗ yi. 14 CHRISTOPHER HEIL Proof. Fix y ∈ K. Then Lx = hAx, yi is a bounded linear functional on H. By the Riesz Representation Theorem, there exists a unique vector h ∈ H such that hAx, yi = Lx = hx, hi. Define A∗ y = h. Verify that this map A∗ is linear (exercise). To see that it is bounded, observe that kA∗ yk = khk = sup |hx, hi| kxk=1 = sup |hAx, yi| kxk=1 ≤ sup kAxk kyk kxk=1 ≤ sup kAk kxk kyk = kAk kyk. kxk=1 We conclude that A∗ is bounded, and that kA∗ k ≤ kAk. Finally, we must show that A∗ is unique. Suppose that B ∈ B(K, H) also satisfied hAx, yi = hx, Byi for all x ∈ H and y ∈ K. Then for each fixed y we would have that hx, By − A∗ yi = 0 for every x, which implies By − A∗ y = 0. Hence B = A∗ . Exercise 2.3 (Properties of the adjoint). (a) If A ∈ B(H, K) then (A∗ )∗ = A. (b) If A, B ∈ B(H, K) and α, β ∈ F, then (αA + βB)∗ = ᾱA∗ + β̄B ∗ . (c) If A ∈ B(H1 , H2 ) and B ∈ B(H2 , H3 ), then (BA)∗ = A∗ B ∗ . (d) If A ∈ B(H) is invertible in B(H) (meaning that there exists A−1 ∈ B(H) such that AA−1 = A−1 A = I), then A∗ is invertible in B(H) and (A−1 )∗ = (A∗ )−1 . Remark 2.4. Later we will prove the Open Mapping Theorem. A remarkable consequence of this theorem is that if X and Y are Banach spaces and A : X → Y is a bounded bijection, then A−1 : Y → X is automatically bounded. Proposition 2.5. If A ∈ B(H, K), then kAk = kA∗ k = kA∗ Ak1/2 = kAA∗ k1/2 . Proof. In the course of proving Theorem 2.2, we already showed that kA∗ k ≤ kAk. If f ∈ H, then kAf k2 = hAf, Af i = hA∗ Af, f i ≤ kA∗ Af k kf k ≤ kA∗ k kAf k kf k. (2.1) ∗ Hence kAf k ≤ kA k kf k (even if kAf k = 0, this is still true). Since this is true for all f we conclude that kAk ≤ kA∗ k. Therefore kAk = kA∗ k. Next, we have kA∗ Ak ≤ kAk kA∗ k = kAk2 . But also, from the calculation in (2.1), we have kAf k2 ≤ kA∗ Af k kf k. Taking the supremum over all unit vectors, we obtain kAk2 = sup kAf k2 ≤ sup kA∗ Af k kf k = kA∗ Ak. kf k=1 kf k=1 CHAPTER 2. OPERATORS ON HILBERT SPACES 15 Consequently kAk2 = kA∗ Ak. The final equality follows by interchanging the roles of A and A∗ . Exercise 2.6. Prove that if U ∈ B(H, K), then U is an isomorphism if and only if U is invertible and U −1 = U ∗ . Exercise 2.7. (a) Let λ = (λn )n∈N ∈ `∞ (N) be given and let L be defined as in Example 1.7. Find L∗ . (b)Prove that the adjoint of the multiplication operator Mφ defined in Exercise 1.18 is the multiplication operator Mφ̄ . Exercise 2.8. Let L and R be the left- and right-shift operators on `2 (N), i.e., L(x1 , x2 , . . . ) = (x2 , x3 , . . . ) and R(x1 , x2 , . . . ) = (0, x1 , x2 , . . . ). Prove that L = R∗ . Example 2.9. Let L be the integral operator defined in (1.3), determined by the kernel function k. Assume that k is chosen so that L : L2 (X) → L2 (X) is bounded. The adjoint is the unique operator L∗ : L2 (X) → L2 (X) which satisfies hLf, gi = hf, L∗ gi, f, g ∈ L2 (X). To find L∗ , let A : L2 (X) → L2 (X) be the integral operator with kernel k(y, x), i.e., Z Af (x) = k(y, x) f (y) dµ(y). X 2 Then, given any f and g ∈ L (X), we have Z ∗ hf, L gi = hLf, gi = Lf (x) g(x) dµ(x) X = Z Z X = = = Z Z Z k(x, y) f (y) dµ(y) g(x) dµ(x) X f (y) X f (y) X Z Z k(x, y) g(x) dµ(x) dµ(y) X k(x, y) g(x) dµ(x) dµ(y) X f (y) Ag(y) dµ(y) X = hf, Agi. By uniqueness of the adjoint, we must have L∗ = A. Exercise: Justify the interchange in the order of integration in the above calculation, i.e., provide hypotheses under which the calculations above are justified. 16 CHRISTOPHER HEIL Exercise 2.10. Let {en }n∈N be an orthonormal basis for a separable Hilbert space H. Define T : H → `2 (N) by T (f ) = {hf, en i}n∈N . Find a formula for T ∗ : `2 (N) → H. Definition 2.11. Let A ∈ B(H). (a) We say that A is self-adjoint or Hermitian if A = A∗ . (b) We say that A is normal if AA∗ = A∗ A. Example 2.12. A real n × n matrix A is self-adjoint if and only if it is symmetric, i.e., if A = AT . A complex n × n matrix A is self-adjoint if and only if it is Hermitian, i.e., if A = AH . Exercise 2.13. Show that every self-adjoint operator is normal. Show that every unitary operator is normal, but that a unitary operator need not be self-adjoint. For H = Cn , find examples of matrices that are not normal. Are the left- and right-shift operators on `2 (N) normal? Exercise 2.14. (a) Show that if A, B ∈ B(H) are self-adjoint, then AB is self-adjoint if and only if AB = BA. (b) Give an example of self-adjoint operators A, B such that AB is not self-adjoint. (c) Show that if A, B ∈ B(H) are self-adjoint then A + A∗ , AA∗ , A∗ A, A + B, ABA, and BAB are all self-adjoint. What about A − A∗ or A − B? Show that AA∗ − A∗ A is self-adjoint. Exercise 2.15. (a) Let λ = (λn )n∈N ∈ `∞ (N) be given and let L be defined as in Example 1.7. Show that L is normal, find a formula for L∗ , and prove that L is self-adjoint if and only if each λn is real. (b) Determine a necessary and sufficient condition on φ so that the multiplication operator Mφ defined in Exercise 1.18 is self-adjoint. (c) Determine a necessary and sufficient condition on the kernel k so that the integral operator L defined in (1.23) is self-adjoint. The following result gives a useful condition for telling when an operator on a complex Hilbert space is self-adjoint. Proposition 2.16. Let H be a complex Hilbert space (i.e., F = C), and let A ∈ B(H) be given. Then: A is self-adjoint ⇐⇒ hAf, f i ∈ R ∀ f ∈ H. CHAPTER 2. OPERATORS ON HILBERT SPACES 17 Proof. ⇒. Assume A = A∗ . Then for any f ∈ H we have hAf, f i = hf, Af i = hA∗ f, f i = hAf, f i. Therefore hAf, f i is real. ⇐. Assume that hAf, f i is real for all f . Choose any f , g ∈ H. Then hA(f + g), f + gi = hAf, f i + hAf, gi + hAg, f i + hAg, gi. Since hA(f + g), f + gi, hAf, f i, and hAg, gi are all real, we conclude that hAf, gi + hAg, f i is real. Hence it equals its own complex conjugate, i.e., Similarly, since we see that hAf, gi + hAg, f i = hAf, gi + hAg, f i = hg, Af i + hf, Agi. (2.2) hA(f + ig), f + igi = hAf, f i − ihAf, gi + ihAg, f i + hAg, gi −ihAf, gi + ihAg, f i = −ihAf, gi + ihAg, f i = ihg, Af i − ihf, Agi. Multiplying through by i yields hAf, gi − hAg, f i = −hg, Af i + hf, Agi. (2.3) Adding (2.2) and (2.3) together, we obtain 2hAf, gi = 2hf, Agi = 2hA∗ f, gi. Since this is true for every f and g, we conclude that A = A∗ . Example 2.17. The preceding result is false for real Hilbert spaces. After all, if F = R then hAf, f i is real for every f no matter what A is. Therefore, any non-self-adjoint operator provides a counterexample. For example, if H = Rn then any non-symmetric matrix A is a counterexample. The next result provides a useful way of calculating the operator norm of a self-adjoint operator. Proposition 2.18. If A ∈ B(H) is self-adjoint, then kAk = sup |hAf, f i|. kf k=1 Proof. Set M = supkf k=1 |hAf, f i|. By Cauchy–Schwarz and the definition of operator norm, we have M = sup |hAf, f i| ≤ sup kAf k kf k ≤ sup kAk kf k kf k = kAk. kf k=1 kf k=1 kf k=1 To get the opposite inequality, note that if f is any nonzero vector in H then f /kf k is a unit vector, so A kff k , kff k ≤ M . Rearranging, we see that ∀ f ∈ H, hAf, f i ≤ M kf k2 . (2.4) 18 CHRISTOPHER HEIL Now choose any f , g ∈ H with kf k = kgk = 1. Then, by expanding the inner products, canceling terms, and using the fact that A = A∗ , we see that A(f + g), f + g − A(f − g), f − g = 2 hAf, gi + 2 hAg, f i = 2 hAf, gi + 2 hg, Af i = 4 Re hAf, gi. Therefore, applying (2.4) and the Parallelogram Law, we have 4 Re hAf, gi ≤ |hA(f + g), f + gi| + |hA(f − g), f − gi| ≤ M kf + gk2 + M kf − gk2 = 2M kf k2 + kgk2 = 4M. That is, Re hAf, gi ≤ M for every choice of unit vectors f and g. Write hAf, gi = |hAf, gi| e iθ . Then eiθ g is another unit vector, so M ≥ Re hAf, e−iθ gi = Re eiθ hAf, gi = |hAf, gi|. Hence kAf k = sup |hAf, gi| ≤ M. kgk=1 Since this is true for every unit vector f , we conclude that kAk ≤ M . The following corollary is a very useful consequence. Corollary 2.19. Assume that A ∈ B(H). (a) If F = R, A = A∗ , and hAf, f i = 0 for every f , then A = 0. (b) If F = C and hAf, f i = 0 for every f , then A = 0. Proof. Assume the hypotheses of either statement (a) or statement (b). In the case of statement (a), we have by hypothesis that A is self-adjoint. In the case of statement (b), we can conclude that A is self-adjoint because hAf, f i = 0 is real for every f . Hence in either case we can apply Proposition 2.18 to conclude that kAk = sup |hAf, f i| = 0. kf k=1 Lemma 2.20. If A ∈ B(H), then the following statements are equivalent. (a) A is normal, i.e., AA∗ = A∗ A. (b) kAf k = kA∗ f k for every f ∈ H. CHAPTER 2. OPERATORS ON HILBERT SPACES 19 Proof. (b) ⇒ (a). Assume that (b) holds. Then for every f we have ∗ (A A − AA∗ )f, f = hA∗ Af, f i − hAA∗ f, f i = hAf, Af i − hA∗ f, A∗ f i = kAf k2 − kA∗ f k2 = 0. Since A∗ A − AA∗ is self-adjoint, it follows from Corollary 2.19 that A∗ A − AA∗ = 0. (a) ⇒ (b). Exercise. Corollary 2.21. If A ∈ B(H) is normal, then ker(A) = ker(A∗ ). Exercise 2.22. Suppose that A ∈ B(H) is normal. Prove that A is injective if and only if range(A) is dense in H. Exercise 2.23. If A ∈ B(H), then the following statements are equivalent. (a) A is an isometry, i.e., kAf k = kf k for every f ∈ H. (b) A∗ A = I. (c) hAf, Agi = hf, gi for every f , g ∈ H. Exercise 2.24. If H = Cn and A, B are n × n matrices, then AB = I implies BA = I. Give a counterexample to this for an infinite-dimensional Hilbert space. Consequently, the hypothesis A∗ A = I in the preceding result does not imply that AA∗ = I. Exercise 2.25. If A ∈ B(H), then the following statements are equivalent. (a) A∗ A = AA∗ = I. (b) A is unitary, i.e., it is a surjective isometry. (c) A is a normal isometry. The following result provides a very useful relationship between the range of A∗ and the kernel of A. Theorem 2.26. Let A ∈ B(H, K). (a) ker(A) = range(A∗ )⊥ . (b) ker(A)⊥ = range(A∗ ). (c) A is injective if and only if range(A∗ ) is dense in H. 20 CHRISTOPHER HEIL Proof. (a) Assume that f ∈ ker(A) and let h ∈ range(A∗ ), i.e., h = A∗ g for some g ∈ K. Then since Af = 0, we have hf, hi = hf, A∗ gi = hAf, gi = 0. Thus f ∈ range(A∗ )⊥ , so ker(A) ⊆ range(A∗ )⊥ . Now assume that f ∈ range(A∗ )⊥ . Then for any h ∈ H we have hAf, hi = hf, A∗ hi = 0. But this implies Af = 0, so f ∈ ker(A). Thus range(A∗ )⊥ ⊆ ker(A). (b), (c) Exercises. 3. Projections and Idempotents: Invariant and Reducing Subspaces Definition 3.1. a. If E ∈ B(H) satisfies E 2 = E then E is said to be idempotent. b. If E ∈ B(H) satisfies E 2 = E and ker(E) = range(E)⊥ then E is called a projection. Exercise 3.2. If E ∈ B(H) is an idempotent operator, then ker(E) and range(E) are closed subspaces of H. Further, ker(E) = range(I − E) and range(E) = ker(I − E). Lemma 3.3 (Characterization of Orthogonal Projections). Let E ∈ B(H) be a nonzero idempotent operator. Then the following statements are equivalent. (a) E is a projection. (b) E is the orthogonal projection of H onto range(E). (c) kEk = 1. (d) E is self-adjoint. (e) E is normal. (f) E is positive, i.e., hEf, f i ≥ 0 for every f ∈ H. Proof. (e) ⇒ (a). Assume that E 2 = E and E is normal. Then from Lemma 2.20 we know that kEf k = kE ∗ f k for every f ∈ H. Hence Ef = 0 if and only if E ∗ f = 0, or in other words, ker(E) = ker(E ∗ ). But we know from Theorem 2.26 that ker(E ∗ ) = range(E)⊥ . Hence we conclude that ker(E) = range(E)⊥ , and therefore E is a projection. The remaining implications are exercises. Definition 3.4 (Orthogonal Direct Sum of Subspaces). Let {Mi }i∈I be a collection of closed subspaces of H such that Mi ⊥ Mj whenever i 6= j. Then the orthogonal direct sum of the Mi is the smallest closed subspace which contains every Mi . This space is S L Mi = span Mi . i∈I i∈I CHAPTER 2. OPERATORS ON HILBERT SPACES 21 Exercise 3.5. Suppose that M , N are closed subspaces of H such that M ⊥ N . Prove that M + N = {m + n : m ∈ M, n ∈ N } is a closed subspace of H, and that M ⊕ N = M + N. Show that every vector x ∈ M ⊕ N can be written uniquely as x = m + n with m ∈ M and n ∈ N. Extend by induction to finite collections of closed, pairwise orthogonal subspaces. (Unfortunately, the analogous statement is not true for infinite collections.) Exercise 3.6. Show that if A ∈ B(H, K) then H = ker(A) ⊕ range(A∗ ). Definition 3.7. Let A ∈ B(H) and M ≤ H. (a) We say that M is invariant under A if A(M ) ⊆ M , where A(M ) = {Ax : x ∈ M }. That is, M is invariant if x ∈ M implies Ax ∈ M . Note that it need not be the case that A(M ) = M . (b) We say that M is a reducing subspace for A if both M and M ⊥ are invariant under A, i.e., A(M ) ⊆ M and A(M ⊥ ) ⊆ M ⊥ . Proposition 3.8. Let A ∈ B(H) and M ≤ H be given. Then the following statements are equivalent. (a) M is invariant under A. (b) P AP = AP , where P = PM is the orthogonal projection of H onto M . Exercise 3.9. Define L : `2 (Z) → `2 (Z) by L(. . . , x−1 , x0 , x1 , . . . ) = (. . . , , x0 , x1 , x2 , . . . ), where on the right-hand side the entry x1 sits in the 0th component position. That is, L slides each component one unit to the left (L is called a bilateral shift). Find a closed subspace of `2 (Z) that is invariant but not reducing under L. Exercise 3.10. Assume that M ≤ H is invariant under L ∈ B(H). Prove that M ⊥ is invariant under L∗ . 22 CHRISTOPHER HEIL 4. Compact Operators Definition 4.1 (Compact and Totally Bounded Sets). Let X be a Banach space, and let E ⊆ X be given. (a) We say that E is compact if every open cover of E contains a finite subcover. That is, E is compact if whenever {Uα }α∈I is a collection of open sets whose union contains E, then there exist finitely many α1 , . . . , αN such that E ⊆ Uα1 ∪ · · · ∪ UαN . (b) We say that E is sequentially compact if every sequence {fn }n∈N of points of E contains a convergent subsequence {fnk }k∈N whose limit belongs to E. (c) We say that E is totally bounded if for every ε > 0 there exist finitely many points f1 , . . . , fN ∈ E such that N S B(fk , ε), E⊆ k=1 where B(fk , ε) is the open ball of radius ε centered at fk . That is, E is totally bounded if and only there exist finitely many points f1 , . . . , fN ∈ E such that every element of E is within ε of some fk . In finite dimensions, a set is compact if and only if it is closed and bounded. In infinite dimensions, all compact sets are closed and bounded, but the converse fails. Instead, we have the following characterization of compact sets. (this characterization actually holds in any complete metric space). Theorem 4.2. Let E be a subset of a Banach space X. Then the following statements are equivalent. (a) E is compact. (b) E is sequentially compact. (c) E is closed and totally bounded. Proof. (b) ⇒ (a).2 Assume that E is sequentially compact. Our first step will be to prove the following claim, where the diameter of a set S is defined to be diam(S) = sup{kf − gk : f, g ∈ S}. Claim 1. For any open cover {Uα }α∈I of E, there exists a number δ > 0 (called a Lebesgue number for the cover) such that if S ⊆ E satisfies diam(S) < δ, then there is an α ∈ I such that S ⊆ Uα . To prove the claim, suppose that {Uα }α∈I was an open cover of E such that no δ with the required property existed. Then for each n ∈ N, we could find a set Sn ⊆ E with diam(Sn ) < n1 such that Sn is not contained in any Uα . Choose any fn ∈ Sn . Since E is sequentially compact, there must be a subsequence {fnk }k∈N that converges to an element of 2This proof is adapted from one given in J. R. Munkres, “Topology,” Second Edition, Prentice Hall, 2000. CHAPTER 2. OPERATORS ON HILBERT SPACES 23 E, say fnk → a ∈ E. But we must have a ∈ Uα for some α, and since Uα is open there must exist some ε > 0 such that B(a, ε) ⊆ Uα . Now choose k large enough that we have both 1 ε ε < and ka − fnk k < . nk 2 2 ε The first inequality above implies that diam(Snk ) < 2 . Therefore, using this and second inequality, we have Snk ⊆ B(a, ε) ⊆ Uα , which is a contradiction. Therefore the claim is proved. Next, we will prove the following claim. Claim 2. For any ε > 0, there exist finitely many f1 , . . . , fN ∈ E such that N S E⊆ B(fk , ε). k=1 To prove this claim, assume that there is an ε > 0 such that E cannot be covered by finitely many ε-balls centered at points of E. Choose any f1 ∈ E. Since E cannot be covered by a single ε-ball, we have E 6⊆ B(f1 , ε). Hence there exists f2 ∈ E \ B(f1 , ε), i.e., f2 ∈ E and kf2 − f1 k ≥ ε. But E cannot be covered by two ε-balls, so there must exist an f3 ∈ E \ B(f1 , ε) ∪ B(f2 , ε) . In particular, we have kf3 − f1 k, kf3 − f2 k ≥ ε. Continuing in this way we obtain a sequence of points {fn }n∈N in E which has no convergent subsequence, which is a contradiction. Hence the claim is proved. Finally, we show that E is compact. Let {Uα }α∈I be any open cover of E. Let δ be the Lebesgue number given by Claim 1, and set ε = 3δ . By Claim 2, there exists a covering of E by finitely many ε-balls. Each ball has diameter smaller than δ, so by Claim 1 is contained in some Uα . Thus we find finitely many Uα that cover E. (c) ⇒ (b). Assume that E is closed and totally bounded, and let {fn }n∈N be any sequence of points in E. Since E is covered by finitely many balls of radius 21 , one of those balls must (1) contain infinitely many fn , say {fn }n∈N . Then we have ∀ m, n ∈ N, (1) kfm − fn(1) k < 1. (2) Since E is covered by finitely many balls of radius 41 , we can find a subsequence {fn }n∈N of (1) {fn }n∈N such that 1 (1) ∀ m, n ∈ N, kfm − fn(1) k < . 2 (k) (k) (k) By induction we keep constructing subsequences {fn }n∈N such that kfm − fn k < k1 for all m, n ∈ N. (n) Now consider the “diagonal subsequence” {fn }n∈N . Given ε > 0, let N be large enough (m) (n) that N1 < ε. If m ≥ n > N , then fm is one element of the sequence {fk }k∈N , say (n) (m) fm = fk . Then 1 (n) (m) kfm − fn(n) k = kfk − fn(n) k < < ε. n (n) Thus {fn }n∈N is Cauchy and hence converges. Since E is closed, it must converge to some element of E. 24 CHRISTOPHER HEIL (a) ⇒ (c). Exercise. Exercise 4.3. Show that if E is a totally bounded subset of a Banach space X, then its closure E is compact. A set whose closure is compact is said to be precompact. Notation 4.4. We let BallH denote the closed unit sphere in H, i.e., BallH = Ball(H) = {f ∈ H : kf k ≤ 1}. Exercise 4.5. Prove that if H is infinite-dimensional, then BallH is not compact. Definition 4.6 (Compact Operators). Let H, K be Hilbert spaces. A linear operator T : H → K is compact if T (BallH ) has compact closure in K. We define B0 (H, K) = {T : H → K : T is compact}, and set B0 (H) = B0 (H, H). By definition, a compact operator is linear, and we will see that all compact operators are bounded. Thus it will turn out that B0 (H, K) ⊆ B(H, K). In fact, we will see that B0 (H, K) is a closed subspace of B(H, K). The following result gives some useful reformulations of the definition of compact operator. Proposition 4.7 (Characterizations of Compact Operators). Let T : H → K be linear. Then the following statements are equivalent. (a) T is compact. (b) T (BallH ) is totally bounded. (c) If {fn }n∈N is a bounded sequence in H, then {T fn }n∈N contains a convergent subsequence. Proof. (a) ⇔ (b). This follows from Theorem 4.2 and Exercise 4.3. (a) ⇒ (c). Suppose that T is compact and that {fn }n∈N is a bounded sequence in H. By rescaling the sequence (i.e., multiplying by an appropriate scalar), we may assume that fn ∈ BallH for every n. Therefore T fn ∈ T (BallH ) ⊆ T (BallH ). Since T (BallH ) is compact, it follows from Theorem 4.2 that {T fn }n∈N contains a subsequence which converges to an element of T (BallH ). (c) ⇒ (a). Exercise. Proposition 4.8. If T : H → K is compact, then it is bounded. That is, B0 (H, K) ⊆ B(H, K). Proof. Assume that T : H → K is linear but unbounded. Then there exist vectors fn ∈ H such that kfn k = 1 but kT fn k ≥ n. Therefore every subsequence of {T fn }n∈N is unbounded, and hence cannot converge. Therefore T is not compact by Proposition 4.7. CHAPTER 2. OPERATORS ON HILBERT SPACES 25 Exercise 4.9. Show that if H is infinite-dimensional then the identity operator on H is not compact. Hence a bounded operator need be compact in general. The following exercise shows that a compact operator maps an orthonormal sequence to a sequence that converges to the zero vector. Exercise 4.10. (a) Let {hn }n∈N be a sequence of vectors in H, and let h ∈ H. Suppose that every subsequence of {hn }n∈N contains a subsequence that converges to h. Prove that hn → h. Hint: Proceed by contradiction. Suppose that hn does not converge to h. Show that this implies that there is an ε > 0 and a subsequence {hnk }k∈N such that kh − hnk k ≥ ε for every k. (b) Suppose that T : H → K is compact, and let {en }n∈N be an orthonormal sequence in H. Show that T en → 0. Hint: Choose any subsequence {fn }n∈N . Since T is compact, this sequence has a subsequence {gn }n∈N such that {T gn }n∈N converges, say T gn → h. Prove that hT gn , hi → 0 (use Bessel’s Inequality to find a bound for the `2 -norm of {hT gn , hi}n∈N ). Use part (a) to complete the proof. The following exercise shows that a compact operator maps weakly convergent sequences to convergent sequences. Definition 4.11. Let {fn }n∈N be a sequence of vectors in H and let f ∈ H. We say that w fn converges weakly to f , written fn → f , if ∀ g ∈ H, hfn , gi → hf, gi as n → ∞. w Exercise 4.12. (a) Show that if fn → f , then fn → f . w (b) Show that if {en }n∈N is an orthonormal sequence in H, then en → 0. w (c) Suppose that T ∈ B(H) is compact. Show that if fn → f , then T fn → T f . Exercise 4.13. Let φ ∈ L∞ (Rn ) be fixed, with φ 6= 0. Then by Exercise 1.18 we know that the multiplication operator Mφ : L2 (Rn ) → L2 (Rn ) given by Mφ f = f φ is bounded. Show that Mφ is not compact. Hint: There must exist an ε > 0 and a set E ⊆ Rn with positive measure such that |φ(x)| ≥ ε for all x ∈ E. Exhibit a measure space (X, Ω, µ) and a bounded, nonzero φ ∈ L∞ (X) such that Mφ is compact. Hint: Consider Exercise 4.23. Exercise 4.14. Porve that if T : H → K is compact and injective, then T −1 : range(T ) → H is unbounded. 26 CHRISTOPHER HEIL Theorem 4.15 (Limits of Compact Operators). B0 (H, K) is a closed subspace of B(H, K) (under the operator norm). That is, (a) if S, T ∈ B0 (H, K) and α, β ∈ F, then αS + βT ∈ B0 (H, K), (b) if Tn ∈ B0 (H, K), T ∈ B(H, K), and kT − Tn k → 0, then T ∈ B0 (H, K). Proof. (a) Exercise. (b) Assume that Tn are compact operators and that Tn → T in operator norm. By Proposition 4.7, it suffices to show that T (BallH ) is a totally bounded subset of K. Choose any ε > 0. Then there exists an n such that kT − Tn k < 3ε . Now, Tn is compact, so Tn (BallH ) is totally bounded. Hence there exist finitely many points h1 , . . . , hm ∈ BallH such that m S Tn (BallH ) ⊆ B Tn hj , 3ε . (4.1) j=1 We will show that T (BallH ) is totally bounded by showing that m S T (BallH ) ⊆ B T n hj , ε . (4.2) j=1 Choose any element of T (BallH ), i.e., any point T f with kf k ≤ 1. Then Tn f ∈ Tn (BallH ), so by (4.1) there must be some j such that kTn f − Tn hj k < 3ε . Consequently, kT f − T hj k ≤ kT f − Tn f k + kTn f − Tn hj k + kTn hj − T hj k ε < kT − Tn k kf k + + kTn − T k khj k 3 ε ε ε < ·1+ + ·1 3 3 3 = ε. Hence (4.2) follows, so T is compact. Exercise 4.16. Another way to prove Theorem 4.15 is to apply a Cantor diagonalization argument. Fill in the details in the following sketch of this argument. Suppose that {fn }n∈N is a bounded sequence in H. Then since T1 is compact, there exists a (1) (1) subsequence {fn }n∈N of {fn }n∈N such that {T1 fn }n∈N converges. Then since T2 is compact, (2) (1) (2) there exists a subsequence {fn }n∈N of {fn }n∈N such that {T2 fn }n∈N converges (and note (2) that {T1 fn }n∈N also converges!). Continue to construct subsequences in this way, and then (n) show that the “diagonal subsequence” {T fn }n∈N converges (use the fact that there exists a k such that kT − Tk k < ε). Therefore T is compact. Theorem 4.17 (Compositions and Compact Operators). Let H1 , H2 , H3 be Hilbert spaces. (a) If A : H1 → H2 is bounded and T : H2 → H3 is compact, then T A : H1 → H3 is compact. CHAPTER 2. OPERATORS ON HILBERT SPACES 27 (b) If T : H1 → H2 is compact and A : H2 → H3 is bounded, then AT : H1 → H3 is compact. Proof. (b) Assume that A is bounded and T is compact. Let {fn }n∈N be any bounded sequence in H1 . Then since T is compact, there is a subsequence {T fnk }k∈N that converges in H2 . Since A is bounded, the subsequence {AT fnk }k∈N therefore converges in H3 . Hence AT is compact. (a) Exercise. Exercise 4.18. Prove that if T ∈ B0 (H, K), then range(T ) is a separable subspace of K. Hints: Since T (BallH ) is compact, it is totally bounded. Hence for each n ∈ N we can find finitely many balls of radius 1/n with centers in T (BallH ) that cover T (BallH ). If we consider all these balls for every n, we have countably many balls that cover T (BallH ). Show that this implies that T (BallH ) contains a countable, dense subset. Then do the same for each ball of radius k ∈ N instead of just k = 1. Combine all of these together to get a countable dense subset of range(T ). Definition 4.19 (Finite-Rank Operators). Recall that the rank of an operator T : H → K is the dimension of range(T ). We say that T is a finite-rank operator if range(T ) is finitedimensional. We set B00 (H, K) = {T ∈ B(H, K) : T is finite-rank}, and set B00 (H) = B00 (H, H). A linear, finite-rank operator need not be bounded (that is why we include the assumption of boundedness in the definition of B00 (H, K) above). However, the following result shows that if a finite-rank operator is bounded, then it is actually compact. Proposition 4.20. If T : H → K is bounded, linear, and has finite rank, then T is compact. Thus, B00 (H, K) ⊆ B0 (H, K). Proof. Since T is bounded, T (BallH ) is a bounded subset of the finite-dimensional space range(T ). All finite-dimensional spaces are closed. Hence the closure of T (BallH ) is a closed and bounded subset of range(T ), and therefore is compact. This gives us the following very useful way to show that a general operator T is compact: try to construct a sequence of finite-rank operators Tn that converge to T in operator norm. Corollary 4.21. Suppose that Tn ∈ B(H, K) are finite-rank operators, T ∈ B(H, K), and Tn → T in operator norm. Then T is compact. Exercise 4.22. Show that if E ∈ B(H) is compact and idempotent, then E has finite rank. 28 CHRISTOPHER HEIL Example 4.23. Let {en }n∈N be an orthonormal basis for a separable Hilbert space H, and let λ = (λn )n∈N be a bounded sequence of scalars. Then we know from Example 1.7 that ∞ X Lf = n=1 λn hf, en i en defines a bounded operator on H. Suppose that λn → 0 as n → ∞. Define N X LN f = n=1 λn hf, en i en . Since range(LN ) ⊆ span{e1 , . . . , eN } (must it be equality?), we have that LN is finite-rank. (Exercise: Show that L is not finite-rank if there are infinitely many λn 6= 0.) Further, LN is a good approximation to L, because (using the Plancherel Theorem) we have ∞ 2 X k(L − LN )f k2 = λ hf, e i e n n n n=N +1 ∞ X = n=N +1 ≤ ≤ |λn |2 |hf, en i|2 sup |λn | n>N 2 ∞ X n=N +1 |hf, en i|2 sup |λn |2 kf k2 . n>N It follows that LN converges to L in operator norm: lim kL − LN k2 ≤ lim sup |λn |2 = lim sup |λn |2 = 0. N →∞ N →∞ n>N N →∞ Since each LN is compact, we conclude that L is compact as well. Exercise 4.24. Continuing the preceding example, prove the following. (a) Prove that if λn does not converge to zero then L is not compact. Hint: We know at least some of the eigenvectors of L. (b) Prove that, with only the assumption that λ ∈ `∞ , we have ∀ f ∈ H, LN f → Lf. (4.3) That is, for each individual vector f we have kLf − LN f k → 0, where this is the norm in H. A sequence of operators which satisfies (4.3) is said to converge strongly or in the strong operator topology (SOT). Prove that strong convergence of operators does not imply convergence in operator norm, i.e., (4.3) does not imply that kL − LN k → 0. (c) Assuming that λ ∈ `∞ , prove that L is self-adjoint if and only if every λn is real. CHAPTER 2. OPERATORS ON HILBERT SPACES 29 We can characterize all the finite-rank operators, as follows. Proposition 4.25 (Finite-Rank Operators). Let L : H → K be bounded and linear. Then the following statements are equivalent. (a) L has finite rank. (b) There exist vectors ϕ1 , . . . , ϕN ∈ H and ψ1 , . . . , ψN ∈ K such that Lf = N X k=1 hf, ϕk i ψk , f ∈ H. (4.4) Proof. (a) ⇒ (b). Since L has finite rank, we know that range(L) is a finite-dimensional subspace of K. Every finite-dimensional subspace is closed, so we can find a finite orthonormal basis {ψk }N k=1 for range(L). Therefore, if f ∈ H then we can express Lf in terms of this orthonormal basis: N N N X X X Lf = hLf, ψk i ψk = hf, L∗ ψk i ψk = hf, ϕk i ψk , k=1 k=1 k=1 ∗ where ϕk = L ψk . (b) ⇒ (a). We have range(L) ⊆ span{ψ1 , . . . , ψN }. Corollary 4.26. If L ∈ B(H, K) has rank 1, then there exist ϕ ∈ H and ψ ∈ K such that Lf = hf, ϕi ψ, f ∈ H. In particular, if ϕ = ψ are unit vectors, then Lf = hf, ϕi ϕ is the orthogonal projection of H onto span{ϕ}. Exercise 4.27. Compute the adjoint of L given by (4.4). Conclude that the adjoint of a finite-rank operator is also finite-rank. Exercise 4.28. Show that if A ∈ B(H) and AT = T A for every finite-rank T then A = cI for some scalar c. Exercise 4.29. Use the idea of the Proposition 4.25 to show that if H is separable and L : H → H is any bounded linear operator, then there exist finite-rank operators LN that converge to L in the strong operator topology, i.e., kLf − LN f k → 0 for each individual f . However, observe that if L is not compact, then we cannot have LN → L in operator norm. The following result shows that not only is the operator norm limit of a sequence of finiterank operators compact, but every compact operator can be realized as the operator norm limit of finite-rank operators. Theorem 4.30. If T ∈ B(H, K), then the following statements are equivalent. 30 CHRISTOPHER HEIL (a) T is compact. (b) There exist finite-rank operators Tn ∈ B(H, K) such that Tn → T . As a consequence, we have that B00 (H, K) is a dense subspace of B0 (H, K), i.e., B00 (H, K) = B0 (H, K) (closure in operator norm). Proof. (b) ⇒ (a). This follows from Theorem 4.15 and the fact that all bounded finite-rank operators are compact. (a) ⇒ (b). Assume that T is compact. Let R = range(T ). If R is finite-dimensional, then T is finite-rank, and we are done. So, assume that R is infinite-dimensional. By Exercise 4.18, we know that R is separable, so there exists a countable orthonormal basis {en }n∈N for R. For any f ∈ H we have T f ∈ R, so Tf = ∞ X n=1 Define TN f = hT f, en i en , N X n=1 f ∈ H. hT f, en i en , f ∈ H, and note that TN = PN T where PN is the orthogonal projection of K onto the closed subspace span{e1 , . . . , eN }. By definition, we have that TN converges to T in the strong operator topology, i.e., TN f → T f for every f . Our goal is to show more, namely, to show that TN → T in operator norm. That is, we need to show that supkf k=1 kT f − TN f k → 0. Choose any ε > 0. Since T (BallH ) is totally bounded, it is covered by finitely many ε-balls centered at points in T (BallH ). Hence, there exist h1 , . . . , hm ∈ H such that m S ε T (BallH ) ⊆ B T hk , . 3 k=1 Since limN →∞ kT hk − TN hk k = 0 for k = 1, . . . , m, we can find an N0 such that ε ∀ N > N0 , kT hk − TN hk k < , k = 1, . . . , m. 3 Choose any f with kf k = 1 and any N > N0 . Then T f ∈ B T hk , 3ε for some k, i.e., ε kT f − T hk k < . 3 Therefore we also have N X kTN f − TN hk k = hT (f − hk ), en i en n=1 = X N n=1 |hT (f − hk ), en i| 2 1/2 CHAPTER 2. OPERATORS ON HILBERT SPACES ≤ X ∞ n=1 |hT (f − hk ), en i| 2 1/2 = kT f − T hk k < 31 ε . 3 Alternatively, this follows even more simply from the fact that ε kTN f − TN hk k = kPN T f − PN T hk k ≤ kPN k kT f − T hk k < 1 · . 3 In any case, it follows that ε ε ε kT f − TN f k ≤ kT f − T hk k + kT hk − TN hk k + kTN hk − TN f k < + + = ε. 3 3 3 This is true for every unit vector, so we have kT − TN k ≤ ε for all N > N0 . Therefore, we do indeed have kT − TN k → 0. Corollary 4.31. If T ∈ B(H, K), then T is compact ⇐⇒ T ∗ is compact. Proof. Assume that T is compact. Then there exist finite-rank operators TN such that TN → T . Hence TN∗ → T ∗ (why?), but each TN∗ is finite-rank, so T ∗ is compact. The converse is symmetrical. Exercise 4.32. Extend Example 4.23 as follows. Let H be a separable Hilbert space, and let {en }n∈N be an orthonormal basis for H. Let λ = (λn )n∈N ∈ `∞ (N) be given. Define Len = λn en . Prove that the definition of L can be extended to all of H in such a way that L is a bounded linear operator. Prove that this operator L is compact if and only if λn → 0. The next result shows that an integral operator with a square-integrable kernel is compact. Theorem 4.33. Let (X, Ω, µ) be a σ-finite measure space. If k ∈ L2 (X × X), then the integral operator Z Lf (x) = k(x, y) f (y) dµ(y), f ∈ L2 (X), X 2 defines a compact mapping of L (X) into itself. Further, kLk ≤ kkk2 . Proof. Note that by Theorem 1.22 we already know that L defines a bounded operator, and that kLk ≤ kkk2 . So, we need only show that L is compact. For simplicity, we will consider only the case where L2 (X) is separable. In this case there exists an orthonormal basis {en }n∈N for L2 (X). Define emn (x, y) = em (x) en (y), x, y ∈ X. Then it is easy to see that {emn }m,n∈N is an orthonormal sequence in L2 (X × X), and with more work (exercise3 ) it can be shown that that it is also complete and hence forms an orthonormal basis for L2 (X × X). Since k ∈ L2 (X × X), we therefore have ∞ X ∞ X k = hk, emn i emn , m=1 n=1 3For details on this type of argument, see the “Real Analysis Review” handout on the instructor’s webpage. 32 CHRISTOPHER HEIL where this series converges in the norm of L2 (X ×X), and in fact it converges unconditionally. For each N ∈ N define an approximation to k by setting kN = N N X X m=1 n=1 2 hk, emn i emn . Then kN → k in L -norm. Now define an approximation to L by defining LN to be the integral operator with kernel kN , i.e., Z LN f (x) = kN (x, y) f (y) dµ(y), X f ∈ L2 (X). Since kN ∈ L2 (X × X), we know that LN is bounded. Further, since the sums involved are finite, we can interchange sums and integrals in the following calculation to obtain that Z LN f (x) = kN (x, y) f (y) dµ(y) X = Z X N X N X m=1 n=1 = N X N X m=1 n=1 = N X N X m=1 n=1 hk, emn i emn (x, y) f (y) dµ(y) hk, emn i Z em (x) en (y) f (y) dµ(y) X hk, emn i hf, en i em (x). P PN This is an equality of functions, i.e., LN f = N m=1 n=1 hk, emn i hf, en i em a.e. In any case we have LN f ∈ span{e1 , . . . , eN }, so LN has finite rank. Since LN is bounded (why?), it is therefore compact. Consequently, if we can show that LN → L, then we can conclude that L itself is compact. Note that L−LN is simply the integral operator with kernel k−kN . Since k−kN ∈ L2 (X×X), we know that L − LN is bounded, and that kL − LN k ≤ kk − kN k2 → 0 as N → ∞. Hence LN → L in operator norm, so L is compact. For the remainder of this section, we consider eigenvalues and eigenvectors of compact operators. Definition 4.34. Let A ∈ B(H) be given. (a) A scalar λ ∈ F is an eigenvalue of A if there exists a nonzero vector f ∈ H such that Af = λf . Equivalently, λ is an eigenvalue if ker(A − λI) 6= {0}. (b) If λ ∈ F is an eigenvalue of A, then any nonzero vector in ker(A − λI) is called an eigenvector of A corresponding to the eigenvalue λ, or simply a λ-eigenvector for short. Equivalently, a nonzero vector f ∈ H is a λ-eigenvector if Af = λf . CHAPTER 2. OPERATORS ON HILBERT SPACES 33 (c) If λ ∈ F is an eigenvalue of A, then ker(A − λI) is called the eigenspace of A corresponding to the eigenvalue λ, or simply a λ-eigenspace for short. Every nonzero vector in the λ-eigenspace is a λ-eigenvector of A. (d) The point spectrum σp (A) of A is the set of eigenvalues of A: σp (A) = {λ ∈ F : λ is an eigenvalue of A}. Exercise 4.35. Let {en }n∈N be an orthonormal basis for a separable Hilbert space P H, and let ∞ λ = (λn )n∈N ∈ ` (N) be fixed. Let L : H → H be the bounded operator Lf = λn hf, en i en defined in Example 1.7. (a) Show that σp (L) = {λn : n ∈ N}. (b) Show that if µ is one component of λ and J = {n ∈ N : λn = µ}, then the µeigenspace of L is span{en }n∈J . (c) Show that the eigenspaces of L corresponding to distinct eigenvalues are orthogonal. Exercise 4.36. Let L ∈ B(H) be given. Prove the following. (a) If L is self-adjoint, then all eigenvalues of L are real. (b) If L is positive (hLf, f i ≥ 0 for all f ), then all eigenvalues of L are nonnegative. (c) If L is positive definite (hLf, f i > 0 for all f 6= 0), then all eigenvalues of L are strictly positive. (d) If L is unitary, then every eigenvalue λ satisfies |λ| = 1. Exercise 4.37. Suppose that L ∈ B(H) is normal. Prove that if λ 6= µ are distinct eigenvalues of L, then the corresponding eigenspaces are orthogonal, i.e., ker(L − λI) ⊥ ker(L − µI). While any linear operator A : Cn → Cn must have an eigenvalue, bounded operators on infinite-dimensional Hilbert spaces need not have any eigenvalues. Exercise 4.38. (a) Prove that the Volterra operator defined in Exercise 1.25 is compact but has no eigenvalues, i.e., its point spectrum is empty. (b) Prove that the right-shift operator R on `2 (N) has no eigenvalues. (c) Prove that every scalar |λ| < 1 is an eigenvalue of the left-shift operator L on ` 2 (N), and find the corresponding eigenvectors. Thus, this operator has uncountably many eigenvalues. (d) Let φ(x) = x. Prove that the multiplication operator Mφ : L2 [0, 1] → L2 [0, 1], defined by Mφ f (x) = xf (x), is self-adjoint but has no eigenvalues. (e) Define k(x, y) = ( i, y ≤ x, −i, y > x, 34 CHRISTOPHER HEIL and let L : L2 [0, 1] → L2 [0, 1] be the integral operator with kernel k. Prove that L is both 2 (and only these), compact and self-adjoint. Prove that the eigenvalues of L are λk = (2k+1)π and find the corresponding eigenvectors. Exercise 4.39 (Convolution). Fix g ∈ L2 [0, 1], where we consider functions in L2 [0, 1] to be 1-periodically extended to the real line. Let T be the convolution operator Z 1 T f (x) = (f ∗ g)(x) = g(x − y) f (y) dy. 0 (a) Prove that T is compact. Hint: Write T as an integral operator and show that its kernel is square-integrable. Note that the fact that [0, 1] has finite measure is important. (b) Prove that the complex exponential functions en (x) = e2πinx are eigenvectors of T , with corresponding eigenvalues ĝ(n) (the Fourier coefficients of g). Exercise 4.40. (a) Assume that A ∈ B(H) is normal and let λ ∈ F be given. Show that A − λI is normal. Use this to show that ker(A − λI) = ker(A∗ − λ̄I). Conclude that if λ is an eigenvalue of A then λ̄ is an eigenvalue of A∗ , and the corresponding eigenspaces are equal. Hint: Consider Corollary 2.21. (b) Find an example of a non-normal operator for which the conclusion of part (a) fails. Hint: Consider a shift operator. The next result shows that the eigenspaces (if any) of a compact operator corresponding to nonzero eigenvalues must be finite-dimensional. Proposition 4.41. Assume that T : H → H is compact and that λ 6= 0 is an eigenvalue of T . Then ker(T − λI) is finite-dimensional. Proof. Since T is bounded, we know that ker(T −λI) is a closed subspace of H. Suppose that it was infinite-dimensional. Then we could find an infinite orthonormal sequence {en }n∈N in ker(T − λI). Each en is a λ-eigenvector, i.e., T en = λen . But then {en }n∈N is a bounded sequence in H yet √ kT em − T en k = |λ| kem − en k = |λ| 2, so since λ 6= 0 there can be no convergent subsequences of {T en }n∈N , which contradicts the fact that T is compact. The following is one useful theoretical result which implies the existence of an eigenvalue of a compact operator T . It states that if inf kf k=1 kT f − λf k = 0, then this infimum is actually achieved, i.e., kT f − λf k = 0 for some unit vector f , or in other words, there exists a λ-eigenvector for T . Proposition 4.42. Assume that T : H → H is compact and that λ 6= 0 is given. Then: inf kT f − λf k = 0 kf k=1 =⇒ λ ∈ σp (T ). CHAPTER 2. OPERATORS ON HILBERT SPACES 35 Proof. Assume that inf kf k=1 kT f − λf k = 0. Then we can find unit vectors fn such that kT fn − λfn k → 0. Since T is compact, {T fn }n∈N has a convergent subsequence, say T fnk → g ∈ H. Since λ 6= 0 we have λfnk − T fnk + T fnk 0+g g → = . (4.5) f nk = λ λ λ Since the fnk are unit vectors, we conclude that g 6= 0. Moreover, since T is continuous it follow from (4.5) that T fnk → T g/λ. But we also know that T fnk → g, so we conclude that T g/λ = g, or in other words that g is a λ-eigenvector. Corollary 4.43. Assume T : H → H is compact and that λ 6= 0. If λ ∈ / σp (T ) and λ̄ ∈ / σp (T ∗ ), then range(T − λI) is a bounded bijection of H onto itself, and (T − λI)−1 is bounded. Proof. Since λ is not an eigenvalue, we know that T − λI is injective. Further, it follows from the preceding proposition that inf kf k=1 kT f − λf k > 0. Hence there exists a C > 0 such that kT f − λf k ≥ C for every unit vector f , and hence ∀ f ∈ H, kT f − λf k ≥ C kf k. (4.6) It follows from Exercise 1.2 than range(T − λI) is a closed subspace of H. But then, since λ̄ is not an eigenvalue of T ∗ , we have that ⊥ range(T − λI) = range(T − λI) = ker (T − λI)∗ = ker(T ∗ − λ̄I)⊥ = {0}⊥ = H. Thus T − λI is a bounded bijection. It remains to show that (T − λI)−1 is bounded. Given f ∈ H we have from (4.6) that kf k = k(T − λI)(T − λI)−1 f k ≥ C k(T − λI)−1 f k. Rearranging, we see that k(T − λI)−1 k ≤ 1 C < ∞. Actually, it can be shown that if T : H → H is compact, λ 6= 0, and λ ∈ / σp (T ), then λ̄ ∈ / σp (T ∗ ) follows automatically. 5. The Diagonalization of Compact Self-Adjoint Operators First let us summarize the facts that have been developed regarding the operator L introduced in Example 1.7 and studied in other examples in previous sections. Theorem 5.1. Let {en }n∈N be an orthonormal basis for a separable Hilbert space H, and let λ = (λn )n∈∞ ∈ `∞ (N) be a bounded sequence of scalars. Define ∞ X Lf = λn hf, en i en , f ∈ H. (5.1) n=1 Then the following statements hold. (a) L is bounded, and kLk = kλk∞ . 36 CHRISTOPHER HEIL (b) L is normal, and L∗ f = P∞ n=1 λ̄n hf, en i en . (c) L is self-adjoint if and only if λn ∈ R for every n. (d) L is compact if and only if λn → 0. Exercise 5.2. Assume that λ → 0 and that L is defined by (5.1). In the definition of L, combine those terms corresponding to identical λn together. That is, let µ = (µk )k∈I be the sequence of distinct values in λ (so I is either {1, . . . , N } if there are only finitely many distinct values, or I = N if there are infinitely many). If we set Jk = {n ∈ N : λn = µk }, then X hf, en i en Pk f = n∈Jk is the orthogonal projection of H onto span{en }n∈Jk . Show that the operator L defined in (5.1) can be rewritten as X Lf = µk Pk f, f ∈ H, k∈I with convergence of the series in the norm of H. Show further that X L = µk P k , k∈I with convergence of the series in operator norm. Show that span{en }n∈Jk is the µk -eigenspace of L. Show that Pj Pk = Pk Pj = 0 for all j 6= k ∈ N, and consequently the eigenspaces corresponding to distinct eigenvalues are orthogonal. In this section we will prove a converse result, showing that all compact, self-adjoint operators on a Hilbert space can be represented in the form of (5.1). First, however, we need to develop some useful machinery. Exercise 5.3. If λ is an eigenvalue of L ∈ B(H), then |λ| ≤ kLk. Exercise 5.4. Let A be an n × n complex matrix. Define its spectral radius to be ρ(A) = max{|λ| : λ is an eigenvalue of A} = max{|λ| : λ ∈ σp (A)}. By the preceding exercise, if we choose any norm on Cn and let kAk be the corresponding operator norm, we have ρ(A) ≤ kAk. (a) Prove that if A is self-adjoint and we use the Euclidean (`2 ) norm on Cn , then kAk = ρ(A). (b) Prove that the same equality holds if A is normal. Find an example of a non-normal matrix for which ρ(A) < kAk. CHAPTER 2. OPERATORS ON HILBERT SPACES 37 (c) Prove that if A is any n × n matrix, then (still using the Euclidean norm on Cn ), kAk = ρ(A∗ A)1/2 . (d) (Harder). Prove that A is a fixed but arbitrary n × n complex matrix and ε > 0 is given, then there exists a norm on Cn such that the corresponding operator norm of A satisfies kAk ≤ ρ(A) + ε. Although an arbitrary compact operator need not have any eigenvalues (see Exercise 4.38), the following result shows that a compact, self-adjoint operator must have at least one eigenvalue. Proposition 5.5. If T : H → H is compact and self-adjoint, then either kT k or −kT k is an eigenvalue of T . Proof. Since T is self-adjoint, we know from Proposition 2.18 that kT k = sup |hT f, f i|. kf k=1 Hence, there must exist unit vectors fn such that |hT fn , fn i| → kT k. Since T is self-adjoint, all the inner products hT fn , fn i are real, so we can find a subsequence that converges either to kT k or to −kT k. Call this subsequence {gn }n∈N , and let λ be either kT k or −kT k as appropriate. Then we have kgn k = 1 for every n, and hT gn , gn i → λ. Hence, since both λ and hT gn , gn i are real, kT gn − λgn k2 = kT gn k2 − 2λ hT gn , gn i + λ2 kgn k2 ≤ kT k2 kgn k2 − 2λ hT gn, gn i + λ2 kgn k2 = λ2 − 2λ hT gn , gn i + λ2 → λ2 − 2λ2 + λ2 = 0. It therefore follows from Proposition 4.42 that λ is an eigenvalue of T . Now we can prove that every compact, self-adjoint operator has a very simple and special form. Theorem 5.6 (Spectral Theorem for Compact Self-Adjoint Operators). Let T : H → H be compact and self-adjoint. Then there exist nonzero real numbers {λn }n∈J , either finitely many or λn → 0 if infinitely many, and an orthonormal basis {en }n∈N of range(T ), such that X Tf = λn hf, en i en , f ∈ H. n∈J Each λn is an eigenvalue of T , and each en is a corresponding eigenvector. 38 CHRISTOPHER HEIL Proof. Note that since T is compact, range(T ) is separable by Exercise 4.18. If T = 0 then the result is trivial, so assume that T is not the zero operator. Let H1 = H and T1 = T . By Proposition 5.5, T1 has an eigenvalue λ1 which satisfies |λ1 | = kT1 k > 0. Let e1 be a corresponding eigenvector, normalized to ke1 k = 1. Let H2 = {e1 }⊥ and let T2 = T |H2 (the restriction of T to H2 ). If T2 = 0, then stop at this point. Otherwise, continue as follows. Since span{e1 } is invariant under T1 (after all, e1 is an eigenvector), we know from Exercise 3.10 that H2 is invariant under T1∗ = T1 . Exercise: Show that T2 : H2 → H2 is compact and self-adjoint. Therefore T2 has an eigenvalue λ2 such that |λ2 | = kT2 k > 0. Note that since T2 is a restriction of T1 , we have |λ2 | = kT2 k ≤ kT1 k = |λ1 |. Let e2 be a corresponding eigenvector, normalized to ke2 k = 1. Note that by definition of H2 , we have e2 ⊥ e1 . Further, λ2 is an eigenvalue of T (not just T2 ), and e2 is the corresponding eigenvector of T . Let H3 = {e1 , e2 }⊥ and let T3 = T |H3 . If T3 = 0, then stop at this point. Otherwise, continue as before to construct an eigenvalue λ3 and eigenvector e3 (which will be orthogonal to both e1 and e2 ). Continuing in this process, there are two possibilities. Case 1: TN +1 = 0 for some N . In this case, since HN +1 = {e1 , . . . , eN }⊥ , we have H = span{e1 , . . . , eN } ⊕ HN +1 . Therefore, if f ∈ H then we can write f uniquely as f = N X n=1 hf, en i en + v where v ∈ HN +1 . Since T (v) = TN +1 (v) = 0, we therefore have Tf = N X n=1 hf, en i T (en ) + T (v) = In this case T is finite-rank and the proof is complete. N X n=1 λn hf, en i en . Case 2: TN 6= 0 for any N . In this case we obtain countably many eigenvalues λn and corresponding orthonormal eigenvectors en . Since T is compact, we have by Exercise 4.10 that λn en = T (en ) → 0. Since ken k = 1, we conclude that λn → 0. Let M = span{en }n∈N . Then {en }n∈N is an orthonormal basis for M , and H = M ⊕ M ⊥ . Hence, if f ∈ H then we can write f uniquely as f = ∞ X n=1 hf, en i en + v for some v ∈ M ⊥ . Therefore ∞ ∞ X X Tf = hf, en i T (en ) + T (v) = λn hf, en i en + T (v). n=1 If we show that T (v) = 0, then we are done. n=1 CHAPTER 2. OPERATORS ON HILBERT SPACES 39 Note that since span{e1 , . . . , eN } ⊆ M , we have v ∈ M ⊥ ⊆ span{e1 , . . . , eN }⊥ = HN . Hence kT (v)k = kTN (v)k ≤ kTN k kvk = |λN | kvk → 0 as N → ∞. Consequently T (v) = 0. Since each eigenspace corresponding to nonzero eigenvalues is finite-dimensional, we can group terms corresponding to the same eigenvalue together. Alternatively, we could write a more efficient proof of the Spectral Theorem (as Conway does), by using the same argument on the distinct eigenvalues and corresponding eigenspaces, instead of one eigenvalue and eigenvector at a time. Either way, an extension of the ideas used in the preceding result gives the following expanded form of the Spectral Theorem. Theorem 5.7 (Spectral Theorem for Compact Self-Adjoint Operators). Let T : H → H be compact and self-adjoint. Then the following statements hold. (a) T has only a finite or countably infinite number of distinct eigenvalues, and each eigenvalue is real. (b) Let {µ1 , µ2 , . . . } = {µk }k∈I be the distinct nonzero eigenvalues, where either I = {1, . . . , N } or I = N. Then each eigenspace is finite-dimensional. Ek = ker(T − λµk ) (c) If I is infinite, then µk → 0 as k → ∞. (d) If Pk is the orthogonal projection onto the eigenspace Ek , then Pj Pk = Pk Pj = 0 for j 6= k ∈ I. That is, eigenspaces corresponding to distinct eigenvalues are orthogonal. (e) We have T = X µk P k , k∈I where the series converges in operator norm. (f) There exist nonzero real numbers {λn }n∈J , either finitely many or λn → 0 if infinitely many, and an orthonormal sequence {en }n∈J such that X Tf = λn hf, en i en . n∈J The λn are obtained by repeating each µk according to its multiplicity (the dimension of the eigenspace Ek ). The sequence {en }n∈J forms an orthonormal basis for ker(T )⊥ = range(T ). Corollary 5.8. If T : H → H is compact, self-adjoint, and injective, then H is separable. Proof. Since T = T ∗ , we have that range(T ) = ker(T ∗ )⊥ = {0}⊥ = H. On the other hand, since T is compact we know from Exercise 4.18 that range(T ) is separable. 40 CHRISTOPHER HEIL Example 5.9 (Diagonalization of Self-Adjoint Matrices). Let us examine what the Spectral Theorem says in finite dimensions. Let A be a self-adjoint n × n matrix (i.e., symmetric if real, and Hermitian if complex). Then the Spectral Theorem says that there exist real nonzero eigenvalues λ1 , . . . , λk and corresponding orthonormal eigenvectors u1 , . . . , uk such that k X Ax = λj (x · uj ) uj . (5.2) j=1 We can extend this representation by including the zero eigenvalues of A, as follows. From (5.2), we see that the column space, or range, of A is C(A) = range(A) = span{u1 , . . . , uk }. Since A is self-adjoint, its nullspace is the orthogonal complement of its column space, for N (A) = ker(A) = range(A)⊥ = C(A)⊥ . Let uk+1 , . . . , un be an orthonormal basis for N (A), and let λk+1 = · · · = λn = 0. Then u1 , . . . , un is an orthonormal basis for Cn with corresponding eigenvectors λ1 , . . . , λn . Further, we have the following representations: n n X X x = (x · uj ) uj and Ax = λj (x · uj ) uj , x ∈ Fn . j=1 j=1 Let us rewrite this representation as follows: n X Ax = λj (x · uj ) uj j=1 λ1 (x · u1 ) | | .. = u1 · · · u n . | | λn (x · un ) λ1 | | .. = u1 · · · u n . | | un λ1 | | .. = u1 · · · u n . | | un = U ΛU H x, x · u1 ... x · un — — uH 1 .. x . — uH n — where U is the matrix that has u1 , . . . , un as columns, and Λ is the diagonal matrix with λ1 , . . . , λn on the diagonal. On the one hand, this is nothing more than the diagonalization of A. However, this says much more: every self-adjoint matrix can be diagonalized (even if some eigenvalues are repeated), and furthermore, the eigenvector matrix is unitary (because it has orthonormal columns). We summarize this next as a theorem. CHAPTER 2. OPERATORS ON HILBERT SPACES 41 Theorem 5.10 (Diagonalization of Self-Adjoint Matrices). Let A be an n × n matrix. Then the following statements are equivalent. (a) A is self-adjoint. (b) A = U ΛU ∗ where U is unitary and Λ is diagonal with real scalars on its diagonal. (c) There exist real scalars λ1 , . . . , λn an orthonormal vectors u1 , . . . , un such that n X Ax = λj (x · uj ) uj , x ∈ Fn . j=1 (d) There exists an orthonormal basis u1 , . . . , un for Fn consisting of eigenvectors of A with corresponding real eigenvalues λ1 , . . . , λn .