Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Class Notes for MATH 567. by S. W. Drury c 2016, by S. W. Drury. Copyright 1 Topological Vector Spaces 1.1 Some Topology Basics Unfortunately sequences are not adequate to handle aspects of topological spaces. The concept that is needed in this context is that of a net. Recall that a partially ordered set (sometimes called a poset ) is a set J together with a relation ≤ on J such that • α ≤ α for all α ∈ J. • α, β ∈ J, α ≤ β, β ≤ α implies α = β. • α, β, γ ∈ J, α ≤ β, β ≤ γ implies α ≤ γ. A directed set J is a partially ordered set with one additional condition: for every pair α, β ∈ J, there exists an element γ ∈ J with α ≤ γ and β ≤ γ. The set of natural numbers N is an example of a directed set, but there are many other examples. To obtain the theory of nets from the theory of sequences, we replace N with a general directed set. E XAMPLE Let J be the set of partitions of [0, 1] as used in Riemann integration. The relation α ≤ β holds if the partition β refines the partition α. One definition of the Riemann integral is in terms of convergence along this directed set. 2 Let J be a directed set and X a topological space. A net in X is a mapping from J to X, usually denoted as (xα )α∈J just as we do for sequences. The net (xα )α∈J is said to converge to x ∈ X if for every neighbourhood V of x, there exists α ∈ J such that xβ ∈ V for all β ∈ J with α ≤ β. Convergence along nets is sometimes called Moore–Smith convergence . 1 A topological space is Hausdorff if and only if whenever x1 , x2 ∈ X with x1 = 6 x2 there exist disjoint open subsets U1 and U2 with xj ∈ Uj for j = 1, 2. There are some other so called separation axioms : • A topological space is regular if both of the following hold: – singletons are closed, – whenever A is a closed subset of X and x ∈ X \ A there exist disjoint open subsets U and V such that x ∈ U and A ⊆ V . • A topological space X is completely regular if both of the following hold: – singletons are closed, – whenever A is a closed subset of X and x ∈ X \ A there exist a continuous function f : X −→ [0, 1] such that f (x) = 1 and f = 0 on A. • A topological space is normal if both of the following hold: – singletons are closed, – whenever A and B are disjoint closed subsets of X there exist disjoint open subsets U and V such that A ⊆ U and B ⊆ V . E XERCISE • Show that in a Hausdorff topological space the limit of a net is unique. • If A ⊆ X, show that x ∈ cl(A) if and only if there exists a net of points in A converging to x. To establish the dif and only ificult implication, let J be the set of neighbourhoods of x ordered by reverse inclusion. • If (xα )α∈J and (yα )α∈J are nets over the same directed set J converging to x and y respectively, show that (xα , yα ) → (x, y) in the product space X × Y . • If X and Y are topological spaces and f : X −→ Y , show that f is continuous if and only if whenever xα → x, f (xα ) → f (x). • Let X be the set of all ordinal numbers that are ≤ Ω where Ω denotes the first uncountable ordinal. Consider the order topology on X. Let E = X \ {Ω}. Then E is dense in X, but no sequence in E converges to Ω. 2 This is because every sequence in E is bounded above by some countable ordinal. Lookup “order topology” and “ordinal number” on Wikipedia for more information. 2 We proven the second item. Let x ∈ / cl(A). Then clearly if xα ∈ A it is impossible that xα −→ x. For the converse, let x ∈ cl(A). Consider the directed set of all neighbourhoods of x ordered by V ≥ W if and only if V ⊆ W . For V a neighbourhood of x, choose xV ∈ V ∩ A. We claim that xV −→ x. Indeed for every neighbourhood W of x, xV ∈ W for every V ≥ W , i.e. for every V ⊆ W . Let J be a directed set and K ⊆ J. Then K is cofinal in J if for all α ∈ J there exists β ∈ K such that α ≤ β. If K is cofinal in J, then it is easy to see that K is a directed set in the order relation that it inherits from J. Let (xα )α∈J be a net in a topological space X. Then the restriction (xα )α∈K is called a subnet in case K is cofinal in J. This generalizes the notion of subsequence. If (xα )α∈J converges to x then so does (xα )α∈K . T HEOREM 1 A topological space is compact if and only if every net in X possesses a convergent subnet. A topological spce is locally compact if and only if every x ∈ X possesses a base of compact neighbourhoods. Specifically, this means that if x ∈ X and V is a neighbourhood of x, then there exists W a compact neighbourhood of x such that W ⊆ V . 1.2 One point compactification Recall that in a Hausdorff topological space, compact subsets are necessarily closed (454 notes page 77). Let X be a locally compact Hausdorff topological space. usually, we also assume that X is not compact, but this is not necessary. We build a new topological space αX = X ∪ {∞} where ∞ is not a point of X. We define U is open in αX if and only if either 1. U ⊆ X and U is open in X. 2. ∞ ∈ U and X \ U is a compact subset of X. One may show that this defines a topology on αX and that αX is compact Hausdorff in this topology. The space αX is called the one point compactification of X or the Alexandroff compactification of X. Here are two more difficult theorems on topology. 3 L EMMA 2 (U RYSOHN ’ S L EMMA ) Let X be a normal topological space. Let A and B be disjoint closed subsets of X then there exists a continuous function f : X −→ [0, 1] such that f −1 ({0}) = A and f −1 ({1}) = B. T HEOREM 3 (U RYSOHN M ETRIZATION T HEOREM ) cal space with a countable basis is metrizable. Every regular topologi- T HEOREM 4 (T IETZE E XTENSION T HEOREM ) Let X be a normal topological space, A a closed subset and g : A −→ [0, 1] a continuous map. Then there exists a continuous extension f of g defined on the whole of X. 1.3 Quotient Spaces Let X be a topological space and ∼ an equivalence relation on X. Let Q be the set of equivalence classes and let π : X −→ Q be the canonical projection that takes every element of X to its equivalence class. Then on Q we define the quotient topology by defining U to be an open subset of Q if and only if π −1 (U ) is open in X. It is easy to check that this actually is a topology. Actually, it is the coarsest topology on Q that renders π continuous. Beware that if X is Hausdorff then Q need not be. 1.4 Uniform spaces Uniform spaces are the natural setting for uniform continuity and completeness. The concept is almost completely absent from most literature. D EFINITION A uniform space is a set X together with a family of subsets of X × X called vicinities. They are required to satisfy the following conditions. 1. For every vicinity V , DiagX ⊆ V . 2. If V ⊆ W ⊆ X × X and V is a vicinity, then W is a vicinity. 3. If V1 and V2 are vicinities, then V1 ∩ V2 is a vicinity. 4. If V is a vicinity, then there exists a vicinity W such that (x1 , x2 ) ∈ W implies (x2 , x1 ) ∈ V . 4 5. If V is a vicinity, then there exists a vicinity W such that (x1 , x2 ), (x2 , x3 ) ∈ W implies (x1 , x3 ) ∈ V . D EFINITION If X and Y are uniform spaces and f : X −→ Y , then f is uniformly continuous if and only if (f × f )−1 (V ) is a vicinity in X whenever V is a vicinity in Y . Every metric space has a natural uniform space structure, V is a vicinity if and only if there exists δ > 0 such that d(x1 , x2 ) < δ implies (x1 , x2 ) ∈ V . Every uniform space X has a natural topology. The neighbourhoods of x are simply the sets {y ∈ X; (x, y) ∈ V } as V runs over the vicinities V . E XERCISE Let V be a vicinity in a uniform space. Show that int V is also a vicinity. The interior is taken in the product topology of X × X. 2 Let X be a uniform space and (xα )α∈J be a net in X. Then (xα )α∈J is a Cauchy net in X if and only if for every vicinity V of X, there exists γ ∈ J such that α ≥ γ, β ≥ γ implies that (xα , xβ ) ∈ V . Let X be a uniform space. The X is complete if and only if every Cauchy net in X converges to some point of X. As you might expect, every convergent net is Cauchy but this uses axiom 5 of uniform space. E XERCISE Let F be a closed subset of a complete uniform space E. Then F is also complete in the uniform structure it inherits from E. Specifically V is a vicinity in F if and only if there exists a vicinity W of E such that V = U ∩ (F × F ). 2 E XERCISE Let F be a subset of a Hausdorff uniform space E. Suppose that F is complete in the uniform structure it inherits from E. Then F is closed in E. 2 If X is a compact completely regular topological space, then X has a natural uniform space structure in which V is a vicinity if and only if there exists U open in X × X (in the product topology) such that DiagX ⊆ U ⊆ V . This is the unique uniform structure on X that gives back the correct topology. 5 1.5 Topological Vector Spaces Let E be a vector space over R or C that is also a Hausdorff topological space. Then E is a topological vector space if the maps (x, y) −→ x + y (t, x) −→ tx E × E −→ E k × E −→ E are continuous where k is the corresponding field of scalars. Note that not all authors insist that a topological vector space is necessarily Hausdorff. However, if one does not so insist, then the subset of vectors that cannot be separated from 0E form a linear subspace and on quotienting out this linear subspace one obtains a space that is Hausdorff. Some basic lemmas follow directly from the definition. L EMMA 5 We have that U is a neighbourhood of 0E if and only if x + U is a neighbourhood of x. L EMMA 6 Let U be a neighbourhood of 0E in a topological vector space E. Then there exists a neighbourhood V of 0E such that V + V ⊆ U . The notation A+B means {a+b; a ∈ A, b ∈ B}. A subset A of E is balanced if x ∈ A and |t| ≤ 1 implies tx ∈ A. A subset A of E is absorbent if for all x ∈ E there exists y ∈ A and t > 0 such that tx = y. L EMMA 7 Then Let U be a neighbourhood of 0E in a topological vector space E. • there exists a balanced neighbourhood V of 0E such that V ⊆ U . • U is absorbent. Proof. Since scalar multiplication is continuous for t 6= 0 the map x 7→ tx is a homeomorphism (bijection continuous in both directions). Also since scalar multiplication is continuous, there exists an open neighbourhood S W of 0E and δ > 0 such that |t| < δ and x ∈ W implies tx ∈ U . But now V = 0<|t|<δ tW is a balanced open subset of U containing 0E . To see that U is absorbent, note that the mapping t 7→ tx is continuous at 0. Hence, there exists δ > 0 such that |t| < δ implies tx ∈ U . Choosing t = 21 δ > 0 does the trick. Combining the lemmas we get 6 P ROPOSITION 8 Let U be a neighbourhood of 0E in a topological vector space E. Then there exists a balanced neighbourhood V of 0E such that V − V ⊆ U . Now let E be a topological vector space without the Hausdorff requirement. Suppose that {0E } is closed. Then E is Hausdorff. Proof: Let x ∈ E \ {0E }. Then since {0E } is closed, there is a neighbourhood U of 0E such that x ∈ / U and hence a neighbourhood V of 0E such that x ∈ / V − V . So, V and x + V are disjoint neighbourhoods of 0E and x respectively. E XERCISE are closed. Show the converse. In a Hausdorff topological space E, singletons 2 This proposition also allows one to define a vicinity of a topological vector space E by V is a vicinity if and only if there exists U a neighbourhood of 0E in E such that x − y ∈ U =⇒ (x, y) ∈ V. E XERCISE Show that this defines a uniform structure on E. 2 This allows us to talk about complete topological vector spaces. P ROPOSITION 9 Let E be a topological vector space and F a linear subspace of E. The cl(F ) is also a linear subspace. Proof. Let x, y ∈ cl(F ). There are nets xα −→ x and yβ −→ y, α ∈ J, β ∈ K, xα , yβ ∈ F , J, K directed sets. We turn I = J × K into a directed set by (α1 , β1 ) ≥ (α2 , β2 ) if and only if α1 ≥ α2 and β1 ≥ β2 . (Exercise: Show that this defines a directed set). Now show that (xα , yβ ) −→ (x, y) in the product topology along the directed net I. Then by continuity of addition xα + yβ −→ x + y along I. Hence x + y ∈ cl(F ). The proof that txα −→ tx along J follows from the continuity of scalar multiplication. Hence tx ∈ cl(F ) and cl(F ) is a linear subspace. E XERCISE Let E be a topological vector space and F a linear subspace of E. Then F is a topological vector space in the subspace topology. 2 T HEOREM 10 Let E be a topological vector space and F a closed linear subspace. Then the quotient space Q = E/F is a topological vector space in the quotient topology. Before beginning the proof, we need the following. 7 E XERCISE Recall that a mapping between topological spaces is said to be open if the direct image of every open subset is open. If π1 and π2 are open mappings (between topological spaces) then so is the product mapping π1 × π2 . 2 Proof of Theorem 10. We startSby showing that π is an open mapping. Let U be open in E. Then π −1 (π(U )) = y∈F (U + y) which is open in E being a union of open subsets. By definition of the quotient topology this says that π(U ) is open in Q. Hence, by the exercise, π × π is an open mapping E × E −→ Q × Q. Now let U be open in Q. Then {(x1 , x2 ); x1 + x2 ∈ π −1 (U )} is open in E × E. But the direct image of this set by π × π, namely {(q1 , q2 ); q1 + q2 ∈ U } must be open in Q × Q. The continuity of scalar multiplication follows by much the same argument. Finally, we need to show that Q is Hausdorff. For this it will suffice to show that {0Q } is closed in Q. But this follows by hypothesis since π −1 ({0Q }) = F . T HEOREM 11 Let E be a finite dimensional topological vector space with basis e1 , . . . , en . Then the linear isomorphism (t1 , . . . , tn ) 7→ t1 e1 + · · · + tn en is a homeomorphism from k n to E. Note that k n can be given the product topology, or the topology coming from any norm. Proof. It follows from the definition of topological vector space that the mapping is continuous. We need to show that the inverse is continuous. We start with the case n = 1 where the map is essentially t 7→ te where e is a nonzero vector of a one-dimensional E. If tα e −→ 0E we need to show that tα −→ 0. Let > 0. Then e 6= 0E . Consequently, there is a balanced neighbourhood U of 0E such that e ∈ / U . Since U is balanced, |t| ≥ 1 implies te ∈ / U . Rescaling, this says that if tα e ∈ U then |tα | < . This completes the claim. The proof now works by induction on the dimension of E. We may assume that the result holds for spaces of dimensions 1 and n − 1. Let F be the linear span of the vectors e1 , . . . , en−1 . Then the linear isomorphism (t1 , . . . , tn−1 ) 7→ t1 e1 + · · · + tn−1 en−1 8 (1.1) is a homeomorphism from k n−1 to F by the induction hypothesis. But k n−1 is known to be complete and hence F is also complete (as a uniform space). Therefore F is closed in E and the quotient space Q = E/F is a topological vector space. The mapping tn 7→ tn π(en ) = π(tn en ) is a linear isomorphism k −→ Q and hence a homeomorphism. Since π is continuous, it now follows that the mapping t1 e1 + · · · + tn en 7→ tn is continuous as a mapping E −→ k. Similarly, the mapping t1 e1 + · · · + tn en 7→ tp is continuous for every p = 1, 2, . . . , n. The result now follows. T HEOREM 12 A locally compact topological vector space E is necessarily finite dimensional. Proof. There exists a compact balanced neighbourhood K of 0E . (Take a compact neighbourhood of 0E , then a balanced neighbourhood of 0E inside it, finally close that). We claim that tK as t runs over ]0, ∞[ form a base of neighbourhoods of 0E . To see this, let W be an arbitrary neighbourhood of 0E . Find U a balanced open neighbourhood of 0E with U + U ⊆ W . The collection {x + U ; x ∈ K} covers K. So, there exist F finite with F ⊂ K such that F + U ⊇ K. Choose t > 0 with t < 1 such that tF ⊂ U . Then tK ⊆ U + U ⊆ W . The claim is proved. Repeating more or less the compactness argument we now have a finite subset G of E such that K ⊆ G + 21 K. Let M be the linear span of G. Then G is isomorphic to k m where m is the dimension of M and hence is complete. Thus, M is closed in E. Now 1 1 1 1 K ⊆ M + K ⊆ M + M + K ⊆ M + K ⊆ · · · ⊆ M + 2−n K 2 2 4 4 for every n ∈ N. We claim that M = E. If not, then since K is absorbent there exists x ∈ K \ M . Then there exists n ∈ N such that x + 2−n K is disjoint from M . This follows since M is closed and 2−n K form a basis of neighbourhoods of 0E . But x ∈ M + 2−n K and this is a contradiction since K is balanced. 9 T HEOREM 13 Let E be a topological vector space such that 0E has a countable base of neighbourhoods. Then E is metrizable. Proof omitted. In the next section we will tackle locally convex spaces. Perhaps the best known example of a topological vector space that is not an locally convex space is Lp (X, M, µ) for 0 < p < 1. Here (X, M, µ) is a measure space. We can take Z d(f, g) = |f − g|p dµ as a metric on this space. Show that Lp (X, M, µ) is complete in this metric. We will deal with Lp (X, M, µ) for 1 ≤ p ≤ ∞ later. 1.6 Locally Convex Spaces A subset C of a linear space E is convex if and only if x, y ∈ C, 0 ≤ t ≤ 1 implies (1 − t)x + ty ∈ C. Informally, if x and y are in C, then C contains the line segment joining x to y. Given any subset S of a linear space E, the convex hull co(S) of S is the intersection of all convex subsets of E that contain S. (Note that E itself is convex and hence the intersection is not indexed over the empty set). An alternative description of co(S) is {t1 x1 + · · · + tn xn ; n ∈ N, x1 , . . . , xn ∈ S, t1 , . . . , tn ≥ 0, t1 + · · · + tn = 1} E XERCISE In a topological vector space E the convex hull of an open set is open. Hint: If t1 , . . . , tn ≥ 0 and t1 + · · · + tn = 1 then at least one of the tk is strictly positive. 2 A locally convex space E is a topological vector space such that there exists a base of convex neighbourhoods of 0E . L EMMA 14 If E is a topological vector space, and U is a convex neighbourhood of 0E , then there exists a balanced convex open neighbourhood V of 0E with V ⊆ U. Proof. Certainly U contains a balanced open neighbourhood W of 0E . We take V to be the convex hull of W . Then V is a balanced open convex set and V ⊆ u since U is convex. 10 It may be worth noting that S balanced and convex is equivalent to S is absolutely convex . We say that S is absolutely convex if and only if x, y ∈ S, t, s ∈ k |t| + |s| ≤ 1 implies that tx + sy ∈ S. Note that neighbourhoods V of the zero vector are always absorbent , that is every vector can be expressed in the form tv with t > 0 and v ∈ V . A seminorm on E is a mapping p : E −→ [0, ∞[ such that • p(tx) = |t|p(x) for t ∈ K and x ∈ E. • p(x + y) ≤ p(x) + p(y) for x, y ∈ E. L EMMA 15 If V is a balanced, convex absorbent subset of E then its Minkowski functional p(x) = inf{t > 0; t−1 x ∈ V } is a seminorm. The proof is easy. Further, if E is a topological vector space and V is open, then V = {x; p(x) < 1}. Thus, in a locally convex space E, we can code the topology by a family P of seminorms with each seminorm being the Minkowski functional of an open balanced convex neighbourhood. Since we are insisting that E is Hausdorff, for every x ∈ E \ {0E }, there is an open balanced convex neighbourhood V such that x∈ / V . In particular the seminorm corresponding to V has p(x) ≥ 1. A family of seminorms P is separating if for all x ∈ E \ {0E }, there exists p ∈ P with p(x) > 0. When we perform this recoding, we have xα −→ x ⇐⇒ p(xα − x) −→ 0 for all p ∈ P. This is really pointwise convergence on P. We can define a locally convex space by means of an arbitrary separating family P of seminorms. When we do this, we have to use the family of subsets {x; p(x) < t} as p runs through P and t runs through ]0, ∞[ as a subbase of neighbourhoods of 0E . That is a basic neighbourhood will have the form {x; pk (x) < tk , k = 1 . . . , n} where n ∈ N, p1 , . . . , pn ∈ P, t1 , . . . , tn > 0. 11 E XERCISE Let E be a locally convex space and F a linear subspace of E. Then F is a locally convex space in the subspace topology. 2 L EMMA 16 The quotient Q of a locally convex space E by a closed linear subspace is again a locally convex space. Proof. Let U be a neighbourhood of 0Q in Q. Then π −1 (U ) is a neighbourhood of 0E in E. There exists an open balanced convex neighbourhood of 0E in E with V ⊆ π −1 (U ). Hence π(V ) ⊆ U . But π(V ) is open since V is open and π is an open mapping, and π(V ) is balanced and convex because V is. Further, π(V ) is a neighbourhood of 0Q since it is open and 0Q ∈ π(V ). 1.7 The Hahn–Banach Theorem Let E be any real vector space. A sublinear functional on E is a mapping p : E −→ R such that • p(tx) = tp(x) for t ≥ 0 and x ∈ E. • p(x + y) ≤ p(x) + p(y) for x, y ∈ E. Note that p(0E ) = 0 and compare with the definition of a seminorm. T HEOREM 17 (H AHN –BANACH T HEOREM ) Let p be a sublinear functional on a vector space E, F a linear subspace of E and f : F −→ R a linear mapping (i.e. linear functional) such that f (x) ≤ p(x) for all x ∈ F . Then there exists a linear functional f˜ : E −→ R extending f and such that f˜(x) ≤ p(x) for all x ∈ E. The proof of the Hahn–Banach theorem uses the axiom of choice in the form of Zorn’s Lemma. Zorn’s lemma is equivalent to the axiom of choice, in the sense that either one together with the standard axioms of set theory is sufficient to prove the other. It occurs in the proofs of several theorems of crucial importance, for instance the theorem that every vector space has a linear basis, the theorem that every field has an algebraic closure and that every ring has a maximal ideal. It is stated as follows. 12 L EMMA 18 (Z ORN ’ S L EMMA ) Every non-empty partially ordered set in which every chain which is bounded above contains a maximal element. The terms are defined as follows. Suppose (X, ≤) is a partially ordered set. A subset C of X is chain if for any x, y ∈ X we have either x ≤ y or y ≤ x. A subset Y of X is bounded above if there exists u ∈ X such that y ≤ u for all y ∈ Y . Note that u is an element of X and need not be an element of Y . A maximal element of X is an element m ∈ X such that x ∈ X and m ≤ x implies x = m. Proof of the Hahn–Banach theorem. We take the partially ordered set X to be the set of pairs (G, g) where G is a linear subspace of E with F ⊆ G, g is a linear functional on G, extending f and such that g(x) ≤ p(x) for all x ∈ G. The partial order is defined by (G1 , g1 ) ≤ (G2 , g2 ) ⇐⇒ G1 ⊆ G2 and g2 |G1 = g1 S Now let C be a chain in X. We define H = (G,g)∈C G and for x ∈ H, we set h(x) = g(x) where x ∈ G for (G, g) ∈ C. Since C is a chain, it follows that h(x) is well defined, (i.e. independent of the choice of (G, g)). We also check that h is linear on H (which also uses the fact that C is a chain) and finally that h(x) ≤ p(x) for all x ∈ H. We now have that (H, h) is an upper bound for C (it does not necessarily belong to C). Applying Zorn’s lemma, we see that X possesses a maximal element. If this maximal element is of the form (E, ?) then we are done. If not, then we must find a contradiction. Let us denote the maximal element as (H, h). Then since H is not the whole of E we can find a linear subspace of E containing H as a linear subspace of H codimension one. Let us relabel this subspace as E. Then to obtain a contradiction to the maximality of (H, h) it will suffice to show the following proposition. P ROPOSITION 19 Let p be a sublinear functional on a vector space E, H a linear subspace of E of codimension one and h : H −→ R a linear mapping (i.e. linear functional) such that h(x) ≤ p(x) for all x ∈ H. Then there exists a linear functional h̃ : E −→ R extending h and such that h̃(x) ≤ p(x) for all x ∈ E. Proof. Let z ∈ E \ H. We will define h̃(x + tz) = h(x) + tα for all x ∈ H and some suitable α ∈ R. We will need h(x) + tα ≤ p(x + tz) 13 for all x ∈ H and all real t. If t = 0 this follows by hypothesis. If t > 0 then divide by t replacing t−1 x by x. We need h(x) + α ≤ p(x + z) for all x ∈ H. Similarly, if t < 0 then divide by −t and replace −t−1 x by y. We need h(y) − α ≤ p(y − z) for all y ∈ H. To recap, we need to have sup h(y) − p(y − z) ≤ α ≤ inf p(x + z) − h(x) x∈H y∈H If the choice of α is impossible, then there exist x, y ∈ H such that h(y) − p(y − z) > p(x + z) − h(x). But this implies h(x + y) = h(x) + h(y) > p(x + z) + p(y − z) ≥ p(x + y) a contradiction. The contradiction shows that a suitable choice of α ∈ R is always possible. There are a number of important corollaries. C OROLLARY 20 Let E be a real Banach space and F a closed subspace (and hence a Banach space in its own right). Let f ∈ F ∗ , the dual space of F . Then there exists f˜ ∈ E ∗ extending f and kf˜k = kf k for the dual norms. Proof. Without loss of generality, kf k = 1. It then suffices to take p(x) = kxk. C OROLLARY 21 Let E be a complex Banach space and F a closed linear subspace (and hence a Banach space in its own right). Let f ∈ F ∗ , the dual space of F . Then there exists f˜ ∈ E ∗ extending f and kf˜k = kf k for the dual norms. Proof. Of course E and F can also be considered as real linear spaces. One forgets how to perform scalar multiplication by complex scalars that are not real. 14 This process is called realification (at least by me). We again assume without loss that kf k = 1. Let h(x) = <f (x). Then h satisfies the hypotheses of the real Hahn–Banach Theorem and we deduce the existence of an extension to a real linear form u on X with |u(x)| ≤ kxk for all x ∈ E. Now define f˜(x) = u(x) − iu(ix). Then clearly f˜ is real-linear since u is. But f˜(ix) = u(ix) − iu(−x) = i(u(x) − iu(ix)) = if˜(x) showing that f˜ is actually complex-linear. Next, let x ∈ F . Then ix ∈ F and f˜(x) = u(x) − iu(ix) = <f (x) − i<f (ix) = <f (x) − i<(if (x)) = f (x). To see this, let f (x) = a + ib with a, b ∈ R. Then <f (x) − i<(if (x)) = a − i<(−b + ia) = a + ib = f (x). Finally, let x ∈ E. Let f˜(x) = ω|f˜(x)| where ω ∈ C and |ω| = 1. Then we have f˜(ωx) = |f˜(x)| ≥ 0 since we already showed that f˜ is complex-linear. But since f˜(ωx) is real, it must be that f˜(ωx) = u(ωx). Therefore |f˜(x)| = |f˜(ωx)| = |u(ωx)| ≤ kωxk = |ω|kxk = kxk since k k is a complex norm. C OROLLARY 22 Let E be a Banach space over R or C. Let x ∈ E with x 6= 0E . Then there exists f ∈ E ∗ with f (x) 6= 0. Colloquially, the continuous linear functionals on E separate the points of E. We have the principle of duality for Banach spaces as follows. We denote by E ∗ the space of all continuous linear functionals on the Banach space E. We give E ∗ the operator norm kϕkE ∗ = sup |ϕ(x)|. x∈E,kxkop ≤1 C OROLLARY 23 We have kxkE = sup ϕ∈E ∗ ,kϕkE ∗ ≤1 15 |ϕ(x)|. Proof. The inequality ≥ follows from the definitions. We fix x a nonzero vector in E. If x = 0E , the the esult is obvious. Let F be the one-dimensional linear subspace of E spanned by x. Let ψ(tx) = tkxkE define a continuous linear functional on F . We have kψkF ∗ = 1. It can be extended to ϕ ∈ E ∗ with kϕkE ∗ = 1 by the Hahn–Banach theorem. But |ϕ(x)| = kxkE and the other inequality is evident. There are also geometrical versions of the Hahn–Banach theorem. We start with P ROPOSITION 24 Let E be a real locally convex space. Let U be an open convex subset of E and y ∈ E with y ∈ / U . Then there exists a closed halfspace H of E with y ∈ H and H ∩ U = ∅. Proof. By translating we can assume that 0E ∈ U . Define p(x) = inf{t > 0; t−1 x ∈ U } analogous to the definition of the Minkowski functional. Then p(y) ≥ 1 (since s > 0, sy ∈ U implies s < 1) and p is a sublinear functional. Let F be the linear span of y. Define a linear functional f on F by f (y) = 1. The Hahn–Banach theorem gives the existence of a linear functional f˜ on E such that f˜(y) = 1 and f˜(x) ≤ p(x) for all x ∈ E. The halfspace H = {x ∈ E; f˜(x) ≥ 1} contains y. For x ∈ H we have p(x) ≥ 1. For x ∈ U we have that there exists > 0 such that (1 + )x ∈ U since U is open. Thus p(x) < 1. Therefore H and U are disjoint. It remains to show that f˜ is continuous which will imply that H is closed. We can choose a balanced open convex neighbourhood V of 0E with V ⊆ U . Let q be the Minkowski functional of V and in particular a seminorm for the topology of E. Then p(x) ≤ q(x) for all x ∈ E and it follows that f˜(x) ≤ q(x) for all x ∈ E. But then −f˜(x) = f˜(−x) ≤ q(−x) = q(x) for all x ∈ E. So |f˜(x)| ≤ q(x). Consequently f˜ is continuous and hence H is closed. P ROPOSITION 25 Let E be a real locally convex space. Let C be a nonempty closed convex subset of E such that 0E ∈ / C. Then there exists a continuous linear form on E and > 0 such that inf f (C) ⊆ [, ∞[. Proof. Let U be a balanced convex open neighbourhood of 0E disjoint from C. The C + U is an open convex subset of E such that 0E ∈ / C + U . By the previous 16 result, there exists a continuous linear from f on E such that f (C + U ) ⊆]0, ∞[. Since C is nonempty, f is not identically zero. It cannot be that U ⊆ f −1 ({0}) since f −1 ({0}) is a closed linear hyperplane and U is absorbent. Therefore f (U ) is a balanced convex subset of R not equal to {0}. There must exist > 0 such that ] − , [⊆ f (U ). It now follows that inf f (C) ≥ . P ROPOSITION 26 Let E be a real locally convex space. Let A be a compact convex subset of E and B a closed convex subset of E with A and B disjoint. Then there is a closed hyperplane H separating A and B. Proof. First of all, the subset C = A − B is closed. To see this, take a point x of the closure. Then there are nets (aα ) and (bα ) in A and B respectively such that aα − bα → x. Extract from (aα ) a convergent subnet (possible since A is compact). Then since B is closed the corresponding subnet of (bα ) converges to a point of B. The set A − B is also convex. The problem has been reduced to separating 0E from a closed convex subset C with a closed hyperplane. The result follows from the previous proposition. Note that if A or B is empty, the result is trivial. 1.8 The Krein–Milman Theorem Let E be a locally convex space and C a compact convex subset. We say that x ∈ C is an extreme point of C if there is no genuine line segment in C having x as an interior point. Technically, the definition is as follows y, z ∈ C, 0 < t < 1, (1 − t)y + tz = x =⇒ y = z. We denote ex(C) the set of extreme points of C. T HEOREM 27 (T HE K REIN –M ILMAN T HEOREM ) Let E be a locally convex space, K a nonempty compact convex subset of E. Then K = cl(co(ex(K))). The proof of this theorem depends on the concept of a face which may be familiar to those familiar with the area of computational geometry in computer science. If we take a closed cube in three dimensional space then the faces of the cube are • the whole cube, 17 • the six two-dimensional faces (using face in the conventional sense), • the twelve one-dimensional edges, • the eight vertex singletons. The definition is a follows. A face of a compact convex set C is a nonempty convex subset F of C such that y, z ∈ C, 0 < t < 1, (1 − t)y + tz ∈ F =⇒ y, z ∈ F. E XERCISE 1. C is a face of C. 2. x is an extreme point of C if and only if {x} is a face of C. 3. If F is a face of C, then ex(F ) ⊆ ex(C). 4. If F is a face of C, n ≥ 2, y1 , . . . , yn ∈ C, t1 , . . . , tn ∈]0, 1[, t1 + · · · + tn = 1 and t1 y1 + · · · + tn yn ∈ F then y1 , . . . , yn ∈ F . This is a fairly straightforward induction argument. 5. A nonempty intersection of faces is again a face. 2 L EMMA 28 Let K be a nonempty compact convex subset of a real locally convex space space E. Let f : E −→ R be a continuous linear form on E. Let s = sup f (K). Then F = {x ∈ K; f (x) = s} is a closed face of K. Proof. Clearly F is nonempty. Let y, z ∈ K, 0 < t < 1 be such that (1 − t)y + tz ∈ F . Then s ≥ (1 − t)f (y) + tf (z) = f ((1 − t)y + tz) = s. The only way out is that f (y) = f (z) = s. Thus y, z ∈ F . 18 P ROPOSITION 29 Let K be a compact convex subset of a real locally convex space space E. Then ex(K) is nonempty. Proof. This proof uses Zorn’s lemma. Consider the poset X of closed faces of K ordered T by reverse inclusion. Let C = (FTα )α∈I be a chain in X. Then consider F = α∈I Fα . For J a finite subset of I, α∈J Fα is actually one of the Fα for some α ∈ J and hence nonempty. It follows from the finite intersection property that F is nonempty. We see that F is a closed face. Thus, every chain in X has an upper bound. Hence X possesses a maximal element G. Now suppose that y, z ∈ G with y 6= z. Then, by the Hahn–Banach Theorem there exists a continuous linear form f : E −→ R with f (y) 6= f (z). Let s = sup f (G). Then by the lemma, H = {x ∈ G; f (x) = s} is a face of G and hence also a face of K. By maximality of G we have H = G, but this contradicts f (y) 6= f (z) since not both of f (y) and f (z) can be equal to s. Hence G is a singleton and the result follows. Proof of the Krein–Milman Theorem. Clearly K ⊇ cl(co(ex(K))). To show the other inclusion, let x ∈ K and x∈ / cl(co(ex(K))). Then by the geometrical form of the Hahn–Banach theorem, there exists a continuous linear form f on E that satisfies f (x) > sup f (cl(co(ex(K)))). But F = {x ∈ K; f (x) = sup(f (K))} is a face of K disjoint from ex(K). But by the proposition, F possesses an extreme point of F and this point is also an extreme point of K a contradiction. 1.9 Banach Spaces We start by introducing the concept of a norm . For an element v of the vector space E the norm of v (denoted kvk) is to be thought of as the distance from 0E to v, or as the “size” or “length” of v. D EFINITION A norm on a vector space E over R or C is a mapping v −→ kvk from E to R+ with the following properties. 19 • k0E k = 0. • v ∈ E, kvk = 0 ⇒ v = 0E . • ktvk = |t|kvk ∀t a scalar, v ∈ E. • kv1 + v2 k ≤ kv1 k + kv2 k ∀v1 , v2 ∈ E. The last of these conditions is called the subadditivity inequality . There are really two definitions here, that of a real norm applicable to real vector spaces and that of a complex norm applicable to complex vector spaces. However, every complex vector space can also be considered as a real vector space — one simply “forgets” how to multiply vectors by complex scalars that are not real scalars. This process is called realification . In such a situation, the two definitions are different. For instance, kx + iyk = max(|x|, 2|y|) (x, y ∈ R) defines a perfectly good real norm on C considered as a real vector space. On the other hand, the only complex norms on C have the form 1 kx + iyk = t(x2 + y 2 ) 2 for some t > 0. The inequality kt1 v1 + t2 v2 + · · · + tn vn k ≤ |t1 |kv1 k + |t2 |kv2 k + · · · + |tn |kvn k holds for scalars t1 , . . . , tn and elements v1 , . . . , vn of E. It is an immediate consequence of the definition. We note that a norm is essentially a seminorm that vanishes only at 0E . E XERCISE Let C be a balanced convex absorbent subset of a vector space E. Then Minkowski functional of C is a norm. 2 A complete normed space is called a Banach space . 1.10 Quotients of Banach Spaces 20 P ROPOSITION 30 If E is a normed space and N is a closed linear subspace of E then the quotient space Q = E/N is again a normed space with the norm kqkQ = inf kxkE , π(x)=q (1.2) for q ∈ Q. This is as the quotient norm . It is more or less obvious that k kQ is homogenous. To show the subadditivity of the norm, we argue by contradiction. Suppose that there exists > 0, q1 , q2 ∈ Q such that kq1 + q2 k ≥ kq1 k + kq2 k + 3. (1.3) Then using the definition (1.2), we can find x1 , x2 ∈ E such that π(xj ) = qj and kxj kE ≤ kqj k + , for j = 1, 2. Obviously, π(x1 + x2 ) = q1 + q2 so that kq1 + q2 k ≤ kx1 k + kx2 k ≤ kq1 k + kq2 k + 2. This contradiction with (1.3) establishes the subadditivity. There is one final detail that requires a little proof. Suppose that q ∈ Q and that kqkQ = 0. Then, using (1.2) we can find a sequence (xj ) of elements of E with π(xj ) = q for j = 1, 2, . . . and kxj k tending to zero. Clearly xj −→ 0E and hence (x1 − xj ) −→ x1 . Since (x1 − xj ) ∈ N and since N is supposed to be closed in E, we conclude that x1 ∈ N and consequently that q = 0Q . It is perhaps worth noting that kπ(x)kQ ≤ kxkE for all x ∈ E, so that π is nonexpansive and in particular continuous. E XERCISE Show that the topology induced by the quotient norm coincides with the quotient toplogy. 2 P ROPOSITION 31 Let EP be a normed vector space with the that whenPproperty ∞ ∞ ever xn ∈ E are such that j=1 kxj kE < ∞ we have that j=1 xj converges in E. Then E is complete. Before proving this we need the following Lemma. 21 L EMMA 32 Let (xn ) be a Cauchy sequence in a metric space X. If (xn ) has a convergent subsequence then (xn ) converges. We leave the proof to the reader. Proof of Proposition 31. Let (un ) be a Cauchy sequence in E. Then applying the definition of Cauchy sequence for = 2−k for all k ∈ N, we find nk such that p, q ≥ nk ⇒ kup − uq k < 2−k It follows from this that kunk+1 − unk k < 2−k and that ∞ X kunk+1 − unk k < ∞. k=1 Therefore, by hypothesis, ∞ X unk+1 − unk k=1 converges in E. But this is equivalent to saying that unk converges. An application of Lemma 32 now shows that (un ) converges. But (un ) was an arbitrary Cauchy sequence in E, so we have shown that E is complete. We have the following Theorem. T HEOREM 33 Let E be a complete normed vector space and N a closed linear subspace. Then the quotient space Q = E/N is a complete normed space with the norm defined by (1.2). Proof. It will suffice to showP that Q satisfies the hypotheses of Proposition P∞ 31. Towards this, let qn ∈ Q with ∞ kq k < ∞. We need to show that n n=1 qn n=1 converges in Q. By definition of the quotient norm, there exist xn ∈ E such that π(xn ) = qn and kxn kE ≤ 2kqn kQ . (The 2 here could be replaced by 1 + , but not by 1. There are P∞examples where the infimum defining the quotient norm is attained.) So n=1 kxn k < ∞ and since E is complete, it follows that Pnot ∞ x say to s ∈ E. But π is continuous and it now follows that n converges Pn=1 P∞ ∞ n=1 qn = n=1 π(xn ) converges to π(s) in Q. 22 1.11 The Open Mappings and Closed Graphs The following result has a number of key applications that cannot be approached in any other way. T HEOREM 34 (BAIRE ’ S C ATEGORY T HEOREM ) Let X be a complete metric space or a compact Hausdorff topological space. Let Ak be a sequence of closed subsets of X with int(Ak ) = ∅. Then X\ ∞ [ Ak is dense in X. (1.4) k=1 In particular if X is nonempty we have ∞ [ Ak 6= X. k=1 T HEOREM 35 (O PEN M APPING T HEOREM ) Suppose that U and E are complete normed spaces and let T : U −→ E be a continuous surjective linear map. Then there is a constant C > 0 such that for every v ∈ E with kvk ≤ 1, there exist u ∈ U with kuk ≤ C such that T (u) = v. The reason for the terminology is that the statement that T is an open mapping is equivalent to the conclusion of the Theorem. Proof. There are two separate ideas in the proof. The first is to use the Baire Category Theorem and the second involves iteration. Let Bn denote {u : u ∈ U, kuk ≤ n}, the closed n-ball in U . Then, since T is onto, we have [ V = T (Bn ). n∈N We can’t use this directly in the Baire Category Theorem because we don’t know that the T (Bn ) are closed. We take the easiest way around this difficulty and write simply [ V = cl(T (Bn )). n∈N 23 By the Baire Category Theorem (page 23), there exists n ∈ N such that cl(T (Bn )) has nonempty interior. This means that there exists v ∈ V and t > 0 such that UV (v, t) ⊆ cl(T (Bn )). By symmetry, it follows that UV (−v, t) ⊆ cl(T (Bn )). We claim that UV (0V , t) ⊆ cl(T (Bn )). Let w ∈ UV (0V , t). Then, we can find two sequences (xk ) and (yk ) in Bn such that (T (xk )) converges to w + v and (T (yk )) converges to w − v. It follows that the sequence (T ( 21 (xk + yk ))) converges w. This establishes the claim. Now, let v be a generic element of V with kvk < t. Then v ∈ cl(T (Bn )). Hence, there exists u0 ∈ Bn such that kv − T (u0 )k < 12 t. We repeat the argument, but rescaled by a factor of 21 and applied to v − T (u0 ). Thus, there is an element u1 ∈ U with ku1 k < 12 n and such that kv − T (u0 ) − T (u1 )k < 14 t. Continuing in this way leads to elements uk ∈ U with kuk k < n 2−k such that kv − ` X T (uk )k < t 2−`−1 . k=0 Using now the fact that U is complete (the completeness of V is needed to apply Baire’s Theorem), we find that T (u) = v where u= ∞ X uk ∈ U k=0 is given by an absolutely convergent series and has norm bounded by 2n. Rescaling gives the required result. An open mapping from one topological space to another is a mapping such that the direct image of an open subset is always open. Let T be as in Theorem 35. We explain why T is an open mapping. Let Ω be open in U . We need to show that T (Ω) is open in V . Let v0 ∈ T (Ω). Then, there exists u0 ∈ Ω such that T (u0 ) = v0 . Since Ω is open in U , there exists δ > 0 such that ku − u0 kU < δ implies that u ∈ Ω. Now let kv − v0 kV < C −1 δ. Then according to Theorem 35 there exists w ∈ U with kwkU ≤ Ckv − v0 kV < δ such that T (w) = v − v0 . Then v = v0 + T (w) = T (u0 + w) = T (u) with u = u0 + w and ku − u0 kU < δ. Hence u ∈ Ω and v ∈ T (Ω). We just showed that any point v sufficiently close to v0 in V lies in T (Ω). Hence T (Ω) is open in V . Conversely, let T be an open linear mapping T : U −→ V with U and V normed spaces. Then the direct image of the open ball centred at zero of radius 1 in U contains an open ball of strictly positive radius centred at the zero vector of V . Scaling shows that T is surjective and the conclusion of the Open Mapping Theorem follows easily with C = −1 . 24 C OROLLARY 36 Let V be a vector space with two norms k k1 and k k2 , both of which make V complete. Suppose that there is a constant C such that kvk2 ≤ Ckvk1 ∀v ∈ V. Then k k1 and k k2 are equivalent norms. Proof. Apply the Open Mapping Theorem in case that T is the identity mapping from (V, k k1 ) to (V, k k2 ). It is possible to construct an infinite dimensional vector space with two incomparable norms both of which render it complete. T HEOREM 37 (I NVERSE M APPING T HEOREM ) Let E and F be Banach spaces. let T : E −→ F be a continuous linear bijection. Then T −1 : F −→ E is also continuous Proof. This follows from the Open Mapping Theorem. Since T is onto, it is an open mapping. But, since T is bijective this just says that T −1 is continuous. T HEOREM 38 (C LOSED G RAPH T HEOREM ) Let E and F be Banach spaces. let T : E −→ F be a linear mapping. Let G = {(x, T (x)); x ∈ E} ⊆ E × F . Then T is continuous if and only if G is closed in E × F for the product topology. Proof. If T is continuous, the G is closed. This is the trivial part of the proof. Let (xn , T (xn )) be a sequence in G converging to (x, y) in E × F . Then xn converges to x and since T is continuous T (xn ) converges to T (x). Hence T (x) = y. It follows that G is closed. For the converse consider the space E ⊕ F , the abstract direct sum of E and F . As a vector space, it consists of objects x ⊕ y where x ∈ E and y ∈ F . We can put a norm on E ⊕ F by kx ⊕ yk = kxkE + kykF although the precise form of the norm is not important. However, the topology induced by this norm corresponds to the product topology on E × F . Hence, restricting the norm to G we obtain a Banach space because G is closed and E ⊕ F is complete. But the mapping G −→ E given by (x, T (x)) 7→ x is a continuous linear bijection. Applying the inverse mapping theorem, we see that x 7→ (x, T (x)) is continuous. In particular, T is continuous. 25 T HEOREM 39 Let E be a Banach space G and L closed subspaces such that G + L is also closed. Then • There is a constant C such that every z ∈ G + L can be written z = x + y where x ∈ G, y ∈ L with kxk ≤ Ckzk and kyk ≤ Ckzk. • There is a constant C such that d(x, G ∩ L) ≤ C d(x, G) + d(x, L) for all x ∈ E. Proof. We form the abstract direct sum G ⊕ L and norm it by kx ⊕ yk = kxkE + kykE where x ∈ G and y ∈ L. It is a Banach space since G and L are. Also G + L, the closed linear subspace of E is also a Banach space. The mapping x ⊕ y 7→ x + y is a continuous linear surjection G ⊕ L −→ G + L and hence an open mapping. Therefore the open unit ball in G ⊕ L must map into a subset that contains a δ-ball of G + L around 0E . Rescaling gives the first result. For the second assertion let > 0. We can choose y ∈ G, z ∈ L such that kx−yk < d(x, G)+ and kx−zk < d(x, L)+. Then y −z ∈ G+L and we may find y 0 ∈ G and z 0 ∈ L with y − z = y 0 − z 0 and control ky 0 k, kz 0 k ≤ Cky − zk. Then y − y 0 = z − z 0 ∈ G ∩ L. So d(x, G ∩ L) ≤ kx − y + y 0 k ≤ kx − yk + ky 0 k ≤ d(x, G) + + Cky − zk ≤ d(x, G) + + Cky − xk + Ckx − zk ≤ (1 + 2C) d(x, G) + d(x, L) + . Letting tend to zero gives the result. 1.12 The Banach–Steinhaus Theorem 26 T HEOREM 40 (BANACH –S TEINHAUS T HEOREM ) Suppose that E and F are normed spaces with E complete and let Tn : E −→ F be continuous linear maps for n ∈ N. Suppose that for every e ∈ E, we have sup kTn (e)kF < ∞. n∈N Then sup kTn kop < ∞, n∈N where k kop denotes the operator norm from E to F . Proof. Let us define for k ∈ N, Ak = {e ∈ E; kTn (e)kF ≤ k for all n ∈ N} = ∞ \ Tn−1 (BF (0, k)) . n=1 Then Ak is closedSin E since it is an intersection of closed subsets of E. The hypothesis is that ∞ k=1 Ak = E. Therefore by the Baire Category Theorem, there exists k ∈ N such that int(Ak ) 6= ∅. Hence there exists e ∈ E and > 0 such that UE (e, ) ⊆ Ak . As in the proof of the Open Mapping Theorem, we deduce from the symmetry and convexity of Ak that UE (0, ) ⊆ Ak . But this says that x ∈ E, kxkE < =⇒ kTn (x)kF ≤ k for all n ∈ N. Rescaling this gives kTn kop ≤ 2k−1 for all n ∈ N which is the desired conclusion. The Banach–Steinhaus Theorem can be used to show that there exists a continuous function on the circle whose Fourier series does not converge at a point. 1.13 Hilbert Spaces A complete inner product space is called a Hilbert space . Hilbert spaces are important because they have almost magical properties and are usually very easy to handle. They are extremely important in Physics, where they form the theoretical basis for Quantum Mechanics. The space L2 holds a very special position among the Lp spaces because it can be given the structure of an inner product space. In this section, we will omit proofs since they were covered in Math 455. 27 T HEOREM 41 The form Z hf, gi = f gdµ defines an inner product on L2 (X, M, µ) which is compatible with the L2 norm. The proof is completely straightforward, the key point being that the associated norm of the inner product is just the L2 norm. Z Z hf, f i = f f dµ = |f |2 dµ = kf k22 . P ROPOSITION 42 Let H be a Hilbert space (real or complex) and let C ⊆ H be a nonempty closed convex subset. Let x ∈ H. Then there is a unique nearest point y of C to x. In fact, this defines a mapping PC : H −→ C called the metric projection onto C. We do not need the Lemma below, but it is an interesting fact. L EMMA 43 Let H be a Hilbert space (real or complex) and let C ⊆ H be a closed convex subset. Then PC satisfies kPC (x1 ) − PC (x2 )k ≤ kx1 − x2 k for all x1 , x2 ∈ H. Let H be a Hilbert space either real or complex. Let S ⊆ H. Then we define S ⊥ = {x; x ∈ H, hs, xi = 0, for all s ∈ S}. It is clear that S ⊥ is an intersection of closed linear subspaces of H and therefore it is a closed linear subspace of H. T HEOREM 44 Let M be a closed linear subspace of H. Then we have H = M ⊕ M ⊥ . Furthermore. let P and Q be the linear projection operators onto M and M ⊥ associated with the direct sum. Then P and Q are norm decreasing and in fact, more generally we have kxk2 = kP (x)k2 + kQ(x)k2 for all x ∈ H. We denote S ⊥⊥ = (S ⊥ )⊥ . This set has a neat characterization. L EMMA 45 Let H be a Hilbert space either real or complex. Let S ⊆ H. Let M be the closure of the linear span of S. Then S ⊥⊥ = M . 28 T HEOREM 46 Let H be a Hilbert space and let f be a continuous linear map from H to the base field k. Then there exists z ∈ H such that f (x) = hz, xi. An orthonormal set is usually an indexed set (eα )α∈I where I is the indexing set. The key property that it has to satisfy is 1 if α = β, heα , eβ i = 0 if α 6= β. Given a finite linearly independent set in an inner product space, one usually constructs an orthonormal set by using the Gram–Schmidt Orthogonalization Process. Note that if you are computing an orthonormal basis on a computer you should use the modified Gram–Schmidt Orthogonalization Process to avoid roundoff error instabilities. Let (eα )α∈I be an orthonormal set. Then P (i) If (cα ) ∈ `2 , then the series α∈I cα eα is a norm convergent unconditional P P 2 1/2 . sum and furthermore k α∈I cα eα kH = α∈I |cα | P (ii) If x ∈ H, then α∈I |heα , xi|2 ≤ kxk2 . T HEOREM 47 (iii) If M is the closed linear span of (eα )α∈I , then we have X P (x) = heα , xieα α∈I where P is orthogonal projection on M . Let H be a Hilbert space. An orthonormal basis in H is a maximal orthonormal set. It turns out that in the finite dimensional case, orthonormal bases are simply linear bases that are also orthonormal. But, in the infinite dimensional case, orthonormal bases are never linear bases. First we need to address the question of existence or, more generally extension. In this setting, we’ll simply work with unindexed sets. L EMMA 48 Every orthonormal set is contained in some orthonormal basis. We need a theorem that characterizes orthonormal bases. 29 T HEOREM 49 Let (eα )α∈I be an orthonormal set in a Hilbert space H. Then the following are equivalent. (i) (eα )α∈I is an orthonormal basis. (ii) The closed linear span M of (eα )α∈I is the whole of H. (iii) The identity X |heα , xi|2 = kxk2 α∈I holds for all x ∈ H. (iv) The identity X hy, eα iheα , xi = hy, xi α∈I holds for all x, y ∈ H. There are some important consequences of this result and the existence of orthonormal bases. C OROLLARY 50 (i) If H is a finite dimensional Hilbert space, then it is linearly isometric to d dimensional Euclidean space Rd or Cd , depending on the field of scalars and where d = dim(H).. (ii) If H is infinite dimensional, but separable Hilbert space, then it is linearly isometric to `2 over the appropriate field of scalars. 1.14 Standard Banach Spaces and their Duals If E and F are Banach spaces and T : E −→ F is a continuous linear mapping, then the operator norm of T is defined by kT kop = sup x∈E,kxkE ≤1 30 kT (x)kF . E XERCISE T. Show that the finiteness of kT kop is equivalent to the continuity of 2 E XERCISE Let L(E, F ) denote the space of all continuous linear mappings from E to F . Show that the operator norm is a norm on L(E, F ) and that L(E, F ) is complete (and hence a Banach space) with this norm. 2 The special case in which F = k the base field defines the dual space E ∗ . We have kf kE ∗ = sup |f (x)|. x∈E,kxkE ≤1 The important Banach spaces and their duals are: • C(K) for K a compact Hausdorff topological space with the uniform norm. The dual space can be identified to M (K) the space of finite Borel measures on K (real (i.e. signed) measures if k = R, complex if k = C) with the total variation norm. By finite, we mean that the total variation is finite. The duality is R given as follows. If g is a continuous linear form on C(K), then g(f ) = f dµ where µ ∈ M (K). We will not prove this result. You can find the proof in Rudin’s Real & Complex Analysis. • C0 (X) for X a locally compact Hausdorff space, C0 (X) consisting of continuous functions vanishing at infinity. Again the norm is the uniform norm. The dual space can be identified to M (X) the space of finite regular Borel measures on X with the total variation norm and the duality is realized as above. If X is a countable union of compact subsets, then you can drop the regularity of the measure. It is satisfied automatically. • Lp (X, M, µ), the Lp space on the measure space (X, M, µ) where µ is a positive (not necessarily finite) measure. For 1 < p < ∞, the dual space of 0 Lp (X, M, µ) is identified to Lp (X, M, µ) where p0 is the conjugate index to p, i.e. p−1 +p0 −1 = 1 and theRlinear functional g is realized by the function 0 h ∈ Lp (X, M, µ) by g(f ) = f hdµ. In case p = 1, the dual space is L∞ (X, M, µ) provided that (X, M, µ) is a σ-finite measure space with the duality being realised in the same way. 31 2 Banach Space Duality Let E be a Banach space. Let E ∗ denote the dual space. Then we have defined the norm on E ∗ as the operator norm kf kE ∗ = sup |f (x)|. x∈E,kxkE ≤1 It is a consequence of the Hahn–Banach theorem (Corollary 23) that kxkE = sup |f (x)| f ∈E ∗ ,kf kE ∗ ≤1 and in particular, the elements of E ∗ separate the points of E. The bidual E ∗∗ is the dual of E ∗ . There is a canonical mapping J : E −→ E ∗∗ defined by J(x)(f ) = f (x). It is clear that this mapping is a linear isometry. If it is bijective, we say that E is reflexive . For example Lp is reflexive for 1 < p < ∞, Hilbert spaces are reflexive. The space c0 is not reflexive since its dual is `1 and the dual of `1 is `∞ . There are two important locally convex topologies. The weak topology on E denoted σ(E, E ∗ ) is the topology defined by the seminorms x 7→ |f (x)| as f runs over E ∗ . More interesting is the weak star topology on E ∗ denoted σ(E ∗ , E) defined by the seminorms f 7→ |f (x)| as x runs over E. More generally you can define the topology σ(E, F ) where F is a linear subspace of E ∗ . The reason that the weak star topology is so important is the following theorem. T HEOREM 51 The closed unit ball of E ? is weak star compact. The proof depends on the Tychonov product theorem. 32 T HEOREM 52 (T YCHONOV PRODUCT T HEOREM ) LetQ Xα be compact topological spaces for every α ∈ I where I is an index set. Then α∈I Xα is a compact topological space in the product topology. There are two proofs of this theorem. Both are difficult. Proof of Theorem 51. We give the proof in the complex case. For each x ∈ E let Dx be a copy of the closed disk of radius kxkE in the complex plane. Let Y D= Dx . x∈E A typical point of D will be denoted (zx )x∈E . Then by the Tychonov product theorem, D is compact for the product topology. Let Zx1 ,x2 ,t1 ,t2 be the subspace of D given by Zx1 ,x2 ,t1 ,t2 = {(zx ); zt1 x1 +t2 x2 = t1 zx1 + t2 zx2 } for t1 , t2 ∈ C and x1 , x2 ∈ E. The condition zt1 x1 +t2 x2 = t1 zx1 + t2 zx2 defines a closed subset of Dt1 x1 +t2 x2 × Dx1 × Dx2 and it follows that Zx1 ,x2 ,t1 ,t2 is a closed subset of D since it involves only a finite number (at most three) coordinates. Let \ Z= Zx1 ,x2 ,t1 ,t2 , x1 ,x2 ∈E,t1 ,t2 ∈C then Z is a closed subset of D (intersection of closed subsets). The final step in the proof is to realize that there is a one-to-one correspondence between {u ∈ E ? ; kukE ? ≤ 1} and Z given by u 7→ (zx ) where zx = u(x) for all x ∈ E. In each case, the weak star topology on {u ∈ E ? ; kukE ? ≤ 1} and the topology on Z inherited from the product topology are given by pointwise convergence on the elements x ∈ E. Since Z is compact, so is the unit ball of E ? . For F ⊆ E we define F ◦ = {f ∈ E ∗ ; f (x) = 0, ∀x ∈ F } a closed linear subspace is E ∗ called the annihilator of F . It should be clear that F has the same annihiliator as its closed linear span. We can also define the annihilator of a subset of E ∗ but, by convention, we take this in E and not in E ∗∗ . Thus for N ⊆ E ∗ , we have N ◦ = {x ∈ E; f (x) = 0, ∀f ∈ N } It should be apparent that F ◦◦ is the closed linear span of F . The fact that closed linear span of F is a subset of F ◦◦ is evident. If there is x ∈ F ◦◦ that is not in the 33 closed linear span of F , then we may invoke the geometrical form of the Hahn– Banach theorem (proposition 26) to find a closed hyperplane separating x from F and hence an element f ∈ E ∗ vanishing on F but with f (x) 6= 0 contradicting x ∈ F ◦◦ . Let E be a Banach space and F a closed linear subspace. Then a continuous linear projection of E on F is a continuous linear mapping P : E −→ F such that P |F is the identity mapping on F . In particular P 2 = P . In this situation, it follows that ker(P ) is a closed linear subspace of E such that E = F ⊕ ker(P ), the (continuous) linear projection on ker(P ) being I − P . The direct sum is implemented by ∈ker(P ) ∈P z }| { z }| { x = P (x) + (I − P )(x) . It should be stressed that except in trivial circumstances (like F = E or F = {0E }) there may be many different continuous linear projections onto F . L EMMA 53 1. If F is a finite dimensional linear subspace of E, then there is a continuous projection on F . 2. If F is a closed linear subspace of E of finite codimension, then there is a continuous projection on F . Proof. If F is finite dimensional, then choose a basis e1 , . . . , en of F . Since F is necessarily closed we define ϕk (t1 e1 + · · · + tn en ) = tk . a continuous linear form on F which Pn extends by Hahn–Banach to a continuous linear form ϕ˜k on E. Then P (x) = k=1 ϕ˜k (x)ek is a continuous linear projection from E to F . For the case where F is closed and of finite codimension, form the quotient space Q = E/F which is finite dimensional and select a basis q1 , . . . , qn . For x ∈ E we will have n X π(x) = tk (x)qk k=1 where the tk are continuous linear forms on E since π is continuous and Q is finite dimensional. Now choose e1P , . . . , en ∈ E such that π(ek ) = qk for k = 1, . . . , n. The mapping x 7→ x − nk=1 tk (x)ek is now the desired continuous linear projection onto F . 34 L EMMA 54 Let E be a Banach space and F a closed linear subspace. Then the dual of the inclusion mapping gives a linear isometry E ∗ /F ◦ to F ∗ . In particular, for f ∈ E ∗ , we have d(f, F ◦ ) = kf |F kF ∗ . Proof. Let J be the inclusion of F into E. Then the dual map J ∗ maps E ∗ onto F ∗ and has kernel F ◦ . Surjectivity follows from Hahn–Banach. Clearly J ∗ is norm decreasing and so it induces a norm decreasing map from E ∗ /F ◦ to F ∗ . But, by Hahn–Banach, every element of F ∗ can be extended to E ∗ without increase of norm. Hence the map E ∗ /F ◦ to F ∗ is an isometry. Let f ∈ E ∗ . Then decoding we get kJ ∗ (f )kF ∗ = supx∈F,kxk≤1 |f (x)|. On the other hand, we may consider the quotient space Q = E ∗ /F ◦ and we find that d(f, F ◦ ) = kπ(f )kQ . This situation can be contrasted with the following. L EMMA 55 Let E be a Banach space and F a Banach space which is dense linear subspace of E such that the inclusion J : F −→ E is continuous. Then the dual of the inclusion mapping gives a a continuous inclusion F ∗ to E ∗ . Proof. Let f ∈ E ∗ and suppose J ∗ (f ) = 0. then for x ∈ F we have f (J(x)) = J ∗ (f )(x) = 0. Thus f vanishes on F . But since F is dense in E and f is continuous it follows that f = 0. Hence J ∗ is injective. L EMMA 56 Let T : E −→ F be a continuous linear operator between two Banach spaces E and F . Then (i) ker T = im(T ∗ )◦ . (ii) ker(T ∗ ) = im(T )◦ . (iii) ker(T )◦ = cl(im(T ∗ )). (iv) ker(T ∗ )◦ = cl(im(T )). Proof. We see that (iii) and (iv) follow immediately from (i) and (ii) using double annihilators. Everything hinges on (T ∗ (f ))(x) = f (T (x)). For (i) the inclusion ⊆ is obvious. On the other hand, if x ∈ im(T ∗ )◦ , then f (T (x)) = 0 for all f ∈ F ∗ . it follows that T (x) = 0 since F ∗ separates on F . Statement (ii) follows similarly. 35 P ROPOSITION 57 Then Let E be a Banach space, G and L closed linear subspaces. (i) G ∩ L = (G◦ + L◦ )◦ . (ii) G◦ ∩ L◦ = (G + L)◦ . (iii) (G ∩ L)◦ ⊇ cl(G◦ + L◦ ). (iv) (G◦ ∩ L◦ )◦ = cl(G + L). Proof. For (i) the ⊆ inclusion is obvious. To see this, let x ∈ G ∩ l and f ∈ G◦ ∩ L◦ . Then f (x) = 0. In the other direction we have G◦ ⊆ G◦ + L◦ and taking annihilators G = G◦◦ ⊇ (G◦ + L◦ )◦ and similarly for L. For (ii) take (i) and replace G by G◦ , L by L◦ . For (iii) it is clear that (G ∩ L)◦ ⊇ G◦ + L◦ and the left-hand member is closed. For (iv), take annihilators in (ii) and use (G + L)◦◦ = cl(G + L). Note that you cannot show (G ∩ L)◦ = cl(G◦ + L◦ ) by applying annihilators to (i). This is because for X ⊆ E it is true that X ◦◦ is the closed linear span of X, but you cannot make the same conclusion in case X ⊆ E ∗ because of the convention defining the annilator of a subset of E ∗ . T HEOREM 58 Let E be a Banach space, G and L closed linear subspaces. Then the following are equivalent: (a) G + L is closed in E. (b) G◦ + L◦ is closed in E ∗ . (c) G + L = (G◦ ∩ L◦ )◦ . (d) G◦ + L◦ = (G ∩ L)◦ . Proof. We see that (a) and (c) are equivalent from (iv) above. Also (d) implies (b). It remains to show that (a) implies (d) and that (b) implies (a). To show that (a) implies (d), since one inclusion is obvious, we need only show that G◦ + L◦ ⊇ (G ∩ L)◦ . Let f ∈ (G ∩ L)◦ . For x ∈ G + L say with x = g + ` we define ϕ(x) = f (g). This is well-defined since if also x = g 0 + `0 36 then the difference g − g 0 = `0 − ` lies in G ∩ L. Since G + L is closed with a good choice of g we have kgk ≤ Ckxk and hence ϕ is continuous. Now extend ϕ (defined on the closed linear subspace G + L) to ϕ̃ defined on the whole of E. Then f = (f − ϕ̃) + ϕ̃ with the first member in G◦ and the second in L◦ . To show that (b) implies (a). Since G◦ + L◦ is closed and by theorem 39 we have ◦ ◦ ◦ ◦ d(f, G ∩ L ) ≤ C d(f, G ) + d(f, L ) for all f ∈ E ∗ . Decoding this using (iii) from Proposition 57 and Lemma 54, we get sup |f (x)| ≤ C sup |f (x)| + sup |f (x)| (2.1) x∈cl(G+L),kxk≤1 x∈G,kxk≤1 x∈L,kxk≤1 Now let x ∈ cl(G + L) with kxk ≤ 1. We claim that C −1 x ∈ cl(BG + BL ). If not, then by Hahn–Banach, there exists α ∈]0, 1[, f ∈ E ∗ such that f (C −1 x) = 1 and f (g + `) ≤ α for all g ∈ BG , ` ∈ BL . But this contradicts (2.1). The claim is proved. But this means that any vector x ∈ cl(G + L) with kxk ≤ 1 can be written in the form x = g + ` + z where g ∈ G, ` ∈ L with kgk, k`k ≤ C and kzk ≤ 21 . An iteration argument shows that x ∈ G + L and completes the proof. 37 3 Compact Operators and the Fredholm Alternative Let E and F be Banach spaces. A continuous operator T ∈ L(E, F ) is said to be compact iff T (BE ) has compact closure in F for the norm topology on F . The set of all such operators is denoted K(E, F ). E XERCISE • Show that T compact is equivalent to the statement that for every sequence (xn ) in BE , the sequence (T (xn )) has a convergent subsequence in F . • Show that K(E, F ) is a linear subspace of L(E, F ). 2 L EMMA 59 We have that K(E, F ) is a closed linear subspace of L(E, F ). Proof. Let Tn −→ T in the operator norm with Tn ∈ K(E, F ). We need to show that T (BE ) has compact closure. But the closure of T (BE ) is closed and hence complete, so it will suffice to show that T (BE ) is totally bounded (recall that the closure of a totally bounded set is again totally bounded). Let δ > 0, then we need to cover T (BE ) with finitely many δ-balls centred in T (BE ). Choose n such that kT − Tn kop < 31 δ. Since Tn (BE ) is compact and hence totally bounded, we may cover Tn (BE ) by finitely many 13 δ-balls centred in Tn (BE ). If the centres of these balls are Tn (zk ), then the δ-balls centred at T (zk ) will cover T (BE ). Clearly, a finite rank operator T is necessarily compact since T (BE ) is a bounded subset of a finite dimensional space. Hence an operator norm limit of finite rank operators is again compact. 38 The converse of this statement is in general false in Banach spaces, a fact that was established by Per Enflo. That is, not every compact operator is an operator norm limit of finite rank operators. As we shall see later, this statement is true in Hilbert spaces. E XERCISE Let E, F , G and H be Banach spaces and let R : E −→ F , S : F −→ G and T : G −→ H be continuous linear maps with S compact. Then SR, T S (and for that matter T SR are all compact. 2 Let T : E −→ F be a continuous linear mapping between Banach spaces E and F . Then the dual map T ∗ : F ∗ −→ E ∗ (some authors call this the adjoint) is defined by T ∗ (φ)(x) = φ(T (x)) for all φ ∈ F ∗ and x ∈ E. It should be trivial to see that kT kop = kT ∗ kop . P ROPOSITION 60 Let T ∈ K(E, F ). Then T ∗ ∈ K(F ∗ , E ∗ ) and conversely. Proof. Let ψn ∈ BF ∗ . We wish to extract a subsequence such that T ∗ (ψnk ) converges in E ∗ . Let K = cl(T (BE )) a compact subset of F . Define ϕn = ψn |K . These are continuous functions on the compact metric space K. We aim to apply the Ascoli–Arzela theorem. For x ∈ BE we have ϕn (T (x)) ≤ kT (x)k ≤ kT k. Extending by continuity, this gives ϕn k∞ ≤ kT k. For x1 , x2 ∈ BE , we have |ϕn (T (x1 )) − ϕn (T (x2 ))| = |ψn (T (x1 − x2 ))| ≤ kT (x1 ) − T (x2 )kF . Passing to the linit this gives |ϕn (y1 ) − ϕn (y2 )| ≤ ky1 − y2 k for all y1 , y2 ∈ K. So the sequence (ϕn ) is uniformly bounded and uniformly equicontinuous on K and must therefore possess a convergent subsequence (ϕnk ) converging uniformly on K to a function ϕ. Therefore sup |ψnk (T (x)) − ϕ(T (x))| → 0. x∈BE It follows that T ∗ (ψnk ) is a Cauchy sequence in E ∗ . Since E ∗ is complete (where did we prove this?) the first assertion is verified. For the second, we have that T ∗∗ ∈ K(E ∗∗ , F ∗∗ ). Hence, T ∗∗ (BE ) has compact closure in F ∗∗ . But T ∗∗ (BE ) = T (BE ) ⊆ F (as subsets of F ∗∗ ). Also F is closed in F ∗∗ . It follows that T (BE ) has compact closure in F . 39 L EMMA 61 (R IESZ ’ S L EMMA ) Let E be a Banach space and F a closed linear subspace with F 6= E. Then given > 0, there exists x ∈ E with kxk = 1 and d(x, F ) ≥ 1 − . Proof. Choose an element z of unit norm in the quotient space E/F , then an element y ∈ E with kyk < 1 + that projects down onto Z. Then d(x, F ) ≥ 1. Putting x = (1 + )−1 y does the trick. T HEOREM 62 (T HE F REDHOLM A LTERNATIVE ) Let T ∈ K(E). Then i) ker(I − T ) is finite-dimensional. ii) im(I − T ) is closed being actually ker(I − T ∗ )◦ . iii) ker(I − T ) = {0E } if and only if im(I − T ) = E. iv) dim(ker(I − T )) = dim(ker(I − T ∗ )) Proof. For (i) the unit ball of ker(I − T ) is contained in T (BE ) and hence is compact. So ker(I − T ) must be finite-dimensional. For (ii) Let yn = xn − T (xn ) −→ y ∈ E. We must show that y ∈ im(I − T ). Let δn = d(xn , ker(I − T )) ≥ 0. Since ker(I − T ) is finite dimensional, there exists zn ∈ ker(I − T ) such that kxn − zn k = δn . Now yn = (xn − zn ) − T (xn − zn ). (3.1) We claim that kxn − zn k remains bounded as n → ∞. If not, there is a subsequence such that kxn − zn k tends to ∞. we pass to this subsequence (without change of notation). Let wn = kxn − zn k−1 (xn − zn ). It follows that wn − T (wn ) = kxn − zn k−1 yn tends to 0E . But since wn is a unit vector, we may pass to a further subsequence (still without change of notation) such that T (wn ) converges. Thus wn and T (wn ) both converge to a vector w and clearly T (w) = w, i.e. w ∈ ker(I − T ). But d(wn , ker(I − T )) = d(xn , ker(I − T )) =1 kxn − zn k gives a contradiction. Hence the claim is proved. 40 Since kxn − zn k remains bounded as n → ∞, we may extract a subsequence (again without change of notation) such that T (xn − zn ) converges say to u. But now we see from (3.1) that xn − zn = yn + T (xn − zn ) therefore converges to y + u. Substituting back into (3.1) and passing to the limit gives y = (y + u) − T (y + u) so that y ∈ im(I − T ) as required. It now follows from Lemma 56(iv) that im(I − T ) = ker(I − T ∗ )◦ . For (iii) we show first that ker(I − T ) = {0E } implies im(I − T ) = E. Let E1 = im(I −T ) and suppose that E1 6= E. Let more generally En = im((I −T )n ) for n ∈ N, then by(ii) the EN are all closed and since I −T is injective, each En+1 is stricly contained in En . Choose unit vector zn ∈ En with d(zn , En+1 ) ≥ 21 . With n > m, we have ∈Em+1 }| { T (zm ) − T (zn ) = zm − zn − (zn − T (zn )) + (zm − T (zm )) z and it follows that kT (zm ) − T (zn )k ≥ 12 a contradiction with the compactness of T. For the other implication in (iii) suppose that im(I − T ) = E. Then ker(I − ∗ T ) = {0E }. Since T ∗ is compact, we can apply the part of (iii) already proved to establish that im(I − T ∗ ) = E ∗ . But by Lemma 56(i) it follows that ker(I − T ) = {0E }. We move on to (iv). Suppose that dim(ker(I − T )) < dim(ker(I − T ∗ )). Then since ker(I − T ) is finite-dimensional, it admits a complement in E and there is a continuous linear projection P : E −→ ker(I − T ). On the other hand, im(I − T ) = ker(I − T ∗ )◦ is closed and has strictly larger finite codimension in E. Thus, im(I − T ) has a complement M in E with dimension strictly bigger than dim(ker(I − T )). Therefore, there is a linear map J : ker(I − T ) −→ M which is injective but not surjective. Let S = T + JP . Then since JP has finite rank, S is compact. We claim that ker(I − S) = {0E }. If ∈im(I−T ) ∈M z }| { z }| { 0E = z − S(z) = (z − T (z) − J(P (z)) then z − T (z) = 0E and J(P (z)) = 0E . Then z ∈ ker(I − T ) so P (z) = z and hence J(z) = 0E . Since J is injective, z = 0E . This establishes the claim. 41 Applying (iii) to S we obtain im(I − S) = E. But we know im(I − S) ⊆ im(I − T ) + im(J) a contradiction since J does not map onto M . This shows dim(ker(I − T )) ≥ dim(ker(I − T ∗ )). For the inequality dim(ker(I − T )) ≤ dim(ker(I − T ∗ )) we apply the same argument to T ∗ to obtain dim(ker(I − T ∗ )) ≥ dim(ker(I − T ∗∗ )) and the result follows since ker(I − T )) ⊆ ker(I − T ∗∗ )). 42 4 Spectral Theory of Hilbert Space Operators In this chapter we take a look at various aspects of the spectral theory and symbolic calculus of linear operators on a complex Hilbert space H. L EMMA 63 Let T be a continuous linear operator on H. Then there exists an ? operator T called the adjoint of T such that hT ? y, xi = hy, T xi (4.1) for all x, y ∈ H. Proof. Let y ∈ H and let u be the continuous linear form u(x) = hy, xi on H. Then x 7→ u(T x) is again a linear form. By Theorem 46 there exists z ∈ H such that u(T x) = hz, xi. That is hz, xi = hy, T xi. We now see that the mapping y 7→ z so defined is linear and continuous. Therefore we may define a continuous linear operator T ? such that (4.1) holds. The mapping T 7→ T ? is conjugate linear. We also observe that (T ? )? = T and that kT ? k = kT k. In case H is Cn with the standard inner product, T ? is essentially the complex conjugate transpose matrix of T . D EFINITION A continuous linear operator T on H is said to be self-adjoint or hermitian if T ? = T and normal if T ? T = T T ? . 43 D EFINITION operator if A continuous linear operator T on H is said to be a compact {T (x); x ∈ H, kxk ≤ 1} has compact closure in H (for the norm topology). E XERCISE For operators on Hilbert space we have: • Every operator of finite rank is compact. • An operator norm limit of compact operators is compact. (It suffices to show that the image of the unit ball is totally bounded). • If T is compact and S is a continuous operator, then ST is compact. • If T is compact and S is a continuous operator, then T S is compact. 2 4.1 Spectral Theory of Compact Hermitian operators L EMMA 64 Let T be a continuous hermitian operator on H. Then kT k = sup |hx, T xi|. kxk≤1 Proof. Clearly the left hand side is ≥ the right hand side. We only need to establish the opposite inequality. Assume temporarily that T x 6= 0. We have 4|<hy, T xi| = 2|hy, T xi + hT x, yi| = 2|hy, T xi + hx, T yi| = |h(x + y), T (x + y)i − h(x − y), T (x − y)i| ≤ |h(x + y), T (x + y)i| + |h(x − y), T (x − y)i| 2 2 ≤ sup |hz, T zi| kx + yk + kx − yk kzk≤1 = 2 sup |hz, T zi| kxk2 + kyk2 kzk≤1 Now replace y by ωy with |ω| = 1 and optimize to get 2 2 2|hy, T xi| ≤ sup |hz, T zi| kxk + kyk . kzk≤1 44 Put y = kxkkT xk−1 T x to get 2kxkkT xk ≤ 2 sup |hz, T zi|kxk2 . kzk≤1 Since T x 6= 0, we have x 6= 0 and we find kT xk ≤ sup |hz, T zi|kxk. kzk≤1 But this holds in any case if T x = 0 and the proof is complete. L EMMA 65 Let T be a compact hermitian operator on nonzero Hilbert space H. Then either kT k or −kT k is an eigenvalue of T . Proof. By the previous lemma, we may find a sequence xn ∈ H with kxn k = 1 and |hxn , T xn i| → kT k. Passing to a subsequence, we can assume without loss of generality that there exists y ∈ H and λ = ±kT k such that hxn , T xn i → λ and T xn → y as n → ∞. Then 0 ≤ lim sup kT xn − λxn k2 n→∞ 2 2 2 = lim sup kT xn k − 2λhxn , T xn i + λ kxn k n→∞ ≤ lim sup kT k2 − λ2 = 0. n→∞ Therefore T xn − λxn → 0 and consequently λxn → y. Applying T we get λT xn → T y and hence λy = T y. It remains to show that y 6= 0. If y = 0, then since λxn → y and kxn k = 1 we have λ = 0. Consequently kT k = 0 and zero is an eigenvalue. T HEOREM 66 Let T be a compact hermitian operator on H. Then • The completion K of ⊕Hλ is the whole of H. • The eigenvalues are all real. 45 • The spaces Hλ are finite dimensional for λ 6= 0. • The only possible accumulation point of the eigenvalues is zero. Proof. It is easy to see that the only possible eigenvalues are real. The closed subspace K is clearly invariant under T . Since T is hermitian, K ⊥ is also invariant under T . Also the restriction of T to K ⊥ is a compact hermitian operator on K ⊥ . By the previous lemma, either K ⊥ is zero or T |K ⊥ has an eigenvalue. The second scenario leads to K ∩ K ⊥ 6= {0}, a contradiction. Hence K ⊥ is zero and K = H. Now let δ > 0 and choose a unit vector eλ in Hλ for every eigenvalue λ with |λ| > δ. Then T eλ = λeλ so that kT eλ k ≥ δ. For distinct eigenvalues λ1 and λ2 , we still have T eλ1 ⊥ T eλ2 and it follows that there cannot be infinitely many such λ. A similar argument show that each individual Hλ is finite dimensional. T HEOREM 67 Let T be a compact linear map T : H −→ K with H and K Hilbert spaces. Then T has a singular value decomposition. Namely, there exists a countable index set I, orthonormal subsets (eα )α∈I , (fα )α∈I of H and K respectively and positive reals (σα )α∈I (not necessarily distinct) but not having a non-zero accumulation point such that X Tx = σα heα , xifα α∈I for every x ∈ H, the sum (for T x) converging in K. Proof. Note that both T ? T and T T ? are compact and hermitian. Let λ be an eigenvalue of T ? T . Then for x a nonzero eigenvector we have λkxk2 = hx, T ? T xi = kT xk2 so that λ ≥ 0. Let Hλ be the corresponding eigenspace and let Kλ be the eigenspace of T T ? for λ. Then (T T ? )T x = T (T ? T )x = λT x showing that T maps Hλ to Kλ . Similarly T ? maps Kλ to Hλ . If λ > 0 we have λ−1 T ? T x = x for all x ∈ Hλ , showing that λ−1 T ? is inverse to T . Hence Hλ and Kλ have the same dimension. It follows (using standard finite-dimensional linear algebra) that there are orthonormal bases of Hλ and Kλ such that the ma√ trix representation of T |Hλ is λI where I is the identity matrix. Glueing these orthonormal bases together for all possible eigenvalues λ > 0 gives the result. 46 4.2 Tensor products of inner product spaces In this section, we will assume that you know all about tensor products of vector spaces. Let E and F be inner product spaces and let E ⊗ F be their tensor product as vector spaces. The there is a natural inner product on E ⊗ F , defined essentially by hξ1 ⊗ η1 , ξ2 ⊗ η2 i = hξ1 , ξ2 ihη1 , η2 i and by extending by linearity and conjugate P linearity. We check that this actually is an inner product. For a tensor τ = nj=1 ξj ⊗ ηj , this yields X hτ, τ i = hξj , ξk ihηj , ηk i j,k which is always nonnegative (some linear algebra required). Another way of seeing this and of completing the next step is to choose orthonormal bases (eα ) and (fβ ) of the linear spans of the ξ’s and η’s respectively allowing us to write X X bβ,j fβ aαk eα , ηj = ξj = α β Then hτ, τ i = X aα,j aα,k bβ,j bβ,k α,β,j,k 2 X X aα,k bβ,k ≥ 0. = α,β k P But now, if hτ, τ i = 0, we have that k aα,k bβ,k = 0 for all α and β. But then X τ= aα,k bβ,k eα ⊗ fβ = 0. α,β,k Hence we have a genuine inner product. Now let E and F be Hilbert spaces. Then unfortunately E ⊗ F is not necessarily complete in its inner product as defined above. The Hilbert space tensor product E ⊗H F of E and F is defined as the completion of E ⊗ F for the corresponding norm. Now if (eα ) and (fβ ) are orthonormal bases of E and F respectively, then (eα ⊗ fβ ) is an orthonormal basis of E ⊗H F . This is easily seen since (eα ⊗ fβ ) is orthonormal and the closure of its linear span is E ⊗H F . Next, let E = L2 (X, F, µ) and F = L2 (Y, G, ν). Since we wish to discuss the L2 space on the product, we are obliged to assume at this point that both measure spaces are σ-finite. The product measure µ × ν is not defined otherwise. Then it is easy to see that the mapping T : E ⊗H F → L2 (X × Y, F ⊗ G, µ × ν) defined by T (f ⊗ g) = F where F (x, y) = f (x)g(y) and by extending by linearity is a well defined isometry (and hence one-to-one). 47 E XERCISE The mapping T is onto. (Use the argument in the math 455 notes or use orthogonality). 2 T HEOREM 68 Let (eα )α∈I and (fβ )β∈J be orthonormal bases of L2 (X, F, µ) 2 and L (Y, G, ν) respectively where µ and ν are σ-finite measures. Then the functions eα (x)fβ (y) form an orthonormal basis of L2 of the product space as (α, β) runs over I × J. E XERCISE Incidentally, to show that L2 (T, η) ⊗ L2 (T, η) is not the whole of L2 (T, η) ⊗H L2 (T, η) = L2 (T × T, η × η), consider Z P (f ⊗ g) = f (t − s)g(s)dη(s) Show that P maps L2 (T, η) ⊗ L2 (T, η) into the space of functions with absolutely convergent Fourier series. On the other hand, given h ∈ L2 (T), the function F (t, s) = h(t + s) gets mapped by P to h. Now show that for E and F Hilbert spaces, E ⊗ F = E ⊗H F if and only if E or F is finite dimensional. 2 4.3 Hilbert–Schmidt Operators In this section, let (X, S, µ) and (Y, T , ν) be σ-finite measure spaces. We will also assume that the corresponding L2 spaces are separable. Let K be a L2 function on the product space. We consider the operator Z T f (x) = K(x, y)f (y)dν(y) (4.2) which we will see shortly is a continuous operator from L2 (Y, T , ν) to L2 (X, S, µ). First observe that Z |K(x, y)||f (y)|dν(y) is finite for µ-almost all x. Now let g ∈ L2 (X, S, µ) and observe that Z Z |g(x)| |K(x, y)||f (y)|dν(y)dµ(x) 48 Z = |K(x, y)||g(x)f (y)|d(µ × ν)(x, y) Z ≤ 2 12 Z 21 |g(x)| |f (y)| d(µ × ν)(x, y) <∞ 2 |K(x, y)| d(µ × ν)(x, y) 2 by Tonelli’s Theorem and the Cauchy–Schwarz inequality. This sets us up to use Fubini’s Theorem and we prove easily that Z Z Z g(x)T f (x)dµ(x) = g(x) K(x, y)f (y)dν(y)dµ(x) ≤ kKk2 kf k2 kgk2 . To conclude from this, we define sets Ak ∈ S increasing with union X and set gk = 11Ak min(k, |T f |)sgn(T f ) which is definitely in L2 . Now the inequality |gk |2 ≤ gk T f holds since it boils down to min(k 2 , |T f |2 ) ≤ |T f | min(k, |T f |). Therefore Z Z 2 |gk (x)| dµ(x) ≤ gk (x)T f (x)dµ(x) ≤ kKk2 kf k2 kgk k2 Since we know that kgk k2 < ∞ we can conclude kgk k2 ≤ kKk2 kf k2 . Letting k → ∞ we finally get kT f k2 ≤ kKk2 kf k2 by monotone convergence. We further deduce that kT k ≤ kKk2 . But choosing (ej ) and (fk ) in L2 (Y, T , ν) and L2 (X, S, µ) respectively, we can write X cjk ej ⊗ fk K= j,k where kKk22 = j,k |cjk |2 and the sum converges in the L2 norm. It follows that T is an operator norm limit of finite rank operators and hence is compact. Thus, in fact we may write from the singular value decomposition of T P T (g) = ∞ X σi hei , gifi i=1 for possibly different orthonormal bases. We now have K(x, y) = ∞ X i=1 49 σi ei (x)fi (y) P 2 and kKk22 = ∞ i=1 σi . One may define the von Neumann–Schatten classes of compact operators to be the ones for which ∞ X σip < ∞. i=1 The corresponding quantity ∞ X ! p1 σip i=1 turns out to be a norm for 1 ≤ p < ∞ and defines the class Cp . This fact is not entirely trivial. The class C2 consists of the Hilbert–Schmidt operators and C1 is the so-called trace class. In order to setup the Cp norm, we make the definition in an underhanded way. We set for T a compact operator kT kCp = sup{| tr(T S)|} Prank(S) 0 as S runs over finite rank operators with k=1 σk (S)p ≤ 1 where (σk ) are the Prank(S) singular values of S. For any finite rank operator S = k=1 tk ek ⊗fk∗ we define Prank(S) the trace of T S by tr(T S) = k=1 tk hfk , T (ek )i. It is obvious that k · kCp is a norm and by duality that ! p1 ∞ X ≤ kT kCp . σi (T )p i=1 The class C∞ is the class of all compact operators with the operator norm. Of course in this case, we have that the sequence of singular values lies in c0 . We will need the following proposition. P ROPOSITION 69 pose that Let aj,k be nonnegative for j ∈ N and k = 1, . . . , n. Sup∞ X aj,k ≤ 1 for all k = 1, . . . , n (4.3) for all j ∈ N (4.4) j=1 and n X aj,k ≤ 1 k=1 50 Let (αj )j∈N and (βk )nk=1 be decreasing sequences of nonnegative numbers. Then ∞ X n X aj,k αj βk ≤ j=1 k=1 n X α k βk . k=1 We now apply the proposition P ROPOSITION 70 We have kT kCp ≤ ∞ X ! p1 σi (T )p i=1 showing that the right hand side actually is a norm. Proof. We write T = ∞ X σj (T )ej ⊗ fj∗ j=1 and S= n X σk (S)gk ⊗ h∗k k=1 where e, f, g, h are orthonormal sets and the singular values are written in decreasing order. Then tr(T S) = ∞ X n X σj (T )σk (S)hhk , ej ihfj , gk i j=1 k=1 Now we claim that ∞ X |hhk , ej i|2 ≤ 1 for all k = 1, . . . , n j=1 since hk is a unit vector and (ej ) is an orthonormal set. Similarly n X |hhk , ej i|2 ≤ 1 k=1 51 for all j ∈ N since ej is a unit vector and (hk ) is an orthonormal set. We can make similar estimates on hfj , gk i|2 . Thus, putting aj,k = |hhk , ej ihfj , gk i| we have that (4.3) and (4.4) hold by applications of the Cauchy–Schwarz inequality. Therefore | tr(T S)| ≤ n X σk (T )σk (S) ≤ k=1 ∞ X ! p1 σk (T )p n X ! 10 p σk (S) p0 . k=1 k=1 This proves the result. P ROPOSITION 71 Let R be a contraction and T ∈ Cp . kRT kCp , kT RkCp ≤ kT kCp . Proof. Then we have We repeat the proof of Proposition 70. We work with RT and obtain tr(T S) = ∞ X n X σj (T )σk (S)hhk , Rej ihfj , gk i j=1 k=1 and once again n X |hhk , Rej i|2 ≤ 1 for all j ∈ N k=1 since Rej is a vector of norm ≤ 1 and (hk ) is an orthonormal set. Similarly ∞ X j=1 2 |hhk , Rej i| = ∞ X |hR∗ hk , ej i|2 ≤ 1 for all k = 1, . . . , n j=1 since R∗ hk is a vector of norm ≤ 1 and (ej ) is an orthonormal set. Taking sups over suitable S, we get the result. The proof for T R is similar. E XERCISE • Show that kT kC1 = sup | tr(T S)| as S runs over finite rank contractions. • Deduce that kT RkC1 ≤ kT kCp kRkCp0 for 1 ≤ p < ∞. 2 52 5 Interpolation There are various methods of interpolation. The most prevalent are the complex method, Marcinkiewicz interpolation and the real method of Lions and Peetre. We start with the idea behind the complex method. L EMMA 72 (T HREE L INES L EMMA ) Let ϕ be a bounded continuous function in 0 ≤ <z ≤ 1 analytic in 0 < <z < 1. Suppose that |ϕ(z)| ≤ M0 for <z = 0 and |ϕ(z)| ≤ M1 for <z = 1 Then |ϕ(z)| ≤ M01−t M1t for <z = t and 0 ≤ t ≤ 1. Proof. Let > 0 and set ϕ (z) = M0z−1 M1−z ϕ(z) exp((z 2 − 1)). Then |ϕ (z)| ≤ 1 on the boundary of the strip 0 ≤ <z ≤ 1 and it vanishes at infinity. Therefore by applying the maximum modulus principle we find that |ϕ (z)| ≤ 1 on the strip 0 ≤ <z ≤ 1. The result follows by letting tend to zero. Note that the maximum modulus result for the strip is proved by means of conformally mapping the strip to say a disk. It is important that the resulting function should be continuous on the boundary of the disk. This is assured in our case since the function ϕ is tending to zero at infinity. L EMMA 73 Let f be a function in Lp of norm 1 where p is strictly between p0 −1 and p1 . Define t ∈]0, 1[ by p−1 = (1−t)p−1 0 +tp1 and let α = p(p0 −p1 )/(p0 p1 ). Then set 1 fz = f |f |α(z− p ) . Then fz is a function in Lp0 of norm 1 for <z = 0 and fz is a function in Lp1 of norm 1 for <z = 1. 53 Note that in some sense, the function z 7→ fz is analytic. We illustrate the complex method with an example. T HEOREM 74 (T HE H AUSDORFF –YOUNG T HEOREM ) 0 p ≤ 2, then fˆ ∈ Lp (R̂). If f ∈ Lp (R) for 1 ≤ Proof. Of course, the theorm is trivial for p = 1 and is a consequence of the Plancherel theorm for p = 2. The idea is to deduce it for values of p in between 1 and 2. For f ∈ Lp (R) of unit norm and g ∈ Lp (R̂) of unit norm, it will be enough to establish that Z fˆ(u)g(u)du ≤ 1. (5.1) by duality. In fact, it will be enough to handle the special case in which f and g are step functions. Now R use Lemma 73 to build fz and gz appropriately. Then we will get for ϕ(z) = fˆz (u)gz (u)du Z |ϕ(z)| = fˆz (u)gz (u)du ≤ kfˆz k2 kgz k2 = kfz k2 kgz k2 ≤ 1 for <z = 0 and Z ˆ |ϕ(z)| = fz (u)gz (u)du ≤ kfˆz k∞ kgz k1 ≤ kfz k1 kgz k1 ≤ 1 for <z = 1. Taking z = t (t as defined in the three lines lemma) we have the required conclusion (5.1). Here is another application of interpolation. Let K be a kernel on a measure space (X, F, µ) and suppose that R • ess supx |K(x, y)|dµ(y) ≤ 1, R • ess supy |K(x, y)|dµ(x) ≤ 1. R Then the operator T defined by T f (x) = K(x, y)f (y)dµ(y) is a contraction on Lp for 1 ≤ p ≤ ∞. L EMMA 75 Proof. The hypotheses lead to the conclusion in the cases p = 1 and p = ∞. Interpolation does the rest. This brings to mind two further things neither of which is conected to interpolation. The first is Gerschgorin’s theorem. 54 T HEOREM 76 (G ERSCHGORIN ’ S T HEOREM ) Let S A = (ajk ) be an n × n matrix. Then the eigenvalues of A lie in the union nj=1 D j where Dj is the P closed disc (in the complex plane) with centre ajj and radius k6=j |aj,k |. The proof is standard linear algebra. A corollary is C OROLLARY 77 Let A = (ajk ) be n × n matrix. Then the spectral radius Pan n of A is bounded above by max1≤j≤n k=1 |aj,k |. In comparison with Lemma 75, only one of the conditions occurs in the hypotheses, but the spectral radius is bounded rather than an operator norm. The second thing that is brought to mind is Cotlar’s lemma, since the hypotheses are a little reminiscent of those in Lemma 75. L EMMA 78 (C OTLAR ’ S L EMMA ) Let Tj be continuous operators from a Hilbert space H to a Hilbert space K for j = 1, 2, . . . , n. Suppose that P • maxi nj=1 kTi Tj∗ k ≤ M , P • maxi nj=1 kTi∗ Tj k ≤ M , P Then k nj=1 Tj k ≤ M . For the proof, see https://terrytao.wordpress.com/2011/05/25/the-cotlar-steinlemma/ There are ways of extending this result to infinite sums and also to integrals. E. M. Stein devised a nice trick to use with complex interpolation. Basically, one builds the desired operator into a complex analytic family. Consider the following question. For σ the uniform meaure on the unit sphere in Rn do we have the convolution estimate kσ ∗ f kn+1 ≤ Ckf k n+1 ? n (5.2) To establish this, one embeds σ into an analytic family of distributions (OK we still have to talk about distributions). The definition is dσz (x) = 1 (1 − |x|2 )−1+z 11D (x) Γ(z) where D is the open unit ball D = {x ∈ Rn ; |x| < 1}. For <z > 0 this is a perfectly good measure. We can compute its Fourier transform as σbz (u) = 2z |u|− n−2 −z 2 55 J n−2 +z (|u|) 2 where Jα denotes a Bessel function. Since σbz has at worst polynomial growth at infinity, we can consider σz as a distribution. Now putting z = 0, we find n−2 that σb0 (u) = |u|− 2 J n−2 (|u|) which happens to be the Fourier transform of σ. 2 Thus σ is embedded in this analytic family. Now consider the case <z = 1, then 1 (1−|x|2 )−1+z 11D (x) is a bounded function. Unfortunately, the L∞ bound may Γ(z) grow as =z goes off to infinity, but not too badly. This depends on lower bounds , σbz is a bounded function since the Bessel functions for |Γ(z)|. When <z = − n−1 2 − 21 decay like |u| at infinity. Again, the bounds may go off to infinity with |=z| and this depends on precise estimates for the decay of Bessel functions. The estimates one obtains are |hgz , σz ∗ fz i| ≤ Cz kfz k1 kgz k1 for <z = 1 and |hgz , σz ∗ fz i| = |hgbz , σbz fbz i| ≤ Cz kfz k2 kgz k2 . There are a number of issues here. One is the analyticity of for <z = − n−1 2 ϕ(z) = hgz , σz ∗ fz i. Another is the growth of the constants Cz . One needs to revisit the three lines lemma with a proof in which is constant (in fact large) and 2 the growth along the sides of the strip is controlled by e|=z| . Suffice it to say that these details can all be worked out. Applying the interpolation idea, one comes out with (5.2). For almost the same problem, check out Terry Tao’s notes at https://terrytao.wordpress.com/2011/05/03/steins-interpolation-theorem/ 5.1 Lorentz Spaces Lorentz spaces developed out of the Marcinkiecz Interpolation Theorem. A typical application of Marcinkiecz Interpolation Theorem arises in relation to the Hardy–Littlewood maximal theorem. We’ll take the centred version on the line Z x+h 1 f (t)dt. M f (x) = sup h>0 2h x−h This satisfies an inequality of the for meas {x; M f (x) > t} ≤ Ckf k1 t−1 and also kM f k∞ ≤ kf k∞ . A function f that satisies the Lp Tchebychev iequality meas {x; M f (x) > t} ≤ Ct−p 56 is said to be in weak Lp and this is coded as the Lorentz space Lp,∞ . There is also a strong Lp denoted Lp,1 and the usual Lp space is the Lorentz space Lp,p . The Marcinkiecz Interpolation Theorem works for sublinear operators like the Hardy–Littlewood maximal operator. Sublinear operators are positive homogenous i.e. T (tf ) = |t|T (f ) and subadditive T (f + g) ≤ T (f ) + T (g). The complex method of interpolation works only for linear operators unless special techniques are used. For f a nice function on a measure space we define df (s) = meas ({|f | > s}) the distribution function of f and then f ∗ (t) = inf{s > 0; df (s) < t} the equimeasurable decreasing rearrangement. The function f ∗ is positive decreasing right-continuous and has the same distribution function as f . The map f 7→ f ∗ is positive homogenous, but not subadditive. An example of this is furnished by f = 110,1[ , g = 11[1,2[ . Then f + g = 11[0,2[ = (f + g)∗ . But f ∗ = g ∗ = f and f ∗ + g ∗ = 211[0,1[ . If one wishes sublinearity, then one averages the decreasing rearrangement with the Hardy averaging operator. Z 1 x f (t)dt Af (x) = x 0 p defined on functions on [0, ∞[. It is well-known that kAf kp ≤ p−1 kf kp for 1 < p ≤ ∞. The sublinearity of the mapping f 7→ Af ∗ follows from Z x Z ∗ ∗ f (t)dt sup |f (s)|dµ(s) (5.3) xAf (x) = µ(A)≤x 0 A where the sup is taken over all measurable subsets A of measure ≤ x. Note that f ∗ ≤ Af ∗ . Another key inequality is Z Z ∞ f gdµ ≤ f ∗ (t)g ∗ (t)dt 0 The Lorentz spaces are defined by means of the quasinorms Z kf kp,q = ∞ 1 p ∗ t f (t) q dt 1q 0 for finite q and 1 kf kp,∞ = sup t p f ∗ (t) t>0 57 t for q infinite. Next we give a proof of Hardy’s inequality. 1 −1 Let g(y) = y p f (y) and ϕ(x) = x p0 11[1,∞[ (x) Then kgkLp ( dt ) = kf kp and t kϕkL1 ( dt ) = p0 . Convolving on the multiplicative group ]0, ∞[ we get t Z ∞ Z x 1 1 1 dy − − −1 −1 0 0 f (y)y p (xy ) p 11[1,∞[ (xy ) = x p h(x) = f (y)dy y 0 0 and khkLp ( dt ) ≤ p0 kgkLp ( dt ) . But t t Z x p Z Z Z dx dx − pp0 p f (y)dy = |h(x)|p ≤ (p0 )p kf kpp (Af (x)) dx = x x x 0 The adjoint of Hardy’s inequality follows with a similar proof. Z ∞ g(y) ∗ A g(x) = dy, kA∗ gkp ≤ pkgkp , 1≤p<∞ y x Next we need a variant of this inequality also due to Hardy L EMMA 79 For b > 0 and 1 ≤ p < ∞ p1 p p1 Z ∞ Z ∞ Z ∞ p p p+b−1 b−1 |f (t)| t dt |f (t)|dt x dx ≤ b 0 0 x b b Proof. Let g(y) = f (y)y 1+ p and ϕ(x) = x p 11]0,1] (x). Then kgkLp ( dt ) = t p1 R∞ p p p+b−1 |f (t)| t dt and kϕkL1 ( dt ) = b since b > 0 and p < ∞. As before, let 0 t h be the convolution of g and ϕ on ]0, ∞[. We get khkLp ( dt ) ≤ pb kgkLp ( dt ) and t t Z ∞ b h(x) = f (y)x p dy. y=x The result follows. let 1 ≤ p < ∞ and 1 ≤ q < ∞. Then Z t q ds 1q 1 1 q ∗ ∗ t p f (t) = s p f (t) p 0 s Z t q ds 1q 1 q ∗ ≤ s p f (s) p 0 s 1q q = kf kp,q p 58 From this, it follows easily that Lp,q is a subspace of Lp,r for 1 ≤ q < r ≤ ∞. E XERCISE Show that simple functions are dense in Lp,q for 1 ≤ p < ∞, 1 ≤ q < ∞. 2 5.2 Lorentz space duality R E XERCISE If 1 ≤ p < ∞, f ∈ Lp,1 and g ∈ Lp0 ,∞ we have f gdµ ≤ kf kp,1 kgkp0 ,∞ 2 T HEOREM 80 Lp0 ,∞ . For σ-finite measure spaces and 1 ≤ p < ∞, we have L∗p,1 = Proof. If p = 1 this is just (L1 )∗ = L∞ . So we assume p > 1. The exercise gives half the result. As in the proof of Lp duality, it suffices to work on a space of finite measure. Let u be a continuous linear form on Lp,1 . Then we define a measure ν by ν(A) = u(11AR). One shows that ν µ and hence by the Radon–Nikodym R theorem u(11A ) = 11A gdµ for suitable g ∈ L1 . It follows that u(f ) = f gdµ for simple functions f and hence by continuity for all f ∈ L∞ . This is similar to the proof of Lp duality. Take f = sgn(g)11|g|>s . Then f ∈ L∞ and calculations give Z 1 sµ ({|g| > s}) ≤ f gdµ ≤ kukkf kp,1 = pkuk (µ ({|g| > s})) p and the result follows since we know that µ ({|g| > s}) is finite. If 1 < p < ∞, 1 < q < ∞ f ∈ Lp,q and g ∈ Lp0 ,q0 we have ERXERCISE f gdµ ≤ kf kp,q kgkp0 ,q0 2 T HEOREM 81 For σ-finite nonatomic measure spaces and 1 < p < ∞ and 1 < q < ∞, we have L∗p,q = Lp0 ,q0 . Proof. We follow the approach in Theorem 80 restricting to R a set of finite measure. Again we have a measurable g ∈ L1 such that u(f ) = f gdµ for functions f in L∞ . Now go back and start again choosing the sets of finite measure so that |g| is bounded on them. This subterfuge ensures that kgkp0 ,q0 is finite. 59 Since the measure space is nonatomic, we can assume without loss of generality that the action takes place on an interval [0, a[ with 0 < a ≤ ∞ and with Lebesgue measure. In this case, g ∗ is actually a measure preserving permutation of |g|. Hence, we may assume without loss that g ∗ = |g|. Now, let Z ∞ 0 q ds 0 −1 s p0 g ∗ (s)q −1 f (t) = sgn(g(t)) t s 2 and note that since |f (t)| is decreasing in t we actually have f ∗ (t) = |f (t)|. Let us rewrite Lemma 79 in the form q Z ∞ Z ∞ Z ∞ q q ds dt p t h(s)ds h(s)q sq+ p ≤C (5.4) t s 0 t/2 0 p0 −2 0 for nonegative h. Then we obtain using (5.4) with h(s) = s q0 (g ∗ (s))q −1 that q Z ∞ Z ∞ 0 q dt 0 −1 ds −1 q ∗ q kf kp,q = s p0 g (s) s t 0 t/2 Z ∞ 0 q ds 1 s p0 g ∗ (s) ≤ C(p, q) s 0 0 ≤ C(p, q)kgkqp0 ,q0 noting that q(q 0 − 1) = q 0 and that 0 q q qq 0 q q + q0 q q0 q − 2 + q + = − q + = − = . p0 p p0 p p0 p0 p0 Thus Z ∞ ∗ ∗ f (t)g (t)dt = u(f ) ≤ kukkgk q0 q p0 ,q 0 . 0 On the other hand Z ∞ ∗ Z ∗ ∞ Z t f (t)g (t)dt ≥ 0 q0 s p0 0 Z ≥ 0 −1 ∗ g (s)q −1 t/2 ∞ ∗ g (t) q0 Z t 0 t/2 0 = C(p, q)kgkqp0 ,q0 60 q0 s p0 ds ∗ g (t)dt s −1 ds s dt Hence the result. Next, we look at an atomic description of Lorentz spaces that we learnt from Terrence Tao’s webpages. See http://www.math.ucla.edu/ tao/preprints/Expository/interpolation.dvi. Consider functions f on a measure space that admit a decomposition X k (5.5) f= ck 2− p fk k∈Z where c ∈ `q (Z), |fk | ≤ 1 and fk is carried on a set of measure at most 2k . We claim that for 1 ≤ p < ∞ and 1 ≤ q ≤ ∞ every function in Lp,q admits such a decomposition. To see this, let f ∈ Lp,q , then it will suffice to decompose f ∗ . We set fk = 1 f 11[ 2k , 2k+1 [, ∗ k f (2 ) Then X cqk = k∈Z X and k ck = 2 p f ∗ (2k ) q k 2 p f ∗ (2k ) ∼ kf kqp,q k∈Z by the same kind of calculations involved in the integral test for convergence. Also, for 1 < p < ∞, a function given by (5.5) lies in Lp,q . To see this, let f have such a decomposition. It will suffice to show that f ∗ ∈ Lp,q . Since f ∗ ≤ Af ∗ pointwise, it will suffice to show that Af ∗ ∈ Lp,q . But we have X k Af ∗ ≤ ck 2− p Afk∗ k∈Z which follows from (5.3). We have an explicit bound for Afk∗ 1 if 0 ≤ t ≤ 2k , ∗ Afk (t) ≤ 2k t−1 if 2k ≤ t. For 2` ≤ t < 2`+1 , we have 1 t p Af ∗ (t) ≤ X ck 2− (k−`) p k>` + X − ck 2 (`−k) p0 ≤ γ` k≤` where γ is in `q (Z) being the convolution over Z of a function in `q and a function in `1 . The claim follows. 61 T HEOREM 82 Let 1 ≤ p0 < p < p1 ≤ ∞ and let T be a linear restricted weak type operator of types (p0 , p0 ) and (p1 , p1 ). Then T maps Lp,r to Lp,r for 1 ≤ r ≤ ∞ and in particular maps Lp to Lp . Proof. The hypotheses mean that R kT f kpj ,∞ ≤ Cj kf kpj ,1 for j = 0, 1. We work with the bilinear form Λ(f, g) = gT f dµ. Let f= X k ck 2− p fk , g= and k∈Z X − pk0 d` 2 gk `∈Z 0 with c ∈ `r , d ∈ `r as above. We have X k +` Λ(f, g) = ck d` 2 p p0 Λ(fk , g` ) k,` We have a choice of estimates ( |Λ(fk , g` )| ≤ k + ` 0 2 pk0 p`0 + 0 2 p1 p1 from which we can choose the smaller. This leads to X |Λ(f, g)| = |ck ||d` |2−α|k−`| k,` for some suitable α > 0. The result follows as above. We think that this theorem can easily be pushed to the case of sublinear operators and also to the off-diagonal type of interpolation. 62 6 Gelfand’s Theory of Commutative Banach Algebra A commutative Banach algebra A is a Banach space together with a continuous multiplication so that A becomes a linear commutative associative algebra. The continuity of the multiplication amounts to the existence of a constant C such that kxyk ≤ Ckxkkyk, ∀x, y ∈ A. The algebra A is said to be unital if it has an identity element which we will denote 11A . In a unital algebra, it may be false that k11A k = 1, but we can always renorm the algebra with an equivalent norm that has this property. For this we use the multiplier norm kxkM = sup kxyk. kyk≤1 While in general, this may fail to define an equivalent norm, but in this case it does because kxkM = sup kxyk ≤ sup Ckxkkyk ≤ Ckxk kyk≤1 kyk≤1 and kxk = kx11A k ≤ k11A kkxkM . Generally we will therefore work with a norm that has the property that k11A k = 1. The multiplier norm has an even more important property, namely that kxykM ≤ kxkM kykM , 63 in other words we may always assume without loss of generality that C = 1 (at least if we are only interested in properties that are preserved under norm equivalence). From now on we are interested only in the case of unital commutative Banach algebras over C. The real case presents substantially more difficulties. The spectrum of an element x ∈ A is a subset of the complex plane defined by σ(x) = {λ; λ ∈ C, (λ11A − x)−1 fails to exist in A}. The spectrum has the following properties 1. If λ ∈ σ(x) implies |λ| ≤ kxk. 2. σ(x) is closed. 3. σ(x) is nonempty. The proofs are easy. First if λ > kxk, then we can construct −1 (11A − λ x) −1 = ∞ X λ−n xn n=0 the right hand side being a norm convergent sum. It follows then that (λ11A − x)−1 exists. The second assertion is similar. If µ ∈ / σ(x), then (µ11A − x)−1 exists. Now we consider λ very close to µ and observe (λ11A − x) = (λ − µ)11A + (µ11A − x) = 11A + (λ − µ)(µ11A − x)−1 (µ11A − x). Provided |λ − µkk(µ11A − x)−1 k < 1, it will be possible to construct (λ11A − x))−1 with a geometric series argument. So, the complement of σ(x) is open and therefore σ(x) is closed. For the third assertion, suppose the contrary. Then (λ11A − x)−1 exists for all complex λ. Let u be a continuous linear functional on A and consider the complex-valued function λ 7→ u((λ11A − x)−1 ) It is clear (actually by using the arguments that we have used in proving the first two assertions) that this is a holomorphic function in the whole complex plane (a 64 so-called entire function) and also that it tends to zero at infinity since for |λ| > kxk ∞ X λ−n−1 xn ≤ |λ|−1 (1 − |λ|−1 kxk)−1 = (|λ| − kxk)−1 . k(λ11A − x)−1 k = n=0 It follows from the maximum modulus principle that such a function is identically zero. It then follows from the Hahn–Banach Theorem that (λ11A − x)−1 = 0 for all λ which is complete nonsense since inverses can never be zero. Having dealt with the spectrum, we now turn to the ideal structure of A. An ideal I is said to be proper if I ⊂ A. We assert that every proper ideal is contained in a maximal proper ideal. This is proved using a Zorn’s Lemma argument. It is enough to show that every chain of proper ideals has an upper bound under set inclusion. Given a chain C of proper ideals, one simply takes [ I. B= I∈C It is easy to see that B is an ideal. If it is not proper, then 11A ∈ B. But then there exists I ∈ C such that 11A ∈ I contradicting the fact that I is proper. (As soon as 11A ∈ I, then x = x11A ∈ I for every x ∈ A.) A similar argument shows that every maximal proper ideal is closed. If M is a maximal proper ideal, then it is clear that cl(M ) is an ideal. So either M = cl(M ) or cl(M ) is not proper. In other words, either M is closed or M is dense. But the latter situation is not possible, since then we would be able to approximate 11A with elements of M . But any element of A sufficiently close to 11A is invertible (by the geometric series argument yet again) and so M would have to contain invertible elements and hence 11A itself, contradicting the fact that M is proper. Now let M be a maximal proper ideal and consider Q = A/M . Then it is routine to check that Q is a unital commutative Banach algebra in the quotient norm. Also, from ring theory, it cannot contain any ideals other than the zero ideal and Q itself. (Let π be the canonical projection π : A → Q and let J be a nontrivial ideal of Q, then π −1 (J) is a proper ideal of A strictly bigger than M — a contradiction). This implies in turn that every non-zero element of Q is invertible. We claim that Q = C11Q . Indeed, let x ∈ Q be arbitrary and let λ ∈ σ(x). Then λ11Q − x fails to be invertible and must therefore be zero. So x = λ11Q . This means then that every maximal proper ideal has codimension 1 and is the kernel of a continuous linear form ϕ : A → C. We are free to normalize ϕ such that ϕ(11A ) = 1. But now, let x, y ∈ A then x − ϕ(x)11A and y − ϕ(y)11A are 65 elements of M since they are clearly in the kernel of ϕ. But now (x−ϕ(x)11A )y = xy − ϕ(x)y is in M and therefore also xy − ϕ(x)ϕ(y)11A = xy − ϕ(x)y + ϕ(x)(y − ϕ(y)11A ). It follows that this element is in the kernel of ϕ and hence ϕ(xy) = ϕ(x)ϕ(y). Such a ϕ (with ϕ(11A ) = 1) is called a multiplicative linear functional (mlf). Every maximal proper ideal is therefore the kernel of an mlf and conversely, it is obvious that the kernel of any mlf is a closed ideal of codimension one and hence a maximal proper ideal. The next step in the saga is to define MA the space of all mlfs and to give it a topology. This is the Gelfand topology and it is simply the relative topology inherited from the weak? (σ(A∗ , A) topology). It turns out that MA is compact in this topology and it is also clearly Hausdorff. To see this first of all we observe that any mlf has norm exactly one. Clearly x − ϕ(x)11A is in the proper ideal ker(ϕ) and therefore, not invertible. So, ϕ(x) ∈ σ(x) and hence |ϕ(x)| ≤ kxk. On the other hand ϕ(11A ) = 1 and k11A k = 1. So MA can be specified as the subset of the unit ball of A∗ which satisfies the following closed conditions x, y ∈ A ϕ(xy) = ϕ(x)ϕ(y) ϕ(11A ) = 1 each depending on only finitely many elements from A (at most three). Since the unit ball of A∗ is compact for the σ(A∗ , A) topology and since MA is a σ(A∗ , A)-closed subset of A∗ it follows that MA is itself σ(A∗ , A) compact. We now write x̂(ϕ) = ϕ(x) and observe that x̂ is now a continuous function on MA . The mapping x 7→ x̂ which maps from A to C(MA ) is called the Gelfand transform of A and is an algebra homomorphism. It can happen that the Gelfand transform has a non-trivial kernel. We can even characterize the kernel of the Gelfand transform. It consists of all elements x ∈ A such that σ(x) = {0} or from power series considerations that 1 lim sup kxn k n = 0. n→∞ This will be proved later. It is also the Jacobson radical of A viewed as a ring. Also it is rarely the case that the Gelfand transform is onto or that the uniform norm of x̂ is equivalent to the norm of x. In many situations the space MA is easy to understand, but there are also cases where its structure is totally mind boggling! 66 6.1 The non-unital case We now come to the case that A is a complex commutative Banach algebra, but it does not have a unit (identity) element. In that case, we simply adjoin an identity element and use the theory in the previous section. So, the new algebra has elements à = {t11 + x; t ∈ C, x ∈ A}. and we define (t11 + x) + (s11 + y) = (t + s)11 + (x + y) (t11 + x)(s11 + y) = ts11 + (ty + sx + xy) For the norm on à we simply take kt11+xk = |t|+kxk and it is straightforward to verify that this is actually a norm. If multiplication is continuous on A, then it is also on à and then one may replace this norm with the multiplier norm to get an equivalent submultiplicative norm. It’s important to extend the norm to à first. Taking the multiplier norm immediately does not work. We now consider the ideals in A and we need to add an extra condition. Let I be an ideal in A. Then a modular unit (or modular identity) for I is an element u ∈ A such that x−ux ∈ I for all x ∈ A. When we form the quotient algebra A/I, the image of u will be an identity element. So, we say that an ideal is modular, if it possesses a modular unit and this is actually equivalent to A/I having an identity element. We now have the following lemma. L EMMA 83 Let I be a modular ideal in A. Then there exists an ideal J in à T such that J 6⊆ A and I = J A. T Conversely, if J is an ideal in à such that J 6⊆ A, then I = J A is a modular ideal in A. Proof. For the first assertion, let I be a modular ideal of A with modular unit u. Define J = {x; x ∈ Ã, xu ∈ I}, clearly an ideal of Ã. Since u ∈ A, u − u2 ∈ I, i.e. (11−u)u ∈ I. So, 11−u ∈ J. / A, soTJ 6⊆ A. It remains to show that T T But 11−u ∈ I = J A and clearly I ⊆ J A. So, let x ∈ J A. then since x ∈ J, we have xu ∈ I and since x ∈ A, we have x − xu ∈ I. Therfore x = (x − xu) + xu ∈ I. This completes the proof of the first assertion.T For the converse, it is clear that I = J A is an ideal in A. But J 6⊆ A, so there is an element of J of the form 11 − u with u ∈ A. Thus, for x ∈ A, 67 x − xu = x(11 − u) ∈ J. But also x and xu are both elements of A and hence so is x − xu. Thus x − xu ∈ I. We have shown that u is a modular unit for I. The consequence of this correspondence is that the maximal modular ideals of A are in one-to-one correspondence with the maximal ideals of à that are not contained in A. But since A is itself a maximal ideal in à because it has codimension one, the maximum modular ideal space (also denoted MA ) of A is just the maximal ideal space of Mà with a single point removed ϕ0 . We view this point as a “point at infinity”, so that MA is a locally compact Hausdorff space having Mà as its one-point compactification. Of course, ϕ0 is an mlf on à vanishing on A and hence must be given by ϕ0 (λ11 + x) = λ for x ∈ A. Every other mlf on à restricts to a (non-zero) mlf on A and conversely, every mlf on A extends to a unique mlf on Ã. For x ∈ A, we have ϕ(x) → ϕ0 (x) = 0 as ϕ → ϕ0 in the Mà topology, so its Gelfand transform x̂ viewed as a function on MA vanishes at infinity. We see that x 7→ x̂ is a continuous algebra homomorphism from A to C0 (MA ). 6.2 Finding the Maximal Ideal Space Usually this is either very easy or totally impossible. E XAMPLE Let A = C 1 ([0, 1]) the space of continuously differentiable functions on the unit interval. It’s clearly an algebra with identity and the multiplication is continuous. Clearly the point evaluations f 7→ f (t) are mlfs for t ∈ [0, 1]. It seems reasonable that these would be the only ones. How do we prove this? Let I be some maximal ideal not of this form. Then, for each t ∈ [0, 1] there is a function ft ∈ I with ft (t) = 1. Let Ut = {s; s ∈ [0, 1], |ft (s)| > 21 }. This is a neighbourhood of t. Applying compactness we have t1 , . . . , tN such that Utn cover [0, 1] for n = 1, . . . , n. Now, make the function f as f= N X ftn ftn n=1 and observe that f > 14 everywhere on [0, 1]. So the reciprocal 11/f is in C 1 . But f is in I and hence so is 11. But this means that I = A a contradiction. We leave the reader to chack that the Gelfand topology is just the standard topology on [0, 1]. 2 68 E XAMPLE Here is another example, very different. Let K be a compact subset of C. Consider the ring of polynomials, viewed as functions on C, restrict them to K and let A be the uniform closure. Then A is a closed subalgebra of C(K) and it has an identity. The key to this algebra is that it is singly generated. We denote by z the identity function on K. Then everything in A is limit of polynomials in z. So, if ϕ is an mlf, then knowledge of ϕ(z) essentially determines ϕ everywhere on A. So ζ = ϕ(z) ∈ C and for every polynomial p, we get ϕ(p) = p(ζ). We will have ζ ∈ MA if and only if the map p 7→ p(ζ) is continuous and indeed, in this case we will have |p(ζ)| ≤ sup |p(z)| for all polynomials p. z∈K The ζ that satisfy this inequality form the polynomially convex hull K̂ of K. It can be shown that C \ K̂ is the unbounded connected component of C \ K. Again MA = K̂ with the usual topology. 2 E XAMPLE Let A = `∞ = C(Z) the set of bounded two-sided sequences with the uniform norm. Again, A is a commutative Banach algebra with identity. The maximal ideal space is horrendous. We clearly have Z ⊆ MA . We claim that this inclusion is dense. Suppose not. Then there is an mlf ϕ which is not in the closure of Z interpreted as a subset of MA via the point evaluations. Now the topology of MA is the topology of convergence on finitely many elements of A, so there exists a neighbourhood of ϕ defined by finitely many functions which avoids the closure of Z. This means (after adding a suitable constant to each function if necessary) that there exists N ∈ N and functions f1 , f2 , . . . , fN ∈ C(Z) such that ϕ(fn ) = 0 for n = 1, 2, . . . , N and the origin is not in the closure of the subset {(f1 (k), f2 (k), . . . , fN (k)); k ∈ Z} PN PN 2 in CN . Then g = n=1 fn fn has ϕ(g) = 0 and yet on Z, g is n=1 |fn | = positive and bounded away from zero. It follows that g is invertible and this is a contradiction. Therefore MA is a compactification of Z and in fact it is called the Stone–Čech compactification. This is the largest possible compactification of Z and enjoys the following universal property. Let K be a compact topological space into which Z is mapped injectively and densely (i.e. K is a compactification of Z). Then there is a mapping π : MA → K which is continuous and onto (but not in general one-to-one) such that the diagram 69 MA π ? -K Z commutes. The Stone–Čech compactification is close to being incomprehensible and we should not waste too much time trying to understand it, although of course some mathematicians have spent many years trying to do so. 2 6.3 The Spectral Radius Formula T HEOREM 84 In a commutative Banach algebra we have 1 kx̂k∞ = lim kxn k n n→∞ The quantity kx̂k∞ is called the spectral radius of x. Proof. Without loss of generality, we can assume that the algebra possesses an identity element. Clearly 1 kx̂k∞ ≤ kxn k n for all n ∈ N and hence 1 kx̂k∞ ≤ lim inf kxn k n . n→∞ It remains to show that 1 lim sup kxn k n ≤ kx̂k∞ . n→∞ −1 −1 If ζ ∈ C and |ζ| < kxk , then(11 − ζx) = ∞ X ζ k xk and k=0 1 x = 2πi n I (11 − ζx)−1 ζ −(n+1) dζ |ζ|=s −1 for s < kxk−1 . Now let t < kx̂k−1 is analytic in ∞ , then since ζ 7→ (11 − ζx) −1 |ζ| < kx̂k∞ , we also have I 1 n x = (11 − ζx)−1 ζ −(n+1) dζ. 2πi |ζ|=t 70 Taking norms in the integral, this yields kxn k ≤ t−n sup k(11 − ζx)−1 k |ζ|=t and, since the sup is finite, 1 lim sup kxn k n ≤ t−1 . n→∞ But, now letting t approach its maximum value kx̂k−1 ∞ , we have the desired result. 6.4 Haar Measure In this section we just assume the results that we need, but we will state them for the nonabelian case which will be discussed later. The proofs aren’t really very instructive. A locally compact abelian group is a group which is also a locally compact Hausdorff topological space. We demand that the multiplication map is continuous as a map from G × G to G and also that group inversion is continuous as a map from G to G. We will use additive notations. An immediate consequence of the definitions is the following proposition. P ROPOSITION 85 Given a neighbourhood V of 0 in G, there is a symmetric neighbourhood U of 0 in G such that U + U ⊆ V (or U · U ⊆ V in the multiplicative case). The basic fact that we need is given by the following theorem. T HEOREM 86 On every LC group G there is a left translation invariant nonnegative regular borel measure λ such that λ(U ) > 0 for every non-empty open subset U of G and λ(K) < ∞ for every compact subset K of G. Furthermore, the measure λ is unique up to a positive multiplicative constant. By left translation invariant, we mean λ(xB) = λ(B) (multiplicative notations) for every x ∈ G and every Borel subset B of G. The measure λ is called the Haar measure of G. Note that the image of the Haar measure under group negation (inversion) ρ is right translation invariant. It turns out that λ = ∆ρ where ∆ is a continuous group homomorphism of G into ]0, ∞[ multiplicative. ∆ is called the modular function of G. We have ∆(x) = 1 71 for all x ∈ G if G is abelian, discrete or compact and in some other circumstances. If G is discrete, the Haar measure is just the counting measure. If G is compact, then it is natural to normalize the Haar measure to be a probability measure. On additive abelian topological groups, we will denote the Haar measure by η. 6.5 Translation and Convolution For a function f defined on G and x ∈ G we define fx (y) = f (y − x) for all y ∈ G. We call fx the translate of f by x. L EMMA 87 Let 1 ≤ p < ∞. For f ∈ Lp (G) we have that x 7→ fx is a uniformly continuous map from G to Lp (G). Proof. Suppose first that f ∈ Cc (G), the space of continuous functions of compact support on G. Then f is uniformly continuous. (Note that G has a natural uniform structure coming from group subtraction). Let K be the support of f . Let U be a compact neighbourhood of 0. Then K + U is also compact. Let it have measure t. Let > 0, then, since f is uniformly continuous, there exists a compact neighbourhood V of 0 such that 1 x ∈ V =⇒ kf − fx k∞ < t− p . If x ∈ U ∩ V , then the support of f − fx is contained in K + U and it follows that kf − fx kp < . In the general case, Let h ∈ Lp and let > 0. We first approximate h by a function f ∈ Cc (G) so that kf − hkp < . It is in this last step that the fact p < ∞ is used. Then, since the underlying measure is translation invariant, kfx − hx kp = kf − hkp < and we have our result. if x ∈ U ∩ V , then kh − hx kp ≤ kf − hkp + kf − fx kp + kfx − hx kp < 3. We now define convolution. If f and g are suitable functions, we set Z f ? g(x) = f (x − y)g(y)dη(y). If we make the substitution y = x − z in this integral we get Z f ? g(x) = f (z)g(x − z)dη(z) = g ? f (x), using that η is both translation and reflection invariant. 72 L EMMA 88 1. If f ∈ L1 and g ∈ L∞ , then f ? g is bounded and uniformly continuous. 2. If f, g ∈ Cc (G) then f ? g ∈ Cc (G). 0 3. If 1, p < ∞, f ∈ Lp , g ∈ Lp , then f ? g ∈ C0 . 4. If f, g ∈ L1 , then f ? g ∈ L1 . In 1), Clearly f ? g is bounded by kf k1 kgk∞ . We rewrite Z Z Z f ? g(x) = f (x − y)g(y)dη(y) = h(y − x)g(y)dη(y) = hx (y)g(y)dη(y) Proof. where h(x) = f (−x) and the uniform continuity is clear since x 7→ hx is uniformly continuous for the L1 norm. For 2), clearly continuous by 1). Also supp(f ?g) ⊆ supp(f )+supp(g). Note that supp(f ) + supp(g) is the continuous image of supp(f ) × supp(g) under the addition map G × G → G and hence is compact. For 3), proceed as in 1). We see that f ? g is bounded by kf kp kgkp0 by using Hölder’s inequality. By 2) f ? g is a uniform limit of continuous functions of compact support. Hence f ? g ∈ C0 . For 4), we start by oberving that if U is open in G, then (x, y); x ∈ G, y ∈ G, x − y ∈ U } is an open subset of G × G. It follows that if B is a Borel set in G, then (x, y); x ∈ G, y ∈ G, x − y ∈ B} is Borel in G × G. So, replacing both f and g with Borel versions, we see that (x, y) 7→ f (x − y) and (x, y) 7→ f (x − y)g(y) are Borel functions on G×G. By Fubini’s Theorem, this last function is absolutely integrable on the product space because ZZ Z |g(y)f (x − y)|dη(x)dη(y) = kf k1 |g(y)|dη(y) = kf k1 kgk1 < ∞ R It now follows that f ?g(x) = f (x−y)g(y)dη(y) is a measurable function (finite almost everywhere) for the completion of the Borel σ-field with respect to η. It also follows from Fubini’s Theorem that f ?g ∈ L1 and that kf ?gk1 ≤ kf k1 kgk1 . It is an exercise to check that different versions of f and g yield the same element of f ? g viewed as an element of L1 . Note: OK, so I lied. The problem with the proof of 4) above is that one of the hypotheses of Fubini’s Theorem is that the underlying measure space be σ-finite. 73 Unfortunately not all LCA groups are σ-finite, for example any discrete uncountable abelian group will fail to be σ-finite. We would have no difficulty handing the case of discrete groups, because L1 functions on such groups would have to be carried by countable sets and actually by countable subgroups. In a general LCA group G, we take an open relatively compact neighbourhood U of 0 and consider Un = U + U + · · · + U with n summands. Note that Un is open and relatively compact. Now consider G0 = ∞ [ Un n=0 an open subgroup of G which is σ-finite. But an open subgroup of G is also closed (because it is the complement of the union of all the cosets not equal to the subgroup itself) and it follows that the quotient G/G0 is discrete. You can now show that given f, g ∈ L1 (G), there is a σ-finite open and closed subgroup H of G such that in fact, f and g are carried on H. Now you apply the argument in 4) above to H. We reserve the right in these notes to tell this same lie again without comment. We now have the following theorem which is easy to check. T HEOREM 89 For g an LCA group, L1 (G) is a commutative Banach algebra with convolution multiplication. If G is discrete, then δ0 is an identity element. It turns out that if L1 (G) has an identity element, then G is discrete, but this is not too obvious. A character χ on G is a continuous group homomorphism into the multiplicative group of unimodular convex numbers. We will denote the set of all characters on G by Γ. We can give Γ the structure of a group in the obvious way. We will use additive notations for consistency even though they look a trifle strange. (−χ)(x) = χ(x), (χ1 + χ2 )(x) = χ1 (x)χ2 (x) The Fourier transform fˆ of f ∈ L1 (G) is now given by Z fˆ(χ) = f (x)χ(x)dη(x). This is a linear functional on L1 (G) and furthermore multiplicative Z Z Z f ? g(x)χ(x)dη(x) = f (x − y)g(y)dη(y)χ(x)dη(x) 74 (6.1) ZZ f (x − y)χ(x − y)χ(y)g(y)dη(y)dη(x) = ZZ = f (z)χ(z)dη(z)χ(y)g(y)dη(y) = fˆ(χ)ĝ(χ) Note also that f ? χ = fˆ(χ)χ T HEOREM 90 Every mlf on L1 (G) is given by a character as in (6.1). Proof. Every bounded linear functional on L1 is given by an L∞ function. So, every non-zero mlf ϕ, which necessarily has norm 1 would have to be given by a function h ∈ L∞ with khk∞ = 1 by Z ϕ(f ) = f (x)h(x)dη(x). Now Z Z ϕ(f ) g(y)h(y)dη(y) = ϕ(f )ϕ(g) = ϕ(f ? g) = ϕ fy g(y)dη(y) Z = (ϕ(fy )g(y)dη(y) and this holds for all g ∈ L1 . Therefore ϕ(f )h(y) = ϕ(fy ) (6.2) for almost all y. Choosing f so that ϕ(f ) 6= 0 and since y 7→ fy is continuous, we see that h has a continuous version. Replacing h with its continuous version, it is now clear that (6.2) holds for all y ∈ G (a conull set must be dense). Now we have ϕ(f )h(x + y) = ϕ(fx+y ) = ϕ(fx )h(y) = ϕ(f )h(x)h(y) giving h(x + y) = h(x)h(y). Put now x = y = 0 to get h(0) = 0 or 1. But h(0) = 0 implies that h and hence ϕ vanishes identically and hence we must have h(0) = 1. But now h(x)h(−x) = 1 and it comes that |h(x)| = 1 for all x ∈ G. 75 6.6 The Dual Group So, on L1 (G), the Gelfand transform and the Fourier transform are the same. We note that if G is discrete, then L1 (G) has an identity and Γ is compact. If G is compact, then η has finite measure. Normally it is normalized to have total mass 1 in this case. We have Z n 1 if χ is the zero element of Γ, χ(x)dη(x) = 0 otherwise. since if χ = 11, the first assertion is obvious. Otherwise there is an element y ∈ G such that χ(y) 6= 1. Then Z Z Z χ(x)dη(x) = χ(x + y)dη(x) = χ(y) χ(x)dη(x) so that Z (1 − χ(y)) χ(x)dη(x) = 0. Note that in this case, the characters are themselves elements of L1 (G). Thus χ̂(ψ) = 1 if χ = ψ and = 0 otherwise. Since χ̂ is a continuous function on Γ, it follows that Γ is discrete. T HEOREM 91 1. (x, χ) 7→ χ(x) is jointly continuous G × Γ → T. 2. Let K and C be compact in G and Γ respectively, then for t > 0 N (K, t) = {χ; |χ(x) − 1| < t for all x ∈ K} N (C, t) = {x; |χ(x) − 1| < t for all χ ∈ C} are open in Γ and G respectively. 3. The sets N (K, t) and their translates form a base for the topology of Γ. 4. Γ is an LCA group. Proof. For 1) let f ∈ L1 (G). We know that x 7→ fx is continuous from G to L1 (G). So, since the Gelfand transform is continuous, x 7→ fˆx is continuous from G to C0 (Γ). But fˆx (χ) = χ(x)fˆ(χ) 76 and it follows that (x, χ) 7→ χ(x) is jointly continuous on the set {(x, χ); x ∈ G, fˆ(χ) 6= 0}. But, for each χ ∈ Γ it is easy to construct f ∈ L1 (G) such that fˆ(χ) 6= 0 and we see that (x, χ) 7→ χ(x) is jointly continuous on G × Γ. Next we prove 2). Let C be compact in Γ and let x0 ∈ N (C, t). Then |χ(x0 ) − 1| < t for all χ ∈ C. So, for each χ there an open neighbourhood Vχ of χ in Γ and an open neighbourhood Uχ of x0 in G such that |ψ(x) − 1| < t for all ψ ∈ Vχ and x ∈ Uχ . Finitely many such neighbourhoods V χ cover C. Let U be the open intersection of the corresponding Uχ . Then it is clear that x0 ∈ U ⊆ N (C, t). The other assertion is proved similarly. Note that 2) states that the Gelfand topology in Γ is finer than the compact open mapping topology (i.e. the topology of uniform convergence on the compact sets). For 3), we have to show the converse and for this it is enough to show that each Gelfand transform fˆ for f ∈ L1 (G) is continuous for the compact open topology. If the function f has compact support, this is obvious since Z ˆ ˆ |f (χ1 ) − f (χ2 )| ≤ |χ1 (x) − χ2 (x)||f (x)|dη(x) supp(f ) ≤ kf k1 sup |χ1 (x) − χ2 (x)|. x∈supp(f ) But any L1 function can be approximated in L1 norm by L1 functions of compact support and the corresponding transforms converge uniformly. This completes the proof of 3). To prove 4), we simply observe that compact open topology on Γ is clearly a group topology. This really amounts to observing that for every compact subset of G and every t > 0 we have N (K, t/2) − N (K, t/2) ⊆ N (K, t), or equivalently that the standard topology on T is a group topology. The group Γ is called the dual group of G. 6.7 Summability Kernels Here we give the theory of summability kernels as it applies to LCA groups. The Bernstein approximation theorem (the proof using the Bernstein polynomials gives an example of the idea in more general situations). Let kn ∈ L1 (G) be indexed over n ∈ N. (In general other indexing sets are used). We suppose 77 • kn ≥ 0. R • G kn (x)dη(x) = 1, for all n ∈ N. • For every measurable neighbourhood V of 0 we have Z lim kn (x)dη(x) = 0. n→∞ G\V We have the following general theorem. T HEOREM 92 Let B be a Banach space of objects on which G acts isometrically and continuously. We will denote bx for the result of applying of the group element x to b ∈ B. Then Z bx kn (x)dη(x) −→ b. n→∞ Proof. We have Z b− and so Z bx kn (x)dη(x) = (b − bx )kn (x)dη(x) Z Z b − bx kn (x)dη(x) ≤ kb − bx kkn (x)dη(x). Now, let > 0. There exists V a measurable neighbourhood of 0 such that x ∈ V ⇔ kb − bx k < and then there exists N ∈ N such that Z n≥N ⇔ kn (x)dη(x) < . G\V We have Z Z Z kb − bx kkn (x)dη(x) ≤ kb − bx kkn (x)dη(x) + G V Z ≤ kb − bx kkn (x)dη(x) G\V Z (kbk + kbx k)kn (x)dη(x) kn (x)dη(x) + V G\V 78 Z ≤ Z (kbk + kbk)kn (x)dη(x) kn (x)dη(x) + G G\V Z ≤ + 2kbk kn (x)dη(x) G\V ≤ + 2kbk for n ≥ N . C OROLLARY 93 Let 1 ≤ p < ∞. Let f ∈ Lp (G) and (kn ) be a summability kernel. Then kn ? f → f in Lp norm. E XAMPLE Let ϕ be a bounded continuous function on G. Show that Z ϕ(x)kn (x)dη(x) −→ ϕ(0). n→∞ 2 6.8 Convolution of Measures Let λ and µ be complex borel measures on G. Then we define thir convolution product λ ∗ µ by λ ∗ µ(B) = λ ⊗ µ(α−1 (B)) (6.3) where α is the addition map α : G × G → G given by α(x, y) = x + y. This extends to suitable measurable functions via Z Z Z f dλ ∗ µ = f (x + y)dλ(x)dµ(y). (6.4) G G G In fact, (6.3) is just the special case f = 11B . It’s easy to check that the convolution multiplication is associative and (on an abelian group) commutative. The totality of all complex borel measures on G is denoted M (G). Since all complex borel measures are necessarily bounded, we can put the total mass norm k kM on M (G) and it can then be realised as the dual space of C0 (G). Taking the supremum over all f ∈ C0 (G) with norm bounded by one in (6.4), we see that kλ ∗ µkM ≤ 79 kλkM kµkM . It follows that M (G) is a commutative Banach algebra with identity δ0 . The maximal ideal space of M (G) is pathological. It is true that the mappings Z µ 7→ µ̂(χ) = χ(x)dµ(x) G for χ ∈ Γ which define the so-called Fourier-Stieltjes transform of µ are multiplicative linear functionals on M (G), but there are other less obvious mlfs as well (at least when G is non-discrete). We now have the following uniqueness theorem which is the wrong way around. R T HEOREM 94 Let µ ∈ M (Γ) be such that Γ χ(x)dµ(χ) = 0 for all x ∈ G. Then µ = 0 identically. R R 1 Proof. Let f be in L (G), then f (x) χ(x)dµ(χ)dη(x) = 0. Then we have G Γ R R |f (x)|d|µ|(χ)dη(x) < ∞ and hence by Fubini’s Theorem, we have G Γ Z Z Z ˆ f (χ)dµ(χ) = f (x)χ(x)dη(x)dµ(χ) = 0 (6.5) Γ Γ G But the set of Fourier transforms A(Γ) of L1 functions on G is a self-adjoint subalgebra of C0 (Γ) under pointwise mutiplication which separates the points of Γ (compact case) and the points of the one-point compactification of Γ in the non-compact case. To verify the self-adjointness, we check Z Z Z f (−x)χ(x)dη(x) = f (−x)χ(x)dη(x) = f (x)χ(−x)dη(x) so that ĝ(χ) = fˆ(χ), where g(x) = f (−x). Therefore, by the Stone–Weierstrass Theorem, A(Γ) is dense in C0 (Γ). It follows from (6.5) that µ = 0. 6.9 Positive Definite Functions Let ϕ be a complex-valued function on G, then we say that ϕ is positive semidefinite if and only the matrix M given by mj,k = ϕ(xj − xk ) 80 is positive semidefinite for all choices of finitely many points (xj )nj=1 from G. Explicitly, this means that n X n X cj ck ϕ(xj − xk ) ≥ 0 j=1 k=1 for all n ∈ N, cj ∈ C and xj ∈ G. Let ϕ be a positive semidefinite function. Then clearly ϕ(0) ≥ 0 (take n = 1 and c1 6= 0). Also, a positive semidefinite matrix has to be hermitian, so ϕ(−x) = ϕ(x). Now the matrix ϕ(0) ϕ(x) ϕ(0) ϕ(x) = ϕ(x) ϕ(0) ϕ(−x) ϕ(0) is positive semidefinite and has a nonnegative determinant, so |ϕ(x)| ≤ ϕ(0) for all x ∈ G. Similarly, the matrix ϕ(0) ϕ(x) ϕ(y) ϕ(0) ϕ(x) ϕ(y) ϕ(−x) ϕ(0) ϕ(x − y) ϕ(0) ϕ(y − x) = ϕ(x) ϕ(y) ϕ(x − y) ϕ(0) ϕ(−y) ϕ(x − y) ϕ(0) is positive semidefinite and hence, using simulaneous row and column reduction, so is ϕ(0) ϕ(x) − ϕ(y) . ϕ(x) − ϕ(y) 2<(ϕ(0) − ϕ(x − y)) It follows that |ϕ(x) − ϕ(y)| ≤ 2ϕ(0) ϕ(0) − <ϕ(x − y) . 2 It is easy to check that if f ∈ L2 (G), then f˜ ? f is a continuous positive semidefinite function on G tending to zero at infinity. However, there is a complete characterization of the continuous positive semidefinite functions on G. THEOREM 95 (B OCHNER ’ S T HEOREM ) Every continuous positive semidefinite function ϕ on G has the form Z (6.6) ϕ(x) = χ(x)dµ(χ) Γ where µ is a nonnegative Borel measure (of finite total mass) on Γ and conversely. 81 Proof. It is routine to check that if ϕ is defined by (6.6) then ϕ is continuous and positive semidefinite on G. For the converse, it is an exercise to check that (since ϕ is bounded and continuous), we have ZZ ϕ(x − y)f (x)f (y)dη(x)dη(y) ≥ 0, G×G for f ∈ L1 (G). We use the formula to define a quasi inner product on L1 (G) by ZZ Z <f, g> = ϕ(x − y)g(x)f (y)dη(x)dη(y) = (f˜ ∗ g)ϕdη. G×G G This is in all respects like an inner product, except that the implication <f, f > = 0 does not necessarily imply that f is the zero element of L1 (G). Nevertheless, the proof of the corresponding Cauchy–Schwarz–Bunyakowski inequality goes thru, giving |<f, g>|2 ≤ <f, f ><g, g> Now pass to the limit as f runs over a summability kernel on G. We get Z 2 ϕ(x)g(x)dη(x) ≤ ϕ(0)<g, g> G for all g ∈ L1 (G). Let g1 = g̃ ? g and gn+1 = g˜n ? gn for n = 1, 2, . . . Actually, n g˜n = gn and it follows that gn+1 = ?2 g1 , the 2n -fold convolution product of g1 with itself. The point is that Z 2 Z ϕ(x)gn (x)dη(x) ≤ ϕ(0)<gn , gn > = ϕ(0) ϕ(x)gn+1 (x)dη(x) G G It follows from this and a simple induction that Z 2n ϕ(x)g(x)dη(x) ≤ ϕ(0)2n k ?2n−1 g1 k1 G and, after taking the root of order 2n−1 and passing to the limit with the spectral radius formula, we get Z 2 ϕ(x)g(x)dη(x) ≤ ϕ(0)2 kgˆ1 k∞ ≤ ϕ(0)2 kĝk2∞ G 82 or Z ϕ(x)g(x)dη(x) ≤ ϕ(0)kĝk∞ G Z ϕ(x)g(x)dη(x) depends only on the value of ĝ and (since A(Γ) This tell us that G is dense in C0 (Γ)) that there is a measure µ on Γ of total mass at most ϕ(0) such that Z Z ϕ(x)g(x)dη(x) = ĝ(χ)dµ(χ) G Γ But now, Z Z Z ϕ(x)g(x)dη(x) = G g(x)χ(x)dη(x)dµ(χ) Z Z = g(x) χ(x)dµ(χ)dη(x) Γ G G (6.7) Γ R The functions ϕ and x 7→ Γ χ(x)dµ(χ) are both continuous on G and since 6.7 holds for all g ∈ L1 (G), we have (6.6) holding for all x ∈ G as required. Finally, R put x = 0 in (6.6) to see that ϕ(0) = dµ ≤ kµk ≤ ϕ(0) forcing µ to be a positive measure. Something special happened in the proof above. Before this theorem, we didn’t know that the points of G could be separated by its characters, but now we do. Given x 6= 0 in G, find a symmetric neighbourhood V of 0 such that x∈ / V + V . Then apply Bochner’s Theorem to 11V ? 11V . We are now ready to prove a preliminary form of the inversion theorem. R T HEOREM 96 Let f ∈ L1 (G) be also given by f (x) = Γ χ(x)dµf (χ) where µf is a complex measure. (Note that complex measures have finite total mass). Then µf = fˆν where ν is a suitably normalized Haar measure on Γ. Proof. Let f and g be two such functions with associated measures µf and µg . Let h ∈ L1 (G). Then ZZ ZZ h(−x − y)f (x)g(y)dη(x)dη(y) = h(−x − y)χ(x)g(y)dµf (χ)dη(x)dη(y) ZZ = h(−x − y)χ(x)g(y)dη(x)dη(y)dµf (χ) 83 ZZ = ZZ = ZZ = ZZ = ZZ = ZZ = h(−x − y)χ(x)g(y)dη(x)dη(y)dµf (χ) h(x − y)χ(x)g(y)dη(x)dη(y)dµf (χ) h(x − y)χ(x)g(y)dη(y)dη(x)dµf (χ) g ? h(x)χ(x)dη(x)dµf (χ) g[ ? h(χ)dµf (χ) ĥ(χ)ĝ(χ)dµf (χ) and also by the symmetry of the initial expression in f and g ZZ = ĥ(χ)fˆ(χ)dµg (χ) Again, since A(Γ) is dense in C0 (Γ), we find ĝdµf = fˆdµg . Now we can construct functions like f and g easily. Let V be a measurable neighbourhood of 0 and let h = 11V , then by Bochners Theorem, there is a measure µh?h̃ such that Z χ(x)dµh?h̃ (χ) h ? h̃(x) = Γ h[ ? h̃(χ) = |ĥ(χ)|2 Furthermore, if ψ ∈ Γ Z f (ψh ? ψh)(x) = ψ(x) h ? h̃(x) = χ(x)dµh?h̃ (χ − ψ) Γ Z = χ(x)dµ(ψh?ψh) f (χ) Γ \ f ψh ? ψh(χ) = |ĥ(χ − ψ)|2 . Note also that ĥ is continuous and ĥ(0) is nonzero. This leads to |ĥ(χ − ψ)|2 dµf (χ) = fˆ(χ)dµ(ψh?ψh) f (χ) 84 showing that µf is uniquely determined near ψ and hence everywhere on Γ. Also we may infer the existence (the details are an exercise) of a positive measure ν such that µf = fˆν. A straightforward compactness argument shows that ν is finite on the compact sets and charges every nonempty open set. Now let us abbreviate h ? h̃ to g. Then c ψg(χ)dµ g (χ) = ĝ(χ)dµψg (χ) = ĝ(χ)dµg (χ − ψ) leading to ĝ(χ − ψ)ĝ(χ)dν(χ) = ĝ(χ)ĝ(χ − ψ)dν(χ − ψ). Now suppose that ψ is given, then, choosing suitably small, choose V such that V ⊆ {x; |ψ(x)−1| < }. Then for χ in a neighbourhood of 0Γ , ĝ(χ−ψ)ĝ(χ) 6= 0, showing that dν(χ) = dν(χ − ψ) at least for values of χ in a neighbourhood of 0Γ . It follows that ν is translation invariant and hence a multiple of Haar measure on Γ. Again, the details are an exercise. −1 Now let V be a symmetric neighbourhood of 0 in G. Let g = R (η(V ) 11V ? 11V . Then g(0) = 1 and g is positive definite. It follows that g(x) = Γ ĝ(χ)χ(x)dν(χ). Now ĝ is in L1 (Γ) with norm 1 and there is a compact subset C of Γ such that Z 1 g(χ)dν(χ) < 5 Γ\C Suppose that x ∈ N (C, 15 ). Then Z Z 1 − ĝ(χ)χ(x)dν(χ) ≤ ĝ(χ)|1 − χ(x)|dν(χ) Γ Γ Z ≤ Z ĝ(χ)|1 − χ(x)|dν(χ) + ĝ(χ)|1 − χ(x)|dν(χ) Γ\C ≤ C 2 1 3 + = 5 5 5 So, g(x) ≥ 25 and x ∈ V + V . It follows from this that the compact open topology defined on G by means of the duality with Γ is finer than and therefore equivalent to the original topology on G. 85 6.10 The Plancherel Theorem This is an immediate consequence of the inversion theorm. T HEOREM 97 (P LANCHEREL T HEOREM ) Let f ∈ L1 (G) ∩ L2 (G). Then Z Z 2 |f (y)| dη(y) = |fˆ(χ)|2 dν(χ) (6.8) G Γ so that f 7→ fˆ L1 (G) ∩ L2 (G) −→ L2 (Γ) extends by continuity to a surjective isometry L2 (G) −→ L2 (Γ) R Proof. Let h = f ? f˜, then h(x) = G f (x − y)f (−y)dη(y) and h(0) = kf k22 . Since h is both in L1 and is positive definite, it can be represented by a measure µh of total mass h(0). Also, µh = ĥν. Since ĥ(χ) = f[ ? f˜(χ) = |fˆ(χ)|2 , we have (6.8). The remainder of the result is obvious, except for the fact that the isometry is surjective. To see this, suppose not. Then there is a nonzero function φ ∈ L2 (Γ) such that Z fˆ(χ)φ(χ)dν(χ) = 0 Γ 1 2 for all f ∈ L (G) ∩ L (G). Fix such an f and consider its translation fx . We get Z Z ˆ χ(x)f (χ)φ(χ)dν(χ) = fˆx (χ)φ(χ)dν(χ) = 0 Γ Γ for all x ∈ G. But now by Theorem 94 and since fˆφν is a measure (fˆφ ∈ L1 ), we have that fˆφ vanishes ν almost everywhere. But we know how to choose f such that fˆ is non-vanishing in a neighbourhood of any given point of Γ. Hence φ = 0 almost everywhere (and as an element of L2 ). C OROLLARY 98 Let f, g ∈ L2 (G), then fcg = fˆ ? ĝ. 86 Proof. Polarizing the Plancherel identity leads to Z Z f (y)g(y)dη(y) = fˆ(χ)ĝ(χ)dν(χ) Γ G for f, g ∈ L2 (G). The notations fˆ, ĝ now stand for the abstract nonsense fourier transforms of f and g respectively. Replace g by g and f by ψf where ψ ∈ Γ. We get Z Z ψf (y)g(y)dη(y) = fˆ(χ + ψ)ĝ(−χ)dν(χ) G Γ which after a change of variables gives exactly fcg = fˆ ? ĝ. C OROLLARY 99 Let Ω be a nonempty open subset of Γ. Then there is a func1 tion f ∈ L (G) such that fˆ is not identically zero and fˆ(χ) = 0 for all χ ∈ Γ \ Ω. Proof. First, find V1 and V2 open nonempty and relatively compact with V1 + V2 ⊂ Ω. Then let fj ∈ L2 (G) be the elements such that fˆj = 11Vj for j = 1, 2. Then f = f1 f2 does the trick, since fˆ = 11V1 ? 11V2 . 6.11 The Pontryagin Duality Theorem Let H be the dual group of Γ. Every element of G defines a continuous character on Γ, so there is a map α : G → H which is clearly one-to-one (different elements of G define different characters since we know that the characters of G separate the points of G). T HEOREM 100 (P ONTRYAGIN D UALITY T HEOREM ) isomorphism of topological groups. The mapping α is an Proof. It is clear that α is an injective group homomorphism. We also know that the topologies of G and H can be identified to the compact open topology when these spaces are viewed as function spaces on Γ. Therefore the topology of G is the subspace topology coming from H. Now the uniform structure of an abelian topological group is given from the topology by means of translation. Therefore, the uniform structure on G is just the restriction of the uniform structure on H. But G is locally compact and hence as a uniform space, it is complete. But when 87 complete spaces occur as subsets of other spaces, they are necessarily closed. Hence α(G) is a closed subset of H 1 . It remains only to show that α(G) is dense in H. But, if not, then by one of the corollaries of the Plancherel Theorem, we can find f ∈ L1 (Γ) nonzero, with fˆ(x) = 0 for all x ∈ α(G). But then Theorem 94 implies that f is almost everywhere zero on Γ a contradiction. Some of the consequences of the Pontryagin Duality Theorem are as follows: • Every compact abelian group is the dual of a discrete abelian group. • Every discrete abelian group is the dual of a compact abelian group. • If µ ∈ M (G) and µ̂(χ) = 0 for all χ ∈ Ĝ, then µ = 0. In particular, both L1 (G) and M (G) are semisimple Banach algebras. • If G is not discrete, then Ĝ is not compact and hence L1 (G) does not have an identity element. • We can restate the inversion theorem the correct way around. If µ ∈ M (G) and µ̂ ∈ L1 (Ĝ), then there exists f ∈ L1 (G) such that µ = f η and the inversion formula Z µ̂(χ)χ(x)dη(χ) f (x) = Ĝ holds. 1 If you are reading along in Rudin’s book, please note that it is in general false that a locally compact subspace of a locally compact topological space is necessarily closed. Whatever Rudin intended in §1.7 is by no means clear. 88 7 Distributions and Euclidean Harmonic Analysis To define distributions, we first need a space of test functions. There are several choices, but the usual one is Cc∞ (Rd ). Here Rd can be replaced by any C ∞ manifold. We topologize Cc∞ (Rd ) with the seminorms pα,K (f ) = sup |∂ α f (x)| x∈K Cc∞ (Rd ) 7→ C ∂ |α| f . Here α = (α1 , . . . , αd ) runs over d-tuples of non∂xα1 1 · · · ∂xαd d negative integers and K runs over compact subsets of Rd . Alternatively, one can replace K with balls of positive integer radius centred at 0. The notation |α| is for Pd α j=1 j . With these seminorms, Cc∞ (Rd ) is a locally convex space. Unfortunately, it is not necessarily complete. The space of continuous linear forms on Cc∞ (Rd ) are called distributions. A function Rf on Rd is said to be locally integrable (written f ∈ L1loc (Rd )) if and only if K |f (x)|dx < ∞ for every compact subset K of RD . Here dx = dx1 . . . dxd is Lebesgue measure on Rd . Locally integrable functions can be identified to distributions by Z ϕ 7→ f ϕdx where ∂ α f = Cc∞ (Rd ) 7→ C 89 Z ∂ (f ϕ)dx = 0 and so ∂xj Z Z ∂f ∂ϕ ϕdx + f dx = 0. ∂xj ∂xj If f is differentiable, we have ∂f as a distribution when f ∈ L1loc (Rd ), but ∂xj is not necessarily differentiable. The defining continuous linear form is Z ∂ϕ ϕ 7→ − f dx ∂xj We now use this formula to define Cc∞ (Rd ) 7→ C ∂ of any distribution. Instead of f one may use ∂xj measures. The derivative of δ0 on the line is δ00 , the unit dipole at 0. Yet another way of defining distributions is by means of Cauchy principal value integrals. The typical such integral is Z ∞ dx ϕ(x) x −∞ More generally, one may take for ϕ ∈ Cc∞ (Rd ). Of course, the integral is meaningless as it stands because of the singularity at x = 0. To give it precise meaning as a Cauchy principal value integral, we define Z ∞ Z dx dx ϕ(x) = lim ϕ(x) . →0+ x x −∞ |x|> To see that this makes sense, we choose a specific ψ ∈ C ∞ (R) which is even and has ψ(0) = 1. Then it is easy to see that the Cauchy principal value integral is just Z ∞ ϕ(x) − ϕ(0)ψ(x) dx x −∞ and the singularity in the integrand is now removable. Note that in the original definition, it is vital to remove a symmetric interval {x; |x| ≤ } from the range of integration. Removing {x; − ≤ x ≤ 2} will give a different answer. Yet another way of thinking about this integral is as the distribution f 0 where f is the locally integrable function x 7→ − ln(|x|). 90 Cauchy principal value integrals can also be defined on Rd where one removes a ball of radius around the singularity and passes to the limit as → 0+. In general, distributions on Rd do not necessarily have Fourier transforms, but one may always take the convolution of a distribution and a C ∞ function. On the circle group, or indeed on the torus Td distributions do have Fourier coefficients since the characters are C ∞ functions. In fact, one may view distributions on Td as objects whose fourier coefficients have at most polynomial growth at infinity (on Zd ). After working with distributions for a while, one gets used to distributional derivatives. For example the derivative of f (x) = |x| is just f 0 (x) = sgn(x). The justification is Z Z ∞ ∞ |x|ϕ0 (x)dx = − sgn(x)ϕ(x)dx −∞ −∞ for ϕ ∈ C ∞ (R). 7.1 The Hilbert Transform It is a well known fact that any function in C0 (R) can be extended to a function in C0 of the halfspace in R2 which is harmonic in the interior of the halfspace. We y 1 which is harmonic in the prove this using the Poisson kernel Py (x) = 2 π x + y2 halfspace {(x, y); x ∈ R, y > 0}. It is also a summability kernel on R as y → 0+. Given f ∈ C0 (R), the harmonic extension is Z ∞ ˜ f (x, y) = Py (t)f (x − t)dt −∞ cy (u) = e−πy|u| . The conjugate harmonic is The Fourier transform of Py is P Z ∞ 1 x Qy (t)f (x − t)dt where Qy (x) = . We note that 2 π x + y2 −∞ 1 1 = Py (x) + iQy (x) π x + iy cy (u) = −i sgn(u)e−πy|u| . There are similar formulas for the extension and that Q of functions from the unit circle into the unit disk. They are Pr (t) = 1 − r2 , 1 − 2r cos(t) + r2 Qr (t) = cr (n) = r|n| , P 91 2r sin(t) , 1 − 2r cos(t) + r2 cr (n) = −i sgn(n)r|n| Q The mapping that takes a suitable function on R to the boundary value function of the conjugate harmonic is called the Hilbert transform. If you pass to the limit 1 so it may not come as a surprise that the formula y → 0+ in Qy (x) you get πx for the Hilbert transform is Z 1 Hf (x) = f (x − t) dt πt where of course the integral has to be interpreted as a Cauchy Principal value d(u) = integral. From the point of view of the Fourier tranform, we have Hf ˆ −i sgn(u)f (u). It should be clear from the Plancherel theorm that H is an isometry on L2 (R). On the circle, the Hilbert transform is given by the Cauchy principal value integral Z s dη(t) Hf (t) = f (t − s) cot 2 The constant functions are in the kernel of H, but it still has operator norm equal to 1 on L2 (T). In both cases, the Hilbert transform is bounded on Lp for 1 < p < ∞ but not on L1 nor on L∞ . The first generalization of the Hilbert transform to several variables (i.e. functions on Rd ) are the Riesz transforms. They are given by Z Γ( d+1 ) uj ˆ tj 2 d Rj f (u) = i f (u) and Rj f (x) = f (x − t)dt d+1 |u| |t|d+1 π 2 where again the integral is to be taken in the Cauchy principal value sense, that is by removing a ball of radius around the singularity t = 0 and passing to the limit as → 0+. The Riesz transforms are also bounded on all the Lp (Rd ) spaces for 1 < p < ∞. 7.2 Schauder estimate for the Hilbert Transform T HEOREM 101 Let f be a function of compact support that is Hölder continuous of index α for 0 < α < 1. Then its Hilbert transform g is Hölder continuous of index α on R and decays at infinity like |x|−1 . Proof. So g is defined by a Cauchy principal value integral. Since f has compact support and is bounded, it follows easily that g decays like |x|−1 at infinity. We would like to write Z f (y) − f (x) g(x) = dy x−y 92 where the singularity of the integrand at y = x is now removable since |f (y) − f (x)| ≤ C|y − x|α , but unfortunately we have introduced a new singularity at infinity since the integrand no longer has compact support. Hence, we had better write Z f (y) − f (x) dy. g(x) = lim `→∞ |x−y|≤` x−y We will be considering g(x1 ) − g(x2 ) so, we have Z Z f (y) − f (x1 ) f (y) − f (x2 ) g(x1 ) − g(x2 ) = lim dy − lim dy. `→∞ |x −y|≤` `→∞ |x −y|≤` x1 − y x2 − y 1 2 But the second integral could equally well be taken over |x1 − y| ≤ ` since in the limit, the difference tends to zero as ` → ∞. This is roughly because Z `+1 dy `+1 = ln(` + 1) − ln(`) = ln → ln(1) = 0 y ` ` as ` → ∞. Thus, we can write Z g(x1 ) − g(x2 ) = lim `→∞ |x1 −y|≤` f (y) − f (x1 ) f (y) − f (x2 ) − x1 − y x2 − y dy. (7.1) To estimate this, we set δ = 2|x1 − x2 | and split the integral in (7.1) into A taken over |y − x1 | ≤ δ and B taken over δ < |y − x1 | ≤ `. We have Z |A| ≤ C |x1 − y|−1+α + |x2 − y|−1+α dy ≤ Cδ α ≤ C|x1 − x2 |α |y−x1 |≤δ since |y − x1 | ≤ δ ⇒ |y − x2 | ≤ 3δ2 . We write B as Z Z f (x2 ) − f (x1 ) 1 1 dy+ − (f (y)−f (x2 ))dy x1 − y x1 − y x2 − y δ<|y−x1 |≤` δ<|y−x1 |≤` and the first integral is zero. Therefore Z (x2 − x1 )(f (y) − f (x2 )) dy B= (x1 − y)(x2 − y) δ<|y−x1 |≤` and Z |x1 − y|−1 |x2 − y|−1+α dy. |B| ≤ C|x1 − x2 | δ<|y−x1 |≤` 93 But, on the range of integration |x2 − y| ≥ 12 |x1 − y|, leading to Z |B| ≤ C|x1 − x2 | |x1 − y|−2+α dy ∼ |x1 − x2 |δ −1+α ∼ |x1 − x2 |α . δ<|y−x1 | The result is proved. An interesting point here is that the corresponding statement for continuous functions, i.e. kHf k∞ ≤ Ckf k∞ for f ∈ Cc (R) is false. In his notes on Sobolev spaces Tao cites this as a reason that one might be interested in functions with a fractional degree of regularity. Theorem 101 also applies to the Riesz transforms on Rd and with essentially the same proof, although the details are a little more complicated. It also applies to higher order Riesz transforms. The typical higher order Riesz transform to which it applies is up uq xp xq K(x) = −d−2 , K̂(u) = cd 2 |x| |u| for 1 ≤ p < q ≤ d. We will give details later. We can see already which this is important in the theory of PDE. Consider the equation ∆f = g where g ∈ Cc∞ (Rd ). Then formally, ĝ(u) = −4π 2 |u|2 fˆ(u) and indeed \ up uq ∂ 2f (u) = cd 2 ĝ(u) ∂xp ∂xq |u| ∂ 2f that is is a higher Riesz transform of g. Thus if g is Hölder continuous ∂xp ∂xq of index α (0 < α < 1), so are the mixed second order partials of the solution. Unfortunately, it does not apply to the straight second order partials since the corresponding kernel does not have the zero mean property. To settle this, we have the following Lemma. L EMMA 102 Let Pk be a homogenous harmonic polynomial of degree k ≥ 1 in d real variables. Let K(x) = |x|−d−k Pk (x). The Rcondition k ≥ 1 forces K to have the zero mean property. Let Rf (x) = lim→0+ |y|≥ f (x − y)K(y)dy be the associated singular integral operator. Then for f ∈ Cc∞ (Rd ) we have k [)(u) = ik π d2 Γ( 2 ) |u|−k Pk (u)fˆ(u) R(f Γ( k+d ) 2 We will not prove this lemma. We now have 94 T HEOREM 103 Let f be a function of compact support that is Hölder continuous of index α for 0 < α < 1 on Rd . Then its higher Riesz transform g = Rf is Hölder continuous of index α on Rd and decays at infinity like |x|−1 . The proof is essentially the same as that of Theorem 101 above. The main difference is in the estimation of the integral B which in this case becomes Z (K(x1 − y) − K(x2 − y)) (f (y) − f (x2 ))dy. δ<|y−x1 |≤` We estimate |K(x1 − y) − K(x2 − y)| ≤ |x1 − x2 | sup |∇K(x)| x+y∈L(x1 ,x2 ) ≤ C|x1 − x2 ||x2 − y|−d+2 and the proof concludes much as before. We handle straight derivatives by writing for example in R3 1 2 u21 = |u| + (2u21 − u22 − u23 ) 3 and use the harmonic polynomial P2 (u) = 2u21 − u22 − u23 . The Riesz transforms are also bounded operators on Lp (Rd ) for 1 < p < ∞. There are quite a few technical difficulties which we will hide by stating the result in the following way. T HEOREM 104 Let K be a kernel on Rd such that • |K(x)| ≤ C|x|−d . R • |x|≥2|y| |K(x) − K(x − y)|dx ≤ C for all y ∈ Rd \ {0}. R • The operator Rf (x) = lim→0+ K(x − y)f (y)dy is bounded on L2 (Rd ), specifically kRf k2 ≤ Ckf k2 . Then R is also bounded on Lp (Rd ) for 1 < p ≤ 2. Sketch proof. Since R is bounded on L2 , we need only show that R is of weak type (1,1) and the result will follow from the Marcinkiewicz interpolation theorem. Let f ∈ L1 (Rd ). Let t > 0. Now let Rd be paved with a lattice of large 95 dyadic cubes Q. The proof will follow the general line of the martingale maximal function proof, but is considerably more complicated. Initially, the cubes are chosen so large that the averages Z −1 |Q| |f (x)| dx ≤ t (7.2) Q This is possible since f ∈ L1 . We now proceed to subdivide the cubes recursively. Each cube is split into 2d cubes of half the linear size. A soon as the left hand side of (7.2) is > t we stop and put that cube aside. Otherwise we subdivide ad infinitum. At this point, we have a countable collection of cubes Qj on which the process stopped. Outside these cubes we have |f (x)| ≤ t. This is a consequence of the martingale convergence theorem. Now, each cube Qj has a predecessor cube R of twice thedlinear size on which the mean was ≤ t. It follows from this that Qj |f | dx ≤ 2 t|Qj |. We now split P f = g + b a good function plus a bad function where b = j bj and each bj lives on Qj . Outside the union of the Qj we set g = f . On Qj we set g to be the average of f on Qj . (note that actually g = EGτ for a stopping time τ ). Each bj is given by ! Z bj = 11Qj f − |Qj |−1 f dx . Qj At this point we have • kgk1 ≤ kf k1 . • |g| ≤ 2d t. P −1 • j |Qj | ≤ t kf k1 . P • j kbj k1 ≤ 2kf k1 . R • Qj bj dx = 0. Now we get kRgk22 ≤ Ctkf k1 and hence the measure of the set where |Rg| > √ t/2 −1 ∗ is ≤ Ct kf k1 . Now let Qj be a cube with the same center yj as Qj but 2 d times the size in such a way that x∈ / Q∗j , y ∈ Qj =⇒ |x − yj | ≥ 2|y − yj |. 96 Then the total measure the Q∗j is at most Ct−1 kf k1 . We can ignore that set. S of d ∗ Now let X = R \ j Qj . it will suffice to show the estimate Z |Rb|dx ≤ Ckf k1 X or indeed Z |Rbj |dx ≤ Ckbj k1 . Rd \Q∗j But Z Z |Rbj |dx ≤ K(x − y) − K(x − y ) )b (y)dy dx j j ∗ ∗ Rd \Qj Rd \Qj Qj Z Z ≤ K(x − y) − K(x − yj )dx|bj (y)|dy Z Qj Z Rd \Q∗j Z ≤ Qj |x0 |≥2|y−yj | 0 0 K(x − (y − y )) − K(x ) dx0 |bj (y)|dy j ≤ Ckbj k1 after putting x0 = x − yj and using |x0 | = |x − yj | ≥ 2|y − yj |. This completes the sketch. 7.3 Riesz Potentials First of all, we need the full version of the Marcinikiewicz interpolation Theorem T HEOREM 105 Let 1 ≤ p − 0, p − 1, q0 , q1 ≤ ∞, 0 < θ < 1, q0 6= q1 , pj ≤ qj (j = 0, 1). Let T be a sublinear operator of weak type (pj , qj ). Then T is strong type (p, q) where 1 1−θ θ = + , p p0 p1 and 1 1−θ θ = + q q0 q1 Here, by weak type (p, q) we mean an operator from Lp to Lq,∞ and by strong type (p, q) we mean an operator from Lp to Lq . 97 T HEOREM 106 Let 1 < p, q, r < ∞ and suppose that 1 1 1 1 1 = − 0 = − 0 . q p r r p Let K ∈ Lr,∞ . Then kK ∗ f kq ≤ CkKkr,∞ kf kp , where the convolution can be taken over any locally compact group (but we are mainly interested in Rd ). Proof. Without loss of generality we can work with nonnegative functions. We 1 take kKkr,∞ = 1 and kf kp = 1 as normalizations. Then K ∗ (t) ≤ t− r . Let τ > 0 and define A to be the set of measure τ where K is largest. We cut K1 = 11A K and K∞ = 11Ac K. Then we have Z τ 1 1 t− r dt ∼ τ 1− r kK1 k1 ≤ 0 and 0 kK∞ kpp0 Z ≤ ∞ p0 p0 t− r dt ∼ τ 1− r . τ The first of these integrals converges since r < ∞ and the second since q < ∞. We now get p0 t kK∞ ∗ f k∞ ≤ kK∞ kp0 kf kp ≤ Cτ 1− r = 2 where the equality will be used to determine τ given t > 0. Then 1 0 kK1 ∗ f kp ≤ kK1 k1 kf kp ≤ Cτ 1− r ∼ tp (1−r) p0 − r. We now have |{K ∗ f > t}| ≤ |{K1 ∗ f > t/2}| ≤ Ct−p kK1 ∗ f kpp ∼ t−p t pp0 (1−r) p0 −r = t−q after a lengthy computation with the indices. This shows that convolution with K is of weak type (p, q). Using the Marcinikiewicz interpolation Theorem this can now be improved to strong type (p, q). 98 According to Stein’s book, we have the Riesz potential operator k Ik (f ) = (−∆)− 2 (f ) and 1 Ik (f ) = γ(k) Z (k real, 0 < k < d) (7.3) |y|−d+k f (x − y)dy. The kernel K(y) = |y|−d+k is locally integrable in the range given. We have d/2 2k Γ(k/2) . The meaning of (7.3) is γ(k) = πΓ((d−k)/2) −k ˆ I[ k (f )(u) = (2π|u|) f (u) These statements are formal in the first instance, but can be verified for functions f in the Schwartz class S(Rd ) with some difficulty. The Schwartz class consists of functions that are infinitely differentiable and such that derivatives of all orders are bounded when mutliplied by polynomials. Applying the last theorem we have P ROPOSITION 107 (H ARDY–L ITTLEWOOD –S OBOLEV L EMMA ) Ik extends p d q d to a bounded operator L (R ) −→ L (R ) where 0 < k < d, p > 1, q < ∞ and 1 1 k = − q p d 7.4 Sobolev Spaces We start by studying Sobolev spaces on Rd . They may also be defined on open subsets of Rd and also on differentiable manifolds. Basically, the space W k,p (Rd ) consists of all functions f which together with all derivatives of order ≤ k lie in Lp (Rd ). The derivatives have to be taken in the weak sense. So, explicitly, this means that for every multiindex α = (α1 , . . . , αd ) with |α| ≤ k there exists a function gα ∈ Lp with the property that Z Z ∂ |α| ϕ |α| (x)dx = (−1) gα (x)ϕ(x)dx. f (x) α1 ∂x1 · · · ∂xαd d While there are many possible equivalent norms on W k,p we would typically take X kf kpW k,p = kgα kpp 0≤|α|≤k and this defines Wk,p as a Banach space. In the case p = 2 the above norm would correspond to a Hilbert space. 99 P ROPOSITION 108 W k,p (Rd ). Let 1 ≤ p < ∞. Then the space Cc∞ (Rd ) is dense in Proof. We choose a nonnegative function ϕ in Cc∞ (Rd ) with integral 1. Let ϕ (x) = −d ϕ(−1 x), so that ϕ is a summability kernel as → 0+. Then it is clear that for f ∈ W k,p , ϕ ∗ f → f in W k,p -norm. This is essentially because convolution with ϕ and partial differentiations commute. The resulting function g in this approximation is C ∞ , but it does not necessarily have compact support. To fix this, we take another nonnegative function ψ in Cc∞ (Rd ) with ψ(0) = 1. Let ψ (x) = ψ(x) and define g (x) = g(x)ψ (x). Then, we obtain X ∂ α g (x) = cα,β ∂ β g(x) |α|−|β| ∂ α−β ψ(x) β for some constants cβ and where the sum is taken over multiindices β such that 0 ≤ β ≤ α. Thus, it follows that X ∂ α g (x) − ∂ α g(x) ψ (x) = cα,β ∂ β g(x) |α|−|β| ∂ α−β ψ(x) β6=α For the terms on the right, ∂ β g is in Lp , ∂ α−β ψ is in L∞ and |α| − |β| ≥ 1 so the right hand side tends to zero in Lp norm. But ∂ α g ψ tends to ∂ α g and the result follows. Note that this result will not hold in the context of W k,p (Ω) for Ω an open subset of Rd . A corollary of this result is that for f ∈ W k,p (Rd ) any kth order (or less) directional derivative is in Lp (in the weak sense). Now let 1 < p < ∞. The statement f ∈ W k,p (Rd ) is roughly equivalent Q to j (2πiuj )αj fˆ(u) is in FLp . We can apply Hilbert transforms in each variable Q separately to show that j |uj |αj fˆ(u) is in FLp . Indeed, similarly for any unit vector v we have that |v u̇|` fˆ(u) is in FLp for 0 ≤ ` ≤ k. Next average over all v in the unit sphere and we obtain that |u|` fˆ(u) is in FLp for 0 ≤ ` ≤ k. `ˆ Conversely, is in FLp , then by applying suitable Riesz transforms Q if |u| f (u) we find that j (2πiuj )αj fˆ(u) is in FLp for every α with 0 ≤ |α| ≤ k. Thus, for 1 < p < ∞ we have the following characterization of W k,p (Rd ). We have f ∈ W k,p (Rd ) if and only if there exist functions g` ∈ Lp for 0 ≤ ` ≤ k such that f = I` (g` ) with I` the Riesz potential of order `. 100 With a little more work one can establish the following. We have f ∈ W (Rd ) if and only if there exist a function g ∈ Lp such that f = Jk (g) with Jk the Bessel potential of order k. The Bessel potential has k,p 2 2 −k ˆ J[ k (f )(u) = (1 + 4π |u| ) 2 f (u) The kernel for the Bessel potential has no simple formula, but it is a positive radial function with a singularity like |x|−d+k at the origin and exponential decay at infinity. Since the Bessel potential is perfectly good when k is nonintegral, this allows a way to define W k,p (Rd ) for 1 < p < ∞ and k nonnegative. In case p = 2 this is a Hilbert space with the norm being Z 2 kf kW k,2 = (1 + |u|2 )k |fˆ(u)|2 du T HEOREM 109 (S OBOLEV E MBEDDING T HEOREM ) 1 1 k = − . Then W k,p (Rd ) ⊆ Lq (Rd ). q p d Let 1 < p, q < ∞, Proof. Let f ∈ W k,p (Rd ). We apply the Hardy–Littlewood–Sobolev Lemma to the gadget g ∈ Lp with Fourier tranform ĝ(u) = |u|k fˆ(u). We also have the following. T HEOREM 110 Let 1 < p < ∞, 0 < β < 1 and − β 1 k = − . Then any d p d function in W k,p (Rd ) is Hölder continuous of order β. Proof. The proof is as above. We have for f ∈ W k,p (Rd ), Z f (x) = ck |x − y|−d+k g(y)dy where g ∈ Lp . Thus Z −d+k −d+k |f (x1 − f (x2 )| ≤ ck |x1 − y| |x2 − y| |g(y)|dy. 101 We simply need to show that Z p 0 0 −d+k −d+k |x2 − y| |x1 − y| dy ≤ C|x1 − x2 |βp or by translation invariance that Z p0 0 −d+k −d+k |y| |y − x| dy ≤ C|x|βp One checks that the homogeneities are correct and then that the integral on the left converges at y = 0, y = x and y near infinity. More generally, one may assert that if f ∈ W k,p (Rd ), then ∂ α f is Hölder continuous of order β for 1 < p < ∞, 0 < β < 1, 1 k |α| + β − =− and 0 ≤ |α| ≤ k. p d d Alterative proofs of many of these results can be obtained using some tricks of Gagliardo, Loomis and Nirenberg. These even handle the case p = 1. We start with the following bizarre lemma. L EMMA 111 Let d ≥ 2 and fj be functions on Lp (Rd−1 ) for j = 1, . . . , d. d Let π : R −→ Rd−1 be the mapping that forgets the jth coordinate. Let F = Qd j j=1 fj ◦ πj . Then d Y p ≤ kfj kp . kF k d−1 j=1 Proof. The proof is by induction on d. In the case d = 2, we have essentially F (x1 , x2 ) = f1 (x2 )f2 (x1 ) and the result is obvious. We illustrate the induction step by deducing the case d = 4 from the case d = 2. So, F (x1 , x2 , x3 , x4 ) = f1 (x2 , x3 , x4 )f2 (x1 , x3 , x4 )f3 (x1 , x2 , x4 )f4 (x1 , x2 , x3 ) Now for j = 1, 2, 3 we view fj as a function gj in the x4 variable taking values in Lp (R2 ). As such, its Lp norm is just the Lp norm of fj . Applying the induction hypothesis, we find that for fixed x4 , the norm of the function (x1 , x2 , x3 ) 7→ f1 (x2 , x3 , x4 )f2 (x1 , x3 , x4 )f3 (x1 , x2 , x4 ) 102 p in L 2 is just Q3 j=1 kgj (x4 )kp . Therefore, the function g(x4 ; x1 , x2 , x3 ) = f1 (x2 , x3 , x4 )f2 (x1 , x3 , x4 )f3 (x1 , x2 , x4 ) p p is in L 3 of the x4 variable, taking values in L 2 of the x1 , x2 , x3 variables. To get F , we multiply g by f4 which is in L∞ of the x4 variable, taking values in Lp of p the x1 , x2 , x3 variables. The result is a function in L 3 of the x4 variable, taking p p values in L 3 of the x1 , x2 , x3 variables, that is a function in L 3 (R4 ). P ROPOSITION 112 Let h ∈ W 1,1 (Rd ) (quantitatively) and also in C 1 (Rd ) d (qualitatively). Then h ∈ L d−1 (Rd ) d Indeed, since C 1 (Rd ) is dense in W 1,1 (Rd ), we see that W 1,1 (Rd ) ⊆ L d−1 (Rd ) at least in the weak sense by extension by continuity. Proof. We have for every j = 1, . . . , d that jth place Z ∞ z}|{ ∂h , t, . . . , x ) |h(x)| ≤ (x , . . . d dt 1 ∂x −∞ j (7.4) We let fj (x) equal the right hand side of (7.4) a function of all the variables except xj . This function is in L1 (Rd−1 ) since h ∈ W 1,1 (Rd ). We now obtain |h(x)| ≤ d Y 1 fj (x) d j=1 1 d and fjd ∈ Ld (Rd−1 ). Therefore by the last lemma, h ∈ L d−1 (Rd ), as required. In fact, more is true W 1,1 (Rd ) ⊆ L d ,1 (Rd ). This is a consequence of the d−1 coarea formula and the isopermetric inequality . . . We can now extend to T HEOREM 113 Let h ∈ W 1,p (Rd ) (quantitatively) and also in C ∞ (Rd ) (qualitatively). Then h ∈ Lq (Rd ) for 1 1 1 − = p d q assuming p ≥ 1 and q < ∞. 103 (7.5) Proof. This has just been proved for p = 1, so we assume p > 1. Let g = h|h|α with α > 0. Then we get ∇g = (1 + α)|h|α ∇h and indeed g is in C 1 (Rd ). We get Z |∇g|dx ≤ (1 + α)k∇hkp k|f |α |kp0 from Hölder’s inequality. We will choose αp0 = q, so that in fact this reads Z |∇g|dx ≤ (1 + α)k∇hkp khkαq . Applying the previous proposition, it comes that khk1+α = kgk q d d−1 ≤ (1 + α)k∇hkp khkαq d = q. It’s easy to check that these conditions are consistent and where (1 + α) d−1 ammount to (7.5) This proof is easily pushed by induction to give W k,p (Rd ) ⊆ Lq (Rd ) for 1 k 1 − = p d q for k an integer ≥ 1, p ≥ 1 and q < ∞. We may also define Sobolev spaces on an open subset of Rd . The space W k,p (Ω) consists of all functions f such that ∂ α f ∈ Lp (Ω) for all α with |α| ≤ k. The Lp norm is taken with respect to Lebesgue measure cut on Ω. P ROPOSITION 114 Let Ω be an open convex subset of Rd with C 1 boundary, d < p < ∞ and p−1 − d−1 = −βd−1 . Then W 1,p (Ω) ⊆ C 0,β (Ω), the space of Hölder continuous functions of order β on Ω. Proof. We clearly have 0 < β < 1. So given two points x1 , x2 of Ω, it suffices to show that |f (x1 ) − f (x2 )| ≤ Ck∇f kp |x1 − x2 |β and indeed, it is enough to show this for |x1 − x2 | sufficiently small. Consider a family of paths from x1 to x2 1 1 1 xu (t) = (x1 + x2 ) + (x1 − x2 ) sin(t) + (x1 − x2 ) u cos(t) 2 2 2 as u runs over vectors of norm ≤ 1 orthogonal to x1 − x2 and such that u · n > 0 where n is a suitable unit vector, typically the inward unit normal to ∂Ω near the points in question. These paths lie entirely inside Ω. We get Z π 2 f (x1 ) − f (x2 ) = ∇f (xu (t)) · x0u (t)dt − π2 104 leading to Z |f (x1 ) − f (x2 )| ≤ π 2 − π2 |∇f (xu (t))||x0u (t)|dt Z ≤ C|x1 − x2 | π 2 |∇f (xu (t))|dt. − π2 As u varies, these paths fill out a solid hemisphere H. Therefore, averaging over u in a suitable way, we get Z |x1 − x2 | |f (x1 ) − f (x2 )| ≤ C |∇f (y)|dy meas(H) H and hence 1 d |f (x1 ) − f (x2 )| ≤ C|x1 − x2 |1−d k∇f kp meas(H) p0 ≤ C|x1 − x2 |1− p k∇f kp as required. By replacing the hemispheres with more complicated blobs, one may hope to extend this argument to more general open sets. We state a few results without proof. T HEOREM 115 Let Ω be a Lipschitz domain in Rd . Basically, this means that the boundary can be specified locally as the graphs of Lipschitz functions suitably oriented. Let 1 ≤ p ≤ ∞ and k ∈ N. Then there is a continuous extension operator T : W k,p (Ω) −→ W k,p (Rd ) with the extension property T (f )|Ω = f . In the case 1 < p < ∞, where we may define W k,p (Rd ) for k > 0 nonintegral, we may also define W k,p (Ω) as the quotient of this space by restriction. In the particular case p = 2, the resulting space W k,2 (Ω) will be a Hilbert space. P ROPOSITION 116 Let k ∈ N and 1 ≤ p < ∞. For any Ω ⊆ Rd C ∞ (Ω) ∩ W k,p (Ω) is dense in W k,p (Ω). Sketch proof. and with We construct a sequence ϕj of C ∞ functions of compact support supp(ϕj ) ⊆ Ωj ⊂ cl(Ωj ) ⊂ Ω, 105 P∞ ϕj = 11Ω and with the property that every x ∈ Ω has a neighbourhood that meets only finitely many of the Ωj . Typically, the Ωj are ”shells” tending to the boundary of Ω. Now let f ∈ W k,p (Ω) be the function that we wish to approximate. Let > 0 and j = 2−j . Then f ϕj is certainly in W k,p (Ω). We now find a nonnegative C ∞ functions ψj with integral equal to one (basically a convolution approximate identity). The additional requirements on ψj are that kf ϕj − (f ϕj ) ∗ ψj kW k,p < j and supp((f ϕj ) ∗ ψj ) ⊆ Ωj . The approximation is possible since partial derivatives commute with convolutions. The approximating function g is taken to be j=1 g= ∞ X (f ϕj ) ∗ ψj . (7.6) j=1 P∞ We have kf − gkW k,p ≤ j=1 kf ϕj − (f ϕj ) ∗ ψj kW k,p < . Each individual ∞ (f ϕj ) ∗ ψj is C since f ϕj is an lp function of compact support and ψj is in Cc∞ . The sum (7.6) is C ∞ since locally it is only a finite sum of C ∞ functions. Note that there is a related topological concept called paracompactness (every open cover has a locally finite refinement). The concept of refinement of a covering is different from the concept of subcovering in that it allows each open set in the cover to be replaced by a (possibly) smaller open set. Every metric space is paracompact. The concept is defined to enable the construction of resolutions of the identity on manifolds. L EMMA 117 (T HE R ELLICH L EMMA ) Let Ω ⊆ Rd be a bounded open set with C 1 boundary. Let 1 ≤ p < d, q −1 = p−1 − d−1 . Then the inclusion mapping of W 1,p (Ω) into Lq (Ω) is not merely continuous (as may be deduced from Theorem 115) but is also a compact operator. 106 8 Symbolic Calculus of Hilbert space operators 8.1 Spectral theory of normal operators and projection measures A ∗-algebra A is an algebra with a ∗ operation, that is for every element x ∈ A, there is an element x∗ . We ask that x∗∗ = x, that x 7→ x∗ is conjugate linear, that (xy)∗ = y ∗ x∗ and also that kx∗ k = kxk. A C ∗ algebra is a Banach algebra with the additional property that kx∗ xk = kxk2 . In fact, it can be shown that every unital C ∗ algebra is isomorphic with the algebra of all bounded operators on a Hilbert space. Once you know this theorem, then you might as well be dealing with B(H). What we are going to do in this section is to use the Gelfand theory to discuss the spectral theory of normal operators on Hilbert space. If T is such a normal operator, then the closed algebra generated by T and its adjoint T ∗ is a commutative closed subalgebra of B(H) with identity and hence a commutative C ∗ algebra with identity. Some of the results that follow will apply to general commutative C ∗ algebras. Every element x ∈ A can be written in the form 1 1 x = (x + x∗ ) − i ((ix) + (ix)∗ ), 2 2 so in particular in the form x = x1 + ix2 where x∗j = xj for j = 1, 2. If x∗ = x we say that x is hermitian or self-adjoint. If x is hermitian and ϕ is an mlf on A, then ϕ(x) is real. To see this, we let for t real ut = exp(itx) given by a convergent power series. We get kut k2 = ku∗t ut k = k exp(−itx) exp(itx)k = k exp(0)k = 1 107 using the commutativity to collapse the product of exponentials. It follows that | exp(itϕ(x))| = |ϕ(ut )| ≤ kut k = 1. Thus, ϕ(x) must be real. It follows from this that for general x, ϕ(x∗ ) = ϕ(x). (8.1) Also, if x is hermitian, we have kxk2 = kx2 k. Replacing x by x2 this gives n n kxk4 = kx2 k2 = kx4 k and an easy induction gives kxk2 = kx2 k. The spectral radius formula now gives kx̂k∞ = kxk. For a general element x ∈ A, we have 2 ∗ xk d kx̂k2∞ ≤ kxk2 = kx∗ xk = kx ∞ ≤ kx̂k∞ since x∗ x is hermitian. Thus the Gelfand transform is an isometry and in particular injective. But the algebra of Gelfand transforms is a complete uniform separating subalgebra of C(MA ) with identity and by the Stone–Weierstrass theorem it must be the whole of C(MA ). In the case that A is generated by a normal T and its adjoint T ∗ we see that the mapping MA to sp(T )given by ϕ 7→ ϕ(T ) is one-toone. For, if ϕ1 (T ) = ϕ2 (T ), then ϕ1 (T ∗ ) = ϕ2 (T ∗ ) and hence ϕ1 = ϕ2 since T and T ∗ generate A. Thus, we may identify MA to sp(T ). Now for a continuous function θ on sp(T ) and for ξ, η ∈ H, we consider the map θ 7→ hη, θ(T )ξi. This is a continuous linear form on C(sp(T )) and hence is given by integration against a measure µη,ξ We have Z hη, θ(T )ξi = θ(z)dµη,ξ (z). sp(t) Note that kµη,ξ k ≤ kηkkξk. Also, the map (η, ξ) 7→ µη,ξ is linear in ξ and conjugate linear in η. We note also that θ(T ) = θ(T )∗ . This is another way of writing (8.1). Then we have Z θdµη,ξ = hη, θ(T )ξi = hη, θ(T )∗ ξi = hθ(T )η, ξi = hξ, θ(T )ηi sp(t) Z = θdµξ,η sp(t) 108 which confirms that µξ,η = µη,ξ . Hence, for θ now a bounded Borel function on sp(T ) we may infer the existence of an operator M (θ) such that Z hη, M (θ)ξi = θ(z)dµη,ξ (z) sp(T ) and it follows that M (θ) = M (θ)∗ by reversing the argument above. If θ, ψ ∈ C(sp(T )) we get M (θψ) = M (θ)M (ψ). (8.2) and this yields Z Z θψdµη,ξ = spT θdµη,M (ψ)ξ spT This can now be extended to all θ bounded Borel on sp(T ). So (8.2) holds for all θ bounded Borel and all ψ continuous. Replaying the argument with θ and ψ interchanged, we see that (8.2) holds for all θ, ψ bounded Borel. Next, take a Borel subset X of sp(T ) and put P (X) = M (11X ). Then P (X) is a hermitian operator on H and P (X)2 = P (X), i.e. P (X) is a hermitian projection. We think of P as a projection-valued measure and we write symbolically Z zdP (z). T = sp(T ) The symbolic calculus of a normal operator T now amounts to Z θ(T ) = θ(z)dP (z). sp(T ) and is valid for all Borel functions θ on sp(T ). Note that P (X ∩ Y ) = M (11X∩Y ) = M (11X 11Y ) = M (11X )M (11Y ) = P (X)P (Y ) for X and Y Borel sets. Next, we justify that P is in some sense a measure. The weak operator topology on B(H) is defined by the seminorms T 7→ |hη, T (ξ)i as η and ξ run over H. The strong operator topology on B(H) is defined by the seminorms T 7→ kT (ξ)k 109 as ξ runs over H. Both of these locally convex space topologies are weaker than the operator norm topology. It is clear subsets of sp(T ) and Yn = ∪nk=1 Xk then Pnthat if Xk are disjoint Borel P (Yn ) = k=1 P (Xk ). Let Y = ∪∞ k=1 Xk . Then it is also clear that P (Yn ) → P (Y ) in the weak operator topology. This follows since the µη,ξ are countably additive measures. Now k(P (Y ) − P (Yn ))ξk2 = h(P (Y ) − P (Yn ))ξ, (P (Y ) − P (Yn ))ξi = hξ, (P (Y ) − P (Yn ))2 ξi = hξ, (P (Y ) − P (Yn ))ξi → 0 as n → ∞ since (P (Y ) − P (Yn ))2 = P (Y )2 − P (Y )P (Yn ) − P (Yn )P (Y ) + P (Yn )2 = P (Y ) − P (Yn ) − P (Yn ) + P (Yn ) = P (Y ) − P (Yn ). Thus we have P ∞ [ ! Xk = ∞ X P (Xk ) k=1 k=1 with the sum on the right converging in the strong operator topology. 8.2 Symbolic Calculus for Hilbert space contractions We give the amazingly short proof of von Neumann’s inequailty due to John Wermer. There are many proofs of the result. T HEOREM 118 Let T be a linear contraction on a Hilbert space H. Let p(z) = P n k p z a polynomial complex coefficient. The result p(T ) of substituting k=0 k Pwith n T into the polynomial k=0 pk T k satsifies kp(T )kop ≤ sup|z|≤1 |p(z)|. A consequence is that f (T ) can be defined and satisfies kf (T )kop ≤ sup|z|≤1 |f (z)| for any function f in the disc algebra A(D). Proof. The first step is to reduce to the case of a finite-dimensional Hilbert space. Obviously, we need only prove * n + X pk T k ξ ≤ kηkkξk sup |p(z)| η, |z|≤1 k=0 110 for all ξ and η. Let K be the linear span of η and T k ξ for k = 0, . . . , n. Let J denote the inclusion of K into H and J ∗ the orthogonal projection of H onto K. Then p(J ∗ T J) = J ∗ p(T )J. Hence it suffices to work with J ∗ T J which is a linear contraction on K. After choosing a suitable (finite) orthonormal basis in K, we may now write the matrix of T as U diag(σ1 , . . . , σd )V where U and V are unitary and σj are the singular values of T . They satisfy 0 ≤ σj ≤ 1. Now let T (w) = U diag(w1 , . . . , wd )V allowing w = (w1 , . . . , wd ) to run over the polydisk Dd . Then q(w) = hη, p(T (w))ξi is a polynomial in the variables w1 , . . . , wd and hence takes its maximum absolute value when |wj | = 1 for all j = 1, . . . , d. But then T (w) is the product of three unitaries and hence is a unitary. The result for unitary operators (and more generally normal operators) follows from the results in the previous section. The result follows. 111 9 Odds and Ends 9.1 The Hardy spaces and Blaschke Products The Hardy spaces are defined by H p (T) = {f ∈ Lp (T); fˆ(n) = 0 for n < 0}. usually for 1 ≤ p ≤ ∞. They are closed linear subspaces of Lp (T). They can also be thought of as spaces of analytic functions in the open unit disk ∆ = {z ∈ C; |z| < 1} Z p H = {F ; F analytic in ∆, sup |F (reit )|p dη(t) < ∞} 0≤r<1 T To establish this for 1 < p ≤ ∞, one uses weak* compactness. Let fr (t) = F (reit ), then the condition F ∈ H p implies that (fr ) is bounded in Lp (T), for 0 ≤ r < 1. Hence the is a weak* limit point f in Lp . It follows that fˆ(n) is the coefficient of z n in the Maclaurin expansion of F for n 6= 0 and fˆ(n) = 0 for n < 0. Hence that F (reit ) = (f ∗ Pr )(t). This does not quite work for p = 1 since L1 (T) is not a dual space. If we want to take a weak* limit, then we should do so in M (T). The situation is saved by the following theorem. T HEOREM 119 (F. & M. R IESZ T HEOREM ) If µ ∈ M (T) and µ̂(n) = 0 for all strictly negative integers n, then µ is absolutely continuous with respect to linear measure on T. One very basic question that may be asked is “what are the closed linear subspaces of `2 that are invariant under the forward shift?” An equivalent formulation is which closed linear subspaces S of H 2 (T) have the property that f ∈ S =⇒ e1 f ∈ S? 112 P ROPOSITION 120 Let S be a closed linear subspace of H 2 (T) with the property that f ∈ S =⇒ e1 f ∈ S. Then either S = {0} or there exists q ∈ H 2 (T) with |q| = 1 η-almost everywhere such that S = qH 2 . Proof. Assume that S 6= {0}. Then we may define m = inf{n ∈ Z+ ; ∃f ∈ S, such that fˆ(n) 6= 0}. Then there exists f ∈ S such that fˆ(m) 6= 0. It follows that f ∈ / e1 S. Therefore e1 S is a proper closed linear subspace of S. (Exercise: Why is e1 S closed?) Now let q ∈ S∩(e1 S)⊥ with kqk2 = 1. Now for n ≥ 1, en q ∈ e1 S (proof by induction). Therefore q ⊥ en q for n ≥ 1. This says Z en |q|2 dη = 0 for n R≥ 1 and taking complex conjugates also for n ≤ −1. On the other hand we have e0 |q|2 dη = 1 and hence complete knowledge of the Fourier coefficients of |q|2 . But |q|2 ∈ L1 . Hence by the uniqueness theorem, |q|2 = 1. We claim that if f ∈ H 2 , then qf ∈ S. To see this, recall fN = N X 1− n=0 n ˆ f (n)en N is a trigonometric polynomial with nonnegative exponents and so qfn ∈ S. But since q is bounded, qfn → qf in L2 norm and the claim is proved since S is closed. So qH 2 ⊆ S. Both sets are closed linear subspaces of H 2 . If the inclusion is strict, we may find g ∈ S ∩ (qH 2 )⊥ . with kgk2 = 1. Now since g ∈ (qH 2 )⊥ we have Z gqen dη = 0 for all n ≥ 0. But since q ∈ (e1 S)⊥ , and g ∈ S we have Z en gqdη = 0 for all n ≥ 1. This shows that gq has all its Fourier coefficients zero. Since gq ∈ L1 , it follows that gq = 0. But |q| = 1 and hence g = 0 contradicting kqk2 = 1. We use the notation ∆ for the open unit disk. 113 P ROPOSITION 121 Let F ∈ H 1 (∆) and F not the zero element of H 1 . Then the zeros of H say αj (as j runs over a necessarily countable) index set satisfy P j (1 − |αj |) < ∞. Sketch proof. We start by factoring out the zeros at the origin. If F has a zero of order p at z = 0, then write F (z) = z p G(z) and replace F by G. Hence we may always assume that F (0) 6= 0. Let αj αj − z Bj (z) = |αj | 1 − αj z a gadget called a Blaschke factor. It is continuous on the closed unit disk and has |Bj (z)| = 1 for |z| = 1. Similarly, we may factor out Blaschke factors and write F (z) = G(z) n Y Bj (z) j=1 where G ∈ H 1 and kgk1 = kf k1 . We get |F (0)| = |G(0)| n Y |Bj (0)| = |G(0)| j=1 It follows that n Y |αj | ≤ kgk1 j=1 n Y |αj | ≥ j=1 n Y |αj | = kf k1 j=1 n Y |αj | j=1 |F (0)| > 0. kf k1 passing to the limit as n → ∞ (in case the set of zeros is denumerable) we have the result. L EMMA 122 Let p ∈ Z+ and αj ∈ ∆ \ {0} be such that Then the Blaschke product B(z) = z p n Y P j (1 − |αj |) < ∞. Bj (z) j=1 converges uniformly on the compacta of ∆ and defines a bounded analytic function in ∆ with boundary value function b of unit modulus η-almost everywhere on T. 114 Proof. We can assume without loss of generality that p = 0. Now let Hp (z) = Qn j=1 Bj (z) and use hp to denote the boundary value function. (No problems here since Hp is continuous on the closed unit disk). Let m ≥ n then ! Z m Y |αj | . khm − hn k22 = khm k22 + khm k22 − 2< hn hm dη = 2 1 − j=n+1 Since j (1 − |αj |) < ∞, it follows that (hn ) is an L2 Cauchy sequence. From this it follows that the (Hn ) converge uniformly on the compacta of ∆ and also that the limit function on T has absolute value 1 almost everywhere. In fact, we can continue to define Gp = Hp−1 F an analytic function in ∆. Clearly Gp ∈ H 1 (∆) and kgp k1 = kf k1 . (Remember Hp−1 is continuous in some open set containing the unit circle). So (gp ) has a weak* limit point g in H 1 (T) by the F. & M. Riesz Theorem. Here weak* means in σ(M (T), C(T)), but in case you were wondering, H 1 actually is a dual space. It follows that Gp has the u on c limit point G. Since F = Hp Gp , it now follows that F = BG. Passing to the limit at the boundary using radial limits, we now get f = bg and |f | = |g| almost everywhere. P We now return to the function q of Proposition 120. In the special case F = Q with Q corresponding to q we find that the function G has no zeros in ∆ and its boundary value function has |g| = 1 almost everywhere. It follows that − log(G(z)) is analytic in ∆ and has nonnegative real part. The real part is a harmonic function in ∆ and its integral over the circle |z| = r is just − log(|G(0)|) < ∞ for all 0 ≤ r < 1 and thus the Poisson integral of a nonnegative measure µ = k · η + µs where k ∈ L1 (T) and µs is singular with respect to η. Checking what happens over radial limits we find that k vanishes identically. It follows that − log(G(reit )) can be obtained by convolving µs by the Herglotz kernel Pr + iQr . This leads to the formula Z it e +z dµs (t) . G(z) = exp − eit − z So, the most general Q can be written Z it Y e +z αj αj − z p Q(z) = z exp − dµ(t) eit − z |αj | 1 − αj z j where µ is a singular measure and P j (1 − |αj |) < ∞. 115 10 Fourier Analysis on Compact Groups In this section G is a compact Hausdorff topological group. The right regular representation of G on L2 (G) is defined by R(x)f (y) = f (yx). To understand why it should be so we see that R(x1 x2 )f (y) = f (yx1 x2 ). On the other hand, if g = R(x2 )f then g(z) = f (zx2 ) and R(x1 )R(x2 )f (y) = R(x1 )g(y) = g(yx1 ) = f (yx1 x2 ). Hence R(x1 x2 ) = R(x1 )R(x2 ). We interpret the right regular representation R of G as a group homomorphism of the group G into the group of unitary operators acting on L2 (G). One may also build the left regular representation, but in order to have a group homomorphism the definition has to be L(x)f (y) = f (x−1 y). Both of these representations are continuous from G to the group of unitary operators on L2 (G) given the strong operator topology. If we have a representation π of G on a Hilbert space H this means that π is a group homomorphism of G into the group of unitary operators on H with the strong operator topology. An invariant subspace K of H is a closed linear subspace of H such that ξ ∈ K and x ∈ G implies π(x)ξ ∈ K. In this case, the restriction of π(x) to K defines a representation of G on K. However, it is also true that K ⊥ is a closed invariant linear subspace of H. To see this, let ξ ∈ K and η ∈ K ⊥ . Then we have hξ, π(x)ηi = hπ(x)∗ ξ, ηi = hπ(x)−1 ξ, ηi = hπ(x−1 )ξ, ηi = 0. Since H = K ⊕ K ⊥ as a Hilbert space direct sum, we have decomposed the representation into two parts. Such a representation is said to be reducible. If a representation has no closed invariant linear subspaces (apart from the trivial ones, {0H } and H itself) we say that it is irreducible. 116 L EMMA 123 Let K be a closed invariant subspace of a Hilbert space H for a unitary representation π. Let P be orthogonal projection from H to K. Then for all x ∈ G we have π(x)P = P π(x). Further, let M be an invariant linear subspace of H. Then P (M ) is also an invariant linear subspace. Proof. Consider Q = π(x)−1 P π(x) = π(x)∗ P π(x). Clearly Q is a hermitian projection which maps onto K and has kernel K ⊥ . Hence Q = P . Now let ξ ∈ M and x ∈ G. Then π(x)P ξ = P π(x)ξ ∈ P (M ) P ROPOSITION 124 Let π be a representation on a Hilbert space H, then there is a maximal closed invariant linear subspace K of H on which π decomposes into a (possibly infinite) Hilbert space direct sum of finite dimensional representations. Proof. The proof is by Zorn’s Lemma. An object (for the purposes of this proof) is a set of finite dimensional invariant linear subspaces of H that are mutually orthogonal. We can partially order the set of objects by inclusion. It is easy to see that for any chain in this partially ordered set the union taken over the chain is again an object. The only thing that needs to be verified here is that the elements (i.e. finite dimensional linear subspaces) in the union object are mutually orthogonal. To do this, we choose two such finite dimensional linear subspaces say K1 and K2 . They must belong to object1 and object2 say. But one of these objects contains the other, so that K1 and K2 both belong to some object in the chain and hence are orthogonal. So, since each chain has an upper bound, it follows that there exists a maximal object. Now let K be the Hilbert space direct sum of the subspaces in this maximal object. Next, we claim that K is maximal among closed linear subpaces of H which can be written as Hilbert space direct sums of finite dimensional invariant subspaces. Suppose that M is a larger such linear subspace. Then M = ⊕α∈I Mα the corresponding direct sum with Mα finite dimensional and invariant. Clearly, there exist α ∈ I, ξ ∈ Mα such that ξ ∈ / K. Let P be orthogonal projection on K ⊥ . Then P ξ 6= 0. But P (Mα ) is a nonzero invariant linear subspace of H contained in K ⊥ . It is also closed since it is finite dimensional. This contradicts the maximality of K. 117 T HEOREM 125 Let G be a compact Hausdorff topological group. Then the right regular representation can be written as a Hilbert space direct sum of finite dimensional irreducible invariant linear subspaces. Proof. We start by using Proposition 124 to find a maximal closed linear subspace K of L2 (G) for which the right regular representation breaks down as a direct sum of finite dimensional representations. We will work on K ⊥ a closed invariant subspace for the right regular representation. It follows that K ⊥ does not have any nonzero finite dimensional invariant subspaces. Assuming that K ⊥ is nonzero, choose a nonzero function g in K ⊥ . Next, find a compact symmetric neightbourhood of e such that kf ∗ g − gk2 < 12 kgk2 with f = fV = meas (V )−1 11V , possible since fV is a summability kernel. Note that left convolution by f , namely Z Z −1 Lf (h)(y) = h(x y)f (x)dx = f (yz −1 )h(z)dz is a hermitian operator on L2 (G). This is because f is real and symmetric (i.e. f (x−1 ) = f (x)). But Lf is also a compact operator, in fact a Hilbert-Schmidt operator since f ∈ L2 . Let J denote the inclusion of K ⊥ into H and J ∗ the adjoint of J is then orthogonal projection from H onto K ⊥ . Both these operators commute with the right regular representation. But Lf also commutes with the right regular representation. This is essentially because left translation and right translation commute. If you have a group element x it doesn’t matter if you first multiply on the left by y and then on the right by z or whether you perform these operations in the opposite order: (yx)z = y(xz). We have J ∗ Lf JR(x) = J ∗ Lf R(x)J = J ∗ R(x)Lf J = R(x)J ∗ Lf J and J ∗ Lf J is a compact operator on K ⊥ . Therefore K ⊥ = (⊕λ Hλ ) ⊕ ker(J ∗ Lf J) where Hλ are the finite dimensional eigenspaces of Lf as λ runs over the nonzero eigenvalues. But, there aren’t any Hλ because we already pulled the finite dimensional invariant stuff into K. So, J ∗ Lf J vanishes on K ⊥ . But then 1 0 = |hg, Lf gi| = |hg, gi − hg, g − Lf gi| ≥ kgk22 − kgk22 > 0 2 118 This contradiction shows that K ⊥ is zero and the result is almost proved. The final step is to break down the finite dimensional invariant subspaces of L2 (G) into irreducible ones. This is trivial since in view of the finite dimensionality, the breaking down procedure has to stop. 119