Download Class Notes for MATH 567.

Document related concepts

Symmetric cone wikipedia , lookup

Vector space wikipedia , lookup

Brouwer fixed-point theorem wikipedia , lookup

Distribution (mathematics) wikipedia , lookup

Lp space wikipedia , lookup

Transcript
Class Notes for MATH 567.
by
S. W. Drury
c 2016, by S. W. Drury.
Copyright 1
Topological Vector Spaces
1.1
Some Topology Basics
Unfortunately sequences are not adequate to handle aspects of topological spaces.
The concept that is needed in this context is that of a net. Recall that a partially
ordered set (sometimes called a poset ) is a set J together with a relation ≤ on J
such that
• α ≤ α for all α ∈ J.
• α, β ∈ J, α ≤ β, β ≤ α implies α = β.
• α, β, γ ∈ J, α ≤ β, β ≤ γ implies α ≤ γ.
A directed set J is a partially ordered set with one additional condition: for every
pair α, β ∈ J, there exists an element γ ∈ J with α ≤ γ and β ≤ γ. The set
of natural numbers N is an example of a directed set, but there are many other
examples. To obtain the theory of nets from the theory of sequences, we replace
N with a general directed set.
E XAMPLE Let J be the set of partitions of [0, 1] as used in Riemann integration.
The relation α ≤ β holds if the partition β refines the partition α. One definition
of the Riemann integral is in terms of convergence along this directed set.
2
Let J be a directed set and X a topological space. A net in X is a mapping
from J to X, usually denoted as (xα )α∈J just as we do for sequences. The net
(xα )α∈J is said to converge to x ∈ X if for every neighbourhood V of x, there
exists α ∈ J such that xβ ∈ V for all β ∈ J with α ≤ β. Convergence along nets
is sometimes called Moore–Smith convergence .
1
A topological space is Hausdorff if and only if whenever x1 , x2 ∈ X with
x1 =
6 x2 there exist disjoint open subsets U1 and U2 with xj ∈ Uj for j = 1, 2.
There are some other so called separation axioms :
• A topological space is regular if both of the following hold:
– singletons are closed,
– whenever A is a closed subset of X and x ∈ X \ A there exist disjoint
open subsets U and V such that x ∈ U and A ⊆ V .
• A topological space X is completely regular if both of the following hold:
– singletons are closed,
– whenever A is a closed subset of X and x ∈ X \ A there exist a
continuous function f : X −→ [0, 1] such that f (x) = 1 and f = 0 on
A.
• A topological space is normal if both of the following hold:
– singletons are closed,
– whenever A and B are disjoint closed subsets of X there exist disjoint
open subsets U and V such that A ⊆ U and B ⊆ V .
E XERCISE
• Show that in a Hausdorff topological space the limit of a net is unique.
• If A ⊆ X, show that x ∈ cl(A) if and only if there exists a net of points in
A converging to x. To establish the dif and only ificult implication, let J be
the set of neighbourhoods of x ordered by reverse inclusion.
• If (xα )α∈J and (yα )α∈J are nets over the same directed set J converging to x
and y respectively, show that (xα , yα ) → (x, y) in the product space X × Y .
• If X and Y are topological spaces and f : X −→ Y , show that f is continuous if and only if whenever xα → x, f (xα ) → f (x).
• Let X be the set of all ordinal numbers that are ≤ Ω where Ω denotes the
first uncountable ordinal. Consider the order topology on X. Let E =
X \ {Ω}. Then E is dense in X, but no sequence in E converges to Ω.
2
This is because every sequence in E is bounded above by some countable
ordinal. Lookup “order topology” and “ordinal number” on Wikipedia for
more information.
2
We proven the second item. Let x ∈
/ cl(A). Then clearly if xα ∈ A it is
impossible that xα −→ x. For the converse, let x ∈ cl(A). Consider the directed
set of all neighbourhoods of x ordered by V ≥ W if and only if V ⊆ W . For V
a neighbourhood of x, choose xV ∈ V ∩ A. We claim that xV −→ x. Indeed for
every neighbourhood W of x, xV ∈ W for every V ≥ W , i.e. for every V ⊆ W .
Let J be a directed set and K ⊆ J. Then K is cofinal in J if for all α ∈ J
there exists β ∈ K such that α ≤ β. If K is cofinal in J, then it is easy to see
that K is a directed set in the order relation that it inherits from J. Let (xα )α∈J
be a net in a topological space X. Then the restriction (xα )α∈K is called a subnet
in case K is cofinal in J. This generalizes the notion of subsequence. If (xα )α∈J
converges to x then so does (xα )α∈K .
T HEOREM 1
A topological space is compact if and only if every net in X possesses a convergent subnet.
A topological spce is locally compact if and only if every x ∈ X possesses
a base of compact neighbourhoods. Specifically, this means that if x ∈ X and V
is a neighbourhood of x, then there exists W a compact neighbourhood of x such
that W ⊆ V .
1.2
One point compactification
Recall that in a Hausdorff topological space, compact subsets are necessarily
closed (454 notes page 77). Let X be a locally compact Hausdorff topological
space. usually, we also assume that X is not compact, but this is not necessary.
We build a new topological space αX = X ∪ {∞} where ∞ is not a point of X.
We define U is open in αX if and only if either
1. U ⊆ X and U is open in X.
2. ∞ ∈ U and X \ U is a compact subset of X.
One may show that this defines a topology on αX and that αX is compact Hausdorff in this topology. The space αX is called the one point compactification of
X or the Alexandroff compactification of X.
Here are two more difficult theorems on topology.
3
L EMMA 2 (U RYSOHN ’ S L EMMA )
Let X be a normal topological space. Let
A and B be disjoint closed subsets of X then there exists a continuous function
f : X −→ [0, 1] such that f −1 ({0}) = A and f −1 ({1}) = B.
T HEOREM 3 (U RYSOHN M ETRIZATION T HEOREM )
cal space with a countable basis is metrizable.
Every regular topologi-
T HEOREM 4 (T IETZE E XTENSION T HEOREM )
Let X be a normal topological space, A a closed subset and g : A −→ [0, 1] a continuous map. Then there
exists a continuous extension f of g defined on the whole of X.
1.3
Quotient Spaces
Let X be a topological space and ∼ an equivalence relation on X. Let Q be the set
of equivalence classes and let π : X −→ Q be the canonical projection that takes
every element of X to its equivalence class. Then on Q we define the quotient
topology by defining U to be an open subset of Q if and only if π −1 (U ) is open
in X. It is easy to check that this actually is a topology. Actually, it is the coarsest
topology on Q that renders π continuous. Beware that if X is Hausdorff then Q
need not be.
1.4
Uniform spaces
Uniform spaces are the natural setting for uniform continuity and completeness.
The concept is almost completely absent from most literature.
D EFINITION A uniform space is a set X together with a family of subsets of
X × X called vicinities. They are required to satisfy the following conditions.
1. For every vicinity V , DiagX ⊆ V .
2. If V ⊆ W ⊆ X × X and V is a vicinity, then W is a vicinity.
3. If V1 and V2 are vicinities, then V1 ∩ V2 is a vicinity.
4. If V is a vicinity, then there exists a vicinity W such that (x1 , x2 ) ∈ W
implies (x2 , x1 ) ∈ V .
4
5. If V is a vicinity, then there exists a vicinity W such that (x1 , x2 ), (x2 , x3 ) ∈
W implies (x1 , x3 ) ∈ V .
D EFINITION If X and Y are uniform spaces and f : X −→ Y , then f is
uniformly continuous if and only if (f × f )−1 (V ) is a vicinity in X whenever V
is a vicinity in Y .
Every metric space has a natural uniform space structure, V is a vicinity if and
only if there exists δ > 0 such that d(x1 , x2 ) < δ implies (x1 , x2 ) ∈ V .
Every uniform space X has a natural topology. The neighbourhoods of x are
simply the sets {y ∈ X; (x, y) ∈ V } as V runs over the vicinities V .
E XERCISE Let V be a vicinity in a uniform space. Show that int V is also a
vicinity. The interior is taken in the product topology of X × X.
2
Let X be a uniform space and (xα )α∈J be a net in X. Then (xα )α∈J is a
Cauchy net in X if and only if for every vicinity V of X, there exists γ ∈ J such
that α ≥ γ, β ≥ γ implies that (xα , xβ ) ∈ V .
Let X be a uniform space. The X is complete if and only if every Cauchy net
in X converges to some point of X.
As you might expect, every convergent net is Cauchy but this uses axiom 5 of
uniform space.
E XERCISE Let F be a closed subset of a complete uniform space E. Then
F is also complete in the uniform structure it inherits from E. Specifically V
is a vicinity in F if and only if there exists a vicinity W of E such that V =
U ∩ (F × F ).
2
E XERCISE Let F be a subset of a Hausdorff uniform space E. Suppose that F
is complete in the uniform structure it inherits from E. Then F is closed in E. 2
If X is a compact completely regular topological space, then X has a natural
uniform space structure in which V is a vicinity if and only if there exists U open
in X × X (in the product topology) such that DiagX ⊆ U ⊆ V . This is the unique
uniform structure on X that gives back the correct topology.
5
1.5
Topological Vector Spaces
Let E be a vector space over R or C that is also a Hausdorff topological space.
Then E is a topological vector space if the maps
(x, y) −→ x + y
(t, x) −→ tx
E × E −→ E
k × E −→ E
are continuous where k is the corresponding field of scalars. Note that not all
authors insist that a topological vector space is necessarily Hausdorff. However,
if one does not so insist, then the subset of vectors that cannot be separated from
0E form a linear subspace and on quotienting out this linear subspace one obtains
a space that is Hausdorff.
Some basic lemmas follow directly from the definition.
L EMMA 5
We have that U is a neighbourhood of 0E if and only if x + U is a
neighbourhood of x.
L EMMA 6
Let U be a neighbourhood of 0E in a topological vector space E.
Then there exists a neighbourhood V of 0E such that V + V ⊆ U .
The notation A+B means {a+b; a ∈ A, b ∈ B}. A subset A of E is balanced
if x ∈ A and |t| ≤ 1 implies tx ∈ A.
A subset A of E is absorbent if for all x ∈ E there exists y ∈ A and t > 0
such that tx = y.
L EMMA 7
Then
Let U be a neighbourhood of 0E in a topological vector space E.
• there exists a balanced neighbourhood V of 0E such that V ⊆ U .
• U is absorbent.
Proof. Since scalar multiplication is continuous for t 6= 0 the map x 7→ tx is
a homeomorphism (bijection continuous in both directions). Also since scalar
multiplication is continuous, there exists an open neighbourhood S
W of 0E and
δ > 0 such that |t| < δ and x ∈ W implies tx ∈ U . But now V = 0<|t|<δ tW is
a balanced open subset of U containing 0E . To see that U is absorbent, note that
the mapping t 7→ tx is continuous at 0. Hence, there exists δ > 0 such that |t| < δ
implies tx ∈ U . Choosing t = 21 δ > 0 does the trick.
Combining the lemmas we get
6
P ROPOSITION 8
Let U be a neighbourhood of 0E in a topological vector space
E. Then there exists a balanced neighbourhood V of 0E such that V − V ⊆ U .
Now let E be a topological vector space without the Hausdorff requirement.
Suppose that {0E } is closed. Then E is Hausdorff. Proof: Let x ∈ E \ {0E }.
Then since {0E } is closed, there is a neighbourhood U of 0E such that x ∈
/ U
and hence a neighbourhood V of 0E such that x ∈
/ V − V . So, V and x + V are
disjoint neighbourhoods of 0E and x respectively.
E XERCISE
are closed.
Show the converse. In a Hausdorff topological space E, singletons
2
This proposition also allows one to define a vicinity of a topological vector
space E by V is a vicinity if and only if there exists U a neighbourhood of 0E in
E such that
x − y ∈ U =⇒ (x, y) ∈ V.
E XERCISE
Show that this defines a uniform structure on E.
2
This allows us to talk about complete topological vector spaces.
P ROPOSITION 9
Let E be a topological vector space and F a linear subspace
of E. The cl(F ) is also a linear subspace.
Proof. Let x, y ∈ cl(F ). There are nets xα −→ x and yβ −→ y, α ∈ J,
β ∈ K, xα , yβ ∈ F , J, K directed sets. We turn I = J × K into a directed set by
(α1 , β1 ) ≥ (α2 , β2 ) if and only if α1 ≥ α2 and β1 ≥ β2 . (Exercise: Show that this
defines a directed set). Now show that (xα , yβ ) −→ (x, y) in the product topology
along the directed net I. Then by continuity of addition xα + yβ −→ x + y
along I. Hence x + y ∈ cl(F ). The proof that txα −→ tx along J follows from
the continuity of scalar multiplication. Hence tx ∈ cl(F ) and cl(F ) is a linear
subspace.
E XERCISE Let E be a topological vector space and F a linear subspace of E.
Then F is a topological vector space in the subspace topology.
2
T HEOREM 10
Let E be a topological vector space and F a closed linear subspace. Then the quotient space Q = E/F is a topological vector space in the
quotient topology.
Before beginning the proof, we need the following.
7
E XERCISE Recall that a mapping between topological spaces is said to be open
if the direct image of every open subset is open. If π1 and π2 are open mappings
(between topological spaces) then so is the product mapping π1 × π2 .
2
Proof of Theorem 10. We startSby showing that π is an open mapping. Let U be
open in E. Then π −1 (π(U )) = y∈F (U + y) which is open in E being a union of
open subsets. By definition of the quotient topology this says that π(U ) is open in
Q.
Hence, by the exercise, π × π is an open mapping E × E −→ Q × Q. Now
let U be open in Q. Then {(x1 , x2 ); x1 + x2 ∈ π −1 (U )} is open in E × E. But
the direct image of this set by π × π, namely {(q1 , q2 ); q1 + q2 ∈ U } must be
open in Q × Q. The continuity of scalar multiplication follows by much the same
argument.
Finally, we need to show that Q is Hausdorff. For this it will suffice to show
that {0Q } is closed in Q. But this follows by hypothesis since π −1 ({0Q }) = F .
T HEOREM 11
Let E be a finite dimensional topological vector space with basis e1 , . . . , en . Then the linear isomorphism
(t1 , . . . , tn ) 7→ t1 e1 + · · · + tn en
is a homeomorphism from k n to E. Note that k n can be given the product topology, or the topology coming from any norm.
Proof. It follows from the definition of topological vector space that the mapping
is continuous. We need to show that the inverse is continuous.
We start with the case n = 1 where the map is essentially t 7→ te where e
is a nonzero vector of a one-dimensional E. If tα e −→ 0E we need to show
that tα −→ 0. Let > 0. Then e 6= 0E . Consequently, there is a balanced
neighbourhood U of 0E such that e ∈
/ U . Since U is balanced, |t| ≥ 1 implies
te ∈
/ U . Rescaling, this says that if tα e ∈ U then |tα | < . This completes the
claim.
The proof now works by induction on the dimension of E. We may assume
that the result holds for spaces of dimensions 1 and n − 1. Let F be the linear
span of the vectors e1 , . . . , en−1 . Then the linear isomorphism
(t1 , . . . , tn−1 ) 7→ t1 e1 + · · · + tn−1 en−1
8
(1.1)
is a homeomorphism from k n−1 to F by the induction hypothesis. But k n−1 is
known to be complete and hence F is also complete (as a uniform space). Therefore F is closed in E and the quotient space Q = E/F is a topological vector
space. The mapping tn 7→ tn π(en ) = π(tn en ) is a linear isomorphism k −→ Q
and hence a homeomorphism. Since π is continuous, it now follows that the mapping
t1 e1 + · · · + tn en 7→ tn
is continuous as a mapping E −→ k. Similarly, the mapping
t1 e1 + · · · + tn en 7→ tp
is continuous for every p = 1, 2, . . . , n. The result now follows.
T HEOREM 12
A locally compact topological vector space E is necessarily finite dimensional.
Proof. There exists a compact balanced neighbourhood K of 0E . (Take a compact neighbourhood of 0E , then a balanced neighbourhood of 0E inside it, finally
close that).
We claim that tK as t runs over ]0, ∞[ form a base of neighbourhoods of 0E .
To see this, let W be an arbitrary neighbourhood of 0E . Find U a balanced open
neighbourhood of 0E with U + U ⊆ W . The collection {x + U ; x ∈ K} covers
K. So, there exist F finite with F ⊂ K such that F + U ⊇ K. Choose t > 0 with
t < 1 such that tF ⊂ U . Then tK ⊆ U + U ⊆ W . The claim is proved.
Repeating more or less the compactness argument we now have a finite subset
G of E such that K ⊆ G + 21 K. Let M be the linear span of G. Then G is
isomorphic to k m where m is the dimension of M and hence is complete. Thus,
M is closed in E. Now
1
1
1
1
K ⊆ M + K ⊆ M + M + K ⊆ M + K ⊆ · · · ⊆ M + 2−n K
2
2
4
4
for every n ∈ N. We claim that M = E. If not, then since K is absorbent there
exists x ∈ K \ M . Then there exists n ∈ N such that x + 2−n K is disjoint from
M . This follows since M is closed and 2−n K form a basis of neighbourhoods of
0E . But x ∈ M + 2−n K and this is a contradiction since K is balanced.
9
T HEOREM 13
Let E be a topological vector space such that 0E has a countable
base of neighbourhoods. Then E is metrizable.
Proof omitted.
In the next section we will tackle locally convex spaces. Perhaps the best
known example of a topological vector space that is not an locally convex space
is Lp (X, M, µ) for 0 < p < 1. Here (X, M, µ) is a measure space. We can take
Z
d(f, g) = |f − g|p dµ
as a metric on this space. Show that Lp (X, M, µ) is complete in this metric. We
will deal with Lp (X, M, µ) for 1 ≤ p ≤ ∞ later.
1.6
Locally Convex Spaces
A subset C of a linear space E is convex if and only if x, y ∈ C, 0 ≤ t ≤ 1
implies (1 − t)x + ty ∈ C. Informally, if x and y are in C, then C contains the
line segment joining x to y. Given any subset S of a linear space E, the convex
hull co(S) of S is the intersection of all convex subsets of E that contain S. (Note
that E itself is convex and hence the intersection is not indexed over the empty
set). An alternative description of co(S) is
{t1 x1 + · · · + tn xn ; n ∈ N, x1 , . . . , xn ∈ S, t1 , . . . , tn ≥ 0, t1 + · · · + tn = 1}
E XERCISE
In a topological vector space E the convex hull of an open set is
open. Hint: If t1 , . . . , tn ≥ 0 and t1 + · · · + tn = 1 then at least one of the tk is
strictly positive.
2
A locally convex space E is a topological vector space such that there exists
a base of convex neighbourhoods of 0E .
L EMMA 14
If E is a topological vector space, and U is a convex neighbourhood of 0E , then there exists a balanced convex open neighbourhood V of 0E with
V ⊆ U.
Proof. Certainly U contains a balanced open neighbourhood W of 0E . We take
V to be the convex hull of W . Then V is a balanced open convex set and V ⊆ u
since U is convex.
10
It may be worth noting that S balanced and convex is equivalent to S is absolutely convex . We say that S is absolutely convex if and only if x, y ∈ S, t, s ∈ k
|t| + |s| ≤ 1 implies that tx + sy ∈ S.
Note that neighbourhoods V of the zero vector are always absorbent , that is
every vector can be expressed in the form tv with t > 0 and v ∈ V .
A seminorm on E is a mapping p : E −→ [0, ∞[ such that
• p(tx) = |t|p(x) for t ∈ K and x ∈ E.
• p(x + y) ≤ p(x) + p(y) for x, y ∈ E.
L EMMA 15
If V is a balanced, convex absorbent subset of E then its
Minkowski functional
p(x) = inf{t > 0; t−1 x ∈ V }
is a seminorm.
The proof is easy. Further, if E is a topological vector space and V is open,
then V = {x; p(x) < 1}.
Thus, in a locally convex space E, we can code the topology by a family P
of seminorms with each seminorm being the Minkowski functional of an open
balanced convex neighbourhood. Since we are insisting that E is Hausdorff, for
every x ∈ E \ {0E }, there is an open balanced convex neighbourhood V such that
x∈
/ V . In particular the seminorm corresponding to V has p(x) ≥ 1.
A family of seminorms P is separating if for all x ∈ E \ {0E }, there exists
p ∈ P with p(x) > 0. When we perform this recoding, we have
xα −→ x
⇐⇒
p(xα − x) −→ 0 for all p ∈ P.
This is really pointwise convergence on P.
We can define a locally convex space by means of an arbitrary separating
family P of seminorms. When we do this, we have to use the family of subsets
{x; p(x) < t} as p runs through P and t runs through ]0, ∞[ as a subbase of
neighbourhoods of 0E . That is a basic neighbourhood will have the form
{x; pk (x) < tk , k = 1 . . . , n}
where n ∈ N, p1 , . . . , pn ∈ P, t1 , . . . , tn > 0.
11
E XERCISE Let E be a locally convex space and F a linear subspace of E. Then
F is a locally convex space in the subspace topology.
2
L EMMA 16
The quotient Q of a locally convex space E by a closed linear
subspace is again a locally convex space.
Proof. Let U be a neighbourhood of 0Q in Q. Then π −1 (U ) is a neighbourhood
of 0E in E. There exists an open balanced convex neighbourhood of 0E in E with
V ⊆ π −1 (U ). Hence π(V ) ⊆ U . But π(V ) is open since V is open and π is an
open mapping, and π(V ) is balanced and convex because V is. Further, π(V ) is a
neighbourhood of 0Q since it is open and 0Q ∈ π(V ).
1.7
The Hahn–Banach Theorem
Let E be any real vector space. A sublinear functional on E is a mapping p :
E −→ R such that
• p(tx) = tp(x) for t ≥ 0 and x ∈ E.
• p(x + y) ≤ p(x) + p(y) for x, y ∈ E.
Note that p(0E ) = 0 and compare with the definition of a seminorm.
T HEOREM 17 (H AHN –BANACH T HEOREM )
Let p be a sublinear functional
on a vector space E, F a linear subspace of E and f : F −→ R a linear mapping
(i.e. linear functional) such that f (x) ≤ p(x) for all x ∈ F . Then there exists
a linear functional f˜ : E −→ R extending f and such that f˜(x) ≤ p(x) for all
x ∈ E.
The proof of the Hahn–Banach theorem uses the axiom of choice in the form
of Zorn’s Lemma.
Zorn’s lemma is equivalent to the axiom of choice, in the sense that either one
together with the standard axioms of set theory is sufficient to prove the other. It
occurs in the proofs of several theorems of crucial importance, for instance the
theorem that every vector space has a linear basis, the theorem that every field
has an algebraic closure and that every ring has a maximal ideal. It is stated as
follows.
12
L EMMA 18 (Z ORN ’ S L EMMA )
Every non-empty partially ordered set in
which every chain which is bounded above contains a maximal element.
The terms are defined as follows. Suppose (X, ≤) is a partially ordered set.
A subset C of X is chain if for any x, y ∈ X we have either x ≤ y or y ≤ x.
A subset Y of X is bounded above if there exists u ∈ X such that y ≤ u for
all y ∈ Y . Note that u is an element of X and need not be an element of Y . A
maximal element of X is an element m ∈ X such that x ∈ X and m ≤ x implies
x = m.
Proof of the Hahn–Banach theorem. We take the partially ordered set X to be
the set of pairs (G, g) where G is a linear subspace of E with F ⊆ G, g is a linear
functional on G, extending f and such that g(x) ≤ p(x) for all x ∈ G. The partial
order is defined by
(G1 , g1 ) ≤ (G2 , g2 )
⇐⇒
G1 ⊆ G2 and g2 |G1 = g1
S
Now let C be a chain in X. We define H = (G,g)∈C G and for x ∈ H, we set
h(x) = g(x) where x ∈ G for (G, g) ∈ C. Since C is a chain, it follows that
h(x) is well defined, (i.e. independent of the choice of (G, g)). We also check
that h is linear on H (which also uses the fact that C is a chain) and finally that
h(x) ≤ p(x) for all x ∈ H.
We now have that (H, h) is an upper bound for C (it does not necessarily belong to C). Applying Zorn’s lemma, we see that X possesses a maximal element.
If this maximal element is of the form (E, ?) then we are done. If not, then we
must find a contradiction. Let us denote the maximal element as (H, h). Then
since H is not the whole of E we can find a linear subspace of E containing H as
a linear subspace of H codimension one. Let us relabel this subspace as E. Then
to obtain a contradiction to the maximality of (H, h) it will suffice to show the
following proposition.
P ROPOSITION 19
Let p be a sublinear functional on a vector space E, H a
linear subspace of E of codimension one and h : H −→ R a linear mapping (i.e.
linear functional) such that h(x) ≤ p(x) for all x ∈ H. Then there exists a linear
functional h̃ : E −→ R extending h and such that h̃(x) ≤ p(x) for all x ∈ E.
Proof. Let z ∈ E \ H. We will define h̃(x + tz) = h(x) + tα for all x ∈ H and
some suitable α ∈ R. We will need
h(x) + tα ≤ p(x + tz)
13
for all x ∈ H and all real t. If t = 0 this follows by hypothesis. If t > 0 then
divide by t replacing t−1 x by x. We need
h(x) + α ≤ p(x + z)
for all x ∈ H. Similarly, if t < 0 then divide by −t and replace −t−1 x by y. We
need
h(y) − α ≤ p(y − z)
for all y ∈ H. To recap, we need to have
sup h(y) − p(y − z) ≤ α ≤ inf p(x + z) − h(x)
x∈H
y∈H
If the choice of α is impossible, then there exist x, y ∈ H such that
h(y) − p(y − z) > p(x + z) − h(x).
But this implies
h(x + y) = h(x) + h(y) > p(x + z) + p(y − z) ≥ p(x + y)
a contradiction. The contradiction shows that a suitable choice of α ∈ R is always
possible.
There are a number of important corollaries.
C OROLLARY 20
Let E be a real Banach space and F a closed subspace (and
hence a Banach space in its own right). Let f ∈ F ∗ , the dual space of F . Then
there exists f˜ ∈ E ∗ extending f and kf˜k = kf k for the dual norms.
Proof.
Without loss of generality, kf k = 1. It then suffices to take p(x) = kxk.
C OROLLARY 21
Let E be a complex Banach space and F a closed linear subspace (and hence a Banach space in its own right). Let f ∈ F ∗ , the dual space of
F . Then there exists f˜ ∈ E ∗ extending f and kf˜k = kf k for the dual norms.
Proof. Of course E and F can also be considered as real linear spaces. One
forgets how to perform scalar multiplication by complex scalars that are not real.
14
This process is called realification (at least by me). We again assume without
loss that kf k = 1. Let h(x) = <f (x). Then h satisfies the hypotheses of the
real Hahn–Banach Theorem and we deduce the existence of an extension to a
real linear form u on X with |u(x)| ≤ kxk for all x ∈ E. Now define f˜(x) =
u(x) − iu(ix). Then clearly f˜ is real-linear since u is. But
f˜(ix) = u(ix) − iu(−x) = i(u(x) − iu(ix)) = if˜(x)
showing that f˜ is actually complex-linear. Next, let x ∈ F . Then ix ∈ F and
f˜(x) = u(x) − iu(ix) = <f (x) − i<f (ix) = <f (x) − i<(if (x)) = f (x).
To see this, let f (x) = a + ib with a, b ∈ R. Then
<f (x) − i<(if (x)) = a − i<(−b + ia) = a + ib = f (x).
Finally, let x ∈ E. Let f˜(x) = ω|f˜(x)| where ω ∈ C and |ω| = 1. Then we have
f˜(ωx) = |f˜(x)| ≥ 0 since we already showed that f˜ is complex-linear. But since
f˜(ωx) is real, it must be that f˜(ωx) = u(ωx). Therefore
|f˜(x)| = |f˜(ωx)| = |u(ωx)| ≤ kωxk = |ω|kxk = kxk
since k k is a complex norm.
C OROLLARY 22
Let E be a Banach space over R or C. Let x ∈ E with
x 6= 0E . Then there exists f ∈ E ∗ with f (x) 6= 0. Colloquially, the continuous
linear functionals on E separate the points of E.
We have the principle of duality for Banach spaces as follows. We denote by
E ∗ the space of all continuous linear functionals on the Banach space E. We give
E ∗ the operator norm
kϕkE ∗ =
sup
|ϕ(x)|.
x∈E,kxkop ≤1
C OROLLARY 23
We have
kxkE =
sup
ϕ∈E ∗ ,kϕkE ∗ ≤1
15
|ϕ(x)|.
Proof. The inequality ≥ follows from the definitions. We fix x a nonzero vector
in E. If x = 0E , the the esult is obvious. Let F be the one-dimensional linear
subspace of E spanned by x. Let ψ(tx) = tkxkE define a continuous linear
functional on F . We have kψkF ∗ = 1. It can be extended to ϕ ∈ E ∗ with
kϕkE ∗ = 1 by the Hahn–Banach theorem. But |ϕ(x)| = kxkE and the other
inequality is evident.
There are also geometrical versions of the Hahn–Banach theorem. We start
with
P ROPOSITION 24
Let E be a real locally convex space. Let U be an open
convex subset of E and y ∈ E with y ∈
/ U . Then there exists a closed halfspace
H of E with y ∈ H and H ∩ U = ∅.
Proof.
By translating we can assume that 0E ∈ U . Define
p(x) = inf{t > 0; t−1 x ∈ U }
analogous to the definition of the Minkowski functional. Then p(y) ≥ 1 (since
s > 0, sy ∈ U implies s < 1) and p is a sublinear functional. Let F be the linear
span of y. Define a linear functional f on F by f (y) = 1. The Hahn–Banach
theorem gives the existence of a linear functional f˜ on E such that f˜(y) = 1 and
f˜(x) ≤ p(x) for all x ∈ E. The halfspace H = {x ∈ E; f˜(x) ≥ 1} contains y.
For x ∈ H we have p(x) ≥ 1. For x ∈ U we have that there exists > 0 such that
(1 + )x ∈ U since U is open. Thus p(x) < 1. Therefore H and U are disjoint. It
remains to show that f˜ is continuous which will imply that H is closed. We can
choose a balanced open convex neighbourhood V of 0E with V ⊆ U . Let q be
the Minkowski functional of V and in particular a seminorm for the topology of
E. Then p(x) ≤ q(x) for all x ∈ E and it follows that f˜(x) ≤ q(x) for all x ∈ E.
But then −f˜(x) = f˜(−x) ≤ q(−x) = q(x) for all x ∈ E. So |f˜(x)| ≤ q(x).
Consequently f˜ is continuous and hence H is closed.
P ROPOSITION 25
Let E be a real locally convex space. Let C be a nonempty
closed convex subset of E such that 0E ∈
/ C. Then there exists a continuous linear
form on E and > 0 such that inf f (C) ⊆ [, ∞[.
Proof. Let U be a balanced convex open neighbourhood of 0E disjoint from C.
The C + U is an open convex subset of E such that 0E ∈
/ C + U . By the previous
16
result, there exists a continuous linear from f on E such that f (C + U ) ⊆]0, ∞[.
Since C is nonempty, f is not identically zero. It cannot be that U ⊆ f −1 ({0})
since f −1 ({0}) is a closed linear hyperplane and U is absorbent. Therefore f (U )
is a balanced convex subset of R not equal to {0}. There must exist > 0 such
that ] − , [⊆ f (U ). It now follows that inf f (C) ≥ .
P ROPOSITION 26
Let E be a real locally convex space. Let A be a compact
convex subset of E and B a closed convex subset of E with A and B disjoint.
Then there is a closed hyperplane H separating A and B.
Proof. First of all, the subset C = A − B is closed. To see this, take a point
x of the closure. Then there are nets (aα ) and (bα ) in A and B respectively such
that aα − bα → x. Extract from (aα ) a convergent subnet (possible since A is
compact). Then since B is closed the corresponding subnet of (bα ) converges to
a point of B. The set A − B is also convex. The problem has been reduced to
separating 0E from a closed convex subset C with a closed hyperplane. The result
follows from the previous proposition. Note that if A or B is empty, the result is
trivial.
1.8
The Krein–Milman Theorem
Let E be a locally convex space and C a compact convex subset. We say that
x ∈ C is an extreme point of C if there is no genuine line segment in C having x
as an interior point. Technically, the definition is as follows
y, z ∈ C, 0 < t < 1, (1 − t)y + tz = x
=⇒
y = z.
We denote ex(C) the set of extreme points of C.
T HEOREM 27 (T HE K REIN –M ILMAN T HEOREM )
Let E be a locally convex
space, K a nonempty compact convex subset of E. Then K = cl(co(ex(K))).
The proof of this theorem depends on the concept of a face which may be
familiar to those familiar with the area of computational geometry in computer
science. If we take a closed cube in three dimensional space then the faces of the
cube are
• the whole cube,
17
• the six two-dimensional faces (using face in the conventional sense),
• the twelve one-dimensional edges,
• the eight vertex singletons.
The definition is a follows. A face of a compact convex set C is a nonempty
convex subset F of C such that
y, z ∈ C, 0 < t < 1, (1 − t)y + tz ∈ F
=⇒
y, z ∈ F.
E XERCISE
1. C is a face of C.
2. x is an extreme point of C if and only if {x} is a face of C.
3. If F is a face of C, then ex(F ) ⊆ ex(C).
4. If F is a face of C, n ≥ 2, y1 , . . . , yn ∈ C, t1 , . . . , tn ∈]0, 1[, t1 + · · · +
tn = 1 and t1 y1 + · · · + tn yn ∈ F then y1 , . . . , yn ∈ F . This is a fairly
straightforward induction argument.
5. A nonempty intersection of faces is again a face.
2
L EMMA 28
Let K be a nonempty compact convex subset of a real locally convex space space E. Let f : E −→ R be a continuous linear form on E. Let
s = sup f (K). Then
F = {x ∈ K; f (x) = s}
is a closed face of K.
Proof. Clearly F is nonempty. Let y, z ∈ K, 0 < t < 1 be such that (1 − t)y +
tz ∈ F . Then
s ≥ (1 − t)f (y) + tf (z) = f ((1 − t)y + tz) = s.
The only way out is that f (y) = f (z) = s. Thus y, z ∈ F .
18
P ROPOSITION 29
Let K be a compact convex subset of a real locally convex
space space E. Then ex(K) is nonempty.
Proof. This proof uses Zorn’s lemma. Consider the poset X of closed faces of
K ordered
T by reverse inclusion. Let C = (FTα )α∈I be a chain in X. Then consider
F = α∈I Fα . For J a finite subset of I, α∈J Fα is actually one of the Fα for
some α ∈ J and hence nonempty. It follows from the finite intersection property
that F is nonempty. We see that F is a closed face. Thus, every chain in X has an
upper bound. Hence X possesses a maximal element G.
Now suppose that y, z ∈ G with y 6= z. Then, by the Hahn–Banach Theorem
there exists a continuous linear form f : E −→ R with f (y) 6= f (z). Let s =
sup f (G). Then by the lemma, H = {x ∈ G; f (x) = s} is a face of G and
hence also a face of K. By maximality of G we have H = G, but this contradicts
f (y) 6= f (z) since not both of f (y) and f (z) can be equal to s. Hence G is a
singleton and the result follows.
Proof of the Krein–Milman Theorem.
Clearly K ⊇ cl(co(ex(K))). To show the other inclusion, let x ∈ K and
x∈
/ cl(co(ex(K))). Then by the geometrical form of the Hahn–Banach theorem,
there exists a continuous linear form f on E that satisfies
f (x) > sup f (cl(co(ex(K)))).
But F = {x ∈ K; f (x) = sup(f (K))} is a face of K disjoint from ex(K). But
by the proposition, F possesses an extreme point of F and this point is also an
extreme point of K a contradiction.
1.9
Banach Spaces
We start by introducing the concept of a norm . For an element v of the vector
space E the norm of v (denoted kvk) is to be thought of as the distance from 0E
to v, or as the “size” or “length” of v.
D EFINITION
A norm on a vector space E over R or C is a mapping
v −→ kvk
from E to R+ with the following properties.
19
• k0E k = 0.
• v ∈ E, kvk = 0 ⇒ v = 0E .
• ktvk = |t|kvk
∀t a scalar, v ∈ E.
• kv1 + v2 k ≤ kv1 k + kv2 k
∀v1 , v2 ∈ E.
The last of these conditions is called the subadditivity inequality . There are
really two definitions here, that of a real norm applicable to real vector spaces
and that of a complex norm applicable to complex vector spaces. However, every
complex vector space can also be considered as a real vector space — one simply
“forgets” how to multiply vectors by complex scalars that are not real scalars.
This process is called realification . In such a situation, the two definitions are
different. For instance,
kx + iyk = max(|x|, 2|y|)
(x, y ∈ R)
defines a perfectly good real norm on C considered as a real vector space. On the
other hand, the only complex norms on C have the form
1
kx + iyk = t(x2 + y 2 ) 2
for some t > 0.
The inequality
kt1 v1 + t2 v2 + · · · + tn vn k ≤ |t1 |kv1 k + |t2 |kv2 k + · · · + |tn |kvn k
holds for scalars t1 , . . . , tn and elements v1 , . . . , vn of E. It is an immediate consequence of the definition. We note that a norm is essentially a seminorm that
vanishes only at 0E .
E XERCISE Let C be a balanced convex absorbent subset of a vector space E.
Then Minkowski functional of C is a norm.
2
A complete normed space is called a Banach space .
1.10
Quotients of Banach Spaces
20
P ROPOSITION 30
If E is a normed space and N is a closed linear subspace of
E then the quotient space Q = E/N is again a normed space with the norm
kqkQ = inf kxkE ,
π(x)=q
(1.2)
for q ∈ Q. This is as the quotient norm .
It is more or less obvious that k kQ is homogenous. To show the subadditivity
of the norm, we argue by contradiction. Suppose that there exists > 0, q1 , q2 ∈ Q
such that
kq1 + q2 k ≥ kq1 k + kq2 k + 3.
(1.3)
Then using the definition (1.2), we can find x1 , x2 ∈ E such that π(xj ) = qj and
kxj kE ≤ kqj k + ,
for j = 1, 2. Obviously, π(x1 + x2 ) = q1 + q2 so that
kq1 + q2 k ≤ kx1 k + kx2 k ≤ kq1 k + kq2 k + 2.
This contradiction with (1.3) establishes the subadditivity.
There is one final detail that requires a little proof. Suppose that q ∈ Q and
that kqkQ = 0. Then, using (1.2) we can find a sequence (xj ) of elements of E
with π(xj ) = q for j = 1, 2, . . . and kxj k tending to zero. Clearly xj −→ 0E and
hence (x1 − xj ) −→ x1 . Since (x1 − xj ) ∈ N and since N is supposed to be
closed in E, we conclude that x1 ∈ N and consequently that q = 0Q .
It is perhaps worth noting that kπ(x)kQ ≤ kxkE for all x ∈ E, so that π is
nonexpansive and in particular continuous.
E XERCISE Show that the topology induced by the quotient norm coincides with
the quotient toplogy.
2
P ROPOSITION 31
Let EP
be a normed vector space with the
that whenPproperty
∞
∞
ever xn ∈ E are such that j=1 kxj kE < ∞ we have that j=1 xj converges in
E. Then E is complete.
Before proving this we need the following Lemma.
21
L EMMA 32
Let (xn ) be a Cauchy sequence in a metric space X. If (xn ) has a
convergent subsequence then (xn ) converges.
We leave the proof to the reader.
Proof of Proposition 31. Let (un ) be a Cauchy sequence in E. Then applying
the definition of Cauchy sequence for = 2−k for all k ∈ N, we find nk such that
p, q ≥ nk ⇒ kup − uq k < 2−k
It follows from this that kunk+1 − unk k < 2−k and that
∞
X
kunk+1 − unk k < ∞.
k=1
Therefore, by hypothesis,
∞ X
unk+1 − unk
k=1
converges in E. But this is equivalent to saying that unk converges. An application
of Lemma 32 now shows that (un ) converges. But (un ) was an arbitrary Cauchy
sequence in E, so we have shown that E is complete.
We have the following Theorem.
T HEOREM 33
Let E be a complete normed vector space and N a closed linear
subspace. Then the quotient space Q = E/N is a complete normed space with
the norm defined by (1.2).
Proof. It will suffice to showP
that Q satisfies the hypotheses of Proposition
P∞ 31.
Towards this, let qn ∈ Q with ∞
kq
k
<
∞.
We
need
to
show
that
n
n=1 qn
n=1
converges in Q. By definition of the quotient norm, there exist xn ∈ E such
that π(xn ) = qn and kxn kE ≤ 2kqn kQ . (The 2 here could be replaced by 1 + ,
but not by 1. There are
P∞examples where the infimum defining the quotient norm
is
attained.) So n=1 kxn k < ∞ and since E is complete, it follows that
Pnot
∞
x
say to s ∈ E. But π is continuous and it now follows that
n converges
Pn=1
P∞
∞
n=1 qn =
n=1 π(xn ) converges to π(s) in Q.
22
1.11
The Open Mappings and Closed Graphs
The following result has a number of key applications that cannot be approached
in any other way.
T HEOREM 34 (BAIRE ’ S C ATEGORY T HEOREM )
Let X be a complete metric
space or a compact Hausdorff topological space. Let Ak be a sequence of closed
subsets of X with int(Ak ) = ∅. Then
X\
∞
[
Ak is dense in X.
(1.4)
k=1
In particular if X is nonempty we have
∞
[
Ak 6= X.
k=1
T HEOREM 35 (O PEN M APPING T HEOREM )
Suppose that U and E are complete normed spaces and let T : U −→ E be a continuous surjective linear map.
Then there is a constant C > 0 such that for every v ∈ E with kvk ≤ 1, there
exist u ∈ U with kuk ≤ C such that T (u) = v.
The reason for the terminology is that the statement that T is an open mapping
is equivalent to the conclusion of the Theorem.
Proof. There are two separate ideas in the proof. The first is to use the Baire
Category Theorem and the second involves iteration.
Let Bn denote {u : u ∈ U, kuk ≤ n}, the closed n-ball in U . Then, since T is
onto, we have
[
V =
T (Bn ).
n∈N
We can’t use this directly in the Baire Category Theorem because we don’t know
that the T (Bn ) are closed. We take the easiest way around this difficulty and write
simply
[
V =
cl(T (Bn )).
n∈N
23
By the Baire Category Theorem (page 23), there exists n ∈ N such that cl(T (Bn ))
has nonempty interior. This means that there exists v ∈ V and t > 0 such that
UV (v, t) ⊆ cl(T (Bn )). By symmetry, it follows that UV (−v, t) ⊆ cl(T (Bn )). We
claim that UV (0V , t) ⊆ cl(T (Bn )). Let w ∈ UV (0V , t). Then, we can find two
sequences (xk ) and (yk ) in Bn such that (T (xk )) converges to w + v and (T (yk ))
converges to w − v. It follows that the sequence (T ( 21 (xk + yk ))) converges w.
This establishes the claim.
Now, let v be a generic element of V with kvk < t. Then v ∈ cl(T (Bn )).
Hence, there exists u0 ∈ Bn such that kv − T (u0 )k < 12 t. We repeat the argument,
but rescaled by a factor of 21 and applied to v − T (u0 ). Thus, there is an element
u1 ∈ U with ku1 k < 12 n and such that kv − T (u0 ) − T (u1 )k < 14 t. Continuing in
this way leads to elements uk ∈ U with kuk k < n 2−k such that
kv −
`
X
T (uk )k < t 2−`−1 .
k=0
Using now the fact that U is complete (the completeness of V is needed to apply
Baire’s Theorem), we find that T (u) = v where
u=
∞
X
uk ∈ U
k=0
is given by an absolutely convergent series and has norm bounded by 2n. Rescaling gives the required result.
An open mapping from one topological space to another is a mapping such
that the direct image of an open subset is always open. Let T be as in Theorem 35.
We explain why T is an open mapping. Let Ω be open in U . We need to show
that T (Ω) is open in V . Let v0 ∈ T (Ω). Then, there exists u0 ∈ Ω such that
T (u0 ) = v0 . Since Ω is open in U , there exists δ > 0 such that ku − u0 kU < δ
implies that u ∈ Ω. Now let kv − v0 kV < C −1 δ. Then according to Theorem 35
there exists w ∈ U with kwkU ≤ Ckv − v0 kV < δ such that T (w) = v − v0 . Then
v = v0 + T (w) = T (u0 + w) = T (u) with u = u0 + w and ku − u0 kU < δ. Hence
u ∈ Ω and v ∈ T (Ω). We just showed that any point v sufficiently close to v0 in
V lies in T (Ω). Hence T (Ω) is open in V .
Conversely, let T be an open linear mapping T : U −→ V with U and V
normed spaces. Then the direct image of the open ball centred at zero of radius 1
in U contains an open ball of strictly positive radius centred at the zero vector
of V . Scaling shows that T is surjective and the conclusion of the Open Mapping
Theorem follows easily with C = −1 .
24
C OROLLARY 36
Let V be a vector space with two norms k k1 and k k2 , both
of which make V complete. Suppose that there is a constant C such that
kvk2 ≤ Ckvk1
∀v ∈ V.
Then k k1 and k k2 are equivalent norms.
Proof. Apply the Open Mapping Theorem in case that T is the identity mapping
from (V, k k1 ) to (V, k k2 ).
It is possible to construct an infinite dimensional vector space with two incomparable norms both of which render it complete.
T HEOREM 37 (I NVERSE M APPING T HEOREM )
Let E and F be Banach
spaces. let T : E −→ F be a continuous linear bijection. Then T −1 : F −→ E is
also continuous
Proof. This follows from the Open Mapping Theorem. Since T is onto, it is an
open mapping. But, since T is bijective this just says that T −1 is continuous.
T HEOREM 38 (C LOSED G RAPH T HEOREM )
Let E and F be Banach spaces.
let T : E −→ F be a linear mapping. Let G = {(x, T (x)); x ∈ E} ⊆ E × F .
Then T is continuous if and only if G is closed in E × F for the product topology.
Proof. If T is continuous, the G is closed. This is the trivial part of the proof. Let
(xn , T (xn )) be a sequence in G converging to (x, y) in E × F . Then xn converges
to x and since T is continuous T (xn ) converges to T (x). Hence T (x) = y. It
follows that G is closed.
For the converse consider the space E ⊕ F , the abstract direct sum of E and
F . As a vector space, it consists of objects x ⊕ y where x ∈ E and y ∈ F . We can
put a norm on E ⊕ F by kx ⊕ yk = kxkE + kykF although the precise form of the
norm is not important. However, the topology induced by this norm corresponds
to the product topology on E × F . Hence, restricting the norm to G we obtain
a Banach space because G is closed and E ⊕ F is complete. But the mapping
G −→ E given by (x, T (x)) 7→ x is a continuous linear bijection. Applying the
inverse mapping theorem, we see that x 7→ (x, T (x)) is continuous. In particular,
T is continuous.
25
T HEOREM 39
Let E be a Banach space G and L closed subspaces such that
G + L is also closed. Then
• There is a constant C such that every z ∈ G + L can be written z = x + y
where x ∈ G, y ∈ L with kxk ≤ Ckzk and kyk ≤ Ckzk.
• There is a constant C such that
d(x, G ∩ L) ≤ C d(x, G) + d(x, L)
for all x ∈ E.
Proof. We form the abstract direct sum G ⊕ L and norm it by kx ⊕ yk = kxkE +
kykE where x ∈ G and y ∈ L. It is a Banach space since G and L are. Also
G + L, the closed linear subspace of E is also a Banach space. The mapping
x ⊕ y 7→ x + y
is a continuous linear surjection G ⊕ L −→ G + L and hence an open mapping.
Therefore the open unit ball in G ⊕ L must map into a subset that contains a δ-ball
of G + L around 0E . Rescaling gives the first result.
For the second assertion let > 0. We can choose y ∈ G, z ∈ L such that
kx−yk < d(x, G)+ and kx−zk < d(x, L)+. Then y −z ∈ G+L and we may
find y 0 ∈ G and z 0 ∈ L with y − z = y 0 − z 0 and control ky 0 k, kz 0 k ≤ Cky − zk.
Then y − y 0 = z − z 0 ∈ G ∩ L. So
d(x, G ∩ L) ≤ kx − y + y 0 k ≤ kx − yk + ky 0 k ≤ d(x, G) + + Cky − zk
≤ d(x, G) + + Cky − xk + Ckx − zk
≤ (1 + 2C) d(x, G) + d(x, L) + .
Letting tend to zero gives the result.
1.12
The Banach–Steinhaus Theorem
26
T HEOREM 40 (BANACH –S TEINHAUS T HEOREM )
Suppose that E and F are
normed spaces with E complete and let Tn : E −→ F be continuous linear maps
for n ∈ N. Suppose that for every e ∈ E, we have
sup kTn (e)kF < ∞.
n∈N
Then
sup kTn kop < ∞,
n∈N
where k kop denotes the operator norm from E to F .
Proof.
Let us define for k ∈ N,
Ak = {e ∈ E; kTn (e)kF ≤ k for all n ∈ N} =
∞
\
Tn−1 (BF (0, k)) .
n=1
Then Ak is closedSin E since it is an intersection of closed subsets of E. The
hypothesis is that ∞
k=1 Ak = E. Therefore by the Baire Category Theorem, there
exists k ∈ N such that int(Ak ) 6= ∅. Hence there exists e ∈ E and > 0 such that
UE (e, ) ⊆ Ak . As in the proof of the Open Mapping Theorem, we deduce from
the symmetry and convexity of Ak that UE (0, ) ⊆ Ak . But this says that
x ∈ E, kxkE < =⇒ kTn (x)kF ≤ k for all n ∈ N.
Rescaling this gives kTn kop ≤ 2k−1 for all n ∈ N which is the desired conclusion.
The Banach–Steinhaus Theorem can be used to show that there exists a continuous function on the circle whose Fourier series does not converge at a point.
1.13
Hilbert Spaces
A complete inner product space is called a Hilbert space . Hilbert spaces are important because they have almost magical properties and are usually very easy to
handle. They are extremely important in Physics, where they form the theoretical
basis for Quantum Mechanics.
The space L2 holds a very special position among the Lp spaces because it can
be given the structure of an inner product space.
In this section, we will omit proofs since they were covered in Math 455.
27
T HEOREM 41
The form
Z
hf, gi =
f gdµ
defines an inner product on L2 (X, M, µ) which is compatible with the L2 norm.
The proof is completely straightforward, the key point being that the associated
norm of the inner product is just the L2 norm.
Z
Z
hf, f i = f f dµ = |f |2 dµ = kf k22 .
P ROPOSITION 42
Let H be a Hilbert space (real or complex) and let C ⊆ H
be a nonempty closed convex subset. Let x ∈ H. Then there is a unique nearest
point y of C to x.
In fact, this defines a mapping PC : H −→ C called the metric projection onto
C. We do not need the Lemma below, but it is an interesting fact.
L EMMA 43
Let H be a Hilbert space (real or complex) and let C ⊆ H be a
closed convex subset. Then PC satisfies kPC (x1 ) − PC (x2 )k ≤ kx1 − x2 k for all
x1 , x2 ∈ H.
Let H be a Hilbert space either real or complex. Let S ⊆ H. Then we define
S ⊥ = {x; x ∈ H, hs, xi = 0, for all s ∈ S}.
It is clear that S ⊥ is an intersection of closed linear subspaces of H and therefore
it is a closed linear subspace of H.
T HEOREM 44
Let M be a closed linear subspace of H. Then we have H =
M ⊕ M ⊥ . Furthermore. let P and Q be the linear projection operators onto M
and M ⊥ associated with the direct sum. Then P and Q are norm decreasing and
in fact, more generally we have kxk2 = kP (x)k2 + kQ(x)k2 for all x ∈ H.
We denote S ⊥⊥ = (S ⊥ )⊥ . This set has a neat characterization.
L EMMA 45
Let H be a Hilbert space either real or complex. Let S ⊆ H. Let
M be the closure of the linear span of S. Then S ⊥⊥ = M .
28
T HEOREM 46
Let H be a Hilbert space and let f be a continuous linear map
from H to the base field k. Then there exists z ∈ H such that f (x) = hz, xi.
An orthonormal set is usually an indexed set (eα )α∈I where I is the indexing
set. The key property that it has to satisfy is
1 if α = β,
heα , eβ i =
0 if α 6= β.
Given a finite linearly independent set in an inner product space, one usually constructs an orthonormal set by using the Gram–Schmidt Orthogonalization Process.
Note that if you are computing an orthonormal basis on a computer you should
use the modified Gram–Schmidt Orthogonalization Process to avoid roundoff error instabilities.
Let (eα )α∈I be an orthonormal set. Then
P
(i) If (cα ) ∈ `2 , then the series α∈I cα eα is a norm convergent unconditional
P
P
2 1/2
.
sum and furthermore k α∈I cα eα kH =
α∈I |cα |
P
(ii) If x ∈ H, then α∈I |heα , xi|2 ≤ kxk2 .
T HEOREM 47
(iii) If M is the closed linear span of (eα )α∈I , then we have
X
P (x) =
heα , xieα
α∈I
where P is orthogonal projection on M .
Let H be a Hilbert space. An orthonormal basis in H is a maximal orthonormal set. It turns out that in the finite dimensional case, orthonormal bases are
simply linear bases that are also orthonormal. But, in the infinite dimensional
case, orthonormal bases are never linear bases. First we need to address the question of existence or, more generally extension. In this setting, we’ll simply work
with unindexed sets.
L EMMA 48
Every orthonormal set is contained in some orthonormal basis.
We need a theorem that characterizes orthonormal bases.
29
T HEOREM 49
Let (eα )α∈I be an orthonormal set in a Hilbert space H. Then
the following are equivalent.
(i) (eα )α∈I is an orthonormal basis.
(ii) The closed linear span M of (eα )α∈I is the whole of H.
(iii) The identity
X
|heα , xi|2 = kxk2
α∈I
holds for all x ∈ H.
(iv) The identity
X
hy, eα iheα , xi = hy, xi
α∈I
holds for all x, y ∈ H.
There are some important consequences of this result and the existence of
orthonormal bases.
C OROLLARY 50
(i) If H is a finite dimensional Hilbert space, then it is linearly isometric to d
dimensional Euclidean space Rd or Cd , depending on the field of scalars
and where d = dim(H)..
(ii) If H is infinite dimensional, but separable Hilbert space, then it is linearly
isometric to `2 over the appropriate field of scalars.
1.14
Standard Banach Spaces and their Duals
If E and F are Banach spaces and T : E −→ F is a continuous linear mapping,
then the operator norm of T is defined by
kT kop =
sup
x∈E,kxkE ≤1
30
kT (x)kF .
E XERCISE
T.
Show that the finiteness of kT kop is equivalent to the continuity of
2
E XERCISE Let L(E, F ) denote the space of all continuous linear mappings
from E to F . Show that the operator norm is a norm on L(E, F ) and that L(E, F )
is complete (and hence a Banach space) with this norm.
2
The special case in which F = k the base field defines the dual space E ∗ . We
have
kf kE ∗ =
sup
|f (x)|.
x∈E,kxkE ≤1
The important Banach spaces and their duals are:
• C(K) for K a compact Hausdorff topological space with the uniform norm.
The dual space can be identified to M (K) the space of finite Borel measures
on K (real (i.e. signed) measures if k = R, complex if k = C) with the total
variation norm. By finite, we mean that the total variation is finite. The
duality is
R given as follows. If g is a continuous linear form on C(K), then
g(f ) = f dµ where µ ∈ M (K).
We will not prove this result. You can find the proof in Rudin’s Real &
Complex Analysis.
• C0 (X) for X a locally compact Hausdorff space, C0 (X) consisting of continuous functions vanishing at infinity. Again the norm is the uniform norm.
The dual space can be identified to M (X) the space of finite regular Borel
measures on X with the total variation norm and the duality is realized as
above. If X is a countable union of compact subsets, then you can drop the
regularity of the measure. It is satisfied automatically.
• Lp (X, M, µ), the Lp space on the measure space (X, M, µ) where µ is a
positive (not necessarily finite) measure. For 1 < p < ∞, the dual space of
0
Lp (X, M, µ) is identified to Lp (X, M, µ) where p0 is the conjugate index
to p, i.e. p−1 +p0 −1 = 1 and theRlinear functional g is realized by the function
0
h ∈ Lp (X, M, µ) by g(f ) = f hdµ.
In case p = 1, the dual space is L∞ (X, M, µ) provided that (X, M, µ) is a
σ-finite measure space with the duality being realised in the same way.
31
2
Banach Space Duality
Let E be a Banach space. Let E ∗ denote the dual space. Then we have defined
the norm on E ∗ as the operator norm
kf kE ∗ =
sup
|f (x)|.
x∈E,kxkE ≤1
It is a consequence of the Hahn–Banach theorem (Corollary 23) that
kxkE =
sup
|f (x)|
f ∈E ∗ ,kf kE ∗ ≤1
and in particular, the elements of E ∗ separate the points of E.
The bidual E ∗∗ is the dual of E ∗ . There is a canonical mapping J : E −→ E ∗∗
defined by J(x)(f ) = f (x). It is clear that this mapping is a linear isometry. If it is
bijective, we say that E is reflexive . For example Lp is reflexive for 1 < p < ∞,
Hilbert spaces are reflexive. The space c0 is not reflexive since its dual is `1 and
the dual of `1 is `∞ .
There are two important locally convex topologies. The weak topology on
E denoted σ(E, E ∗ ) is the topology defined by the seminorms x 7→ |f (x)| as f
runs over E ∗ . More interesting is the weak star topology on E ∗ denoted σ(E ∗ , E)
defined by the seminorms f 7→ |f (x)| as x runs over E. More generally you can
define the topology σ(E, F ) where F is a linear subspace of E ∗ . The reason that
the weak star topology is so important is the following theorem.
T HEOREM 51
The closed unit ball of E ? is weak star compact.
The proof depends on the Tychonov product theorem.
32
T HEOREM 52 (T YCHONOV PRODUCT T HEOREM )
LetQ
Xα be compact topological spaces for every α ∈ I where I is an index set. Then α∈I Xα is a compact
topological space in the product topology.
There are two proofs of this theorem. Both are difficult.
Proof of Theorem 51. We give the proof in the complex case. For each x ∈ E
let Dx be a copy of the closed disk of radius kxkE in the complex plane. Let
Y
D=
Dx .
x∈E
A typical point of D will be denoted (zx )x∈E .
Then by the Tychonov product theorem, D is compact for the product topology. Let Zx1 ,x2 ,t1 ,t2 be the subspace of D given by
Zx1 ,x2 ,t1 ,t2 = {(zx ); zt1 x1 +t2 x2 = t1 zx1 + t2 zx2 }
for t1 , t2 ∈ C and x1 , x2 ∈ E. The condition zt1 x1 +t2 x2 = t1 zx1 + t2 zx2 defines a
closed subset of Dt1 x1 +t2 x2 × Dx1 × Dx2 and it follows that Zx1 ,x2 ,t1 ,t2 is a closed
subset of D since it involves only a finite number (at most three) coordinates. Let
\
Z=
Zx1 ,x2 ,t1 ,t2 ,
x1 ,x2 ∈E,t1 ,t2 ∈C
then Z is a closed subset of D (intersection of closed subsets). The final step in
the proof is to realize that there is a one-to-one correspondence between {u ∈
E ? ; kukE ? ≤ 1} and Z given by u 7→ (zx ) where zx = u(x) for all x ∈ E. In
each case, the weak star topology on {u ∈ E ? ; kukE ? ≤ 1} and the topology on
Z inherited from the product topology are given by pointwise convergence on the
elements x ∈ E. Since Z is compact, so is the unit ball of E ? .
For F ⊆ E we define F ◦ = {f ∈ E ∗ ; f (x) = 0, ∀x ∈ F } a closed linear
subspace is E ∗ called the annihilator of F . It should be clear that F has the same
annihiliator as its closed linear span. We can also define the annihilator of a subset
of E ∗ but, by convention, we take this in E and not in E ∗∗ . Thus for N ⊆ E ∗ , we
have
N ◦ = {x ∈ E; f (x) = 0, ∀f ∈ N }
It should be apparent that F ◦◦ is the closed linear span of F . The fact that closed
linear span of F is a subset of F ◦◦ is evident. If there is x ∈ F ◦◦ that is not in the
33
closed linear span of F , then we may invoke the geometrical form of the Hahn–
Banach theorem (proposition 26) to find a closed hyperplane separating x from F
and hence an element f ∈ E ∗ vanishing on F but with f (x) 6= 0 contradicting
x ∈ F ◦◦ .
Let E be a Banach space and F a closed linear subspace. Then a continuous
linear projection of E on F is a continuous linear mapping P : E −→ F such
that P |F is the identity mapping on F . In particular P 2 = P . In this situation, it
follows that ker(P ) is a closed linear subspace of E such that E = F ⊕ ker(P ),
the (continuous) linear projection on ker(P ) being I − P . The direct sum is
implemented by
∈ker(P )
∈P
z }| { z }| {
x = P (x) + (I − P )(x) .
It should be stressed that except in trivial circumstances (like F = E or F =
{0E }) there may be many different continuous linear projections onto F .
L EMMA 53
1. If F is a finite dimensional linear subspace of E, then there is a continuous
projection on F .
2. If F is a closed linear subspace of E of finite codimension, then there is a
continuous projection on F .
Proof. If F is finite dimensional, then choose a basis e1 , . . . , en of F . Since F is
necessarily closed we define ϕk (t1 e1 + · · · + tn en ) = tk . a continuous linear form
on F which
Pn extends by Hahn–Banach to a continuous linear form ϕ˜k on E. Then
P (x) = k=1 ϕ˜k (x)ek is a continuous linear projection from E to F .
For the case where F is closed and of finite codimension, form the quotient
space Q = E/F which is finite dimensional and select a basis q1 , . . . , qn . For
x ∈ E we will have
n
X
π(x) =
tk (x)qk
k=1
where the tk are continuous linear forms on E since π is continuous and Q is
finite dimensional. Now choose e1P
, . . . , en ∈ E such that π(ek ) = qk for k =
1, . . . , n. The mapping x 7→ x − nk=1 tk (x)ek is now the desired continuous
linear projection onto F .
34
L EMMA 54
Let E be a Banach space and F a closed linear subspace. Then the
dual of the inclusion mapping gives a linear isometry E ∗ /F ◦ to F ∗ . In particular,
for f ∈ E ∗ , we have d(f, F ◦ ) = kf |F kF ∗ .
Proof. Let J be the inclusion of F into E. Then the dual map J ∗ maps E ∗
onto F ∗ and has kernel F ◦ . Surjectivity follows from Hahn–Banach. Clearly
J ∗ is norm decreasing and so it induces a norm decreasing map from E ∗ /F ◦ to
F ∗ . But, by Hahn–Banach, every element of F ∗ can be extended to E ∗ without
increase of norm. Hence the map E ∗ /F ◦ to F ∗ is an isometry. Let f ∈ E ∗ . Then
decoding we get kJ ∗ (f )kF ∗ = supx∈F,kxk≤1 |f (x)|. On the other hand, we may
consider the quotient space Q = E ∗ /F ◦ and we find that d(f, F ◦ ) = kπ(f )kQ .
This situation can be contrasted with the following.
L EMMA 55
Let E be a Banach space and F a Banach space which is dense
linear subspace of E such that the inclusion J : F −→ E is continuous. Then the
dual of the inclusion mapping gives a a continuous inclusion F ∗ to E ∗ .
Proof. Let f ∈ E ∗ and suppose J ∗ (f ) = 0. then for x ∈ F we have f (J(x)) =
J ∗ (f )(x) = 0. Thus f vanishes on F . But since F is dense in E and f is
continuous it follows that f = 0. Hence J ∗ is injective.
L EMMA 56
Let T : E −→ F be a continuous linear operator between two
Banach spaces E and F . Then
(i) ker T = im(T ∗ )◦ .
(ii) ker(T ∗ ) = im(T )◦ .
(iii) ker(T )◦ = cl(im(T ∗ )).
(iv) ker(T ∗ )◦ = cl(im(T )).
Proof. We see that (iii) and (iv) follow immediately from (i) and (ii) using double
annihilators. Everything hinges on
(T ∗ (f ))(x) = f (T (x)).
For (i) the inclusion ⊆ is obvious. On the other hand, if x ∈ im(T ∗ )◦ , then
f (T (x)) = 0 for all f ∈ F ∗ . it follows that T (x) = 0 since F ∗ separates on F .
Statement (ii) follows similarly.
35
P ROPOSITION 57
Then
Let E be a Banach space, G and L closed linear subspaces.
(i) G ∩ L = (G◦ + L◦ )◦ .
(ii) G◦ ∩ L◦ = (G + L)◦ .
(iii) (G ∩ L)◦ ⊇ cl(G◦ + L◦ ).
(iv) (G◦ ∩ L◦ )◦ = cl(G + L).
Proof. For (i) the ⊆ inclusion is obvious. To see this, let x ∈ G ∩ l and f ∈
G◦ ∩ L◦ . Then f (x) = 0. In the other direction we have G◦ ⊆ G◦ + L◦ and
taking annihilators G = G◦◦ ⊇ (G◦ + L◦ )◦ and similarly for L. For (ii) take
(i) and replace G by G◦ , L by L◦ . For (iii) it is clear that (G ∩ L)◦ ⊇ G◦ + L◦
and the left-hand member is closed. For (iv), take annihilators in (ii) and use
(G + L)◦◦ = cl(G + L).
Note that you cannot show (G ∩ L)◦ = cl(G◦ + L◦ ) by applying annihilators
to (i). This is because for X ⊆ E it is true that X ◦◦ is the closed linear span
of X, but you cannot make the same conclusion in case X ⊆ E ∗ because of the
convention defining the annilator of a subset of E ∗ .
T HEOREM 58
Let E be a Banach space, G and L closed linear subspaces.
Then the following are equivalent:
(a) G + L is closed in E.
(b) G◦ + L◦ is closed in E ∗ .
(c) G + L = (G◦ ∩ L◦ )◦ .
(d) G◦ + L◦ = (G ∩ L)◦ .
Proof. We see that (a) and (c) are equivalent from (iv) above. Also (d) implies
(b). It remains to show that (a) implies (d) and that (b) implies (a).
To show that (a) implies (d), since one inclusion is obvious, we need only
show that G◦ + L◦ ⊇ (G ∩ L)◦ . Let f ∈ (G ∩ L)◦ . For x ∈ G + L say with
x = g + ` we define ϕ(x) = f (g). This is well-defined since if also x = g 0 + `0
36
then the difference g − g 0 = `0 − ` lies in G ∩ L. Since G + L is closed with a
good choice of g we have kgk ≤ Ckxk and hence ϕ is continuous. Now extend
ϕ (defined on the closed linear subspace G + L) to ϕ̃ defined on the whole of E.
Then f = (f − ϕ̃) + ϕ̃ with the first member in G◦ and the second in L◦ .
To show that (b) implies (a). Since G◦ + L◦ is closed and by theorem 39 we
have
◦
◦
◦
◦
d(f, G ∩ L ) ≤ C d(f, G ) + d(f, L )
for all f ∈ E ∗ . Decoding this using (iii) from Proposition 57 and Lemma 54, we
get
sup
|f (x)| ≤ C
sup |f (x)| + sup |f (x)|
(2.1)
x∈cl(G+L),kxk≤1
x∈G,kxk≤1
x∈L,kxk≤1
Now let x ∈ cl(G + L) with kxk ≤ 1. We claim that C −1 x ∈ cl(BG + BL ). If
not, then by Hahn–Banach, there exists α ∈]0, 1[, f ∈ E ∗ such that f (C −1 x) = 1
and f (g + `) ≤ α for all g ∈ BG , ` ∈ BL . But this contradicts (2.1). The claim is
proved.
But this means that any vector x ∈ cl(G + L) with kxk ≤ 1 can be written in
the form x = g + ` + z where g ∈ G, ` ∈ L with kgk, k`k ≤ C and kzk ≤ 21 . An
iteration argument shows that x ∈ G + L and completes the proof.
37
3
Compact Operators and the Fredholm
Alternative
Let E and F be Banach spaces. A continuous operator T ∈ L(E, F ) is said to be
compact iff T (BE ) has compact closure in F for the norm topology on F . The
set of all such operators is denoted K(E, F ).
E XERCISE
• Show that T compact is equivalent to the statement that for every sequence
(xn ) in BE , the sequence (T (xn )) has a convergent subsequence in F .
• Show that K(E, F ) is a linear subspace of L(E, F ).
2
L EMMA 59
We have that K(E, F ) is a closed linear subspace of L(E, F ).
Proof. Let Tn −→ T in the operator norm with Tn ∈ K(E, F ). We need to show
that T (BE ) has compact closure. But the closure of T (BE ) is closed and hence
complete, so it will suffice to show that T (BE ) is totally bounded (recall that the
closure of a totally bounded set is again totally bounded). Let δ > 0, then we need
to cover T (BE ) with finitely many δ-balls centred in T (BE ). Choose n such that
kT − Tn kop < 31 δ. Since Tn (BE ) is compact and hence totally bounded, we may
cover Tn (BE ) by finitely many 13 δ-balls centred in Tn (BE ). If the centres of these
balls are Tn (zk ), then the δ-balls centred at T (zk ) will cover T (BE ).
Clearly, a finite rank operator T is necessarily compact since T (BE ) is a
bounded subset of a finite dimensional space. Hence an operator norm limit of
finite rank operators is again compact.
38
The converse of this statement is in general false in Banach spaces, a fact that
was established by Per Enflo. That is, not every compact operator is an operator
norm limit of finite rank operators. As we shall see later, this statement is true in
Hilbert spaces.
E XERCISE Let E, F , G and H be Banach spaces and let R : E −→ F , S :
F −→ G and T : G −→ H be continuous linear maps with S compact. Then
SR, T S (and for that matter T SR are all compact.
2
Let T : E −→ F be a continuous linear mapping between Banach spaces E
and F . Then the dual map T ∗ : F ∗ −→ E ∗ (some authors call this the adjoint) is
defined by
T ∗ (φ)(x) = φ(T (x))
for all φ ∈ F ∗ and x ∈ E. It should be trivial to see that kT kop = kT ∗ kop .
P ROPOSITION 60
Let T ∈ K(E, F ). Then T ∗ ∈ K(F ∗ , E ∗ ) and conversely.
Proof. Let ψn ∈ BF ∗ . We wish to extract a subsequence such that T ∗ (ψnk )
converges in E ∗ . Let K = cl(T (BE )) a compact subset of F . Define ϕn = ψn |K .
These are continuous functions on the compact metric space K. We aim to apply
the Ascoli–Arzela theorem. For x ∈ BE we have ϕn (T (x)) ≤ kT (x)k ≤ kT k.
Extending by continuity, this gives ϕn k∞ ≤ kT k. For x1 , x2 ∈ BE , we have
|ϕn (T (x1 )) − ϕn (T (x2 ))| = |ψn (T (x1 − x2 ))| ≤ kT (x1 ) − T (x2 )kF .
Passing to the linit this gives
|ϕn (y1 ) − ϕn (y2 )| ≤ ky1 − y2 k
for all y1 , y2 ∈ K. So the sequence (ϕn ) is uniformly bounded and uniformly
equicontinuous on K and must therefore possess a convergent subsequence (ϕnk )
converging uniformly on K to a function ϕ. Therefore
sup |ψnk (T (x)) − ϕ(T (x))| → 0.
x∈BE
It follows that T ∗ (ψnk ) is a Cauchy sequence in E ∗ . Since E ∗ is complete (where
did we prove this?) the first assertion is verified. For the second, we have that
T ∗∗ ∈ K(E ∗∗ , F ∗∗ ). Hence, T ∗∗ (BE ) has compact closure in F ∗∗ . But T ∗∗ (BE ) =
T (BE ) ⊆ F (as subsets of F ∗∗ ). Also F is closed in F ∗∗ . It follows that T (BE )
has compact closure in F .
39
L EMMA 61 (R IESZ ’ S L EMMA )
Let E be a Banach space and F a closed linear subspace with F 6= E. Then given > 0, there exists x ∈ E with kxk = 1
and d(x, F ) ≥ 1 − .
Proof. Choose an element z of unit norm in the quotient space E/F , then an
element y ∈ E with kyk < 1 + that projects down onto Z. Then d(x, F ) ≥ 1.
Putting x = (1 + )−1 y does the trick.
T HEOREM 62 (T HE F REDHOLM A LTERNATIVE )
Let T ∈ K(E). Then
i) ker(I − T ) is finite-dimensional.
ii) im(I − T ) is closed being actually ker(I − T ∗ )◦ .
iii) ker(I − T ) = {0E } if and only if im(I − T ) = E.
iv) dim(ker(I − T )) = dim(ker(I − T ∗ ))
Proof. For (i) the unit ball of ker(I − T ) is contained in T (BE ) and hence is
compact. So ker(I − T ) must be finite-dimensional.
For (ii) Let yn = xn − T (xn ) −→ y ∈ E. We must show that y ∈ im(I − T ).
Let δn = d(xn , ker(I − T )) ≥ 0. Since ker(I − T ) is finite dimensional, there
exists zn ∈ ker(I − T ) such that kxn − zn k = δn . Now
yn = (xn − zn ) − T (xn − zn ).
(3.1)
We claim that kxn − zn k remains bounded as n → ∞. If not, there is a subsequence such that kxn − zn k tends to ∞. we pass to this subsequence (without change of notation). Let wn = kxn − zn k−1 (xn − zn ). It follows that
wn − T (wn ) = kxn − zn k−1 yn tends to 0E . But since wn is a unit vector, we
may pass to a further subsequence (still without change of notation) such that
T (wn ) converges. Thus wn and T (wn ) both converge to a vector w and clearly
T (w) = w, i.e. w ∈ ker(I − T ). But
d(wn , ker(I − T )) =
d(xn , ker(I − T ))
=1
kxn − zn k
gives a contradiction. Hence the claim is proved.
40
Since kxn − zn k remains bounded as n → ∞, we may extract a subsequence
(again without change of notation) such that T (xn − zn ) converges say to u. But
now we see from (3.1) that
xn − zn = yn + T (xn − zn )
therefore converges to y + u. Substituting back into (3.1) and passing to the limit
gives y = (y + u) − T (y + u) so that y ∈ im(I − T ) as required. It now follows
from Lemma 56(iv) that im(I − T ) = ker(I − T ∗ )◦ .
For (iii) we show first that ker(I − T ) = {0E } implies im(I − T ) = E. Let
E1 = im(I −T ) and suppose that E1 6= E. Let more generally En = im((I −T )n )
for n ∈ N, then by(ii) the EN are all closed and since I −T is injective, each En+1
is stricly contained in En . Choose unit vector zn ∈ En with d(zn , En+1 ) ≥ 21 .
With n > m, we have
∈Em+1
}|
{
T (zm ) − T (zn ) = zm − zn − (zn − T (zn )) + (zm − T (zm ))
z
and it follows that kT (zm ) − T (zn )k ≥ 12 a contradiction with the compactness of
T.
For the other implication in (iii) suppose that im(I − T ) = E. Then ker(I −
∗
T ) = {0E }. Since T ∗ is compact, we can apply the part of (iii) already proved to
establish that im(I − T ∗ ) = E ∗ . But by Lemma 56(i) it follows that ker(I − T ) =
{0E }.
We move on to (iv). Suppose that dim(ker(I − T )) < dim(ker(I − T ∗ )).
Then since ker(I − T ) is finite-dimensional, it admits a complement in E and
there is a continuous linear projection P : E −→ ker(I − T ). On the other hand,
im(I − T ) = ker(I − T ∗ )◦ is closed and has strictly larger finite codimension in
E. Thus, im(I − T ) has a complement M in E with dimension strictly bigger
than dim(ker(I − T )). Therefore, there is a linear map J : ker(I − T ) −→ M
which is injective but not surjective. Let S = T + JP . Then since JP has finite
rank, S is compact.
We claim that ker(I − S) = {0E }. If
∈im(I−T )
∈M
z }| { z }| {
0E = z − S(z) = (z − T (z) − J(P (z))
then z − T (z) = 0E and J(P (z)) = 0E . Then z ∈ ker(I − T ) so P (z) = z and
hence J(z) = 0E . Since J is injective, z = 0E . This establishes the claim.
41
Applying (iii) to S we obtain im(I − S) = E. But we know im(I − S) ⊆
im(I − T ) + im(J) a contradiction since J does not map onto M . This shows
dim(ker(I − T )) ≥ dim(ker(I − T ∗ )).
For the inequality dim(ker(I − T )) ≤ dim(ker(I − T ∗ )) we apply the same
argument to T ∗ to obtain dim(ker(I − T ∗ )) ≥ dim(ker(I − T ∗∗ )) and the result
follows since ker(I − T )) ⊆ ker(I − T ∗∗ )).
42
4
Spectral Theory of Hilbert Space Operators
In this chapter we take a look at various aspects of the spectral theory and symbolic
calculus of linear operators on a complex Hilbert space H.
L EMMA 63
Let T be a continuous linear operator on H. Then there exists an
?
operator T called the adjoint of T such that
hT ? y, xi = hy, T xi
(4.1)
for all x, y ∈ H.
Proof. Let y ∈ H and let u be the continuous linear form u(x) = hy, xi on H.
Then x 7→ u(T x) is again a linear form. By Theorem 46 there exists z ∈ H such
that u(T x) = hz, xi. That is
hz, xi = hy, T xi.
We now see that the mapping y 7→ z so defined is linear and continuous. Therefore
we may define a continuous linear operator T ? such that (4.1) holds.
The mapping T 7→ T ? is conjugate linear. We also observe that (T ? )? = T
and that kT ? k = kT k. In case H is Cn with the standard inner product, T ? is
essentially the complex conjugate transpose matrix of T .
D EFINITION A continuous linear operator T on H is said to be self-adjoint or
hermitian if T ? = T and normal if T ? T = T T ? .
43
D EFINITION
operator if
A continuous linear operator T on H is said to be a compact
{T (x); x ∈ H, kxk ≤ 1}
has compact closure in H (for the norm topology).
E XERCISE
For operators on Hilbert space we have:
• Every operator of finite rank is compact.
• An operator norm limit of compact operators is compact. (It suffices to
show that the image of the unit ball is totally bounded).
• If T is compact and S is a continuous operator, then ST is compact.
• If T is compact and S is a continuous operator, then T S is compact.
2
4.1
Spectral Theory of Compact Hermitian operators
L EMMA 64
Let T be a continuous hermitian operator on H. Then
kT k = sup |hx, T xi|.
kxk≤1
Proof. Clearly the left hand side is ≥ the right hand side. We only need to
establish the opposite inequality. Assume temporarily that T x 6= 0. We have
4|<hy, T xi| = 2|hy, T xi + hT x, yi| = 2|hy, T xi + hx, T yi|
= |h(x + y), T (x + y)i − h(x − y), T (x − y)i|
≤ |h(x + y), T (x + y)i| + |h(x − y), T (x − y)i|
2
2
≤ sup |hz, T zi| kx + yk + kx − yk
kzk≤1
= 2 sup |hz, T zi| kxk2 + kyk2
kzk≤1
Now replace y by ωy with |ω| = 1 and optimize to get
2
2
2|hy, T xi| ≤ sup |hz, T zi| kxk + kyk .
kzk≤1
44
Put y = kxkkT xk−1 T x to get
2kxkkT xk ≤ 2 sup |hz, T zi|kxk2 .
kzk≤1
Since T x 6= 0, we have x 6= 0 and we find
kT xk ≤ sup |hz, T zi|kxk.
kzk≤1
But this holds in any case if T x = 0 and the proof is complete.
L EMMA 65
Let T be a compact hermitian operator on nonzero Hilbert space
H. Then either kT k or −kT k is an eigenvalue of T .
Proof. By the previous lemma, we may find a sequence xn ∈ H with kxn k = 1
and |hxn , T xn i| → kT k. Passing to a subsequence, we can assume without loss
of generality that there exists y ∈ H and λ = ±kT k such that
hxn , T xn i → λ and T xn → y
as n → ∞. Then
0 ≤ lim sup kT xn − λxn k2
n→∞
2
2
2
= lim sup kT xn k − 2λhxn , T xn i + λ kxn k
n→∞
≤ lim sup kT k2 − λ2 = 0.
n→∞
Therefore T xn − λxn → 0 and consequently λxn → y. Applying T we get
λT xn → T y and hence λy = T y. It remains to show that y 6= 0. If y = 0, then
since λxn → y and kxn k = 1 we have λ = 0. Consequently kT k = 0 and zero is
an eigenvalue.
T HEOREM 66
Let T be a compact hermitian operator on H. Then
• The completion K of ⊕Hλ is the whole of H.
• The eigenvalues are all real.
45
• The spaces Hλ are finite dimensional for λ 6= 0.
• The only possible accumulation point of the eigenvalues is zero.
Proof. It is easy to see that the only possible eigenvalues are real. The closed
subspace K is clearly invariant under T . Since T is hermitian, K ⊥ is also invariant
under T . Also the restriction of T to K ⊥ is a compact hermitian operator on K ⊥ .
By the previous lemma, either K ⊥ is zero or T |K ⊥ has an eigenvalue. The second
scenario leads to K ∩ K ⊥ 6= {0}, a contradiction. Hence K ⊥ is zero and K = H.
Now let δ > 0 and choose a unit vector eλ in Hλ for every eigenvalue λ with
|λ| > δ. Then T eλ = λeλ so that kT eλ k ≥ δ. For distinct eigenvalues λ1 and λ2 ,
we still have T eλ1 ⊥ T eλ2 and it follows that there cannot be infinitely many such
λ. A similar argument show that each individual Hλ is finite dimensional.
T HEOREM 67
Let T be a compact linear map T : H −→ K with H and
K Hilbert spaces. Then T has a singular value decomposition. Namely, there
exists a countable index set I, orthonormal subsets (eα )α∈I , (fα )α∈I of H and K
respectively and positive reals (σα )α∈I (not necessarily distinct) but not having a
non-zero accumulation point such that
X
Tx =
σα heα , xifα
α∈I
for every x ∈ H, the sum (for T x) converging in K.
Proof. Note that both T ? T and T T ? are compact and hermitian. Let λ be an
eigenvalue of T ? T . Then for x a nonzero eigenvector we have
λkxk2 = hx, T ? T xi = kT xk2
so that λ ≥ 0. Let Hλ be the corresponding eigenspace and let Kλ be the
eigenspace of T T ? for λ. Then
(T T ? )T x = T (T ? T )x = λT x
showing that T maps Hλ to Kλ . Similarly T ? maps Kλ to Hλ . If λ > 0 we have
λ−1 T ? T x = x for all x ∈ Hλ , showing that λ−1 T ? is inverse to T . Hence Hλ
and Kλ have the same dimension. It follows (using standard finite-dimensional
linear algebra) that there are orthonormal
bases of Hλ and Kλ such that the ma√
trix representation of T |Hλ is λI where I is the identity matrix. Glueing these
orthonormal bases together for all possible eigenvalues λ > 0 gives the result.
46
4.2
Tensor products of inner product spaces
In this section, we will assume that you know all about tensor products of vector
spaces.
Let E and F be inner product spaces and let E ⊗ F be their tensor product
as vector spaces. The there is a natural inner product on E ⊗ F , defined essentially by hξ1 ⊗ η1 , ξ2 ⊗ η2 i = hξ1 , ξ2 ihη1 , η2 i and by extending by linearity and
conjugate
P linearity. We check that this actually is an inner product. For a tensor
τ = nj=1 ξj ⊗ ηj , this yields
X
hτ, τ i =
hξj , ξk ihηj , ηk i
j,k
which is always nonnegative (some linear algebra required). Another way of seeing this and of completing the next step is to choose orthonormal bases (eα ) and
(fβ ) of the linear spans of the ξ’s and η’s respectively allowing us to write
X
X
bβ,j fβ
aαk eα ,
ηj =
ξj =
α
β
Then
hτ, τ i =
X
aα,j aα,k bβ,j bβ,k
α,β,j,k
2
X X
aα,k bβ,k ≥ 0.
=
α,β
k
P
But now, if hτ, τ i = 0, we have that k aα,k bβ,k = 0 for all α and β. But then
X
τ=
aα,k bβ,k eα ⊗ fβ = 0.
α,β,k
Hence we have a genuine inner product.
Now let E and F be Hilbert spaces. Then unfortunately E ⊗ F is not necessarily complete in its inner product as defined above. The Hilbert space tensor
product E ⊗H F of E and F is defined as the completion of E ⊗ F for the
corresponding norm. Now if (eα ) and (fβ ) are orthonormal bases of E and F
respectively, then (eα ⊗ fβ ) is an orthonormal basis of E ⊗H F . This is easily seen
since (eα ⊗ fβ ) is orthonormal and the closure of its linear span is E ⊗H F .
Next, let E = L2 (X, F, µ) and F = L2 (Y, G, ν). Since we wish to discuss the
L2 space on the product, we are obliged to assume at this point that both measure
spaces are σ-finite. The product measure µ × ν is not defined otherwise. Then it
is easy to see that the mapping T : E ⊗H F → L2 (X × Y, F ⊗ G, µ × ν) defined
by T (f ⊗ g) = F where F (x, y) = f (x)g(y) and by extending by linearity is a
well defined isometry (and hence one-to-one).
47
E XERCISE The mapping T is onto. (Use the argument in the math 455 notes or
use orthogonality).
2
T HEOREM 68
Let (eα )α∈I and (fβ )β∈J be orthonormal bases of L2 (X, F, µ)
2
and L (Y, G, ν) respectively where µ and ν are σ-finite measures. Then the functions eα (x)fβ (y) form an orthonormal basis of L2 of the product space as (α, β)
runs over I × J.
E XERCISE Incidentally, to show that L2 (T, η) ⊗ L2 (T, η) is not the whole of
L2 (T, η) ⊗H L2 (T, η) = L2 (T × T, η × η), consider
Z
P (f ⊗ g) = f (t − s)g(s)dη(s)
Show that P maps L2 (T, η) ⊗ L2 (T, η) into the space of functions with absolutely
convergent Fourier series. On the other hand, given h ∈ L2 (T), the function
F (t, s) = h(t + s) gets mapped by P to h.
Now show that for E and F Hilbert spaces, E ⊗ F = E ⊗H F if and only if
E or F is finite dimensional.
2
4.3
Hilbert–Schmidt Operators
In this section, let (X, S, µ) and (Y, T , ν) be σ-finite measure spaces. We will also
assume that the corresponding L2 spaces are separable. Let K be a L2 function
on the product space. We consider the operator
Z
T f (x) = K(x, y)f (y)dν(y)
(4.2)
which we will see shortly is a continuous operator from L2 (Y, T , ν) to
L2 (X, S, µ). First observe that
Z
|K(x, y)||f (y)|dν(y)
is finite for µ-almost all x. Now let g ∈ L2 (X, S, µ) and observe that
Z
Z
|g(x)| |K(x, y)||f (y)|dν(y)dµ(x)
48
Z
=
|K(x, y)||g(x)f (y)|d(µ × ν)(x, y)
Z
≤
2
12 Z
21
|g(x)| |f (y)| d(µ × ν)(x, y)
<∞
2
|K(x, y)| d(µ × ν)(x, y)
2
by Tonelli’s Theorem and the Cauchy–Schwarz inequality. This sets us up to use
Fubini’s Theorem and we prove easily that
Z
Z
Z
g(x)T f (x)dµ(x) = g(x) K(x, y)f (y)dν(y)dµ(x)
≤ kKk2 kf k2 kgk2 .
To conclude from this, we define sets Ak ∈ S increasing with union X and set
gk = 11Ak min(k, |T f |)sgn(T f ) which is definitely in L2 . Now the inequality
|gk |2 ≤ gk T f holds since it boils down to min(k 2 , |T f |2 ) ≤ |T f | min(k, |T f |).
Therefore
Z
Z
2
|gk (x)| dµ(x) ≤ gk (x)T f (x)dµ(x) ≤ kKk2 kf k2 kgk k2
Since we know that kgk k2 < ∞ we can conclude
kgk k2 ≤ kKk2 kf k2 .
Letting k → ∞ we finally get kT f k2 ≤ kKk2 kf k2 by monotone convergence.
We further deduce that kT k ≤ kKk2 .
But choosing (ej ) and (fk ) in L2 (Y, T , ν) and L2 (X, S, µ) respectively, we
can write
X
cjk ej ⊗ fk
K=
j,k
where kKk22 = j,k |cjk |2 and the sum converges in the L2 norm. It follows that
T is an operator norm limit of finite rank operators and hence is compact. Thus,
in fact we may write from the singular value decomposition of T
P
T (g) =
∞
X
σi hei , gifi
i=1
for possibly different orthonormal bases. We now have
K(x, y) =
∞
X
i=1
49
σi ei (x)fi (y)
P
2
and kKk22 = ∞
i=1 σi .
One may define the von Neumann–Schatten classes of compact operators to
be the ones for which
∞
X
σip < ∞.
i=1
The corresponding quantity
∞
X
! p1
σip
i=1
turns out to be a norm for 1 ≤ p < ∞ and defines the class Cp . This fact is not
entirely trivial. The class C2 consists of the Hilbert–Schmidt operators and C1 is
the so-called trace class.
In order to setup the Cp norm, we make the definition in an underhanded way.
We set for T a compact operator
kT kCp = sup{| tr(T S)|}
Prank(S)
0
as S runs over finite rank operators with k=1 σk (S)p ≤ 1 where (σk ) are the
Prank(S)
singular values of S. For any finite rank operator S = k=1 tk ek ⊗fk∗ we define
Prank(S)
the trace of T S by tr(T S) = k=1 tk hfk , T (ek )i. It is obvious that k · kCp is a
norm and by duality that
! p1
∞
X
≤ kT kCp .
σi (T )p
i=1
The class C∞ is the class of all compact operators with the operator norm. Of
course in this case, we have that the sequence of singular values lies in c0 .
We will need the following proposition.
P ROPOSITION 69
pose that
Let aj,k be nonnegative for j ∈ N and k = 1, . . . , n. Sup∞
X
aj,k ≤ 1
for all k = 1, . . . , n
(4.3)
for all j ∈ N
(4.4)
j=1
and
n
X
aj,k ≤ 1
k=1
50
Let (αj )j∈N and (βk )nk=1 be decreasing sequences of nonnegative numbers. Then
∞ X
n
X
aj,k αj βk ≤
j=1 k=1
n
X
α k βk .
k=1
We now apply the proposition
P ROPOSITION 70
We have
kT kCp ≤
∞
X
! p1
σi (T )p
i=1
showing that the right hand side actually is a norm.
Proof.
We write
T =
∞
X
σj (T )ej ⊗ fj∗
j=1
and
S=
n
X
σk (S)gk ⊗ h∗k
k=1
where e, f, g, h are orthonormal sets and the singular values are written in decreasing order. Then
tr(T S) =
∞ X
n
X
σj (T )σk (S)hhk , ej ihfj , gk i
j=1 k=1
Now we claim that
∞
X
|hhk , ej i|2 ≤ 1
for all k = 1, . . . , n
j=1
since hk is a unit vector and (ej ) is an orthonormal set. Similarly
n
X
|hhk , ej i|2 ≤ 1
k=1
51
for all j ∈ N
since ej is a unit vector and (hk ) is an orthonormal set. We can make similar
estimates on hfj , gk i|2 . Thus, putting aj,k = |hhk , ej ihfj , gk i| we have that (4.3)
and (4.4) hold by applications of the Cauchy–Schwarz inequality. Therefore
| tr(T S)| ≤
n
X
σk (T )σk (S) ≤
k=1
∞
X
! p1
σk (T )p
n
X
! 10
p
σk (S)
p0
.
k=1
k=1
This proves the result.
P ROPOSITION 71
Let R be a contraction and T ∈ Cp .
kRT kCp , kT RkCp ≤ kT kCp .
Proof.
Then we have
We repeat the proof of Proposition 70. We work with RT and obtain
tr(T S) =
∞ X
n
X
σj (T )σk (S)hhk , Rej ihfj , gk i
j=1 k=1
and once again
n
X
|hhk , Rej i|2 ≤ 1
for all j ∈ N
k=1
since Rej is a vector of norm ≤ 1 and (hk ) is an orthonormal set. Similarly
∞
X
j=1
2
|hhk , Rej i| =
∞
X
|hR∗ hk , ej i|2 ≤ 1
for all k = 1, . . . , n
j=1
since R∗ hk is a vector of norm ≤ 1 and (ej ) is an orthonormal set. Taking sups
over suitable S, we get the result. The proof for T R is similar.
E XERCISE
• Show that kT kC1 = sup | tr(T S)| as S runs over finite rank contractions.
• Deduce that kT RkC1 ≤ kT kCp kRkCp0 for 1 ≤ p < ∞.
2
52
5
Interpolation
There are various methods of interpolation. The most prevalent are the complex
method, Marcinkiewicz interpolation and the real method of Lions and Peetre. We
start with the idea behind the complex method.
L EMMA 72 (T HREE L INES L EMMA )
Let ϕ be a bounded continuous function
in 0 ≤ <z ≤ 1 analytic in 0 < <z < 1. Suppose that |ϕ(z)| ≤ M0 for <z = 0
and |ϕ(z)| ≤ M1 for <z = 1 Then
|ϕ(z)| ≤ M01−t M1t for <z = t and 0 ≤ t ≤ 1.
Proof. Let > 0 and set ϕ (z) = M0z−1 M1−z ϕ(z) exp((z 2 − 1)). Then
|ϕ (z)| ≤ 1 on the boundary of the strip 0 ≤ <z ≤ 1 and it vanishes at infinity.
Therefore by applying the maximum modulus principle we find that |ϕ (z)| ≤ 1
on the strip 0 ≤ <z ≤ 1. The result follows by letting tend to zero.
Note that the maximum modulus result for the strip is proved by means of
conformally mapping the strip to say a disk. It is important that the resulting
function should be continuous on the boundary of the disk. This is assured in our
case since the function ϕ is tending to zero at infinity.
L EMMA 73
Let f be a function in Lp of norm 1 where p is strictly between p0
−1
and p1 . Define t ∈]0, 1[ by p−1 = (1−t)p−1
0 +tp1 and let α = p(p0 −p1 )/(p0 p1 ).
Then set
1
fz = f |f |α(z− p ) .
Then fz is a function in Lp0 of norm 1 for <z = 0 and fz is a function in Lp1 of
norm 1 for <z = 1.
53
Note that in some sense, the function z 7→ fz is analytic.
We illustrate the complex method with an example.
T HEOREM 74 (T HE H AUSDORFF –YOUNG T HEOREM )
0
p ≤ 2, then fˆ ∈ Lp (R̂).
If f ∈ Lp (R) for 1 ≤
Proof. Of course, the theorm is trivial for p = 1 and is a consequence of the
Plancherel theorm for p = 2. The idea is to deduce it for values of p in between 1
and 2. For f ∈ Lp (R) of unit norm and g ∈ Lp (R̂) of unit norm, it will be enough
to establish that
Z
fˆ(u)g(u)du ≤ 1.
(5.1)
by duality. In fact, it will be enough to handle the special case in which f and g
are step functions. Now
R use Lemma 73 to build fz and gz appropriately. Then we
will get for ϕ(z) = fˆz (u)gz (u)du
Z
|ϕ(z)| = fˆz (u)gz (u)du ≤ kfˆz k2 kgz k2 = kfz k2 kgz k2 ≤ 1
for <z = 0 and
Z
ˆ
|ϕ(z)| = fz (u)gz (u)du ≤ kfˆz k∞ kgz k1 ≤ kfz k1 kgz k1 ≤ 1
for <z = 1. Taking z = t (t as defined in the three lines lemma) we have the
required conclusion (5.1).
Here is another application of interpolation.
Let K be a kernel on a measure space (X, F, µ) and suppose that
R
• ess supx |K(x, y)|dµ(y) ≤ 1,
R
• ess supy |K(x, y)|dµ(x) ≤ 1.
R
Then the operator T defined by T f (x) = K(x, y)f (y)dµ(y) is a contraction on
Lp for 1 ≤ p ≤ ∞.
L EMMA 75
Proof. The hypotheses lead to the conclusion in the cases p = 1 and p = ∞.
Interpolation does the rest.
This brings to mind two further things neither of which is conected to interpolation. The first is Gerschgorin’s theorem.
54
T HEOREM 76 (G ERSCHGORIN ’ S T HEOREM )
Let S
A = (ajk ) be an n × n
matrix. Then the eigenvalues of A lie in the union nj=1 D
j where Dj is the
P
closed disc (in the complex plane) with centre ajj and radius k6=j |aj,k |.
The proof is standard linear algebra. A corollary is
C OROLLARY 77
Let A = (ajk ) be
n × n matrix. Then the spectral radius
Pan
n
of A is bounded above by max1≤j≤n k=1 |aj,k |.
In comparison with Lemma 75, only one of the conditions occurs in the hypotheses, but the spectral radius is bounded rather than an operator norm.
The second thing that is brought to mind is Cotlar’s lemma, since the hypotheses are a little reminiscent of those in Lemma 75.
L EMMA 78 (C OTLAR ’ S L EMMA )
Let Tj be continuous operators from a
Hilbert space H to a Hilbert space K for j = 1, 2, . . . , n. Suppose that
P
• maxi nj=1 kTi Tj∗ k ≤ M ,
P
• maxi nj=1 kTi∗ Tj k ≤ M ,
P
Then k nj=1 Tj k ≤ M .
For the proof, see https://terrytao.wordpress.com/2011/05/25/the-cotlar-steinlemma/ There are ways of extending this result to infinite sums and also to integrals.
E. M. Stein devised a nice trick to use with complex interpolation. Basically,
one builds the desired operator into a complex analytic family. Consider the following question. For σ the uniform meaure on the unit sphere in Rn do we have
the convolution estimate
kσ ∗ f kn+1 ≤ Ckf k n+1 ?
n
(5.2)
To establish this, one embeds σ into an analytic family of distributions (OK we
still have to talk about distributions). The definition is
dσz (x) =
1
(1 − |x|2 )−1+z 11D (x)
Γ(z)
where D is the open unit ball D = {x ∈ Rn ; |x| < 1}. For <z > 0 this is a
perfectly good measure. We can compute its Fourier transform as
σbz (u) = 2z |u|−
n−2
−z
2
55
J n−2 +z (|u|)
2
where Jα denotes a Bessel function. Since σbz has at worst polynomial growth
at infinity, we can consider σz as a distribution. Now putting z = 0, we find
n−2
that σb0 (u) = |u|− 2 J n−2 (|u|) which happens to be the Fourier transform of σ.
2
Thus σ is embedded in this analytic family. Now consider the case <z = 1, then
1
(1−|x|2 )−1+z 11D (x) is a bounded function. Unfortunately, the L∞ bound may
Γ(z)
grow as =z goes off to infinity, but not too badly. This depends on lower bounds
, σbz is a bounded function since the Bessel functions
for |Γ(z)|. When <z = − n−1
2
− 21
decay like |u| at infinity. Again, the bounds may go off to infinity with |=z| and
this depends on precise estimates for the decay of Bessel functions. The estimates
one obtains are
|hgz , σz ∗ fz i| ≤ Cz kfz k1 kgz k1
for <z = 1 and
|hgz , σz ∗ fz i| = |hgbz , σbz fbz i| ≤ Cz kfz k2 kgz k2
. There are a number of issues here. One is the analyticity of
for <z = − n−1
2
ϕ(z) = hgz , σz ∗ fz i. Another is the growth of the constants Cz . One needs to
revisit the three lines lemma with a proof in which is constant (in fact large) and
2
the growth along the sides of the strip is controlled by e|=z| . Suffice it to say that
these details can all be worked out. Applying the interpolation idea, one comes
out with (5.2).
For almost the same problem, check out Terry Tao’s notes at
https://terrytao.wordpress.com/2011/05/03/steins-interpolation-theorem/
5.1
Lorentz Spaces
Lorentz spaces developed out of the Marcinkiecz Interpolation Theorem. A typical application of Marcinkiecz Interpolation Theorem arises in relation to the
Hardy–Littlewood maximal theorem. We’ll take the centred version on the line
Z x+h
1
f (t)dt.
M f (x) = sup
h>0 2h x−h
This satisfies an inequality of the for
meas {x; M f (x) > t} ≤ Ckf k1 t−1
and also kM f k∞ ≤ kf k∞ . A function f that satisies the Lp Tchebychev iequality
meas {x; M f (x) > t} ≤ Ct−p
56
is said to be in weak Lp and this is coded as the Lorentz space Lp,∞ . There is also
a strong Lp denoted Lp,1 and the usual Lp space is the Lorentz space Lp,p .
The Marcinkiecz Interpolation Theorem works for sublinear operators like the
Hardy–Littlewood maximal operator. Sublinear operators are positive homogenous i.e. T (tf ) = |t|T (f ) and subadditive T (f + g) ≤ T (f ) + T (g). The
complex method of interpolation works only for linear operators unless special
techniques are used.
For f a nice function on a measure space we define df (s) = meas ({|f | > s})
the distribution function of f and then f ∗ (t) = inf{s > 0; df (s) < t} the
equimeasurable decreasing rearrangement. The function f ∗ is positive decreasing
right-continuous and has the same distribution function as f . The map f 7→ f ∗
is positive homogenous, but not subadditive. An example of this is furnished by
f = 110,1[ , g = 11[1,2[ . Then f + g = 11[0,2[ = (f + g)∗ . But f ∗ = g ∗ = f and
f ∗ + g ∗ = 211[0,1[ .
If one wishes sublinearity, then one averages the decreasing rearrangement
with the Hardy averaging operator.
Z
1 x
f (t)dt
Af (x) =
x 0
p
defined on functions on [0, ∞[. It is well-known that kAf kp ≤ p−1
kf kp for
1 < p ≤ ∞. The sublinearity of the mapping f 7→ Af ∗ follows from
Z x
Z
∗
∗
f (t)dt sup
|f (s)|dµ(s)
(5.3)
xAf (x) =
µ(A)≤x
0
A
where the sup is taken over all measurable subsets A of measure ≤ x. Note that
f ∗ ≤ Af ∗ .
Another key inequality is
Z
Z ∞
f gdµ ≤
f ∗ (t)g ∗ (t)dt
0
The Lorentz spaces are defined by means of the quasinorms
Z
kf kp,q =
∞
1
p
∗
t f (t)
q dt 1q
0
for finite q and
1
kf kp,∞ = sup t p f ∗ (t)
t>0
57
t
for q infinite.
Next we give a proof of Hardy’s inequality.
1
−1
Let g(y) = y p f (y) and ϕ(x) = x p0 11[1,∞[ (x) Then kgkLp ( dt ) = kf kp and
t
kϕkL1 ( dt ) = p0 . Convolving on the multiplicative group ]0, ∞[ we get
t
Z ∞
Z x
1
1
1
dy
−
−
−1
−1
0
0
f (y)y p (xy ) p 11[1,∞[ (xy ) = x p
h(x) =
f (y)dy
y
0
0
and khkLp ( dt ) ≤ p0 kgkLp ( dt ) . But
t
t
Z x
p
Z
Z
Z
dx
dx
− pp0
p
f (y)dy
= |h(x)|p
≤ (p0 )p kf kpp
(Af (x)) dx = x
x
x
0
The adjoint of Hardy’s inequality follows with a similar proof.
Z ∞
g(y)
∗
A g(x) =
dy,
kA∗ gkp ≤ pkgkp ,
1≤p<∞
y
x
Next we need a variant of this inequality also due to Hardy
L EMMA 79
For b > 0 and 1 ≤ p < ∞
p1
p
p1
Z ∞
Z ∞ Z ∞
p
p p+b−1
b−1
|f (t)| t
dt
|f (t)|dt x dx
≤
b
0
0
x
b
b
Proof. Let g(y) = f (y)y 1+ p and ϕ(x) = x p 11]0,1] (x). Then kgkLp ( dt ) =
t
p1
R∞
p
p p+b−1
|f (t)| t
dt and kϕkL1 ( dt ) = b since b > 0 and p < ∞. As before, let
0
t
h be the convolution of g and ϕ on ]0, ∞[. We get khkLp ( dt ) ≤ pb kgkLp ( dt ) and
t
t
Z ∞
b
h(x) =
f (y)x p dy.
y=x
The result follows.
let 1 ≤ p < ∞ and 1 ≤ q < ∞. Then
Z t
q ds 1q
1
1
q
∗
∗
t p f (t) =
s p f (t)
p 0
s
Z t
q ds 1q
1
q
∗
≤
s p f (s)
p 0
s
1q
q
=
kf kp,q
p
58
From this, it follows easily that Lp,q is a subspace of Lp,r for 1 ≤ q < r ≤ ∞.
E XERCISE Show that simple functions are dense in Lp,q for 1 ≤ p < ∞,
1 ≤ q < ∞.
2
5.2
Lorentz space duality
R
E XERCISE If 1 ≤ p < ∞, f ∈ Lp,1 and g ∈ Lp0 ,∞ we have f gdµ ≤
kf kp,1 kgkp0 ,∞
2
T HEOREM 80
Lp0 ,∞ .
For σ-finite measure spaces and 1 ≤ p < ∞, we have L∗p,1 =
Proof. If p = 1 this is just (L1 )∗ = L∞ . So we assume p > 1. The exercise gives
half the result. As in the proof of Lp duality, it suffices to work on a space of finite
measure. Let u be a continuous linear form on Lp,1 . Then we define a measure
ν by ν(A) = u(11AR). One shows that ν µ and hence by the Radon–Nikodym
R
theorem u(11A ) = 11A gdµ for suitable g ∈ L1 . It follows that u(f ) = f gdµ
for simple functions f and hence by continuity for all f ∈ L∞ . This is similar to
the proof of Lp duality.
Take f = sgn(g)11|g|>s . Then f ∈ L∞ and calculations give
Z
1
sµ ({|g| > s}) ≤ f gdµ ≤ kukkf kp,1 = pkuk (µ ({|g| > s})) p
and the result follows since we know that µ ({|g| > s}) is finite.
If 1 < p < ∞, 1 < q < ∞ f ∈ Lp,q and g ∈ Lp0 ,q0 we have
ERXERCISE
f gdµ ≤ kf kp,q kgkp0 ,q0
2
T HEOREM 81
For σ-finite nonatomic measure spaces and 1 < p < ∞ and
1 < q < ∞, we have L∗p,q = Lp0 ,q0 .
Proof. We follow the approach in Theorem 80 restricting to
R a set of finite measure. Again we have a measurable g ∈ L1 such that u(f ) = f gdµ for functions
f in L∞ . Now go back and start again choosing the sets of finite measure so that
|g| is bounded on them. This subterfuge ensures that kgkp0 ,q0 is finite.
59
Since the measure space is nonatomic, we can assume without loss of generality that the action takes place on an interval [0, a[ with 0 < a ≤ ∞ and with
Lebesgue measure. In this case, g ∗ is actually a measure preserving permutation
of |g|. Hence, we may assume without loss that g ∗ = |g|. Now, let
Z ∞ 0
q
ds
0
−1
s p0 g ∗ (s)q −1
f (t) = sgn(g(t))
t
s
2
and note that since |f (t)| is decreasing in t we actually have f ∗ (t) = |f (t)|. Let
us rewrite Lemma 79 in the form
q
Z ∞ Z ∞
Z ∞
q
q ds
dt
p
t
h(s)ds
h(s)q sq+ p
≤C
(5.4)
t
s
0
t/2
0
p0
−2
0
for nonegative h. Then we obtain using (5.4) with h(s) = s q0 (g ∗ (s))q −1 that
q
Z ∞ Z ∞ 0
q
dt
0 −1 ds
−1
q
∗
q
kf kp,q =
s p0 g (s)
s
t
0
t/2
Z ∞
0
q ds
1
s p0 g ∗ (s)
≤ C(p, q)
s
0
0
≤ C(p, q)kgkqp0 ,q0
noting that q(q 0 − 1) = q 0 and that
0
q
q
qq 0
q
q + q0
q
q0
q
−
2
+
q
+
=
−
q
+
=
−
=
.
p0
p
p0
p
p0
p0
p0
Thus
Z
∞
∗
∗
f (t)g (t)dt = u(f ) ≤ kukkgk
q0
q
p0 ,q 0
.
0
On the other hand
Z ∞
∗
Z
∗
∞
Z
t
f (t)g (t)dt ≥
0
q0
s p0
0
Z
≥
0
−1 ∗
g (s)q −1
t/2
∞
∗
g (t)
q0
Z
t
0
t/2
0
= C(p, q)kgkqp0 ,q0
60
q0
s p0
ds ∗
g (t)dt
s
−1 ds
s
dt
Hence the result.
Next, we look at an atomic description of Lorentz spaces that we learnt from
Terrence Tao’s webpages.
See http://www.math.ucla.edu/ tao/preprints/Expository/interpolation.dvi.
Consider functions f on a measure space that admit a decomposition
X
k
(5.5)
f=
ck 2− p fk
k∈Z
where c ∈ `q (Z), |fk | ≤ 1 and fk is carried on a set of measure at most 2k .
We claim that for 1 ≤ p < ∞ and 1 ≤ q ≤ ∞ every function in Lp,q admits
such a decomposition. To see this, let f ∈ Lp,q , then it will suffice to decompose
f ∗ . We set
fk =
1
f 11[ 2k , 2k+1 [,
∗
k
f (2 )
Then
X
cqk =
k∈Z
X
and
k
ck = 2 p f ∗ (2k )
q
k
2 p f ∗ (2k ) ∼ kf kqp,q
k∈Z
by the same kind of calculations involved in the integral test for convergence.
Also, for 1 < p < ∞, a function given by (5.5) lies in Lp,q . To see this, let f
have such a decomposition. It will suffice to show that f ∗ ∈ Lp,q . Since f ∗ ≤ Af ∗
pointwise, it will suffice to show that Af ∗ ∈ Lp,q . But we have
X
k
Af ∗ ≤
ck 2− p Afk∗
k∈Z
which follows from (5.3). We have an explicit bound for Afk∗
1
if 0 ≤ t ≤ 2k ,
∗
Afk (t) ≤
2k t−1 if 2k ≤ t.
For 2` ≤ t < 2`+1 , we have
1
t p Af ∗ (t) ≤
X
ck 2−
(k−`)
p
k>`
+
X
−
ck 2
(`−k)
p0
≤ γ`
k≤`
where γ is in `q (Z) being the convolution over Z of a function in `q and a function
in `1 . The claim follows.
61
T HEOREM 82
Let 1 ≤ p0 < p < p1 ≤ ∞ and let T be a linear restricted
weak type operator of types (p0 , p0 ) and (p1 , p1 ). Then T maps Lp,r to Lp,r for
1 ≤ r ≤ ∞ and in particular maps Lp to Lp .
Proof. The hypotheses mean that
R kT f kpj ,∞ ≤ Cj kf kpj ,1 for j = 0, 1. We work
with the bilinear form Λ(f, g) = gT f dµ. Let
f=
X
k
ck 2− p fk ,
g=
and
k∈Z
X
− pk0
d` 2
gk
`∈Z
0
with c ∈ `r , d ∈ `r as above. We have
X
k
+`
Λ(f, g) =
ck d` 2 p p0 Λ(fk , g` )
k,`
We have a choice of estimates
(
|Λ(fk , g` )| ≤
k
+
`
0
2 pk0 p`0
+ 0
2 p1 p1
from which we can choose the smaller. This leads to
X
|Λ(f, g)| =
|ck ||d` |2−α|k−`|
k,`
for some suitable α > 0. The result follows as above.
We think that this theorem can easily be pushed to the case of sublinear operators and also to the off-diagonal type of interpolation.
62
6
Gelfand’s Theory of Commutative Banach
Algebra
A commutative Banach algebra A is a Banach space together with a continuous
multiplication so that A becomes a linear commutative associative algebra. The
continuity of the multiplication amounts to the existence of a constant C such that
kxyk ≤ Ckxkkyk,
∀x, y ∈ A.
The algebra A is said to be unital if it has an identity element which we will denote
11A . In a unital algebra, it may be false that k11A k = 1, but we can always renorm
the algebra with an equivalent norm that has this property. For this we use the
multiplier norm
kxkM = sup kxyk.
kyk≤1
While in general, this may fail to define an equivalent norm, but in this case it
does because
kxkM = sup kxyk ≤ sup Ckxkkyk ≤ Ckxk
kyk≤1
kyk≤1
and
kxk = kx11A k ≤ k11A kkxkM .
Generally we will therefore work with a norm that has the property that
k11A k = 1. The multiplier norm has an even more important property, namely
that
kxykM ≤ kxkM kykM ,
63
in other words we may always assume without loss of generality that C = 1 (at
least if we are only interested in properties that are preserved under norm equivalence). From now on we are interested only in the case of unital commutative
Banach algebras over C. The real case presents substantially more difficulties.
The spectrum of an element x ∈ A is a subset of the complex plane defined by
σ(x) = {λ; λ ∈ C, (λ11A − x)−1 fails to exist in A}.
The spectrum has the following properties
1. If λ ∈ σ(x) implies |λ| ≤ kxk.
2. σ(x) is closed.
3. σ(x) is nonempty.
The proofs are easy. First if λ > kxk, then we can construct
−1
(11A − λ x)
−1
=
∞
X
λ−n xn
n=0
the right hand side being a norm convergent sum. It follows then that (λ11A − x)−1
exists.
The second assertion is similar. If µ ∈
/ σ(x), then (µ11A − x)−1 exists. Now
we consider λ very close to µ and observe
(λ11A − x) = (λ − µ)11A + (µ11A − x) = 11A + (λ − µ)(µ11A − x)−1 (µ11A − x).
Provided |λ − µkk(µ11A − x)−1 k < 1, it will be possible to construct (λ11A −
x))−1 with a geometric series argument. So, the complement of σ(x) is open and
therefore σ(x) is closed.
For the third assertion, suppose the contrary. Then (λ11A − x)−1 exists for
all complex λ. Let u be a continuous linear functional on A and consider the
complex-valued function
λ 7→ u((λ11A − x)−1 )
It is clear (actually by using the arguments that we have used in proving the first
two assertions) that this is a holomorphic function in the whole complex plane (a
64
so-called entire function) and also that it tends to zero at infinity since for |λ| >
kxk
∞
X
λ−n−1 xn ≤ |λ|−1 (1 − |λ|−1 kxk)−1 = (|λ| − kxk)−1 .
k(λ11A − x)−1 k = n=0
It follows from the maximum modulus principle that such a function is identically
zero. It then follows from the Hahn–Banach Theorem that (λ11A − x)−1 = 0 for
all λ which is complete nonsense since inverses can never be zero.
Having dealt with the spectrum, we now turn to the ideal structure of A. An
ideal I is said to be proper if I ⊂ A. We assert that every proper ideal is contained
in a maximal proper ideal. This is proved using a Zorn’s Lemma argument. It is
enough to show that every chain of proper ideals has an upper bound under set
inclusion. Given a chain C of proper ideals, one simply takes
[
I.
B=
I∈C
It is easy to see that B is an ideal. If it is not proper, then 11A ∈ B. But then there
exists I ∈ C such that 11A ∈ I contradicting the fact that I is proper. (As soon as
11A ∈ I, then x = x11A ∈ I for every x ∈ A.)
A similar argument shows that every maximal proper ideal is closed. If M is a
maximal proper ideal, then it is clear that cl(M ) is an ideal. So either M = cl(M )
or cl(M ) is not proper. In other words, either M is closed or M is dense. But
the latter situation is not possible, since then we would be able to approximate 11A
with elements of M . But any element of A sufficiently close to 11A is invertible
(by the geometric series argument yet again) and so M would have to contain
invertible elements and hence 11A itself, contradicting the fact that M is proper.
Now let M be a maximal proper ideal and consider Q = A/M . Then it is
routine to check that Q is a unital commutative Banach algebra in the quotient
norm. Also, from ring theory, it cannot contain any ideals other than the zero
ideal and Q itself. (Let π be the canonical projection π : A → Q and let J be a
nontrivial ideal of Q, then π −1 (J) is a proper ideal of A strictly bigger than M — a
contradiction). This implies in turn that every non-zero element of Q is invertible.
We claim that Q = C11Q . Indeed, let x ∈ Q be arbitrary and let λ ∈ σ(x). Then
λ11Q − x fails to be invertible and must therefore be zero. So x = λ11Q .
This means then that every maximal proper ideal has codimension 1 and is
the kernel of a continuous linear form ϕ : A → C. We are free to normalize ϕ
such that ϕ(11A ) = 1. But now, let x, y ∈ A then x − ϕ(x)11A and y − ϕ(y)11A are
65
elements of M since they are clearly in the kernel of ϕ. But now (x−ϕ(x)11A )y =
xy − ϕ(x)y is in M and therefore also
xy − ϕ(x)ϕ(y)11A = xy − ϕ(x)y + ϕ(x)(y − ϕ(y)11A ).
It follows that this element is in the kernel of ϕ and hence
ϕ(xy) = ϕ(x)ϕ(y).
Such a ϕ (with ϕ(11A ) = 1) is called a multiplicative linear functional (mlf).
Every maximal proper ideal is therefore the kernel of an mlf and conversely, it is
obvious that the kernel of any mlf is a closed ideal of codimension one and hence
a maximal proper ideal.
The next step in the saga is to define MA the space of all mlfs and to give it
a topology. This is the Gelfand topology and it is simply the relative topology
inherited from the weak? (σ(A∗ , A) topology). It turns out that MA is compact in
this topology and it is also clearly Hausdorff.
To see this first of all we observe that any mlf has norm exactly one. Clearly
x − ϕ(x)11A is in the proper ideal ker(ϕ) and therefore, not invertible. So, ϕ(x) ∈
σ(x) and hence |ϕ(x)| ≤ kxk. On the other hand ϕ(11A ) = 1 and k11A k = 1.
So MA can be specified as the subset of the unit ball of A∗ which satisfies the
following closed conditions
x, y ∈ A
ϕ(xy) = ϕ(x)ϕ(y)
ϕ(11A ) = 1
each depending on only finitely many elements from A (at most three).
Since the unit ball of A∗ is compact for the σ(A∗ , A) topology and since MA
is a σ(A∗ , A)-closed subset of A∗ it follows that MA is itself σ(A∗ , A) compact.
We now write x̂(ϕ) = ϕ(x) and observe that x̂ is now a continuous function
on MA . The mapping x 7→ x̂ which maps from A to C(MA ) is called the Gelfand
transform of A and is an algebra homomorphism. It can happen that the Gelfand
transform has a non-trivial kernel. We can even characterize the kernel of the
Gelfand transform. It consists of all elements x ∈ A such that σ(x) = {0} or
from power series considerations that
1
lim sup kxn k n = 0.
n→∞
This will be proved later. It is also the Jacobson radical of A viewed as a ring.
Also it is rarely the case that the Gelfand transform is onto or that the uniform
norm of x̂ is equivalent to the norm of x. In many situations the space MA is easy
to understand, but there are also cases where its structure is totally mind boggling!
66
6.1
The non-unital case
We now come to the case that A is a complex commutative Banach algebra, but
it does not have a unit (identity) element. In that case, we simply adjoin an identity element and use the theory in the previous section. So, the new algebra has
elements
à = {t11 + x; t ∈ C, x ∈ A}.
and we define
(t11 + x) + (s11 + y) = (t + s)11 + (x + y)
(t11 + x)(s11 + y) = ts11 + (ty + sx + xy)
For the norm on à we simply take kt11+xk = |t|+kxk and it is straightforward
to verify that this is actually a norm. If multiplication is continuous on A, then it
is also on à and then one may replace this norm with the multiplier norm to get
an equivalent submultiplicative norm. It’s important to extend the norm to à first.
Taking the multiplier norm immediately does not work.
We now consider the ideals in A and we need to add an extra condition. Let
I be an ideal in A. Then a modular unit (or modular identity) for I is an element
u ∈ A such that x−ux ∈ I for all x ∈ A. When we form the quotient algebra A/I,
the image of u will be an identity element. So, we say that an ideal is modular, if it
possesses a modular unit and this is actually equivalent to A/I having an identity
element. We now have the following lemma.
L EMMA 83
Let I be a modular
ideal in A. Then there exists an ideal J in Ã
T
such that J 6⊆ A and I = J A.
T
Conversely, if J is an ideal in à such that J 6⊆ A, then I = J A is a modular
ideal in A.
Proof. For the first assertion, let I be a modular ideal of A with modular unit u.
Define J = {x; x ∈ Ã, xu ∈ I}, clearly an ideal of Ã. Since u ∈ A, u − u2 ∈ I,
i.e. (11−u)u
∈ I. So, 11−u ∈ J.
/ A, soTJ 6⊆ A. It remains to show that
T
T But 11−u ∈
I = J A and clearly I ⊆ J A. So, let x ∈ J A. then since x ∈ J, we have
xu ∈ I and since x ∈ A, we have x − xu ∈ I. Therfore x = (x − xu) + xu ∈ I.
This completes the proof of the first assertion.T
For the converse, it is clear that I = J A is an ideal in A. But J 6⊆ A,
so there is an element of J of the form 11 − u with u ∈ A. Thus, for x ∈ A,
67
x − xu = x(11 − u) ∈ J. But also x and xu are both elements of A and hence so
is x − xu. Thus x − xu ∈ I. We have shown that u is a modular unit for I.
The consequence of this correspondence is that the maximal modular ideals
of A are in one-to-one correspondence with the maximal ideals of à that are not
contained in A. But since A is itself a maximal ideal in à because it has codimension one, the maximum modular ideal space (also denoted MA ) of A is just the
maximal ideal space of MÃ with a single point removed ϕ0 . We view this point
as a “point at infinity”, so that MA is a locally compact Hausdorff space having
Mà as its one-point compactification. Of course, ϕ0 is an mlf on à vanishing on
A and hence must be given by
ϕ0 (λ11 + x) = λ
for x ∈ A. Every other mlf on à restricts to a (non-zero) mlf on A and conversely,
every mlf on A extends to a unique mlf on Ã. For x ∈ A, we have ϕ(x) →
ϕ0 (x) = 0 as ϕ → ϕ0 in the MÃ topology, so its Gelfand transform x̂ viewed as
a function on MA vanishes at infinity. We see that x 7→ x̂ is a continuous algebra
homomorphism from A to C0 (MA ).
6.2
Finding the Maximal Ideal Space
Usually this is either very easy or totally impossible.
E XAMPLE Let A = C 1 ([0, 1]) the space of continuously differentiable functions on the unit interval. It’s clearly an algebra with identity and the multiplication is continuous. Clearly the point evaluations f 7→ f (t) are mlfs for t ∈ [0, 1].
It seems reasonable that these would be the only ones. How do we prove this?
Let I be some maximal ideal not of this form. Then, for each t ∈ [0, 1] there is
a function ft ∈ I with ft (t) = 1. Let Ut = {s; s ∈ [0, 1], |ft (s)| > 21 }. This
is a neighbourhood of t. Applying compactness we have t1 , . . . , tN such that Utn
cover [0, 1] for n = 1, . . . , n. Now, make the function f as
f=
N
X
ftn ftn
n=1
and observe that f > 14 everywhere on [0, 1]. So the reciprocal 11/f is in C 1 . But f
is in I and hence so is 11. But this means that I = A a contradiction. We leave the
reader to chack that the Gelfand topology is just the standard topology on [0, 1].
2
68
E XAMPLE Here is another example, very different. Let K be a compact subset
of C. Consider the ring of polynomials, viewed as functions on C, restrict them to
K and let A be the uniform closure. Then A is a closed subalgebra of C(K) and
it has an identity. The key to this algebra is that it is singly generated. We denote
by z the identity function on K. Then everything in A is limit of polynomials in
z. So, if ϕ is an mlf, then knowledge of ϕ(z) essentially determines ϕ everywhere
on A. So ζ = ϕ(z) ∈ C and for every polynomial p, we get ϕ(p) = p(ζ). We will
have ζ ∈ MA if and only if the map p 7→ p(ζ) is continuous and indeed, in this
case we will have
|p(ζ)| ≤ sup |p(z)| for all polynomials p.
z∈K
The ζ that satisfy this inequality form the polynomially convex hull K̂ of K. It
can be shown that C \ K̂ is the unbounded connected component of C \ K. Again
MA = K̂ with the usual topology.
2
E XAMPLE Let A = `∞ = C(Z) the set of bounded two-sided sequences with
the uniform norm. Again, A is a commutative Banach algebra with identity. The
maximal ideal space is horrendous. We clearly have Z ⊆ MA . We claim that this
inclusion is dense. Suppose not. Then there is an mlf ϕ which is not in the closure
of Z interpreted as a subset of MA via the point evaluations. Now the topology of
MA is the topology of convergence on finitely many elements of A, so there exists
a neighbourhood of ϕ defined by finitely many functions which avoids the closure
of Z. This means (after adding a suitable constant to each function if necessary)
that there exists N ∈ N and functions f1 , f2 , . . . , fN ∈ C(Z) such that ϕ(fn ) = 0
for n = 1, 2, . . . , N and the origin is not in the closure of the subset
{(f1 (k), f2 (k), . . . , fN (k)); k ∈ Z}
PN
PN
2
in CN . Then g =
n=1 fn fn has ϕ(g) = 0 and yet on Z, g is
n=1 |fn | =
positive and bounded away from zero. It follows that g is invertible and this is a
contradiction.
Therefore MA is a compactification of Z and in fact it is called the Stone–Čech
compactification. This is the largest possible compactification of Z and enjoys the
following universal property. Let K be a compact topological space into which
Z is mapped injectively and densely (i.e. K is a compactification of Z). Then
there is a mapping π : MA → K which is continuous and onto (but not in general
one-to-one) such that the diagram
69
MA
π
?
-K
Z
commutes. The Stone–Čech compactification is close to being incomprehensible
and we should not waste too much time trying to understand it, although of course
some mathematicians have spent many years trying to do so.
2
6.3
The Spectral Radius Formula
T HEOREM 84
In a commutative Banach algebra we have
1
kx̂k∞ = lim kxn k n
n→∞
The quantity kx̂k∞ is called the spectral radius of x.
Proof. Without loss of generality, we can assume that the algebra possesses an
identity element. Clearly
1
kx̂k∞ ≤ kxn k n
for all n ∈ N and hence
1
kx̂k∞ ≤ lim inf kxn k n .
n→∞
It remains to show that
1
lim sup kxn k n ≤ kx̂k∞ .
n→∞
−1
−1
If ζ ∈ C and |ζ| < kxk , then(11 − ζx)
=
∞
X
ζ k xk and
k=0
1
x =
2πi
n
I
(11 − ζx)−1 ζ −(n+1) dζ
|ζ|=s
−1
for s < kxk−1 . Now let t < kx̂k−1
is analytic in
∞ , then since ζ 7→ (11 − ζx)
−1
|ζ| < kx̂k∞ , we also have
I
1
n
x =
(11 − ζx)−1 ζ −(n+1) dζ.
2πi |ζ|=t
70
Taking norms in the integral, this yields
kxn k ≤ t−n sup k(11 − ζx)−1 k
|ζ|=t
and, since the sup is finite,
1
lim sup kxn k n ≤ t−1 .
n→∞
But, now letting t approach its maximum value kx̂k−1
∞ , we have the desired result.
6.4
Haar Measure
In this section we just assume the results that we need, but we will state them for
the nonabelian case which will be discussed later. The proofs aren’t really very
instructive. A locally compact abelian group is a group which is also a locally
compact Hausdorff topological space. We demand that the multiplication map is
continuous as a map from G × G to G and also that group inversion is continuous
as a map from G to G. We will use additive notations. An immediate consequence
of the definitions is the following proposition.
P ROPOSITION 85
Given a neighbourhood V of 0 in G, there is a symmetric
neighbourhood U of 0 in G such that U + U ⊆ V (or U · U ⊆ V in the multiplicative case).
The basic fact that we need is given by the following theorem.
T HEOREM 86
On every LC group G there is a left translation invariant nonnegative regular borel measure λ such that λ(U ) > 0 for every non-empty open
subset U of G and λ(K) < ∞ for every compact subset K of G. Furthermore, the
measure λ is unique up to a positive multiplicative constant. By left translation
invariant, we mean λ(xB) = λ(B) (multiplicative notations) for every x ∈ G and
every Borel subset B of G.
The measure λ is called the Haar measure of G. Note that the image of the
Haar measure under group negation (inversion) ρ is right translation invariant. It
turns out that λ = ∆ρ where ∆ is a continuous group homomorphism of G into
]0, ∞[ multiplicative. ∆ is called the modular function of G. We have ∆(x) = 1
71
for all x ∈ G if G is abelian, discrete or compact and in some other circumstances.
If G is discrete, the Haar measure is just the counting measure. If G is compact,
then it is natural to normalize the Haar measure to be a probability measure. On
additive abelian topological groups, we will denote the Haar measure by η.
6.5
Translation and Convolution
For a function f defined on G and x ∈ G we define fx (y) = f (y − x) for all
y ∈ G. We call fx the translate of f by x.
L EMMA 87
Let 1 ≤ p < ∞. For f ∈ Lp (G) we have that x 7→ fx is a
uniformly continuous map from G to Lp (G).
Proof. Suppose first that f ∈ Cc (G), the space of continuous functions of compact support on G. Then f is uniformly continuous. (Note that G has a natural
uniform structure coming from group subtraction). Let K be the support of f .
Let U be a compact neighbourhood of 0. Then K + U is also compact. Let it
have measure t. Let > 0, then, since f is uniformly continuous, there exists a
compact neighbourhood V of 0 such that
1
x ∈ V =⇒ kf − fx k∞ < t− p .
If x ∈ U ∩ V , then the support of f − fx is contained in K + U and it follows that
kf − fx kp < .
In the general case, Let h ∈ Lp and let > 0. We first approximate h by a
function f ∈ Cc (G) so that kf − hkp < . It is in this last step that the fact p < ∞
is used. Then, since the underlying measure is translation invariant, kfx − hx kp =
kf − hkp < and we have our result. if x ∈ U ∩ V , then
kh − hx kp ≤ kf − hkp + kf − fx kp + kfx − hx kp < 3.
We now define convolution. If f and g are suitable functions, we set
Z
f ? g(x) = f (x − y)g(y)dη(y).
If we make the substitution y = x − z in this integral we get
Z
f ? g(x) = f (z)g(x − z)dη(z) = g ? f (x),
using that η is both translation and reflection invariant.
72
L EMMA 88
1. If f ∈ L1 and g ∈ L∞ , then f ? g is bounded and uniformly continuous.
2. If f, g ∈ Cc (G) then f ? g ∈ Cc (G).
0
3. If 1, p < ∞, f ∈ Lp , g ∈ Lp , then f ? g ∈ C0 .
4. If f, g ∈ L1 , then f ? g ∈ L1 .
In 1), Clearly f ? g is bounded by kf k1 kgk∞ . We rewrite
Z
Z
Z
f ? g(x) = f (x − y)g(y)dη(y) = h(y − x)g(y)dη(y) = hx (y)g(y)dη(y)
Proof.
where h(x) = f (−x) and the uniform continuity is clear since x 7→ hx is uniformly continuous for the L1 norm.
For 2), clearly continuous by 1). Also supp(f ?g) ⊆ supp(f )+supp(g). Note
that supp(f ) + supp(g) is the continuous image of supp(f ) × supp(g) under the
addition map G × G → G and hence is compact.
For 3), proceed as in 1). We see that f ? g is bounded by kf kp kgkp0 by using
Hölder’s inequality. By 2) f ? g is a uniform limit of continuous functions of
compact support. Hence f ? g ∈ C0 .
For 4), we start by oberving that if U is open in G, then (x, y); x ∈ G, y ∈
G, x − y ∈ U } is an open subset of G × G. It follows that if B is a Borel set in G,
then (x, y); x ∈ G, y ∈ G, x − y ∈ B} is Borel in G × G. So, replacing both f and
g with Borel versions, we see that (x, y) 7→ f (x − y) and (x, y) 7→ f (x − y)g(y)
are Borel functions on G×G. By Fubini’s Theorem, this last function is absolutely
integrable on the product space because
ZZ
Z
|g(y)f (x − y)|dη(x)dη(y) = kf k1 |g(y)|dη(y) = kf k1 kgk1 < ∞
R
It now follows that f ?g(x) = f (x−y)g(y)dη(y) is a measurable function (finite
almost everywhere) for the completion of the Borel σ-field with respect to η. It
also follows from Fubini’s Theorem that f ?g ∈ L1 and that kf ?gk1 ≤ kf k1 kgk1 .
It is an exercise to check that different versions of f and g yield the same element
of f ? g viewed as an element of L1 .
Note: OK, so I lied. The problem with the proof of 4) above is that one of the
hypotheses of Fubini’s Theorem is that the underlying measure space be σ-finite.
73
Unfortunately not all LCA groups are σ-finite, for example any discrete uncountable abelian group will fail to be σ-finite. We would have no difficulty handing
the case of discrete groups, because L1 functions on such groups would have to
be carried by countable sets and actually by countable subgroups.
In a general LCA group G, we take an open relatively compact neighbourhood
U of 0 and consider Un = U + U + · · · + U with n summands. Note that Un is
open and relatively compact. Now consider
G0 =
∞
[
Un
n=0
an open subgroup of G which is σ-finite. But an open subgroup of G is also
closed (because it is the complement of the union of all the cosets not equal to the
subgroup itself) and it follows that the quotient G/G0 is discrete. You can now
show that given f, g ∈ L1 (G), there is a σ-finite open and closed subgroup H of
G such that in fact, f and g are carried on H. Now you apply the argument in 4)
above to H. We reserve the right in these notes to tell this same lie again without
comment.
We now have the following theorem which is easy to check.
T HEOREM 89
For g an LCA group, L1 (G) is a commutative Banach algebra
with convolution multiplication.
If G is discrete, then δ0 is an identity element. It turns out that if L1 (G) has an
identity element, then G is discrete, but this is not too obvious.
A character χ on G is a continuous group homomorphism into the multiplicative group of unimodular convex numbers. We will denote the set of all characters
on G by Γ. We can give Γ the structure of a group in the obvious way. We will
use additive notations for consistency even though they look a trifle strange.
(−χ)(x) = χ(x),
(χ1 + χ2 )(x) = χ1 (x)χ2 (x)
The Fourier transform fˆ of f ∈ L1 (G) is now given by
Z
fˆ(χ) = f (x)χ(x)dη(x).
This is a linear functional on L1 (G) and furthermore multiplicative
Z
Z Z
f ? g(x)χ(x)dη(x) =
f (x − y)g(y)dη(y)χ(x)dη(x)
74
(6.1)
ZZ
f (x − y)χ(x − y)χ(y)g(y)dη(y)dη(x)
=
ZZ
=
f (z)χ(z)dη(z)χ(y)g(y)dη(y)
= fˆ(χ)ĝ(χ)
Note also that
f ? χ = fˆ(χ)χ
T HEOREM 90
Every mlf on L1 (G) is given by a character as in (6.1).
Proof. Every bounded linear functional on L1 is given by an L∞ function. So,
every non-zero mlf ϕ, which necessarily has norm 1 would have to be given by a
function h ∈ L∞ with khk∞ = 1 by
Z
ϕ(f ) = f (x)h(x)dη(x).
Now
Z
Z
ϕ(f )
g(y)h(y)dη(y) = ϕ(f )ϕ(g) = ϕ(f ? g) = ϕ
fy g(y)dη(y)
Z
=
(ϕ(fy )g(y)dη(y)
and this holds for all g ∈ L1 . Therefore
ϕ(f )h(y) = ϕ(fy )
(6.2)
for almost all y. Choosing f so that ϕ(f ) 6= 0 and since y 7→ fy is continuous,
we see that h has a continuous version. Replacing h with its continuous version,
it is now clear that (6.2) holds for all y ∈ G (a conull set must be dense). Now we
have
ϕ(f )h(x + y) = ϕ(fx+y ) = ϕ(fx )h(y) = ϕ(f )h(x)h(y)
giving h(x + y) = h(x)h(y). Put now x = y = 0 to get h(0) = 0 or 1. But
h(0) = 0 implies that h and hence ϕ vanishes identically and hence we must have
h(0) = 1. But now h(x)h(−x) = 1 and it comes that |h(x)| = 1 for all x ∈ G.
75
6.6
The Dual Group
So, on L1 (G), the Gelfand transform and the Fourier transform are the same. We
note that if G is discrete, then L1 (G) has an identity and Γ is compact. If G is
compact, then η has finite measure. Normally it is normalized to have total mass
1 in this case. We have
Z
n
1 if χ is the zero element of Γ,
χ(x)dη(x) =
0 otherwise.
since if χ = 11, the first assertion is obvious. Otherwise there is an element y ∈ G
such that χ(y) 6= 1. Then
Z
Z
Z
χ(x)dη(x) = χ(x + y)dη(x) = χ(y) χ(x)dη(x)
so that
Z
(1 − χ(y))
χ(x)dη(x) = 0.
Note that in this case, the characters are themselves elements of L1 (G). Thus
χ̂(ψ) = 1 if χ = ψ and = 0 otherwise. Since χ̂ is a continuous function on Γ, it
follows that Γ is discrete.
T HEOREM 91
1. (x, χ) 7→ χ(x) is jointly continuous G × Γ → T.
2. Let K and C be compact in G and Γ respectively, then for t > 0
N (K, t) = {χ; |χ(x) − 1| < t for all x ∈ K}
N (C, t) = {x; |χ(x) − 1| < t for all χ ∈ C}
are open in Γ and G respectively.
3. The sets N (K, t) and their translates form a base for the topology of Γ.
4. Γ is an LCA group.
Proof. For 1) let f ∈ L1 (G). We know that x 7→ fx is continuous from G to
L1 (G). So, since the Gelfand transform is continuous, x 7→ fˆx is continuous from
G to C0 (Γ). But
fˆx (χ) = χ(x)fˆ(χ)
76
and it follows that (x, χ) 7→ χ(x) is jointly continuous on the set {(x, χ); x ∈
G, fˆ(χ) 6= 0}. But, for each χ ∈ Γ it is easy to construct f ∈ L1 (G) such that
fˆ(χ) 6= 0 and we see that (x, χ) 7→ χ(x) is jointly continuous on G × Γ.
Next we prove 2). Let C be compact in Γ and let x0 ∈ N (C, t). Then |χ(x0 ) −
1| < t for all χ ∈ C. So, for each χ there an open neighbourhood Vχ of χ in Γ
and an open neighbourhood Uχ of x0 in G such that |ψ(x) − 1| < t for all ψ ∈ Vχ
and x ∈ Uχ . Finitely many such neighbourhoods V χ cover C. Let U be the open
intersection of the corresponding Uχ . Then it is clear that x0 ∈ U ⊆ N (C, t). The
other assertion is proved similarly.
Note that 2) states that the Gelfand topology in Γ is finer than the compact
open mapping topology (i.e. the topology of uniform convergence on the compact
sets). For 3), we have to show the converse and for this it is enough to show
that each Gelfand transform fˆ for f ∈ L1 (G) is continuous for the compact open
topology. If the function f has compact support, this is obvious since
Z
ˆ
ˆ
|f (χ1 ) − f (χ2 )| ≤
|χ1 (x) − χ2 (x)||f (x)|dη(x)
supp(f )
≤ kf k1
sup
|χ1 (x) − χ2 (x)|.
x∈supp(f )
But any L1 function can be approximated in L1 norm by L1 functions of compact
support and the corresponding transforms converge uniformly. This completes the
proof of 3).
To prove 4), we simply observe that compact open topology on Γ is clearly a
group topology. This really amounts to observing that for every compact subset
of G and every t > 0 we have
N (K, t/2) − N (K, t/2) ⊆ N (K, t),
or equivalently that the standard topology on T is a group topology.
The group Γ is called the dual group of G.
6.7
Summability Kernels
Here we give the theory of summability kernels as it applies to LCA groups.
The Bernstein approximation theorem (the proof using the Bernstein polynomials
gives an example of the idea in more general situations).
Let kn ∈ L1 (G) be indexed over n ∈ N. (In general other indexing sets are
used). We suppose
77
• kn ≥ 0.
R
• G kn (x)dη(x) = 1, for all n ∈ N.
• For every measurable neighbourhood V of 0 we have
Z
lim
kn (x)dη(x) = 0.
n→∞
G\V
We have the following general theorem.
T HEOREM 92
Let B be a Banach space of objects on which G acts isometrically and continuously. We will denote bx for the result of applying of the group
element x to b ∈ B. Then
Z
bx kn (x)dη(x) −→ b.
n→∞
Proof.
We have
Z
b−
and so
Z
bx kn (x)dη(x) =
(b − bx )kn (x)dη(x)
Z
Z
b − bx kn (x)dη(x) ≤ kb − bx kkn (x)dη(x).
Now, let > 0. There exists V a measurable neighbourhood of 0 such that
x ∈ V ⇔ kb − bx k < and then there exists N ∈ N such that
Z
n≥N ⇔
kn (x)dη(x) < .
G\V
We have
Z
Z
Z
kb − bx kkn (x)dη(x) ≤
kb − bx kkn (x)dη(x) +
G
V
Z
≤
kb − bx kkn (x)dη(x)
G\V
Z
(kbk + kbx k)kn (x)dη(x)
kn (x)dη(x) +
V
G\V
78
Z
≤
Z
(kbk + kbk)kn (x)dη(x)
kn (x)dη(x) +
G
G\V
Z
≤ + 2kbk
kn (x)dη(x)
G\V
≤ + 2kbk
for n ≥ N .
C OROLLARY 93
Let 1 ≤ p < ∞. Let f ∈ Lp (G) and (kn ) be a summability
kernel. Then kn ? f → f in Lp norm.
E XAMPLE
Let ϕ be a bounded continuous function on G. Show that
Z
ϕ(x)kn (x)dη(x) −→ ϕ(0).
n→∞
2
6.8
Convolution of Measures
Let λ and µ be complex borel measures on G. Then we define thir convolution
product λ ∗ µ by
λ ∗ µ(B) = λ ⊗ µ(α−1 (B))
(6.3)
where α is the addition map α : G × G → G given by α(x, y) = x + y. This
extends to suitable measurable functions via
Z
Z Z
f dλ ∗ µ =
f (x + y)dλ(x)dµ(y).
(6.4)
G
G
G
In fact, (6.3) is just the special case f = 11B . It’s easy to check that the convolution
multiplication is associative and (on an abelian group) commutative. The totality
of all complex borel measures on G is denoted M (G). Since all complex borel
measures are necessarily bounded, we can put the total mass norm k kM on M (G)
and it can then be realised as the dual space of C0 (G). Taking the supremum over
all f ∈ C0 (G) with norm bounded by one in (6.4), we see that kλ ∗ µkM ≤
79
kλkM kµkM . It follows that M (G) is a commutative Banach algebra with identity
δ0 . The maximal ideal space of M (G) is pathological. It is true that the mappings
Z
µ 7→ µ̂(χ) =
χ(x)dµ(x)
G
for χ ∈ Γ which define the so-called Fourier-Stieltjes transform of µ are multiplicative linear functionals on M (G), but there are other less obvious mlfs as well
(at least when G is non-discrete).
We now have the following uniqueness theorem which is the wrong way
around.
R
T HEOREM 94
Let µ ∈ M (Γ) be such that Γ χ(x)dµ(χ) = 0 for all x ∈ G.
Then µ = 0 identically.
R
R
1
Proof.
Let
f
be
in
L
(G),
then
f
(x)
χ(x)dµ(χ)dη(x) = 0. Then we have
G
Γ
R R
|f (x)|d|µ|(χ)dη(x) < ∞ and hence by Fubini’s Theorem, we have
G Γ
Z
Z Z
ˆ
f (χ)dµ(χ) =
f (x)χ(x)dη(x)dµ(χ) = 0
(6.5)
Γ
Γ
G
But the set of Fourier transforms A(Γ) of L1 functions on G is a self-adjoint
subalgebra of C0 (Γ) under pointwise mutiplication which separates the points of
Γ (compact case) and the points of the one-point compactification of Γ in the
non-compact case. To verify the self-adjointness, we check
Z
Z
Z
f (−x)χ(x)dη(x) = f (−x)χ(x)dη(x) = f (x)χ(−x)dη(x)
so that ĝ(χ) = fˆ(χ), where g(x) = f (−x). Therefore, by the Stone–Weierstrass
Theorem, A(Γ) is dense in C0 (Γ). It follows from (6.5) that µ = 0.
6.9
Positive Definite Functions
Let ϕ be a complex-valued function on G, then we say that ϕ is positive semidefinite if and only the matrix M given by
mj,k = ϕ(xj − xk )
80
is positive semidefinite for all choices of finitely many points (xj )nj=1 from G.
Explicitly, this means that
n X
n
X
cj ck ϕ(xj − xk ) ≥ 0
j=1 k=1
for all n ∈ N, cj ∈ C and xj ∈ G. Let ϕ be a positive semidefinite function. Then
clearly ϕ(0) ≥ 0 (take n = 1 and c1 6= 0). Also, a positive semidefinite matrix
has to be hermitian, so ϕ(−x) = ϕ(x). Now the matrix
ϕ(0) ϕ(x)
ϕ(0) ϕ(x)
=
ϕ(x) ϕ(0)
ϕ(−x) ϕ(0)
is positive semidefinite and has a nonnegative determinant, so |ϕ(x)| ≤ ϕ(0) for
all x ∈ G. Similarly, the matrix


 
ϕ(0)
ϕ(x)
ϕ(y)
ϕ(0)
ϕ(x)
ϕ(y)
 ϕ(−x)
ϕ(0)
ϕ(x − y) 
ϕ(0)
ϕ(y − x)  =  ϕ(x)
ϕ(y) ϕ(x − y)
ϕ(0)
ϕ(−y) ϕ(x − y)
ϕ(0)
is positive semidefinite and hence, using simulaneous row and column reduction,
so is
ϕ(0)
ϕ(x) − ϕ(y)
.
ϕ(x) − ϕ(y) 2<(ϕ(0) − ϕ(x − y))
It follows that
|ϕ(x) − ϕ(y)| ≤ 2ϕ(0) ϕ(0) − <ϕ(x − y) .
2
It is easy to check that if f ∈ L2 (G), then f˜ ? f is a continuous positive
semidefinite function on G tending to zero at infinity. However, there is a complete
characterization of the continuous positive semidefinite functions on G.
THEOREM 95 (B OCHNER ’ S T HEOREM )
Every continuous positive semidefinite function ϕ on G has the form
Z
(6.6)
ϕ(x) = χ(x)dµ(χ)
Γ
where µ is a nonnegative Borel measure (of finite total mass) on Γ and conversely.
81
Proof. It is routine to check that if ϕ is defined by (6.6) then ϕ is continuous and
positive semidefinite on G. For the converse, it is an exercise to check that (since
ϕ is bounded and continuous), we have
ZZ
ϕ(x − y)f (x)f (y)dη(x)dη(y) ≥ 0,
G×G
for f ∈ L1 (G). We use the formula to define a quasi inner product on L1 (G) by
ZZ
Z
<f, g> =
ϕ(x − y)g(x)f (y)dη(x)dη(y) = (f˜ ∗ g)ϕdη.
G×G
G
This is in all respects like an inner product, except that the implication <f, f > = 0
does not necessarily imply that f is the zero element of L1 (G). Nevertheless, the
proof of the corresponding Cauchy–Schwarz–Bunyakowski inequality goes thru,
giving
|<f, g>|2 ≤ <f, f ><g, g>
Now pass to the limit as f runs over a summability kernel on G. We get
Z
2
ϕ(x)g(x)dη(x) ≤ ϕ(0)<g, g>
G
for all g ∈ L1 (G). Let g1 = g̃ ? g and gn+1 = g˜n ? gn for n = 1, 2, . . . Actually,
n
g˜n = gn and it follows that gn+1 = ?2 g1 , the 2n -fold convolution product of g1
with itself. The point is that
Z
2
Z
ϕ(x)gn (x)dη(x) ≤ ϕ(0)<gn , gn > = ϕ(0) ϕ(x)gn+1 (x)dη(x)
G
G
It follows from this and a simple induction that
Z
2n
ϕ(x)g(x)dη(x) ≤ ϕ(0)2n k ?2n−1 g1 k1
G
and, after taking the root of order 2n−1 and passing to the limit with the spectral
radius formula, we get
Z
2
ϕ(x)g(x)dη(x) ≤ ϕ(0)2 kgˆ1 k∞ ≤ ϕ(0)2 kĝk2∞
G
82
or
Z
ϕ(x)g(x)dη(x) ≤ ϕ(0)kĝk∞
G
Z
ϕ(x)g(x)dη(x) depends only on the value of ĝ and (since A(Γ)
This tell us that
G
is dense in C0 (Γ)) that there is a measure µ on Γ of total mass at most ϕ(0) such
that
Z
Z
ϕ(x)g(x)dη(x) = ĝ(χ)dµ(χ)
G
Γ
But now,
Z Z
Z
ϕ(x)g(x)dη(x) =
G
g(x)χ(x)dη(x)dµ(χ)
Z
Z
=
g(x) χ(x)dµ(χ)dη(x)
Γ
G
G
(6.7)
Γ
R
The functions ϕ and x 7→ Γ χ(x)dµ(χ) are both continuous on G and since 6.7
holds for all g ∈ L1 (G), we have (6.6) holding
for all x ∈ G as required. Finally,
R
put x = 0 in (6.6) to see that ϕ(0) = dµ ≤ kµk ≤ ϕ(0) forcing µ to be a
positive measure.
Something special happened in the proof above. Before this theorem, we
didn’t know that the points of G could be separated by its characters, but now
we do. Given x 6= 0 in G, find a symmetric neighbourhood V of 0 such that
x∈
/ V + V . Then apply Bochner’s Theorem to 11V ? 11V .
We are now ready to prove a preliminary form of the inversion theorem.
R
T HEOREM 96
Let f ∈ L1 (G) be also given by f (x) = Γ χ(x)dµf (χ) where
µf is a complex measure. (Note that complex measures have finite total mass).
Then µf = fˆν where ν is a suitably normalized Haar measure on Γ.
Proof. Let f and g be two such functions with associated measures µf and µg .
Let h ∈ L1 (G). Then
ZZ
ZZ
h(−x − y)f (x)g(y)dη(x)dη(y) =
h(−x − y)χ(x)g(y)dµf (χ)dη(x)dη(y)
ZZ
=
h(−x − y)χ(x)g(y)dη(x)dη(y)dµf (χ)
83
ZZ
=
ZZ
=
ZZ
=
ZZ
=
ZZ
=
ZZ
=
h(−x − y)χ(x)g(y)dη(x)dη(y)dµf (χ)
h(x − y)χ(x)g(y)dη(x)dη(y)dµf (χ)
h(x − y)χ(x)g(y)dη(y)dη(x)dµf (χ)
g ? h(x)χ(x)dη(x)dµf (χ)
g[
? h(χ)dµf (χ)
ĥ(χ)ĝ(χ)dµf (χ)
and also by the symmetry of the initial expression in f and g
ZZ
=
ĥ(χ)fˆ(χ)dµg (χ)
Again, since A(Γ) is dense in C0 (Γ), we find ĝdµf = fˆdµg .
Now we can construct functions like f and g easily. Let V be a measurable
neighbourhood of 0 and let h = 11V , then by Bochners Theorem, there is a measure
µh?h̃ such that
Z
χ(x)dµh?h̃ (χ)
h ? h̃(x) =
Γ
h[
? h̃(χ) = |ĥ(χ)|2
Furthermore, if ψ ∈ Γ
Z
f
(ψh ? ψh)(x) = ψ(x) h ? h̃(x) = χ(x)dµh?h̃ (χ − ψ)
Γ
Z
=
χ(x)dµ(ψh?ψh)
f (χ)
Γ
\
f
ψh ? ψh(χ)
= |ĥ(χ − ψ)|2 .
Note also that ĥ is continuous and ĥ(0) is nonzero. This leads to
|ĥ(χ − ψ)|2 dµf (χ) = fˆ(χ)dµ(ψh?ψh)
f (χ)
84
showing that µf is uniquely determined near ψ and hence everywhere on Γ.
Also we may infer the existence (the details are an exercise) of a positive
measure ν such that
µf = fˆν.
A straightforward compactness argument shows that ν is finite on the compact
sets and charges every nonempty open set.
Now let us abbreviate h ? h̃ to g. Then
c
ψg(χ)dµ
g (χ) = ĝ(χ)dµψg (χ) = ĝ(χ)dµg (χ − ψ)
leading to
ĝ(χ − ψ)ĝ(χ)dν(χ) = ĝ(χ)ĝ(χ − ψ)dν(χ − ψ).
Now suppose that ψ is given, then, choosing suitably small, choose V such that
V ⊆ {x; |ψ(x)−1| < }. Then for χ in a neighbourhood of 0Γ , ĝ(χ−ψ)ĝ(χ) 6= 0,
showing that dν(χ) = dν(χ − ψ) at least for values of χ in a neighbourhood of
0Γ . It follows that ν is translation invariant and hence a multiple of Haar measure
on Γ. Again, the details are an exercise.
−1
Now let V be a symmetric neighbourhood of 0 in G. Let g =
R (η(V ) 11V ? 11V .
Then g(0) = 1 and g is positive definite. It follows that g(x) = Γ ĝ(χ)χ(x)dν(χ).
Now ĝ is in L1 (Γ) with norm 1 and there is a compact subset C of Γ such that
Z
1
g(χ)dν(χ) <
5
Γ\C
Suppose that x ∈ N (C, 15 ). Then
Z
Z
1 − ĝ(χ)χ(x)dν(χ) ≤
ĝ(χ)|1 − χ(x)|dν(χ)
Γ
Γ
Z
≤
Z
ĝ(χ)|1 − χ(x)|dν(χ) + ĝ(χ)|1 − χ(x)|dν(χ)
Γ\C
≤
C
2 1
3
+ =
5 5
5
So, g(x) ≥ 25 and x ∈ V + V . It follows from this that the compact open topology
defined on G by means of the duality with Γ is finer than and therefore equivalent
to the original topology on G.
85
6.10
The Plancherel Theorem
This is an immediate consequence of the inversion theorm.
T HEOREM 97 (P LANCHEREL T HEOREM )
Let f ∈ L1 (G) ∩ L2 (G). Then
Z
Z
2
|f (y)| dη(y) = |fˆ(χ)|2 dν(χ)
(6.8)
G
Γ
so that
f 7→ fˆ
L1 (G) ∩ L2 (G) −→ L2 (Γ)
extends by continuity to a surjective isometry
L2 (G) −→ L2 (Γ)
R
Proof. Let h = f ? f˜, then h(x) = G f (x − y)f (−y)dη(y) and h(0) = kf k22 .
Since h is both in L1 and is positive definite, it can be represented by a measure
µh of total mass h(0). Also, µh = ĥν. Since ĥ(χ) = f[
? f˜(χ) = |fˆ(χ)|2 , we have
(6.8). The remainder of the result is obvious, except for the fact that the isometry
is surjective. To see this, suppose not. Then there is a nonzero function φ ∈ L2 (Γ)
such that
Z
fˆ(χ)φ(χ)dν(χ) = 0
Γ
1
2
for all f ∈ L (G) ∩ L (G). Fix such an f and consider its translation fx . We get
Z
Z
ˆ
χ(x)f (χ)φ(χ)dν(χ) = fˆx (χ)φ(χ)dν(χ) = 0
Γ
Γ
for all x ∈ G. But now by Theorem 94 and since fˆφν is a measure (fˆφ ∈ L1 ), we
have that fˆφ vanishes ν almost everywhere. But we know how to choose f such
that fˆ is non-vanishing in a neighbourhood of any given point of Γ. Hence φ = 0
almost everywhere (and as an element of L2 ).
C OROLLARY 98
Let f, g ∈ L2 (G), then fcg = fˆ ? ĝ.
86
Proof.
Polarizing the Plancherel identity leads to
Z
Z
f (y)g(y)dη(y) = fˆ(χ)ĝ(χ)dν(χ)
Γ
G
for f, g ∈ L2 (G). The notations fˆ, ĝ now stand for the abstract nonsense fourier
transforms of f and g respectively. Replace g by g and f by ψf where ψ ∈ Γ. We
get
Z
Z
ψf (y)g(y)dη(y) = fˆ(χ + ψ)ĝ(−χ)dν(χ)
G
Γ
which after a change of variables gives exactly fcg = fˆ ? ĝ.
C OROLLARY 99
Let Ω be a nonempty open subset of Γ. Then there is a func1
tion f ∈ L (G) such that fˆ is not identically zero and fˆ(χ) = 0 for all χ ∈ Γ \ Ω.
Proof. First, find V1 and V2 open nonempty and relatively compact with V1 +
V2 ⊂ Ω. Then let fj ∈ L2 (G) be the elements such that fˆj = 11Vj for j = 1, 2.
Then f = f1 f2 does the trick, since fˆ = 11V1 ? 11V2 .
6.11
The Pontryagin Duality Theorem
Let H be the dual group of Γ. Every element of G defines a continuous character
on Γ, so there is a map α : G → H which is clearly one-to-one (different elements
of G define different characters since we know that the characters of G separate
the points of G).
T HEOREM 100 (P ONTRYAGIN D UALITY T HEOREM )
isomorphism of topological groups.
The mapping α is an
Proof. It is clear that α is an injective group homomorphism. We also know that
the topologies of G and H can be identified to the compact open topology when
these spaces are viewed as function spaces on Γ. Therefore the topology of G is
the subspace topology coming from H. Now the uniform structure of an abelian
topological group is given from the topology by means of translation. Therefore,
the uniform structure on G is just the restriction of the uniform structure on H.
But G is locally compact and hence as a uniform space, it is complete. But when
87
complete spaces occur as subsets of other spaces, they are necessarily closed.
Hence α(G) is a closed subset of H 1 .
It remains only to show that α(G) is dense in H. But, if not, then by one
of the corollaries of the Plancherel Theorem, we can find f ∈ L1 (Γ) nonzero,
with fˆ(x) = 0 for all x ∈ α(G). But then Theorem 94 implies that f is almost
everywhere zero on Γ a contradiction.
Some of the consequences of the Pontryagin Duality Theorem are as follows:
• Every compact abelian group is the dual of a discrete abelian group.
• Every discrete abelian group is the dual of a compact abelian group.
• If µ ∈ M (G) and µ̂(χ) = 0 for all χ ∈ Ĝ, then µ = 0. In particular, both
L1 (G) and M (G) are semisimple Banach algebras.
• If G is not discrete, then Ĝ is not compact and hence L1 (G) does not have
an identity element.
• We can restate the inversion theorem the correct way around. If µ ∈ M (G)
and µ̂ ∈ L1 (Ĝ), then there exists f ∈ L1 (G) such that µ = f η and the
inversion formula
Z
µ̂(χ)χ(x)dη(χ)
f (x) =
Ĝ
holds.
1
If you are reading along in Rudin’s book, please note that it is in general false that a locally
compact subspace of a locally compact topological space is necessarily closed. Whatever Rudin
intended in §1.7 is by no means clear.
88
7
Distributions and Euclidean Harmonic Analysis
To define distributions, we first need a space of test functions. There are several choices, but the usual one is Cc∞ (Rd ). Here Rd can be replaced by any C ∞
manifold. We topologize Cc∞ (Rd ) with the seminorms
pα,K (f ) = sup |∂ α f (x)|
x∈K
Cc∞ (Rd )
7→ C
∂ |α| f
. Here α = (α1 , . . . , αd ) runs over d-tuples of non∂xα1 1 · · · ∂xαd d
negative integers and K runs over compact subsets of Rd . Alternatively, one can
replace
K with balls of positive integer radius centred at 0. The notation |α| is for
Pd
α
j=1 j .
With these seminorms, Cc∞ (Rd ) is a locally convex space. Unfortunately, it is
not necessarily complete. The space of continuous linear forms on Cc∞ (Rd ) are
called distributions.
A function Rf on Rd is said to be locally integrable (written f ∈ L1loc (Rd ))
if and only if K |f (x)|dx < ∞ for every compact subset K of RD . Here
dx = dx1 . . . dxd is Lebesgue measure on Rd . Locally integrable functions can be
identified to distributions by
Z
ϕ 7→
f ϕdx
where ∂ α f =
Cc∞ (Rd ) 7→ C
89
Z
∂
(f ϕ)dx = 0 and so
∂xj
Z
Z
∂f
∂ϕ
ϕdx + f
dx = 0.
∂xj
∂xj
If f is differentiable, we have
∂f
as a distribution when f ∈ L1loc (Rd ), but
∂xj
is not necessarily differentiable. The defining continuous linear form is
Z
∂ϕ
ϕ 7→ − f
dx
∂xj
We now use this formula to define
Cc∞ (Rd ) 7→ C
∂
of any distribution. Instead of f one may use
∂xj
measures. The derivative of δ0 on the line is δ00 , the unit dipole at 0.
Yet another way of defining distributions is by means of Cauchy principal
value integrals. The typical such integral is
Z ∞
dx
ϕ(x)
x
−∞
More generally, one may take
for ϕ ∈ Cc∞ (Rd ). Of course, the integral is meaningless as it stands because of
the singularity at x = 0. To give it precise meaning as a Cauchy principal value
integral, we define
Z ∞
Z
dx
dx
ϕ(x)
= lim
ϕ(x) .
→0+
x
x
−∞
|x|>
To see that this makes sense, we choose a specific ψ ∈ C ∞ (R) which is even and
has ψ(0) = 1. Then it is easy to see that the Cauchy principal value integral is just
Z ∞
ϕ(x) − ϕ(0)ψ(x)
dx
x
−∞
and the singularity in the integrand is now removable. Note that in the original
definition, it is vital to remove a symmetric interval {x; |x| ≤ } from the range
of integration. Removing {x; − ≤ x ≤ 2} will give a different answer. Yet
another way of thinking about this integral is as the distribution f 0 where f is the
locally integrable function x 7→ − ln(|x|).
90
Cauchy principal value integrals can also be defined on Rd where one removes
a ball of radius around the singularity and passes to the limit as → 0+.
In general, distributions on Rd do not necessarily have Fourier transforms, but
one may always take the convolution of a distribution and a C ∞ function. On the
circle group, or indeed on the torus Td distributions do have Fourier coefficients
since the characters are C ∞ functions. In fact, one may view distributions on Td
as objects whose fourier coefficients have at most polynomial growth at infinity
(on Zd ).
After working with distributions for a while, one gets used to distributional
derivatives. For example the derivative of f (x) = |x| is just f 0 (x) = sgn(x). The
justification is
Z
Z
∞
∞
|x|ϕ0 (x)dx = −
sgn(x)ϕ(x)dx
−∞
−∞
for ϕ ∈ C ∞ (R).
7.1
The Hilbert Transform
It is a well known fact that any function in C0 (R) can be extended to a function in
C0 of the halfspace in R2 which is harmonic in the interior of the halfspace. We
y
1
which is harmonic in the
prove this using the Poisson kernel Py (x) =
2
π x + y2
halfspace {(x, y); x ∈ R, y > 0}. It is also a summability kernel on R as y → 0+.
Given f ∈ C0 (R), the harmonic extension is
Z ∞
˜
f (x, y) =
Py (t)f (x − t)dt
−∞
cy (u) = e−πy|u| . The conjugate harmonic is
The Fourier transform of Py is P
Z
∞
1
x
Qy (t)f (x − t)dt where Qy (x) =
. We note that
2
π x + y2
−∞
1 1
= Py (x) + iQy (x)
π x + iy
cy (u) = −i sgn(u)e−πy|u| . There are similar formulas for the extension
and that Q
of functions from the unit circle into the unit disk. They are
Pr (t) =
1 − r2
,
1 − 2r cos(t) + r2
Qr (t) =
cr (n) = r|n| ,
P
91
2r sin(t)
,
1 − 2r cos(t) + r2
cr (n) = −i sgn(n)r|n|
Q
The mapping that takes a suitable function on R to the boundary value function
of the conjugate harmonic is called the Hilbert transform. If you pass to the limit
1
so it may not come as a surprise that the formula
y → 0+ in Qy (x) you get
πx
for the Hilbert transform is
Z
1
Hf (x) = f (x − t) dt
πt
where of course the integral has to be interpreted as a Cauchy Principal value
d(u) =
integral. From the point of view of the Fourier tranform, we have Hf
ˆ
−i sgn(u)f (u). It should be clear from the Plancherel theorm that H is an isometry on L2 (R). On the circle, the Hilbert transform is given by the Cauchy principal
value integral
Z
s
dη(t)
Hf (t) = f (t − s) cot
2
The constant functions are in the kernel of H, but it still has operator norm equal to
1 on L2 (T). In both cases, the Hilbert transform is bounded on Lp for 1 < p < ∞
but not on L1 nor on L∞ .
The first generalization of the Hilbert transform to several variables (i.e. functions on Rd ) are the Riesz transforms. They are given by
Z
Γ( d+1
)
uj ˆ
tj
2
d
Rj f (u) = i f (u) and Rj f (x) =
f (x − t)dt
d+1
|u|
|t|d+1
π 2
where again the integral is to be taken in the Cauchy principal value sense, that
is by removing a ball of radius around the singularity t = 0 and passing to the
limit as → 0+. The Riesz transforms are also bounded on all the Lp (Rd ) spaces
for 1 < p < ∞.
7.2
Schauder estimate for the Hilbert Transform
T HEOREM 101
Let f be a function of compact support that is Hölder continuous of index α for 0 < α < 1. Then its Hilbert transform g is Hölder continuous
of index α on R and decays at infinity like |x|−1 .
Proof. So g is defined by a Cauchy principal value integral. Since f has compact
support and is bounded, it follows easily that g decays like |x|−1 at infinity. We
would like to write
Z
f (y) − f (x)
g(x) =
dy
x−y
92
where the singularity of the integrand at y = x is now removable since |f (y) −
f (x)| ≤ C|y − x|α , but unfortunately we have introduced a new singularity at
infinity since the integrand no longer has compact support. Hence, we had better
write
Z
f (y) − f (x)
dy.
g(x) = lim
`→∞ |x−y|≤`
x−y
We will be considering g(x1 ) − g(x2 ) so, we have
Z
Z
f (y) − f (x1 )
f (y) − f (x2 )
g(x1 ) − g(x2 ) = lim
dy − lim
dy.
`→∞ |x −y|≤`
`→∞ |x −y|≤`
x1 − y
x2 − y
1
2
But the second integral could equally well be taken over |x1 − y| ≤ ` since in the
limit, the difference tends to zero as ` → ∞. This is roughly because
Z `+1
dy
`+1
= ln(` + 1) − ln(`) = ln
→ ln(1) = 0
y
`
`
as ` → ∞. Thus, we can write
Z
g(x1 ) − g(x2 ) = lim
`→∞
|x1 −y|≤`
f (y) − f (x1 ) f (y) − f (x2 )
−
x1 − y
x2 − y
dy. (7.1)
To estimate this, we set δ = 2|x1 − x2 | and split the integral in (7.1) into A taken
over |y − x1 | ≤ δ and B taken over δ < |y − x1 | ≤ `. We have
Z
|A| ≤ C
|x1 − y|−1+α + |x2 − y|−1+α dy ≤ Cδ α ≤ C|x1 − x2 |α
|y−x1 |≤δ
since |y − x1 | ≤ δ ⇒ |y − x2 | ≤ 3δ2 . We write B as
Z
Z
f (x2 ) − f (x1 )
1
1
dy+
−
(f (y)−f (x2 ))dy
x1 − y
x1 − y x2 − y
δ<|y−x1 |≤`
δ<|y−x1 |≤`
and the first integral is zero. Therefore
Z
(x2 − x1 )(f (y) − f (x2 ))
dy
B=
(x1 − y)(x2 − y)
δ<|y−x1 |≤`
and
Z
|x1 − y|−1 |x2 − y|−1+α dy.
|B| ≤ C|x1 − x2 |
δ<|y−x1 |≤`
93
But, on the range of integration |x2 − y| ≥ 12 |x1 − y|, leading to
Z
|B| ≤ C|x1 − x2 |
|x1 − y|−2+α dy ∼ |x1 − x2 |δ −1+α ∼ |x1 − x2 |α .
δ<|y−x1 |
The result is proved.
An interesting point here is that the corresponding statement for continuous
functions, i.e. kHf k∞ ≤ Ckf k∞ for f ∈ Cc (R) is false. In his notes on Sobolev
spaces Tao cites this as a reason that one might be interested in functions with a
fractional degree of regularity.
Theorem 101 also applies to the Riesz transforms on Rd and with essentially
the same proof, although the details are a little more complicated. It also applies to
higher order Riesz transforms. The typical higher order Riesz transform to which
it applies is
up uq
xp xq
K(x) = −d−2 , K̂(u) = cd 2
|x|
|u|
for 1 ≤ p < q ≤ d. We will give details later. We can see already which this is
important in the theory of PDE. Consider the equation
∆f = g
where g ∈ Cc∞ (Rd ). Then formally, ĝ(u) = −4π 2 |u|2 fˆ(u) and indeed
\
up uq
∂ 2f
(u) = cd 2 ĝ(u)
∂xp ∂xq
|u|
∂ 2f
that is
is a higher Riesz transform of g. Thus if g is Hölder continuous
∂xp ∂xq
of index α (0 < α < 1), so are the mixed second order partials of the solution.
Unfortunately, it does not apply to the straight second order partials since the
corresponding kernel does not have the zero mean property. To settle this, we
have the following Lemma.
L EMMA 102
Let Pk be a homogenous harmonic polynomial of degree k ≥ 1
in d real variables. Let K(x) = |x|−d−k Pk (x). The Rcondition k ≥ 1 forces K to
have the zero mean property. Let Rf (x) = lim→0+ |y|≥ f (x − y)K(y)dy be the
associated singular integral operator. Then for f ∈ Cc∞ (Rd ) we have
k
[)(u) = ik π d2 Γ( 2 ) |u|−k Pk (u)fˆ(u)
R(f
Γ( k+d
)
2
We will not prove this lemma. We now have
94
T HEOREM 103
Let f be a function of compact support that is Hölder continuous of index α for 0 < α < 1 on Rd . Then its higher Riesz transform g = Rf is
Hölder continuous of index α on Rd and decays at infinity like |x|−1 .
The proof is essentially the same as that of Theorem 101 above. The main
difference is in the estimation of the integral B which in this case becomes
Z
(K(x1 − y) − K(x2 − y)) (f (y) − f (x2 ))dy.
δ<|y−x1 |≤`
We estimate
|K(x1 − y) − K(x2 − y)| ≤ |x1 − x2 |
sup
|∇K(x)|
x+y∈L(x1 ,x2 )
≤ C|x1 − x2 ||x2 − y|−d+2
and the proof concludes much as before.
We handle straight derivatives by writing for example in R3
1 2
u21 =
|u| + (2u21 − u22 − u23 )
3
and use the harmonic polynomial P2 (u) = 2u21 − u22 − u23 .
The Riesz transforms are also bounded operators on Lp (Rd ) for 1 < p < ∞.
There are quite a few technical difficulties which we will hide by stating the result
in the following way.
T HEOREM 104
Let K be a kernel on Rd such that
• |K(x)| ≤ C|x|−d .
R
• |x|≥2|y| |K(x) − K(x − y)|dx ≤ C for all y ∈ Rd \ {0}.
R
• The operator Rf (x) = lim→0+ K(x − y)f (y)dy is bounded on L2 (Rd ),
specifically kRf k2 ≤ Ckf k2 . Then R is also bounded on Lp (Rd ) for 1 <
p ≤ 2.
Sketch proof. Since R is bounded on L2 , we need only show that R is of weak
type (1,1) and the result will follow from the Marcinkiewicz interpolation theorem. Let f ∈ L1 (Rd ). Let t > 0. Now let Rd be paved with a lattice of large
95
dyadic cubes Q. The proof will follow the general line of the martingale maximal
function proof, but is considerably more complicated.
Initially, the cubes are chosen so large that the averages
Z
−1
|Q|
|f (x)| dx ≤ t
(7.2)
Q
This is possible since f ∈ L1 . We now proceed to subdivide the cubes recursively.
Each cube is split into 2d cubes of half the linear size. A soon as the left hand
side of (7.2) is > t we stop and put that cube aside. Otherwise we subdivide ad
infinitum. At this point, we have a countable collection of cubes Qj on which the
process stopped. Outside these cubes we have |f (x)| ≤ t. This is a consequence
of the martingale convergence theorem.
Now, each cube Qj has a predecessor cube
R of twice thedlinear size on which
the mean was ≤ t. It follows from this that Qj |f | dx ≤ 2 t|Qj |. We now split
P
f = g + b a good function plus a bad function where b = j bj and each bj lives
on Qj . Outside the union of the Qj we set g = f . On Qj we set g to be the average
of f on Qj . (note that actually g = EGτ for a stopping time τ ). Each bj is given
by
!
Z
bj = 11Qj f − |Qj |−1
f dx .
Qj
At this point we have
• kgk1 ≤ kf k1 .
• |g| ≤ 2d t.
P
−1
•
j |Qj | ≤ t kf k1 .
P
•
j kbj k1 ≤ 2kf k1 .
R
• Qj bj dx = 0.
Now we get kRgk22 ≤ Ctkf k1 and hence the measure of the set where |Rg| > √
t/2
−1
∗
is ≤ Ct kf k1 . Now let Qj be a cube with the same center yj as Qj but 2 d
times the size in such a way that
x∈
/ Q∗j , y ∈ Qj =⇒ |x − yj | ≥ 2|y − yj |.
96
Then the total measure
the Q∗j is at most Ct−1 kf k1 . We can ignore that set.
S of
d
∗
Now let X = R \ j Qj . it will suffice to show the estimate
Z
|Rb|dx ≤ Ckf k1
X
or indeed
Z
|Rbj |dx ≤ Ckbj k1 .
Rd \Q∗j
But
Z
Z
|Rbj |dx ≤
K(x
−
y)
−
K(x
−
y
)
)b
(y)dy
dx
j
j
∗
∗
Rd \Qj
Rd \Qj
Qj
Z Z
≤
K(x − y) − K(x − yj )dx|bj (y)|dy
Z
Qj
Z
Rd \Q∗j
Z
≤
Qj
|x0 |≥2|y−yj |
0
0 K(x
−
(y
−
y
))
−
K(x
)
dx0 |bj (y)|dy
j
≤ Ckbj k1
after putting x0 = x − yj and using |x0 | = |x − yj | ≥ 2|y − yj |. This completes
the sketch.
7.3
Riesz Potentials
First of all, we need the full version of the Marcinikiewicz interpolation Theorem
T HEOREM 105
Let 1 ≤ p − 0, p − 1, q0 , q1 ≤ ∞, 0 < θ < 1, q0 6= q1 , pj ≤ qj
(j = 0, 1). Let T be a sublinear operator of weak type (pj , qj ). Then T is strong
type (p, q) where
1
1−θ
θ
=
+ ,
p
p0
p1
and
1
1−θ
θ
=
+
q
q0
q1
Here, by weak type (p, q) we mean an operator from Lp to Lq,∞ and by strong
type (p, q) we mean an operator from Lp to Lq .
97
T HEOREM 106
Let 1 < p, q, r < ∞ and suppose that
1
1
1
1
1
= − 0 = − 0 .
q
p r
r p
Let K ∈ Lr,∞ . Then kK ∗ f kq ≤ CkKkr,∞ kf kp , where the convolution can be
taken over any locally compact group (but we are mainly interested in Rd ).
Proof. Without loss of generality we can work with nonnegative functions. We
1
take kKkr,∞ = 1 and kf kp = 1 as normalizations. Then K ∗ (t) ≤ t− r . Let τ > 0
and define A to be the set of measure τ where K is largest. We cut K1 = 11A K
and K∞ = 11Ac K. Then we have
Z τ
1
1
t− r dt ∼ τ 1− r
kK1 k1 ≤
0
and
0
kK∞ kpp0
Z
≤
∞
p0
p0
t− r dt ∼ τ 1− r .
τ
The first of these integrals converges since r < ∞ and the second since q < ∞.
We now get
p0
t
kK∞ ∗ f k∞ ≤ kK∞ kp0 kf kp ≤ Cτ 1− r =
2
where the equality will be used to determine τ given t > 0. Then
1
0
kK1 ∗ f kp ≤ kK1 k1 kf kp ≤ Cτ 1− r ∼ tp (1−r) p0 − r.
We now have
|{K ∗ f > t}| ≤ |{K1 ∗ f > t/2}| ≤ Ct−p kK1 ∗ f kpp ∼ t−p t
pp0 (1−r)
p0 −r
= t−q
after a lengthy computation with the indices.
This shows that convolution with K is of weak type (p, q). Using the
Marcinikiewicz interpolation Theorem this can now be improved to strong type
(p, q).
98
According to Stein’s book, we have the Riesz potential operator
k
Ik (f ) = (−∆)− 2 (f )
and
1
Ik (f ) =
γ(k)
Z
(k real, 0 < k < d)
(7.3)
|y|−d+k f (x − y)dy.
The kernel K(y) = |y|−d+k is locally integrable in the range given. We have
d/2 2k Γ(k/2)
. The meaning of (7.3) is
γ(k) = πΓ((d−k)/2)
−k ˆ
I[
k (f )(u) = (2π|u|) f (u)
These statements are formal in the first instance, but can be verified for functions
f in the Schwartz class S(Rd ) with some difficulty. The Schwartz class consists
of functions that are infinitely differentiable and such that derivatives of all orders
are bounded when mutliplied by polynomials.
Applying the last theorem we have
P ROPOSITION 107 (H ARDY–L ITTLEWOOD –S OBOLEV L EMMA )
Ik extends
p
d
q
d
to a bounded operator L (R ) −→ L (R ) where 0 < k < d, p > 1, q < ∞ and
1
1 k
= −
q
p d
7.4
Sobolev Spaces
We start by studying Sobolev spaces on Rd . They may also be defined on open
subsets of Rd and also on differentiable manifolds.
Basically, the space W k,p (Rd ) consists of all functions f which together with
all derivatives of order ≤ k lie in Lp (Rd ). The derivatives have to be taken in the
weak sense. So, explicitly, this means that for every multiindex α = (α1 , . . . , αd )
with |α| ≤ k there exists a function gα ∈ Lp with the property that
Z
Z
∂ |α| ϕ
|α|
(x)dx = (−1)
gα (x)ϕ(x)dx.
f (x) α1
∂x1 · · · ∂xαd d
While there are many possible equivalent norms on W k,p we would typically take
X
kf kpW k,p =
kgα kpp
0≤|α|≤k
and this defines Wk,p as a Banach space. In the case p = 2 the above norm would
correspond to a Hilbert space.
99
P ROPOSITION 108
W k,p (Rd ).
Let 1 ≤ p < ∞. Then the space Cc∞ (Rd ) is dense in
Proof. We choose a nonnegative function ϕ in Cc∞ (Rd ) with integral 1. Let
ϕ (x) = −d ϕ(−1 x), so that ϕ is a summability kernel as → 0+. Then it is
clear that for f ∈ W k,p , ϕ ∗ f → f in W k,p -norm. This is essentially because
convolution with ϕ and partial differentiations commute. The resulting function
g in this approximation is C ∞ , but it does not necessarily have compact support.
To fix this, we take another nonnegative function ψ in Cc∞ (Rd ) with ψ(0) = 1.
Let ψ (x) = ψ(x) and define g (x) = g(x)ψ (x). Then, we obtain
X
∂ α g (x) =
cα,β ∂ β g(x) |α|−|β| ∂ α−β ψ(x)
β
for some constants cβ and where the sum is taken over multiindices β such that
0 ≤ β ≤ α. Thus, it follows that
X
∂ α g (x) − ∂ α g(x) ψ (x) =
cα,β ∂ β g(x) |α|−|β| ∂ α−β ψ(x)
β6=α
For the terms on the right, ∂ β g is in Lp , ∂ α−β ψ is in L∞ and |α| − |β| ≥ 1 so the
right hand side tends to zero in Lp norm. But ∂ α g ψ tends to ∂ α g and the result
follows.
Note that this result will not hold in the context of W k,p (Ω) for Ω an open
subset of Rd .
A corollary of this result is that for f ∈ W k,p (Rd ) any kth order (or less)
directional derivative is in Lp (in the weak sense).
Now let 1 < p < ∞. The statement f ∈ W k,p (Rd ) is roughly equivalent
Q
to j (2πiuj )αj fˆ(u) is in FLp . We can apply Hilbert transforms in each variable
Q
separately to show that j |uj |αj fˆ(u) is in FLp . Indeed, similarly for any unit
vector v we have that |v u̇|` fˆ(u) is in FLp for 0 ≤ ` ≤ k. Next average over all v
in the unit sphere and we obtain that |u|` fˆ(u) is in FLp for 0 ≤ ` ≤ k.
`ˆ
Conversely,
is in FLp , then by applying suitable Riesz transforms
Q if |u| f (u)
we find that j (2πiuj )αj fˆ(u) is in FLp for every α with 0 ≤ |α| ≤ k.
Thus, for 1 < p < ∞ we have the following characterization of W k,p (Rd ).
We have f ∈ W k,p (Rd ) if and only if there exist functions g` ∈ Lp for 0 ≤ ` ≤ k
such that f = I` (g` ) with I` the Riesz potential of order `.
100
With a little more work one can establish the following. We have f ∈
W (Rd ) if and only if there exist a function g ∈ Lp such that f = Jk (g) with Jk
the Bessel potential of order k. The Bessel potential has
k,p
2
2 −k ˆ
J[
k (f )(u) = (1 + 4π |u| ) 2 f (u)
The kernel for the Bessel potential has no simple formula, but it is a positive
radial function with a singularity like |x|−d+k at the origin and exponential decay
at infinity. Since the Bessel potential is perfectly good when k is nonintegral, this
allows a way to define W k,p (Rd ) for 1 < p < ∞ and k nonnegative. In case p = 2
this is a Hilbert space with the norm being
Z
2
kf kW k,2 = (1 + |u|2 )k |fˆ(u)|2 du
T HEOREM 109 (S OBOLEV E MBEDDING T HEOREM )
1
1 k
= − . Then W k,p (Rd ) ⊆ Lq (Rd ).
q
p d
Let 1 < p, q < ∞,
Proof. Let f ∈ W k,p (Rd ). We apply the Hardy–Littlewood–Sobolev Lemma to
the gadget g ∈ Lp with Fourier tranform ĝ(u) = |u|k fˆ(u).
We also have the following.
T HEOREM 110
Let 1 < p < ∞, 0 < β < 1 and −
β
1 k
= − . Then any
d
p d
function in W k,p (Rd ) is Hölder continuous of order β.
Proof.
The proof is as above. We have for f ∈ W k,p (Rd ),
Z
f (x) = ck |x − y|−d+k g(y)dy
where g ∈ Lp . Thus
Z −d+k
−d+k |f (x1 − f (x2 )| ≤ ck |x1 − y|
|x2 − y|
|g(y)|dy.
101
We simply need to show that
Z p 0
0
−d+k
−d+k |x2 − y|
|x1 − y|
dy ≤ C|x1 − x2 |βp
or by translation invariance that
Z p0
0
−d+k
−d+k |y|
|y − x|
dy ≤ C|x|βp
One checks that the homogeneities are correct and then that the integral on the left
converges at y = 0, y = x and y near infinity.
More generally, one may assert that if f ∈ W k,p (Rd ), then ∂ α f is Hölder
continuous of order β for
1 < p < ∞,
0 < β < 1,
1 k
|α| + β
− =−
and 0 ≤ |α| ≤ k.
p d
d
Alterative proofs of many of these results can be obtained using some tricks
of Gagliardo, Loomis and Nirenberg. These even handle the case p = 1. We start
with the following bizarre lemma.
L EMMA 111
Let d ≥ 2 and fj be functions on Lp (Rd−1 ) for j = 1, . . . , d.
d
Let π : R −→ Rd−1 be the mapping that forgets the jth coordinate. Let F =
Qd j
j=1 fj ◦ πj . Then
d
Y
p
≤
kfj kp .
kF k d−1
j=1
Proof. The proof is by induction on d. In the case d = 2, we have essentially
F (x1 , x2 ) = f1 (x2 )f2 (x1 ) and the result is obvious.
We illustrate the induction step by deducing the case d = 4 from the case
d = 2. So,
F (x1 , x2 , x3 , x4 ) = f1 (x2 , x3 , x4 )f2 (x1 , x3 , x4 )f3 (x1 , x2 , x4 )f4 (x1 , x2 , x3 )
Now for j = 1, 2, 3 we view fj as a function gj in the x4 variable taking values in
Lp (R2 ). As such, its Lp norm is just the Lp norm of fj . Applying the induction
hypothesis, we find that for fixed x4 , the norm of the function
(x1 , x2 , x3 ) 7→ f1 (x2 , x3 , x4 )f2 (x1 , x3 , x4 )f3 (x1 , x2 , x4 )
102
p
in L 2 is just
Q3
j=1
kgj (x4 )kp . Therefore, the function
g(x4 ; x1 , x2 , x3 ) = f1 (x2 , x3 , x4 )f2 (x1 , x3 , x4 )f3 (x1 , x2 , x4 )
p
p
is in L 3 of the x4 variable, taking values in L 2 of the x1 , x2 , x3 variables. To get
F , we multiply g by f4 which is in L∞ of the x4 variable, taking values in Lp of
p
the x1 , x2 , x3 variables. The result is a function in L 3 of the x4 variable, taking
p
p
values in L 3 of the x1 , x2 , x3 variables, that is a function in L 3 (R4 ).
P ROPOSITION 112
Let h ∈ W 1,1 (Rd ) (quantitatively) and also in C 1 (Rd )
d
(qualitatively). Then h ∈ L d−1 (Rd )
d
Indeed, since C 1 (Rd ) is dense in W 1,1 (Rd ), we see that W 1,1 (Rd ) ⊆ L d−1 (Rd )
at least in the weak sense by extension by continuity.
Proof.
We have for every j = 1, . . . , d that
jth place
Z ∞ z}|{
∂h
,
t,
.
.
.
,
x
)
|h(x)| ≤
(x
,
.
.
.
d dt
1
∂x
−∞
j
(7.4)
We let fj (x) equal the right hand side of (7.4) a function of all the variables except
xj . This function is in L1 (Rd−1 ) since h ∈ W 1,1 (Rd ). We now obtain
|h(x)| ≤
d
Y
1
fj (x) d
j=1
1
d
and fjd ∈ Ld (Rd−1 ). Therefore by the last lemma, h ∈ L d−1 (Rd ), as required.
In fact, more is true W 1,1 (Rd ) ⊆ L d ,1 (Rd ). This is a consequence of the
d−1
coarea formula and the isopermetric inequality . . .
We can now extend to
T HEOREM 113
Let h ∈ W 1,p (Rd ) (quantitatively) and also in C ∞ (Rd ) (qualitatively). Then h ∈ Lq (Rd ) for
1 1
1
− =
p d
q
assuming p ≥ 1 and q < ∞.
103
(7.5)
Proof. This has just been proved for p = 1, so we assume p > 1. Let g = h|h|α
with α > 0. Then we get ∇g = (1 + α)|h|α ∇h and indeed g is in C 1 (Rd ). We get
Z
|∇g|dx ≤ (1 + α)k∇hkp k|f |α |kp0
from Hölder’s inequality. We will choose αp0 = q, so that in fact this reads
Z
|∇g|dx ≤ (1 + α)k∇hkp khkαq .
Applying the previous proposition, it comes that
khk1+α
= kgk
q
d
d−1
≤ (1 + α)k∇hkp khkαq
d
= q. It’s easy to check that these conditions are consistent and
where (1 + α) d−1
ammount to (7.5)
This proof is easily pushed by induction to give W k,p (Rd ) ⊆ Lq (Rd ) for
1 k
1
− =
p d
q
for k an integer ≥ 1, p ≥ 1 and q < ∞.
We may also define Sobolev spaces on an open subset of Rd . The space
W k,p (Ω) consists of all functions f such that ∂ α f ∈ Lp (Ω) for all α with |α| ≤ k.
The Lp norm is taken with respect to Lebesgue measure cut on Ω.
P ROPOSITION 114
Let Ω be an open convex subset of Rd with C 1 boundary,
d < p < ∞ and p−1 − d−1 = −βd−1 . Then W 1,p (Ω) ⊆ C 0,β (Ω), the space of
Hölder continuous functions of order β on Ω.
Proof. We clearly have 0 < β < 1. So given two points x1 , x2 of Ω, it suffices to
show that |f (x1 ) − f (x2 )| ≤ Ck∇f kp |x1 − x2 |β and indeed, it is enough to show
this for |x1 − x2 | sufficiently small. Consider a family of paths from x1 to x2
1
1
1
xu (t) = (x1 + x2 ) + (x1 − x2 ) sin(t) + (x1 − x2 ) u cos(t)
2
2
2
as u runs over vectors of norm ≤ 1 orthogonal to x1 − x2 and such that u · n > 0
where n is a suitable unit vector, typically the inward unit normal to ∂Ω near the
points in question. These paths lie entirely inside Ω. We get
Z π
2
f (x1 ) − f (x2 ) =
∇f (xu (t)) · x0u (t)dt
− π2
104
leading to
Z
|f (x1 ) − f (x2 )| ≤
π
2
− π2
|∇f (xu (t))||x0u (t)|dt
Z
≤ C|x1 − x2 |
π
2
|∇f (xu (t))|dt.
− π2
As u varies, these paths fill out a solid hemisphere H. Therefore, averaging over
u in a suitable way, we get
Z
|x1 − x2 |
|f (x1 ) − f (x2 )| ≤ C
|∇f (y)|dy
meas(H) H
and hence
1
d
|f (x1 ) − f (x2 )| ≤ C|x1 − x2 |1−d k∇f kp meas(H) p0 ≤ C|x1 − x2 |1− p k∇f kp
as required.
By replacing the hemispheres with more complicated blobs, one may hope to
extend this argument to more general open sets.
We state a few results without proof.
T HEOREM 115
Let Ω be a Lipschitz domain in Rd . Basically, this means that
the boundary can be specified locally as the graphs of Lipschitz functions suitably
oriented. Let 1 ≤ p ≤ ∞ and k ∈ N. Then there is a continuous extension
operator
T : W k,p (Ω) −→ W k,p (Rd )
with the extension property T (f )|Ω = f .
In the case 1 < p < ∞, where we may define W k,p (Rd ) for k > 0 nonintegral,
we may also define W k,p (Ω) as the quotient of this space by restriction. In the
particular case p = 2, the resulting space W k,2 (Ω) will be a Hilbert space.
P ROPOSITION 116
Let k ∈ N and 1 ≤ p < ∞. For any Ω ⊆ Rd C ∞ (Ω) ∩
W k,p (Ω) is dense in W k,p (Ω).
Sketch proof.
and with
We construct a sequence ϕj of C ∞ functions of compact support
supp(ϕj ) ⊆ Ωj ⊂ cl(Ωj ) ⊂ Ω,
105
P∞
ϕj = 11Ω and with the property that every x ∈ Ω has a neighbourhood
that meets only finitely many of the Ωj . Typically, the Ωj are ”shells” tending
to the boundary of Ω. Now let f ∈ W k,p (Ω) be the function that we wish to
approximate. Let > 0 and j = 2−j . Then f ϕj is certainly in W k,p (Ω). We
now find a nonnegative C ∞ functions ψj with integral equal to one (basically a
convolution approximate identity). The additional requirements on ψj are that
kf ϕj − (f ϕj ) ∗ ψj kW k,p < j and supp((f ϕj ) ∗ ψj ) ⊆ Ωj . The approximation is
possible since partial derivatives commute with convolutions.
The approximating function g is taken to be
j=1
g=
∞
X
(f ϕj ) ∗ ψj .
(7.6)
j=1
P∞
We have kf − gkW k,p ≤
j=1 kf ϕj − (f ϕj ) ∗ ψj kW k,p < . Each individual
∞
(f ϕj ) ∗ ψj is C since f ϕj is an lp function of compact support and ψj is in Cc∞ .
The sum (7.6) is C ∞ since locally it is only a finite sum of C ∞ functions.
Note that there is a related topological concept called paracompactness (every
open cover has a locally finite refinement). The concept of refinement of a covering is different from the concept of subcovering in that it allows each open set in
the cover to be replaced by a (possibly) smaller open set. Every metric space is
paracompact. The concept is defined to enable the construction of resolutions of
the identity on manifolds.
L EMMA 117 (T HE R ELLICH L EMMA )
Let Ω ⊆ Rd be a bounded open set
with C 1 boundary. Let 1 ≤ p < d, q −1 = p−1 − d−1 . Then the inclusion
mapping of W 1,p (Ω) into Lq (Ω) is not merely continuous (as may be deduced
from Theorem 115) but is also a compact operator.
106
8
Symbolic Calculus of Hilbert space operators
8.1
Spectral theory of normal operators and projection measures
A ∗-algebra A is an algebra with a ∗ operation, that is for every element x ∈ A,
there is an element x∗ . We ask that x∗∗ = x, that x 7→ x∗ is conjugate linear, that
(xy)∗ = y ∗ x∗ and also that kx∗ k = kxk. A C ∗ algebra is a Banach algebra with
the additional property that kx∗ xk = kxk2 . In fact, it can be shown that every
unital C ∗ algebra is isomorphic with the algebra of all bounded operators on a
Hilbert space. Once you know this theorem, then you might as well be dealing
with B(H).
What we are going to do in this section is to use the Gelfand theory to discuss
the spectral theory of normal operators on Hilbert space. If T is such a normal
operator, then the closed algebra generated by T and its adjoint T ∗ is a commutative closed subalgebra of B(H) with identity and hence a commutative C ∗ algebra
with identity. Some of the results that follow will apply to general commutative
C ∗ algebras.
Every element x ∈ A can be written in the form
1
1
x = (x + x∗ ) − i ((ix) + (ix)∗ ),
2
2
so in particular in the form x = x1 + ix2 where x∗j = xj for j = 1, 2. If x∗ = x we
say that x is hermitian or self-adjoint. If x is hermitian and ϕ is an mlf on A, then
ϕ(x) is real. To see this, we let for t real ut = exp(itx) given by a convergent
power series. We get
kut k2 = ku∗t ut k = k exp(−itx) exp(itx)k = k exp(0)k = 1
107
using the commutativity to collapse the product of exponentials. It follows that
| exp(itϕ(x))| = |ϕ(ut )| ≤ kut k = 1.
Thus, ϕ(x) must be real. It follows from this that for general x,
ϕ(x∗ ) = ϕ(x).
(8.1)
Also, if x is hermitian, we have kxk2 = kx2 k. Replacing x by x2 this gives
n
n
kxk4 = kx2 k2 = kx4 k and an easy induction gives kxk2 = kx2 k. The spectral
radius formula now gives kx̂k∞ = kxk.
For a general element x ∈ A, we have
2
∗ xk
d
kx̂k2∞ ≤ kxk2 = kx∗ xk = kx
∞ ≤ kx̂k∞
since x∗ x is hermitian. Thus the Gelfand transform is an isometry and in particular
injective. But the algebra of Gelfand transforms is a complete uniform separating
subalgebra of C(MA ) with identity and by the Stone–Weierstrass theorem it must
be the whole of C(MA ). In the case that A is generated by a normal T and its
adjoint T ∗ we see that the mapping MA to sp(T )given by ϕ 7→ ϕ(T ) is one-toone. For, if ϕ1 (T ) = ϕ2 (T ), then ϕ1 (T ∗ ) = ϕ2 (T ∗ ) and hence ϕ1 = ϕ2 since T
and T ∗ generate A. Thus, we may identify MA to sp(T ). Now for a continuous
function θ on sp(T ) and for ξ, η ∈ H, we consider the map θ 7→ hη, θ(T )ξi. This
is a continuous linear form on C(sp(T )) and hence is given by integration against
a measure µη,ξ We have
Z
hη, θ(T )ξi =
θ(z)dµη,ξ (z).
sp(t)
Note that kµη,ξ k ≤ kηkkξk. Also, the map
(η, ξ) 7→ µη,ξ
is linear in ξ and conjugate linear in η. We note also that θ(T ) = θ(T )∗ . This is
another way of writing (8.1). Then we have
Z
θdµη,ξ = hη, θ(T )ξi = hη, θ(T )∗ ξi = hθ(T )η, ξi = hξ, θ(T )ηi
sp(t)
Z
=
θdµξ,η
sp(t)
108
which confirms that µξ,η = µη,ξ .
Hence, for θ now a bounded Borel function on sp(T ) we may infer the existence of an operator M (θ) such that
Z
hη, M (θ)ξi =
θ(z)dµη,ξ (z)
sp(T )
and it follows that M (θ) = M (θ)∗ by reversing the argument above. If θ, ψ ∈
C(sp(T )) we get
M (θψ) = M (θ)M (ψ).
(8.2)
and this yields
Z
Z
θψdµη,ξ =
spT
θdµη,M (ψ)ξ
spT
This can now be extended to all θ bounded Borel on sp(T ). So (8.2) holds for
all θ bounded Borel and all ψ continuous. Replaying the argument with θ and ψ
interchanged, we see that (8.2) holds for all θ, ψ bounded Borel.
Next, take a Borel subset X of sp(T ) and put P (X) = M (11X ). Then P (X)
is a hermitian operator on H and P (X)2 = P (X), i.e. P (X) is a hermitian
projection.
We think of P as a projection-valued measure and we write symbolically
Z
zdP (z).
T =
sp(T )
The symbolic calculus of a normal operator T now amounts to
Z
θ(T ) =
θ(z)dP (z).
sp(T )
and is valid for all Borel functions θ on sp(T ).
Note that
P (X ∩ Y ) = M (11X∩Y ) = M (11X 11Y ) = M (11X )M (11Y ) = P (X)P (Y )
for X and Y Borel sets.
Next, we justify that P is in some sense a measure. The weak operator topology on B(H) is defined by the seminorms T 7→ |hη, T (ξ)i as η and ξ run over H.
The strong operator topology on B(H) is defined by the seminorms T 7→ kT (ξ)k
109
as ξ runs over H. Both of these locally convex space topologies are weaker than
the operator norm topology.
It is clear
subsets of sp(T ) and Yn = ∪nk=1 Xk then
Pnthat if Xk are disjoint Borel
P (Yn ) = k=1 P (Xk ). Let Y = ∪∞
k=1 Xk . Then it is also clear that P (Yn ) →
P (Y ) in the weak operator topology. This follows since the µη,ξ are countably
additive measures. Now
k(P (Y ) − P (Yn ))ξk2 = h(P (Y ) − P (Yn ))ξ, (P (Y ) − P (Yn ))ξi
= hξ, (P (Y ) − P (Yn ))2 ξi
= hξ, (P (Y ) − P (Yn ))ξi → 0
as n → ∞ since
(P (Y ) − P (Yn ))2 = P (Y )2 − P (Y )P (Yn ) − P (Yn )P (Y ) + P (Yn )2
= P (Y ) − P (Yn ) − P (Yn ) + P (Yn ) = P (Y ) − P (Yn ).
Thus we have
P
∞
[
!
Xk
=
∞
X
P (Xk )
k=1
k=1
with the sum on the right converging in the strong operator topology.
8.2
Symbolic Calculus for Hilbert space contractions
We give the amazingly short proof of von Neumann’s inequailty due to John Wermer. There are many proofs of the result.
T HEOREM 118
Let T be a linear contraction on a Hilbert space H. Let p(z) =
P
n
k
p
z
a
polynomial
complex coefficient. The result p(T ) of substituting
k=0 k
Pwith
n
T into the polynomial k=0 pk T k satsifies kp(T )kop ≤ sup|z|≤1 |p(z)|.
A consequence is that f (T ) can be defined and satisfies kf (T )kop ≤
sup|z|≤1 |f (z)| for any function f in the disc algebra A(D).
Proof. The first step is to reduce to the case of a finite-dimensional Hilbert space.
Obviously, we need only prove
* n
+
X
pk T k ξ ≤ kηkkξk sup |p(z)|
η,
|z|≤1
k=0
110
for all ξ and η. Let K be the linear span of η and T k ξ for k = 0, . . . , n. Let J
denote the inclusion of K into H and J ∗ the orthogonal projection of H onto K.
Then p(J ∗ T J) = J ∗ p(T )J. Hence it suffices to work with J ∗ T J which is a linear
contraction on K.
After choosing a suitable (finite) orthonormal basis in K, we may now write
the matrix of T as U diag(σ1 , . . . , σd )V where U and V are unitary and σj
are the singular values of T . They satisfy 0 ≤ σj ≤ 1. Now let T (w) =
U diag(w1 , . . . , wd )V allowing w = (w1 , . . . , wd ) to run over the polydisk Dd .
Then
q(w) = hη, p(T (w))ξi
is a polynomial in the variables w1 , . . . , wd and hence takes its maximum absolute
value when |wj | = 1 for all j = 1, . . . , d. But then T (w) is the product of
three unitaries and hence is a unitary. The result for unitary operators (and more
generally normal operators) follows from the results in the previous section. The
result follows.
111
9
Odds and Ends
9.1
The Hardy spaces and Blaschke Products
The Hardy spaces are defined by
H p (T) = {f ∈ Lp (T); fˆ(n) = 0 for n < 0}.
usually for 1 ≤ p ≤ ∞. They are closed linear subspaces of Lp (T). They can
also be thought of as spaces of analytic functions in the open unit disk ∆ = {z ∈
C; |z| < 1}
Z
p
H = {F ; F analytic in ∆, sup
|F (reit )|p dη(t) < ∞}
0≤r<1
T
To establish this for 1 < p ≤ ∞, one uses weak* compactness. Let fr (t) =
F (reit ), then the condition F ∈ H p implies that (fr ) is bounded in Lp (T), for
0 ≤ r < 1. Hence the is a weak* limit point f in Lp . It follows that fˆ(n) is the
coefficient of z n in the Maclaurin expansion of F for n 6= 0 and fˆ(n) = 0 for
n < 0. Hence that F (reit ) = (f ∗ Pr )(t). This does not quite work for p = 1
since L1 (T) is not a dual space. If we want to take a weak* limit, then we should
do so in M (T). The situation is saved by the following theorem.
T HEOREM 119 (F. & M. R IESZ T HEOREM )
If µ ∈ M (T) and µ̂(n) = 0 for
all strictly negative integers n, then µ is absolutely continuous with respect to
linear measure on T.
One very basic question that may be asked is “what are the closed linear
subspaces of `2 that are invariant under the forward shift?” An equivalent formulation is which closed linear subspaces S of H 2 (T) have the property that
f ∈ S =⇒ e1 f ∈ S?
112
P ROPOSITION 120
Let S be a closed linear subspace of H 2 (T) with the property that f ∈ S =⇒ e1 f ∈ S. Then either S = {0} or there exists q ∈ H 2 (T)
with |q| = 1 η-almost everywhere such that S = qH 2 .
Proof.
Assume that S 6= {0}. Then we may define
m = inf{n ∈ Z+ ; ∃f ∈ S, such that fˆ(n) 6= 0}.
Then there exists f ∈ S such that fˆ(m) 6= 0. It follows that f ∈
/ e1 S. Therefore
e1 S is a proper closed linear subspace of S. (Exercise: Why is e1 S closed?) Now
let q ∈ S∩(e1 S)⊥ with kqk2 = 1. Now for n ≥ 1, en q ∈ e1 S (proof by induction).
Therefore q ⊥ en q for n ≥ 1. This says
Z
en |q|2 dη = 0
for n R≥ 1 and taking complex conjugates also for n ≤ −1. On the other hand we
have e0 |q|2 dη = 1 and hence complete knowledge of the Fourier coefficients of
|q|2 . But |q|2 ∈ L1 . Hence by the uniqueness theorem, |q|2 = 1.
We claim that if f ∈ H 2 , then qf ∈ S. To see this, recall
fN =
N X
1−
n=0
n ˆ
f (n)en
N
is a trigonometric polynomial with nonnegative exponents and so qfn ∈ S. But
since q is bounded, qfn → qf in L2 norm and the claim is proved since S is
closed.
So qH 2 ⊆ S. Both sets are closed linear subspaces of H 2 . If the inclusion is
strict, we may find g ∈ S ∩ (qH 2 )⊥ . with kgk2 = 1. Now since g ∈ (qH 2 )⊥ we
have
Z
gqen dη = 0
for all n ≥ 0. But since q ∈ (e1 S)⊥ , and g ∈ S we have
Z
en gqdη = 0
for all n ≥ 1. This shows that gq has all its Fourier coefficients zero. Since
gq ∈ L1 , it follows that gq = 0. But |q| = 1 and hence g = 0 contradicting
kqk2 = 1.
We use the notation ∆ for the open unit disk.
113
P ROPOSITION 121
Let F ∈ H 1 (∆) and F not the zero element of H 1 . Then
the zeros of H say αj (as j runs over a necessarily countable) index set satisfy
P
j (1 − |αj |) < ∞.
Sketch proof. We start by factoring out the zeros at the origin. If F has a zero
of order p at z = 0, then write F (z) = z p G(z) and replace F by G. Hence we
may always assume that F (0) 6= 0. Let
αj
αj − z
Bj (z) =
|αj | 1 − αj z
a gadget called a Blaschke factor. It is continuous on the closed unit disk and has
|Bj (z)| = 1 for |z| = 1. Similarly, we may factor out Blaschke factors and write
F (z) = G(z)
n
Y
Bj (z)
j=1
where G ∈ H 1 and kgk1 = kf k1 . We get
|F (0)| = |G(0)|
n
Y
|Bj (0)| = |G(0)|
j=1
It follows that
n
Y
|αj | ≤ kgk1
j=1
n
Y
|αj | ≥
j=1
n
Y
|αj | = kf k1
j=1
n
Y
|αj |
j=1
|F (0)|
> 0.
kf k1
passing to the limit as n → ∞ (in case the set of zeros is denumerable) we have
the result.
L EMMA 122
Let p ∈ Z+ and αj ∈ ∆ \ {0} be such that
Then the Blaschke product
B(z) = z
p
n
Y
P
j (1
− |αj |) < ∞.
Bj (z)
j=1
converges uniformly on the compacta of ∆ and defines a bounded analytic function in ∆ with boundary value function b of unit modulus η-almost everywhere on
T.
114
Proof.
We can assume without loss of generality that p = 0. Now let Hp (z) =
Qn
j=1 Bj (z) and use hp to denote the boundary value function. (No problems here
since Hp is continuous on the closed unit disk). Let m ≥ n then
!
Z
m
Y
|αj | .
khm − hn k22 = khm k22 + khm k22 − 2< hn hm dη = 2 1 −
j=n+1
Since j (1 − |αj |) < ∞, it follows that (hn ) is an L2 Cauchy sequence. From
this it follows that the (Hn ) converge uniformly on the compacta of ∆ and also
that the limit function on T has absolute value 1 almost everywhere.
In fact, we can continue to define Gp = Hp−1 F an analytic function in ∆.
Clearly Gp ∈ H 1 (∆) and kgp k1 = kf k1 . (Remember Hp−1 is continuous in some
open set containing the unit circle). So (gp ) has a weak* limit point g in H 1 (T)
by the F. & M. Riesz Theorem. Here weak* means in σ(M (T), C(T)), but in case
you were wondering, H 1 actually is a dual space. It follows that Gp has the u on
c limit point G. Since F = Hp Gp , it now follows that F = BG. Passing to the
limit at the boundary using radial limits, we now get f = bg and |f | = |g| almost
everywhere.
P
We now return to the function q of Proposition 120. In the special case
F = Q with Q corresponding to q we find that the function G has no zeros
in ∆ and its boundary value function has |g| = 1 almost everywhere. It follows that − log(G(z)) is analytic in ∆ and has nonnegative real part. The real
part is a harmonic function in ∆ and its integral over the circle |z| = r is just
− log(|G(0)|) < ∞ for all 0 ≤ r < 1 and thus the Poisson integral of a nonnegative measure µ = k · η + µs where k ∈ L1 (T) and µs is singular with respect to
η. Checking what happens over radial limits we find that k vanishes identically.
It follows that − log(G(reit )) can be obtained by convolving µs by the Herglotz
kernel Pr + iQr . This leads to the formula
Z it
e +z
dµs (t) .
G(z) = exp −
eit − z
So, the most general Q can be written
Z it
Y
e +z
αj
αj − z
p
Q(z) = z exp −
dµ(t)
eit − z
|αj | 1 − αj z
j
where µ is a singular measure and
P
j (1
− |αj |) < ∞.
115
10
Fourier Analysis on Compact Groups
In this section G is a compact Hausdorff topological group. The right regular
representation of G on L2 (G) is defined by R(x)f (y) = f (yx). To understand
why it should be so we see that R(x1 x2 )f (y) = f (yx1 x2 ). On the other hand, if
g = R(x2 )f then g(z) = f (zx2 ) and R(x1 )R(x2 )f (y) = R(x1 )g(y) = g(yx1 ) =
f (yx1 x2 ). Hence R(x1 x2 ) = R(x1 )R(x2 ). We interpret the right regular representation R of G as a group homomorphism of the group G into the group of
unitary operators acting on L2 (G). One may also build the left regular representation, but in order to have a group homomorphism the definition has to be
L(x)f (y) = f (x−1 y).
Both of these representations are continuous from G to the group of unitary operators on L2 (G) given the strong operator topology.
If we have a representation π of G on a Hilbert space H this means that π
is a group homomorphism of G into the group of unitary operators on H with
the strong operator topology. An invariant subspace K of H is a closed linear
subspace of H such that ξ ∈ K and x ∈ G implies π(x)ξ ∈ K. In this case, the
restriction of π(x) to K defines a representation of G on K. However, it is also
true that K ⊥ is a closed invariant linear subspace of H. To see this, let ξ ∈ K and
η ∈ K ⊥ . Then we have
hξ, π(x)ηi = hπ(x)∗ ξ, ηi = hπ(x)−1 ξ, ηi = hπ(x−1 )ξ, ηi = 0.
Since H = K ⊕ K ⊥ as a Hilbert space direct sum, we have decomposed the
representation into two parts. Such a representation is said to be reducible. If a
representation has no closed invariant linear subspaces (apart from the trivial ones,
{0H } and H itself) we say that it is irreducible.
116
L EMMA 123
Let K be a closed invariant subspace of a Hilbert space H for
a unitary representation π. Let P be orthogonal projection from H to K. Then
for all x ∈ G we have π(x)P = P π(x). Further, let M be an invariant linear
subspace of H. Then P (M ) is also an invariant linear subspace.
Proof. Consider Q = π(x)−1 P π(x) = π(x)∗ P π(x). Clearly Q is a hermitian
projection which maps onto K and has kernel K ⊥ . Hence Q = P .
Now let ξ ∈ M and x ∈ G. Then π(x)P ξ = P π(x)ξ ∈ P (M )
P ROPOSITION 124
Let π be a representation on a Hilbert space H, then there
is a maximal closed invariant linear subspace K of H on which π decomposes into
a (possibly infinite) Hilbert space direct sum of finite dimensional representations.
Proof. The proof is by Zorn’s Lemma. An object (for the purposes of this proof)
is a set of finite dimensional invariant linear subspaces of H that are mutually
orthogonal. We can partially order the set of objects by inclusion. It is easy
to see that for any chain in this partially ordered set the union taken over the
chain is again an object. The only thing that needs to be verified here is that the
elements (i.e. finite dimensional linear subspaces) in the union object are mutually
orthogonal. To do this, we choose two such finite dimensional linear subspaces
say K1 and K2 . They must belong to object1 and object2 say. But one of these
objects contains the other, so that K1 and K2 both belong to some object in the
chain and hence are orthogonal.
So, since each chain has an upper bound, it follows that there exists a maximal
object. Now let K be the Hilbert space direct sum of the subspaces in this maximal
object.
Next, we claim that K is maximal among closed linear subpaces of H which
can be written as Hilbert space direct sums of finite dimensional invariant subspaces. Suppose that M is a larger such linear subspace. Then
M = ⊕α∈I Mα
the corresponding direct sum with Mα finite dimensional and invariant. Clearly,
there exist α ∈ I, ξ ∈ Mα such that ξ ∈
/ K. Let P be orthogonal projection
on K ⊥ . Then P ξ 6= 0. But P (Mα ) is a nonzero invariant linear subspace of H
contained in K ⊥ . It is also closed since it is finite dimensional. This contradicts
the maximality of K.
117
T HEOREM 125
Let G be a compact Hausdorff topological group. Then the
right regular representation can be written as a Hilbert space direct sum of finite
dimensional irreducible invariant linear subspaces.
Proof. We start by using Proposition 124 to find a maximal closed linear subspace K of L2 (G) for which the right regular representation breaks down as a
direct sum of finite dimensional representations. We will work on K ⊥ a closed
invariant subspace for the right regular representation. It follows that K ⊥ does not
have any nonzero finite dimensional invariant subspaces.
Assuming that K ⊥ is nonzero, choose a nonzero function g in K ⊥ . Next, find
a compact symmetric neightbourhood of e such that kf ∗ g − gk2 < 12 kgk2 with
f = fV = meas (V )−1 11V , possible since fV is a summability kernel. Note that
left convolution by f , namely
Z
Z
−1
Lf (h)(y) = h(x y)f (x)dx = f (yz −1 )h(z)dz
is a hermitian operator on L2 (G). This is because f is real and symmetric (i.e.
f (x−1 ) = f (x)). But Lf is also a compact operator, in fact a Hilbert-Schmidt
operator since f ∈ L2 . Let J denote the inclusion of K ⊥ into H and J ∗ the
adjoint of J is then orthogonal projection from H onto K ⊥ . Both these operators
commute with the right regular representation. But Lf also commutes with the
right regular representation. This is essentially because left translation and right
translation commute. If you have a group element x it doesn’t matter if you first
multiply on the left by y and then on the right by z or whether you perform these
operations in the opposite order:
(yx)z = y(xz).
We have J ∗ Lf JR(x) = J ∗ Lf R(x)J = J ∗ R(x)Lf J = R(x)J ∗ Lf J and J ∗ Lf J
is a compact operator on K ⊥ . Therefore
K ⊥ = (⊕λ Hλ ) ⊕ ker(J ∗ Lf J)
where Hλ are the finite dimensional eigenspaces of Lf as λ runs over the nonzero
eigenvalues. But, there aren’t any Hλ because we already pulled the finite dimensional invariant stuff into K. So, J ∗ Lf J vanishes on K ⊥ . But then
1
0 = |hg, Lf gi| = |hg, gi − hg, g − Lf gi| ≥ kgk22 − kgk22 > 0
2
118
This contradiction shows that K ⊥ is zero and the result is almost proved.
The final step is to break down the finite dimensional invariant subspaces of
L2 (G) into irreducible ones. This is trivial since in view of the finite dimensionality, the breaking down procedure has to stop.
119