Download Normed and Banach spaces

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Normed and Banach spaces
László Erdős
Nov 6, 2007
We recall that the norm is a function on a vectorspace V , k · k : V → R+ , satisfying the
following properties
kx + yk ≤ kxk + kyk
kcxk = |c| kxk
kxk = 0 ⇐⇒ x = 0
We always assume that the field of V is R or C, i.e. there is a natural absolute value on
these fields. Note that otherwise one has to start with putting a norm on the field itself – the
absolute value on R and C is a norm. A vectorspace equipped with a norm is called normed
space (normierter Raum).
A norm defines a metric on V with the formula
d(x, y) := kx − yk
i.e. normed spaces are in particular metric spaces as well.
Of course there are many norms on a vectorspace and there is typically not a single optimal
one. We have already seen the different metrics dp on C[0, 1] which actually come from the
following norms:
Z 1
kf kp :=
|f (x)|p dx
kf k∞ := sup |f (x)| = max |f (x)|
They are called Lp and L∞ (or supremum- or maximum-) norms, respectively. The names
come from the appropriate Lebesgue spaces, since the same formulas define norms on bigger
spaces of various integrable functions as well.
In most cases, when we talk about normed spaces, it is important to indicate not just the
vectorspace, but also the norm. E.g. we will denote the vectorspace of continuous functions
by C[0, 1], but if we want to think of it as a normed space, we will specify the norm and use
the notation (C[0, 1], k · kp). Later we will see that the supremum norm, k · k∞ is in some sense
the “most” natural one on C[0, 1], and thus sometimes in some books take it for granted that
C[0, 1] is equipped with this norm without explicitly saying it.
Homework 1.1 Prove that for any f ∈ C[0, 1]
lim kf kp = kf k∞
Even the Euclidean space Rn has many norms, but in finite dimensions it does not really
make a big difference (at least not qualitatively, e.g. as far as convergence is concerned) since
we have
Lemma 1.2 On a finite dimensional vectorspace V any two norms are equivalent, i.e. for
any two norms k · k1 and k · k2 there exists two positive constants, c, C, such that
ckxk1 ≤ kxk2 ≤ Ckxk2 ,
∀x ∈ V
Proof. Fix a basis, {e1 , e2 , . . . , en } and represent any vector x ∈ V with a vector in Cn
consisting of its coordinates
xj ej −→ (x1 , x2 , . . . , xn ) ∈ Cn
This representation is clearly one to one.
We will show that any norm k · k on V is equivalent with the norm
kxk∗ := max |xj |
i.e. with the `∞ -norm on the coordinates (CHECK that this is a norm!). In particular this
will show that any two norms are equivalent.
Recall that in Cn any closed bounded set is compact, so consider the unit sphere in the
S := {x ∈ V : max |xj | = 1}
this is a compact set in the topology generated by the k·k∗ -norm (Heine-Borel). If you learned
Heine-Borel for the Euclidean norm, then you have to recall that the euclidean norm and the
maximumnorm are equivalent, i.e.
1/2 √
|xj |
≤ n max |xj |,
max |xj | ≤
hence they generate the same topology, hence their compact sets are the same.
Consider now an arbitrary norm, k · k, as a function on V :
f (x) := kxk
Note that this function is continuous with respect to the k · k∗ norm, since
kx + hk − kxk ≤ khk
(triangle inequality) and
kej k ≤ Ckhk∗
hj ej ≤ max |hj |
khk = j=1
where C is a constant independent of h (it depends only on the basis that we fixed).
Therefore we have a continuous function f that we restrict onto the compact set S, i.e.
we have a continuous function on a compact set. By a general theorem in analysis we know
that the minimum and maximum of a continuous function on a compact set is achieved, in
particular there exists two numbers m, M such that
m := min f (x) ≤ f (x) ≤ M := max f (x)
∀x ∈ S
Note that M < ∞ and m > 0. This latter holds because if m were zero, then there would be
an element y ∈ S such that f (y) = 0. But recall that f (y) = kyk is a norm, so this would
imply y = 0, in contradiction with y ∈ S.
For an arbitrary x ∈ V , x 6= 0, we set v := x/kxk∗ , then v ∈ S and by the homogeneity of
the k · k-norm can write
kxk = kxk∗ kvk = kxk∗ f (v)
≤M <∞
that proves (1.1). Since equivalent norms generate the same topology and the unit ball is compact in the `∞
norm, thus we proved that
Lemma 1.3 In any finite dimensional vectorspace equipped with any norm the unit ball is
This statetment is false in infinite dimensional vectorspaces. Consider for example the
unit ball in `∞ :
B := {(x1 , x2 , . . .) : sup |xj | ≤ 1}
Take the sequence
e1 := (1, 0, 0, 0, . . .)
e2 := (0, 1, 0, 0, . . .)
etc. This clearly lies in B, but it has no convergent subsequence.
Actually this is not just an example, but the following hold:
Proposition 1.4 Let V be an infinite dimensional vectorspace equipped with any norm k · k.
Then the closed unit ball
B := {x ∈ V : kxk ≤ 1}
is not compact.
Proof. The proof relies on the following version of Riesz Lemma (there will be a very
similar Lemma later in Hilbert spaces, that will also be called Riesz Lemma).
Lemma 1.5 (Riesz) Let U ⊂ X be closed proper subspace of a normed vectorspace (X, k · k)
(proper means U 6= X.) Given any number 0 < δ < 1, there is a unit vector x = xδ ∈ X
(depending on δ) such that
kxδ − uk ≥ 1 − δ,
∀u ∈ U
The lemma is interesting when δ is very small. To see this, define
dist(x, U) := inf{kx − uk : u ∈ U}
to be distance of x from the subspace U. Since 0 ∈ U, the distance of any unit vector cannot
be bigger than 1. The Lemma says that there is an “almost optimal” unit vector, i.e. a unit
vector that is 1 − δ distance from U.
Proof of the Lemma. Choose an element x 6∈ U and let d = dist(x, U). By the infimum in
the definition of d, there exists a vector uδ ∈ U such that
kx − uδ k ≤
> d. Define
xδ :=
x − uδ
kx − uδ k
This is clearly a unit vector and this will be our choice. We need to check its distance from
any other elements of U. Let u ∈ U be arbitrary and write
kxδ − uk = −
− u
kx − uδ k kx − uδ k
x − (uδ + kx − uδ ku) ≥
> 1−δ
kx − uδ k
kx − uδ k
and this completes the proof.
Question: where did we use that U is closed? Apparently nowhere, but implicitly we
did use it in the inequality 1−δ
> d, that holds only if d > 0 (note that from this it also
follows that kx − uδ k > 0 which was also tacitly used above). To ensure d = dist(x, U) > 0,
the information x 6∈ U would not be enough without the closedness of U (the distance of a
boundary point of a set to the set is zero, even though the boundary point may not lie in the
set itself).
You may find it weird a subspace may not be closed. Indeed, this is often a property of
infinite dimensional subspaces. For example consider
C 1 [0, 1] := {f : [0, 1] → C : f, f 0 ∈ C[0, 1]}
the set of continuously differentiable functions on [0, 1] (since the [0, 1] is closed, we also require
the existence and continuity of the appropriate one-sided derivatives at the endpoints). Now
it is clear that C 1 [0, 1] is an infinite dimensional subspace of C[0, 1], but is not closed: it
is easy to find a sequence of differentiable functions that converge in supremum norm to a
non-differentiable (but continuous) function (FIND AN EXAMPLE).
However, finite dimensional subspaces are always closed:
Homework 1.6 Let U ⊂ X be a finite dimensional subspace of a normed space (X, k · k),
then U is closed. (Hint: let un → u, un ∈ U. Choose a basis in U and express un in terms of
this basis. Using that any two norms on U are equivalent (Lemma 1.2), show that coefficients
of un in the fixed basis are Cauchy sequences, hence they have a limit, thus you can represent
u as a linear combination of basis vectors in U.)
Finally, we can return to the proof of Proposition 1.4. Start with a unit vector x1 and
define U1 = span(x1 ) be the one-dimensional (hence closed) subspace spanned by x1 . Fix
δ = 1/2 and apply Riesz Lemma. Then there exists a unit vector x2 such that kx1 − x2 k ≥ 21 .
Define U2 := Span(x1 , x2 ) and apply again Riesz Lemma to find a unit vector x3 so that
kx3 − x1 k ≥ 1/2 and kx3 − x2 k ≥ 1/2. Etc. The sequence x1 , x2 , . . . constructed in this way
is in the unit ball but it clearly does not have any convergent subsequence. 2
Linear maps
We know what a linear map T : V → W between two vectorspaces means: this is a vectorspace
homeomorphismus, i.e. a map that preserves the basic structure of the vectorspace (addition
and scalar multiplication). The words linear maps (lineare Abbildungen) and linear
operators (lineare Operatoren) are synonyms and used in parallel.
Definition 2.1 Let (V, k·kV ) and (W, k·kW ) be two normed spaces. A linear map T : V → W
is called bounded (beschränkt) if there exists a finite constant C such that
kT vkW ≤ CkvkV ,
∀v ∈ V
Note that we neglect the parenthesis: the absolutely honest notation would be T (v) instead
of T v. This is a very common abbreviated notation and it is used only for linear maps. The
same convention is used for composition maps, e.g.
ST v = S(T (v))
Homework 2.2 Suppose V is finite dimensional. Then any linear map T : V → W is
bounded. (Hint: choose a basis in V , express T v in terms of the T -images of this basis and use
that k · kV is equivalent with the maximum-norm of the coefficients in the basis decomposition.)
Here are examples of a bounded and an unbounded map:
The “antiderivative” map on the space (C[0, 1], k · k∞ ):
Z x
A : f → (Af )(x) :=
f (s)ds
is bounded. Actually A maps C[0, 1] into the subspace C 1 [0, 1] ⊂ C[0, 1].
The inverse map, i.e. the “differentiation” map
D : f → (Df )(x) := f 0 (x)
from (C 1 [0, 1], k · k∞ ) into (C[0, 1].k · k∞ ) is unbounded (it is easy to construct a sequence fn
such that kfn k = 1 but kfn0 k → ∞, thus kfn0 k∞ ≤ Ckfn k∞ would never hold for any fixed
constant C).
Understandably the derivative map is probably the most important linear map in functional analysis. It is an unfortunate (?) fact, that it is unbounded: we will see that the theory
of bounded maps is much easier. Actually in this course we will almost exclusively work with
bounded maps.
Comparing derivative and antiderivative (=integration) as linear maps, it is clear that integration behaves much better. This is a general fact: integration is usually a stable operation
(e.g. numerically easier and safer to compute, it is less sensitive to imprecisions), while the
derivative is unstable. For pedagogical reasons one usually first learns derivative (because it is
conceptually simpler) and integration only later. From mathematical point of view integration
should be the starting point.
We go back to bounded linear maps. If T : (V, k · kV ) → (W, k · kW ) is a map between
two normed spaces, it is meaningful to ask if the map is continuous (the normed spaces are
of course topological spaces as well). The very good news is that within the linear maps,
bounded and continuous maps are the same. The proof is easy, but the statement is actually
quite surprising – nothing like that is expected to hold for general (non-linear) maps! It is a
special and strong property of the linearity.
Theorem 2.3 Let T : (V, k · kV ) → (W, k · kW ) be a linear map. Then the following three
properties are equivalent
(i) T is bounded;
(ii) T is continuous (at any point of V );
(iii) T is continouos at one point.
In the following proof we will intentionally neglect the subscripts V and W from the norms,
and use only k · k, but think it over at every step, whether the actual norm is in V or in W .
This is a very common “lousiness” of math writing – the norm is indexed only if there is a
danger of confusion. When it is clear from the context which norm we talk about, we use
only k · k. The confusion usually comes when the same space is equipped with two different
norms. When we have two spaces and each has its own norm, they cannot be confused, since
the argument of the norm indicates which space we consider: if we know that a vector x is in
space V , then kxk must necessarily mean kxkV .
Proof. (i) → (ii): Since any normed space is metric, it is sufficient to check continuity
along sequences, i.e. we need to show that if xn → x, then T xn → T x. But this is clear from
kT xn − T xk = kT (xn − x)k ≤ Ckxn − xk → 0
(ii) → (iii) is trivial.
(iii) → (i): Suppose T is continuous at x, i.e. we have T xn → T x whenever xn → x.
Introducing yn := xn − x, we see that then T is continuous at the origin, i.e. for any yn → 0
we have T yn → 0. Suppose that T were not bounded. Then there existed a sequence vn such
kT vn k ≥ nkvn k,
But then, by linearity and homogeneity (of the norm):
v n
nkvn k
v 1
n = →0
nkvn k
i.e. we have found a sequence converging to zero (namely vn /nkvn k) whose T -image would
not converge to zero. Similarly to the matrix norm generated by a vectornorm, we can define the norm of a
bounded linear map:
Definition 2.4 Let T : (V, k · kV ) → (W, k · kW ) be a bounded linear map. The norm of T is
defined to be the smallest constant C so that (2.2) holds, i.e.
n kT vk
: v ∈ V, v 6= 0
kT k := sup
By linearity, it is the same as
kT k := sup{kT vkW : v ∈ V, kvkV = 1}
Note that the norm of T depends on the two norms fixed on V and W , so it would be
more honest to indicate this fact but it usually follows from the context and we do not do it.
The notation and the name already indicates that the number defined in (2.3) is indeed a
norm. To make it absolutely precise, one has to introduce a vector space structure on the set
of bounded linear maps between (V, k · kV ) and (W, k · kW ). This is done in the most natural
way, if T and S are two such maps, then their sum is defined as
(T + S)v := T v + Sv
and similarly for the multiplication by a scalar. Notice that the two plus signs are not the
same: the one on the left indicates the addition in the space of maps (just being defined), the
one on the right is the addition in W .
Homework 2.5 Check that (2.3) indeed defines a norm on the vectorspace of bounded linear
Finally, we remark that the norm on operators enjoys two additional properties. First, it
is obvious that the identity map, I : (V, k · kV ) → (V, k · kV ) is bounded and has norm 1:
kIk = 1
A delicate remark: This relation holds only if V is considered to be equipped with the
same norm both as a domain and as a target. For example, if we consider the identity map as
I : (C[0, 2], k · k1 ) → (C[0, 2], k · k∞ )
then it is unbounded, while the identity map as
I : (C[0, 2], k · k∞ ) → (C[0, 2], k · k1 )
is bounded and has norm 2.
The second property is submultiplicativity; the matrix analogue being the bound
kABk ≤ kAk kBk
We formulate it as a lemma, whose proof is a HOMEWORK (it goes exactly as the proof of
(2.4) from linear algebra).
Lemma 2.6 Let two maps be given:
T : (V, k · kV ) → (W, k · kW ),
S : (W, k · kW ) → (U, k · kU )
and suppose that both maps are bounded. Then the composition map
ST : (V, k · kV ) → (U, k · kU )
is also bounded and
kST k ≤ kSk kT k
A small remark: Not every matrix norm satisfies (2.4) (e.g. the so called Frobenius norm,
kAk := [Tr(AA∗ )]1/2 , does not). But those matrix norms that are generated by a vector norm
(according to Definition 2.4) do satisfy this property. In principle one could similarly define
norms on the vectorspace of linear operators between two spaces that have nothing to with
any norm on the spaces themselves. For those norms, the submultiplicativity does not hold.
We will not encounter such norms, they are unimportant.
Banach spaces
Recall the definition
Definition 3.1 A normed vectorspace (V, k · k) is called Banach space (Banachraum) if
it is complete (with respect to the metric generated by k · k).
Obviously, every finite dimensional normed vectorspace is Banach. Moreover, we have also
proved that (C(X), k · k∞ ), the space of continuous functions on a metric space, equipped
with the supremum norm (metric) is complete. In contrast, the space (C[0, 1], k · k1 ) is not
complete (HOMEWORK!). It will be a fundamental question how to complete it, this will
lead us to the Lebesgue integral.
The following theorem is very simple, almost tautological, but the idea behind it is fundamental.
Theorem 3.2 (Bounded extension principle) Let
T : (V, k · kV ) → (W, k · kW )
be a bounded map and assume that W is a Banach space. Then T can be extended to the
completion of V in a unique way with keeping the norm.
Proof. Let Ve denote the completion of V and let x ∈ Ve . Since V is dense in Ve , there is a
sequence xn ∈ V such that xn → x. Clearly T xn is Cauchy (in W ), since
kT xn − T xm k ≤ kT k kxn − xm k
Since W is complete, T xn converges, say T xn → y. We want to define Tex := y to be the
extension of T , but before we can do it, we have to prove that y is independent of the choice
of the sequence xn and depends only on x. But this is easy, if x0n → x were another sequence
and T x0n → y 0, then we can merge these two sequences into one sequence by considering
x1 , x01 , x2 , x02 , x3 , x03 . . .
This new sequence is also Cauchy, so its T -image is convergent, but then y = y 0 (here we
implicitly use the trivial fact that the limit in a normed – actually metric – space is unique).
Now we can really define the extension by
Tex := y
It is immediate to check that Te is linear and T is the restriction of Te to V from Ve (CHECK).
To compute the norm, we have
kTexk = lim kT xn k ≤ lim sup kT xn k ≤ lim sup kT k kxn k ≤ kT k kxk
for any x ∈ Ve , thus kTek ≤ kT k, and since Te is an extension of T , its norm is at least kT k, so
we have kTek = kT k.
The uniqueness of Te is trivial (CHECK) The main application of this principle is that it is enough to specify a bounded linear
map on a dense subspace if the target space is Banach. This has enormous advantages. A
very naive, almost stupid example is the following. Consider Q ⊂ R equipped with the usual
absolute value as norm. Suppose we want to define the “multiplication by two” linear operator
on R, i.e.
T : R → R,
T x = 2x
Since Q is dense, so the above theorem says that it is sufficient to know the map
T : Q → R,
T x = 2x
i.e. to know how to multiply rational numbers by two, from this information (and the boundedness and linearity) it follows how to multiply any real number by two.
Here is a less trivial example that actually you have seen and heavily used but you did not
view it in this way.
Example: [Definition of the Riemann integral]
Let P C[a, b] be the set of piecewise continuous functions on [a, b] (for definiteness, we
prescribe that the functions be right continuous at the jumps), i.e. it is the set of all functions
f : [a, b] → C so that there exists finitely many points, x1 , . . . xn ∈ [a, b], such that f is
continuous on (xj , xj+1), for any j = 0, 1, . . . n (where x0 = a, xn+1 = b), and f is right
continuous at all xj :
lim f (x) = f (xj ),
j = 0, 1, . . . n
x→xj +0
We equip P C[a, b] with the supremum norm.
Let us denote by S[a, b] the space of stepfunctions, i.e. this is a subset of P C[a, b] of the
sj χ[xj ,xj+1 ) + sn χ[xn ,b]
where sj ∈ C and χA denotes the characteristic function of a set A. It is easy to check
(HOMEWORK) that S[a, b] is dense in P C[a, b] (with the supremum norm).
We define a linear map from S[a, b] to C as follows
I :f →
sj (xj+1 − xj )
if f is of the form. One has to check that this is well defined, i.e. I(f ) does not depend on
the representation (3.5) (such representation of a stepfunction is not unique, WHY?)
It is easy to see that I : S[a, b] → C is a bounded map (with norm |b − a|), and since C is
Banach, thus by the bounded extension principle, we have extended I to P C[a, b]. The map
Ie : P C[a, b] → C
is the Riemann integral.
If you read the proof of the bounded extension principle, you see that the abstract procedure there exactly corresponds to the procedure how the Riemann integral was defined from
stepfunctions to continuous (or piecewise continuous) functions. With a similar argument one
can extend I to the set of all Riemann integrable functions.
You may wonder if there is some cheating going on here. Eventually what happened
so far is quite “soft”, and Riemann integral just popped up almost as a triviality, while in
Analysis I you spend several weeks with it. The point is the same that I emphasized in
the completion theorem. Proving that some object exists via some general principle (like
completion viewed as a factorspace of Cauchy sequences) is usually not hard or deep. The key
is always how to identify the object in a manageable, computationally feasible way. The main
point behind the Riemann integral is not so much that one could take the limit of stepfunctions
and the “area under the graph” is obtained as the limit of areas of approximating rectangles.
All these may be notationally a bit painful to bookkeep (and this is done quite elegantly
by the above abstract language), but nothing deep happens here. The deep and surprising
fact is the Newton-Leibniz Theorem that connects the Riemann integral with something else
(derivative) and this enables us to compute Riemann integrals without actually going through
the stepfunction-limit procedure.