Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript

Normed and Banach spaces László Erdős Nov 6, 2007 1 Norms We recall that the norm is a function on a vectorspace V , k · k : V → R+ , satisfying the following properties kx + yk ≤ kxk + kyk kcxk = |c| kxk kxk = 0 ⇐⇒ x = 0 We always assume that the field of V is R or C, i.e. there is a natural absolute value on these fields. Note that otherwise one has to start with putting a norm on the field itself – the absolute value on R and C is a norm. A vectorspace equipped with a norm is called normed space (normierter Raum). A norm defines a metric on V with the formula d(x, y) := kx − yk i.e. normed spaces are in particular metric spaces as well. Of course there are many norms on a vectorspace and there is typically not a single optimal one. We have already seen the different metrics dp on C[0, 1] which actually come from the following norms: Z 1 1/p 1≤p<∞ kf kp := |f (x)|p dx 0 kf k∞ := sup |f (x)| = max |f (x)| x x They are called Lp and L∞ (or supremum- or maximum-) norms, respectively. The names come from the appropriate Lebesgue spaces, since the same formulas define norms on bigger spaces of various integrable functions as well. 1 In most cases, when we talk about normed spaces, it is important to indicate not just the vectorspace, but also the norm. E.g. we will denote the vectorspace of continuous functions by C[0, 1], but if we want to think of it as a normed space, we will specify the norm and use the notation (C[0, 1], k · kp). Later we will see that the supremum norm, k · k∞ is in some sense the “most” natural one on C[0, 1], and thus sometimes in some books take it for granted that C[0, 1] is equipped with this norm without explicitly saying it. Homework 1.1 Prove that for any f ∈ C[0, 1] lim kf kp = kf k∞ p→∞ Even the Euclidean space Rn has many norms, but in finite dimensions it does not really make a big difference (at least not qualitatively, e.g. as far as convergence is concerned) since we have Lemma 1.2 On a finite dimensional vectorspace V any two norms are equivalent, i.e. for any two norms k · k1 and k · k2 there exists two positive constants, c, C, such that ckxk1 ≤ kxk2 ≤ Ckxk2 , ∀x ∈ V (1.1) Proof. Fix a basis, {e1 , e2 , . . . , en } and represent any vector x ∈ V with a vector in Cn consisting of its coordinates x= n X j=1 xj ej −→ (x1 , x2 , . . . , xn ) ∈ Cn This representation is clearly one to one. We will show that any norm k · k on V is equivalent with the norm kxk∗ := max |xj | j i.e. with the `∞ -norm on the coordinates (CHECK that this is a norm!). In particular this will show that any two norms are equivalent. Recall that in Cn any closed bounded set is compact, so consider the unit sphere in the ∞ ` S := {x ∈ V : max |xj | = 1} j 2 this is a compact set in the topology generated by the k·k∗ -norm (Heine-Borel). If you learned Heine-Borel for the Euclidean norm, then you have to recall that the euclidean norm and the maximumnorm are equivalent, i.e. n X 1/2 √ 2 |xj | ≤ n max |xj |, max |xj | ≤ j j j=1 hence they generate the same topology, hence their compact sets are the same. Consider now an arbitrary norm, k · k, as a function on V : f (x) := kxk Note that this function is continuous with respect to the k · k∗ norm, since kx + hk − kxk ≤ khk (triangle inequality) and n n X X kej k ≤ Ckhk∗ hj ej ≤ max |hj | khk = j=1 j j=1 where C is a constant independent of h (it depends only on the basis that we fixed). Therefore we have a continuous function f that we restrict onto the compact set S, i.e. we have a continuous function on a compact set. By a general theorem in analysis we know that the minimum and maximum of a continuous function on a compact set is achieved, in particular there exists two numbers m, M such that m := min f (x) ≤ f (x) ≤ M := max f (x) x∈S x∈S ∀x ∈ S Note that M < ∞ and m > 0. This latter holds because if m were zero, then there would be an element y ∈ S such that f (y) = 0. But recall that f (y) = kyk is a norm, so this would imply y = 0, in contradiction with y ∈ S. For an arbitrary x ∈ V , x 6= 0, we set v := x/kxk∗ , then v ∈ S and by the homogeneity of the k · k-norm can write kxk = kxk∗ kvk = kxk∗ f (v) Therefore 0<m≤ kxk ≤M <∞ kxk∗ that proves (1.1). Since equivalent norms generate the same topology and the unit ball is compact in the `∞ norm, thus we proved that 3 Lemma 1.3 In any finite dimensional vectorspace equipped with any norm the unit ball is compact. This statetment is false in infinite dimensional vectorspaces. Consider for example the unit ball in `∞ : B := {(x1 , x2 , . . .) : sup |xj | ≤ 1} j Take the sequence e1 := (1, 0, 0, 0, . . .) e2 := (0, 1, 0, 0, . . .) etc. This clearly lies in B, but it has no convergent subsequence. Actually this is not just an example, but the following hold: Proposition 1.4 Let V be an infinite dimensional vectorspace equipped with any norm k · k. Then the closed unit ball B := {x ∈ V : kxk ≤ 1} is not compact. Proof. The proof relies on the following version of Riesz Lemma (there will be a very similar Lemma later in Hilbert spaces, that will also be called Riesz Lemma). Lemma 1.5 (Riesz) Let U ⊂ X be closed proper subspace of a normed vectorspace (X, k · k) (proper means U 6= X.) Given any number 0 < δ < 1, there is a unit vector x = xδ ∈ X (depending on δ) such that kxδ − uk ≥ 1 − δ, ∀u ∈ U The lemma is interesting when δ is very small. To see this, define dist(x, U) := inf{kx − uk : u ∈ U} to be distance of x from the subspace U. Since 0 ∈ U, the distance of any unit vector cannot be bigger than 1. The Lemma says that there is an “almost optimal” unit vector, i.e. a unit vector that is 1 − δ distance from U. Proof of the Lemma. Choose an element x 6∈ U and let d = dist(x, U). By the infimum in the definition of d, there exists a vector uδ ∈ U such that kx − uδ k ≤ 4 d 1−δ since d 1−δ > d. Define xδ := x − uδ kx − uδ k This is clearly a unit vector and this will be our choice. We need to check its distance from any other elements of U. Let u ∈ U be arbitrary and write x uδ kxδ − uk = − − u kx − uδ k kx − uδ k = d 1 x − (uδ + kx − uδ ku) ≥ > 1−δ kx − uδ k kx − uδ k and this completes the proof. Question: where did we use that U is closed? Apparently nowhere, but implicitly we d did use it in the inequality 1−δ > d, that holds only if d > 0 (note that from this it also follows that kx − uδ k > 0 which was also tacitly used above). To ensure d = dist(x, U) > 0, the information x 6∈ U would not be enough without the closedness of U (the distance of a boundary point of a set to the set is zero, even though the boundary point may not lie in the set itself). You may find it weird a subspace may not be closed. Indeed, this is often a property of infinite dimensional subspaces. For example consider C 1 [0, 1] := {f : [0, 1] → C : f, f 0 ∈ C[0, 1]} the set of continuously differentiable functions on [0, 1] (since the [0, 1] is closed, we also require the existence and continuity of the appropriate one-sided derivatives at the endpoints). Now it is clear that C 1 [0, 1] is an infinite dimensional subspace of C[0, 1], but is not closed: it is easy to find a sequence of differentiable functions that converge in supremum norm to a non-differentiable (but continuous) function (FIND AN EXAMPLE). However, finite dimensional subspaces are always closed: Homework 1.6 Let U ⊂ X be a finite dimensional subspace of a normed space (X, k · k), then U is closed. (Hint: let un → u, un ∈ U. Choose a basis in U and express un in terms of this basis. Using that any two norms on U are equivalent (Lemma 1.2), show that coefficients of un in the fixed basis are Cauchy sequences, hence they have a limit, thus you can represent u as a linear combination of basis vectors in U.) Finally, we can return to the proof of Proposition 1.4. Start with a unit vector x1 and define U1 = span(x1 ) be the one-dimensional (hence closed) subspace spanned by x1 . Fix 5 δ = 1/2 and apply Riesz Lemma. Then there exists a unit vector x2 such that kx1 − x2 k ≥ 21 . Define U2 := Span(x1 , x2 ) and apply again Riesz Lemma to find a unit vector x3 so that kx3 − x1 k ≥ 1/2 and kx3 − x2 k ≥ 1/2. Etc. The sequence x1 , x2 , . . . constructed in this way is in the unit ball but it clearly does not have any convergent subsequence. 2 Linear maps We know what a linear map T : V → W between two vectorspaces means: this is a vectorspace homeomorphismus, i.e. a map that preserves the basic structure of the vectorspace (addition and scalar multiplication). The words linear maps (lineare Abbildungen) and linear operators (lineare Operatoren) are synonyms and used in parallel. Definition 2.1 Let (V, k·kV ) and (W, k·kW ) be two normed spaces. A linear map T : V → W is called bounded (beschränkt) if there exists a finite constant C such that kT vkW ≤ CkvkV , ∀v ∈ V (2.2) Note that we neglect the parenthesis: the absolutely honest notation would be T (v) instead of T v. This is a very common abbreviated notation and it is used only for linear maps. The same convention is used for composition maps, e.g. ST v = S(T (v)) Homework 2.2 Suppose V is finite dimensional. Then any linear map T : V → W is bounded. (Hint: choose a basis in V , express T v in terms of the T -images of this basis and use that k · kV is equivalent with the maximum-norm of the coefficients in the basis decomposition.) Here are examples of a bounded and an unbounded map: The “antiderivative” map on the space (C[0, 1], k · k∞ ): Z x A : f → (Af )(x) := f (s)ds 0 is bounded. Actually A maps C[0, 1] into the subspace C 1 [0, 1] ⊂ C[0, 1]. The inverse map, i.e. the “differentiation” map D : f → (Df )(x) := f 0 (x) 6 from (C 1 [0, 1], k · k∞ ) into (C[0, 1].k · k∞ ) is unbounded (it is easy to construct a sequence fn such that kfn k = 1 but kfn0 k → ∞, thus kfn0 k∞ ≤ Ckfn k∞ would never hold for any fixed constant C). Understandably the derivative map is probably the most important linear map in functional analysis. It is an unfortunate (?) fact, that it is unbounded: we will see that the theory of bounded maps is much easier. Actually in this course we will almost exclusively work with bounded maps. Comparing derivative and antiderivative (=integration) as linear maps, it is clear that integration behaves much better. This is a general fact: integration is usually a stable operation (e.g. numerically easier and safer to compute, it is less sensitive to imprecisions), while the derivative is unstable. For pedagogical reasons one usually first learns derivative (because it is conceptually simpler) and integration only later. From mathematical point of view integration should be the starting point. We go back to bounded linear maps. If T : (V, k · kV ) → (W, k · kW ) is a map between two normed spaces, it is meaningful to ask if the map is continuous (the normed spaces are of course topological spaces as well). The very good news is that within the linear maps, bounded and continuous maps are the same. The proof is easy, but the statement is actually quite surprising – nothing like that is expected to hold for general (non-linear) maps! It is a special and strong property of the linearity. Theorem 2.3 Let T : (V, k · kV ) → (W, k · kW ) be a linear map. Then the following three properties are equivalent (i) T is bounded; (ii) T is continuous (at any point of V ); (iii) T is continouos at one point. In the following proof we will intentionally neglect the subscripts V and W from the norms, and use only k · k, but think it over at every step, whether the actual norm is in V or in W . This is a very common “lousiness” of math writing – the norm is indexed only if there is a danger of confusion. When it is clear from the context which norm we talk about, we use only k · k. The confusion usually comes when the same space is equipped with two different norms. When we have two spaces and each has its own norm, they cannot be confused, since the argument of the norm indicates which space we consider: if we know that a vector x is in space V , then kxk must necessarily mean kxkV . Proof. (i) → (ii): Since any normed space is metric, it is sufficient to check continuity along sequences, i.e. we need to show that if xn → x, then T xn → T x. But this is clear from kT xn − T xk = kT (xn − x)k ≤ Ckxn − xk → 0 7 (ii) → (iii) is trivial. (iii) → (i): Suppose T is continuous at x, i.e. we have T xn → T x whenever xn → x. Introducing yn := xn − x, we see that then T is continuous at the origin, i.e. for any yn → 0 we have T yn → 0. Suppose that T were not bounded. Then there existed a sequence vn such that kT vn k ≥ nkvn k, n∈N But then, by linearity and homogeneity (of the norm): v n T =1 nkvn k while v 1 n = →0 nkvn k n i.e. we have found a sequence converging to zero (namely vn /nkvn k) whose T -image would not converge to zero. Similarly to the matrix norm generated by a vectornorm, we can define the norm of a bounded linear map: Definition 2.4 Let T : (V, k · kV ) → (W, k · kW ) be a bounded linear map. The norm of T is defined to be the smallest constant C so that (2.2) holds, i.e. o n kT vk W : v ∈ V, v 6= 0 (2.3) kT k := sup kvkV By linearity, it is the same as kT k := sup{kT vkW : v ∈ V, kvkV = 1} Note that the norm of T depends on the two norms fixed on V and W , so it would be more honest to indicate this fact but it usually follows from the context and we do not do it. The notation and the name already indicates that the number defined in (2.3) is indeed a norm. To make it absolutely precise, one has to introduce a vector space structure on the set of bounded linear maps between (V, k · kV ) and (W, k · kW ). This is done in the most natural way, if T and S are two such maps, then their sum is defined as (T + S)v := T v + Sv and similarly for the multiplication by a scalar. Notice that the two plus signs are not the same: the one on the left indicates the addition in the space of maps (just being defined), the one on the right is the addition in W . 8 Homework 2.5 Check that (2.3) indeed defines a norm on the vectorspace of bounded linear operators. Finally, we remark that the norm on operators enjoys two additional properties. First, it is obvious that the identity map, I : (V, k · kV ) → (V, k · kV ) is bounded and has norm 1: kIk = 1 A delicate remark: This relation holds only if V is considered to be equipped with the same norm both as a domain and as a target. For example, if we consider the identity map as I : (C[0, 2], k · k1 ) → (C[0, 2], k · k∞ ) then it is unbounded, while the identity map as I : (C[0, 2], k · k∞ ) → (C[0, 2], k · k1 ) is bounded and has norm 2. The second property is submultiplicativity; the matrix analogue being the bound kABk ≤ kAk kBk (2.4) We formulate it as a lemma, whose proof is a HOMEWORK (it goes exactly as the proof of (2.4) from linear algebra). Lemma 2.6 Let two maps be given: T : (V, k · kV ) → (W, k · kW ), S : (W, k · kW ) → (U, k · kU ) and suppose that both maps are bounded. Then the composition map ST : (V, k · kV ) → (U, k · kU ) is also bounded and kST k ≤ kSk kT k A small remark: Not every matrix norm satisfies (2.4) (e.g. the so called Frobenius norm, kAk := [Tr(AA∗ )]1/2 , does not). But those matrix norms that are generated by a vector norm (according to Definition 2.4) do satisfy this property. In principle one could similarly define norms on the vectorspace of linear operators between two spaces that have nothing to with any norm on the spaces themselves. For those norms, the submultiplicativity does not hold. We will not encounter such norms, they are unimportant. 9 3 Banach spaces Recall the definition Definition 3.1 A normed vectorspace (V, k · k) is called Banach space (Banachraum) if it is complete (with respect to the metric generated by k · k). Obviously, every finite dimensional normed vectorspace is Banach. Moreover, we have also proved that (C(X), k · k∞ ), the space of continuous functions on a metric space, equipped with the supremum norm (metric) is complete. In contrast, the space (C[0, 1], k · k1 ) is not complete (HOMEWORK!). It will be a fundamental question how to complete it, this will lead us to the Lebesgue integral. The following theorem is very simple, almost tautological, but the idea behind it is fundamental. Theorem 3.2 (Bounded extension principle) Let T : (V, k · kV ) → (W, k · kW ) be a bounded map and assume that W is a Banach space. Then T can be extended to the completion of V in a unique way with keeping the norm. Proof. Let Ve denote the completion of V and let x ∈ Ve . Since V is dense in Ve , there is a sequence xn ∈ V such that xn → x. Clearly T xn is Cauchy (in W ), since kT xn − T xm k ≤ kT k kxn − xm k Since W is complete, T xn converges, say T xn → y. We want to define Tex := y to be the extension of T , but before we can do it, we have to prove that y is independent of the choice of the sequence xn and depends only on x. But this is easy, if x0n → x were another sequence and T x0n → y 0, then we can merge these two sequences into one sequence by considering x1 , x01 , x2 , x02 , x3 , x03 . . . This new sequence is also Cauchy, so its T -image is convergent, but then y = y 0 (here we implicitly use the trivial fact that the limit in a normed – actually metric – space is unique). Now we can really define the extension by Tex := y It is immediate to check that Te is linear and T is the restriction of Te to V from Ve (CHECK). 10 To compute the norm, we have kTexk = lim kT xn k ≤ lim sup kT xn k ≤ lim sup kT k kxn k ≤ kT k kxk n n n for any x ∈ Ve , thus kTek ≤ kT k, and since Te is an extension of T , its norm is at least kT k, so we have kTek = kT k. The uniqueness of Te is trivial (CHECK) The main application of this principle is that it is enough to specify a bounded linear map on a dense subspace if the target space is Banach. This has enormous advantages. A very naive, almost stupid example is the following. Consider Q ⊂ R equipped with the usual absolute value as norm. Suppose we want to define the “multiplication by two” linear operator on R, i.e. T : R → R, T x = 2x Since Q is dense, so the above theorem says that it is sufficient to know the map T : Q → R, T x = 2x i.e. to know how to multiply rational numbers by two, from this information (and the boundedness and linearity) it follows how to multiply any real number by two. Here is a less trivial example that actually you have seen and heavily used but you did not view it in this way. Example: [Definition of the Riemann integral] Let P C[a, b] be the set of piecewise continuous functions on [a, b] (for definiteness, we prescribe that the functions be right continuous at the jumps), i.e. it is the set of all functions f : [a, b] → C so that there exists finitely many points, x1 , . . . xn ∈ [a, b], such that f is continuous on (xj , xj+1), for any j = 0, 1, . . . n (where x0 = a, xn+1 = b), and f is right continuous at all xj : lim f (x) = f (xj ), j = 0, 1, . . . n x→xj +0 We equip P C[a, b] with the supremum norm. Let us denote by S[a, b] the space of stepfunctions, i.e. this is a subset of P C[a, b] of the form n−1 X f= sj χ[xj ,xj+1 ) + sn χ[xn ,b] (3.5) j=0 where sj ∈ C and χA denotes the characteristic function of a set A. It is easy to check (HOMEWORK) that S[a, b] is dense in P C[a, b] (with the supremum norm). 11 We define a linear map from S[a, b] to C as follows I :f → n X j=0 sj (xj+1 − xj ) if f is of the form. One has to check that this is well defined, i.e. I(f ) does not depend on the representation (3.5) (such representation of a stepfunction is not unique, WHY?) It is easy to see that I : S[a, b] → C is a bounded map (with norm |b − a|), and since C is Banach, thus by the bounded extension principle, we have extended I to P C[a, b]. The map Ie : P C[a, b] → C is the Riemann integral. If you read the proof of the bounded extension principle, you see that the abstract procedure there exactly corresponds to the procedure how the Riemann integral was defined from stepfunctions to continuous (or piecewise continuous) functions. With a similar argument one can extend I to the set of all Riemann integrable functions. You may wonder if there is some cheating going on here. Eventually what happened so far is quite “soft”, and Riemann integral just popped up almost as a triviality, while in Analysis I you spend several weeks with it. The point is the same that I emphasized in the completion theorem. Proving that some object exists via some general principle (like completion viewed as a factorspace of Cauchy sequences) is usually not hard or deep. The key is always how to identify the object in a manageable, computationally feasible way. The main point behind the Riemann integral is not so much that one could take the limit of stepfunctions and the “area under the graph” is obtained as the limit of areas of approximating rectangles. All these may be notationally a bit painful to bookkeep (and this is done quite elegantly by the above abstract language), but nothing deep happens here. The deep and surprising fact is the Newton-Leibniz Theorem that connects the Riemann integral with something else (derivative) and this enables us to compute Riemann integrals without actually going through the stepfunction-limit procedure. 12