Survey

Survey

Document related concepts

Transcript

Chapter 2 Linear Spaces Contents Field of Scalars . . . . . . . . . Vector Spaces . . . . . . . . . . Subspaces . . . . . . . . . . . . Sum of subsets . . . . . . . . . . Linear combinations . . . . . . . Linear independence . . . . . . . Basis and Dimension . . . . . . Convexity . . . . . . . . . . . . Normed linear spaces . . . . . . The `p and Lp spaces . . . . . . Topological concepts . . . . . . Open sets . . . . . . . . . . . . Closed sets . . . . . . . . . . . . Bounded sets . . . . . . . . . . . Convergence of sequences . . . . Series . . . . . . . . . . . . . . Cauchy sequences . . . . . . . . Banach spaces . . . . . . . . . . Complete subsets . . . . . . . . Transformations . . . . . . . . . Linear transformations . . . . . . Continuity . . . . . . . . . . . . Compactness . . . . . . . . . . . Upper semicontinuous functions . Quotient Spaces . . . . . . . . . Denseness . . . . . . . . . . . . Separability . . . . . . . . . . . Schauder basis . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 2.3 2.5 2.5 2.6 2.7 2.7 2.8 2.9 2.10 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.19 2.21 2.21 2.22 2.23 2.26 2.27 2.27 2.27 2.27 2.28 2.2 c J. Fessler, October 4, 2004, 12:44 (student version) In systems analysis, linear spaces are ubiquitous. Why? Linear systems/models are easier to analyze; many systems, particularly in signal processing, are deliberately designed to be linear; linear models are a useful starting point (approximation) for more complicated nonlinear cases. Formal definitions of a vector space use the concept of a field of scalars, so we first review that. Field of Scalars (from Applied Linear Algebra, Noble and Daniel, 2nd ed.) A field of scalars F is a collection of elements α, β, γ, . . . along with an “addition” and a “multiplication” operator. To every pair of scalars α, β in F, there must correspond a scalar α + β in F, called the sum of α and β, such that • Addition is commutative: α + β = β + α • Addition is associative: α + (β + γ) = (α + β) + γ • There exists a unique element 0 ∈ F, called zero, for which α + 0 = α, ∀α ∈ F • For every α ∈ F, there corresponds a unique scalar (−α) ∈ F for which α + (−α) = 0. To every pair of scalars α, β in F, there must correspond a scalar αβ in F, called the product of α and β, such that • Multiplication is commutative: αβ = βα • Multiplication is associative: α(βγ) = (αβ)γ • Multiplication distributes over addition: α(β + γ) = αβ + αγ • There exists a unique element 1 ∈ F, called one, or unity, or the identity element, for which 1α = α, ∀α ∈ F • For every nonzero α ∈ F, there corresponds a unique scalar α−1 ∈ F, called the inverse of α for which αα−1 = 1. Simple facts for fields: • 0 + 0 = 0 (use α = 0 in the definition of 0) • −0 = 0 Proof. For any α, by the associative property (α + 0) + (−0) = α + (0 + (−0)) hence α + (−0) = α. Hence, since the zero element is unique, −0 = 0. Example. The set of rational numbers Q (with the usual definition of addition and multiplication) is a field. The only fields that we will need are the field of real numbers R and the field of complex numbers C. Therefore, hereafter we will use F when describing results that hold for either R or C. c J. Fessler, October 4, 2004, 12:44 (student version) 2.3 Vector Spaces In simple words, a vector space is a space that is closed under vector addition and under scalar multiplication. Definition. A vector space or linear space consists of the following four entities. 1. A field F of scalars. 2. A set X of elements called vectors. 3. An operation called vector addition that associates a sum x + y ∈ X with each pair of vectors x, y ∈ X such that • Addition is commutative: x + y = y + x • Addition is associative: x + (y + z) = (x + y) + z • There exists an element 0 ∈ X , called the zero vector, for which x + 0 = x, ∀x ∈ X • For every x ∈ X , there corresponds a unique vector (−x) ∈ X for which x + (−x) = 0. 4. An operation called multiplication by a scalar that associates with each scalar α ∈ F and vector x ∈ X a vector αx ∈ X , called the product of α and x, such that: • Associative: α(βx) = (αβ)x • Distributive α(x + y) = αx + αy • Distributive (α + β)x = αx + βx • If 1 is the identify element of F, then 1x = x. ∀x ∈ X . • 0x = 0 for any x ∈ X . The requirement that x + y ∈ X and αx ∈ X is sometimes called the closure property. Simple facts for vector spaces: • 0 is unique. • (−1)x = −x for x ∈ X . Proof. x + (−1)x = 1x + (−1)x = (1 + (−1))x = 0x = 0. • α0 = 0 for α ∈ F. Proof. α0 = α0 + 0 = α0 + [α0 + (−α0)] = [α0 + α0] + (−α0) = α(0 + 0) + (−α0) = α0 + (−α0) = 0. • x + y = x + z implies y = z (cancellation law) • αx = αy and α 6= 0 implies x = y (cancellation law) • αx = βx and x 6= 0 implies α = β (cancellation law) • α(x − y) = αx − αy (distributive law) • (α − β)x = αx − βx (distributive law) • −αx = α(−x) = −(αx) c J. Fessler, October 4, 2004, 12:44 (student version) 2.4 What are some examples? (Linear algebra classes focus on finite-dimensional examples.) Important Vector Spaces • Euclidean space or n-tuple space: X = Rn . If x ∈ X , then x = (a1 , a2 , . . . , an ) where ai ∈ R and we use ordinary addition and multiplication: x + y = (a1 + b1 , a2 + b2 , . . . , an + bn ) and αx = (αa1 , . . . , αan ). (As a special case, the set of real numbers R (with ordinary addition and multiplication) is a trivial R ∞ vector R ∞ space.) • X = L1 R2 . The set of functions f : R2 → R that are absolutely (Lebesgue) integrable: −∞ −∞ |f (x, y)| dx dy < ∞, with the usual pointwise definition multiplication. of addition and scalar To show show that f, g ∈ L1 R2 implies f + g ∈ L1 R2 , one can apply the triangle inequality: |f + g| ≤ |f | + |g|. • The set of functions on the plane R2 that are zero outside of the unit square. • The set of solutions to a homogeneous linear system of equations Ax = 0. • C[a, b]: the space of real-valued, continuous functions defined on the interval [a, b]. • The space of band-limited signals. • Many more in Luenberger... Example. For 1 ≤ p < ∞, define the following infinite-dimensional1 space: Z X = Rp [a, b] = f : [a, b] → R : f is Riemann integrable and 1 p |f (t)| dt < ∞ , 0 (with the usual pointwise definitions of addition of functions and multiplication of functions by a scalar). To show that this space is a vector space, the only nontrivial work is verifying closure. R1 R1 Clearly if f ∈ X then αf ∈ X since 0 |αf (t)|p dt = |α|p 0 |f (t)|p dt < ∞, so X is closed under scalar multiplication. To show that f +g ∈ X if f, g ∈ X , i.e., closure under addition, requires a bit more work. Note that since |a+b| ≤ 2 max{|a|, |b|}, it follows for p ≥ 1 that |a + b|p ≤ 2p max{|a|p , |b|p } ≤ 2p [|a|p + |b|p ]. Hence if f, g ∈ X : Z 1 Z |f (t) + g(t)|p dt ≤ 0 1 0 [2p |f (t)|p + 2p |f (t)|p ] dt ≤ 2p Z 0 1 |f (t)|p dt +2p Z 0 1 |f (t)|p dt < ∞ + ∞ = ∞. showing closure under addition. Example. Can X = (0, ∞) with F = R be a vector space? ?? Cartesian product We can make a “larger” vector space from two vector spaces X and Y (having a common field F) by forming the Cartesian product of X and Y, denoted X × Y, which is the collection of ordered pairs (x, y) where x ∈ X and y ∈ Y. X × Y = {(x, y) : x ∈ X and y ∈ Y} . To be a vector space we must define vector addition and scalar multiplication operations, which we define component-wise: • (x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ) • α(x, y) = (αx, αy), ∀α ∈ F. Fact. With these definitions the Cartesian product of two vector spaces is indeed a vector space. The above definition generalizes easily to higher-order combinations. Example. R3 = R × R × R. 1A definition of dimension is forthcoming... c J. Fessler, October 4, 2004, 12:44 (student version) 2.5 2.3 Subspaces A (nonempty) subset S of a vector space X is called a subspace of X if S, when endowed with the addition and scalar multiplication operations defined for X , is a vector space, i.e., αx + βy ∈ S whenever x, y ∈ S and α, β ∈ F. Example. The subset of X = R2 [−1, 1] consisting of symmetric functions (f (−t) = f (t)) is a subspace of X . It is clearly closed under addition and scalar multiplication. What are the four types of subspaces of R3 ? ?? Intuition: think of a subspace like a line or plane (or hyperplane) through the origin. Properties of subspaces •0∈S • {0} is a subspace of X • X is a subspace of X • A subspace not equal to the entire space X is called a proper subspace • If M and N are subspaces of a vector space X , then the intersection M ∩ N is also a subspace of X . Proof. see text Think: intersection of planes (through the origin) in 3d. • Typically the union of two subspaces is not a subspace. Think: union of planes (through the origin) in 3d. Although unions usually fail, we can combine two subspaces by an appropriate sum, defined next. Sum of subsets Definition. If S and T are two subsets of a vector space, then the sum of those subsets, denoted S + T is defined by S + T = {x = s + t : s ∈ S, t ∈ T } . Example. What is the sum of a plane and a line (both through origin) in R3 ? ?? Example. Consider X = R2 , with S = {(x, 0) : x ∈ [0, 1]} and T = {(0, y) : y ∈ [0, 1]}. Then S + T is the unit square. Proposition. If M and N are subspaces of a vector space X , then the sum M + N is a subspace of X . Proof. see text Does the previous example illustrate this proposition? ?? Example. Let X = {f : f (t) = a sin(t + φ) for a, φ ∈ R} (with the usual definitions of addition and scalar multiplication2 ). Then M = {f : f (t) = a sin(t) for a ∈ R} and N = {f : f (t) = a cos(t) for a ∈ R} are both (proper) subspaces of X . What is M + N ? ?? 2 It is time to stop saying this. From now on we leave it implicit whenever this is clear, which it usually is. c J. Fessler, October 4, 2004, 12:44 (student version) 2.6 Linear combinations Definition. A finite sum Pn i=1 αi xi for xi ∈ X and αi ∈ F is called a linear combination. (The associative property of vector addition allows us to write such a sum without parentheses.) Depending on where the xi ’s originated Pnwe get various properties of linear combinations. • X : If xi ∈ X , i = 1, . . . , n, then i=1 αi xi ∈ X . This is shown Pn easily by induction from the definition of a vector space. • M : If xi ∈ M, i = 1, . . . , n, where M is a subspace, then i=1 αi xi ∈ M by induction from the definition of a subspace. Any linear combination of vectors from a subspace is also in the subspace. • S: What if we take linear combinations from a subset rather than a subspace? Definition. If S is a subset of a vector space X , then the subspace generated by S is the subspace of linear combinations drawn from S, defined by ) ( n X αi xi , for xi ∈ S, αi ∈ F, and n ∈ N . [S] = x ∈ X : x = i=1 • [S] is also called the span or the linear hull of S. • [S] is indeed a subspace of X since a linear combinations of linear combinations is itself a linear combination. • [S] is the smallest subspace of X containing S, i.e., if M is a subspace of X that contains S, then [S] ⊆ M . • If M is a subspace of X , then [M ] = ?? • Clearly S ⊆ [S] • Note that only finite sums are involved, as in all linear combinations. Example. For X = R3 , what is [S] when S consists of a line through the origin plus any point not on that line? ?? Intuition: a subspace of a general vector space generalizes the notion of a line or plane through the origin of Euclidean 3D space. What about lines or planes that are not through the origin? Linear varieties skip for now. Not needed in Ch. 2 problems. Wait until 3.10. Definition. A subset V of a vector space X is called a linear variety iff V = x0 + M for some x0 ∈ X and some subspace M of X . Another term used is affine subspace. Linear varieties arise in certain minimum norm problems. c J. Fessler, October 4, 2004, 12:44 (student version) 2.7 2.5 Linear independence Often we need to quantify how “big” a subspace is. Definition. A vector x is called linearly dependent on a set S of vectors iff x ∈ [S], i.e., x is in the span of S. Otherwise, if x ∈ / [S], then x is called linearly independent of S. Definition. A set S of vectors is called a linearly independent set if each vector in the set is linearly independent of the remaining vectors in the set, i.e., ∀x ∈ S, x ∈ / [S − {x}] . Remark. S may be uncountable, but testing whether x ∈ [S − {x}] requires consideration only of finite sums, by the definition of linear combinations. Example. This illustrates that S can be uncountable! 1, t = s Let X = {f : [0, 1] → R} and define gs (t) = 0, otherwise. Then S = {gs : s ∈ [0, 1]} is a linearly independent subset of X . Theorem. A finite set of vectors {x1 , . . . , xn } is linearly independent iff Pn i=1 αi xi = 0 implies that αi = 0, ∀i. (We are skipping proofs that are found in basic linear algebra books.) Corollary. If a finite set S = {x1 , . . . , xn } is linearly independent and y ∈ [S], then y has a unique expansion y = for some α1 , . . . , αn ∈ F. Pn i=1 αi xi Basis and Dimension Definition. A set S is called a basis or Hamel basis [3, p 183] for X iff S is linearly independent and [S] = X . Luenberger [4, p. 20] says “finite set” but Naylor [3, p. 183] and Maddox [2, p. 74] do not. Let us agree to use the above definition, rather than Luenberger’s. Note that X can (and usually will) have more than one basis! Definition. If X has a basis S that is a finite set, then we call X finite dimensional. Otherwise, if no such finite S exists, we call X infinite dimensional. Definition. A space with a basis consisting of n elements is called an n-dimensional space. This terminology is acceptable thanks to the following result. Theorem. Any two bases for a finite-dimensional vector space contain the same number of elements. Most, but not all, properties of (more easily understood) finite-dimensional spaces generalize to infinite-dimensional spaces. Example. Pn = {polynomials of degree ≤ n} A basis is 1, t, t2 , . . . , tn , which has dimension n. Why linearly independent? ?? Another basis is the Legendre polynomials: dk /dtk (t2 − 1)k , k = 1, . . . , n. Exercise. C[0, 1] is infinite dimensional. Hint: prove by counter example considering Pn+1 . A basis is a generalization of the usual concept of a coordinate system. Fact. [3, p 184] If S is linearly independent set in a vector space X , then there exists a basis B for X such that S ⊆ B. Thus, every vector space has a Hamel basis, (since the empty set is linearly independent). However, “Hamel basis is not the only concept of basis that arises in analysis. There are concepts of basis that involve topological as well as linear structure. ... In applications involving infinite-dimensional spaces, a useful basis, if one even exists, is usually c J. Fessler, October 4, 2004, 12:44 (student version) 2.8 something other than a Hamel basis. For example, a complete orthonormal set is far more useful in an infinite-dimensional Hilbert space than a Hamel basis” [3, p. 183]. (More on this later!) We often need sets with less rigid structure than subspaces but that nevertheless still have some structure, so we digress a bit here. 2.4 Convexity Luenberger describes convexity as “the fundamental algebraic concept of vector space.” (p. 25) According to Oxford English dictionary, an algebra is “a calculus of symbols combining according to certain defined laws.” Definition. A set K in a vector space is called convex iff for any x, y ∈ K, αx + (1 − α)y ∈ K for all α ∈ [0, 1]. Geometrically: for any two points in a convex set, the “line segment” between them is also in the set. Properties of convex sets • Subspaces and linear varieties are convex. • {0} is “vacuously” convex. • For α ∈ F, αK , {x = αk : k ∈ K} is convex if K is convex (magnification or minification of a set) • K1 + K2 is convex if K1 and K2 are convex sets in a common vector Tspace. • If C is a collection of convex sets (in a common vector space), then K∈C K is a convex set. (Important for POCS methods.) Convex hull Definition. The convex cover or convex hull of a set S in a vector space is the smallest convex set containing S, denoted co(S). Equivalently, the convex hull of S is the intersection of all convex sets containing S: \ co(S) = (Picture in 2D of blob and its convex hull) K. {cK convex : S⊆K} Problem 2.4 gives a more constructive form for co(S). Cone Definition. A set C in a vector space is called a cone with vertex at the origin if x ∈ C implies that αx ∈ C for all α ∈ [0, ∞). Cone Convex Cone Example. The space of nonnegative continuous (real) functions is a convex cone in the vector space of continuous functions. Relationships between cones, convex sets, and subspaces Subsets Convex Cones Cones Subspaces {X } Exercise. An arbitrary union of subspaces (in a common VS) is a cone. Convex Sets c J. Fessler, October 4, 2004, 12:44 (student version) 2.9 2.6 Normed linear spaces We’ve gone about as far as we can with just the basic axioms of a vector space. Fortunately, most of the vector spaces of interest have additional structure: a norm. Definition. A norm on a vector space X is a function k·k : X → R that satisfies the following for every x ∈ X . • kxk ≥ 0 (nonnegativity) • kxk = 0 iff x = 0 (positive definiteness) • kαxk = |α| kxk , ∀α ∈ F (scaling property) (homogeneity) • kx + yk ≤ kxk + kyk , ∀y ∈ X (triangle inequality) Notice that the absolute value function |α| appears above. It is here where our “restriction” to the fields R and C enters3 . Definition. (X , k·k) is a normed vector space or normed linear space or normed linear vector space or just normed space4 . Clearly a norm generalizes the usual notion of length. The following lemma arises remarkably frequently in proofs. Lemma. In a normed space, kxk − kyk ≤ kx − yk for all x, y ∈ X , i.e., | kxk − kyk | ≤ kx − yk. Proof. kxk = kx − y + yk ≤ kx − yk + kyk . pPn 2 Example. Euclidean n-space: En , Rn , kxk = x i=1 i pPn 2 for w > 0. w x Example. Euclidean n-space with a weighted norm: Rn , kxk = i i=1 i i Example. A different norm for Rn : (Rn , kxk = maxk |xk |) Proof. Clearly this norm is nonnegative, and is zero only if |xk | = 0 ∀k, i.e., if x = 0. And maxk |αxk | = |α| maxk |xk |. For the triangle inequality, we note that maxk |xk + yk | ≤ maxk [|xk | + |yk |] ≤ maxk |xk | + maxk |yk |. Those are all finite-dimensional examples. Example. The vector space of continuous functions on the interval [a, b] with the norm kf k2 = What if we replaced “continuous” with “square integrable?” (see footnote p. 32 regarding equivalence classes of functions that are equal a.e.) qR b a |f (t)|2 dt. Example. C[a, b]. The space of continuous functions on the interval [a, b] with the norm kf k∞ = maxa≤t≤b |f (t)|. Later we will see a reason why this “max” norm can be preferable. p Example. Space of real m × n matrices, with kAk = tr(AAT ). Problems involving spaces of matrices arise fairly frequently in systems analysis. R2 and R √ a2 + b2 2 For R there are three particularly important norms. For x = (a, b) consider: |a| + |b| max{|a|, |b|} √ Are there others? Yes, e.g., weighted versions: w1 a2 + w2 b2 , wk > 0. Euclidean or E2 `1 or city block `∞ What about R? One norm is kxk = |x|. Are there others? If g(x) is a norm on R, then one condition is that g(αx) = |α|g(x), ∀α ∈ R. Thus, choosing x = 1 we must have g(α) = |α|g(1), where g(1) > 0 since 1 6= 0. Thus all norms on R have the form kxk = w|x| for some w > 0. All such norms are equivalent, in a sense to be defined in HW, so there is no point in considering anything but the case w = 1. So we usually just speak of R, rather than E1 = (R, k·k = | · |). 3 Probably it is possible to define normed spaces over other fields for which one can define a suitable |·| function that satisfies: (i) |α| = 0 ⇐⇒ α = 0, (ii) |αβ| = |α| |β|, (iii) |α + β| ≤ |α| + |β|. But would such spaces be useful? 4 The term normed space should suffice because the presence of a vector space is implied in the axioms of a norm, since vector addition, scalar multiplication, and the zero vector are all part of those axioms. c J. Fessler, October 4, 2004, 12:44 (student version) 2.10 2.10 The `p and Lp spaces These normed spaces are ubiquitous in the engineering literature, and represent perhaps the most important examples of (infinite dimensional) normed vector spaces. P∞ Definition. Let p ∈ [1, ∞). The space `p consists of all sequences of scalars a1 , a2 , . . . for which i=1 |ai |p < ∞. The norm of a vector x ∈ `p is defined by !1/p ∞ X p kxkp = |ai | < ∞. i=1 Definition. The space `∞ consists of bounded sequences. The norm of a vector x = {ai } in `∞ is defined by kxk∞ = sup |ai |. i Before referring to these spaces as normed spaces, we must confirm that each of the functionals defined above is indeed a norm. • kαxkp = |α| kxkp is trivial to verify • kxkp > 0 unless x = 0. • What about the triangle inequality? The holds due to the Minkowski inequality, which in turn follows from the Hölder inequality. Theorem. (The Hölder inequality) If p ∈ [1, ∞) and q ∈ [1, ∞) satisfy 1/p + 1/q = 1, and if x = (a1 , a2 , . . .) ∈ `p and y = (b1 , b2 , . . .) ∈ `q , then ∞ X i=1 |ai bi | ≤ kxkp kykq . Moreover, equality holds iff either of the following two conditions hold: • either x or y equal 0, or !1/q !1/p |ai | |bi | • both x and y are nonzero and = , ∀i. kxkp kykq See errata: Luenberger’s equality condition omits the cases where x or y equal 0. Proof. See text. The special case p = q = 2 is particularly important, and is known as the Cauchy-Schwarz inequality: s sX sX sX X X X 2 2 2 |ai | |bi | and hence ai bi ≤ |ai | |bi |2 . |ai bi | ≤ i i i i i i Theorem. (The Minkowski inequality) If x, y ∈ `p , p ∈ [1, ∞] then so is x + y, and kx + ykp ≤ kxkp + kykp . For p ∈ [1, ∞), equality holds iff x and y are linearly dependent. See errata: Luenberger’s condition for equality is incomplete since it omits the cases where x or y equal 0. Proof. See text. (It uses the Hölder inequality.) Pn Pn Pn p 1/p p 1/p p 1/p At one point in the proof, we have the inequality ( i=1 |ai + bi | ) ≤ ( i=1 |ai | ) + ( i=1 |bi | ) . Pn p 1/p Taking the limit as n → ∞ on the RHS (which increases monotonically with n) yields ( i=1 |ai + bi | ) ≤ kxkp + kykp . Then taking the limit as n → ∞ on the LHS yields kx + ykp ≤ kxkp + kykp . This technique of first working with one side and then the other arises frequently. c J. Fessler, October 4, 2004, 12:44 (student version) 2.11 The trouble with Riemann integration The Lp space, defined below, involves Lebesgue integration. Why?Because the set of Riemann integrable functions is not as 1, t ∈ Q This function is not Riemann integrable, general as we would like. For example, consider the function f (t) = 0, otherwise. but it is Lebesgue integrable. (Its Lebesgue integral is zero since the function is nonzero only on a set of Lebesgue measure zero.) When a function is Riemann integrable, its Lebesgue integral will equal its Riemann integral. Considering Lebesgue integrable functions will more than general enough for any engineering problems. The Lp space Definition. Let p ∈ [1, ∞). The space Lp [a, b] consists of all real-valued measurable functions x on the interval [a, b] for which |x(t)|p is Lebesgue integrable. The norm on this space is defined as kxkp = Z a b p !1/p |x(t)| dt . However, there is a subtle caveat with this space. There are functions that are nonzero on a set of measure zero for which kxkp = 0. So to consider this space to be a normed space, we must treat functions that are equal almost everywhere (a.e.) as being equivalent. In other words, a vector in Lp is really an equivalence class of measurable functions that are all equal almost everywhere. This makes the definition of L∞ [a, b] a bit more subtle than `∞ . In particular, we cannot define kxk∞ to be simply the obvious choice “supa≤t≤b |x(t)| ,” because that value will be different for different functions in the equivalence class. Instead, we define kxk∞ = essential supremum of |x(t)| = infimum sup |y(t)| = ess sup |x(t)| . a.e. y(t)=x(t) See [4, p. 33] for an example. If x ∈ Lp [a, b] and y ∈ Lq [a, b] with p, q > 1 and 1/p + 1/q = 1, then the Hölder inequality is Rb a |x(t)y(t)| dt ≤ kxkp kykq . Similarly, kx + ykp ≤ kxkp + kykp is the Minkowski inequality for Lp . The proofs are very similar. c J. Fessler, October 4, 2004, 12:44 (student version) 2.12 Topological concepts According to the Oxford English dictionary, topology is the “branch of mathematics concerned with those properties of figures and surfaces which are independent of size and shape and are unchanged by any deformation that is continuous, neither creating new points nor fusing existing ones; hence, with those of abstract spaces that are invariant under homeomorphic transformations.” Anyway, we have been systematically generalizing notions from geometry to more general settings, but one important concept we have yet to generalize is that of distance. The concept of distance is central to any discussion of optimization, since many optimization problems involve finding the element (within a set) that is the closest to some given point outside that set. Definition. If x and y are two points in a normed space (X , k·k), then the distance5 between x and y is defined by d(x, y) = kx − yk . More generally, if S is a nonempty subset of X , then the distance between a point x ∈ X and the set S is defined by d(x, S) = inf kx − yk . y∈S Example. In R, d(1, (3, 4]) = 2, for the usual norm kxk = |x|. (Picture) Note that there is no “closest point,” a complication that will require careful attention later. Example. In R2 , what is the distance between (0, 0) and {(a, b) : 2 ≤ a ≤ 3, 1 ≤ b ≤ 2}? (Trick question) ?? How would you define the distance between two sets S and T ? ?? Properties of distance d. (We will use these later.) Lemma 2.1 |d(x, S) − d(y, S)| ≤ kx − yk . (Picture) Proof. d(x, S) = inf z∈S kx − zk = inf z∈S k(x − y) − (z − y)k ≥ inf z∈S kz − yk − kx − yk = d(y, S) − kx − yk . Now rearrange. 2 Lemma 2.2 For any two subsets U and V of a normed space X , d(U, V ) ≤ d(x, U ) + d(x, V ), ∀x ∈ X . (Picture) Proof. d(U, V ) = inf u∈U d(u, V ) ≤ inf u∈U d(x, V ) + kx − uk = d(x, V ) + d(x, U ). 2 Lemma. Let (X , k·k) be a normed space, M ⊆ X a subspace, and x ∈ X a point. Then ∀α ∈ F: d(αx, M ) = |α| d(x, M ) . Proof. Trivial for α = 0. For α 6= 0: d(αx, M ) = inf y∈M kαx − yk = inf z∈M kαx − αzk = |α| inf z∈M kx − zk = |α| d(x, M ) (Picture) . 2 Preview of optimization Many problems in optimization can be expressed as follows. Given x in a normed space (X , k·k), and a subset S in X , find “the” vector s ∈ S that minimizes kx − sk. What questions should we ask about such problems? • Is there any best s? I.e., does there exist s? ∈ S s.t. kx − s? k = d(x, S)? • If so, is s? unique? • How is s? characterized? (Better yet would be an explicit formula for s? .) One purpose of some of the material that follows is to answer these questions. (But be prepared for a challenge since some aspects of these questions remain open problems!) We will return to these questions after introducing Hilbert spaces in Ch. 3. 5 This type of distance is a special case of the more general concept of a distance function that is called a metric if it satisfies: • d(x, x) = 0 • d(x, y) > 0 if x 6= y • d(x, y) = d(y, x) • d(x, y) ≤ d(x, z) + d(z, y) (triangle inequality) Exercise. Verify that a metric is a more general concept than a norm, i.e., find a metric d(x, y) that is not of the form d(x, y) = kx − yk for any norm k·k. Hint: a metric need not satisfy the scaling property. ?? c J. Fessler, October 4, 2004, 12:44 (student version) 2.13 2.7 Open sets Looking ahead: in optimization we often use iterative algorithms that generate sequences, and we need to know when a sequence converges to a limit that belongs to the same set as the sequence itself. Such questions are related to whether a set is closed or not, so we need concepts of open and closed sets. Definition. The open sphere centered at x of radius ε is defined by S(x, ε) , {y ∈ X : kx − yk < ε} . Also called the open ball, which is perhaps more descriptive since a sphere is often considered to be a surface, whereas a ball might be more often considered to be a solid. Example. Consider (X = C[0, 1], k·k∞ ) and x = 0. Then S(x, 2) = continuous functions f : [0, 1] → R : maxt∈[0,1] |f (t)| < 2 . (Picture of continuous functions between ±2.) Clearly this definition of a “sphere” generalizes the 3-space notion of a sphere! Now we need a definition for “inside,” another generalization. Definition. Let P be a subset of a normed space (X , k·k). (Picture in 2D with in a generic set.) A point p ∈ P is called an interior point of P iff ∃ε > 0 s.t. S(p, ε) ⊆ P. 2 ◦ Example. Let X = (a, b) ∈ R : a = b (the line at 45 in the plane) with kxk = |a| if x = (a, a). Is x = (1, 1) an interior point of X , i.e., does there exist an open ball that is a subset of X ? ?? Suppose we change to X = R2 with the usual Euclidean norm and S = (a, b) ∈ R2 : a = b . Is x = (1, 1) an interior point of S? ?? Which points within an open sphere are interior points? ?? Definition. The interior of a set P , denoted Int(P ), is the collection of all interior points of P . Int(P ) , {p ∈ P : p is an interior point of P } = {p ∈ P : ∃ε > 0 s.t. S(p, ε) ⊆ P } . Definition. A set P is called open iff P = Int(P ). Remark. Int(P ) ⊆ P , so to show a set P is open we must show P ⊆ Int(P ), i.e., show that x ∈ P =⇒ ∃ε > 0 s.t. S(x, ε) ⊂ P. Examples of open sets. • ∅ and X are open sets • open spheres are open (exercise) • Int(Int(P )) = Int(P ) • Int(P ) is open • P × Q is open in X × Y if P is open in X and Q is open in Y (problem 2.16 p. 44) • (1, 2) × (0, 1) ⊂ E2 is open Proposition. x ∈ Int(P ) ⇐⇒ d(x, Pe) > 0. (Picture) Proof. x ∈ Int(P ) =⇒ ∃ε > 0 s.t. S(x, ε) ⊆ P, so d(x, Pe) = inf y∈Pe kx − yk = inf y∈P / kx − yk ≥ ε, since y ∈ / P =⇒ y ∈ / S(x, ε) =⇒ ky − xk ≥ ε. We can prove the converse by proving its contrapositive6 . Suppose x ∈ / Int(P ). Then ∀ε > 0, S(x, ε) ∩ Pe 6= ∅, so ∃y ∈ Pe s.t. kx − yk < ε. Thus d(x, Pe) = inf y∈Pe kx − yk = 0. Alternative direct proof. δ = d(x, Pe) > 0 =⇒ kz − xk ≥ δ > 0, ∀z ∈ Pe. But y ∈ S(x, δ) =⇒ ky − xk < δ =⇒ y ∈ / Pe =⇒ y ∈ P. So S(x, δ) ⊆ P and hence x ∈ Int(P ) . 2 e e = P. Fact. Similarly: x ∈ Int Pe ⇐⇒ d(x, P ) > 0, since P 6 We have shown that A =⇒ B and we want to show the converse B =⇒ A, the contrapositive of which is: not A =⇒ not B. c J. Fessler, October 4, 2004, 12:44 (student version) 2.14 Closed sets Definition. A point x ∈ X is called a closure point (or a cluster point or an adherent point) of a set P iff ∀ε > 0, ∃p ∈ P s.t. kx − pk < ε. Fact. If P is nonempty, then x is a closure point of P iff d(x, P ) = 0. Definition. The collection of all closure points of a set P is called the closure of P and is denoted P . Example. Example. Example. Example. P = {1, 1/2, 1/3, . . . , 1/10} =⇒ P = P . P = [0, 1) =⇒ P = [0, 1]. P = {1/n : n ∈ N} =⇒ P = P ∪ {0} . Q = R. Properties. • P = {x ∈ X : d(x, P ) = 0} if P is nonempty. (If P is empty, then so is P .) •P ⊆P • P =P Definition. A set P is called closed iff P = P . Remark. Since P ⊆ P , to show a set P is closed we must show P ⊆ P , i.e., if p is any closure point of P , then p ∈ P . Examples of closed sets. • ∅ is closed (!) • X is closed (!) • {x} is closed • {y : d(x, y) ≤ 1} is closed Proposition. • P is open =⇒ Pe is closed • P is closed =⇒ Pe is open Proof. Suppose P is open. x ∈ P =⇒ ∃ε > 0 s.t. S(x, ε) ⊆ P =⇒ d(x, Pe) ≥ ε > 0 =⇒ x ∈ / Pe. / P i.e., Pe ⊆ Pe. Hence Pe is closed. Thus the contrapositive is x ∈ Pe =⇒ x ∈ Suppose P is closed. x ∈ Pe =⇒ x ∈ / P = P =⇒ ε , d(x, P ) > 0. Thus S(x, ε) ⊂ Pe so x ∈ Int Pe . Thus Pe ⊆ Int Pe and we conclude Pe is open. (See Luenberger errata for p. 25.) Remark. We could also state the proposition simply as “P is open ⇐⇒ Pe is closed.” Proposition. • The intersection of a finite number of open sets is open. • The union of a finite number of closed sets is closed. • The intersection of an arbitrary number of closed sets is closed. • The union of an arbitrary number of open sets is open. Proof. Exercise. Example. Consider the open intervals Sn = (−1/n, 1/n) ⊂ R, n = 1, 2, . . .. N ∞ Then ∪∞ n=1 Sn = (−1, 1), which is open. And ∩n=1 Sn = SN , which is open. What is ∩n=1 Sn ? ?? 2 c J. Fessler, October 4, 2004, 12:44 (student version) 2.15 Now we consider open and closedness in the context of the special types of sets we considered previously: convex sets and subspaces. Proposition. If C is a convex set in a normed space, then C and Int(C) are convex. Proof. see text Open and closed subspaces Fact. If M is a subspace of X and M is open, then M = X [3, p. 229]. Fact. If M is a subspace of X , then M is a subspace of X [3, p. 229]. In general, subspaces are not necessarily closed in infinite dimensional normed spaces. Example. The subspace M of continuous functions in L2 . Because the rect function is in L2 and is a closure point of M but is not in M . (Picture) . Example. The subspace M of finite-length sequences in `2 . Because the infinite geometric series (1, 1/2, 1/4, 1/8, . . .) is in `2 , and is a closure point of M , but is not in M . Caution! We have now seen one of our first examples of a situation where our intuition from E3 does not generalize to general normed spaces! We have been thinking of subspaces as being like planes or hyperplanes. Yet even though planes are closed subsets of E3 , subspaces are not necessarily closed in general. Remark. We had to use infinite-dimensional examples above because (as we will show soon), finite-dimensional subspaces are closed. Closed sets and distances Lemma. In a normed space, let U and V be disjoint subsets. If V is closed, then d(u, V ) > 0, ∀u ∈ U . Proof. Pick any u ∈ U and suppose d(u, V ) = 0. Then u ∈ V = V since V is closed. But u ∈ V contradicts the assumption that U and V are disjoint. Lemma 2.3 In a normed space, d(y, S) > 0 =⇒ d(y, S) > 0. ?? Bounded sets Definition. A set S in a normed space (X , k·k) is called bounded iff ∃M < ∞ such that kxk ≤ M, ∀x ∈ S. Are closed sets bounded? ?? Bounded sets can be open, or closed, or neither. 2 c J. Fessler, October 4, 2004, 12:44 (student version) 2.16 Sequences Definition. A sequence is a set of vectors indexed by the natural numbers N, e.g., {xn } = {xn : n ∈ N}. Formally, a sequence is a mapping from N to some vector space X . Definition. If {xn } is a sequence, and n1 < n2 < · · · , then {xni } is called a subsequence of {xn }. Notation. {xn } ∈ X iff xn ∈ X , ∀n 2.8 Convergence of sequences Definition. In a normed space, we say a sequence of vectors {xn } converges to a vector x iff the sequence of real numbers kxn − xk converges to zero, in which case we write xn → x or limn→∞ xn = x. In other words, considering the definition of convergence of real numbers: xn → x ⇐⇒ ⇐⇒ ⇐⇒ kxn − xk → 0 ∀ε > 0, ∃Nε < ∞ s.t. n ≥ Nε =⇒ kxn − xk < ε ∀ε > 0, ∃Nε < ∞ s.t. n ≥ Nε =⇒ xn ∈ S(x, ε). Example. (Infinite dimensional, of course.) Consider L2 [0, 1] with x(t) = 1 and xn (t) = 1 + t/n. qR qR 1 1 Then xn → x because kx − xn k2 = |x(t) − xn (t)|2 dt = |t/n|2 dt = √13 n → 0 as n → ∞. 0 0 Example. (Infinite dimensional, of course.) Consider `p with x = (1, 1/2, 1/3, . . .) and xn = (1, 1/2, . . . , 1/n, 0, 0, . . .). P∞ p 1/p Then kxn − xkp = . That power series is convergent (and hence goes to zero as n → ∞) for p > 1, but k=n+1 1/k diverges for p = 1. So we can say “xn →x” in `2 , for example, but we cannot say that in `1 ! So all norms are not equivalent in general, unlike in finite-dimensional vector spaces. Proposition. The limit of a convergent sequence is unique. Proof. Suppose xn → x and xn → y. Then ∀n: kx − yk = kx − xn + xn − yk ≤ kx − xn k + kxn − yk which → 0 as n → ∞. Thus kx − yk = 0 so x − y = 0 so x = y. 2 Proposition. xn → x =⇒ kxn k → kxk . Proof. As shown earlier, | kxk − kyk | ≤ kx − yk . Thus, |kxn k − kxk| ≤ kxn − xk → 0 if xn → x, so kxn k → kxk. 2 Proposition. If xn → x, then supn kxn k < ∞, i.e., convergent sequences are bounded. Proof. ∃N1 s.t. kxn − xk < 1, ∀n > N1 . Thus kxn k = kxn − x + xk ≤ kxn − xk + kxk < 1 + kxk , so supn kxn k ≤ max {1 + kxk , kx1 k , . . . , kxN1 k} < ∞. (Picture) 2 c J. Fessler, October 4, 2004, 12:44 (student version) 2.17 The concept of a closed set is closely connected with limits of sequences. Definition. If S is a subset of X , we call x ∈ X a limit point of S iff there is a sequence of elements of S that converges to x. Proposition. x is a limit point of S iff x ∈ S. Thus limit points and cluster points are equivalent! 7 Proof. If x is a limit point, then ∃ {xn } ∈ S s.t. xn → x. Thus d(x, {xn }∞ n=1 ) = 0, so d(x, S) = 0. Hence x ∈ S. Suppose x ∈ S. Then ∀ε > 0, ∃y ∈ S s.t. kx − yk < ε. Choose ε = 1/n and identify xn with the corresponding y. Since kxn − xk < 1/n, we see xn → x, so x is a limit point of S. 2 Corollary. A set is closed iff it contains its limit points. Summary • distance • open sphere • interior point & interior • open set = its interior • closure point: d(x, P ) = 0, & closure • closed set = its closure • bounded set • convergence sequences • limit point of a set (exists convergent sequence to it) = cluster point • closed sets contain their limit points Series (A special kind of sequence.) P∞ Definition. to converge to x in a normed space iff P the sequence of partial sums Pn An infinite series of the form i=1 xi is saidP n ∞ sn = i=1 xi converges to x, in which case we write x = i=1 xi as short hand for x = limn→∞ i=1 xi . P∞ P∞ Caution. Sometimes as a lazy short P hand one might write i=1 1/i = ∞. Saying that i=1 αi = ∞ is shorthand for saying n ∀M < ∞, ∃N ∈ N s.t. n > N =⇒ i=1 αi > M . P ∞ Since ∞ ∈ / R, it is truly an abuse of notation to write n=1 1/n = ∞, because in the above definition of convergence of an infinite series, it is implied that the limit x is an element of X . Example. In `2 , consider x = (1, 1/2, 1/3, . . .) and xn = (0, . . . , 0, 1/n, 0, . . .) where 1/n is in the nth element. P∞ Then i=1 xi converges to x. (But it does not converge in `1 .) 7A ⊆ B =⇒ d(x, B) ≤ d(x, A). (Picture) c J. Fessler, October 4, 2004, 12:44 (student version) 2.18 2.11 Cauchy sequences Often in convergence analysis it is easier to examine kxn − xm k than kxn − x? k, especially if we have not yet shown that a limit x? even exists. Definition. A sequence {xn } in a normed space is called a Cauchy sequence iff kxn − xm k → 0 as n, m → ∞. In other words, ∀ε > 0, ∃N > 0 s.t. n, m > N =⇒ kxn − xm k < ε. Fact. In a normed space, every convergent sequence is a Cauchy sequence, since if xn → x, then kxn − xm k ≤ kxn − xk + kxm − xk → 0. Note the repeated use of the triangle inequality in proofs. In a normed space, it is about all we have to work with! The converse of the above fact is not true in general: Cauchy sequences need not converge in general normed spaces. This is a bit unfortunate since the converse is what we would really like to use usually! Example. Consider the sequence of functions fn (t) = 1 − e−nt in the normed space X = {f : [0, 1] → R : f continuous} , kf k = kf k1 = Z 0 1 |f (t)| dt , (Picture) . R1 kfn − fm k = 0 |e−nt − e−mt | dt = |(1 − e−n )/n − (1 − e−m )/m| → 0 as n, m → ∞. So {fn } is Cauchy. But the “apparent limit” of fn is a step function, which is not continuous, and hence not an element of X . How can we “fix” this problem? • Broaden the vector space X to L1 [0, 1] (i.e., drop the continuity requirement). • Replace the norm kf k1 with kf k∞ . One can show for this example that kfn − fm k∞ does not approach zero as “n, m → ∞.” (For any fixed m, kfn − fm k∞ → 1 as n → ∞.) This “incompleteness” can also arise even in subspaces of `2 . Example. X = set of infinite sequence of reals with only finitely many nonzero terms: X = {(a1 , . . . , ak , 0, 0, . . .) : k ≥ 1, ai ∈ R} with the `p norm kxkp . Now consider the sequence with elements xn = (1, 1/2, 1/4, . . . , 1/2n , 0, 0, . . .). P 1/p 1/p max{n,m} 1 1 p 1 Since kxn − xm kp = → 0 as n, m → ∞, {xn } is Cauchy. But there ≤ p k p min{m,n} k=min{n+1,m+1} 2 1−1/2 2 is no x ∈ X to which {xn } converges. In some sense, the problem is that X has “holes” in it. Broadening the space is the natural solution. Is there an alternate norm that would make this vector space X complete? (I doubt it, can you show otherwise?) c J. Fessler, October 4, 2004, 12:44 (student version) 2.19 2.11 Banach spaces Often we prefer to use normed spaces that are free of the pathologies described in the preceding examples, so we name them. Definition. A normed space (X , k·k) is called complete iff every Cauchy sequence in X has a limit in X (and hence converges). Definition. A complete normed space is called a Banach space. Examples of Banach spaces • Rn , k·kp , for p ∈ [1, ∞] • C[a, b] with kf k∞ = supt∈[a,b] |f (t)| • `p for p ∈ [1, ∞] with usual kxkp • Lp [a, b] for p ∈ [1, ∞] with usual kf kp Is finding a suitable Banach space usually difficult? Fortunately not, since every normed space has a completion. Theorem. If (X , k·kX ) is a normed space, then ∃ Y, k·kY that is a Banach space (called a completion of X ) with • X is a subspace of Y •X =Y • x ∈ X =⇒ kxkX = kxkY . Moreover, Y is essentially unique (i.e., all the completions of X are isometric with one another [3, p. 121]). R1 Example. L1 [0, 1] is the completion of X = {f : [0, 1] → R : f continuous} , kf k = kf k1 = 0 |f (t)| dt . Example. Lp [a, b] is the completion of Rp [a, b] for p ∈ [1, ∞]. Showing such results for Lp requires measure theory. Showing completeness of C[a, b] uses the following fact that is a key result from Math 451. Theorem. (R, | · |) is complete, i.e., if {αn } ⊂ R is Cauchy, then ∃α ∈ R s.t. αn → α. There exist bounded functions that are Lebesgue integrable but not Riemann integrable, e.g. the indicator function on the rationals [3, p. 561]. And in fact R1 [a, b] is not complete [3, p. 564]. 2.12 Complete subsets Definition. A subset S of a normed space is complete iff every Cauchy sequence from the subset converges to a limit within S. Example. Any finite set is complete. (The next theorems give more interesting cases.) Theorem. In a normed space, any complete subset is closed. Proof. If P ⊆ X is complete, then every Cauchy sequence in P has a limit in P . Thus all convergent sequences (which are of course Cauchy) have limits in P . Thus P is closed. 2 Theorem. In a Banach space, a subset is complete if and only if it is closed. Proof. The “only if” direction follows from the preceding theorem. Suppose P is closed and {xn } is a Cauchy sequence in P (and hence in X ). Since X is Banach, ∃x ∈ X such that xn → x. Hence x is a limit point of P . Since P is closed, x ∈ P . Thus P is complete. Exercise. In a normed space, intersections of arbitrarily many complete subsets are complete. What about unions of complete subsets? ?? ?? 2 c J. Fessler, October 4, 2004, 12:44 (student version) 2.20 There is an asymmetry between the two preceding theorems. We assumed a normed space to show complete set =⇒ closed set, but we assumed a Banach space to show closed set =⇒ complete set. Is the stronger assumption (of a Banach space) truly necessary to show closed set =⇒ complete set? ?? Ok, but in some (incomplete) X is there a closed proper subset that is incomplete? ?? In the preceding theorem we assumed we are already working in a Banach space. What if we only have an ordinary normed space? Are there complete subsets of it? The next theorem shows that the answer can be yes, at least for finite-dimensional subspaces. Recall that if M is a subspace of a normed space X , then d(αx, M ) = |α| d(x, M ) for all x ∈ X and α ∈ F. Theorem. Any finite-dimensional subspace of a normed space is complete (and hence closed). Proof. By induction on the dimension of the subspace. For a 1D subspace: M = {αe : α ∈ F}, where e ∈ X is a fixed basis vector. Any Cauchy sequence {xn } in this subspace has elements of the form xn = αn e. Since kxn − xm k = |αn − αm | kek, the sequence of reals {αn } is also Cauchy and hence convergent (to some limit α) since E1 = (R, | · |) is complete. Thus, kxn − αek = |αn − α| kek → 0, so xn → x = αe ∈ M . Hence, any 1D subspace of a normed space is complete. Now assume the theorem is true for subspaces of dimension N − 1. Suppose M is a N -dimensional subspace of a normed space X . We must show that M is complete. Let e1 , . . . , eN denote a basis for M . For k = 1, . . . , N , define Mk = [{e1 , . . . , ek−1 , ek+1 , . . . , eN }] and δk = d(ek , Mk ) . Claim: δk > 0. Suppose δk = 0. Then since Mk is a N − 1 dimensional space it is complete by assumption and hence closed, so d(ek , Mk ) = 0 would imply ek ∈ Mk , contradicting the linear independence of {ek }. PN Suppose {xn } ∈ M is Cauchy. Each xn has a (unique) representation xn = k=1 λnk ek , so for each k ∈ {1, . . . , N }: N N X X n n m m m n kxn − xm k = (λk − λk )ek = (λk − λk )ek − (λj − λj )ej ≥ d((λnk − λm k )ek , Mk ) k=1 j6=k {z } | = |λnk − λm k | d(ek , Mk ) = |λnk − λm k | δk . ∈Mk Since kxn − xm k → 0 for a Cauchy sequence, we see that |λnk − λm | → 0 since δk > 0. Thus {λnk } is Cauchy and hence (by the PN k completeness of R) converges to some limit λk . Defining x = k=1 λk ek ∈ M we find that xn → x since N N X X n kxn − xk = (λk − λk )ek ≤ |λnk − λk | kek k → 0. k=1 k=1 Thus any such Cauchy sequence in M converges to a limit in M , so M is complete. 2 Note this proof’s immediate use of the basis to represent any x ∈ M . Proofs about finite-dimensional spaces often start this way. Corollary. • Any finite-dimensional subspace of a normed space is closed. • Any finite-dimensional normed space is complete (and closed). Exercise. [3, p. 218]. A (Hamel) basis for a Banach space is either finite or uncountably infinite. c J. Fessler, October 4, 2004, 12:44 (student version) 2.21 2.9 Transformations In systems theory, we analyze many systems and mathematical operations that transform one signal into another. Definition. Let X and Y be two vector spaces (over a common field F), and let D be a subset of X . A rule “T ” that assigns a single element y ∈ Y to each x ∈ D is called a transformation from X to Y with domain D. We write y = T (x) and T : X → Y. or T : D → Y. See p. 27 or c1 review notes for one-to-one and onto. Definition. A transformation from a vector space X (over a scalar field F) into F is called a functional on X . Example. A norm is a functional. Linear transformations Definition. A transformation T : X → Y (where X and Y are vector spaces over a common field) is called linear iff T (αx1 + βx2 ) = α T (x1 ) +β T (x2 ), ∀x1 , x2 ∈ X , ∀α, β ∈ F. Definition. If T is a linear transformation from X into X itself, then we say T is a linear operator. However, the terminology distinguishing linear transformations from linear operators is not universal, and the two terms are often used interchangeably. Example. Let F = R and let X = Y be the space of continuous functions on [0, 1]. Rt Define the linear transformation T by: F = T (f ) iff F (t) = 0 f (τ ) dτ . Integration (with suitable limits) is a linear transformation. Simple fact for linear transformations: • T (0) = 0. Proof. T (0) = T (00) = 0 T (0) = 0. This is called the “zero in, zero out” property in linear systems theory. Pn Pn Caution! (From Linear Systems by T. Kailath.) By induction it follows that T ( i=1 αi xi ) = i=1 αi T (xi ) for any finite n, but the above does not imply in general that linearity holds for infinite summations or integrals. Further assumptions about “smoothness” or “regularity” or “continuity” of T are needed for that. This point is always glossed over in introductory signals and systems courses, where infinite sums (and integrals) are routinely “passed through” linear systems according to the superposition property, with no attempt to verify the validity of such exchanges. c J. Fessler, October 4, 2004, 12:44 (student version) 2.22 Continuity To define such continuity, we restrict attention to normed spaces. Note that a transformation can be continuous for one pair of normed spaces but discontinuous for another pair, e.g. [3, p. 63,65]. Definition. A transformation T from a normed space (X , k·kX ) into a normed space Y, k·kY is called continuous at x0 ∈ X iff ∀ε > 0, ∃δ = δ(x0 , ε) > 0 s.t. kx − x0 kX < δ =⇒ kT (x) − T (x0 )kY < ε. Definition. If T is continuous at all x0 ∈ X , then we simply call T continuous. Example. f : X → R defined by f (x) = d(x, S) is continuous for any set S in a normed space X , since |d(x, S) − d(y, S)| ≤ kx − yk. Definition. A transformation T from a normed space (X , k·kX ) into a normed space Y, k·kY is called uniformly continuous iff ∀ε > 0, ∃δ = δ(ε) > 0 s.t. kx − zkX < δ =⇒ kT (x) − T (z)kY < ε. Here δ depends only on ε. Sometimes δ(ε) is called the modulus of continuity of T . Example. Is k·k uniformly continuous? ?? We often want to exchange transformations and limits; the following proposition shows that continuity is the key condition. Proposition. A transformation T from a normed space (X , k·kX ) into a normed space Y, k·kY is continuous at x ∈ X iff xn → x =⇒ T (xn ) → T (x) (for any such sequence {xn }), i.e., lim T (xn ) = T n→∞ lim xn . n→∞ Proof. see text Exercise. A linear transformation from a finite-dimensional normed space to any normed space is continuous. Thus any “counter-example” showing a linear transformation that is not continuous will be infinite dimensional, e.g., the integrator system example [3, p.63]. Rt Example. Consider X = Y = L2 [R] and the linear operator T corresponding to the integral transformation: y(t) = −∞ x(τ ) dτ . We show that T is not continuous at x0 = 0. (And in fact is discontinuous everywhere!) Consider the specific signals: x(t) 6b = (δ/2)3/2 −a Then kx − 0k = qR a = 2/δ 2 |x(t)| dt = ab t y(t) 6 −a q √ p √ 2ab2 = δ/ 2 < δ but kT (x) − T (0)k = ky − 0k = 23 b2 a3 = 2/3. a t An example of a continuous (linear) operator in an infinite dimensional vector space is the discrete-time convolution operator. P∞ Example. Consider X = Y = `∞ and the transformation y = T (x) ⇐⇒ yn = k=−∞ hn−k xk , where h ∈ `1 . Note that `1 corresponds to BIBO stability! Exercise. Show that kT (x)k∞ ≤ khk1 kxk∞ . (This also ensures that T (x) is well defined.) Thus, since T is linear, if x, z ∈ `∞ then kT (x) − T (z)k∞ = kT (x − z)k∞ ≤ khk1 kx − zk∞ , so T is (uniformly) continuous, with δ(ε) = ε/ khk1 . So BIBO LTI systems are uniformly continuous, and thus for such systems we can freely exchange limits and sums. (And in fact we would need to make such exchanges to derive rigorously convolution properties like the commutative property.) We will talk much more about such bounded linear operators in Ch. 6. c J. Fessler, October 4, 2004, 12:44 (student version) 2.23 2.13 Compactness Optimization is about maximizing a functional over some set, or more precisely, usually about finding the maximizer (within some set) of a functional. When does a functional achieve its maximum? We wish that the answer were “if the set is closed and bounded,” but unfortunately that is incorrect in general. The concept of a compact set helps answer this question in general. Definition. A subset K of a normed space (X , k·k) is called compact or sequentially compact iff every sequence {xn } in K has some subsequence {xni } that converges to a limit x ∈ K. (This is not quite the definition of compactness typically used first in a real analysis course, but it is more convenient for our purposes. One can first define a compact set in terms of set coverings, and then prove that a metric space is compact if and only if it is sequentially compact, e.g., [2, p. 63].) As usual, having defined a new concept, we now attempt to relate it to previously defined concepts. Can a cone (other than {0}) be compact? ?? Exercise. In a normed space, arbitrary intersections of compact subsets are compact. Exercise. What about unions? ?? ?? Lemma. If xn → x and {xni } is a subsequence of {xn }, then limi→∞ xni = x. Proof. Pick any ε > 0. xn → x =⇒ ∃N s.t. n > N =⇒ kxn − xk < ε. Choose I such that i > I =⇒ ni > N . Then i > I =⇒ kxni − xk < ε. 2 Proposition. A compact subset K of a normed space (X , k·k) is complete, closed, (and bounded). Proof. • Claim: K is complete, because Cauchy sequences will have limits in a compact set K. Let {xn } be any Cauchy in K, then ∀ε > 0, ∃M s.t. n, m > M =⇒ kxn − xm k < ε/2. Since K is compact, ∃ {ni } s.t. xni → x ∈ K, i.e., ∀ε > 0, ∃I s.t. i > I =⇒ kxni − xk < ε/2. For any ε > 0, let N = max{M, nI }. For n > N and i > I we have kxn − xk ≤ kxn − xni k + kxni − xk < ε/2 + ε/2 = ε. Thus xn → x. • To show K is closed, we show that xn → x =⇒ x ∈ K for {xn } ∈ K. {xn } ∈ K =⇒ ∃xni → y ∈ K. But xn → x =⇒ xni → x, so x = y ∈ K. Thus K is closed. • Before we show boundedness, we need to prove the Weierstrass theorem! 2 So is the reverse true? Are closed and bounded sets compact? The following theorem shows that in general the answer is “yes” in finite-dimensional normed spaces. (But not in general.) Theorem. In a normed space (X , k·k), the following are equivalent. (a) X is finite dimensional. (b) Every closed and bounded subset is compact. (c) The closed unit ball B1 (0) = {x ∈ X : kxk ≤ 1} is compact [3, p. 269]. Proof. (a) =⇒ (b) is the Heine-Borel theorem (see Math 451...) (b) =⇒ (c) is obvious (c) =⇒ (a) will be a homework problem This theorem illustrates why infinite-dimensional spaces are more challenging than finite-dimensional spaces. The fact that every closed and bounded subset is compact can be very useful in finite dimensional problems, and we do not have this tool in general infinite-dimensional cases. By having a notion of compactness, we can make statements (such as the Weierstrass theorem below) that apply to both infinite and finite-dimensional spaces, whereas those statements would not hold if we merely assumed that the sets were closed and bounded. Analogy: we write sup when we think max. Here, we write compact when we think “closed and bounded.” Summary c J. Fessler, October 4, 2004, 12:44 (student version) 2.24 In any normed space: • compact =⇒ complete =⇒ closed • compact =⇒ closed and bounded In a Banach space: closed =⇒ complete In a finite-dimensional normed space: closed and bounded =⇒ compact Examples As has been / will be shown, any compact set is closed, bounded, and complete, and that the reverse is true in finite-dimensional spaces. But what about infinite-dimensional spaces? Example. Consider any of the Banach spaces `p , for p ∈ [1, ∞] and the subset B = {x ∈ X : kxk ≤ 1} . This is a closed set, and hence it is complete (since X is complete). Furthermore, B is clearly bounded. We now proceed to exhibit a sequence {xn } in B that has no convergent subsequences. Let xn denote the sequence in `p whose nth element is unity and all other elements are zero. Clearly {xn } ∈ B. Yet kxn − xm kp = 1, so it is impossible for this {xn } to have any convergent subsequences since any such convergent subsequence would need to be Cauchy which cannot occur when kxn − xm kp = 1, ∀n, m ∈ N. So the set B in `p is closed, bounded, and complete, but not compact. One might begin to wonder then: are there any compact sets in infinite-dimensional spaces? Example. Any finite set in an infinite-dimensional space is compact. But that is not very interesting since the set {x1 , . . . , xn } is just a closed and bounded subset of the finite-dimensional subspace [x1 , . . . , xn ], so of course it is compact. Example. In any normed space X , suppose {xn } converges to x ∈ X . Then the set {x} ∪ ∪n {xn } is compact. This still seems like a rather contrived and limited compact set. Are there any interesting (e.g., nonempty) convex compact sets in infinite-dimensional spaces? Conjecture: the following (convex) set is (?) compact: ( x ∈ `1 : Note: the following set is not compact ( ∞ X i=1 i ) 2 |xi | ≤ 1 . x ∈ `1 : kxk1 = ∞ X i=1 (2-1) ) |xi | ≤ 1 , as shown above, so the question is whether adding the “2i ” part is enough of a change to make a compact set. Extra credit (=30 homework points) to anyone who can show that (2-1) is or is not compact, or who can give a different example of a nontrivial convex, compact set in an infinite dimensional normed space. c J. Fessler, October 4, 2004, 12:44 (student version) 2.25 Compactness, closedness, and distances Lemma 2.4 In a normed space, closed subsets of compact subsets are themselves compact. ?? Lemma 2.5 In a normed space, any compact set contains the closures of all of its subsets. ?? Lemma 2.6 If U and V are compact, disjoint subsets of a normed space, then d(U, V ) > 0. (Picture) Proof. Suppose d(U, V ) = 0. Then there exists {xn } ∈ U and {yn } ∈ V such that kxn − yn k → 0. Since U is compact, there is a subsequence {xni } that converges to some x ∈ U . Now kyni − xk ≤ kyni − xni k + kxni − xk → 0 as i → ∞, so yni → x ∈ U . 2 But since V is compact, it is also closed, so the limit of {yni } must lie in V , contradicting the disjointedness of U and V . ...................................................................................................................... Would it suffice for U and V to be closed? No. Consider the sets U = (x, y) ∈ R2 : y > 0, x ≥ 1/y and V = (x, y) ∈ R2 : y < 0, x ≤ −1/y . (Picture) These sets are closed and disjoint, yet d(U, V ) = 0. Can you find a 1D example? ?? ...................................................................................................................... Would it suffice for U and V to be closed and bounded? (We need not look in finite-dimensional normed spaces for counter-examples since there all closed and bounded sets are compact.) o n For a counter-example, we turn to the (incomplete) normed space X = {finite-length real sequences} , k·kp , where k·kp is the ∞ n usual `p norm. Define U = ∪∞ n=1 xn and V = ∪n=1 yn , where xn = (1, 1/2, . . . , 1/2 , 0, 0, . . .) and yn = xn + (1/n, 0, 0, . . .). Then U and V are disjoint, closed (in this incomplete X ), and bounded (for any p ∈ [1, ∞]) yet d(U, V ) = 0. So it seems that we do need a stronger condition than “closed and bounded” in Lemma 2.6. ...................................................................................................................... Would “complete and bounded” suffice? Or do we need compactness? c J. Fessler, October 4, 2004, 12:44 (student version) 2.26 Upper semicontinuous functions Definition. A (real) functional f defined on a normed space (X , k·k) is called upper semicontinuous at x ∈ X iff ∀ε > 0, ∃δ > 0 s.t. ∀y ∈ X , ky − xk < δ =⇒ f (y) < f (x) + ε. We call f lower semicontinuous if at x iff −f is upper semicontinuous at x. Fact. f is continuous iff f is both upper and lower semicontinuous. Equivalently, one can show that f is u.s.c. at x iff f (x) ≥ lim supy→x f (y). ∀ε > 0, ∃δ > 0 s.t. ky − xk < δ =⇒ f (y) < S + ε, and ∀ε > 0, ∀δ > 0, ∃y s.t. ky − xk < δ and f (y) > S − ε. Loosely speaking, it is like a least upper bound “in the limit.” S = lim supy→x f (y) iff Example. Consider the following “almost piecewise continuous” real-valued function. f (x) 20 10 2 4 6 x What is lim supy→2 f (y)? We get the biggest limit approaching from the left, so lim supy→2 f (y) = 20. So f is u.s.c. at x = 2. Similarly, lim supy→4 f (y) = 20, so f is not u.s.c. at x = 4. But lim supy→6 f (y) = 20, so f is u.s.c. at x = 6. This function is u.s.c. everywhere except ?? and lower semicontinuous everywhere except ?? Does this function achieve a maximum on [3,5]? ?? What about on [1,3]? ?? How about on [0,1)? ?? Theorem. (Weierstrass) An upper semicontinuous (real) functional f on a compact subset K of a normed space (X , k·k) (i) is bounded on K, and (ii) achieves a maximum on K. inf {g ∈ R : g ≥ f (x), ∀x ∈ K} , if f is bounded above on K Proof. Let M = supx∈K f (x) = ∞, otherwise. (At this point we do not know if M is finite or not.) By definition of supremum, ∃ a sequence {xn } ∈ K such f (xn ) → M . (This is true even if M = ∞.) However, the definition of supremum alone does not ensure that xn converges. Since K is compact, ∃ a convergent subsequence xni → x? ∈ K, for some x? . Since subsequences of convergent sequences have the same limit, limi→∞ f (xni ) = M , considering {f (xn )} as a sequence in R. Since f is upper semicontinuous, f (x? ) ≥ lim supi→∞ f (xni ) = M . Since f (x) is real and hence finite, f (x? ) ≥ M implies M must be finite. ... f is bounded on K. On the other hand, by definition of supremum, since x? ∈ K, M ≥ f (x? ). So we conclude f (x? ) = M , meaning that x? achieves the maximum of f on K. 2 Corollary. A real-valued, continuous functional f on a compact subset K of a normed space (X , k·k) achieves its maximum and minimum on K. Example. On a normed space (X , k·k), the function f : X → R defined by f (x) = kxk is continuous. Proof. Recall | kxk − kyk | ≤ kx − yk, so x → y =⇒ kx − yk → 0 =⇒ |f (x) − f (y)| → 0. An inf. dim. example like splines would be nice here... Corollary (to Weierstrass theorem) A compact subset K of a normed space is (complete, closed, and) bounded. Proof. Since k·k is continuous, by the Weierstrass theorem ∃y ∈ K s.t. M , kyk = supx∈K kxk. Hence kxk ≤ M < ∞, ∀x ∈ K, i.e., K is bounded. 2 c J. Fessler, October 4, 2004, 12:44 (student version) 2.27 2.14 Quotient Spaces skip 2.15 Denseness One last topological concept. Definition. A subset D of a normed space (X , k·k) is called dense iff any of the following equivalent conditions hold. • ∀x ∈ X and ∀ε > 0, ∃y ∈ D s.t. kx − yk < ε, i.e., d(x, D) = 0. • D = X. • ∀x ∈ X , ∃ {yn } ∈ D s.t. yn → x Example. The set of rational numbers is dense in the real line. Separability Definition. A normed space X is called separable iff it contains a countable dense set, i.e., D = S∞ n=1 {yn }. Example. Euclidean space En is separable. The collection of vectors x = (a1 , . . . , an ) with rational components is countable and dense in En . Example. `p is separable for p ∈ [1, ∞) If Dn = {(r1 , . . . , rn , 0, 0, . . .) : rk ∈ Q}, then D = ∪∞ n=1 Dn is dense in `p . See text for proof. Example. Lp is separable for p ∈ [1, ∞) Pn If Dn = { k=1 rk 1Ik (t) : rk ∈ Q, Ik = (ak , bk ), ak , bk ∈ Q} for n ∈ N, then D = ∪∞ n=1 Dn is dense in Lp . Alternatively, if Dn = {f : [a, b] → R : f is piecewise linear with n ∈ N breakpoints at rk ∈ Q and f (rk ) ∈ Q}, then D = ∪∞ n=1 Dn is dense in Lp . Example. C[a, b] is separable. The set of all polynomials with rational coefficients is countable and dense in C[a, b]. If Dn = {r0 + r1 t + · · · + rn tn : rk ∈ Q}, then D = ∪∞ n=1 Dn is dense in C[a, b]. Example. `∞ and L∞ are not separable (Problem 2.21) Schauder basis Definition. InPa normed space, {xn } ∈ X is a Schauder basis for X iff for each x ∈ X , there exists a unique sequence {λn } ∞ such that x = n=1 λn xn [5] [2, p. 98]. Theorem. If a normed space has a Schauder basis, then it is separable [2, p. 100]. What about the converse? The famous Banach conjecture that every separable Banach space has a Schauder basis was shown to be incorrect by Elfon in 1973 [2, p. 100]. This could be considered surprising since separable, complete, normed spaces should be about as “nice as they come.” ...................................................................................................................... So is the concept of separability of limited use to use? No, thanks the following key result. Fact. A Hilbert space has a countable orthonormal basis iff it is separable [3, p. 314]. (Any countable orthonormal basis is a Schauder basis.) c J. Fessler, October 4, 2004, 12:44 (student version) 2.28 Summary Geometrical concepts and their generalizations. point line plane cone length sphere distance coordinate system x∈X {αx P : α ∈ R} { i αi xi : αi ∈ R} x ∈ C =⇒ αx ∈ C for α ≥ 0 kxk {x ∈ X : kxk < ε} kx − yk basis Hierarchy of spaces: vector space ⊃ normed space ⊃ Banach space Some principal results • A subset of a Banach space is complete iff it is closed. • Any finite-dimensional subspace of a normed space is complete (and hence closed). • A real-valued, continuous functional on a compact subset of a normed space achieves its maximum and minimum on that subset. • Any closed and bounded subset of a finite dimensional normed space is compact. c J. Fessler, October 4, 2004, 12:44 (student version) Bibliography [1] P. Enflo. A counterexample to the approximation problem in Banach spaces. Acta Math, 130:309–17, 1973. [2] I. J. Maddox. Elements of functional analysis. Cambridge, 2 edition, 1988. [3] A. W. Naylor and G. R. Sell. Linear operator theory in engineering and science. Springer-Verlag, New York, 2 edition, 1982. [4] D. G. Luenberger. Optimization by vector space methods. Wiley, New York, 1969. [5] J. Schauder. Zur theorie stetiger abbildungen in funktionenrumen. Math. Zeitsch., 26:47–65, 1927. [6] P. P. Vaidyanathan. Generalizations of the sampling theorem: Seven decades after Nyquist. IEEE Tr. Circ. Sys. I, Fundamental theory and applications, 48(9):1094–109, September 2001. [7] F. Deutsch. The convexity of Chebyshev sets in Hilbert space. In A. Yanushauskas Th. M. Rassias, H. M. Srivastava, editor, Topics in polynomials of one and several variables and their applications, pages 143–50. World Sci. Publishing, River Edge, NJ, 1993. [8] M. Jiang. On Johnson’s example of a nonconvex Chebyshev set. J. Approx. Theory, 74(2):152–8, August 1993. [9] P. L. Combettes and H. J. Trussell. Method of successive projections for finding a common point of sets in metric spaces. J. Optim. Theory Appl., 67(3):487–507, December 1990.