* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Marian Muresan Mathematical Analysis and Applications I Draft
Survey
Document related concepts
Law of large numbers wikipedia , lookup
Infinitesimal wikipedia , lookup
Vincent's theorem wikipedia , lookup
List of important publications in mathematics wikipedia , lookup
Karhunen–Loève theorem wikipedia , lookup
Mathematical proof wikipedia , lookup
Non-standard analysis wikipedia , lookup
Fermat's Last Theorem wikipedia , lookup
Nyquist–Shannon sampling theorem wikipedia , lookup
Continuous function wikipedia , lookup
Georg Cantor's first set theory article wikipedia , lookup
Wiles's proof of Fermat's Last Theorem wikipedia , lookup
Four color theorem wikipedia , lookup
Non-standard calculus wikipedia , lookup
Transcript
Marian Mureşan Mathematical Analysis and Applications I Draft Forword These lecture notes have been written having in mind a computer scientist (not a typist) in our changing world. It is difficult to imagine the computer science without mathematics. For those interested mainly in software we recall as an argument a capital (and classical) work of Knuth, [11]. Many results in discrete (and even continuous) mathematics are designed to be used in several parts of the giant called the computer science. On the one side we have to notice the immutability of the mathematical world in the following sense: a correct mathematical result remains true forever. Mathematics enlarges continuously. The speed of this enlargement increases. On the other side the needs of computer scientists change continuously. The time designed for study remains, more or less. unchanged. Hence, a question arises. What parts of mathematics and how to teach them in an optimal way to the computer scientist students? Not an easy question, indeed! We are pressed to take into account some parts from mathematics and to neglect many other. The mathematical analysis offers a solid ground to many achievements in applied mathematics and discrete mathematics. In spite of the facts that these lecture notes concern with a part of the mathematical analysis on the real axis, we have tried to include useful and relevant examples, exercises, and results enlightening the reader on the power of mathematical tools. We recommend [1], [3], [4], [5], [6], [8], [9], [12], [14], [15], [16], [21], [23], [25]. iii Contents Forword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 1 Sets 1.1 Sets . . . . . . . . . . . . . . . . . . . . 1.1.1 The concept of a set . . . . . . . 1.1.2 Operations on sets . . . . . . . . 1.1.3 Relations and functions . . . . . . 1.1.4 Exercises . . . . . . . . . . . . . . 1.2 Sets of numbers . . . . . . . . . . . . . . 1.2.1 An example . . . . . . . . . . . . 1.2.2 The real number system . . . . . 1.2.3 The extended real number system 1.3 Exercises . . . . . . . . . . . . . . . . . . 1.3.1 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 2 4 7 8 8 9 18 19 19 2 Basic notions in topology 27 2.1 Metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2 Compact spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3 Numerical sequences and series 3.1 Numerical sequences . . . . . . . . . . . . . . . . . . . . . 3.1.1 Convergent sequences . . . . . . . . . . . . . . . . . 3.1.2 Subsequences . . . . . . . . . . . . . . . . . . . . . 3.1.3 Cauchy sequences . . . . . . . . . . . . . . . . . . . 3.1.4 Monotonic sequences . . . . . . . . . . . . . . . . . 3.1.5 Upper and lower limits . . . . . . . . . . . . . . . . 3.1.6 Stoltz-Cesaro theorem and some of its consequences 3.1.7 Some special sequences . . . . . . . . . . . . . . . . 3.2 Numerical series . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Series of nonnegative terms . . . . . . . . . . . . . 3.2.2 The root and ration tests . . . . . . . . . . . . . . . 3.2.3 Power series . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Partial summation . . . . . . . . . . . . . . . . . . 3.2.5 Absolutely and conditionally convergent series . . . v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 37 37 39 40 41 45 46 48 55 57 60 62 62 65 3.2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4 Euclidean spaces 71 4.1 Euclidean spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5 Limits and Continuity 5.1 Limits . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 The limit of a function . . . . . . . . . . 5.1.2 Right-hand side and left-hand side limits 5.2 Continuity . . . . . . . . . . . . . . . . . . . . . 5.2.1 Continuity and compactness . . . . . . . 5.2.2 Uniform continuous mappings . . . . . . 5.2.3 Continuity and connectedness . . . . . . 5.2.4 Discontinuities . . . . . . . . . . . . . . 5.2.5 Monotonic functions . . . . . . . . . . . 5.2.6 Darboux functions . . . . . . . . . . . . 5.2.7 Lipschitz functions . . . . . . . . . . . . 5.2.8 Convex functions . . . . . . . . . . . . . 5.2.9 Jensen convex functions . . . . . . . . . 6 Differential calculus 6.1 The derivative of a real function . . . . 6.2 Mean value theorems . . . . . . . . . . 6.2.1 Consequences of the mean value 6.3 The continuity of derivatives . . . . . . 6.4 L’Hospital theorem . . . . . . . . . . . 6.5 Higher order derivatives . . . . . . . . 6.6 Convex functions and differentiability . 6.6.1 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 75 75 77 77 79 80 82 82 83 84 86 88 90 . . . . . . . . 91 91 96 98 102 103 104 104 107 7 Integral calculus 109 7.1 The Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 7.2 The Gronwall inequality . . . . . . . . . . . . . . . . . . . . . . . . . . 109 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Author index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Subject index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 vi Chapter 1 Sets The aim of this chapter is to introduce several basic notions and results concerning sets. 1.1 1.1.1 Sets The concept of a set The basic notion of set theory which was first introduced by Cantor1 will occur constantly in our results. Hence it would be fruitful to discuss briefly some of the notions connected to it before studying the mathematical analysis. We take the notion of a set as being already known. Roughly speaking, a set (collection, class, family) is any identifiable collection of objects of any sort. We identify a set by stating what its members (elements) are. The theory of sets has been described axiomatically in terms of the notion ”member of ” ([10]). We shall make no effort to built the complete theory of sets, but will appear throughout to intuition and elementary logic. The so-called ”naive” theory of sets is completely satisfactory for us ([8]). We will usually adhere to the following notational conventions. Elements of sets will be denoted by small letters: a, b, c, . . . , x, y, z, α, β, γ, . . . . Sets will be denoted by capital Roman letters: A, B, C, . . . X, Y, . . . . Families of sets will be denoted by capital script letters: A, B, C, . . . . A set is often defined by some property of its elements. We will write {x | P (x)} (where P (x) is some proposition about x ) to denote the set of all x such that P (x) is true. Here | is read ”such that”. If the object x is an element of the set A, we write x ∈ A; while x ∈ / A means that this x is not in A. We write ∅ for the empty (void ) set; it has no member at all. For any object x, {x} will denote the set whose only member is x. Then x ∈ {x}, but x 6= {x}. Similarly, {x1 , x2 , . . . , xn } is the set whose elements are precisely x1 , 1 Georg Cantor, 1845-1918 1 2 1. Sets x2 , . . . , xn . Let us emphasize that {x, x} = {x}. Examples 1.1. (a) The set of natural numbers, N = {0, 1, 2, 3, . . . }; 2 (b) The set of non-zero natural numbers, N∗ = {1, 2, 3, . . . }; (c) The set of integers Z = {0, ±1, ±2, ±3, . . . }; (d) The set of rational numbers Q = {p/q | p, q ∈ Z, q 6= 0}; 3 (e) The set of positive integers less than 7; (f) The set of Romanian cities having more than five million of inhabitants; (g) The set S of vowels in English alphabet. S may be written as S = {a, e, i, o, u} or S = {x | x is a vowel in English alphabet}. 4 Let A and B be sets such that every element of A is an element of B. Then A is called a subset of B and we write A ⊂ B or B ⊃ A. If A ⊂ B and B ⊂ A, we write A = B. A 6= B denies A = B. If A ⊂ B and A 6= B, we say that A is a proper subset of B and we write A ( B. We remark that under this idea of equality of sets, the empty set is unique, i.e., if ∅1 and ∅2 are any two empty sets, we have ∅1 ⊂ ∅2 and ∅2 ⊂ ∅1 . Let A be a set. By P(A) we denote the family of subsets of A. Thus P(∅) = {∅, {∅}} . For A = {1, 2} , we have P(A) = {∅, {1} , {2} , {1, 2}} . It is clear that if A is not a subset of B, the following statement has to be true: ”there is an element x such that x ∈ A and x ∈ / B ”. 1.1.2 Operations on sets If A and B are sets, we define A ∪ B as the set {x | x ∈ A or x ∈ B} and we call A ∪ B the union of A and B, see figure 1.1. Let A be a family of sets; then we define ∪A = {x | x ∈ A for some A ∈ A}. Similarly, if {Aα }α∈I is a family of sets indexed by α, we write ∪α∈I Aα = {x | x ∈ Aα for some α ∈ I}. For given sets A and B, we define A ∩ B as the set {x | x ∈ A and x ∈ B} and we call A ∩ B the intersection of A and B, see figure 1.2. If A is any family of sets we define ∩A = {x | x ∈ A for all A ∈ A}. 2 An axiomatic introduction of natural numbers may be found in many textbooks, e.g., [22, Chapter 1]. 3 An axiomatic introduction of natural, integer, and rational numbers may be found, e.g., [17, I.§2-§4]. 1.1. Sets 3 Figure 1.1: Union Figure 1.2: Intersection Figure 1.3: Disjoint sets Similarly, if {Aα }α∈I is a family of sets indexed by α, we write ∩α∈I Aα = {x | x ∈ Aα for all α ∈ I}. Theorem 1.1. Let A, B, C be any sets. Then we have (i) (ii) (iii) (iv) (v) (vi) A∪B =B∪A A∪A=A A∪∅=A A ∪ (B ∪ C) = (A ∪ B) ∪ C A⊂A∪B A ⊂ B ⇐⇒ A ∪ B = B (i’) (ii’) (iii’) (iv’) (v’) (vi’) A∩B =B∩A commutative law; A∩A=A idempotent law; A ∩ ∅ = ∅; A ∩ (B ∩ C) = (A ∩ B) ∩ C associative law; A ∩ B ⊂ A; A ⊂ B ⇐⇒ A ∩ B = A. Theorem 1.2. Let A, B, C be any sets. Then we have A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C); A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C); (i) (ii) distributive law; distributive law. If A ∩ B = ∅, the sets A and B are said to be disjoint, see figure 1.3. If A is a family of sets such that each pair of distinct members of A are disjoint, A is said to be pairwise disjoint. Thus an indexed family {Aα }α∈I is pairwise disjoint if Aα ∩ Aβ = ∅ whenever α, β ∈ I and α 6= β. Let A and B be two sets. Then A \ B := {x | x ∈ A and x ∈ / B} is said to be the difference of A and B , see figure 1.4. If A ⊂ X, we define the complement of A (relative to X ) by the set {x | x ∈ X, x ∈ / A}. This set is denoted by {X A or {A. Other notation: X \ A. Theorem 1.3. (de Morgan4 laws) (a) {(A ∪ B) = ({A) ∩ ({B); (b) {(A ∩ B) = ({A) ∪ ({B); (c) {(∪α∈I Aα ) = ∩α∈I {Aα ; (d) {(∩α∈I Aα ) = ∪α∈I {Aα . 4 de Morgan 4 1. Sets Figure 1.4: Difference Figure 1.5: Symmetric difference For sets A and B, the symmetric difference of A and B is the set (A\B)∪(B\A) and we write A4B for this set, see figure 1.5. Note that A4B is the set consisting of those elements which are in exactly one of A and B, and that it may also be defined by A4B = (A ∩ {B) ∪ ({A ∩ B). 1.1.3 Relations and functions Sometimes it becomes significant to consider the order of the elements in a set. If we consider a pair (x1 , x2 ) of elements in which we distinguish x1 as the first element and x2 as the second element, then (x1 , x2 ) is called an ordered pair. Thus, two ordered pairs (x, y) and (u, v) are equal if and only if x = u and y = v. Let X and Y be sets. The Cartesian product of X and Y is the set X × Y of all ordered pairs (x, y) such that x ∈ X and y ∈ Y. Hence, X × Y := {(x, y) | x ∈ X, y ∈ Y }. Remark. (1, 2) 6= (2, 1) while {1, 2} = {2, 1}. 4 A (binary) relation R on two sets X and Y is any subset of the Cartesian product of X and Y, i.e., R is a relation on X and Y ⇐⇒ R ⊂ X × Y. Let R be a relation. The domain of R is the set DomR := {x | (x, y) ∈ R for some y}. The range of R is the set RangeR := {y | (x, y) ∈ R for some x}. The symbol R−1 denotes the inverse of R, i.e., R−1 := {(y, x) | (x, y) ∈ R}. Let R and Q be relations. The composition (product) of two relations R and Q is the relation Q ◦ R := {(x, z) | for some y, (x, y) ∈ R and (y, z) ∈ Q}. The composition of R and Q may be empty. Q ◦ R 6= ∅ ⇐⇒ (RangeR) ∩ (DomQ) 6= ∅. Let X be a set. An equivalence relation on X is any relation ∼ ⊂ X × X such that for all x, y and z in X it hold (i) x ∼ x (reflexive); (ii) x ∼ y implies y ∼ x (symmetric); (iii) x ∼ y and y ∼ z imply x ∼ z (transitive). Examples 1.2. (a) ”=” is an equivalence relation on the set of rational numbers Q. 1.1. Sets 5 (b) Let Z be the set of integers and fix a natural number n. For any a, b ∈ Z, ”a is congruent to b modulo n” if a − b = kn for a k ∈ Z. Here ”congruence modulo n” is an equivalence relation on Z. 4 Let P be a set. A partial ordering on P is any relation ≤ ⊂ P × P such that for all x, y and z in X we have (i) x ≤ x (reflexive); (ii) x ≤ y and y ≤ x imply y = x (antisymmetric); (iii) x ≤ y and y ≤ z imply x ≤ z (transitive). Examples 1.3. (a) Let X be a nonempty set and take A, B ⊂ X. Define A ≤ B provided A ⊂ B. Then ” ≤ ” is a partial ordering on the class of subsets of X. (b) For m, n ∈ N define m ≤ n provided there exists k ∈ N∗ such that m = kn. Then ” ≤ ” is a partial ordering on N. 4 If ≤ also satisfies (iv) x, y ∈ P implies x ≤ y or y ≤ x, then ≤ is called a total ordering on P. If x ≤ y and x 6= y, then we write x < y. The expression x ≥ y means y ≤ x and x > y means y < x. Example 1.1. ≤ is a total ordering on Q. 4 If ≤ is a total ordering such that (v) ∅ = 6 A ⊂ P implies there exists an element a ∈ A such that a ≤ x for each x ∈ A ( a is the smallest element of A ), then ≤ is called a well ordering on P. A partially ordered set is an ordered pair (P, ≤), where P is a set and ≤ is a partial ordering on P. If ≤ is a well ordering, the pair (P, ≤) is called a well-ordered set. Example 1.2. well-ordered set. The set N of natural numbers with the usual ordering ≤ is a 4 If ≤ is a total ordering on P, the pair (P, ≤) is called a totally ordered set. Let P be a totally ordered set. For x, y ∈ P we define max{x, y} := y if x ≤ y, and max{x, y} := x if y ≤ x. For a finite subset {x1 , . . . , xn } (not all xj ’s necessarily distinct), we define max{x1 , . . . , xn } := max{xn , max{x1 , . . . , xn−1 }}. Similarly, we define min{x, y}. That is, min{x, y} := x whenever x ≤ y and min{x1 , . . . , xn } := min{xn , min{x1 , . . . , xn−1 }}. Let (P, ≤) be a partially ordered set and A ⊂ P. An element x ∈ P is said to be 6 1. Sets Figure 1.6: Figure 1.7: Figure 1.8: (i) a lower bound of A if x ≤ y for any y ∈ A; in this case we say that A is bounded below ; (ii) an upper bound of A if y ≤ x for any y ∈ A; in this case we say that A is bounded above; (iii) the greatest lower bound of A or infimum of A if (iii1) x is a lower bound of A, (iii2) if x < y, then y is not a lower bound of A; (iv) the least upper bound of A or supremum of A if (iv1) x is an upper bound of A, (iv2) if y < x, then y is not an upper bound of A. A is bounded if it is bounded below and above. Remark. A set A may have several lower and/or upper bounds. A set A has at most one infimum (denoted by inf A ) and at most one supremum (denoted by sup A ). 4 Example. Let E consists of all numbers 1/n, where n = 1, 2, . . . . Then E is bounded, sup E = 1, inf E = 0, and 1 ∈ E while 0 ∈ / E. 4 Let f be a relation and A be a set. The image of A under f is the set f (A) := {y | (x, y) ∈ f for some x ∈ A}. Observe that f (A) 6= ∅ ⇐⇒ A ∩ Domf 6= ∅. This is interpreted as ” f maps the set A into the set B ”. The inverse image of A under f is the set f −1 (A) := {x | (x, y) ∈ f for some y ∈ A}, [18, §3.2], [19, §1.1], and [20, §2.3]. A relation f is said to be single-valued if (x, y) ∈ f and (x, z) ∈ f imply y = z. In this case we write f (x) = y. A single-valued relation is called a function (mapping, application, transformation, operation). If f and f −1 are both single-valued, then f is called a bijective function, figure 1.8. Theorem 1.4. Let X and Y be sets and f ⊂ X × Y be a relation. Suppose that {Ai }i∈I is a family of subsets of X and {Bj }j∈J is a family of subsets of Y. Then (a) f (∪i∈I Ai ) = ∪i∈I f (Ai ); (b) f −1 (∪j∈J Bj ) = ∪j∈J f −1 (Bj ); 1.1. Sets of numbers 7 (c) f (∩i∈I Ai ) ⊂ ∪i∈I f (Ai ). The following statements are true if f is a function, but may fail for arbitrary relations (c) f −1 (∩j∈J Bj ) = ∪j∈J f −1 (Bj ); (d) f −1 ({Y B) = {X (f −1 (B)), B ⊂ Y ; (e) f (f −1 (B) ∩ A) = B ∩ f (A), A ⊂ X, B ⊂ Y. Let f be a function such that Domf = X and Rangef ⊂ Y. Then f is said to be a function from (on) X into (to) Y and we write f : X → Y. If Rangef = Y, we say that f is onto, that is f (X) = Y. It means that to every y ∈ Y there exists an x ∈ X such that y = f (x). f is said to be injective or one-to-one if for f (x) = y and f (t) = y, then x = y. In other words, a function f : X → Y is said to be one-to-one if distinct elements of X have distinct images in Y, that is, if no two two different elements in X have the same image. A suitable way in deciding whether or not a given map is injective is if for x, t ∈ X, f (x) = f (t) implies x = t. Figure 1.6 exhibits a surjective but not an one-to-one function, figure 1.7 exhibits an injective but not an onto function, while figure 1.8 presents a bijective function. Theorem 1.5. Let X and Y be nonempty sets and f : X → Y be a function. Then f is bijective if and only if it is injective and onto. A sequence is a function having N∗ as its domain. Sometimes we will consider sequences having N as their domain. If x is a function, we frequently write xn instead x(n) for the value of x at n. The value xn is called the nth term of the sequence. The sequence whose nth term is xn will be denoted by (xn )∞ n=1 or (xn )n or (xn ). A sequence (xn ) is said to be in X if xn ∈ X for each n ∈ N∗ . 1.1.4 Exercises 1. Let S = {0, ±1, ±2, 3} and Q (the set of rational numbers) be two sets. Then, for a function f : S → Q, given by f (t) = t2 − 1, for all t ∈ S, find f (S). 2. Show whether or not each of the following functions is one-to-one and/or onto: (i) the function f : N → N, defined by f (n) = 2n, n ∈ N; (ii) the function f : Q × Q → Q, defined by f (p, q) = p, p, q ∈ Q; (iii) the function f : Q × Q → Q × Q, defined by f (p, q) = (p, −q) p, q ∈ Q. 8 1. Sets 1.2 Sets of numbers A satisfactory discussion of the main concepts of analysis (e.g., convergence, continuity, differentiation and integration) must be based on an accurately defined number concept. We shall not, however, enter into any discussions of the axioms governing the arithmetic of the integers, but we take the rational number system as our starting point. 1.2.1 An example It is well known that the rational number system is inadequate for many purposes. Maybe the most frustrating case is the following. Let be an isosceles right triangle having the length of a cathetus 1. Can we express the length of the hypotenuse as a rational number? Example. Let us begin by showing that the equation (2.1) p2 = 2 is not satisfied by any rational p. For, suppose that (2.1) is satisfied. Then we can write p = m/n, where m and n are integers with n 6= 0, and we can further choose m and n so that not both are even. Let us assume that this is done. Then (2.1) implies (2.2) m2 = 2n2 . This shows that m2 is even. Hence m is even (if m were odd, m2 would be odd), and so m2 is divisible by 4. It follows that the right-hand side of (2.2) is divisible by 4, so that n2 is even, which implies that n is even. Thus the assumption that (2.1) holds for a rational number leads us to the conclusion that both m and n are even, contrary to our choice of m and n. Hence (2.1) is impossible for rational p. So, the length of the hypotenuse to an isosceles right triangle with unitary cathetus is non-rational. Let us examine the situation a little more closely. Let A be the set of all positive rationals p such that p2 < 2, and let B be the set of all positive rationals p such that p2 > 2. We shall show that A contains no largest element, and B contains no smallest element. More explicitly, for every p ∈ A we can find a rational q ∈ A such that p < q, and for every p ∈ B we can find a rational q ∈ B such that q < p. Suppose that p ∈ A. Then p2 < 2. Choose a rational h such that 0 < h < 1 and such that 2 − p2 . h< 2p + 1 Put q = p + h. Then q > p, and q 2 = p2 + (2p + h)h < p2 + (2p + 1)h < p2 + (2 − p2 ) = 2, 1.2. Sets of numbers 9 so that q is in A. This proves the first part of our assertion. Next, suppose that p ∈ B. Then p2 > 2. Put q =p− p 1 p2 − 2 = + . 2p 2 p Then 0 < q < p and 2 2 2 q = p − (p − 2) + p2 − 2 2p 2 > p2 − (p2 − 2) = 2, so that q ∈ B. 4 The purpose of the above discussion has been to show that the rational number system has certain gaps, in spite of the fact that between any two rationals there is another (since p < (p + q)/2 < q ). 1.2.2 The real number system There are several ways for introducing the real number set. We say that a set X is the real number set provided there are defined two operation X × X 3 (x, y) 7→ x + y ∈ X, X × X 3 (x, y) 7→ xy ∈ X called addition and multiplication as well as a binary relation ≤ called ordering satisfying the following axioms (conditions, assumptions) (R 1) (x + y) + z = x + (y + z), ∀ x, y, z ∈ X; (R 2) there exists an element 0 ∈ X, called zero or null such that x+0 = x, ∀ x ∈ X; (R 3) for each x ∈ X there exists an element −x ∈ X, called the opposite to x, such that x + (−x) = 0; (R 4) x + y = y + x, ∀ x, y ∈ X; (R 5) (xy)z = x(yz), ∀ x, y, z ∈ X; (R 6) there exists an element 1 ∈ X \ {0}, called unity or identity such that x1 = x, ∀ x ∈ X; (R 7) for each element x ∈ X \ {0} there exists an element x−1 ∈ X, called the inverse of x, such that xx−1 = 1; (R 8) xy = yx, ∀ x, y ∈ X; (R 9) x(y + z) = xy + xz, ∀ x, y, z ∈ X; 10 1. Sets (R 10) x ≤ x, for all x ∈ X; (R 11) x ≤ y and y ≤ x imply x = y, for all x, y ∈ X; (R 12) x ≤ y and y ≤ z imply x ≤ z, for all x, y, z ∈ X; (R 13) for all x, y ∈ X we have x ≤ y or y ≤ x; (R 14) x ≤ y implies x + z ≤ y + z, for all x, y, z ∈ X; (R 15) if x ≥ 0 and y ≥ 0 imply xy ≥ 0; (R 16) for every ordered pair (A, B) of nonempty subsets of X having the property that x ≤ y for every x ∈ A and y ∈ B there exists an element z ∈ X such that x ≤ z ≤ y, for any x ∈ A and y ∈ B. Remarks. From (R 1)-(R 4) we have that (X, +) is an Abelian (commutative) group; from (R 5)-(R 8) we have that (X \ {0}, ·) is an Abelian (commutative) group, too; from (R 1)-(R 9) we have that (X, +, ·) is a field ; from (R 10)-(R 13) we have that (X, ≤) is a totally ordered set; from (R 1)-(R 15) we have that (X, +, ·, ≤) is a totally ordered field. (R 14) and (R 15) express the compatibility of the ordering relation with the algebraic operations. (R 16) has a special rôle which will be clear a little bit later. 1 y x−1 ∈ X \ {0} from (R 7) is denoted as , too. Hence = yx−1 . 4 x x For a while let us ignore assumption (R 16). Proposition 2.1. From (R 1)-(R 15) it follows that (1) (a) (b) (2) (a) (b) (c) (3) (a) (b) (c) (d) (4) (5) (a) (b) (c) (d) x1 ≤ x2 and y1 ≤ y2 imply x1 + y1 ≤ x2 + y2 ; x1 < x2 and y1 ≤ y2 imply x1 + y1 < x2 + y2 ; x > 0 if and only if x−1 > 0; x ≥ 0 implies − x ≤ 0; x > 0 implies − x < 0; x ≤ y and z > 0 imply xz ≤ yz; x < y and z > 0 imply xz < yz; x ≤ y and z < 0 imply xz ≥ yz; x < y and z < 0 imply xz > yz; if xy > 0, then x ≤ y if and only if 1/x ≥ 1/y; 0 ≤ x1 ≤ x2 and 0 ≤ y1 ≤ y2 imply x1 y1 ≤ x2 y2 ; 0 < x1 < x2 and 0 < y1 ≤ y2 imply x1 y1 < x2 y2 ; x1 ≤ x2 ≤ 0 and y1 ≤ y2 ≤ 0 imply x1 y1 ≥ x2 y2 ; x1 < x2 ≤ 0 and y1 ≤ y2 < 0 imply x1 y1 > x2 y2 . 1.2. Sets of numbers 11 The absolute value function is defined by ( x, for x ∈ X, |x| = −x, x ≥ 0, x < 0. Proposition 2.2. From (R 1)-(R 15), it follows that (1) (a) (b) (c) (2) (a) (b) (3) (a) (b) (4) (a) (b) (c) |x| ≥ 0; |x| = 0 ⇐⇒ x = 0; |x| = | − x|; |x + y| ≤ |x| + |y|; |x − y| ≥ | |x| − |y| |; |x| ≤ a ⇐⇒ −a ≤ x ≤ a; |x| < a ⇐⇒ −a < x < a; |xy| = |x| · |y|; x |x| = y |y| ; |xn | = |x|n ; The distance function is defined by for x, y ∈ X, d(x, y) = |x − y|. Proposition 2.3. From proposition 2.2 it follows that d(x, y) = 0 ⇐⇒ x = y; d(x, y) = d(y, x), ∀ x, y ∈ X; d(x, y) ≤ d(x, z) + d(z, y), ∀ x, y, z ∈ X. Warning. There exist several systems satisfying (R 1)-(R 16) axioms. But all are algebraically and order isomorphic, [9, theorem 5.34]. We choose one of them and called it the set of real numbers and denoted it by R = (R, +, ·, ≤) . Any element x ∈ R is called a real number. Any real number x such that 0 ≤ x is called non-negative, while 0 < x is called positive number. Any real number x such 0 ≥ x is called non-positive, while 0 > x is called negative number. Proposition 2.4. Number 1 is positive. Proof. Suppose that 1 is non-positive, i.e., 1 ≤ 0. Adding −1 to the both sides we have 0 ≤ −1. Multiplying both sides by the non-negative number −1 and using (R 15), we get 0 ≤ (−1)(−1) ⇐⇒ 0 ≤ 1. Now, 1 is simultaneously non-negative and non-positive, so 0 = 1. But this contradicts (R 6). Hence 1 > 0. 2 It is clear that any set of real number (i.e., any subset of R ) having an infimum is nonempty and bounded below. The converse statement is true, too. 12 1. Sets Theorem 2.1. Every nonempty and bounded below subset A of R has an infimum. Proof. Denote by A0 the set of lower bounds of A. Since A is bounded below, A0 6= ∅. Remark that the ordered system (A0 , A) has the property that for every x ∈ A0 and y ∈ A it holds x ≤ y. From (R 16) it follows there exists a real number z such that x ≤ z ≤ y for any x ∈ A0 and y ∈ A. It results that the number z is the greatest element in A0 , i.e., the infimum of A. 2 Corollary 2.1. If A is a nonempty and bounded below subset of R and B is a nonempty subset of A, then inf A ≤ inf B. Theorem 2.2. Every nonempty and bounded above subset A of R has a supremum. Corollary 2.2. If A is a nonempty and bounded above subset of R and B is a nonempty subset of A, then sup A ≥ sup B. Remark. The proofs of the existence of an infimum and the existence of a supremum have used axiom (R 16). At the same time it can be proved that (R 16) follows from any one of these theorems. Thus (R 16) is equivalent to any one of these theorems. Hence we may substitute (R 16) by any one of these statements in order to get the same real number system. 4 Theorem 2.3. Suppose X is a totally ordered field (i.e., it satisfies axioms (R 1)(R 15)) and, moreover, every nonempty and bounded above subset of it has a supremum. Then it is fulfilled axiom (R 16). Proof. Consider an ordered pair (A, B) of nonempty subsets of X having the property that x ≤ y for any x ∈ A and y ∈ B. Then A is nonempty and bounded above (by any element of B ). It follows that there exists z ∈ X such that (2.3) z = sup A. We have to show that z ≤ y, for every y ∈ B. For, suppose there exists y0 ∈ B such that y0 < z. Then y0 is an upper bound of A strictly less then z, contradicting (2.3). 2 A similar statement holds for infimum. Theorem 2.4. Suppose X is a totally ordered field and, moreover, every nonempty and bounded below subset of it has an infimum. Then it is fulfilled axiom (R 16). Theorem 2.5. (Archimedes5 principle, [3, theorem 6.5.1, p. 72]) For any two real numbers x and y such that y > 0 there exists a natural number n such that x < ny. 5 Archimedes, 287-212 1.2. Sets of numbers 13 Proof. Under the above-mentioned assumptions define A = {u ∈ R | ∃ n ∈ N∗ , u < ny}. and remark that A 6= ∅ (since, at least y ∈ A ). We wish to show that A = R. For, suppose that A 6= R and denote B = R \ A. Obviously, B 6= ∅. Note that for any u ∈ A and v ∈ B, u < v. Indeed, for any u ∈ A there exists a natural n such that u < ny. Since v ∈ / A and that the real number set is a totally ordered set it follows that ny ≤ v. Then u < ny ≤ v =⇒ u < v. Axiom (R 16) implies that for the ordered pair (A, B) there exists a real number z such that (2.4) u ≤ z ≤ v, ∀ u ∈ A, v ∈ B. The real number z − y belongs to A, since otherwise z − y ∈ B, and then by (2.4) z ≤ z − y =⇒ y ≤ 0, contradicting the hypothesis. So z − y ∈ A. Then we can find a natural number n such that z − y < ny. We also have z + y = (z − y) + 2y < (n + 2)y, and it follows that z + y ∈ A. Then z + y ≤ z, so y ≤ 0. The contradiction shows that A = R and the theorem is proved. 2 Remark. It can be shown that axiom (R 16) is equivalent to the Archimedes principle. 4 Theorem 2.6. The supremum of a nonempty and bounded above set is unique. Proof. For, suppose that sup A = a1 and sup A = a2 and a1 6= a2 . Then either a1 < a2 or a2 < a1 . In both cases we get a contradiction. 2 Theorem 2.7. A real number a is the supremum of a set A if and only if (i) for any x ∈ A, x ≤ a; (ii) for any ε > 0 there is y ∈ A such that y > a − ε. Proof. (i) says that a is an upper bound of A, while (ii) shows that there is no upper bound less then a. 2 Theorem 2.8. ([23, theorem 1.37, p. 11]) For every real x > 0 and every integer n > 0, there is one and only one real y > 0 such that y n = x. 14 1. Sets √ Remark. This number y is written n x or x1/n . Proof. That there is at most one such y is clear, since 0 < y1 < y2 implies y1n < y2n . Let E be the set consisting of all positive reals t such that tn < x. If t = x/(1 + x), then 0 < t < 1; hence tn < t < x, so E is not empty. Put t0 = 1 + x. Then t > t0 implies tn ≥ t > x, so that t ∈ / E, and t0 is an upper bound of E. Let y = sup E (which exists, by theorem 2.2). Suppose y n < x. Choose h such that 0 < h < 1 and such that x − yn h< . (1 + y)n − y n We have n n−1 n n−1 n n (y + h) = y + y h + ··· + h ≤ y + h y + ··· + 1 1 1 = y n + h[(1 + y)n − y n ] < y n + (x − y n ) = x. n n Thus y + h ∈ E, contradicting the fact that y is an upper bound of E. Suppose y n > x. Choose k such that 0 < k < 1 such that k < y and such that yn − x . k< (1 + y)n − y n Then, for t ≥ y − k, we have n n−1 n n−2 2 n n n t ≥ (y − k) = y − y k+ y k − · · · + (−1)n k n 1 2 n n−2 n n−1 n n−1 n−1 =y −k y − y k + · · · + (−1) k 1 2 n n−1 n n−2 n ≥y −k y + y + ··· + 1 1 2 = y n − k[(1 + y)n − y n ] > y n + (x − y n ) = x. Thus y − k is an upper bound of E, contradicting the fact that y = sup E. It follows that y n = x. 2 An interval A of the real number system is a subset of R so that for every x, y ∈ A and z ∈ R satisfying x ≤ z ≤ y, we have z ∈ A. An interval bounded below and above is said to be bounded. Otherwise it is called unbounded. For any nonempty and bounded interval A, the non-negative real number l(A) = sup A − inf A is said to be the length of A. We remark that for any real numbers a and b with a ≤ b the next sets are intervals [a, b] = {x ∈ R | a ≤ x ≤ b} [a, b[ = {x ∈ R | a ≤ x < b} ]a, b] = {x ∈ R | a < x ≤ b} ]a, b[= {x ∈ R | a < x < b} closed interval; left closed right open interval; left open right closed interval; open interval. 1.2. Sets of numbers 15 Theorem 2.9. Let (Ik )k∈N∗ be a nested sequence of nonempty closed intervals in R, i.e., (2.5) Ik+1 ⊂ Ik , k ∈ N∗ . Then ∩k∈N∗ Ik 6= ∅. Proof. Denote Ik = [ak , bk ], k ∈ N∗ . From (2.5) it follows that (2.6) ak ≤ ak+1 ≤ bk+1 ≤ bk , k ∈ N∗ . Denote A = {x | x = ak , for some k ∈ N∗ } and B = {y | y = bk , for some k ∈ N∗ }. Then for any x ∈ A and any y ∈ B we have x ≤ y, since otherwise there are ak ∈ A and bm ∈ B such that bm < ak . We have either m < k or k < m. Suppose m < k. Then bm < a k ≤ bk , thus contradicting (2.6). Axiom (R 16) supplies a real z such that ak ≤ z ≤ b k , k ∈ N∗ . Then z ∈ Ik , for every k ∈ N∗ , and therefore z ∈ ∩Ik∈N∗ . 2 Theorem 2.10. For every real x there exits a unique integer k such that k − 1 ≤ x < k. Let x be a real number. Its integer part is the unique (by theorem 2.10) integer k satisfying k − 1 ≤ x < k, and it is denoted by [x]. Hence, the integer part function is defined by R 3 x 7→ [x] ∈ Z. The fractional part of a real number x is defined as x − [x] and it is denoted by {x}. So, the fractional part function is defined by R 3 x 7→ {x} ∈ [0, 1[ . Theorem 2.11. For every two real numbers x and y such that x < y there exists a rational lying between them, i.e., x < u < y, for an element u ∈ Q. 16 1. Sets Proof. Based on Archimedes principle (theorem 2.5) for the positive real y − x there exists a natural n such that 1 < n(y − x). Then 1/n < y − x. (2.7) From theorem 2.10 it follows that there exists an integer m such that m ≤ nx < m + 1. (2.8) Obviously, u = (m + 1)/n is a rational, and satisfies x < u. From the left-hand side of (2.8) as well as from (2.7) we infer that u also satisfies u= m 1 1 + ≤ x + < y. 2 n n n Corollary 2.3. Given any two real numbers x and y such that x < y, there exists an irrational number v (from R \ Q ) such that x < v < y. √ Proof. Choose any irrational number v0 ( 2, for example). Then x − v0 < y − v0 and by theorem 2.11 there exists a rational u such that x − v0 < u < y − v0 , i.e., x < v0 + u < y. Finally, we remark that v = v0 + u is irrational, since otherwise it follows that v0 itself is rational, and this is not the case. 2 Let A and B be two sets. If there exists a bijective mapping from A onto B, we say that A and B have the same cardinal number or that A and B are equivalent, and we write A ∼ B. Theorem 2.12. The relation ∼ defined above is an equivalence relation. For any positive integer n, let N∗n be the set whose elements are precisely the integers 1, 2, . . . , n. For a set A we say that (a) A is finite if A ∼ N∗n for some n (the empty set is, by definition, finite). The number of elements of a nonempty finite set A is n provided A ∼ N∗n . In this case we write |A| = n, and we read ”‘the number of elements of the nonempty and finite set is equal to n”. By definition, |∅| = 0.; (b) A is infinite if A is not finite; (c) A is countable if A ∼ N∗ . We write |A| = ℵ0 ; (d) A is uncountable if A is neither finite nor countable. We write |A| ≥ ℵ1 > ℵ0 ; (e) A is at most countable (or denumerable) if A ∼ N∗ or A ∼ N∗n for some n ∈ N∗ . We write |A| ≤ ℵ0 ; Remarks 2.1. (a) For two finite sets A and B, we obviously have A ∼ B if and only if A and B contain the same number of elements. For infinite sets, however, this is not exactly so. Indeed, let M be the set of all even positive integers, M = {2, 4, 6, . . . }. 1.2. Sets of numbers 17 Figure 1.9: Infinite array It is clear that M is a proper subset of N∗ . But N∗ ∼ M, since N∗ 3 n 7→ 2n ∈ M is a bijection. (b) The sets {1, −1, 2, −2, 3, −3, . . . } and {0, 1, −1, 2, −2, 3, −3, . . . } are equivalent. Indeed, the function ( 0, k = 1, bk = ak−1 , k > 1 maps the k rank term ak of the first set to the k + 1 rank term in the second set in a bijective way. We conclude that the sets N∗ and Z are equivalent. Therefore we can write |Z| = ℵ0 . 4 Exercise. Let A be a finite set. Then |P(A)| = 2|A| . Theorem 2.13. Every infinite subset of a countable set is countable. Theorem 2.14. Let {An }, n = 1, 2, . . . , be a sequence of countable sets, and put B = ∪∞ n=1 An . (2.9) Then B is countable. Proof. Let every set An be arranged in a sequence (xn k )n , k = 1, 2, . . . , and consider the infinite array in which the elements of An form the nth row, figure 1.9. The array contains all elements of B. As indicated by the arrows, these elements can be arranged in a sequence (2.10) x11 , x21 , x12 , x31 , x22 , x13 , . . . . If any two of the elements An have elements in common, these will appear more than once in (2.10). Hence there is a subset C of the set of all positive integers such that C ∼ B, which shows that B is at most countable (theorem 2.13). Since A1 ⊂ B, and A1 is infinite, B is infinite, and thus countable. 2 Corollary 2.4. Suppose A is at most countable and for every α ∈ A, Bα is at most countable. Put C = ∪α∈A Bα . Then C is at most countable. Proof. For C is equivalent to a subset of (2.9). 2 18 1. Sets Theorem 2.15. Let A be a countable set, and let Bn be the set of all ordered n -tuples (a1 , a2 , . . . , an ), where ak ∈ A (k = 1, . . . , n), i.e., Bn = |A × A × {z· · · × A} . n times Then Bn is countable. Proof. That B1 is countable is obvious, since B1 = A. Suppose Bn−1 is countable (n = 2, 3, . . . ). The elements of Bn are of the form (2.11) (b, a) (b ∈ Bn−1 , a ∈ A). For every fixed b, the set of pairs (b, a) is equivalent to A, and hence, countable. Thus Bn is a countable union of countable sets. By theorem 2.14, Bn is countable. 2 Corollary 2.5. The set of all rational numbers is countable. Proof. We apply theorem 2.15 with n = 2, noting that every rational r is of the form a/b, where a and b are integers and b 6= 0. The set of such pairs (a, b), and therefore the set of fractions a/b, is countable. 2 Therefore we can write |Q| = ℵ0 . Theorem 2.16. Let A be the set of all sequences whose elements are the digits 0 and 1. Then A is uncountable. Proof. Let B be a countable subset of A, and let B consists of the sequences s1 , s2 , . . . . We construct a sequence s as follows. If the nth digit in sn is 1 we let the nth digit of s be 0, and vice versa. Then the sequence s differs from every member of B in at least one place; hence s ∈ / B. But clearly s ∈ A, so that B is a proper subset of A. We have shown that every countable subset of A is a proper subset of A. It follows that A is uncountable (for otherwise A would be a proper subset of A, which is absurd). 2 Corollary 2.6. The real number set is uncountable. Proof. Use the binary representation of the real numbers and apply theorem 2.16. 2 Therefore we can write |R| = ℵ1 . 1.2.3 The extended real number system The extended real number set consists of the real number set to which two symbols, +∞ and −∞ have been adjoined, with the following properties 1.2. Exercices 19 (a) if x is real, −∞ < x < +∞, and x x x + ∞ = +∞, x − ∞ = −∞, = = 0; +∞ −∞ (b) if x > 0, x(+∞) = +∞, x(−∞) = −∞; (c) if x < 0, x(+∞) = −∞, x(−∞) = +∞. The extended real number system is denoted by R = R ∪ {+∞} ∪ {−∞} with the above-mentioned conventions. Any element of R is called finite while +∞ and −∞ are called infinities. Let A be a nonempty subset of the extended real number set. If A is not bounded above (i.e., for every real y there is an x ∈ A such that y < x ), we define sup A = +∞. Similarly, if A is not bounded below (i.e., for every real y there is an x ∈ A such that y > x ), we define inf A = −∞. We define intervals involving infinities [a, +∞[ = {x ∈ R | a ≤ x}; [a, +∞] = {x ∈ R | a ≤ x ≤ +∞}; ] − ∞, a] = {x ∈ R | x ≤ a}; [−∞, a] = {x ∈ R | −∞ ≤ x ≤ a}; [−∞, ∞] = R; ] − ∞, ∞[ = R. 1.3 1.3.1 ]a, +∞[ = {x ∈ R | a < x}; ]a, +∞] = {x ∈ R | a < x ≤ +∞}; ] − ∞, a[ = {x ∈ R | x < a}; [−∞, a[ = {x ∈ R | −∞ ≤ x < a}; Exercises Inequalities The proofs of Hölder6 inequalities are based on the W. H. Young7 inequality. Theorem 3.1. (The integral form of Young inequality, [9, p.189]) Let f be a continuous and strictly increasing function defined on [0, ∞) such that limu→∞ f (u) = ∞ and f (0) = 0. Denote g = f −1 . For x ∈ [0, ∞) we define the following functions Z x Z x (3.1) F (x) = f (u)du and G(x) = g(v)dv. 0 6 7 Otto Hölder, W. H. Young, 0 20 1. Sets Then a, b ∈ [0, ∞) imply ab ≤ F (a) + G(b), (3.2) and equality holds if and only if b = f (a). Corollary 3.1. (Young inequality, [9, p. 90]) Suppose p > 1 and α and β are non-negative reals. Then αp β q αβ ≤ + , p q (3.3) if 1 1 + = 1. p q The equality holds if and only if αp = β q . Proof. The first approach runs as follows. For u ∈ [0, ∞), define f (u) = up−1 and apply theorem 3.1. The second approach runs as follows. Consider the function f : ]0, ∞[ → R given by xp x−q + . f (x) = p q It has an absolute minimum at x = 1. The required inequality follows from f (1) ≤ 1 1 f (α q β − p ). 2 Let a = (α1 , . . . , αn ) ∈ Rn or Cn . If r 6= 0, define the weighted mean with weight r of the finite sequence a as !1/r n X 1/r X r r (3.4) Mr (a) = |αk | = |αk | . k=1 Proposition 3.1. Suppose p > 1, 1/p + 1/q = 1, and there are given two finite sequences a = (α1 , . . . , αn ) and b = (β1 , . . . , βn ) satisfying Mp (a) = Mq (b) = 1. Then M (ab) = M1 (ab) ≤ 1, (3.5) where ab = (α1 β1 , . . . , αn βn ). Proof. Choose k ∈ {1, . . . , n}. We apply Young inequality (3.3) to |αk | and |βk |. It follows |αk βk | ≤ (3.6) |αk |p |βk |q + . p q Summing up (3.6) for k = 1, 2, . . . , n, we get X |αk βk | ≤ 1X 1X 1 1 |αk |p + |βk |q = + = 1. 2 p q p q 1.3. Exercises 21 Theorem 3.2. (Hölder inequality for p > 1 and q > 1 ) Suppose p > 1, 1/p+1/q = 1, a = (α1 , . . . , αn ), b = (β1 , . . . , βn ) are two finite sequences satisfying Mp (a) > 0 and Mq (b) > 0. Then X (3.7) M (ab) = |αk βk | ≤ Mp (a)Mq (b). Proof. Define αk = αk , Mp (a) βk = a = (α1 , . . . , αn ), βk , Mq (b) b = (β 1 , . . . , β n ). We remark that Mp (a) = Mq (b) = 1, and therefore we can apply proposition 3.1. 2 Thus we find that M (ab) ≤ 1, i.e., (3.7). Theorem 3.3. Inequality (3.7) turns into an equality if and only if the fraction does not depend upon k. (the fraction |αk |p |βk |q 0 is excluded) 0 Theorem 3.4. (Hölder inequality for positive weights, [21, Part 2, Chapter 2, problem 81.3]) Consider m ∈ N∗ , m ≥ 2, aj = (αj1 , . . . , αjn ), j = 1, . . . , m, and P 1 p1 , p2 , . . . , pm > 0 so that = 1. Suppose that Mpj (aj ) > 0, j = 1, . . . , m. pj Then n Y m m Y X (3.8) Mpj (aj ). αjk ≤ k=1 j=1 j=1 Proof. If m = 2, theorem 3.4 reduces to theorem 3.2. Suppose that m ≥ 3 and than we prove (3.8) by induction. Consider that (3.8) is true for m − 1 and we prove it for m. Using theorem 3.2 we have p1p−1 m m ! p p−1 1 1 n Y n n m X Y 1 X X Y |α1k | αjk ≤ Mp1 (a1 ) |αjk | αjk = k=1 j=1 k=1 j=2 " (3.9) = Mp1 (a1 ) k=1 n Y m X |αjk | p1 p1 −1 j=2 # p1p−1 1 . k=1 j=2 We remark that pj (p1 − 1) > 0, j = 2, . . . , m, p1 and m X j=2 m p1 p1 X 1 = = 1. pj (p1 − 1) p1 − 1 j=2 pj 22 1. Sets Thus (3.9) is further evaluated as ≤Mp1 (a1 ) " m n Y X (3.10) =Mp1 (a1 ) j=2 pj (p1 −1) p1 p1 j (p1 −1) p1 −1 p1 k=1 m n Y X j=2 p1 |αjk | p1 −1 #p ! p1 j pj |αjk | = m Y Mpj (aj ). 2 j=1 k=1 Theorem 3.5. (Hölder inequality for 0 < p < 1 ) Consider 0 < p < 1, 1/p+1/q = 1, and two finite sequences of positive numbers a = (α1 , . . . , αn ) and b = (β1 , . . . , βn ). Then M (ab) ≥ Mp (a)Mq (b). (3.11) −1 1 1 Proof. Take u = 1/p (> 1), 1/u + 1/v = 1. We define γk = βk u , δk = βku αku , k = 1, . . . , n. So γku = βkq . Using theorem 3.2, we get n X αkp = k=1 = X αk βk n X γk δk ≤ X δku u1 X γkv v1 k=1 p X v −u βk v1 = X αk βk p X βkq v1 . Hence Mp (a)Mq (b) ≤ M (ab) 2 Theorem 3.6. (Hölder inequality for negative weights, [4]) Consider m ∈ N∗ , m ≥ 2, finite and nonzero sequences aj = (αj1 , . . . ,P αjn ), j = 1, . . . , m, and the weights m p1 , p2 , . . . , pm−1 < 0 and pm ∈ ]0, 1[ satisfying j=1 1/pj = 1. Then n Y m m Y X (3.12) Mpj (aj ). αjk ≥ k=1 j=1 j=1 Proof. If m = 2, this theorem reduces to theorem 3.5. Suppose that (3.12) holds for an m ≥ 2. We show that it holds for m+1. Therefore consider p1 , p2 , . . . , pm < 0 Palso m+1 and pm+1 ∈ R such that j=1 1/pj = 1, and aj = (αj1 , . . . , αjn ) are nonzero for all j = 1, . . . , m + 1. Thus 0 < pm+1 < 1. Hence n m+1 X Y k=1 j=1 n n X m+1 X Y αjk = |α1k | αjk ≥ Mp1 (a1 ) j=2 k=1 " (3.13) = Mp1 (a1 ) n m+1 X Y k=1 j=2 k=1 |αjk | p1 p1 −1 # p1 −1 p1 . m+1 Y j=2 p1p−1 1 ! p p−1 1 1 |αjk | 1.3. Exercises 23 We remark that pm+1 (p1 − 1) pj (p1 − 1) < 0, j = 2, . . . , m, > 0, p1 p1 m+1 m+1 X p1 p1 X 1 = = 1. p (p − 1) p − 1 p j 1 1 j j=2 j=2 and Then (3.13) is further evaluated as ≥Mp1 (a1 ) " n m+1 Y X =Mp1 (a1 ) j=2 #p p1 j (p1 −1) p1 −1 p1 k=1 m n Y X j=2 |αjk | p1 pj (p1 −1) p1 −1 p1 ! p1 j pj |αjk | = m Y Mpj (aj ). 2 j=1 k=1 Theorem 3.7. (Minkowski8 inequality for p ≥ 1. ) Suppose there are satisfied the assumptions of theorem 3.2. Then Mp (a + b) ≤ Mp (a) + Mp (b). (3.14) Proof. For p = 1 the above inequality follows from the inequality given in (2) (a) of the proposition 2.2. Suppose that p > 1. We apply the Hölder inequality (3.7) for the following two pairs of finite sequences (ak )nk=1 , (|ak + bk |p−1 )nk=1 and (bk )nk=1 , (|ak + bk |p−1 )nk=1 . Then we have the following estimations X |ak + bk |p ≤ X |ak | · |ak + bk |p−1 + n X |bk | · |ak + bk |p−1 k=1 ≤ (Mp (a) + Mp (b)) X |ak + bk |p 1/q . P Now, dividing the both sides by ( |ak + bk |p )1/q , we get inequality (3.14). 2 Proposition 3.2. (Bernoulli9 inequality) For every n ∈ N∗ and every x ≥ −1 it holds (3.15) 8 9 Hermann Minkowski, 1864-1909 Bernoulli, (1 + x)n ≥ 1 + nx. 24 1. Sets Proof. For x = −1 the left-hand side of (3.15) is null while the right-hand is nonpositive. So we may suppose that x > −1. For x > −1 we prove (3.15) by induction. If n = 1, we have an equality. If n = 2, then (1 + x)2 = 1 + 2x + x2 ≥ 1 + 2x, so the inequality holds. Suppose now that inequality (3.15) holds for a k ∈ N∗ and for every x > −1, that is (1 + x)k ≥ 1 + kx. (3.16) We will prove that (3.15) holds for k + 1 and every x > −1, too. Since x > −1, 1 + x > 0. We multiply (3.16) by 1 + x > 0 getting (1 + x)k+1 ≥ 1 + kx + x + kx2 = 1 + (k + 1)x + kx2 ≥ 1 + (k + 1)x. So, (3.15) holds for k + 1 and every x > −1. Thus we conclude inequality (3.15) holds for every n ∈ N∗ and every x ≥ −1. 2 Proposition 3.3. (Bernoulli inequality) Consider n real numbers xi > −1, i = 1, 2, . . . , n such that all of them have the same sign. Then (3.17) (1 + x1 )(1 + x2 ) . . . (1 + xn ) ≥ 1 + x1 + x2 + · · · + xn . Proof. Similar to the one supplied for the previous proposition. 2 Proposition 3.4. (Mean inequality) Let x1 , x2 , . . . xm be positive reals. Then the geometric mean is less or equal to the arithmetic mean, i.e. (3.18) √ m x1 x2 . . . xm ≤ x1 + x2 + · · · + xm . m Corollary 3.2. Let x1 , x2 , . . . xm be positive reals. Then the harmonic mean is less or equal to the geometric mean, i.e. √ m m (3.19) x 1 x2 . . . xm . 1 1 1 ≤ + x2 + · · · + xm x1 Proof. Substitute xi → 1/xi in (3.18). 2 Proof of proposition 3.4. Cauchy’s10 approach. First we prove the mean inequality introducing an extra assumption, namely m = 2k . Later on we will remove this assumption. If k = 1, the mean inequality is known. For k = 2 we follow the next way √ √ q x1 x2 + x3 x4 √ √ √ 4 x1 x2 x3 x4 = x1 x2 x3 x4 ≤ 2 x1 +x2 x3 +x4 + x + x + x3 + x4 1 2 2 ≤ 2 = . 2 4 10 Augutin Louis Cauchy, 1789-1857 1.3. Exercises 25 Now, suppose inequality (3.18) holds for m = 2k , i.e., √ 2k (3.20) x 1 + x 2 + · · · + x 2k . 2k x 1 x 2 . . . x 2k ≤ Then √ 2k+1 √ √ k x1 . . . x2k 2 x2k +1 . . . x2k+1 √ √ k 2k x1 . . . x2k + 2 x2k +1 . . . x2k+1 . ≤ 2 x1 x2 . . . x2k+1 = q 2k Now using (3.20), we get that (3.19) holds for any m equal to a power of 2. It remained the case of unrestricted m. For, suppose that m is not a power of 2. Take a k such that m < 2k and denote l := x1 + x2 + · · · + xm (> 0). m Take xk+1 = · · · = x2k =: l and consider inequality (3.20). Then we write √ 2k √ 2k x1 x2 . . . xm · l ml + (2k − m)l =l 2k ⇐⇒ x1 . . . xm ≤ lm . 2 2k −m 2k m x 1 x 2 . . . x m ≤ l 2k ≤ Second approach. This approach consists in considering a special case of (3.19) which is, actually, equivalent to it. This special case reads as (3.21) x1 , . . . , xm > 0, x1 . . . xm = 1 =⇒ x1 + x2 + . . . xm ≥ m. If m = 1 or m = 2, (3.21) is trivial. Suppose that (3.21) holds for m = n, i. e., (3.22) x1 , . . . , xn > 0, x1 . . . xn = 1 =⇒ x1 + x2 + . . . xn ≥ n. Suppose also that there are given n + 1 positive real numbers satisfying x1 x2 . . . xn xn+1 = 1. We prove that their sum is at least n + 1. If either all xi ’s are equal to 1, and then the conclusion follows at once, or there are two numbers one greater than 1, the other less than 1. We suppose that xn > 1 and xn+1 < 1. Now we consider n positive numbers, namely x1 , x2 , . . . , xn−1 , xn · xn+1 . 26 1. Sets Based on the induction hypothesis, (3.22), we get the following lower estimation x1 + x2 + · · · + xn−1 + xn xn+1 ≥ n. Then we continue the estimation x1 + x2 + · · · + xn+1 ≥ n − xn xn+1 + xn + xn+1 = n + 1 + (1 − xn+1 )(xn − 1) ≥ n + 1. Thus we infer that implication (3.22) holds for any m ∈ N∗ . Hence it follows the mean inequality, too. 2 Proposition 3.5. (Lagrange11 identity) Let ai and bi be real numbers, i = 1, . . . , m. Then ! m ! !2 m m X X X X (3.23) a2i b2i = ai b i + (ai bj − aj bi )2 . i=1 i=1 i=1 Proof. It follows at once by induction. 1≤i<j≤m 2 Corollary 3.3. Let ai and bi be real numbers, i = 1, . . . , m. Then ! m ! !2 m m X X X (3.24) a2i b2i ≥ ai b i . i=1 i=1 i=1 Corollary 3.4. Let ai and bi be real numbers, i = 1, . . . , m. Then ! m ! !2 m m X X X a2i b2i = ai bi i=1 i=1 i=1 if and only if ai bj − aj bi = 0 for all 1 ≤ i < j ≤ m. Corollary 3.5. Let ai be real numbers, i = 1, . . . , m. Then r a1 + a2 + · · · + am a21 + a22 + · · · + a2m ≤ . m m Proof. Take bi = 1, i = 1, . . . , m, in inequality (3.24). 11 Joseph Louis Lagrange, 1736-1813 2 Chapter 2 Basic notions in topology This chapter is dedicated introducing basic notions and results concerning, mainly, metric spaces. 2.1 Metric spaces A set X, whose elements are called points, is said to be a metric space if with any two points x and y of X there is associated a real number ρ(x, y), called the distance from x to y, such that (a) ρ(x, y) > 0 if x 6= y; ρ(x, x) = 0; (b) ρ(x, y) = ρ(y, x); (c) ρ(x, y) ≤ ρ(x, z) + ρ(z, y), for any z ∈ X. The distance function ρ is called a metric on X, too. From (a) we have that the distance ρ is non-negative. From (b) we have that the distance from x to y is same to the distance from y to x. So, the distance function is symmetrical. (c) is called the triangle inequality. Examples. (a) Consider a nonempty set X and define on it the following metric ( 0, x = y, (1.1) ρ(x, y) = 1, x 6= y. Thus we notice on a nonempty set we can define at least a metric. (b) We recall the definition of the distance function given on R at the page 11. It follows that (R, ρ), where ρ(x, y) = |x − y|, is a metric space. This metric is called the Euclidean metric or the Euclidean distance on R. (c) Consider the complex plane C and define on it the following distance function p ρ(z1 , z2 ) = |z1 − z2 | = (x1 − x2 )2 + (y1 − y2 )2 , (1.2) zk = xk + iyk , xk , yk ∈ R, k = 1, 2, i2 = −1, 27 28 2. Basic notions in topology called the Euclidean distance on C. (d) Consider the plane R2 and define on it the following metrics, for uk = (xk , yk ) ∈ R2 , k = 1, 2, (d1 ) ρ1 (u1 , u2 ) = |x1 − x2 | + |y1 − y2 |; p ( d2 ) ρ2 (u1 , u2 ) = (x1 − x2 )2 + (y1 − y2 )2 ; ( d3 ) ρ3 (u1 , u2 ) = max{|x1 − x2 |, |y1 − y2 |}; Metric ρ2 is said to be the Euclidean metric on R2 while ρ3 is said to be the uniform metric. (e) From (d) it follows at once that on a given set we can define several metrics. 4 Let (X, ρ) be a metric space. All points and sets mentioned bellow are understood to be elements and subsets of X. The open ball with center at x and radius r > 0 is the set B(x, r) given by B(x, r) = {y ∈ X | ρ(x, y) < r}. A neighborhood of a point x is a set V such that there exists a ball B(x, r) with B(x, r) ⊂ V. We denote by V(x) the family of neighborhoods of x, i.e., V(x) = {V | V is a neighborhood of x}. A point x is a limit point of the set A if every neighborhood x contains a point y 6= x such that y ∈ A. If x ∈ A and x is not a limit point of A, x is said to be an isolated point of A. A is closed if every limit point of A is a point of A. A point x is an interior of A if there is a neighborhood V of x such that V ⊂ A. A is open if every point of A is an interior point of A. The empty set is open. A is perfect if A is closed and if every point of A is a limit point of A. A is bounded if there is a real number m such that ρ(x, y) ≤ m, for any x, y ∈ A. A is dense if every point of X is a limit point of A or a point of A (or both). The system of open sets corresponding to a metric space (X, ρ) is denoted by τ and is called the topology generated by the metric ρ. The pair (X, τ ) is said to be a topological space. Sometimes we say that X is a topological space when the topology is irrelevant. Remarks. (a) All these topological notions (openess, closeness, etc.) are based ultimately on the notion of metric. Thus changing the metric, the open (closed, etc.) 2.1. Metric spaces 29 sets change. For example, the family of open sets corresponding to the metric on R given by (1.1) coincides with the family of all subsets of R. Obviously, this is not the case if we consider the Euclidean metric ρ2 on R. For, the set containing precisely a point in R is open in the first case whereas it is not in the second case. (b) In our presentation we considered the metric as a primary notion. However, it is possible and actually largely used, to consider the topology on a set as a basic notion, [7]. 4 Theorem 1.1. Every open ball is an open set. Proof. Consider A = B(x, r) and let y be any point of A. Then there is a positive number h such that ρ(x, y) = r − h. For all points z such that ρ(y, z) < h we have then ρ(x, z) ≤ ρ(x, y) + ρ(y, z) < r − h + r + r, so that z ∈ A. Thus y is an interior point of A. 2 Theorem 1.2. If x is a limit point of a set A, then every neighborhood of x contains infinitely many points of A. Proof. Suppose there is a neighborhood V of x which contains only a finite number of points of A. Let y1 , . . . , yn be those points of V ∩ A, which are distinct from x, and put r = min ρ(x, yk ). 1≤k≤n Note that r > 0. The neighborhood V (x, r) contains no point y ∈ A such that y 6= x, so that x is not a limit point of A. This contradiction establishes the theorem. 2 Examples 1.1. Let us consider the following subsets of R2 (bijective to C ). (a) The set of all complex z such that |z| < 1; (b) the set of all complex z such that |z| ≤ 1; (c) a finite set; (d) the set of all integers; (e) the set A = {1/n | n = 1, 2, . . . }. A has a limit point (namely x = 0 ), but no point in A is a limit point of A; we stress the difference between a limit point and containing one; (f) the set of all complex numbers; (g) the interval ]0, 1[ . 30 2. Basic notions in topology Let us note that (d), (e) and (g) can be regarded also as subsets of R. Some properties of these sets are tabulated below (a) (b) (c) (d) (e) (f) (g) closed no yes yes yes no yes no open yes no no no no yes perfect no yes no no no yes no bounded yes yes yes no no no yes 4 Theorem 1.3. A set A is open if and only if {X A is closed. Proof. First, suppose that {X A is closed. Choose x ∈ A. Then x ∈ / {X A and x is not a limit point of {A. Hence there exists a neighborhood V of x such that {A ∩ V = ∅, that is V ⊂ A. Thus x is an interior point of A, and A is open. Now, suppose A is open. Let x be a limit point of {A. Then every neighborhood of x contains a point of {A, so that x is not an interior point of A. Since A is open, this means that x ∈ {A. It follows that {A is closed. 2 Corollary 1.1. A set is closed if and only if its complement is open. Theorem 1.4. (a) For any family {Gα } of open sets, ∪α Gα is open. (b) For any family {Fα } of closed sets, ∩α Fα is closed. (c) For any finite family G1 , . . . , Gn of open sets, ∩ni=1 Gi is open. (d) For any finite family F1 , . . . , Fn of closed sets, ∪ni=1 Fi is closed. Proof. (a) Put G = ∪α Gα . If x ∈ G, then x ∈ Gα for some α. Since x is an interior point of Gα , x is also an interior point of G, and G is open. (b) Fα is closed ⇐⇒ {Fα open =⇒ ∪α {Fα open ⇐⇒ {(∪α Fα ) open ⇐⇒ ∪α Fα closed. (c) Put H = ∩n1 Gi . For every x ∈ H there exist some balls Bi of x with radii ri such that Bi ⊂ Gi , i = 1, . . . , n. Put r = min1≤i≤n ri and let B be the ball of x of radius r. Then B ⊂ Gi , i = 1, . . . , n, so that B ⊂ H and H is open. (d) Similarly to (b). 2 Warning. Hereafter the real number system is considered endowed with the topology generated by the Euclidean metric. Example. In parts (c) and (d) of the preceding theorem, the finiteness of the families is essential. For, let Gn be the interval ] − 1/n, 1/n[ , n ∈ N∗ . Then Gn is an open subset of R. Put G = ∩∞ 1 Gn . Thus G consists of a single point (that is, x = 0 ) which is not an open set in R. 4 2.1. Metric spaces 31 Theorem 1.5. Let A be a closed set of real numbers which is bounded above. Let y = sup A. Then y ∈ A. Proof. Suppose y ∈ / A. For any ε > 0 there is a point x ∈ A such that y −ε ≤ x ≤ y, theorem 2.7. Thus, every neighborhood of y contains a point x ∈ A, and x 6= y, since y ∈ / A. It follows that y is a limit point of A which is not a point of A, so that A is not closed. This contradicts the hypothesis. 2 Suppose A ⊂ Y ⊂ X, where (X, ρ) is a metric space. To say that A is an open subset of X means that to each point x ∈ A there is associated a positive number r such that ρ(x, y) < r, y ∈ X imply that y ∈ A. First remark that (Y, ρ|Y ×Y ) is a metric space, too. We say that A is open relative to Y if to each x ∈ A there is associated an r > 0 such that y ∈ A whenever ρ(x, y) < r and y ∈ Y. Example. Example 1.1 (g) showed that a set may be open relative to Y without being an open subset of X. However, there is a simple relation between these concepts. 4 Theorem 1.6. Suppose Y ⊂ X. A subset A of Y is open relative to Y if and only if A = Y ∩ G for some open subset G of X. Proof. Suppose A is open relative to Y. To each x ∈ A there is a positive number rx such that the conditions ρ(x, y) < rx and y ∈ Y imply y ∈ A. Let Vx be the set of all y ∈ X such that ρ(x, y) < rx , and define G = ∪x∈A Vx . Then G is an open subset of X. Since x ∈ Vx for all x ∈ A, it is clear that A ⊂ G ∩ Y. By our choice of Vx we have Vx ∩ Y ⊂ A for x ∈ A, so that G ∩ Y ⊂ A. Thus A = G ∩ Y, and one half of the theorem is proved. Conversely, if G is open in X and A = G∩Y, for every x ∈ A has a neighborhood Vx ⊂ G. Then Vx ∩ Y ⊂ A, so that A is open relative to Y. 2 Theorem 1.7. Let (X, ρ) be a metric space and let E 0 be the set of limit points of a set E. Then E 0 is closed. Proof. By definition E 0 is closed if and only if (E 0 )0 ⊂ E 0 . Take y ∈ (E 0 )0 . Then for every ε > 0 there is an x ∈ E 0 such that 0 < ρ(x, y) < ε/2. Since x ∈ E 0 there is a v ∈ E such that ρ(x, v) < ρ(x, y). Hence v 6= y and 0 < ρ(v, y) ≤ ρ(x, v) + ρ(x, y) < ε. Since v ∈ E, this shows that y is a limit point of E, i.e., y ∈ E 0 and the proof is complete. 2 Let E be a set, let E 0 be the set of limit points of E, and define E = E ∪ E 0 . 32 2. Basic notions in topology Theorem 1.8. (a) E is a closed set, and that E ⊂ F if E ⊂ F and F is closed; (b) Moreover E = ∩E⊂F, F closed F ; (c) E is closed if and only if E = E. A metric space is said to be separable if it contains a countable dense set. Theorem 1.9. R is separable. Proof. Q is a countable (corollary 2.5, p. 18) and dense (theorem 2.11, p. 15) subset of R. 2 2.2 Compact spaces A covering of a set A in a topological space X is a family of subsets {Gα } of X such that A ⊂ ∪α Gα . An open covering is a covering consisting of open subsets. A subset K of a topological space X is said to be compact if every open covering contains a finite subcovering of it. More explicitly, the requirement is that if {Gα } is an open covering of K, then there are finitely many indices α1 , . . . , αn such that K ⊂ G α1 ∪ · · · ∪ Gαn . Example. Every finite set is compact. 4 Remark. We observed earlier that if A ⊂ Y ⊂ X, then A may be open relative to Y without being open relative to X. The property of being open thus depends on the space in which A is embedded. The same is true of the property of being closed. 4 Theorem 2.1. Suppose K ⊂ Y ⊂ X. Then K is compact relative to X if and only if K is compact relative to Y. If K is a compact set of a metric space X and K = X, then we say that X is a compact space. Theorem 2.2. Every compact subset of a metric space is closed. Proof. Let K be a compact subset of a metric space. We shall prove that {X K is open. If K = X, we are done. Suppose that {X K 6= ∅. Choose x ∈ X, x ∈ / K. If y ∈ K, let Vy and Wy be neighborhoods of x, respectively, y of radius less than ρ(x, y)/2 (> 0, since x 6= y ). Since K is compact, there are finitely many points y1 , . . . , yn in K such that K ⊂ Wy1 ∪ · · · ∪ Wyn = W. 2.2. Compact spaces 33 If V = Vy1 ∪ · · · ∪ Vyn , then V is a neighborhood of x which does not intersect W. Hence V ⊂ {X K, so that x is an interior point of {X K. So, {X K is open and K is closed. 2 Theorem 2.3. Every closed subset of a compact set is compact. Proof. Suppose F ⊂ K ⊂ X, F is closed (relative to X ), and K is compact. Let {Vα } be an open covering of F. Then we have F ⊂ K ⊂ ∪α Vα ∪ (X \ F ). Since K is compact, there is a finite subcovering by open sets of the form {Vα } ∪ (X \ F ). This finite subcovering of K is a covering of F. Since X \ F does not cover F, it remains a finite number of Vα that covers F. We have thus shown that a finite members of {Vα } covers F. Corollary 2.1. If F is closed and K is compact, then K ∩ F is compact. Proof. By theorem 2.2 K is closed, by theorem 1.4 F ∩ K is closed, and, finally, by theorem 2.3 F ∩ K is compact. 2 Theorem 2.4. If {Kα } is a family of compact subsets of a metric space X such that the intersection of every finite family of {Kα } is nonempty, then ∩Kα is nonempty. Proof. Fix a member K1 of {Kα } and put Gα = {Kα . Assume that no point of K1 belongs to every Kα . Then the sets Gα form an open covering of K1 . Since K1 is compact, there are finitely many indices α1 , . . . , αn such that K1 ⊂ G α 1 ∪ · · · ∪ G α n . This means that K1 ∩ Kα1 ∩ · · · ∩ Kαn = ∅, contrary to our assumption. 2 Theorem 2.5. If A is an infinite subset of a compact set K, then it has a limit point in K. Proof. If no point of K were a limit point of A, then each x ∈ K would have a neighborhood Vx which contains at most one point of A (namely x, if x ∈ A ). It is clear that no finite subfamily of {Vx } can cover E; moreover K since A ⊂ K. This contradicts the compactness of K. 2 Theorem 2.6. Every closed and bounded interval is compact. Proof. Let I = [a, b] be a closed and bounded interval. Denote δ = |a − b| = b − a. Then |x − y| ≤ δ, for any x, y ∈ I. Suppose there exists an open covering {Gα } of I which contains no finite subcovering of I. Put c = (a + b)/2. The intervals Q1 = [a, c] and Q2 = [c, b] then determine 2 subintervals whose union is I. At least one of these sets, call it I1 , cannot be covered by any finite subfamily of {Gα } (otherwise I could be so covered). We next subdivide I1 and continue the process. We obtain a sequence {In } with the following properties 34 2. Basic notions in topology (i) I ⊃ I1 ⊃ I2 ⊃ . . . ; (ii) In is not covered by any finite subfamily of {Gα }; (iii) if x, y ∈ In , then |x − y| ≤ 2−n δ. By (i) and theorem 2.9 there is a point z ∈ ∩In . For some α, z ∈ Gα . Since Gα is open, there exists r > 0 such that B(z, r) ⊂ Gα . If n is so large that 2−n δ < r, then (iii) implies that In ⊂ Gα , which contradicts (ii). 2 Theorem 2.7. (Heine1 -Borel2 ) If a set A ⊂ R has one of the following properties, then it has the other two (a) A is closed and bounded; (b) A is compact; (c) every infinite subset of A has a limit point in A. Proof. (a) =⇒ (b). There exists a closed and bounded interval I such that A ⊂ I. Then (b) follows from theorems 2.6 and 2.2. (b) =⇒ (c). This is theorem 2.5. (c) =⇒ (a). If is not bounded, Then A contains points xn with |xn | > n, (n = 1, 2, . . . ). The set P consisting of these points xn is infinite and clearly has no limit point in R, hence has none in A. Thus (c) implies that A is bounded. If A is not closed, then there is a point x0 ∈ R which is a limit point of A but not a point of A. For n = 1, 2, 3, . . . there are points xn ∈ A such that |xn − x0 | < 1/n. Set M be the set of these points xn . Then M is infinite (otherwise, |xn − x0 | would have a constant positive value, for infinite many n ), M has x0 as a limit point, and M has no other limit point in R. For if y ∈ R, y 6= x0 , then 1 |xn − y| ≥ |x0 − y| − |xn − x0 | ≥ |x0 − y| − 1/n ≥ |x0 − y| 2 for all but finitely many n; this shows that y not a limit point of M (theorem 1.2). Thus M has no limit point in A; hence A must be closed if (c) holds. 2 Remark. (b) ⇐⇒ (c) in any metric space. (a) does not, in general, imply (b) and (c). 4 Theorem 2.8. (Weierstrass3 ) Every bounded infinite subset of R has a limit point in R. 1 Heine, Emile, Borel, 1871-1956 3 Karl Weierstrass, 1815-1897 2 2.2. Compact spaces 35 Proof. Being bounded, the set A in question is a subset of an interval [a, b] = I ⊂ R. By theorem 2.6 I is compact, and so A has a limit point in I, by theorem 2.5. 2 Perfect sets have been defined at page 28. Theorem 2.9. Let P be a nonempty perfect set in R. Then P is uncountable. Proof. Since P has limit points, P must be infinite. Suppose P is countable, and denote the points of P by x1 , x2 , . . . . We shall construct a sequence Bn of balls, as follows. Let B1 be any ball with center x1 . If B1 = B(x1 , r1 ), the corresponding closed ball B1 is defined to be the set of all x ∈ R such that |x − x1 | ≤ r1 . ( {B1 is open, hence closed balls are closed). Suppose Bn has been constructed, so that Bn ∩ P 6= ∅. Since every point of P is a limit point of P, there is a ball Bn+1 such that (i) Bn+1 ⊂ Bn ; (ii) xn ∈ / Bn+1 ; (iii) Bn+1 ∩ P 6= ∅. By (iii) Bn+1 satisfies our induction hypothesis, and the construction can proceed. / Kn+1 , Kn = Bn ∩ P. Since Bn is closed and bounded, Bn is compact. Since xn ∈ ∞ no point of P lies in ∩1 Kn . Since Kn ⊂ P, this implies that ∩Kn is empty. But each Kn is nonempty, by (iii), and Kn ⊃ Kn+1 , by (i); this contradicts theorem 2.4. 2 Corollary 2.2. Every interval [a, b] (a < b) is uncountable. Remark. We introduce the Cantor set showing that there exist perfect sets in R which contain no interval. Let C0 be the interval [0, 1]. Remove the open interval ]1/3, 2/3[ and let C1 be the union of the intervals [0, 1/3] ∪ [2/3, 1]. Remove the middle thirds of these intervals and let C2 be the union of the intervals [0, 1/9] ∪ [2/9, 3/9] ∪ [6/9, 7/9] ∪ [8/9, 1]. Continuing in this way we obtain a sequence of compact sets Cn such that (i) C1 ⊃ C2 ⊃ . . . ; (ii) Cn is the union of 2n intervals of length 3−n . 36 2. Basic notions in topology The set C = ∩∞ 1 Cn is called the Cantor set. C is clearly compact, and theorem 2.4 shows that C is not empty. No interval of the form 3k + 1 3k + 2 (2.1) , , 3m 3m where k and m are positive integers, has a point in common with C. Since every interval ]α, β[ contains a segment of the form (2.1), if 3−m < (β − α)/6, C contains no segment. To show that C is perfect, it is enough to show that C contains no isolated point. Let x ∈ C, and let I be any open interval containing x. Let In be that interval of Cn which contains x. Choose n large enough, so that In ⊂ I. Let xn be an end point of In , such that xn 6= x. It follows from the construction of C that xn ∈ C. Hence x is a limit point of C, and C is perfect. 4 Chapter 3 Numerical sequences and series The present chapter is devoted introducing several results on numerical sequences and series. 3.1 3.1.1 Numerical sequences Convergent sequences A sequence (xn )n in a metric space (X, ρ) is said to converge if there is a point x ∈ X with the following property: for every ε > 0 there is an integer nε such that n ≥ nε implies that ρ(xn , x) < ε. In this case we also say that (xn ) converges to x or that x is the limit of (xn ) , and we write xn → x, or lim xn = lim xn = x. n→∞ If (xn ) does not converges, it is said to diverge. Remarks. (a) It might be well to point out that our definition of ”convergent sequence” depends not only on (xn )n but also on X; for instance, the sequence (xn ), xn = 1/n converges in R to 0, but fails to converge in the set ]0, ∞[ (with ρ(x, y) = |x − y| ). (b) We may say that lim xn = a ⇐⇒ ∩ε>0 ∪m>0 ∩n≥m ]xn − a, xn + a[ = {a}. 4 n→∞ A sequence (xn )n in a metric space (X, ρ) is said to be bounded if the set {xn | n} is bounded (for bounded set see page 28). Theorem 1.1. Let (xn ) be a sequence in a metric space (X, ρ). (a) (xn ) converges to x ∈ X if and only if every neighborhood of x contains all but finitely many of the terms of (xn ); (b) if x, y ∈ X and (xn ) converges to x and y, x = y; 37 38 3. Numerical sequences and series (c) if (xn ) converges, {xn } is bounded; (d) if A ⊂ X and if x is a limit point of A, then there is a sequence (xn ) in A such that lim xn = x. Proof. (a) Suppose lim xn = x and let V ∈ V(x). For some ε > 0, the conditions ρ(y, x) < ε, y ∈ X imply y ∈ V. Corresponding to this ε, there exists nε such that n ≥ nε implies ρ(xn , x) < ε. Thus n ≥ nε implies xn ∈ V. (b) Let ε > 0 be given. There exists integers nε and mε such that n ≥ nε =⇒ ρ(xn , x) < ε/2, n ≥ mε =⇒ ρ(xn , y) < ε/2. Hence if n ≥ max{nε , mε }, we have 0 ≤ ρ(x, y) ≤ ρ(xn , x) + ρ(xn , y) < ε. Since ε has been chosen arbitrary, we conclude that ρ(x, y) = 0. (c) Suppose lim xn = x. There is an integer m such that n > m implies ρ(xn , x) < 1. Put r = max{1, ρ(x1 , x), ρ(x2 , x), . . . , ρ(xn , x)}. Then ρ(xn , x) ≤ r, for n = 1, 2, 3 . . . . (d) For each positive integer n, there is a point xn ∈ A such that ρ(xn , x) < 1/n. Given ε > 0, choose nε so that εnε > 1. If n > nε , it follows that ρ(xn , x) < ε. Hence xn → x. 2 Theorem 1.2. Suppose (xn ), (yn ) are real sequences, and lim xn = x, lim yn = y. Then (a) lim(xn + yn ) = x + y; (b) lim c · xn = cx, lim(c + xn ) = c + x, for any real c; (c) lim xn yn = xy; (d) lim 1 1 = , provided xn 6= 0 (n = 1, 2, . . . ) and x 6= 0. xn x Proof. (a) Given ε > 0 there exist nε and mε such that n ≥ nε =⇒ |xn − x| < ε/2, n ≥ mε =⇒ |yn − y| < ε/2. Put nε = max{nε , mε }. If n ≥ nε , then |(xn + yn ) − (x + y)| ≤ |xn − x| + |yn − y| < ε/2 + ε/2 = ε. 3.1. Numerical sequences 39 (b) The first claim follows from (c), while the second claim follows from (a). (c) We write |xn yn − xy| ≤ |xn ||yn − y| + |y||xn − x|. Since (xn ) converges, it is bounded and we can find an M > 1 such that |xn | < M and |y| < M. Given ε > 0, there exist nε and mε such that n ≥ nε =⇒ |xn − x| < ε/(2M ), n ≥ mε =⇒ |yn − y| < ε/(2M ). Put nε = max{nε , mε }. If n ≥ nε , then |xn yn − xy| ≤ M |xn − x| + M |yn − y| < ε/2 + ε/2 = ε. (d) Choosing m such that |xn − x| < (1/2)|x| if n > m, we see that 1 |xn | > |x|, 2 n ≥ m. Given ε > 0, there is an integer nε ≥ m such that n ≥ nε implies 1 |xn − x| < |xn − x|2 ε. 2 Hence, for n ≥ nε 1 xn − x 1 2 − = xn x xn x < |x|2 |xn − x| < ε. 2 3.1.2 Subsequences Given a sequence (xn ), consider a sequence (nk )k of positive integers, such that n1 < n2 < . . . . Then the sequence (xnk )k is called a subsequence of (xn )n . If (xnk )k converges, its limit is called a subsequential limit of (xn )n . Remark. It is clear that (xn ) converges to x if and only if every subsequence of (xn ) converges to x. Theorem 1.3. Every bounded sequence in R contains a convergent subsequence. Proof. Let E be the range of the bounded sequence (xn ). If E is finite, there is at least one point in E, say x, and a sequence (nk )k with n1 < n2 < . . . such that xn1 = xn2 = · · · = x. The subsequence (xnk )k so obtained evidently converges. If E is infinite, then E has a limit point x ∈ R, theorem 2.8 page 34. Choose n1 so that |xn1 − x| < 1. Having chosen n1 , . . . , ni−1 , we see by theorem 1.2, page 29, that there is an integer ni > ni−1 such that |xni − x| < 1/i. The sequence (xni )i thus obtained converges to x. 2 40 3. Numerical sequences and series Theorem 1.4. The subsequential limits of a sequence (xn ) in a metric space (X, ρ) form a closed set in X. Proof. Apply theorem 1.7, page 31. 2 Example. The set of subsequential limits of the following sequence 1 2 3 1 2 3 4 5 1 2n + 1 1, , , , , , , , , . . . , n , . . . , ,... 2 2 2 4 4 4 4 4 2 2n is the closed interval [0, 1], while the set of subsequential limits of the following sequence 9 1 2 3n 1 2 3 1 2 3 4 1, , , , , , , , . . . , , . . . , n , n , . . . , n , . . . 2 2 2 4 4 4 4 4 2 2 2 is the closed interval [0, ∞). 4 3.1.3 Cauchy sequences A sequence (xn ) in a metric space (X, ρ) is said to be a Cauchy sequence or fundamental sequence if for every ε > 0 there is an integer nε such that ρ(xn , xm ) < ε provided n ≥ nε , m ≥ nε . Let (X, ρ) be a metric space and E ⊂ X, E 6= ∅. The diameter of E is diamE = sup ρ(x, y). x,y∈E If (xn ) is a sequence in X and En = {xn , xn+1 , . . . }, (xn ) is a Cauchy sequence if and only if lim diamEn = 0. n→∞ Theorem 1.5. (a) If E is the closure of a set E in a metric space X, then diamE = diamE. (b) (Cantor lemma) If (Kn ) is a sequence of nonempty, nested, and compact sets and if lim diamKn = 0, n→∞ then ∩∞ 1 Kn consists in exactly one point. Proof. (a) Since E ⊂ E, it is clear that diamE ≤ diamE. Fix ε > 0 and choose x ∈ E and y ∈ E. By the definition of E there are points x0 , y 0 ∈ E such that ρ(x, x0 ) < ε, ρ(y, y 0 ) < ε. Hence ρ(x, y) ≤ ρ(x, x0 ) + ρ(x0 , y 0 ) + ρ(y 0 , y) ≤ 2ε + ρ(x0 , y 0 ) ≤ 2ε + diamE. 3.1. Numerical sequences 41 It follows that diamE ≤ 2ε + diamE, and since ε was arbitrary, (a) is proved. (b) Put K = ∩∞ 1 Kn . By theorem 2.4, p. 33, K 6= ∅. If K contains more than one element, then diamK > 0. But for each n, Kn ⊃ K, so that diamKn ≥ diamK. This contradicts the assumption that diamKn → 0. 2 Theorem 1.6. (a) Every convergent sequence in a metric space is a Cauchy sequence. (b) Every Cauchy sequence in R converges. Proof. (a) If limn→∞ xn = x and ε > 0, there is an integer nε such that ρ(xn , x) < ε/2 whenever n ≥ nε . Hence, if n, m ≥ nε , we have ρ(xn , xm ) ≤ ρ(xn , x) + ρ(xm , x) < ε, so that (xn ) is a Cauchy sequence. (b) Suppose (xn ) is a Cauchy sequence in R. Let En = {xn , xn+1 , . . . } and let E n be the closure of En . By definition and theorem 1.5 we see that (1.1) lim diamE n = 0. n→∞ In particular, the sets E n are bounded. They are also closed. Hence each E n is compact. Also E n ⊃ E n+1 . By theorem 1.5 there is a unique point x ∈ R which lies in every E n . Let ε > 0 be given. By (1.1) there is an integer n0 such that diamE n < ε, if n ≥ n0 . Since x ∈ E n , this means that |y − x| < ε for all y ∈ E n , hence for all y ∈ En . In other words, if n ≥ n0 , then |xn − x| < ε. But this says precisely that xn → x, and thus the proof is completed. 2 A metric space (X, ρ) in which every Cauchy sequence converges is said to be complete. Remark. R is a complete metric space, while Q is not. 4 3.1.4 Monotonic sequences A sequence (xn ) of real numbers is said to be (a) monotonically increasing if xn ≤ xn+1 , n ∈ N∗ ; (b) monotonically decreasing if xn ≥ xn+1 , n ∈ N∗ . The class of monotonic sequences consists of the increasing and decreasing sequences. Theorem 1.7. Suppose (xn ) is monotonic. Then (xn ) converges if and only if it is bounded. Proof. Suppose xn ≤ xn+1 . Let E be the range of (xn ). If (xn ) is bounded, let x be the least upper bound of E. Then xn ≤ x, n ∈ N∗ . 42 3. Numerical sequences and series For every ε > 0 there is an integer nε such that x − ε < xnε ≤ x, for otherwise x − ε would be an upper bound of E. Since (xn ) increases, n ≥ nε therefore implies x − ε < xn ≤ x < x + ε ⇐⇒ |xn − x| < ε, which shows that (xn ) converges to x. 2 Exercises 1.1. (a) Consider the sequence (an )n≥1 defined by an = 1 + 1 1 1 + + ··· + , 2 3 n n ∈ N∗ . Then an → ∞. Indeed, it is enough to remark that the sequence has positive terms, is monotonically increasing, but divergent. To check this the last property it is enough to see that it is not a Cauchy one. The sequence (an ) is not fundamental since a2n − an = 1 1 1 1 1 + ··· + > + ··· + = . n+1 2n 2n 2n 2 (b) Consider a numerical sequence (an ) satisfying |an −am | > 1/n for any n < m. Then the sequence (an ) is unbounded. Suppose that (an ) is bounded. Then there exists a positive M such that |an | ≤ M, for every n. From the hypothesis we infer that the open intervals ]an − 1/(2n), an + 1/(2n)[ are disjoint and their union satisfies ∪n ]an − 1 1 1 1 , an + [ ⊂ ] − M − , +M + [ . 2n 2n 2 2 Since the length of the interval ] − M − 12 , +M + 12 [ is 2M + 1, it follows that the sum of the length to the disjoint intervals ]an − 1/(2n), an + 1/(2n)[ does not exceed 2M + 1. On the other side, the length of an interval ]an −1/(2n), Pnan +1/(2n)[ is equal to 1/n. So the sum of the length of the first n intervals is k=1 1/k which tends to +∞, accordingly to (a). The contradiction shows that our hypothesis regarding the boundedness of the sequence (an ) is not true. Hence the sequence is unbounded. 4 Theorem 1.8. Suppose that beginning with a certain rank, the terms of a convergent (xn ) satisfy the inequality xn ≥ b (xn ≤ b). Then the limit a of the sequence (xn ) satisfies the inequality a ≥ b (a ≤ b). Proof. Suppose that there exists N ∈ N∗ such that for every n ≥ N, xn ≥ b. We show that a ≥ b. If a < b, denote c := b − a. Consider ε := c/2. Since a is the limit of the sequence (xn ), there is a rank nε ∈ N∗ such that |xn − a| < ε, for all n ≥ nε , i.e., xn < a + ε < b for all n ≥ nε , contrary to our assumption. Hence a ≥ b. 2 3.1. Numerical sequences 43 Corollary 1.1. Suppose that beginning with a certain rank, the terms xn and yn of the convergent sequences (xn ) and (yn ) satisfy the inequality xn ≤ yn . Then lim xn ≤ lim yn . Proof. The sequence (yn − xn )n is convergent and has non-negative terms. So, its limit is non-negative. It means that lim yn − lim xn = lim(yn − xn ) ≥ 0. 2 Theorem 1.9. Suppose that beginning with a certain rank N, the terms xn and yn of the convergent sequences (xn ) and (yn ) satisfy (i) xn ≤ yn , for all n ≥ N ; (ii) yn − xn → 0, as n → ∞. Then lim xn = lim yn . Proof. Consider the nonempty and compact intervals In defined by In := [xn , yn ], n ≥ N. By (b) of theorem 1.5 we get the conclusion. 2 Exercise 1.1. ([24, Probl. 7, p. 9]) Let (xn ) a sequence defined by q √ √ √ x1 = a, x2 = a + a, . . . , xn+1 = a + xn , . . . , a > 0. Show that the sequence (xn ) converges. For the beginning we remark that the sequence (xn ) has positive terms and it is increasing since q √ √ x2 = a + a > a = x1 , (xn+2 − xn+1 )(xn+2 + xn+1 ) = xn+1 − xn , ∀ n ≥ 1. We are interested now to see if the sequence is bounded above or not. Since √ 1 + 1 + 4a , x1 < 2 √ √ 1 + 1 + 4a 1 + 1 + 4a xn < =⇒ xn+1 < , ∀n ≥ 1 2 2 we infer that the sequence is bounded. Hence the given sequence is convergent. Exercise 1.2. ([24, Probl. 7, p. 9]) Show that the sequence (xn ) defined by q √ √ √ x1 = a1 , x2 = a1 + a2 , . . . , xn+1 = an+1 + xn , . . . , ai > 1, converges if lim n→∞ 1 ln(ln an ) < ln 2. n 4 44 3. Numerical sequences and series We introduce other two sequences an bn = 2n , n ≥ 1, e q p p p y1 = b1 , y2 = b1 + b2 , . . . , yn+1 = bn+1 + yn , . . . and we note that for every n ∈ N∗ yn = xn /e, i. e., the sequences (xn ) and (yn ) converge or diverge simultaneously. We also remark that the sequence (yn ) is increasing. From the hypothesis of the exercise it follows that there exists an n0 ∈ N∗ such that for n ≥ n0 1 n ln(ln an ) < ln 2 ⇐⇒ an < e2 ⇐⇒ bn < 1. n Consider a = max{b1 , b2 , . . . , bn0 , 1} and define the following sequence q √ √ √ z1 = a, z2 = a + a, . . . , zn+1 = a + zn , . . . . Then yn ≤ zn . Based on the exercise 1.1, we deduce that the sequence (zn ) converges. Therefore the sequence (yn ) converges and, at the end, the sequence (xn ) converges, too. 4 Corollary 1.2. Suppose there are given three sequences (xn ), (an ), and (yn ) satisfying xn ≤ an ≤ yn from a certain rank. If the sequences (xn ) and (yn ) converge to the same limit, then the sequence (an ) is convergent and its limit coincides with the limit of (xn ). Proof. It is obvious since |an − xn | ≤ |yn − xn |. 2 Exercises 1.2. Find the limits (a) (b) 1 1 1 + 2 + ··· + 2 ); +1 n +2 n +n 1 + 22 + · · · + nn lim . n→∞ nn lim ( n→∞ n2 The first limit is equal to 1 since n2 n 1 1 1 n ≤ 2 + 2 + ··· + 2 ≤ 2 . +n n +1 n +2 n +n n +1 For the second limit we take into account the following inequalities 1≤ 1 + 22 + · · · + nn 1 + n + n2 + · · · + nn nn+1 − 1 1 ≤ = → 1. nn nn n − 1 nn Thus we get that the limit is 1. 4 3.1. Numerical sequences 3.1.5 45 Upper and lower limits Let (xn ) be a sequence of real numbers with the following property: for every real m there is an integer nm such that n ≥ nm implies xn ≥ m. We then write xn → ∞. Similarly, if for every real m there is an integer nm such that n ≥ nm implies xn ≤ m, we write xn → −∞. Let (xn ) be a sequence of real numbers. Let E be the set of numbers x (in the extend real number system) such that xnk → x for some sequence (xnk )k . This set E contains all subsequential limits, plus possibly, the elements +∞ and −∞. We put x∗ = sup E, x∗ = inf E. The elements x∗ and x∗ are called the upper and lower limits of (xn ); we use the notation lim sup xn = x∗ , lim inf xn = x∗ . n→∞ n→∞ Theorem 1.10. Let (xn ) be a sequence of real numbers. Let E and x∗ as defined earlier. The x∗ has the following two properties (a) x∗ ∈ E; (b) if y > x∗ , there is an integer m such that n ≥ m implies xn < y. Moreover, x∗ is the only number with the properties (a) and (b). Proof. If x∗ = +∞, E is not bounded above; hence (xn ) is not bounded above, and there is a subsequence (xnk ) such that xnk → ∞. If x∗ is real, then E is bounded above, and at least one subsequential limit exists, so that (a) follows from the theorem 1.4 and theorem 1.5, p. 30. If x∗ = −∞, E contains only one element, namely −∞, and there is no subsequential limit; hence for any real m, xn > m for at most a finite number of values of n, so that xn → −∞. This establishes (a) in all cases. To prove (b) suppose there is a number y > x∗ such that xn ≥ y for infinitely many values of n. In that case, there is a number z ∈ E such that z ≥ y > x∗ , contradicting the definition of x∗ . Thus x∗ satisfies (a) and (b). To show the uniqueness, suppose there are two numbers, p and q, which satisfy (a) and (b) and suppose p < q. Choose x such that p < x < q. Since p satisfies (b), we have that xn < x for n > m. But then q cannot satisfy (a). 2 Exercise 1.3. Let (an ) and (bn ) be two sequences of real numbers such that lim supn→∞ an = lim supn→∞ bn = +∞. Then there exist m, n such that |am −an | > 1 and |bm − bn | > 1. 46 3. Numerical sequences and series We note that the two sequences are unbounded. Then there exist n1 , n2 satisfying |an1 − an2 | > 2 (since otherwise the sequence (an ) is bounded). Further, there exists n3 such that |bn1 − bn3 | > 1 and |bn2 − bn3 | > 1 (since otherwise the sequence (bn ) is bounded). Now, if |an1 − an3 | > 1, then n := n1 and m := n3 . If |an1 − an3 | ≤ 1, then |an2 − an3 | > |an1 − an2 | − |an1 − an3 | > 1, and hence n := n2 and m := n3 . 4 3.1.6 Stoltz-Cesaro theorem and some of its consequences Theorem 1.11. (Stoltz1 -Cesaro2 theorem) Let (an ) be a sequence of real numbers and (bn ) be a strictly increasing and divergent sequence. Then an+1 − an = l (∈ [−∞, +∞]) n→∞ bn+1 − bn (1.2) lim implies an = l. n→∞ bn (1.3) lim Proof. Suppose that l is finite. Choose a strictly positive ε. For ε/3 we find a rank nε such that for every n ≥ nε it holds an+1 − an < ε/3, − l bn+1 − bn that is, (bn+1 − bn )(l − ε/3) < an+1 − an < (bn+1 − bn )(l + ε/3). (1.4) Taking in (1.4), successively, n = nε , n = nε + 1, . . . , n = nε + p − 1, and adding all these inequalities, we get (bnε +p − bnε )(l − ε/3) < anε +p − anε < (bnε +p − bn )(l + ε/3) or (1.5) l− ε ε bn anε an +p ε ε bn anε − (l − ) ε + < ε < l + − (l + ) ε + . 3 3 bnε +p bnε +p bnε +p 3 3 bnε +p bnε +p Notice that the sequences (bnε /bnε +p )p and (anε /bnε +p )p tend to 0. Then there exists a rank pε such that for every p ≥ pε there hold bnε anε ε ε ε bn +p (l ± 3 ) < 3 and bn +p < 3 . ε ε 1 2 Otto Stoltz, Ernesto Cesaro, 3.1. Numerical sequences 47 Hence, for every n > nε + pε , we finally have a n l − < ε. bn Now suppose that l = +∞. Choose a positive ε as large as we like. There exists a rank nε such that for every n ≥ ε it holds an+1 − an ε > bn+1 − bn 2 or ε an+p − an > (bn+1 − bn ). 2 Adding the first p such inequalities, we may write an+p ε an − εbn > + . bn+p 2 bn+p Now, there exists a rank pε such that for every p > pε we have an − εbn ε bn+p < 2 . Hence, for every n > nε + pε , we finally have an > ε. 2 bn Corollary 1.3. Suppose there is given a strictly positive sequence (an ). (a) If lim n→∞ an+1 = a (a > 0), an then lim n→∞ √ n an = a. (b) If lim an = a (a > 0), n→∞ then lim n→∞ √ n a1 a2 . . . an = a. (c) For p ∈ N∗ , it holds 1p + 2p + · · · + np 1 = . p+1 n→∞ n p+1 lim √ Proof. (a) Take xn = ln n an . Then xn = ln an /n. Now we apply theorem 1.11 of Stoltz-Cesaro and we get lim xn = lim ln an+1 /an = ln lim an+1 /an = ln a. √ (b) Take xn = ln n a1 a2 . . . an and reason as before. (c) Applying Stoltz-Cesaro theorem 1.11, we get (n + 1)p np + pnp−1 + . . . 1 = → p+1 p+1 p (n + 1) − (n) (p + 1)n + . . . p+1 as n → ∞. 48 3. Numerical sequences and series 3.1.7 Some special sequences Proposition 1.1. The following limits hold 1 (a) If p > 0, limn→∞ p = 0; n √ n (b) if p > 0, lim p = 1; √ (c) lim n n = 1; (d) if p > 0 and α is real, then limn→∞ nα = 0; (1 + p)n (e) if |x| < 1, limn→∞ xn = 0. Proof. (a), (b), and (c) follow from theorem 1.11 and corollary 1.3. 2 Proposition 1.2. Let the sequence (an ) be defined as n 1 (1.6) an = 1 + . n Then the sequence (an ) is convergent. Proof. First approach. Based on Newton3 formula we may write 1 1 2 1 n 1 an = 1 + + + ··· + 2 n n n n n nn n(n − 1) 1 n(n − 1)(n − 2) . . . 2 · 1 1 =1+1+ + ··· + 2 n 2! n n! n 1 1 1 2 n−1 1 =1+1+ 1− + ··· + 1 − 1− 1− n 2! n n n n! 1 1 1 (1.7) < 1 + 1 + + + ··· + 2! 3! n! Since n! ≥ 2n−1 , for every n ≥ 2, it follows that 1 1 1 2 < an < 1 + 1 + + 2 + · · · + n−1 < 3. 2 2 2 Thus the sequence (an ) is bounded. We study now the monotonicity of the sequence (an ). 1 1 1 2 n 1 ) + · · · + (1 − )(1 − ) . . . (1 − ) n + 1 2! n+1 n+1 n + 1 (n + 1)! 1 1 1 2 n−1 1 > 1 + 1 + (1 − ) + · · · + (1 − )(1 − ) . . . (1 − ) n + 1 2! n+1 n+1 n + 1 n! 1 1 1 2 n−1 1 > 1 + 1 + (1 − ) + · · · + (1 − )(1 − ) . . . (1 − ) = an . n 2! n n n n! an+1 = 1 + 1 + (1 − 3 Sir Issac Newton, 1642-1727 3.1. Numerical sequences 49 Thus the sequence (an ) is increasing. Taking into account that it is also bounded, by theorem 1.7, we infer that the sequence (an ) is convergent. Second approach. We start with the identity xn+1 − y n+1 = (x − y) n X xn−i y i , i=0 valid for any real x and y and n ∈ N∗ . If x > y it results (1.8) (n + 1)(x − y)y n < xn+1 − y n+1 < (n + 1)(x − y)xn . Substitute x = 1 + 1/n and y = 1 + 1/(n + 1) and we get 1 , xn+1 = x · an , y n+1 = an+1 n(n + 1) x−1 1 1 n+1 = > y =⇒ y 2 < x + y − 1 = 1 + + =⇒ x= n y−1 n n+1 1 1 n+2 y < 1+ + yn. n n+1 x−y = (1.9) The right-hand side of (1.8) supplies 1 1 1 x · an − an+1 < an =⇒ 1 + an − an < an+1 =⇒ an < an+1 . n n n Thus the sequence (an ) is increasing. Denote un = (1 + n1 )n+1 . Then an < un , for every n ∈ N∗ . From (1.9) it follows 1 1 1 n+2 n y = + y yn, un+1 = y < 1+ + n n+1 n while from the left-hand side of (1.8) it results (1.10) 1 1 n 1 y < un − y n+1 =⇒ ( + y)y n < un =⇒ un+1 < ( + y)y n < un . n n n Thus the sequence (un ) is decreasing and 1 an < un < · · · < u5 = (1 + )6 = 2, 985984 . . . =⇒ 5 an < 3, ∀n ∈ N∗ . Thus the sequence (an ) is increasing and bounded above, hence it converges. As usually we denote the limit of the sequence (an ) by e, i.e., n 1 (1.11) lim 1 + =: e. n→∞ n 2 50 3. Numerical sequences and series Corollary 1.4. There hold the inequalities n n+1 1 1 1+ <e< 1+ , n n ∀n ∈ N∗ . Proposition 1.3. Below we introduce some properties of number e. (i) It holds that lim n→∞ 1 1 1 1 1 + + + + ··· + 1! 2! 3! n! = e; (ii) e can be written as (1.12) e=1+ 1 1 1 1 θn + + + ··· + + , 1! 2! 3! n! n · n! where θn ∈ ]0, 1[ ; (iii) e can be approximated with an accuracy up to 10−6 , if n ≥ 8; (iv) e is a non-rational number, e ∈ R \ Q. Proof. (i) We saw by (1.7) that if an < 1 + 1 + 1 1 1 + + ··· + 2! 3! n! the sequence (cn ) defined as cn = 1 + 1 + 1 1 1 + + ··· + , 2! 3! n! n ∈ N∗ is convergent and therefore the following inequality holds e = lim an ≥ lim cn . (1.13) n→∞ n→∞ On the other hand n 1 1 1 2 1 n 1 an = 1 + =1+ + + ··· + 2 n n n n n n nn 1 1 1 2 k−1 1 >1+1+ 1− + ··· + − 1− ... 1 − , n 2! n n n k! 1 1 =⇒ lim an = e ≥ 1 + 1 + + · · · + , ∀k ≥ 2 n→∞ 2! k! (1.14) =⇒ e ≥ ck =⇒ e ≥ lim ck . k→∞ Hence, from (1.13) and (1.14), we infer lim ck = e. k→∞ ∀k ≤ n 3.1. Numerical sequences 51 (ii) 1 1 1 + + ··· + (n + 1)! (n + 2)! (n + m)! ∞ X 1 1 1 n+2 1 < = < . k (n + 1)! k=0 (n + 2) (n + 1)! n + 1 n · n! cn+m − cn = For fixed n and passing m → +∞ it follows 0 < e − cn < Denote 0 < θn := 1 . n · n! e − cn 1 n·n! <1 and it follows (1.12). (iii) The inequality 0 < e − cn < 1 < 10−5 n · n! is satisfied for any n ≥ 8; so e∼ =2+ 1 1 + ··· + ∼ = 2.71828. 2! 8! (iv) Suppose that e ∈ Q. Then we may write e = m/n, for some m, n ∈ N∗ . Then e= m 1 1 1 1 θn = 1 + + + + ··· + + , θn ∈ ]0, 1[ , n 1! 2! 3! n! n · n! 1 1 1 θn =⇒ (n)! · m − n! 2 + + + · · · + = . 2! 3! n! n The last equality is impossible since θn ∈ ]0, 1[ . Hence e ∈ R \ Q. Proposition 1.4. The sequence an = 1 + 1 1 + . . . − ln n, 2 n n ≥ 1, is decreasing and bounded. Proof. By corollary 1.4 successively follow n+1 n 1 1 1+ >e> 1+ n n =⇒ (n + 1)[ln(n + 1) − ln n] > 1 > n[ln(n + 1) − ln n] 1 1 (1.15) =⇒ < ln(n + 1) − ln n < . n+1 n 2 52 3. Numerical sequences and series Inequality (1.15) follows from the Lagrange4 mean value theorem 2.3 at page 97, too. Taking n = 1, 2, . . . , k we get k X (1.16) n=1 k X1 1 < ln(k + 1) < . n+1 n n=1 Remark that from (1.15) it follows an+1 − an = 1 − ln(n + 1) + ln n < 0, n+1 so that the sequence (an ) is decreasing. From (1.16) we get k X 1 ln k < ln(n + 1) < . n n=1 So, the sequence (an ) is bounded below by 0. Hence our sequence is convergent. Its limit is denoted by γ = 0.5772156649 . . . and it is called the Euler5 -Mascheroni6 constant. Corollary 1.5. It holds 1 + 12 + · · · + lim n→∞ ln n 1 n = 1. Proof. Pass to the limit in 1 + 12 + · · · + ln n 1 n − ln n + 1. We may use equally well the Stoltz-Cesaro theorem 1.11. Corollary 1.6. It holds 1 1 1 lim + + ··· + = ln k, n→∞ n+1 n+2 kn 2 ∀k ∈ {2, 3, . . . }. Proof. Pass to the limit in 1 1 1 + + ··· + n +1 n + 2 kn 1 1 1 1 = 1 + + ··· + − ln kn − 1 + + · · · + − ln n + ln k, n ≥ 1. 2 2 kn 2 n 4 J. L. Lagrange, Leonhard Euler, 1707-1783 6 Mascheroni 5 3.1. Numerical sequences 53 Corollary 1.7. Consider the limit 1 1 lim 1 + + · · · + − α ln n n→∞ 2 n and determine the set of constants α for which the above limit exists and it is finite. Proof. We denote by M the set of constants α for which the above limit exists and it is finite. Remark that M 6= ∅ since by proposition 1.4, 1 ∈ M. Now α = 1, γ, 1 1 1 + + · · · + − ln n + (1 − α) ln n → +∞, α < 1, 2 n −∞, α > 1. Hence M = {1}. 2 Corollary 1.8. Denote an = 1 − 1 1 1 1 + − + · · · + (−1)n−1 . 2 3 4 n Then it holds lim an = ln 2. n→∞ Proof. Recall Catalan7 identity 1− 1 1 1 1 1 1 1 + − + ··· − = + + ··· + . 2 3 4 2n n+1 n+2 2n Then 1 1 1 + + ··· + n+1 n+2 2n 1 1 1 1 = 1 + + ··· + − ln 2n − 1 + + · · · + − ln n + ln 2 2 2n 2 n → ln 2. 2 a2n = Lemma 1.1. (Problem of Traian Lalescu,8 Gazeta Matematică, 1901, problem 579) Define p √ n n+1 an = (n + 1)! − n!. Then limn→∞ an = e−1 . The result will follow from the next two proposition. 7 8 E. Ch. Catalan, 1814-1894 Traian Lalescu, 54 3. Numerical sequences and series Proposition 1.5. The sequence n , xn = √ n n! n > 3, satisfy the inequalities 1− √1n (1.17) e 1 < xn < e1− n Proof. We will prove them by induction. For n > 3 the next inequality holds (1.18) xn > e 1− √1n . Indeed, since r r 4 32 32 e< and x4 = 3 3 it follows that (1.18) holds for n = 4. Suppose that (1.18) holds for some n ≥ 4. Then √ n n n 1 n+1 n+1 ·e , xn+1 > 1 + n and from corollary 1.4 it follows xn+1 > e 1−f (n) √ 1 + (n + 1) n , where f (n) = . (n + 1)2 It is obvious that n ≥ 4 =⇒ f (n) ≤ √ So xn+1 > e 1 1− √n+1 1 . n+1 , and (1.18) is completely proved. Now we suppose that for some n ≥ 4, xn < e1−1/n . Then applying corollary 1.4 we get n−1 n 1 (n + 1)n n−1 1 n+1 1− n+1 n+1 n (n + 1) xn+1 = xn · < e =⇒ x < 1 + < e , n+1 nn nn n ending in this way the proof. 2 Proposition 1.6. For every ε > 0 it is satisfied the next inequality 1 an − < ε e for every n > nε , where 1 1 n ε = 1 + 8 + 2 + 8 − 2 . 2ε 2ε 3.1. Series 55 Proof. Using (1.17) and corollary 1.4, for n ≥ 16, we get 1 √ n( n xn+1 − 1) n(e n+1 − 1) √1n 1 √1 an = < ·e < e n e xn e 1 1 1 1 1 1 1+ √ < < +√ . < 1+ √ e e e n n−2 [ n] − 1 At the same time h i 1√ 1√ √ n( n xn − 1) > n e n+1+ n+1 − 1 > n e [n+1+ n+1] − 1 n n √ √ > ≥ . [n + 2 + n + 1] n+2+ n+1 Thus an > 1 nen 1 1 √ · > −√ . e n+2+ n+1 e n Further, for n ≥ 16, we have an − 1 1 <√ . e n √ Since nε := 1 + [max{16, 1/ε2 }] > max{16, 1/ε2 }, for n ≥ nε we have 1/ n. Hence 1 an − < ε ∀ n ≥ n ε . 2 e Remark. Different approaches of the above problem may be found in [5, page 140], [2, Problem 3.20, page 437]. 3.2 Numerical series All the results under consideration in the present section are complex-valued, unless the contrary is explicitly stated. Given a sequence (xn ) we use the notation q X xn (p ≤ q) n=p to denote the sum xp + xp+1 + · · · + xq . To (xn )n≥1 we associate a sequence (sn ), where n X sn = xk . k=1 56 3. Numerical sequences and series We also use the symbolic expression x1 + x2 + . . . . Or, more concisely, (2.1) ∞ X xn . n=1 The symbol (2.1) we call an infinite series or just a series. The numbers sn are called the partial sums of the series. If (sn ) converges, say to s, then the series converges and we write ∞ X xn = s. n=1 The number s is called the sum of the series; but it should be clearly understood that s is the limit of a sequence of sums, and not obtained simply by addition. If (sn ) diverges, the series is said to diverge. Sometimes, for convenience of notation, we shall consider series of the form (2.2) ∞ X xn . n=0 And frequently, when there is no Ppossible ambiguity, or when the distinction is immaterial, we shall simply write xn in place P of (2.1) or (2.2). Theorem 2.1. (General criterion of Cauchy) xn converges if and only if for every ε > 0 there is an integer nε such that m X (2.3) xk < ε k=n if m ≥ n ≥ nε . Proof. Apply theorem 1.6 from page 41 to (sn ). In particular, for m = n (2.3) becomes |xn | < ε, 2 n ≥ nε , that is Theorem 2.2. (Necessary condition) If P xn converges, then lim xn = 0. Remark. The condition xn → 0 is not, however, sufficient to ensure convergence of P xn . For instance, the series ∞ X 1 1 n diverges as it results from (a) of exercises 1.1 from page 42 or it will become clear by theorem 2.19. 4 3.2. Numerical series 57 P P Theorem 2.3. (Operations with series)P Let an and P bn be two convergent series and c a real number. Then the series c an and (an ± bn ) are convergent and, moreover, X X (2.4) c an = c an and X (2.5) (an ± bn ) = X an ± X bn . Proof. Indeed, c X an = c lim n X ak = lim n X c ak = X c an and X an ± X bn = lim n X ak ± lim n X bk = lim( n X ak ± n X bk ) = X (an ± bn ). 2 Remark. The assumption that both series are convergent is essential. Otherwise it could happen that the left-hand side of (2.5) exists while the right-hand side of it has no meaning. For, X X X 0= (1 − 1) versus 1− 1. 4 As for finite sums, we may group the terms of a convergent series in brackets (but no commutativity is allowed). For example a1 + (a2 + a3 ) + (a4 + a5 + a6 + a7 ) + . . . . P Theorem 2.4. Let an be a convergent series. Then by grouping its terms it results a convergent series having the same sum. Proof. The sequence of partial sums of the transformed series is a subsequence of the convergent sequence of partial sums to the original sequence. 2 3.2.1 Series of nonnegative terms Theorem 2.5. A series of nonnegative terms converges if and only if its partial sums form a bounded sequence. Proof. The sequence of partial sums is increasing. It is convergent if and only if it is bounded, accordingly to the theorem 1.7 from page 41. 2 P Theorem an be a convergent series with nonnegative terms. Then the P 2.6. Let series bn obtained from the former by rearranging (commuting) and renumbering its terms is also convergent having the same sum. 58 3. Numerical sequences and series Proof. Let s be the sum of the first series and snPits partial sum of rank n. Denote by pn the partial sum of rank n of the series bn . Fix a rank, let it be n and consider pn . Then p n = b 1 + · · · + b n = ak 1 + · · · + ak n and denotePN = max{k1 , . . . , kn }. Obviously, pn ≤ sN ≤ s. Thus we conclude that the series bP n converges and, if b is its sum, b ≤ s. Reasoning vice-verse, we get P that an = b n . 2 Theorem 2.7. (Comparison test) (a) If |x P Pn | ≤ cn for n ≥ n0 , where n0 is some fixed integer, and if cn converges, then P xn converges. P (b) If xn ≥ yn ≥ 0 for n ≥ n0 and if yn diverges, then xn diverges. Proof. (a) Given ε > 0 there exists nε ≥ n0 such that m ≥ n ≥ nε implies m X ck ≤ ε, k=n by the Cauchy criterion. Hence m m m X X X xk ≤ |xk | ≤ ck ≤ ε, k=n k=n k=n and (a) follows. P P (b) follows from (a), for if xn converges, so must yn . Theorem 2.8. If 0 ≤ x < 1, ∞ X xn = n=0 2 1 . 1−x If x ≥ 1, the series diverges. Proof. If x 6= 1, sn = n X k=0 xk = 1 − xn+1 . 1−x The result follows if we let n → ∞. For x = 1, we get 1 + 1 + 1 + ..., which evidently diverges. 2 Theorem P 2.9. (Cauchy condensation test) Suppose x1 ≥ x2 ≥ x3 ≥ · · · ≥ 0. Then ∞ the series n=1 xn converges if and only if the series (2.6) ∞ X k=0 converges. 2k x2k = x1 + 2x2 + 4x4 + 8x8 + . . . 3.2. Numerical series 59 Proof. By theorem 2.5 it suffices to consider the boundedness of the partial sums. Let sn = x 1 + x 2 + · · · + x n , tk = x1 + 2x2 + · · · + 2k x2k . For n < 2k , sn = x1 + · · · + xn (add terms) ≤ x1 + (x2 + x3 ) + · · · + (x2k + · · · + x2k+1 −1 ) ≤ x1 + 2x2 + · · · + 2k x2k = tk , so that sn ≤ tk . (2.7) On the other hand, if n > 2k , sn = x1 + · · · + xn (neglect terms) ≥ x1 + x2 + (x3 + x4 ) + · · · + (x2k−1 +1 + · · · + x2k ) 1 1 ≥ x1 + x2 + 2x4 + · · · + 2k−1 x2k = tk , 2 2 so that 2sn ≥ tk . (2.8) By (2.7) and (2.8) the sequences (sn ) and (tk ) are either both bounded or both unbounded. 2 Theorem 2.10. X 1 converges if and only if p > 1. np Proof. If p ≤ 0, divergence follows from the necessary condition, i.e., theorem 2.2. If p > 0, the previous theorem is applicable, and we are led to the series ∞ X ∞ X 1 2 kp = 2(1−p)k . 2 k=0 k=0 k Now, 21−p < 1 if and only if 1 − p < 0, and the result follows by comparison with the geometric series (take x = 21−p ). 2 Remark. The series 1 1 1 1 + + + + ... 2 3 4 is called the harmonic series and it is divergent. From corallary 1.5 we deduce that the speed of divergence of the harmonic series agrees to the speed of divergence of the logarithmic function. 4 60 3. Numerical sequences and series Theorem 2.11. If p > 1, ∞ X (2.9) n=2 1 n(ln n)p converges; if p ≤ 1, the series diverges. Proof. The logarithmic function increases, hence 1/[n(ln n)p ] decreases. We apply theorem 2.9 to (2.9); this leads us to the series (2.10) 3.2.2 ∞ X ∞ 1 1 X 1 2 k = . 2 k )p p p 2 (ln 2 (ln 2) k n=1 k=1 k The root and ration tests Theorem 2.12. (Root test or D’Alembert9 criterion) Given p lim supn→∞ n |xn |. Then P (a) if α < 1, xn converges; P (b) if α > 1, xn diverges; P xn , we put α = (c) if α = 1, the test gives no information. Proof. (a) If α < 1, we can choose β so that α < β < 1 and an integer m such that p n |xn | < β for all n ≥ m. That is, n ≥ m implies |xn | < β n . P n P Since 0 < β < 1, β converges. Convergence of xn follows now from the comparison test, theorem 2.7. p (b) If α > 1, then there is a sequence (nk ) such that nk |xnk | → α. Hence |xn | > 1 for infinitely many values of n, so the condition xn → 0, necessary for P convergence of xn , does not hold. (c) To prove (c), we consider the series X1 , n X 1 . n2 For each of these series α = 1, but the first diverges, while the second converges. 2 P Theorem 2.13. (Ratio test or Cauchy criterion) The series xn 9 Jean LE Rond D’Alembert, 1717-1783 3.2. Numerical series 61 xn+1 < 1; (a) converges if lim supn→∞ xn xn+1 ≥ 1, for n ≥ n0 , where n0 is some fixed integer; (b) diverges if xn (c) if xn+1 xn+1 lim inf ≤ 1 ≤ lim sup n→∞ xn x n→∞ n the test gives no information. Proof. (a) If condition (a) holds, we can find β < 1 and an integer m such that xn+1 xn < β for n ≥ m. In particular |xm+1 | < β|xm |, |xm+2 | < β|xm+1 | < β 2 |xm |, ...... |xm+p | < β p |xm |. That is |xn | < β n−m |xm | P n for n ≥ m, and (a) follows from the comparison test, since β converges. (b) If |xn+1 | > |xn | for n ≥ m, it is easily seen that the necessary condition xn → 0 does not hold, and (b) follows. (c) We again consider series X1 , n In any case we have lim n→∞ X 1 . n2 xn+1 = 1, xn but the first series diverges while the second one converges. Theorem 2.14. For any sequence (xn ) of positive numbers √ xn+1 ≤ lim inf n xn n→∞ n→∞ xn √ xn+1 lim sup n xn ≤ lim sup . xn n→∞ n→∞ lim inf 2 62 3. Numerical sequences and series 3.2.3 Power series Given a sequence (cn ) of complex numbers, the series ∞ X (2.11) n cn z = 1 + n=0 ∞ X cn z n , n=1 is called a power series. The numbers cn are called the coefficients of the series; z is a complex number. P Theorem 2.15. (Cauchy-Hadamard10 ) Given the power series cn z n , put α = lim sup p n |cn |, R= n→∞ (If α = 0, R = +∞; if α = ∞, R = 0. ) Then diverges if |z| > R. P Remark. R is called the radius of convergence of Proof. Put xn = cn z n and apply the root test lim sup p n |xn | = |z| lim sup n→∞ Examples. (a) P p n 1 . α cn z n converges if |z| < R; and P cn z n . |cn | = n→∞ |z| . 2 R nn z n , R = 0; P zn , R = +∞; n! P n (c) z , R = 1. If |z| = 1, the series diverges, since z n does not tend to 0 as n → ∞; (b) P zn , R = 1. On the circle of convergence, the series diverges at z = 1; it n converges at all other points of |z| = 1. The last assertion follows from theorem 2.19. P zn (e) , R = 1. The series converges at all points of the circle |z| = 1, by the n2 n z 1 comparison test, since 2 = 2 if |z| = 1. 4 n n (d) 3.2.4 Partial summation Lemma 2.1. Given two sequences (an ), (bn ). Put An = n X k=0 10 Jacques Hadamard, 1865-1963 ak 3.2. Numerical series 63 if n ≥ 0; put A−1 = 0. Then, if 0 ≤ p ≤ q, we have (2.12) q X an b n = n=p p−1 X An (bn − bn−1 ) + Aq bq − Ap−1 bp . n=p The identity (2.12) is said to be the partial summation formula. Theorem 2.16. (Abel11 theorem) Suppose there are given two sequences (an )n and (bn )n . Moreover, P∞ (i) the series n=1 |bn − bn+1 | converges; (ii) limn→∞ bn = 0; (iii) there exists a positive α such that for every n ∈ N∗ and m ∈ N∗ , m ≥ n, we have |an + an+1 + · · · + am | < α. P∞ Then n=1 an bn converges. Proof. For every n ∈ N∗ and m ∈ N∗ , m ≥ n, we denote αn,m = an +an+1 +· · ·+am . Then based on a variant of the partial summation formula (2.12) we have an bn + · · · + am bm = αn,n bn + (αn,n+1 − αn,n )bn+1 + · · · + (αn,m − αn,m−1 )bm (2.13) = αn,n (bn − bn+1 ) + · · · + αn,m−1 (bm−1 − bm ) + αn,m bm . Now, we choose a positive ε. For ε/(2α) we find an nε ∈ N∗ that satisfies the following two requirements • it is greater or equal to the nε fromPthe general criterion of Cauchy theorem ∞ 2.1 applied to the convergent series n=1 |bn − bn+1 |; • for every n ≥ nε , |bn | < ε/(2α). For every m ≥ n > nε we have • if m = n, an bn | ≤ α|bn | < α[ε/(2α)] < ε; • if m > n, from (2.13) we get | m X ak bk | ≤ α(|bn − bn+1 | + · · · + |bm−1 − bm |) + α|bm | < α k=n ε ε +α = ε. (2α) (2α) The conclusion follows from the the general criterion ofPCauchy theorem 2.1. 2 ∞ Remark. If bn ↓ 0, the assumption that the series n=1 |bn − bn+1 | converges in theorem 2.16 is useless. It may be neglected since the n -rank partial sum of the series P∞ |b 4 n=1 n − bn+1 | is equal to b1 − bn+1 which, in turn, tends to b1 . 11 Niels Henrik Abel, 1802-1829 64 3. Numerical sequences and series Theorem 2.17. (Dirichlet12 theorem) Suppose P (i) the partial sums An of an form a bounded sequence; (ii) b0 ≥ b1 ≥ b2 ≥ . . . ; (iii) limn→∞ bn = 0. P Then an bn converges. Proof. It follows from the previous remark. 2 Theorem 2.18. Suppose (i) |c1 | ≥ |c2 | ≥ . . . ; (ii) c2m−1 ≥ 0, c2m ≤ 0, m = 1, 2, . . . ; (iii) limn→∞ cn = 0. P Then cn converges. Proof. Apply theorem 2.17 with an = (−1)n+1 , bn = |cn |. 2 Example 2.1. For p > 0 the series 1− 1 1 1 1 + p − p + · · · + (−1)n+1 p + . . . p 2 3 4 n converges. For p = 1, based on corollary 1.8, we can write the Leibnitz13 series 1 1 1 1 + − + · · · + (−1)n+1 + · · · = ln 2. 4 2 3 4 n P Theorem 2.19. Suppose the radius of convergence of cn z n is 1, and suppose c0 ≥ c1 ≥ c2 . . . , limn→∞ cn = 0. Then cn z n converges at every point of the circle |z| = 1, except possibly at z = 1. (2.14) 1− Proof. Put an = z n , bn = cn . The hypotheses of theorem 2.17 are satisfied, since n X 1 − z n+1 2 k ≤ |An | = z = , 1−z |1 − z| k=0 if |z| = 1, z 6= 1. 12 13 2 Gustav Lejeune Dirichlet, 1805-1859 Gottfried Wilhelm Leibnitz, 1646-1716 3.2. Numerical series 3.2.5 65 Absolutely and conditionally convergent series Consider X (2.15) an . Series (2.15) with complex terms is said to be absolutely convergent if the series X (2.16) |an | converges. Theorem 2.20. The convergence of series (2.16) implies the convergence of series (2.15). Proof. We apply the Cauchy general criterion, theorem 2.1. For arbitrary, but fixed, ε > 0 there exists a rank nε such that for every ranks n and m, n ≥ nε and n ≥ m it holds m m X X | |ak | | = |ak | < ε. k=n k=n But using a well known inequality, we get | m X k=n ak | ≤ m X |ak | < ε. k=n Now, we apply once again theorem 2.1. 2 The converse statement is not exactly true. It is enough to consider the series (2.14) and the series in theorem 2.10 for p = 1. Remarks. It is obvious that a convergent series with nonnegative terms is absolutely convergent, too. But there are convergent series (with arbitrary terms) which are not absolutely convergent. Indeed, as we already saw the series in example 2.1 is convergent for p > 0, but it is absolutely convergent only for p > 1, theorem 2.10. 4 Series (2.15) is said to be conditionally convergent if the series converges whereas the corresponding series of the absolute values (2.16) diverges. A very important property to a sum of a finite number of real (complex) summands is the commutative property, that is, a rearrangement of the terms does not affect their sum. Unfortunately, this is not the case for series. Consider the convergent series (2.14), i.e., 1 1 1 1 1 − + − + · · · + (−1)n+1 . . . . 2 3 4 n We write it, based on theorem 2.4, as 1 1 1 1 1 (2.17) 1− + − + ··· + − + .... 2 3 4 2k − 1 2k − 2 66 3. Numerical sequences and series and we notice, as already we saw, that its sum is s := ln 2. Now we rearrange series (2.14) according to the rule: two negative terms after a positive one. We find 1 1 1 1 1 1 1 1 + − − + ··· + − − + .... (2.18) 1− − 2 4 3 6 8 2k − 1 4k − 2 4k Denote by Sn the n -th order partial sum of series (2.18). Then we have X n n X 1 1 1 1 1 S3n = − − = − 2k − 1 4k − 2 4k 4k − 2 4k k=1 k=1 n 1X 1 1 1 = − = s2n . 2 k=1 2k − 1 2k 2 Thus we can write 1 S3n = s2n . 2 (2.19) Also (2.20) 1 1 S3n−1 = s2n + 2 4n 1 1 and S3n−2 = s2n + . 2 4n − 2 From (2.19) and (2.20) we conclude that 1 lim S3n = lim S3n−1 = lim S3n−2 = s2n . n→∞ n→∞ n→∞ 2 Thus we have proved that series (2.18) converges and its sum is equal to s/2. Hence, by rearrangement of a conditionally convergent series we got a convergent series whose sum does not agree with the sum of the initial series. The previous example illustrates that the commutativity is not longer valid for arbitrary series. But let us see the positive case. Theorem 2.21. (On rearrangement of absolutely convergent series of Cauchy) The rearrangement of the terms of an absolutely convergent series supplies another absolutely convergent series having the same sum as the original one. Proof. Consider (2.21) ∞ X an an absolutely convergent series. Further, consider the positive, respectively, the negative parts of its terms, more precisely, ( ( a , a ≥ 0 −an , an ≤ 0 n n a− (2.22) a+ n = n = 0, an > 0. 0, an < 0 3.2. Numerical series 67 We have − an = a+ n − an . (2.23) Consider, now, the following two series with nonnegative terms X X and a− (2.24) a+ n n. + − Then both series P in (2.24) converge, since an ≤ |an | and an ≤ |an |. P + Consider bn the rearranged series of (2.21). For it construct the series bn P − of positive parts , respectively, the series bn of negative parts, as in (2.22). Then ∞ X an = ∞ X (a+ n − a− n) = ∞ X a+ n − ∞ X a− n = ∞ X b+ n − ∞ X b− n = ∞ X (b+ n − b− n) = ∞ X bn . The first equality is obvious, the second equality follows from theorem 2.3, the third one follows from theorem 2.6. For the other equalities we argue in the same way. 2 P Corollary 2.1. Suppose the series an is absolutely convergent. Then the series P + P − an are absolutely convergent, too. an and The converse statement is true since the difference of two convergent series both of them having nonnegative terms is a convergent series, theorem 2.3. P Corollary 2.2. For series an to be absolutely convergent it is necessary and sufficient that series (2.24) generated by it be convergent. P Lemma 2.2. If series an is convergent but not absolutely, both series (2.24) − generated by it are divergent but a+ n → 0 and an → 0. Proof. From corollary 2.2 P it+follows that at+least one of the series (2.24) generated by it is divergent, that is an = ∞, since an ≥ 0. (2.25) n X a− k = n X a+ k − n X ak . We examine the behaviour of the right-hand P side of (2.25) as n → ∞. The second sum tends to a finite number, since the series an is convergent. The first sum increases to ∞. Therefore the sum in the left-hand side of (2.25) increases to ∞ as n → ∞. As a conclusion, if one of the series (2.24) is divergent, under our assumptions, the other is divergent, too. P The sequences (a± an is convergent. 2 n ) tend to zero since the series Theorem P 2.22. (On rearrangement of conditionally convergent series of Riemann14 ) P∞ ∞ Let series 0 an and 0 bn be two divergent series with positive terms whose general terms tend to zero, i. e., an , bn → 0 as n → ∞. 14 Bernhard Riemann, 1826-1866 68 3. Numerical sequences and series Then for any s ∈ [−∞, ∞] we can construct a series (2.26) a0 + a1 + · · · + am1 − b0 − b1 − · · · − bn1 am1 +1 + am1 +2 + · · · + am2 − bn1 +1 − bn1 +2 − · · · − bn2 + . . . whose sum is equal to s. The series (2.26) contains all the terms of P ∞ 0 bn only once. P∞ 0 an and Proof. First we suppose that s ∈ R. The indices n1 , n2 , . . . and m1 , m2 , . . . can be chosen in this case as the smallest natural numbers for which the corresponding inequalities written bellow are fulfilled (i) α1 = Pn1 0 ak > s, (ii) α2 = α1 − Pm1 (iii) α3 = α2 + Pn2 > s, (iv) α4 = α3 − Pm2 < s, . . . . 0 bk < s, n1 +1 bk m1 +1 bk At the pth step of this construction we indeed can choose P∞ the natural P∞ numbers np and mp satisfying the pth inequality since the series a and n 0 0 bn have positive terms terms and diverge. The fact that the series thus constructed converges to s follows from the above inequalities and from the assumption that an , bn → 0 as n → ∞. Now, suppose that s = +∞. We can replace s in the right-hand side of the inequalities (i), (ii), (iii), . . . by a divergent sequence of the form 2,1,4,3,5,. . . . 2 P Corollary 2.3. Let an be a convergent series but not absolutely. Choose an s ∈ [−∞, ∞]. Then there is a rearrangement of the series such that the resulting series converges to s. P P + P − Proof. We just split the series an into two series an and an as it is indicated by (2.22), then apply lemma 2.2 and theorem 2.22. 2 3.2. Numerical series 3.2.6 69 Exercises Study the convergence of the following series and find their sum if any ∞ √ X n ∞ X 1 ; 2n − 1 n=1 n ; n (n + 1)2 n=1 3 ∞ √ X n2 ; n n=1 ∞ X n+1 ∞ X sin(nx) n=1 ∞ X n=1 n ; n2 3n ; ∞ X 2 ; 3n n=1 1 ; n(n + 1) n=1 ∞ X 1 arctg 2 ; 2n n=1 2 n + 5n + 6 ; ln + n 5n + 4 n=1 ∞ X 1 π sin , p ∈ R; p n n n=1 ∞ X ; ∞ X cos(nx) − cos(n + 1)x ∞ X ∞ X n=1 5n ln n ; 4 n(ln n + 1) n=1 n (−1)n−1 1 2+ ; n+1 n n=1 ; n=1 ∞ X ∞ X √ √ √ 3 3 3 ( n + 2 − n + 1 + n); n=1 ∞ X p arcsin n=1 ∞ X 2 √ (n + 1)2 − 1 − n2 − 1 ; n(n + 1) (n + 1)n ; n2 3n n n=1 √ ∞ − X e √ n=1 n n ; Chapter 4 Euclidean spaces Some basic facts regarding the Euclidean spaces are presented in this chapter. 4.1 Euclidean spaces For each positive integer k, let Rk be the set of all ordered k -tuples x = (x1 , x2 , . . . , xk ), where x1 , . . . , xk are real numbers, called coordinates of x. The elements of Rk are called points or vectors, especially when k > 1. If y = (y1 , . . . , yk ) and α is a real number, let x + y = (x1 + y1 , x2 + y2 , . . . , xk + yk ), αx = (αx1 , . . . , αx2 , . . . , αxk ), + : Rk × Rk → R k ; · : R × Rk → Rk . These define addition of vectors, and respectively, multiplication of a vector by a real number (a scalar). Theorem 1.1. (a) (Rk , +) is a commutative group; (b) α(x + y) = αx + αy, for every α ∈ R, x, y ∈ Rk ; (c) (α + β)x = αx + βx, for every α, β ∈ R, x ∈ Rk ; (d) α(βx) = (αβ)x, for every α, β ∈ R, x ∈ Rk . These two operations make Rk into a vector (linear ) space over the field of the reals. The zero element 0 in Rk is called the origin or the null vector and all its coordinates are 0. Proposition 1.1. (Calculus rules in a linear space) (a) 0 · x = 0, ∀x ∈ Rk (the first 0 is a scalar, while the second is a vector); (b) α · 0 = 0, ∀α ∈ R (the two 0 coincide, the null vector); (c) (−1)x = (−x1 , . . . , −xn ), ∀x ∈ Rk ; (d) α(x1 +· · ·+xn ) = αx1 +· · ·+αxn , for every α ∈ R, and xi ∈ Rk , i = 1, 2, . . . , n; (e) (α1 +· · ·+αn )x = α1 x+· · ·+αn x, for every αi ∈ R, i = 1, 2, . . . , n, and x ∈ Rk . 71 72 4. Euclidean spaces We define the inner product of x, y ∈ Rk by hx, yi = k X xi yi i=1 and the Euclidean norm of x by kxk2 = p hx, xi = k X !1/2 x2i . i=1 The Minkowski norm or the l1 -norm of x is defined by kxk1 = |x1 | + · · · + |xk |, while the uniform norm of x is defined by kxk∞ = max{|x1 |, . . . , |xk |}. For 1 ≤ p < +∞ we define the lp -norm of x by kxkp = (|x1 |p + · · · + |xk |p )1/p . In order to indicate which the norm we are refering to we denote (Rk , k · kp ) and (Rk , k · k∞ ). Theorem 1.2. Let k · k be any of the norms defined above on Rk . Suppose x, y, z ∈ Rk , and α ∈ R. Then (a) kxk ≥ 0; (b) kxk = 0 ⇐⇒ x = 0; (c) kαxk = |α|kxk; (d) |hx, yi| ≤ kxk2 kyk2 ; (e) kx + yk ≤ kxk + kyk; (f) kx − zk ≤ kx − yk + ky − zk. Proof. (a), (b), and (c) are trivial. (d) If y = 0, we have equality. Suppose y 6= 0 and consider a real λ. Then 0 ≤ hx + λy, x + λyi = hx, xi + 2λhx, yi + λ2 hy, yi. Taking λ = − hx, yi we get hy, yi 0 ≤ hx, xi − 2 and (d) follows. hx, xihy, yi − hx, yi2 hx, yi2 hx, yi2 hy, yi = + hy, yi hy, yi hy, yi2 4.1. Euclidean spaces 73 (e) If the norm is an lp -norm, p ≥ 1, then our inequality is actually the Minkowski inequality, (see page 23). If the norm under consideration is the uniform one, then a straightforward evaluation proves this inequality. (f) follows from (e). 2 Any function k · k : Rk → R satisfying (a), (b), (c), and (e) of theorem 1.2 is said to be a norm on Rk . Given a norm on Rk , we assign to it a metric by ρ(x, y) = kx − yk. (1.1) Theorem 1.3. If k · k is a norm on Rk , then the metric ρ defined by (1.1) is a metric on Rk ; thus every norm induces a metric. Theorem 1.4. Any two metrics on Rk induced by any of the norms k · kp , p ≥ 1, or k · k∞ generate the same open sets. Corollary 1.1. A sequence is convergent in one of the metrics mentioned in the above theorem if and only if it converges in any other of them. Theorem 1.5. (a) Suppose xn ∈ Rk , n = 1, 2, · · · , and xn = (α1n , α2n , · · · , αkn ). Then (xn ) converges to x = (α1 , · · · , αk ) if and only if (1.2) lim αjn = αj , n→∞ 1 ≤ j ≤ k. (b) Suppose (xn )n , (yn )n are sequences in Rk , (βn )n is a sequence of real numbers, and xn → x, yn → y, βn → β. Then lim(xn + yn ) = x + y, lim hxn , yn i = hx, yi, lim βn xn = βx. Proof. We use the Euclidean metric. (a) If xn → x, the inequalities |αjn − αj | ≤ kxn − xk2 , j = 1, . . . , k follow immediately from the definition of the norm in Rk . These imply that (1.2) holds. Conversely, if (1.2) holds, then to each ε > 0 corresponds an integer nε such that n ≥ nε implies ε |αjn − αj | < √ , 1 ≤ j ≤ k. k 74 4. Euclidean spaces Hence n ≥ nε implies kxn − xk2 = k X !1/2 |αjn − αj |2 < ε, j=1 so that xn → x. This proves (a). (b) follows from (a) and from theorem 1.2, page 38. 2 Theorem 1.6. Every space (Rk , k · kp ), 1 ≤ p < ∞, and (Rk , k · k∞ ) are complete. We consider the case of (Rk , k · k2 ). The other cases can be proved similarly. We prove the following claim: a sequence (xn )n = (α1n , . . . , αkn )n in Rk is k · k2 -Cauchy if and only if every sequence (αj,n )n , 1 ≤ j ≤ k, is a Cauchy sequence. Proof of the claim. Suppose (xn ) is k · k2 -Cauchy. Then there is an integer nε such that n, m ≥ nε imply kxn − xm k2 < ε. It follows that |αjn − αjm | ≤ kxn − xm k2 < ε, j = 1, . . . , k. Conversely, suppose that for every ε > 0 there is an integer nε such that n, m ≥ nε imply ε |αjn − αjm | < √ , j = 1, . . . , k. k Hence, n, m ≥ nε imply kxn − xm k2 < ε. Proof. Suppose (xn ) is k · k2 -Cauchy. Then every sequence (αjn )n , 1 ≤ j ≤ k, is Cauchy, hence convergent to, say, αj ∈ R. Based on (a) of theorem 1.5, we infer that (xn ) converges to x = (α1 , · · · , αn ). 2 A closed and bounded interval I in Rk is defined as I = [a1 , b1 ] × · · · × [ak , bk ], where each [ai , bi ] is a closed and bounded interval. Theorem 1.7. Every closed and bounded interval I in Rk is compact. Theorem 1.8. Let E be a set in Rk . Then the following statements are equivalent (a) E is closed and bounded; (b) E is compact; (c) every infinite subset of E has a limit point in E. Theorem 1.9. (Weierstrass) Every bounded infinite subset of Rk has a limit point in Rk . Chapter 5 Limits and Continuity The aim of this chapter is to introduce notions and results on limits and continuity. 5.1 5.1.1 Limits The limit of a function Let X and Y be metric spaces; suppose ∅ 6= E ⊂ X, f maps E into Y, and p is a limit point of E. We write f (x) → q as x→p or equivalently, (1.1) lim f (x) = q x→p if there is a point q ∈ Y with the following property: for every ε > 0 there exists a δ > 0 such that (1.2) ρ(f (x), q) < ε for all points x ∈ E for which (1.3) 0 < ρ(x, p) < δ. Remark. It should be noted that p ∈ X, but that p need not be a point of E. Moreover, even if p ∈ E, we may very well have f (p) 6= limx→p f (x). 4 We can recast this definition in terms of sequences. Theorem 1.1. Let X, Y, E, f, and p be as in the definition given above. Then (1.4) lim f (x) = q x→p 75 76 5. Limits and Continuity if and only if (1.5) lim f (pn ) = q n→∞ for every sequence (pn )n in E such that pn 6= p, (1.6) lim pn = p. n→∞ Proof. Suppose (1.4) holds. Choose (pn ) in E satisfying (1.6). Let ε > 0 be given. Then there exists δ > 0 such that ρ(f (x), q) < ε if x ∈ E and 0 < ρ(x, p) < δ. Also, there exists nε such that n ≥ nε implies 0 < ρ(pn , p) < δ. Thus for n ≥ nε we have ρ(f (pn ), q) < ε, which shows that (1.5) holds. Conversely, suppose (1.4) is false. Then there exists some ε > 0 such that for every δ > 0 there is a point x ∈ E (depending on δ ), for which ρ(f (x), q) > ε but 0 < ρ(x, p) < δ. Taking δn = 1/n, n ∈ N∗ , we thus find a sequence in E satisfying (1.6) for which (1.5) is false. 2 Corollary 1.1. If f has a limit at p, this limit is unique. Proof. Follows from theorem 1.1, (b), page 37 and from theorem 1.1. Suppose Y = Rk and f, g : E → Rk . Define f + g and hf, gi by (f + g)(x) = f (x) + g(x), hf, gi(x) = hf (x), g(x)i, 2 x∈E and if λ is a real number, (λf )(x) = λf (x). Theorem 1.2. Suppose E ⊂ X is a metric space, p ∈ E 0 , f, g are functions defined on E with values in Rk and lim f (x) = A, x→p lim g(x) = B. x→p Then (a) limx→p (f + g)(x) = A + B; (b) limx→p hf, gi(x) = hA, Bi; f A (c) If Y = R, and B = 6 0, then limx→p (x) = . g B Proof. 2 5.1. Limits 5.1.2 77 Right-hand side and left-hand side limits Let f be defined on ]a, b[ . Consider any point x such that a ≤ x < b. We write f (x+) = q if f (tn ) → q as n → ∞ for all sequences (tn ) in ]x, b[ such that tn → x. Consider any point x such that a < x ≤ b. We write f (x−) = q if f (tn ) → q as n → ∞ for all sequences (tn ) in ]a, x[ such that tn → x. Theorem 1.3. Let I be a nonempty interval and p ∈ I. If f : I → R, then lim f (x) = q ⇐⇒ f (x−) = f (x+) = q. x→p Proof. The necessity part is immediate. Suppose now that f (x−) = f (x+) = q. We use the characterization in the theorem 1.1 and consider a sequence (xn ) in I , xn → p and having infinitely many terms greater than p, (denoted by (yn ) ), and infinitely many terms less than p (denoted by (zn ) ). By hypothesis, f (yn ) → q and f (zn ) → q. Then f (xn ) → q. 2 5.2 Continuity Suppose X and Y are metric spaces, E ⊂ X, p ∈ E, and f maps E into Y. Then f is said to be continuous at p if for every ε > 0, there exists a δ > 0 such that ρ(f (x), f (p)) < ε, for all points x ∈ E for which ρ(x, p) < δ. If f is continuous at every point of E, then f is said to be continuous on E. Remark. It should be noted that f has to be defined at the point p in order to be continuous at p. If p is an isolated point of E, then our definition implies that every function f which has E as its domain of definition is continuous at p. 4 Theorem 2.1. In the above setting, assume also that p is a limit point of E. Then f is continuous at p if and only if limx→p f (x) = f (p). Proof. 2 Now we prove that a continuous function of a continuous function is continuous, more precisely the following theorem holds 78 5. Limits and Continuity Theorem 2.2. Suppose X, Y, and Z are metric spaces, E ⊂ X, f maps E in Y, g maps the range of E, f (E), into Z, and h maps E into Z defined by h(x) = g(f (x)), x ∈ E. If f is continuous at a point p ∈ E and if g is continuous at the point f (p), then h is continuous at p. Proof. Let ε > 0 be given. Since g is continuous at f (p), there exists η > 0 such that if ρ(y, f (p)) < η and y ∈ f (E), we have ρ(g(y), g(f (p))) < ε . Since f is continuous at p, there exists δ > 0 such that ρ(f (x), f (p)) < η, if ρ(x, p) < δ and x ∈ E. It follows that ρ(h(x), h(p)) = ρ(g(f (x)), g(f (p))) < ε, if ρ(x, p) < δ and x ∈ E. Thus h is continuous at p. 2 Theorem 2.3. (Characterization of continuity) A mapping f of a metric space X into a metric space Y is continuous on X if and only if f −1 (V ) is open in X for every open set V in Y. Proof. Suppose f is continuous on X and V is an open set in V. We have to show that every point of f −1 (V ) is an interior point of f −1 (V ). Suppose p ∈ X, f (p) ∈ V. Since V is open, there exists ε > 0 such that y ∈ V if ρ(f (p), y) < ε, and since f is continuous at p there exists δ > 0 such that ρ(f (x), f (p)) < ε if ρ(x, p) < δ. Thus x ∈ f −1 (V ) as soon as ρ(x, p) < δ. Conversely, suppose f −1 (V ) is open in X for every open V in Y. Let p ∈ X and ε > 0. Let V = {y ∈ Y | ρ(y, f (p)) < ε}. Then V is open; hence f −1 (V ) is open; hence, there exists δ > 0 such that x ∈ f −1 (V ) as soon as ρ(p, x) < δ. However, if x ∈ f −1 (V ), then f (x) ∈ V, so that ρ(f (x), f (p)) < ε. 2 Theorem 2.4. Let f and g be complex-valued continuous functions on a metric space X. Then f + g, f g, and f /g are continuous on X (g(x) 6= 0 for all x ∈ X). Proof. At isolated points of X there is nothing to prove. At limit points, the statement follows from theorems 1.2 and 2.1. 5.2. Continuity 79 Theorem 2.5. (a) Let f1 , . . . , fk be real functions on a metric space X, and let f be the mapping of X into Rk defined by f (x) = (f1 (x), . . . , fk (x)), x ∈ X. Then f is continuous if and only if each of the functions f1 (x), . . . , fk (x) is continuous. (b) If f and g are continuous mappings of X into Rk , then f + g and hf, gi are continuous on X. Functions f1 , . . . , fk are called components of f. Proof. Part (a) follows from the inequalities |fj (x) − fj (y)| ≤ kf (x) − f (y)k2 = k X !1/2 2 |fi (x) − fi (y)| , i=1 for j = 1, . . . , k. Part (b) follows from (a) and theorem 1.2. 5.2.1 2 Continuity and compactness A mapping f of a set E into Rk is said to be bounded if there is a real number M such that |f (x)| < M, for all x ∈ E. Theorem 2.6. Suppose f is a continuous mapping from a compact metric space X into a metric space Y. Then f (X) is compact. Proof. Let {Vα } be an open covering of f (X). Since f is continuous, each of the sets f −1 (Vα ) is open (see theorem 2.3). Since X is compact, there are finitely many indices, say α1 , . . . , αn , such that (2.1) X ⊂ f −1 (Vα1 ) ∪ f −1 (Vα2 ) ∪ · · · ∪ f −1 (Vαn ). Since f (f −1 (E)) = E, for every E ⊂ Y, (2.1) implies that f (X) ⊂ Vα1 ∪ Vα2 ∪ · · · ∪ Vαn . This completes the proof. 2 Theorem 2.7. Suppose f is a continuous mapping from a compact metric space X into Rk . Then f (X) is closed and bounded. Thus f is bounded. Theorem 2.8. Suppose f is a continuous real function on a compact metric space X, and M = sup f (p), m = inf f (p). p∈X p∈X Then there exist points p, q ∈ X such that f (p) = M, f (q) = m. 80 5. Limits and Continuity Proof. By theorem 2.7, f (X) is a closed and bounded set of real numbers; hence f (X) contains its sup and inf, (see theorem 1.5, page 30). Theorem 2.9. Suppose f is a continuous bijective mapping of a compact metric space onto a metric space Y. Then the inverse function f −1 defined on Y is a continuous mapping of Y onto X. Proof. Applying theorem 2.3 to f −1 in place of f, we see that it suffices to prove that f (V ) is an open set in Y for every open set V in X. Fix such a set V. The complement X \ V is closed in X, hence compact (theorem 2.3, page 33). It follows that f (X \ V ) is a compact subset of Y which implies that it is closed in Y. Since f is bijective, f (X \ V ) = Y \ f (V ). Hence f (V ) is open. 2 Exercise 2.1. (Cauchy functional equation) Let ϕ be a continuous function satisfying (2.2) ϕ(x + y) = ϕ(x) + ϕ(y), ∀x, y ∈ R. Note that the solution is an odd function. Taking x = y = 0, we find ϕ(0) = 2ϕ(0), i.e., ϕ(0) = 0. By induction one can show that ϕ(n) = nϕ(1), n ∈ N∗ . Then from ϕ(1) = ϕ( 1 m ) = mϕ( ) m m it follows that ϕ( 1 1 ) = ϕ(1) m ∈ N∗ . m m By induction one can further show that ϕ(p/q) = (p/q)ϕ(1), p ∈ Z, q ∈ Z \ {0}. Now we know the representation of the solution if we restrict the problem to the system of rational numbers. Consider an arbitrary but fixed x ∈ R \ Q. Then there exists a sequence (qn ) of rational numbers such that qn → x. Then using the continuity hypothesis one can infer that ϕ(x) = lim ϕ(qn ) = lim qn ϕ(1) = [lim qn ]ϕ(1)] = xϕ(1). Denote ϕ(1) =: a. Then the solution of the Cauchy functional equation is ϕ(x) = ax, for every x ∈ R. 4 5.2.2 Uniform continuous mappings Let f be a mapping on a metric space X with values in a metric space Y. We say that f is uniformly continuous on X if for every ε > 0 there exists δ > 0 such that ρ(f (p), f (q)) < ε for all p, q ∈ X for which ρ(p, q) < δ. 5.2. Continuity 81 Remark. Uniform continuity is the property of a function on the whole set, whereas continuity can be defined at a single point. Asking whether a given function is uniformly continuous at a certain point is meaningless. Secondly, if f is continuous on X, it is possible to find, for each ε > 0 and each point p ∈ X, a number δ > 0 having the property specified in definition of continuity; thus δ = δ(ε, p). However, if f is uniformly continuous on X, then it is possible, for each ε > 0, to find one number δ > 0 which will do for all points p ∈ X. It is trivial to see that every uniformly continuous function is continuous. That the two concepts are equivalent on compact sets follows from the next theorem. Theorem 2.10. (Cantor) Let f be a continuous mapping of a compact metric space X into a metric space Y. Then f is uniformly continuous on X. Proof. Let ε > 0 be given. Since f is continuous, we can associate to each point p ∈ X, a positive number φ(p) such that (2.3) x ∈ X, ρ(p, x) < φ(p) =⇒ ρ(f (p), f (x)) < ε/2. Let I(p) be the ball defined as 1 I(p) = B(p, φ(p)). 2 (2.4) Since p ∈ I(p), the family of all sets I(p) is an open cover of X and since X is compact, there is a finite set of points p1 , . . . , pn in X, such that X = I(p1 ) ∪ · · · ∪ I(pn ). (2.5) Let (2.6) δ= 1 min{φ(p1 ), . . . , φ(pn )}(> 0). 2 Now, let x and p be points of X such that ρ(x, p) < δ. By (2.5), there is an integer m , 1 ≤ m ≤ n, such that p ∈ I(pm ). It follows that (2.7) 1 ρ(p, pm ) < φ(pm ), 2 and also that 1 ρ(x, pm ) ≤ ρ(x, p) + ρ(p, pm ) < δ + φ(pm ) ≤ φ(pm ). 2 This, together with (2.3) imply that ρ(f (p), f (x)) ≤ ρ(f (p), f (pm )) + ρ(f (pm ), f (x)) < ε. 2 82 5. Limits and Continuity 5.2.3 Continuity and connectedness A set E in a metric space X is said to be connected if there are no two disjoint open sets A and B, A ⊂ E, B ⊂ E such that E ⊂ A ∪ B. 4 Theorem 2.11. On the real axis, a set is connected if and only if it is an interval. Similar to theorem 2.6, we have Theorem 2.12. If f is a continuous mapping of a connected metric space X into a metric space, then f (X) is connected. Proof. If f (X) is not connected, there are disjoint open sets V and W in Y, both of which intersect f (X), such that f (X) ⊂ V ∪ W. Since f is continuous, the sets f −1 (V ) and f −1 (W ) are open in X; they are clearly disjoint and nonempty, and their union is X. This implies that X is not connected, in contradiction to the hypothesis. 2 Theorem 2.13. (Darboux1 property to continuous functions) Let f be a continuous real function on the interval [a, b]. If f (a) < f (b) and if c is a number such that f (a) < c < f (b), there exists a point x ∈ ]a, b[ such that f (x) = c. A similar result holds if f (a) > f (b). Proof. By theorem 2.11, [a, b] is connected; hence, theorem 2.12 shows that f ([a, b]) is a connected subset of R1 , and the assertion follows if we appeal once more to the theorem 2.11. 2 5.2.4 Discontinuities If x is a point in the domain of definition of a function f at which f is not continuous, we say that f is discontinuous at x, or that f has a discontinuity at x. Let f be defined on ]a, b[ . If f is discontinuous at a point x, and if f (x+) and f (x−) exist, then f is said to have a discontinuity of the first kind or a simple discontinuity. Otherwise the discontinuity is said to be of the second kind. There are two ways in which a function can have a simple discontinuity: either f (x+) 6= f (x−) (in which case the value f (x) is immaterial), or f (x+) = f (x−) 6= f (x). Examples. (a) Define ( 1, x ∈ Q, f (x) = 0, x ∈ R \ Q. Then f has a discontinuity of the second kind at the point x, since neither f (x+) nor f (x−) exists. (b) Define ( x, x ∈ Q, f (x) = 0, x ∈ R \ Q. 1 Gaston Garboux, 1842-1917 5.2. Continuity 83 Then f is continuous at x = 0 and has a discontinuity of the second kind at every other point. (c) Define ( sin x1 , x 6= 0, f (x) = 0, x = 0. Since neither f (0+) nor f (0−) exist, f has a discontinuity of the second kind at x = 0. However, f is continuous at every point x 6= 0. 4 5.2.5 Monotonic functions Let f be real on ]a, b[ . Then f is said to be monotonically increasing on ]a, b[ if a < x < y < b implies f (x) ≤ f (y). If the last inequality is reversed, we obtain the definition of a monotonically decreasing function. The set of monotonic functions consists of both the increasing and the decreasing functions. Theorem 2.14. Let f be monotonically increasing on ]a, b[ . Then f (x+) and f (x−) exist at every point x ∈]a, b[ . More precisely, (2.8) sup f (t) = f (x−) ≤ f (x) ≤ f (x+) = inf f (t). x<t<b a<t<x Furthermore, if a < x < y < b, then f (x+) ≤ f (y−). (2.9) Proof. The set of numbers f (t) with a < t < x is bounded above by f (x). Therefore it has a least upper bound which we shall denote by A. It is obvious that A ≤ f (x). We have to show that A = f (x−). Let ε > 0 be given. It follows from the definition of A as supremum that there exists δ > 0 such that a < x − δ < x and A − ε < f (x − δ) ≤ A. (2.10) Since f is monotonic, we have (2.11) f (x − δ) ≤ f (t) ≤ A, x − δ < t < x. Combining (2.10) and (2.11), we obtain |f (t) − a| < ε, x − δ < t < x. Hence f (x−) = A. The second half of (2.8) is proved in precisely the same way. Next, if a < x < y < b we infer from (2.8) that (2.12) f (x+) = inf f (t) = inf f (t). x<t<b x<t<y 84 5. Limits and Continuity Similarly (2.13) f (y−) = sup f (t) = sup f (t). x<t<y x<t<b Comparing (2.12) and (2.13), we get (2.9). 2 Corollary 2.1. Monotonic functions have no discontinuity of the second kind. Proof. Suppose, for the sake of definiteness, that f is increasing, and let E be the set of points at which f is discontinuous. With every point x ∈ E, we associate a rational number r(x) such that f (x−) < r(x) < f (x+). Since x1 < x2 , it follows that f (x1 +) ≤ f (x2 −). We note that r(x1 ) 6= r(x2 ), if x1 6= x2 . We have thus established an one-to-one correspondence between the set E and a subset of the rational numbers. The latter is countable. 2 Exercise 2.2. ([24, Probl. 11, p. 9]) Let f be a continuous and increasing function defined on an interval [a, b] and such that f (a) ≥ a and f (b) ≤ b. Choose an arbitrary x1 ∈ [a, b] and consider the sequence (xn ) obtained as xn+1 = f (xn ), n ≥ 1. Show that the limit lim xn =: x∗ exists and that f (x∗ ) = x∗ . Since f is continuous, f ([a, b]) is a compact set. Thus the sequence (xn ) is included in a compact set and thus it is bounded. Function f being increasing, the sequence (xn ) is monotone. So we have established that the sequence (xn ) converges, let x∗ ∈ [a, b] be its limit. From the estimations 0 ≤ |f (xn ) − x∗ | = |xn+1 − x∗ | → 0 as n → ∞ we conclude that f (x∗ ) = x∗ . 5.2.6 4 Darboux functions Let I ⊂ R be a nonempty interval and f : I → R be a mapping. We say that f is a Darboux function if for any a, b ∈ I, a < b and any λ between f (a) and f (b), there is c ∈ ]a, b[ such that f (c) = λ. We denote the set of Darboux function on an interval I by DI . Remark. Geometrical, the Darboux property says that for any a, b ∈ I, a < b and any λ between f (a) and f (b), the parallel to the Ox axis through y = λ meets the restriction of f to the open interval ]a, b[ at least in one point. 4 Proposition 2.1. For a mapping f : I → R the next statements are equivalent 5.2. Continuity 85 (a) f is a Darboux function; (b) if J ⊂ I is an interval, then f (J) is an interval, too (the image of an interval is an interval); (c) if a, b ∈ I, a < b, f ([a, b]) is an interval. Proof. (a) =⇒ (b). For any y1 = f (t1 ), y2 = f (t2 ) ∈ f (J) (we may suppose that y1 < y2 ) and any λ ∈ ]y1 , y2 [ by (a), there is c ∈ ]t1 , t2 [ such that λ = f (c). Since J is an interval and t1 , t2 ∈ J, it follows that c ∈ J and therefore λ = f (c) ∈ inf(J). Thus f (J) is an interval. (b) =⇒ (c). It is trivial. (c) =⇒ (a). Let a, b ∈ I, a < b and any λ between f (a) and f (b). Then λ ∈ f ([a, b]) and by (c), there is c ∈ [a, b] such that λ = f (c). We note that c∈ / {a, b} and λ = f (c). Hence f is a Darboux mapping on I. 2 Corollary 2.2. Let f : I → R and g : f (I) → R be two Darboux mappings. Then their composition g ◦ f is a Darboux mapping. Proof. If f, g are Darboux mappings and J is a subinterval of I, (g◦f )(J) = g(f (J)) is an interval. Thus, by proposition 2.1, the mapping g ◦ f is a Darboux mapping. 2 Corollary 2.3. Let f : I → R be a Darboux mapping whose range is an at most countable set. The function f is constant. Proof. f (I) is an interval and is at most countable. Thus | f (I) |= 1, that is, f is constant. 2 Corollary 2.4. Let f : I → R be a Darboux mapping. Then f is injective if and only if it is strictly monotone. Proof. Sufficiency. If f is strictly monotone, it is injective. Necessity. Suppose f : I → R is a Darboux and injective function. If f is not strictly monotone, there exist t1 < t2 < t3 such that either f (t1 ) < f (t2 ) > f (t3 ) or f (t1 ) > f (t2 ) < f (t3 ). Let us suppose that the first case holds. The other case runs similarly. The first case has two subcases. • f (t1 ) < f (t3 ) < f (t2 ). Then there is c ∈ ]t1 , t2 [ with f (c) = f (t3 ). Thus function f is not injective which is a contradiction. •• f (t3 ) < f (t1 ) < f (t2 ). Reasoning as above, we find c ∈ ]t2 , t3 [ with f (c) = f (t1 ). Thus it is contradicted the injectivity of f. 2 Corollary 2.5. Let f : I → R be a Darboux mapping. Then f is also a Darboux function on any subinterval J ⊂ I. Proof. It follows from (ii) of proposition 2.1. 2 Proposition 2.2. Suppose f : I → R is a Darboux mapping. Then 86 5. Limits and Continuity (a) |f |, f 2 , and p |f | are Darboux mappings on I; (b) if f (t) 6= 0 for every t ∈ I, then 1/f is a Darboux mapping on I. Proof. Apply corollary 2.2 to mapping f and to mappings g1 (t) = |t|, g2 (t) = t2 , p g3 (t) = |t|, respectively, g(t) = 1/t. 2 Remark. We introduce two Darboux function whose sum is not a Darboux function. ( ( sin 1t , t 6= 0, − sin 1t , t 6= 0, f (t) = g(t) = 0, t = 0, 1, t = 0, are two Darboux functions while their sum ( 0, t 6= 0, (f + g)(t) = 1, t = 0, is not a Darboux mapping. 4 Corollary 2.6. Let f : I → R be a Darboux mapping. Then f has no discontinuity point of the first kind. Proof. [15, p. 52]. 2 Theorem 2.15. (Sierpinski2 ) Any mapping f : R → R is a sum of two discontinuous Darboux mappings, i.e., exist f1 , and f2 Darboux and discontinuous functions such that f (t) = f1 (t) + f2 (t), ∀t ∈ R. A proof of the previous theorem may be found, e.g., in [15, p. 46-48]. 5.2.7 Lipschitz functions Let I be a real interval and f : I → R be a function. Then f is said to be Lipschitz3 on I if there exists a nonnegative L such that (2.14) |f (x) − f (y)| ≤ L|x − y|, ∀ x, y ∈ I. Remark. Every Lipschitz function on an interval is uniformly continuous on it. 4 Let A be a nonempty set and f : A → A be a mapping. A point x ∈ A is said to be a fixed point of f if f (x) = x. Let (X, ρ) be a metric space and f : X → X be a mapping. f is said to be a contraction if there exists a constant α ∈ ]0, 1[ so that ρ(f (x), f (y)) ≤ αρ(x, y), for every x, y ∈ X. In this very case the mapping f is said to be an α -contraction. Remark. Every α -contraction is a Lipschitz function. 4 2 3 Waclaw Sierpinski, 1882-1969 Lipschitz 5.2. Continuity 87 Theorem 2.16. (Banach4 fixed point theorem) Let (X, ρ) be a complete metric space and f : X → X be an α -contraction. Then f has a unique fixed point. Proof. Choose an arbitrary, but fixed, point x0 ∈ X. Define the sequence (xn )n≥1 by xn+1 = f (xn ), n = 0, 1, . . . . For every n ∈ N∗ and every integer p, p ≥ 1, we have the estimations ρ(xn+1 , xn ) = ρ(f (xn ), f (xn−1 ) ≤ αρ(xn , xn−1 ) ≤ · · · ≤ αn ρ(x1 , x0 ), and by the triangle inequality ρ(xn+p , xn ) ≤ ρ(xn+p , xn+p−1 ) + ρ(xn+p−1 , xn+p−2 ) + · · · + ρ(xn+1 , xn ) ≤ (αn+p−1 + αn+p−2 + · · · + αn )ρ(x1 , x0 ) = αn (αp−1 + αp−2 + · · · + 1)ρ(x1 , x0 ) 1 ≤ αn ρ(x1 , x0 ). 1−α Thus we may conclude that (xn ) is a Cauchy sequence. The metric space X is complete, thus the sequence (xn ) is convergent. Let x ∈ X be its limit. Passing p to +∞ in the previous estimations we get (2.15) ρ(x, xn ) ≤ αn 1 ρ(x1 , x0 ). 1−α Now, substituting xn by f (xn−1 ) and passing n to +∞, we get ρ(x, f (x)) = 0. Thus x = f (x), i.e., x is a fixed point of f. Suppose that there are points x and y (not necessarily distinct) such that x = f (x) and y = f (y). Then ρ(x, y) = ρ(f (x), f (y)) ≤ αρ(x, y) implies that (1 − α)ρ(x, y) ≤ 0. Hence x = y, i.e., the fixed point of the function f is unique. 2 Remark. We notice that for any starting point x0 ∈ X we get precisely the same limit point and this limit point is the unique fixed point of f. 4 The sequence (xn ) is said to be the sequence of successive approximations of x. Based on (2.15), we can establish the speed of convergence of the sequence (xn ) to its limit x. Example. Suppose we have to find a real roots of the equation (2.16) 4 stefan Banach, x3 + 2x − 1 = 0. 88 5. Limits and Continuity First we note that if we denote the left-hand side of (2.16) by g(x), we get a polynomial function; this function is strictly increasing. Hence it has at most one real root. At the same time from lim g(x) = −∞ and x→−∞ lim g(x) = +∞, x→+∞ and, moreover, taking into account theorem 2.13, we conclude that (2.16) has precisely one real root. In order to find this real root we rewrite the given equation as (2.17) x= x2 1 . +2 Denote the right-hand side of (2.17) by f (x) and thus we get a real rational function defined on R. We are checking the assumption of theorem 2.16. For x, y ∈ R we have |f (x) − f (y)| = However |x + y| |x − y|. (2 + x2 )(2 + y 2 ) 1 √ 2 + t2 |t| = √ 2t2 ≤ √ . 2 2 2 Thus |x + y| ≤ |x| + |y| ≤ 2 + x2 + 2 + y 2 (2 + x2 )(2 + y 2 ) √ √ ≤ . 2 2 2 2 Now we conclude that the function f is a contraction. From (2.17) easily follows that the solution has to be strictly positive. Since the function f is strictly decreasing on the positive semi-axis and f ([0, 1]) = [1/3, 1/2] ⊂ [0, 1], it follows that it is a contraction on the compact interval [0, 1]. Thus we may apply the Banach fixed point theorem and if we start from x0 = 0, then x1 =0.5, x2 = 0.444444 . . . , x3 = 0.455056 . . . , x4 = 0.453088 . . . , x5 =0.453455 . . . , x6 = 0.453386 . . . , x7 = 0.453399 . . . . 4 5.2.8 Convex functions Let I be a real interval and f : I → R be a function. Then f is said to be convex provided (2.18) x, y ∈ I, α ∈ [0, 1] =⇒ f ((1 − α)x + αy) ≤ (1 − α)f (x) + αf (y). f is said to be strictly convex provided (2.19) x, y ∈ I, α ∈ [0, 1] =⇒ f ((1 − α)x + αy) < (1 − α)f (x) + αf (y). Figure 5.1 represents a convex function. 5.2. Continuity 89 Figure 5.1: Lemma 2.1. (Jensen5 inequality) Suppose f is a convex function. Then f (α1 x1 + · · · + αn xn ) ≤ α1 f (x1 ) + · · · + αn f (xn ) (2.20) for any n ∈ N∗ , any x1 , . . . , xn ∈ I, and any α1 , . . . , αn ≥ 0 with α1 + · · · + αn = 1. Proof. If n = 2, the claim is true. Suppose n ≥ 2 and the claim is true for n 1. Consider x1 , . . . , xn ∈ I, α1 , . . . , αn ≥ 0, and α1 + · · · + αn = 1. Then P− n−1 i=1 αi /(1 − αn ) = 1, so by hypothesis ! n−1 n−1 X X αi αi f ≤ f (xn ). 1 − α 1 − α n n i=1 i=1 Hence f n X ! αi xi =f ((1 − αn ) 1 n−1 X 1 n−1 X ≤(1 − αn )f ( ≤(1 − αn ) 1 n−1 X 1 αi xi + αn xn ) 1 − αn αi xi ) + αn f (xn ) 1 − αn n X αi f (xi ) + αn f (xn ) = αi f (xi ). 1 − αn 1 Corollary 2.7. Suppose f : I → R is a convex mapping. Then α1 x1 + · · · + αn xn α1 f (x1 ) + · · · + αn f (xn ) (2.21) f ≤ , α1 + · · · + αn α1 + · · · + αn for any x1 , . . . , xn ∈ I, α1 , . . . , αn ≥ 0 with α1 + · · · + αn > 0. Let f be a real function defined on an interval I. Then sf,x0 (x) = f (x) − f (x0 ) , x − x0 x ∈ I \ {x0 } is said to be the slope of f. Lemma 2.2. Suppose f : I → R is a mapping. Then (a) f is convex on I 5 Jensen, 1859-1925 =⇒ sf,x0 is increasing on I \ {x0 }; 90 5. Limits and Continuity (b) f is strictly convex on I Proof. =⇒ sf,x0 is strictly increasing on I \ {x0 }; 2 Theorem 2.17. Suppose f : I → R is a convex mapping on an interval I. Then (a) f is continuous on the interior of I; (b) f is Lipschitz on every compact interval contained in I. Proof. 2 5.2.9 Jensen convex functions Let I be a real interval and f : I → R be a function. Then f is said to be Jensen convex or J-convex provided x+y f (x) + f (y) )≤ . 2 2 Proposition 2.3. A continuous and J-convex function f : I → R is convex. x, y ∈ I =⇒ f ( (2.22) Proof. From (2.22), by induction, we get 1 1 (x + · · · + x [f (x1 ) + · · · + f (x2k )], k )) ≤ 1 2 2k 2k for every k ∈ N∗ , and x1 , . . . , x2k ∈ I. Consider x, y ∈ I and α ∈ ]0, 1[ . We show that (2.18) is satisfied. Write α as α1 α2 αk α= + 2 + ··· + k + ..., 2 2 2 ∗ where αi ∈ {0, 1}, for all i ∈ N . Denoting (2.23) f( α(k) = α1 α2 αk α1 2k−1 + · · · + αk βk + 2 + ··· + k = = , 2 2 2 2k 2k it result lim α(k) = α, k→∞ 1 − α(k) = 2k − βk . 2k We consider in (2.23) that x1 = x2 = · · · = xβk = x and xβk +1 = xβk +1 = · · · = x2k = y. Then we get βk x + (2k − βk ) y) 2k βk f (x) + (2k − βk )f (y) ≤ = α(k) f (x) + (1 − α(k) f (y). 2k Now passing k → ∞ and taking into account the continuity of f, the conclusion follows. 2 f (α(k) x + (1 − α(k) y) =f ( Chapter 6 Differential calculus This chapter is devoted to introduce some basic results on differential calculus. 6.1 The derivative of a real function Let f be defined on [a, b] and real-valued. For any x ∈ [a, b] form the quotient (1.1) φ(t) = f (t) − f (x) , t−x (a < t < b, t 6= x) and define f 0 (x) = lim φ(t) (1.2) t→x provided this limit exists. We thus associate to the function f a function f 0 whose domain of definition is the set of points x at which limit (1.2) exists; f 0 is called the derivative of f. If f 0 is defined at a point x, we say that f is differentiable on a point at x. If f 0 is defined at every point of a set E ⊂ [a, b], we say that f is differentiable on E. Theorem 1.1. Let f be defined on [a, b]. If f is differentiable at a point x ∈ [a, b], then f is continuous at x. Proof. As t → x, we have f (t) − f (x) = f (t) − f (x) (t − x) → f 0 (x) · 0 = 0. 2 t−x Remark. The converse is not true. Consider f (t) =| t |, t ∈ R and x = 0. It holds a stronger result, namely 4 Theorem 1.2. (Weierstrass) There exists a continuous function on R having no point of differentiability. 91 92 6. Differential calculus Proof. Consider the function h(t) = |t|, t ∈ [−1, 1], and extend it by periodicity on R such that t ∈ [−1, 1], h(t), g(t) = g(t − 2), t > 1, g(t + 2), t < −1. Then for any s, t ∈ R (1.3) |g(t) − g(s)| ≤ |t − s|. Thus function g is Lipschitz on R, so it is continuous on R. Define ∞ n X 3 f (t) = g(4n t). 4 n=0 Since 0 ≤ g ≤ 1, the series is uniformly convergent. We have uniform convergence of continuous functions, so f is continuous on R. Now we show that function f is nowhere differentiable. Choose an arbitrary, but fixed x ∈ R. Define δm = ± 12 4−m , where the sign is chosen such that between 4m x and 4m (x+ δm ) there is no integer. Define g(4n (x + δm )) − g(4n x) . γn = δm If n > m, 4n δm is an even number and thus γn = 0. If 0 ≤ n ≤ m, from (1.3) it follows that |γn | ≤ 4n . Since |γm | = 4m , it follows the estimation ∞ n m−1 X f (x + δm ) − f (x) X 3 1 m = γ ≥ 3 − 3n = (3m + 1). n δm 4 2 n=0 n=0 If m → ∞, δm → 0. Hence function f is not differentiable on x, and the theorem is proved. 2 We introduce some arithmetic properties of differentiable functions. Theorem 1.3. Suppose f and g are defined on [a, b] and are differentiable at a point x ∈ [a, b]. Then f + g, f · g and f /g are differentiable at x, and (a) (f + g)0 (x) = f 0 (x) + g 0 (x); (b) (f g)0 (x) = f 0 (x)g(x) + f (x)g 0 (x); f f 0 (x)g(x) − f (x)g 0 (x) (c) (x) = (we assume that g(x) 6= 0). g g 2 (x) 6.1. The derivative of a real function 93 Proof. (a) is trivial. (b) Let h = f · g. Then f (t)g(t) − f (x)g(t) + f (x)g(t) − f (x)g(x) h(t) − h(x) f (t)g(t) − f (x)g(x) = = t−x t−x t−x g(t) − g(x) f (t) − f (x) =g(t) + f (x) . t−x t−x (c) f f (t) − (x) f (t)g(x) − g(t)f (x) g g = t−x (t − x)g(x)g(t) 1 f (t) − f (x) g(t) − g(x) g(x) − f (x) . 2 = g(t)g(x) t−x t−x Theorem 1.4. Suppose f is continuous on [a, b], f 0 (x) exists at some point x ∈ [a, b], g is defined on a closed interval I that contains the range of f, and g is differentiable at the point f (x). If h(t) = g(f (t)) a ≤ t ≤ b, h is differentiable at x, and (1.4) h0 (x) = g 0 (f (x)) · f 0 (x). Proof. Let y = f (x). By the definition of the derivative, we have (1.5) (1.6) f (t) − f (x) = (t − x)[f 0 (x) + u(t)] g(s) − g(y) = (s − y)[g 0 (y) + v(s)] where t ∈ [a, b], s ∈ I, u(t) → 0 as t → x, v(s) → 0 as s → y. First we use (1.6) and then (1.5). We obtain h(t) − h(x) =g(f (t)) − g(f (x)) = [f (t) − f (x)][g 0 (y) + v(s)] =(t − x)[f 0 (x) + u(t)][g 0 (y) + v(s)] or, if t 6= x, (1.7) h(t) − h(x) = [g 0 (y) + v(s)][f 0 (x) + u(t)]. t−x Let t → x and see that s → y, by the continuity of f. Thus the right-hand side of (1.7) tends to g 0 (y)f 0 (x), which is actually (1.4). 2 94 6. Differential calculus Examples 1.1. (a) Consider ( x sin x1 , f (x) = 0, x 6= 0 x = 0. Then f 0 (x) = sin x1 − x1 cos x1 , x 6= 0 and does not exist f 0 (0). (b) Consider ( x2 sin x1 , x = 6 0 f (x) = 0, x = 0. Then f 0 (x) = 2x sin x1 − cos x1 , x 6= 0. Now f 0 (0) = 0. (c) The above cases are particular cases of the next one. Let a ≥ 0, c > 0 be constants and f : [−1, 1] → R be a function defined by ( xa sin(x−c ), x 6= 0 f (x) = 0, x = 0. Then (c 1 ) f is continuous ⇐⇒ a > 0; (c 2 ) ∃ f 0 (0) ⇐⇒ a > 1; (c 3 ) f 0 is bounded ⇐⇒ a ≥ 1 + c; (c 4 ) f 0 is continuous ⇐⇒ a > 1 + c; (c 5 ) ∃ f 00 (0) ⇐⇒ a > 2 + c; (c 6 ) f 00 is bounded ⇐⇒ a ≥ 2 + 2c; (c 7 ) f 00 is continuous ⇐⇒ a > 2 + 2c. 4 Now we study the behaviour of the ratio from (1.1). Proposition 1.1. Consider a function f : ] − 1, 1[ → R such that ∃ f 0 (0). Choose two sequences (αn )n and (βn )n satisfying −1 < αn < βn < 1 such that αn → 0 and βn → 0 as n → ∞. Define dn := f (βn ) − f (αn ) . βn − αn Then (a) if αn < 0 < βn , limn→∞ dn = f 0 (0); n is bounded, limn→∞ dn = f 0 (0); (b) if 0 < αn and the sequence βnβ−α n n 6.1. The derivative of a real function 95 (c) if f 0 is continuous on ] − 1, 1[ , limn→∞ dn = f 0 (0); (d) there exists a differentiable function f on ] − 1, 1[ (with discontinuous derivative) such that αn → 0, βn → 0, and ∃ limn→∞ dn , but limn→∞ dn 6= f 0 (0). Proof. (a) We write dn as a convex combination of the following form f (βn ) − f (0) βn f (0) − f (αn ) −αn f (βn ) − f (αn ) = · + · , βn − αn βn βn − αn −αn βn − αn βn −αn λ1 = , λ2 = , λ1 , λ2 > 0, λ1 + λ2 = 1. βn − αn βn − αn It follows f (βn ) − f (0) f (0) − f (αn ) f (βn ) − f (αn ) min , ≤ βn −αn βn − αn f (βn ) − f (0) f (0) − f (αn ) ≤ max , . βn −αn The assumption guarantees us that f (βn ) − f (0) f (0) − f (αn ) = lim = f 0 (0). n→∞ n→∞ βn −αn lim Now invoking theorem 1.9 from page 43 we get the conclusion. (b) Consider the following identity f (βn ) − f (αn ) f (0) − f (αn ) βn − = βn − αn −αn βn − αn f (βn ) − f (0) f (0) − f (αn ) − βn −αn . n The sequence βnβ−α is bounded and since the difference from the parenthesis n n tends to 0, the conclusion follows. (c) (d) Consider the function given in (b) of examples 1.1 and the sequences βn = 2 , (2n + 1)π αn = 1 . 2 nπ Theorem 1.5. Let I, J ⊂ R be compact intervals and f : I → J be a continuous and bijective mapping. Suppose function f is differentiable on a point x0 ∈ I and f 0 (x0 ) 6= 0. Then the inverse function f −1 : J → I is differentiable at y0 = f (x0 ) and it holds 1 [f −1 (y0 )]0 = 0 . f (x0 ) 96 6. Differential calculus Proof. Since function f is continuous and bijective and the interval I is compact it follows that f −1 is continuous (theorem 2.9, page 80). Therefore (1.8) xn ∈ I \ {x0 }, xn → x0 ⇐⇒ yn = f (xn ) ∈ J \ {y0 }, yn → y0 . Take y = f (x). Then f −1 (y) − f −1 (y0 ) 1 , = f (x) − f (x0 ) y − y0 x − x0 ∀x ∈ I \ {x0 }. From (1.5) it follows that y → y0 ⇐⇒ x → x0 , hence lim y→y0 f −1 (y) − f −1 (y0 ) 1 1 = lim = 0 . 2 x→x0 f (x) − f (x0 ) y − y0 f (x0 ) x − x0 6.2 Mean value theorems Let f be a real function defined on a metric space X. We say that f has a local maximum at a point p ∈ X if there exists δ > 0 such that f (q) ≤ f (p) for all q ∈ B(p, δ). Analogously, we say that f has a local minimum at a point p ∈ X if there exists δ > 0 such that f (q) ≥ f (p) for all q ∈ B(p, δ). Theorem 2.1. (Fermat1 ) Let f be defined on [a, b]. If f has a local maximum point x ∈ ]a, b[ and if there exists f 0 (x), then f 0 (x) = 0. The analogous statement for local minimum is true, too. Proof. The idea is suggested in figure 6.1. Choose δ in accordance with the above definition, so that a < x − δ < x < x + δ < b. If x − δ < t < x, f (t) − f (x) ≥ 0. t−x Letting t → x, we see that f 0 (x) ≥ 0. If x < t < x + δ, f (t) − f (x) ≤ 0. t−x Letting t → x, we see that f 0 (x) ≤ 0. Hence f 0 (x) = 0. 1 Pierre de Fermat, 1601-1665 2 6.2. Mean value theorems 97 Figure 6.1: Figure 6.2: Theorem 2.2. (Cauchy) If f and g are continuous real functions on [a, b] which are differentiable in ]a, b[ , then there exists a point x ∈]a, b[ at which [f (b) − f (a)]g 0 (x) = [g(b) − g(a)]f 0 (x). Proof. Put h(t) := [f (b) − f (a)]g(t) − [g(b) − g(a)]f (t), a ≤ t ≤ b. Then h is continuous on [a, b] and differentiable on ]a, b[ , and (2.1) h(a) = f (b)g(a) − f (a)g(b) = h(b). We have to show that h0 (x) = 0 for some x ∈ ]a, b[ . If h is constant, h0 (x) = 0 for every x ∈ ]a, b[ . If h(t) > h(a) for some t ∈ ]a, b[ , let x be a point in [a, b] at which h attains its maximum value, theorem 2.6, page 79. By (2.1) it follows that x ∈ ]a, b[ , while by theorem 2.1 we get the conclusion. If h(t) < h(a) for some t ∈ ]a, b[ , the same reasoning applies if we choose as x a point in [a, b] at which h attains its minimum value. 2 Theorem 2.3. (Lagrange) If f are continuous real function on [a, b] which is differentiable in ]a, b[ , then there exists a point x ∈ ]a, b[ at which f (b) − f (a) = (b − a)f 0 (x). Proof. Take g(t) = t in theorem 2.3. The idea is suggested in figure 6.2. 2 Theorem 2.4. Suppose f is differentiable on ]a, b[ . (a) If f 0 (t) ≥ 0 for any t ∈ ]a, b[ , f is monotonically increasing on ]a, b[ ; (b) if f 0 (t) = 0 for any t ∈ ]a, b[ , f is constant on ]a, b[ ; (c) If f 0 (t) ≤ 0 for any t ∈]a, b[ , f is monotonically decreasing on ]a, b[ ; Proof. All conclusions can be read off from the relation f (x2 ) − f (x1 ) = (x2 − x1 )f 0 (x), which is valid for each pair of numbers x1 , x2 ∈ ]a, b[ , for some x between x1 and x2 . 2 98 6.2.1 6. Differential calculus Consequences of the mean value theorems Lemma 2.1. Consider I = [a, b[ , a < b ≤ +∞. Suppose f, g : I → R are continuous functions. Moreover, (i) f and g are differentiable on ]a, b[ and f 0 (t) ≤ g 0 (t) for every t ∈]a, b[ ; (ii) f (a) ≤ g(a). Then f (t) ≤ g(t) for any t ∈ I. Proof. Define h = g − f. Function h is differentiable on ]a, b[ and h0 (t) ≥ 0 for any t ∈ ]a, b[ . From theorem 2.4 it follows that h is increasing and hence h(t) ≥ h(a) ≥ 0, for all t ∈ [a, b[ . 2 Lemma 2.2. Consider I = [a, b], a < b. Suppose f, g : I → R are continuous functions. Moreover, (i) f and g are differentiable on ]a, b[ ; (ii) |f 0 (t)| ≤ g 0 (t) for every t ∈ ]a, b[ . Then |f (b) − f (a)| ≤ g(b) − g(a). Proof. The conclusion is equivalent to (2.2) g(a) − g(b) ≤ f (b) − f (a) ≤ g(b) − g(a), that is g(b) − f (b) − (g(a) − f (a)) ≥ 0 and f (b) + g(b) − (f (a) + g(a)) ≥ 0. These two inequalities suggest us to consider the next auxiliary functions ( h1 (t) = g(t) − f (t) − (g(a) − f (a)), h1 , h2 : I → R, h2 (t) = f (t) + g(t) − (f (a) + g(a)). We note that h1 (a) = h2 (a) = 0 and ( h01 (t) = g 0 (t) − f 0 (t) ≥ 0, h02 (t) = f 0 (t) + g 0 (t) ≥ 0. Then h1 (t) ≥ h1 (a) and h2 (t) ≥ h2 (a), on [a, b], and hence (2.2) follows. 2 Lemma 2.3. Let I ⊂ R be a nonempty interval, t0 ∈ I. Consider f : I → R a continuous function on I, differentiable on I \ {t0 }. 6.2. Mean value theorems 99 (a) If f 0 has a left hand side limit at t0 , then φ in (1.1) has a left hand side limit at t0 and lim f 0 (t) = lim φ(t). t→t0 t<t0 t→t0 t<t0 (b) If f 0 has a right hand side limit at t0 , then φ in (1.1) has a right hand side limit at t0 and lim f 0 (t) = lim φ(t). t→t0 t>t0 t→t0 t>t0 (c) If f 0 has a finite limit at t0 , then f is differentiable at t0 and f 0 is continuous at t0 . Proof. (a) Let t < t0 , t ∈ I. By the mean value theorem (2.3) there exists ct ∈ ]t, t0 [ such that f (t) − f (t0 ) φ(t) = = f 0 (ct ). t − t0 Letting t → t0 we get the conclusion. (b) Similar to (a). (c) It follows from (a) and (b). 2 Lemma 2.4. Let I ⊂ R be a nonempty interval and f : I → R be a differentiable function on I. If f 0 is bounded on I, then f is Lipschitz on I. Proof. Since f 0 is bounded on I, there is a positive L such that |f 0 (t)| ≤ L, for any t ∈ I. Then for any t1 , t2 ∈ I, t1 < t2 there is c ∈ ]t1 , t2 [ such that |f (t2 ) − f (t1 )| = |f 0 (c)(t2 − t1 )| ≤ L(t2 − t1 ), hence function f is Lipschitz on I. 2 Theorem 2.5. (Denjoy2 -Bourbaki3 theorem, [13, vol.2, p. 77]) Let I ⊂ R be a nonempty interval, a, b ∈ I, a < b, and f : I → R be a function on I. Suppose (i) function f is continuous on [a, b]; (ii) there is an at most countable set A ⊂ ]a, b[ such that f is right differentiable on ]a, b[ \A . Then inf t∈ ]a,b[ \A 2 3 φ(t+) ≤ f (b) − f (a) ≤ sup φ(t+). b−a t∈ ]a,b[ \A Arnaud Denjoy, 1884-1974 Nikolas Bourbaki, collective pseudonym of several French mathematicians, 1939 ↑ 100 6. Differential calculus Proof. We prove that f (b) − f (a) ≤ sup φ(t+) =: M. b−a t∈ ]a,b[ \A The other side follows in a similar way. If M = +∞, the inequality is obvious. Suppose M < ∞ and consider the function g : I → R g(t) := M t − f (t). We remark that function g fulfills the assumptions of lemma 2.6. Thus g is increasing on I. Then g(b) ≥ g(a), that is M (b − a) ≥ f (b) − f (a). 2 Corollary 2.1. Suppose f, g : I → R are continuous on a nonempty interval I. Then (a) f is constant on I if and only if there is an at most countable set A ⊂ I such that f is right-hand side differentiable on I \ A and φ(t+) = 0, for any t ∈ I \ A; (b) f − g is constant on I if and only if there is an at most countable set A ⊂ I such that f and g are right-hand side differentiable side on I \ A and φf (t+) = φg (t+) = 0, for any t ∈ I \ A, where φf and φg are the φ functions corresponding to f, respectively to g. Remark. The right-hand side differentiability from the Denjoy-Bourbaki theorem may be substituted by the left-hand side differentiability. 4 Lemma 2.5. Let I ⊂ R be a nonempty interval. Suppose (i) function f is continuous on I; (ii) there is an at most countable set A ⊂ I such that for every s ∈ I \ A and δ there is t ∈ ]s, s + δ[ satisfying f (t) ≥ f (s). Then function f is increasing on I. Proof. Consider a, b ∈ I, a < b. We show that f (a) ≤ f (b). Let λ ∈ / f (A), λ < f (a). Consider the set Sλ := {s ∈ [a, b] | λ ≤ f (s)}. Note that • Sλ 6= ∅, since a ∈ Sλ ; 6.2. Mean value theorems 101 •• set Sλ is bounded above, since Sλ ⊂ [a, b]. Hence ∃ M := sup Sλ (∈ [a, b]). Since M is a limit point of Sλ , there is sn ∈ Sλ such that sn → M. From ) sn ∈ Sλ (2.3) =⇒ [λ ≤ f (sn ) =⇒ λf (M ).] f continuous at M So, M ∈ Sλ . Now we show that M = b. Suppose M < b. Then M is a limit point for the complement of Sλ , so there is tn ∈ / Sλ , tn → M. It follows that f (tn ) < λ, and, due to the continuity of f, f (M ) ≤ λ. (2.4) From (2.3)and (2.4) it follows that f (M ) = λ and thus M ∈ / A. On the other side, since M < b and (ii), there is t ∈ ]M, b[ such that λ = f (M ) ≤ f (t) and so t ∈ Sλ and t > M, that is a contradiction. Thus M = b. Hence for any λ ∈ / f (A) such that λ < f (a) it follows that λ ≤ f (b). So we have that f (a) ≤ f (b). 2 Lemma 2.6. Let I ⊂ R be a nonempty interval. Suppose (i) function f is continuous on I; (ii) there is an at most countable set A ⊂ I such that there exists φ(t+) (φ defined by (1.1) ) for any t ∈ I \ A; (iii) φ(t+) ≥ 0 for any t ∈ I \ A. Then function f is increasing on I. Proof. From (iii) it follows that for every ε > 0 and t ∈ I \ A there is δ = δ(ε, t) such that for every h ∈ ]0, δ[ it holds f (t + h) − f (t) < ε, − φ(t+) h which means that −ε < φ(t+) − ε < f (t + h) − f (t) < φ(t+) + ε h and so f (t + h) − f (t) + hε > 0, for every h ∈ ]0, δ[ , t ∈ I \ A, 102 6. Differential calculus that is f (t + h) + (t + h)ε − [f (t) + tε] > 0, for every h ∈ ]0, δ[ , t ∈ I \ A. Denote gε (t) := f (t) + tε. Then the last inequality is equivalent with gε (t + h) − gε (t) > 0, for every h ∈ ]0, δ[ , t ∈ I \ A. From lemma (2.5) it follows that gε is increasing on I, that is for any t1 , t2 ∈ I, t1 < t2 we have gε (t1 ) ≤ gε (t2 ) ⇐⇒ f (t1 ) ≤ f (t2 ) + ε(t2 − t1 ). Letting ε → 0 we get that f (t1 ) ≤ f (t2 ), meaning that f is increasing. 6.3 2 The continuity of derivatives We have already seen ( x2 sin x1 , f (x) = 0, x 6= 0 x = 0, that a function f may have a derivative f 0 which exists at every point, but is discontinuous at some point. However, not every function is a derivative. In particular, derivatives which exist at every point of an interval have one important property in common with functions which are continuous on an interval: intermediate values are assumed. Theorem 3.1. (Darboux property of derivatives) Suppose f is a real differentiable function an [a, b] and suppose f 0 (a) < λ < f 0 (b). Then there is a point x ∈ ]a, b[ such that f 0 (x) = λ. A similar result holds of course if f 0 (a) > f 0 (b). Proof. Put c = (a + b)/2. If a ≤ t ≤ c, define α(t) = a, β(t) = 2t − a. If c ≤ t ≤ b, define α(t) = 2t − b, β(t) = b. Then a ≤ α(t) < β(t) ≤ b in ]a, b[ . Define g(t) := f (β(t)) − f (α(t)) , β(t) − α(t) a < t < b. then g is continuous on ]a, b[ , g(t) → f 0 (a), as t → a, g(t) → f 0 (b), as t → b, and so theorem 2.12 page 82 implies that g(t0 ) = λ for some t0 ∈ ]a, b[ . Fix t0 . By theorem 2.3 page 97 there is a point x such that α(t0 ) < x < β(t0 ) and such that f 0 (x) = g(t0 ). Hence f 0 (x) = λ. 2 Corollary 3.1. If function f is differentiable on [a, b], then f 0 cannot have any discontinuities of the first kind on [a, b]. 6.4. L’Hospital theorem 103 6.4 L’Hospital theorem Theorem 4.1. (L’Hospital4 ) Suppose f and g are real and differentiable in ]a, b[ and g 0 (x) 6= 0 for all x ∈ ]a, b[ , where −∞ ≤ a < b ≤ +∞. Suppose (4.1) f 0 (x) →A g 0 (x) as x → a. If (4.2) f (x) → 0 and g(x) → 0 as x → a, or if (4.3) g(x) → +∞ as x → a, then (4.4) f (x) →A g(x) as x → a. Proof. We first consider the case when −∞ ≤ A < +∞. Choose a real number q such that A < q, and the choose r such that A < r < q. By (4.1) there is a point c ∈ ]a, b[ such that a < x < c implies (4.5) f 0 (x) < r. g 0 (x) If a < x < y < c, then Cauchy theorem 2.2 shows that there is a point t ∈ ]x, y[ such that (4.6) f (x) − f (y) f 0 (t) = 0 < r. g(x) − g(y) g (t) Suppose (4.2) holds. Letting x → a in (4.6) we see that (4.7) f (y) ≤r<q g(y) (a < y < c). Next, suppose (4.3) holds. Keeping y fixed in (4.6) we can choose a point c1 ∈ ]a, y[ such that g(x) > g(y) and g(x) > 0 if a < x < c1 . Multiplying (4.6) by [g(x) − g(y)]/g(x), we get (4.8) 4 f (x) g(y) f (y) <r−r + , g(x) g(x) g(x) (a < x < c1 ). Guillaume de L’Hospital, marquis de Sainte-Mesme, 1661-1704 104 6. Differential calculus If we let x → a in (4.8), (4.3) shows that there is a point c2 ∈ ]a, c1 [ such that f (x) <q g(x) (4.9) (a < x < c2 ). Summing up, (4.7) and (4.9) show that for any q , subject only to the condition (x) < r if a < x < c2 . A < q, there is a point c2 such that fg(x) In the same manner, if −∞ < A ≤ +∞, and p is chosen so that p < A, we can find a point c2 such that (4.10) p< f (x) g(x) a < x < c2 , and (4.4) follows from these two statements. 6.5 2 Higher order derivatives Theorem 5.1. (Leibniz formula) Let u and v be two functions having derivatives up to the n -order on an interval. Then 1 (n−1) 0 n − 1 0 (n−1) (n) n (uv) = u v + u v + ··· + uv + uv (n) . n n Proof. By induction. 2 Proposition 5.1. The following identities hold on R π π (ex )(n) = ex , sin(n) (x) = sin x + n , cos(n) (x) = cos x + n . 2 2 6.6 Convex functions and differentiability Theorem 6.1. Suppose f : I → R is convex, a = inf I, b = sup I, I being an interval. Then (a) f has side derivatives on ]a, b[ and for any t1 , t2 ∈ ]a, b[ , t1 < t2 , we have fl0 (t1 ) ≤ fr0 (t1 ) ≤ fl0 (t2 ) ≤ fr0 (t2 ); (b) if a ∈ I (b ∈ I), then f is right-hand differentiable in a ( respectively it is left-hand differentiable in b) and fr0 (a) ≤ fl0 (t) (respectively fr0 (t) ≤ fl0 (b)), t ∈ int (I); 6.6. Convex functions and differentiability 105 (c) there is an at most countable set A ⊂ I such that f is differentiable on I \ A. Proof. (a) Suppose t ∈ ]a, b[ . Since sf,t is increasing on I \ {t}, it follows that sf,t has finite side limits at t, and hence f has side derivatives on t. Suppose t1 , t2 ∈ ]a, b[ , t1 < t2 , and choose u, v, w satisfying a < u < t1 < v < t2 < w < b. Then from sf,t1 (u) ≤ sf,t1 (v) = sf,v (t1 ) ≤ sf,v (t2 ) = sf,t2 (v) ≤ sf,t2 (w) it follows fl0 (t1 ) = sf,t1 (t1 −) ≤ sf,t1 (t1 +) = fr0 (t1 ) ≤ sf,t2 (t2 −) = fl0 (t2 ) ≤ sf,t2 (t2 +) = fr0 (t2 ). (b) If a ∈ I , we repeat the previous proof for a < t1 < u. (c) From (a) it follows that fl0 (t) is increasing on ]a, b[. Hence there is an at most countable set A such that fl0 (t) is continuous on ]a, [ \A. Choose t0 ∈ ]a, b[ \A. Then fl0 (·) is continuous on t0 , and for t > t0 fl0 (t0 ) ≤ fr0 (t0 ) ≤ fl0 (t). Letting t → t0 , we get fl0 (t0 ) = fr0 (t0 ), that is f is differentiable on t0 . Hence f is differentiable on I \ A. 2 Corollary 6.1. If f : I → R is convex, f is continuous on the interior of I. Remark. A convex function need not be convex on the extreme points of I. Indeed ( 0, t ∈ ]0, 1[ f : [0, 1] → R, f (t) = 4 1, t ∈ {0, 1}. Corollary 6.2. Suppose f : I → R is convex, a = inf I, , b = sup I, and t0 ∈ ]a, b[ . Then f (t) ≥ f (t0 ) + m(t − t0 ) for every t ∈ I and m ∈ [fl0 (t0 ), fr0 (t0 )]. Proof. If t > t0 , sf,t0 (t) = f (t) − f (t0 ) ≥ inf sf,s (s) = fr0 (t0 ) ≥ m. t − t0 s∈I s>t0 Similarly, if t < t0 , sf,t0 (t) = f (t) − f (t0 ) ≤ sup sf,s (s) = fl0 (t0 ) ≤ m. t − t0 s∈I s<t0 Now the conclusion follows. 2 106 6. Differential calculus Corollary 6.3. Suppose f : I → R is differentiable on I. Then f is convex if and only if f (t) ≥ f (t0 ) + f 0 (t0 )(t − t0 ), (6.1) for any t, t0 ∈ I. Proof. The necessity part follows from the previous corollary. Sufficiency. From (6.1) for any a, b ∈ I and α ∈ [0, 1] follow f (a) ≥f ((1 − α)a + αb) + f 0 ((1 − α)a + αb)α(a − b), f (b) ≥f ((1 − α)a + αb) − f 0 ((1 − α)a + αb)(1 − α)(a − b). Multiply the first inequality by (1 − α), the second by α, and then sum them. It results that (1 − α)f (a) + αf (b) ≥ f ((1 − α)a + αb), hence f is convex. 2 Corollary 6.4. (Fermat theorem for convex functions) Suppose f : I → R is convex and differentiable on I. Then for any point t0 ∈ int (I) the following statements are equivalent (a) (t0 , f (t0 )) is the global minimum of f ; (b) (t0 , f (t0 )) is a local minimum of f ; (c) f 0 (t0 ) = 0. Proof. (a) =⇒ (b) is obvious. (b) =⇒ (c) follows without any convexity assumption, by theorem 2.1. (c) =⇒ (a). This implication follows from the previous corollary. 2 We have seen that a convex function f : I → R is continuous on the interior of I. It holds even a stronger statement on any compact subinterval of I. Corollary 6.5. Let f : I → R be a convex function. Then f is Lipschitzean on any compact [a, b] in I. Proof. Consider an arbitrary subinterval [a, b] ⊂ I and u, v ∈ [a, b], u < v. Denote M := max{|fr0 (a)|, |fl0 (b)|}. Then −M ≤ fr0 (a) ≤ fr0 (u) ≤ f (u) − f (v) ≤ fl0 (v) ≤ fl0 (b) ≤ M. u−v Thus |f (u) − f (v)| ≤ M |u − v| for any u, v ∈ [a, b]. 2 Theorem 6.2. A differentiable function f : I → R is convex (strictly convex) if and only if its first derivative is increasing (respectively, strictly increasing). 6.6. Convex functions and differentiability 107 Proof. The necessity part follows from theorem 6.1. Sufficiency. Suppose that there exists a differentiable function whose first order derivative is increasing and it is not convex. Then there exist a, b, c ∈ I, a < b < c, and sf,b (a) > sf,b (c), that is f (a) − f (b) f (c) − f (b) > . a−b c−b By the Lagrange mean value theorem it follows that there are t1 ∈ ]a, b[ and t2 ∈ ]b, c[ with f 0 (t1 ) > f 0 (t2 ). But this contradicts that the first derivative is increasing. 2 Corollary 6.6. A differentiable function f : I → R is concave (strictly concave) if and only if its derivative is decreasing (respectively, strictly decreasing). Proof. Apply the previous theorem to −f. 2 Corollary 6.7. (Jensen) Let f : I → R be a function with second order derivative on I. Then f is convex (concave) if and only if f 00 ≥ 0 (respectively, f 00 ≤ 0 ). Proof. The claim follows from the remark that f 0 is increasing (decreasing) if and only if f 00 is positive (negative). 2 Corollary 6.8. Let f : I → R be a function with second order derivative on I. Then f is strictly convex (strictly concave) if and only if f 00 > 0 (respectively, f 00 < 0 ). 6.6.1 Inequalities Proposition 6.1. P (Young generalized inequality) Take n ∈ N∗ , yk > 0, pk > 0, for n k = 1, . . . , n, and k=1 1/pk = 1. Then (6.2) n Y n X 1 pk yk ≤ yk . p k k=1 k=1 For n = 2 the above inequality reduces to (3.3) at page 20. Proof. Consider the function f (x) = exp(x), x ∈ R. Since f 00 (x) > 0, for every x ∈ R, we infer that function f is convex. Based on Jensen inequality, (2.18) at the page 89, taking αk = pk and xk = ln ykpk we may write ! n n X X 1 1 pk exp ln yk ≤ exp (ln ykpk ) . p p k k k=1 k=1 But n X 1 exp ln ykpk p k k=1 ! n X = exp ! ln yk k=1 n n X X 1 1 pk pk exp (ln yk ) = y , pk pk k k=1 k=1 and the generalized Young inequality follows. 2 , 108 6. Differential calculus Proposition 6.2. (Generalized mean inequality) Take n ∈ N∗ , xi > 0, and αi ≥ 0, i = 1, . . . , n satisfying α1 + · · · + αn = 1. Then xα1 1 xα2 2 . . . xαnn ≤ α1 x1 + . . . αn xn . (6.3) Proof. Consider the function f (x) = ln x, x > 0. Since f 00 (x) < 0, for every x > 0, we infer that function f is (strictly) concave. So ! n n X X ln αi xi ≥ αi ln xi , i=1 and (6.3) follows. i=1 2 Corollary 6.9. Taking α1 = · · · = αn = 1/n in (6.3) it follows √ n (6.4) x1 . . . xn ≤ x1 + x2 + . . . xn . n The genuine Cauchy’s proof of the above inequality is given at page 24 and it is available in many books, let as mention only one, namely [21, Part 2, Chapter 2]. Corollary 6.10. Substituting xi by 1/xi in (6.3) we get (6.5) α1 αn + ··· + x1 xn −1 ≤ xα1 1 xα2 2 . . . xαnn . Corollary 6.11. Taking α1 = · · · = αn = 1/n in (6.5) it follows (6.6) 1 x1 n + ··· + 1 xn ≤ √ n x1 . . . xn . Chapter 7 Integral calculus The aim of the present chapter is to introduce some basic results on integral calculus. 7.1 The Riemann integral Let f be defined on [a, b] and real-valued. 7.2 The Gronwall inequality A basic tool in many results connected to differential equations and inclusions is the following one. Lemma 2.1. Suppose a continuous function x : [a, b] → R satisfies Z (2.1) t 0 ≤ x(t) ≤ c + h(s)x(s)ds, t ∈ [a, b] a for some constant c and some nonnegative integrable function h : [a, b] → R. Then Z (2.2) 0 ≤ x(t) ≤ c + c t Z h(r) exp( a t h(s)ds)dr, t ∈ [a, b]. r Proof. Denote Z t w(t) := c + h(s)x(s)ds. a Then w0 (t) = h(t)x(t), w(t) > 0, 109 w(a) = c. 110 7. Integral calculus From (2.1) it follows that x(t) ≤ w(t), for any t ∈ [a, b]. Thus it holds the following sequence of implications w0 (t) w (t) = h(t)x(t) ≤ h(t)w(t) =⇒ ≤ h(t) =⇒ w(t) Z t Z t 0 Z t w (s) ds ≤ h(s)ds =⇒ ln w(t) ≤ h(s)ds + ln c =⇒ a w(s) a a Z t Z t w(t) ≤ c · exp( h(s)ds =⇒ x(t) ≤ c · exp( h(s)ds. 0 a Substituting in (2.1), we get (2.2). a 2 Bibliography [1] M. BALÁZS and I. KOLUMBÁN, Matématikai analizis, Dacia, Cluj-Napoca, 1978 (Hungarian). [2] D. M. BĂTINEŢU, Şiruri (Romanian). [3] W. W. BRECKNER, Analiză matematică. Topologia spaţiului Rn , Universitatea din Cluj-Napoca, Cluj-Napoca, 1985 (Romanian). [4] W.-S. CHEUNG, Generalizations of Hölder’s inequality, Internat. J. Math. & Math. Sci. 26 (2001), no. 1, 7–10. [5] Ş. COBZAŞ, Analiză matematică, Presa Universitară Clujeană, Cluj-Napoca, 1997 (Romanian). [6] D. I. DUCA and E. DUCA, Culegere de probleme de analiză matematică, 1,2, GIL, Zalău, 1997 (Romanian). [7] R. ENGELKING, General Topology, Warszawa, 1977. Monografie Matematyczne, PWN, [8] P. R. HALMOS, Naive Set Theory, Van Nostrand, Princeton, New Jersey, 1967. [9] E. HEWITT and K. STROMBERG, Real and Abstract Analysis, Springer-Verlag, New York, 1975, A modern treatment of the theory of functions of a real variable, Third printing, Graduate Texts in Mathematics, No. 25. [10] T. J. JECH, Lectures on Set Theory with Particular Emphasis on the Method of Forcing, Lecture Notes on Mathematics, vol. 217, Springer, Berlin, 1971. [11] D. E. KNUTH, Seminumerical algorithms, second ed., The Art of Computer Programming, vol. 1-7, Addison-Wesley, Reading, Massachusetts, 1973. [12] I. MARUŞCIAC, Analiză matematică I, Universitatea Babeş-Bolyai, ClujNapoca, 1980 (Romanian). [13] M. MEGAN, Bazele analizei matematice, Matuniv, vol. 4, Eurobit, Timişoara, 1997 (Romanian). [14] , Analiză matematică, vol. 1, Mirton, Timişoara, 1999 (Romanian). 111 [15] , Analiză matematică, vol. 2, Mirton, Timişoara, 1999 (Romanian). [16] M. MEGAN, A. L. SASU, and B. SASU, Calcul diferenţial ı̂n R, prin exerciţii şi probleme, Editura Universităţii de Vest, Timişoara, 2001 (Romanian). [17] C. MEGHEA, Foundation of Mathematical Analysis. Tretise of Analysis, Ed. Ştiinţifică şi Enciclopedică, Bucureşti, 1977 (Romanian). [18] M. MUREŞAN, Introducere in Control Optimal, Risoprint, Cluj-Napoca, 1999 (Romanian). [19] , Introduction to Set-Valued Analysis, Cluj University Press, Cluj-Napoca, 1999. [20] , Analiză Neneteda şi Aplicaţii, Risoprint, Cluj-Napoca, 2001 (Romanian). [21] G. PÓLYA and G. SZEGÖ, Problems and Theorems in Analysis I, SpringerVerlag, Berlin, 1972. [22] T. POPOVICIU, Numerical Analysis. Basic Notions of Approximative Calculus, Calculus Theory, Numerical Analysis, and Computer Science, vol. 1, Acad. R. S. R., Bucharest, 1975 (Romanian). [23] W. RUDIN, Principles of Mathematical Analysis, McGraw-Hill, New York, 1976. [24] V. A. SADOVNICHIĬ, A. A. GRIGORIAN, and S. B. KONIAGIN, Problems from Students’s Mathematics Olympiads, Ed. Moscow University, 1987 (Russian). [25] G. SIREŢCHI, Calcul diferenţial şi integral, I, II, EDP, Bucureşti, 1985 (Romanian). 112 Author index Stoltz, ., 48, 54 Abel, N. H., 64 Archimedes, 13, 14, 17 Weierstrass, K., 36, 93 Banach, S., 87 Bernoulli, , 24, 25 Borel, E., 36 Bourbaki, N., 101 Young, , 109 Young, W. H., 20, 21 Cantor, G., 1, 37, 38, 42, 81 Catalan, E. Ch., 1814-1894, 55 Cauchy, A., 80, 98 Cauchy, A. L., 25, 42, 43, 58, 60, 62, 64, 65, 68 Cesaro, ., 48, 54 D’Alembert, J. le R., 62 Darboux, , 82, 84 Denjoy, , 101 Dirichlet, P. G. L., 65 Euler, L., 54 Fermat, P., 98, 108 Hölder, O., 20, 22, 23 Hadamard, J., 64 Heine, , 36 Jensen, , 89, 90, 109 Lagrange, J. J., 99 Lagrange, J. L., 27, 54 Lalescu, T., 55 Leibnitz, G., 106 Lipschitz, , 86 Minkowski, H., 24 Morgan, de, , 4 Newton, I., 50 Riemann, B., 69 Sierpinski, W., 86 113 Subject index addition, 10 application, 7 Cauchy, 80 family pairwise disjoint, 3 fixed point, 86 formula of Leibnitz, 106 function, 7 absolute value, 12 antiderivative, 110 bijective, 7 continuous at a point, 77 continuous on a set, 77 convex, 88 Jensen, 90 stricly, 88 Darboux, 84 derivative, 93 differentiable, 93 on a set, 93 discontinuous, 82 distance, 12 fractional part, 16 from, 8 injective, 8 integer part, 16 Lipschitz, 86 monotonic, 83 monotonically decreasing, 83 monotonically increasing, 83 one-to-one, 8 onto, 8 primitive, 110 slope, 90 ball, 30 closed, 37 open, 30 cardinal number, 17 components, 79 constant of Euler, 54 contraction, 86 coordinate, 71 covering, 34 open, 34 diameter, 42 discontinuity first kind, 82 second kind, 82 simple, 82 distance, 29 Euclidean, 29 on C , 30 element, 1 bound greatest lower, 6 least upper, 6 lower, 6 upper, 6 identity, 10 infimum, 6 inverse, 10 null, 10 smallest, 6 supremum, 6 unity, 10 zero, 10 equation functional identity Lagrange, 27 inequality Bernoulli, 24, 25 generalized mean, 110 115 ordered pair, 4 origin, 71 Jensen, 89 mean, 25 of Young, 20 triangle, 29 infimum, 6 inner product, 72 integral Darboux lower, 113 interval, 15 bounded, 15 length, 15 unbounded, 15 partial summation formula, 64 partition, 112 point, 29, 71 interior, 30 isolated, 30 radius of convergence, 64 relation, 4 composition, 5 domain, 5 equivalence, 5 image, 7 inverse, 5 inverse image, 7 ordering partial, 5 total, 6 well, 6 product, 5 range, 5 single-valued, 7 limit point, 30 linear space, 71 mapping, 7 maximum local, 98 mean weighted, 21 member, 1 metric, 29 Euclidean, 29 Euclidean on R2 , 30 uniform on R2 , 30 minimum local, 98 multiplication, 10 sequence, 8 bounded, 39 Cauchy, 42 convergent, 39 divergent, 39 fundamental, 42 in, 8 limit, 39 lower, 47 upper, 47 monotonic, 43 monotonically decreasing, 43 increasing, 43 of successive approximations, 87 speed of convergence, 87 term, 8 series absolutely convergent, 66 neighborhood, 30 norm, 73 Euclidean, 72 l1 − norm , 72 lp -norm, 72 Minkowski, 72 uniform, 72 null vector, 71 number fractional part, 16 integer part, 16 operation, 7 116 separable, 34 triangle inequality, 29 topological, 30 subset, 2 proper, 2 sum Darboux lower, 113 supremum, 6 conditionally convergent, 67 convergent, 58 divergent, 58 power, 63 coefficient, 63 radius of convergence, 64 sum, 58 set, 1 at most countable, 17 bounded, 6, 30 above, 6 below, 6 Cantor, 38 closed, 30 compact, 34 connected, 82 countable, 17 dense, 30 denumerable, 17 empty, 1 finite, 17 infinite, 17 open, 30 operation Cartesian product, 4 intersection, 2 symmetric difference, 4 union, 2 ordered partially, 6 totally, 6 perfect, 30 uncountable, 17 void, 1 well-ordered, 6 sets difference, 3 disjoint, 3 space compact, 34 complete, 43 metric, 29 distance, 29 theorem Stolz-Cesaro, 48 topology, 30 transformation, 7 uniform continuous mapping, 80 vector, 71 vector space, 71 weight of a mean, 21 117