* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Constructive Analysis Ch.2
Wiles's proof of Fermat's Last Theorem wikipedia , lookup
Law of large numbers wikipedia , lookup
History of the function concept wikipedia , lookup
Vincent's theorem wikipedia , lookup
Large numbers wikipedia , lookup
Dirac delta function wikipedia , lookup
Brouwer fixed-point theorem wikipedia , lookup
Mathematics of radio engineering wikipedia , lookup
Infinitesimal wikipedia , lookup
Collatz conjecture wikipedia , lookup
Elementary mathematics wikipedia , lookup
Non-standard analysis wikipedia , lookup
Hyperreal number wikipedia , lookup
Continuous function wikipedia , lookup
Fundamental theorem of calculus wikipedia , lookup
Georg Cantor's first set theory article wikipedia , lookup
Function of several real variables wikipedia , lookup
Real number wikipedia , lookup
Fundamental theorem of algebra wikipedia , lookup
Errett Bishop Douglas Bridges Constructive Analysis Springer-Verlag Berlin Heidelberg NewYork Tokyo Chapter 2. Calculus and the Real Numbers Section 1 establishes some conventions about sets and functions. TI1e next three sections are devoted to constructing the real numbers as certain Cauchy sequences of rational numbers, and investigating their order and arithmetic. The rest of the chapter deals with the basic ideas of the calculus of one variable. Topics covered include continuity, the convergence of sequences and series of continuous functiom, differentiation, integration, Taylor's theorem, and the basic properties of the exponemial and trigonometric functions and their inverses. Most of the material is a routine constructivization of the corresponding part of classical mathematics ; for this reason it affords a good introduction co the constructive approach. We assume that the reader is familiar with the order and arithmetic of the integers and the rational numbers. For us, a rational number will be an expression of the form pj q, where p and q a re integers with q =¥ 0. Two ratio nal numbers pjq and p'fq' are equal if pq' = p' q. The integer n is identified with the rational number nf l. There are geometric magnitudes which are not represented by rational numbers, and which can only be described by a sequence of rational approximations. Certain such approximating sequences are called real numbers. In this chapter we construct the real numbers and study their basic properties. Then we develop the funda mental ideas of the calculus. 1. Sets and Functions Before constructing the real numbers, we introduce some notions which are ba~ic to much of mathematics. The totality of all mathematical objects constructed in accordance with certain requirements is called a set. The requirements of the I. Sets and Functions 15 construction, which vary with the set under consideration, d etermine the set. Thus the integers form a set, the rational numbers form a set, and (we ant icipate here the formal definition of 'sequence') the collection of all sequences of integers is a set. Each set wiiJ be endowed with a binary relation = of equality. This relation is a matter of convention, except that it must be an equiV<lience relation ; in other words, the following conditions must hold for aU objects x , y, and z in the set: (1.1) (i) X= X (ii) If x= y, then y= x (iii) If x = y and y= z, then x==· The relation of eq uality given above for rdtional numbers is an equivalence relation. In this example there is a fmite, mechanical procedure for deciding whether or not two given objects in the set are equal. Such a procedure wilJ not exist in general: there are instances in which we are unable to decide whether or not two g iven elements of a :.et a re equal; such an instance, in the theory of real numbers, will be given later. We use the standard notation a e A to denote that a is an element, or member, of the set A, or that the construction defi ning a satisfies the requirements a construction must-sat isfy in order to define an object of A. W e also use the notat ion {a 1 , a 2 , ... } for a set whose elements can be written in a (possibly finite) list. The dependence of one quantity on another is expressed by the basic notion of an operation. An operation from a set A into a set B is a linite routine f which assigns an element f(a) of B to each given element a of A. This routine must afford an explicit, fin ite, mechanical reduction of the procedure for constructing f(a) to the procedure for constructing a. If it is clear from the context what the sets A and B are, we sometimes denote f by a~-> f(a), in order to bring out the form of /(a) for a given element a of A. The set A is called the domain of the operation, and is denoted by dmnf In the most important case, we have f(a) = f(a' ) whenever a,a'e A and a = a'; the operation f is then called a fun ction, or a mapping o f A into B, or a map of A into B. For two functions f,g from A into B. f = g means that f(a) = g(a) for each element a of A . Taken with this equality relation, the collection of all functions from A into B becomes a set. The notation f: A-. B indicates that f is a function from the set A to the set B. A function x whose domain is the set Z + of positive integers is called a sequence. The object xn= x(n) is called the n•h term of the 16 Chapter 2 Calculus and the Real Numbers sequence. The finite routine x can be given explicitly, or it can be left to inference: for example, by writing the terms of the sequence in order (x 1,x 2 , ••• ) until the rule of their formation becomes clear. Different notations for the sequence whose nth term is X11 are: IU-+x., (x 1,x2 , ...), (x.):• ., and (x11). Thus the sequence whose nt 11 term is 11 2 can be written nl--+n 2, or (1,4, 9, ... ),or (n 2):::_ 1, or simply (11 2). A subsequence of a sequence (x.) consists of the sequence (x.) and a sequence (nk)~ 1 of positive integ ers such that n 1< 11 2 < .... We identify such a subsequence with the sequence whose k'h term is x.k . Sometimes we shall speak of sequences whose domain is some set of integers other than z+. For example, we shall write (x,);:-'. 0 to denote a mapping x from the set of nonnegative integers, where x,=:x(n) for each n. Another example arises as follows. If n is a positive integer, then a finite sequence of length n is a function from the set { 1, 2, ... , n} into a set B. The carresian product, or simply the produa, of sets X 1, ••• , X. is defined to be the set X =: X 1 xX 2 x . .. xX, of all ordered n-tuples (x 1, ••• ,x.) with x 1 e X 1 , x 2 e X 2 , • •• ,and x.eX,. Elements (x 1 , ••• ,x.) and (x'1, ••• ,x~) of the cartesian product are equal if the coordinates (or componencs) xk and x~ are equal elements of X 1 for each k. If x is a finite sequence of elements of a set B, then x can be identified with the element (x(l), ... , x(n)) of the cartesian product B" =: BxBx ... xB, where n is the length of x. Returning to functions in general, we say that a function f: A ..... B maps A onto B if to each element b of B there corresponds an element a of A with f(a) = b. In other words, f maps A onto B if there is an operation g from B into A such that f(g(b)) = b for each b in B. A set A is countable if there exists a mapping of z + onto A; intuitively, this means that the elements of A can be arranged in a sequence with · possible duplications. The elements of the cartesian product 7l x 7l of the set 7l of integers with itself can be arranged in a sequence as follows. We order the elements (m,n) of 7l x 7l, first according to the value of lml +In I, then according to the value of m, and finally according to the value of I. Sets and Functions J7 n. This produces the sequence ( 1.2) ((0, 0), (- 1, 0), (0, - 1), (0, 1)1 (1, 0), ( - 2, 0), ( - 1I 1)1 - • •. ), in which each element o f Z x Z occurs exactly once. In the sequence (1.2), omit every term (m,n) with n = O, and replace each term (111 11) with n =t= O by mjn ; this produces the sequence 1 (1.3) (0/ -1,0/ 1. -1 j -1, ... ). in which every expression p/q with p and cJ integers and q =t= 0 occurs exactly once. Keeping only the term 0/ 1 of (1.3) and those terms for wh ich q>0 1 p=t=O, and p is relatively prime to C/1 we obtain a sequence 1 1 1 (1.4) (0/1, - 1/1,1 / 1, ... ) which has the property that for any given rational number r there exists exactly one term equal to r. For each positive integer 11, let be the sct{0 1 11-l }.If there is a mapping of Z, onto the set A, then we say tha t A has at most 11 eleme11ts. A set with at most 11 clements for some 11 is said to be subfinite, or finitely enumerable. Note that every subfinitc or countable set has at least one element. Before we introduce stronger notions than countability and subfiniteness, we must discuss the composition of functions. T he composition of two functions f: A-+ B and g: B-+ C is the function go f: A -+ C defined by z. (g of)( a) = g(f(a)) 1 1 •• • 1 (ae A). Composition is associative: h o(g of) = (h o g) oJ whenever the compositions are defi ned. If f: A-+ B, g: B-+ A, and g{f(a)) = a for all u in A, then the function g is called a left i11verse of f, and the functi on f is called a right i11verse of g. (Note that f has a left inverse if and only if it is one-one, in the sense that a= a' for all elements u,a' o f A with f(a) = f(a').) When g is both a left and a right inverse of f, then it is simply called an inverse of f; f is then called a one-one correspondence, or a bijection, and the sets A and B are said to be in one-one correspondence with each other. A set which is in one-one correspondence with the set z+ of positive integers is said to be cowrtably i11j'inite. For example, let f be the sequence (1.4), and define a fun ction g from the set <Q of rational numbers to z+ by writing g(r) :: 11, where II is the unique positive integer for which f(n) =r. Then K is an inverse off; so that the set <Q 18 Chapter 2 Calculus and the Real Numbers is countably infinite. A similar proof using (1.2) shows that Z x Z is countably infinite. A set which is in one-one correspondence with is said to ltave n elements, and to be finite. Every finite set is countable. It is not true that every countable set is either countably infinite or subfinite. For example, let A consist of all positive integers n such that both 11 and n + 2 arc prime; then A is countable, but we do not know if it is either countably infinite or subfinite. T his does not rule out the possibility that at some time in the future A will have become countably infinite or sublinite; it is possible that tomorrow someone will show that A is subfinite. This set A has the property that if it is sublinite, then it is finite. Not all sets have this property. z. 2. The Real Number System The following definition is basic to everything that follows. (2.1) Definition. A sequence (x.) of rational numbers is regular if (2.1.1) A real number is a regular seq uence of rational numbers. Two real numbers x::(x.) and y::(y.) are equal if (2.1 .2) The set of real numbers is denoted by .R. (2.2) Proposition. Equality of rea/numbers is art equivalence relation. Proof: Parts (i) and (ii) of (1.1) are obvious. Part (iii) is a consequence of the following lemma. = (2.3) Lemma. The real numbers x (x") and y = (Y.) are equal if and only if for each positive integer j there exists a positive integer N1 such that (2.3.1 ) Proof: If x= y, then (2.3.1) holds with N 1:: 2j. 2. The Real Number System 19 Assume conversely that for each j in z + there exists N1 satisfying (2.3.1). Consider a positive integer n. If m and j are any positive integers with m;;; max {j, N1}, then lx.-y,.l ;:;;![x. -x'"l +lx'"- Y,.l + ly.,- Y,l ~(n - 1 +m-')+T 1 +(n- 1 +m- 1)<2n-• +3r '· Since this holds for all j in z+, (2.1.2) is valid. 0 Notice that the proof of Lemma (2.3) singles out a specific Ni satisfying (2.3.1). This situation is typical: every proof of a theorem which asserts the existence of an object must embody, at least implicitly, a finite routine for the construction of the object. The rational number x, is called the n•h rational approximation to the real number x::(xJ. Note that the operation from R to ~ which takes the real number x into its n•h rational approximation is not a function. For later use we wish to associate with each real number x :: (x.) an integer K"' such that This is done by letting K"' be the least integer which ts greater than lx .I+ 2. We call K"' the canonical bound for x. The development of the arithmetic of the real numbers offers no surprises: we operate with real numbers by operating with their rational approximations. (2.4) Definition. Let x:=(x,) and y=(y,) be real numbers with respective canonical bounds K" and K r Write k=max {K",K,.}. Let a be any rational number. We define (a) x+ y:=(x2,. + Y2n):., (b) xy=(xu.Yn.>:. , (c) max {x,y} := (max {x,,y,});-_ , (d) -x = ( -x.)~. 1 (e) a* := (a,a,a, ... ). (2.5) Proposition. The sequences x + y, xy, max{x,y}, -x, and a* of Definition (2.4) are real numbers. 2U Chapter 2 Calculus and the Real Numbers Proof: (a) Write z.=x 2 .+y2 •• Then x+y= (:.). For all positive integers m and n, l=no -z.l ~ lxlm - xl.l + lhm - Y2nl ~ (211)- l +(2m) I +(211) I +(2m) I = ll I +m 1. Thus x + y is a real number. (b) Write z. = xu.Yun· Then xy = (z,). For all positive integers and n, 111 lz, -z.l = lx2km(y2km- Yu.)+ Y2k,(x2h• -x21n)l ~ kiYam - Y2k.l+klx2k,.-xu.l ~ k((2km)- 1 +(2kll) - 1 +(2km) - 1 +(2kn) 1 )= n 1 +m 1 • Thus xy is a real number. (c) Write z.= max {x.,y,}. Then max {x,y} a (z,). Consider positive integers m and 11. For simplicity assume that Then lznr- :.1 =lx,.- max {x,,y,}l =x,-max {x.,,y.}~x,.-x., ~ n - 1 +m 1 Thus max {x, y} is a real number. (d) For all positive integers m and n, 1 -xm-( - x.)l =lxm - x.l~m- 1 +11 1 Thus - xis a real number. (e) This is obvious. 0 T here is no trouble in proving that (x,y)t-+X+)', (x,)•)~--+xy, <tnd are functions from RxR toR; that xt-+ - x is a function from R toR.; and that cxt-+a* is a function from~ toR. The operation Xt-+lxl :: max {x, -x} (x,y)~-+max{x.y} is therefore a function from R. to R , and the operation (x,y)t-+min {x,y} = -max { -x,- y} is a function from R. x lR. to JR. The next proposition states that the real numbers obey the same rules of arithmetic as the rational numbers. (2.6) Proposition. For arbitrary real numbers .x, y, am/ : am/ rational numbers a. and fl, (a) x+y= y+ x. xy=yx 2. The Re-.tl Number System 21 (b) (x+y)+z=x+(y+z), x(yz)=(xy)z (c) x(y+z)=xy+xz (d) O* + x=x, l*x=x (e) x-x=O* (f) lx Yl = lxiiYI (g) (a.+fJ)*=a.*+fJ*, (1XfJ)*=a.*fJ*, and (-IX)* = -IX*. We omit the simple proofs of these results. We shall use standard notations, such as x+ y+z and max {x,y,z}, without further comment. There are three basic relations defined on the set of real numbers. The first of these, the equality relation, has already been defined. The remaining relations, which pertain to order, are best introduced in terms of certain subsets It+ and .R0 + of IR. (2.7) Definition. A real number x s(x.) is positive, or xe:R +, if (2.7.1) for some n in Xn>n-1 z•. A real number xs(xn) is nonnegative, or xe:R 0 +, if (2.7.2) The following criteria are often useful. (2.8) Lemma. A real number x::(x.) is positive exists a positive integer N such that if and only if there (2.8.1) A real number x::(x.) is nonnegative there exists N. in such that z• if and only if for each n in z+ (2.8.2) Proof: Assume that xeiR +. Then x. > n N in z+ with 1 for some n in z+. Choose Then Xm~x.-lxm -x.l~x. -m- 1 -n ~x" - n- 1 - N- 1 1 > N-' whenever m ~ N. Therefore (2.8.1) is valid. Conversely, if (2.8.1) is valid, then (2.7.1) holds with n = N + l. Therefore xe.R • . 22 Chapter 2 Calculus and the Real Numbers Assume next that xe 1R0 +. Then for each positive integer n, xm~-m- 1 ~ - n - 1 (m~n). Therefore (2.8.2) is valid with N, =n. Assume finally that (2.8.2) holds. Then if k, m, a nd n are positive integers with m ~ N, , we have x~~x,..- lx... -x~l ~ - n- 1 -k- 1 -m - 1• Since m and n are arbitrary, this gives xeJR 0 + . 0 xk~ -k- 1. Therefore As a corollary of Lemma(2.8), we see that if x andy are equal real numbers, then x is positive if and only if y is positive, and x is nonnegative if and only if y is nonnegative. It is not strictly correct to say that a real number (x,) is an element of 1R + . An element of R. + consists of a real number (x,) and a posit ive integer n such that x, > n- 1, because an element of 1R + is not presented until both (x.) and n are given. One and the same real number (x,) can be associated with two distinct (but equal) elements of R. +. Nevertheless we shall continue to refer loosely to a positive real number (x,). On those occasions when we need to refer to an n for which x,>n- 1 , we shall take the position that it was there implicitly all along. The proof of the following proposition is now easy, and will be left to the reader. For convenience, R. • represents either 1R + or 1R0 +. (2.9) Proposition. Let x and y be real numbers. Then (a) x+yelR* and xyelt* whenever xe R.* and yeR• (b) x+ yelR +whenever xeR +and yeJR0 + (c) lxleJR 0 + (d) max{x,y}eJR• whenever x e lR* (e) min {x,y}elR • whenever xeR. • and yelR. •. We now define the order relations on R . (2.10) Definition. Let x andy be real numbers. We define x> y (or y<x) if x- ye-.R + x~y (or y~x) if x- yeR 0 +. and A real number X is negative if X< o• - that is, if -X is positive. 2. The Real Number System 23 Consider real numbers x, x', y, and y' such that (i) x = x', y = y', and x> y. We have x' - y' = x- yelR. + and therefore (ii) x' > y'. We express the fact that (ii) holds whenever (i) is valid by saying that > is a relation on ll. More formally, a relation on a set X is a subset S of X x X such that if x, x', y, y' are elements of X with x =x', y= y', and (x,y)eS, then (x',y')eS. We express the fact that x>y if and only if y<x by saying that> and < are transposed relations. Similarly, ;;;:;; and ~ are transposed relations. If x < y or x = y, then x ~ y. The converse is not valid : as we shall see later, it is possible that we have x~y without being able to prove that x<y or x=y. For this reason it was necessary to define the relations < and ~ independently of each other. The following rules for manipulating inequalities are easily proved from Proposition (2.9). We omit the proofs. (2.11) Proposition. For all real numbers x, y, z, and t, (a) x<z whenever either x<y and y~z or x~y and y<z (b) x ~ z whenever x~y and y~z (c) x+ y~z+ t whenever x~z and (d) x+ y<z+t whenever (e) xy~zy whenever x~z x~z and y~t and y<t y~O· (f) xy<zy whenever x<z and y>O* (g) if x < y, then - x > - y (h) (i) if x ~ y, then - x ~ - y max{x, y}~x (j) min {x, y} ~ x (k) if x~y and y~x. then x=y (l) lxi~O· (m) lx+yl~ lx l+lyl . An important property of the relation <, of which we shall make no use, is the antlsymmetry property, which states that at most one of the relations x<y and y<x is valid for given real numbers x and y. This negative statement has no place in the affirmative mathematics we are trying to develop, except as motivation. Its place is taken by the affirmative statement (k) of Proposition (2. 11). As a general principle, negative statements are only for counterexamples and motivation; they are not to be used in subsequent work. 24 Chapter 2 Calculus and the Real Numbers * (2.12) Definition. For real numbers x and y we write x y if and only if x < y or x > y. lnequality=t= is a relation because both < and > are relations. As motivation we have the negative statement that at most one of the relations x y, x = y can hold for given real numbers x and y. In other words, at most one of the relations x <y, x > y, x = y can hold. This is clear from the definitions. The following proposition defines the inverse x-I of a real number x =t=O*, and derives the basic properties of the operation x~-->x- 1 . * (2.13) Proposition. Let x be a nonzero real number (so that lxle.R +). There exists a positive integer N with lx,.I;:;;; N- 1 for m ;:;;; N. Define Y.=(xNJ)-1 (n < N) and Then x - 1= (y.):'= t is a real number which is positive if x is positive, and negative if x is negative ; also xx - I = 1*. If t is any real number for which xt = t•, then t =x- 1 • The operation x~-->x - 1 is a function. If x=t=O and y=t=O, th en (xy)- 1=x- 1 y- 1. If a=t=O is rational, then (a•)- 1 =(cc 1)*. If x=t=O, then (x - 1)- 1=x. Proof: Our definitions guarantee that IY.I~ N for all n. Consider positive integers m and n. Write j::max{m,N}, k::max{n,N}. Then ly,. - Y.l = ly,.IIY.IIxJN' -xkNll ~ N2(UN 2) - 1 +(kN2)-')=r I +k-1 ~ m- I 1 + n-1 . Therefore x - is a real number. Assume now that x > o•. Then by (2.8), x. > 0 for all sufficiently large n. Hence Y. > K ; 1 (where K,. is the canonical bound for x) for all sufficiently large n. It follows from (2.8) that x- 1 > 0*. A similar proof ShOWS that X- 1 < 0* whenever X< 0". Let k be the maximum of the canonical bounds for x and x - 1 • Write xx- 1 :: (z.). Then x2nkY2nk - X2nk(x2nN' k) - l (n ~ N). Therefore Jz• ....: l*J = Jx2nNlkJ-I JX2nk- X2nNl ti z.= ~ IY2 okl((2nk)- 1 +(2nN 2 k)- 1 ) ~ n - for n;:,;;N. It follows that xx - 1 = 1*. 1 2 The Real Number System 25 If tis any real number with xt= 1*, then x 1 = x- 1(xt) = (x- 1 x)t - (xx- 1 )t = t. If x = x', then x'x- 1 = xx - 1 = 1*. Therefore x- 1 = (x') - 1 • It follows that xt-tx If x =F O and y =F O, then 1 is a function. (x y)x- 1 y- 1 = xx- 1 y y- 1 = 1*. Therefore x- 1 y- 1 =(xy)- 1 • If a=FO is rational, then a* = (a,a, ... ).Therefore (a•)- 1 = (a- 1,a- 1 , ••• ) = (a- 1 )*. For each x in Jt+, x- 1 is in Jt+, and thus (x-•)- 1 exists. Since x - 1 x=xx- 1 = 1*, it follows that (x- 1) - 1 =x. Similarly (x-•)- 1 =x if xis negative. Therefore (x-•)- 1 = x whenever x=FO. 0 Of course, we often write xf y instead of xy - 1 when x and y are real numbers with y=FO. As the previous propositions show, (a/f)* = a•p•, (a+Pl* =a*+{J*, (- a)* = -a•, (lal)*=la*l, and (a- 1)*=(a*)- 1 for all rational numbers a and p. Also a 6-P if and only if a* 6.{1*, where 6. stands for any of the relations =, <, >, and =F. This situation is expressed by saying that the map at-ta• is an order isomorphism from ~ into JR. This justifies identifying ~ with a subset of IR, as we previously identified Z with a subset of ~. Henceforth we make no distinction between a rational number a and the corresponding real number a•. The next lemma shows that the n11' rational approximation x. to a real number x ::(x.) actually approximates x to within n- 1 • (2.14) Lemma. FOl" each real number x = (x.), we have lx-x.l~n- 1 Proof: By (2.4) and the definition of tion to n- 1 - lx- xnl is n 1 (n eZ •). I 1. the m1h rational approxima- -lx 4 ,.-x.l ~ n - 1 -((4m) - 1 +n 1 )= By (2.7), we haven 1 -(4m)- 1 >-m- •. - lx - xnle JR 0 + . Therefore lx - x.l~n - 1 • 0 (2.15) Lemma. If x :: (x.) and y :: (y.) are real numbers with x<y, then there exists a rational number a with x <a< y. 26 Chapter 2 Calculus and the Real Numbers Proof: By (2.4), we have y-x::(y 2 ,-x 2 ,.)~ 1 • Since y -xelt +,by (2.7) there exists n in z + with y 2 ,.-x 2 ,. >n - 1• Write a::t(x2.+Y2.). Then Also, Therefore x <a< y. D As a corollary, for each x in R. and r in lR + there exists a in with lx -«1 < r. Here is anot her corollary. ~ (2.16) Proposition. If x 1 , ... ,x. are real numb-ers with x 1 + ... +x. >0, then X;> O for some i (l ~i~n). Proof: By (2.15), there exists a rational number « with 0 <« <x 1 + ... + x,. For 1 ~ i ~ n let a; be a rational number with lx1- ad< (2n)- 1 «. Then • • i· l • L: xi- l:lx,-a l>i«. L:a;~ 1 i J l- 1 Therefore a; >(2n)- 1 a for some i. For this i it follows that X;~a,-lx,-a;I>O. 0 (2.17) Corollary. If x , y, and z are real numbers with y < z, then either x<z or x> y. Proof: Since z-x+x-y=z -y> O, either z-x>O or x - y>O, by (2.16). 0 The next lemma gives an extremely useful method for proving inequalities of the form x~ y. (2.18) LemDUl. Let x and y be real numbers such that the assumption x > y implies that 0 = 1. Then x ~ y. Proof: Without loss of generality, we take y=O. For each n in z +, either x,.~n- 1 or x,>n - 1 • The case x,>n - 1 is ruled out, since it implies that x>O. Therefore - x. ~ - n- 1 for all n, and so -x~ O. Thus x~O. 0 2. The Real Number System 27 (2.19) Theorem. Let (a.) be a sequence of real numbers, and let x 0 and y 0 be real numbers with x 0 <Yo· Then there exists a real number x such that x 0 ~ X ~ Yo and x =t= a,. for al/ n in z+. Proof: We construct by induction sequences (x.) and (y,.) of rational numbers such that (i) x 0 ~x .. ~xm<Ym~Y.. ~Yo (m~n~l) (ii) x.>a, or y,.<a,. (n~l) (iii)y,-x,<n - 1 (n!;:;l). Assume that n~l and that x 0 , •. • ,x,. _ ., y0 , ••• ,y, _ 1 have been constructed. Either a,.>x,_ 1 or a,<y. _ 1• In case a,.>x,._ 1, let x. be any rational number with x,. _ 1 <x,.<min {a,.,y,_ 1 }, and let y,. be any rational number with x,.<}',.<min{a,.,y,._ 1 ,x,.+n- 1 }. Then the relevant inequalities are satisfied. In case a,. <y,_ 1, let y. be any rational number with max {a.. ,x,._ 1 } <y. <y,. _ 1 , and x,. any rational number with max{a,.,x,._ 1,y,.-n - 1 }<x,.<y,. Again, the relevant inequalities are satisfied. This completes the induction. From (i) and (iii) it follows that (m~n). Similarly IYm - Ynl<n- for m~n . Therefore x::(x,) and y::(y.) are real numbers. By (i) and (iii), they are equal. By (i), x,.~x and Y. ~ Y for all n. If a.< x,, then a,< x and so a,. :t= x. If a,.> y,, then a.> y = x and so a,.=t=x. Thus x has the. required properties. 0 1 Theorem (2.19) is the famous theorem of Cantor, that the real numbers are uncountable. The proof is essentially Cantor's " diagonal" proof. Both Cantor's theorem and his method of proof are of great importance. The time has come to consider some counterexamples. Let (nk) be a sequence of integers, each of which is either 0 or 1, for which we are unable to prove either that nit= 1 for some k or that n1 = 0 for all k. This corresponds to what Brouwer calls "a fugitive property of the natural numbers". For example, such a sequence can be defined as follows. Let n~c be 0 if u' + rl =F w' for all integers u, v, w, c with 0 < u, v, w ~ k and 3 ~ t ~ 2 + k. Otherwise let n~c be 1. Then we are unable to prove n1 = I for some k, because this would disprove Fermat's last theorem. We are unable to prove Ilk =0 fo r all k, because this would prove Fermat's last theorem. Now define x1 =0 if n1=0 for all j~k, and x1 =2- m otherwise, where m is the least positive integer such that nm = 1. Then x = (xJ is a 28 Chapter 2 Calculus and the Real Numbers nonnegative real number, but we are unable to prove that X> 0 or x = 0. Since nothing is true unless and until it has been proved, it is untrue that x > 0 or x = 0. Of course, if Fermat's last theorem is proved t omorrow, we shall probably still be able to define a fugitive sequence (11") of integers. Thus it is unlikely that there will ever exist a constructive proof that for every real number x ;?; O either x >O or x = O. We express this fact by saying that there exists a real number x ;?; 0 such that it is not true t hat x>O or x= O. In much the same way we can construct a real number x such that it is not true that x !;;; 0 or x ~ 0. 3. Sequences and Series of Real Numbers We develop methods for defining a r eal number in terms of approximations by other real numbers. (3.1) Definition. A sequence (x.) of real numbers converges to a real number Xo if for each k in z+ there exists Nk in z+ with (3.1.1) The real number x 0 is then called a limit of the sequence (x.). To express the fact that (x.) converge9 to x0 we write lim x. = x 0 n- oo or x. -+ x 0 as n -+ ex:> or simply x.-+ x 0 • A sequence (x.) of real numbers is said to converge, or be convergent, if there exists a limit x 0 of (x.J. It is easily seen that if (x.) converges to both x 0 and x'0 , then x 0 =Xo. A convergent sequence is bounded: there exists r in JR. + such that l x.l ~ r for all n. A convergent sequence of real numbers is not d etermined until the limit x 0 and the sequence (Nk) are given, as well as the sequence (x.) itself. Even when they are not mentioned explicitly, these quantities are implicitly p resent. Similar comments apply to many subsequent definitions, including the following. 3. Sequences and Series of Real Numbers 29 (3.2) Definition. A sequence (x") of real numbers is a Cauchy sequence if for each kin z+ there exists M" in z+ such that (3.2.1) (3.3) Theorem. A sequence (x.) of real numbers converges it is a Cauchy sequence. if and only if Proof: Assume that (x.) converges to a real number x 0 • Let the sequence (N1) satisfy (3.1.1). Write M 1 Nu. Then = Jx., -xnl ~ l x., - x0 +lx. -x0 l ~ (2k) J 1 + (2k)- 1 = k- 1 form, n"S; Mk. Therefore (xJ is a Cauchy sequence. Assume conversely that (xJ is a Cauchy sequence. Let the sequence (M 1.) satisfy (3.2.1). Write N,. = max {3k,Ma}. Then lx., - x.l~ (2k) Let 1 (m, n ?f N,.). Y1r. be the (2k)'h rational approximation to xN.· For m;::; n, lYm- Ynl $ Jy., - xN.J +lxN'" -xNJ +lxN" - Ynl $ (2m)- 1 +(2m) 1 +(2n) 1 +(2n) 1 = m- 1 +n- 1 • Therefore y = (yJ is a real number. To see that (xJ converges to y, we consider n ~ N" and compute Jy - x.J~ Jy- Y.l +Jy. - xN.J +lxN" -x.l ~ n- 1 +(211) - 1 +(2k) - 1 ~ (3k)- 1 +(6k)- 1 + (2k)- 1 = k - 1 • 0 A subsequence of a convergent sequence converges to the same limit. If a sequence converges, then any sequence obtained from it by modifications (including, perhaps, insertions or deletions) which involve only a finite number of terms converges to the same limit. If x (xJ is a regular sequence of rational numbers, then (x!) converges to x, by (2.14). A sequence (x") is increasing (respectively, strictly increasing) if x.+ 1 ~ x. (respectively, x.+1 > x.) for each n. Decreasing and strictly decreasing sequences are defined analogously, in the obvious way. A theorem of classical mathematics states that every bounded increasing sequence of real numbers converges. A counterexample to this statement is given by any increasing sequence (x.) such that x. = 0 or x" = 1 for each n, but it is not known whether x.=O for all n. It is useful to supplement Definition (3.1) by writing = limx. =oo or 30 Chapter 2 Calculus and the Real Numbers to express the fact that for each k in x.>k for all n~Nk. We also define z• there exists Nk in z• with lim x. =- oo or to mean that lim - x. =ex:>. n-oo The next proposition shows that we may work with real numbers constructed as limits by working with their approximations. (3.4) Proposition. Assume that x.-+ x 0 as n-+ ex:>, and Yn-+ Yo as n-+ ex:>, where x 0 and y 0 are real numbers. Then (a) x.+y.-+x 0 +y0 as n-+oo (b) x.y.-+x0 y 0 as n-+oo (c) max{x.,y.}-+max{x 0 ,y0 } as n-+oo (d) x 0 =c whenever x.=c for all n (e) if x 0 =t=O and x.=t=Ofor all n, then x; 1 -+x0 1 as n-+oo (f) if x.&Y. for all n, then x 0 &y0 • Proof: (a) For each kin z• there exists Nk in z+ such that lx. - x 0 l& (2k)- 1, ly. - y 0 15(2k)- (n~Nk). 1 Then Therefore x.+y.-+x0 +y0 as n-+00. (b) Choose m in z+ such that IYol & m and lx.l & m for all n. For each k in z+ choose Nk in z+ with lx.-x 0 1 ~ (2mk)- 1 , Then for ly. - y 0 1&(2mk)- 1 (n~Nk). n~Nk, lx.y. -Xo Yo l & lx.(y.- Yo)l +IYo(x. - xo)l ~m(ly.- Yol +lx. -x 0 l)&k - 1 • Therefore x. y n ..... Xo Yo as n ..... CX). (c) Since Imax {x., Y.}- max {x0 , y 0 }1 ~max {lx.- x 0 1, IY.- y 0 1}, it follows that 3. Sequences and Series of Real Numbers 31 (d) If x. = c for all n, then (x.) converges to c. Therefo re x 0 =c. (e) Since lx 0 l > 0, lx.l ~ lx0 1-lx.- Xol > 1lx 0 1 whenever n is large enough, say for n ~ n 0 • Let k and n be positive integers such that n~n 0 and lx.- x 0 1<(2k)- 'lx0 12• Then lx; 1 -X 1 2 1 2 1 0 'I =lx.l - 'lx0 l- lx. - x 0 1 ~ 2lx 0 l- (2k)- lx 0 1 - k - • Therefore x; 1 -+ x 0 1 as n-+ oo. (f) We compute y0 -x0 = lim y.- lim x.= lim(y.-x.) = lim ly.- x.l n-oo =lim max {y.-x., x.- y.} = max{y 0 -x0 ,x0 - y0 } ~ 0. by (a), (b), (c), and (d). 0 For each sequence (x.) of real numbers the number is called the n•h partial sum of (x,.), and (s.) is called the sequence of partial sums of the sequence (x,.). A sum s0 of (x.) is a limit of the sequence (s,.) of partial sums. We write So= L x. •-I to indicate that s0 is a sum of (x.). A sequence which is meant to be summed is called a series. A series is said to converge to its sum. Thus the sequence (2 - •):'_ 1 converges to 0 as a sequence, but as a series it 00 converges to L 2- "= 1. n- I A convergent series remains convergent, but not necessarily to the same sum, after modification of finitely many of its terms. 00 The series (x. ) is often loosely referred to as the series oo If the series 00 L L1x. converges, then x. -+0 as n-+oo. ,. _ L x•. n. t O(t L A series x. is said to converge absolutely when the series lx.l •· 1 converges. •- 1 In classical analysis a series of nonnegative terms converges if the partial sums are bounded. This is not true in constructive analysis. However, we have the following result. 32 Chapter 2 Calculus and the Real Numbers 2: y. is a convergent series of nonnegative terms, (3.5) Proposition. If n. t and if lx.l ~ Yn for ~ each n, then L x. converges. •- I "" Proof: Since 2: y. is convergent, the sequence of partial sums is a ·- l Cauchy sequence. Therefore for each kin with z+ there exists an Nk in z+ Then 2: Therefore the sequence of partial sums of the series x. is a Cauchy •- 1 sequence. By (3.3), the series converges. 0 The criterion of Proposition (3.5) is known as t he comparison test. It follows from the comparison test that every absolutely convergent series is convergent. The terms of an absolutely convergent series co 2: x. may be re- •- l ordered without affecting the sum s0 of the series. More precisely, if A.: z+-+z+ 00 is a bijection, then 00 2: x l(•l exists and ,._ 1 equals s0 • This may 2: x. is merely convergent. not be true if the series •- I A sequence (x.) is said to diverge if there exists s in R + such that for each k in z+ there exist m and n in z + with m, n ~ k and lxno -x.l ~ &. The motivation for this definition is, of course, that a sequence cannot be both convergent and divergent. A series is said to diverge if the sequence of its partial sums diverges. 00 The series 2: n- 1 diverges, because "• I The series l x.l ~ r Let L x. diverges whenever there exists r in R. + •- I such that for infinitely many values of n. ro !X> "- I n- I L x. and 2: Yn be series of nonnegative terms. The compari- son test for divergence is that oo oo n- I n ... 1 L x. diverges whenever L Yn and there is a positive integer N with x. ~Y. for all n~ N. diverges 3. Sc:quence!> and Series of Real Numbers 33 The following very useful test for convergence and divergence is called the ratio test. 00 (3.6) Proposition. Let L x. be a series, c a positive number, and N a n- I "' L positive integer. Then x. converges If c < 1 and n- 1 (3.6.1) and diverges if c > 1 and (3.6.2) Proof: Assume that c < 1 and that (3.6.1) is valid. Then "' for n ~ N. By the comparison test, L x. converges. lx.l $ eft- H lxHl ... I Next, assume that c > 1 and that (3.6.2) holds. Then lx,.l ~ c·- H- 1 lx,. + tl ~ lx,. + 1 1 and Therefore L x,. diverges. (n ~ N+l) 0 .. . t A corollary of the ratio test is that if the limit L = lim lxn+ 1 x;; 11 •- oo exists, then L x. converges whenever L < 1 and diverges whenever L>l. n- 1 The ratio test says nothing in case L = 1. To handle this case, we introduce stronger tests based on Kummer's criterion. (3.7) Lemma. Let (a.) and (x.) be sequences of positive numbers, c a positive number, and N a positive imeger. Then a.x,.-+ 0 as n-+ co and (3.7.1) "" while L x. diverges "• I (3.7.2) if La;; 1 diverges and "• I L x. •- 1 converges if 34 Cha pter 2 Calculus and the Real Numbers Proof: Assume that a"x" -+0 and tha t (3.7.1) is valid. Let c be an arbitrary positive number, and choose an integer vi?; N so that akxk -a1 x1 ~ ce whenever j>k?;; v. For suchj and k we have j j L f x.r ,. .,. 1 J=- x.(a"_ 1 x._ 1 x; 1 -a.) n. lc + 1 n- k + l Thus ( L Xn ~ c- • = c - •(a1 xk - aix1) ~ e . is a Cauchy sequence, and so 1 Next assume that I x.converges. n• l OIJ L a;• diverges and that (3.7.2) holds. Then for n. 1 oo each n6 N , x.6 aNxNa; 1• Thus L x. diverges, by comparison with •• I 0 (3.8) Lemma. Let (y.) be a sequence of positive numbers, c a positive number, and N a positive integer such that n(y.y;+1 1 -1)6 c (n 6 N). Then limy.= O. n- oo Proof: For each n > N, YN y; I = (yN YN"~ .HYN+ I YN~ 2) .. . (y. _ I ~ (l +c N - 1) ... (1 +c(n - 1)- 1} y; ' ) n- 1 ~ l +c L k- 1 • k. N n- Given s> O, choose an integer v>N so that I 2:k- 1 > C 1 (e- 1 yN - 1) k- N for all n ;;:;; v. Then for such n we have Y.<&. Hence Y.-o as n-+oo. 0 The next convergence test is known as Raabe's test. 00 (3.9) Proposition. Let L x. n. J be a series of positive numbers such that ~ n(x. x;}. 1 - l ) converge.s to a limit L. Then L x. converges diverges if L < 1. •- • if L > l, and Proof: First note that n(n x,j(n + l) x.+ 1 -1)= n(n + 1)- 1 (n (x. x;+11 - 1) - 1) -+L - 1 as n-+ oo . 4. Continuous Functions 35 If L> 1, it follows from (3.8) that nx.-+0 as n -+co. We then obtain the 00 convergence of L x. by taking a.= n (neZ +) in Kummer's criterion. "• 1 00 The same choice of a. yields divergence of L x. in case L < 1. •- I 0 Important real numbers represented by series are 00 e=l+ l:(n!)- 1 •- I and 00 n = 4 L (- 1)"(2n + 1)- 1• •• o The series for e converges by the ratio test. The convergence of the series for n is a consequence of the general result that a series 00 L (- 1)" x. converges whenever (i) x.;;;;o for all n and (ii) the sequence •=I (x.) is decreasing and converges to 0. To see this, consider positive integers m and n with m ~ n. Then 0 ~ (x. -xn+ ,)+(xn+2 -x.+ 3)+ ... +( - l)m+n Xm m =( - 1)"'2: (- ll'xk =x. -(xn+ ,-xn +2)- ... +( -l)m+nxm ~ x•. It follows that the sequence of partial sums of the series is a Cauchy sequence. Therefore the series converges. 4. Continuous Functions A property P which is applicable to the elements of a set S is defined by a statement of the requirements that an element of S must satisfy in order to have property P. To construct an element of S with property P we must construct an element of S, perform certain additional constructions which depend on the property P, and prove that the entities constructed satisfy certain requirements that are characteristic of the property P. Each property P applicable to elements of a set S determines a subset of S which is denoted by {x: xe S, x has property P} or {xe S: x has property P}. 36 Chapter 2 Calculus and the Real Numbers When the context makes it clear which set S is under discussion, we also write simply {x : x has property P} . Properties applicable to elements of a set S, and subsets of S, are essentially the same things rega rded from different points of view. Among the most importa nt subsets of JR. are the intervals. (4.1) Definition. For all real numbers a and b we define (a, b)= {x: (a, b] = {x: [a,b}= {x: [a, b] = {x: xeJR, a < x < b}, xe JR, a < x ~ b}, xe JR, a ~ x < b}, xe JR, a ~ x ~ b}. Each of these sets in an interval, whose left and right end points are a and b, respectively. The interval (a, b) is open, [a, b] is closed, (a, b] is half-open on the left, and [a, b) is half-open on the r ight. If a < b, then the intervals are said to be proper. The above intervals are called finite intervals. We also introduce infinite intervals, by the following definitions : (- oo, a):: {x: (-oo, a] = {x: (a, oo):: {x: [a, oo):: {x: xe iR, x<a}, x e iR, x ~ a}, xe iR, a <x}, xeJR, a ~ x}. An interval I is nonvoid if we can construct a real number belonging to I. A nonvoid, closed, finite interval is called a compact interval. A nonvoid fin ite interval I with left and right end points a and b has length III= b - a. If an interval I is a subset of an interval J (that is, if every element of I also belongs to J), then we say that I is a subinterval of J . The rules for manipulating intervals, which we use in the sequel without mention o r proof, are implicit in Proposition (2.11). (4.2) Definition. A nonvoid set A of real numbers is bounded above if there exists a real number b, called an upper bound of A, such that x ~ b for all x in A. A real number b is called a supremum, or least upper bound, of A if it is an upper bound of A, and if for each r. > 0 there exists x in A with x > b - c. We say that A is bounded below if there exists a real number b, called a lower bound of A, such that b ~ x for all x in A. A real 4. Continuous Functions 37 number b is called an infimum, or greatest lower bound, of A 1f it is a lower bound of A, and if for each t > 0 there exists x in A with x < b +e. The supremum (respectively, infimum) of A is unique, if it exists, and is written sup A (respectively, inf A). A classical theorem asserts that every nonvoid set of real numbers that is bounded above has a supremum. A counterexample to this is provided by the set {xn: neZ +} where x n= O or x. = 1 for each n, but it is not known whether x. =O for all n. We now prove the constructive least-upper-bound principle. (4.3) Proposition. Let A be a nonvoid set of real numbers that is bounded above. Then sup A exists if and only if for all x, y in R. with x < y, either y is an upper bound of A or there exists a in A with x <a. Proof: If sup A exists and x < y, then either sup A < y or x < sup A ; in the latter case we can find a in A with sup A-(supA - x) < a and hence x <a. Thus the stated condition is necessary. Conversely, assume that the stated condition holds. Let a1 be an element of A, and choose an upper bound b 1 of A with b 1 >a 1 • We construct recursively a sequence (a,) in A and a sequence (b.) of upper bounds of A such that for each n in z+, (i) a. ~ an+ 1 <bn+ 1 ~ bn and (ii) bn + l-an+! ~l(b. - a,). Having found a 1 , ... , a. and b 1 , ... , bn, if a,.+ !(b,. - a.) is a n upper bound of A, we set b.+ 1 == a.+{ (bn -a.) and a.+ 1 ~ a.; while if there exists a in A with a>a,.+!(b.-a.), we set a,.+ 1 :a a and b.+ 1 ""' b•. This completes the recursive construction. By (i) and (ii), we have 0 ;£ b.-a. ~ (3/4)" - 1 {b 1 -a 1 ) (m: Z+). Hence the sequences (a.) and (b.) converge to a common limit I with for each n in z+. Since each b,. is an upper bound of A. so is I. On the other hand, given e >O, we can choose n so that l "?;, a,>f - e, where a.e A. Hence ! = sup A. 0 a,. ~ I ~ b. In Proposition (4.3), if A is contained in some interval I, then in order to prove that sup A exists it is sufficient to consider arbitrary points x and y in I with x < y. '\R Chnplcr 2 Cnlc uhl' ;md the Real Numbers (•1•1) Corollary. Let the subset A of 1R have the property that for each ' -.. 0 1here e:.:ist s a subfrnite set {y 1 , ..• , y.} of points of A such that for each .x in A at least one of the numbers lx- y 1 1, .. . , lx- Y.l is less than r.. (Such a set A is called totally bounded.) Then sup A and inf A exist. Proof: It will suffice to prove that sup A exists. To this end, let x and y be real numbers with x < y, and set cx =~ {y -x). Choose points a 11 ••• , aN in A such that for each a in A at least one of the numbers la -ad •... ,!a-aNI is less than ex. For some n with l ::;n::; N we have a.> max{a" ... , aN} - ex. Either x < a. or an < x + 2ex. In the latter case, if ae A and we choose k with Ia - atl <ex, we have a ~ at + la - atl <a.+ex+ex<x+4ex = y, so that y is an upper bound of A. Thus sup A exists, by (4.3). 0 Often when one real number depends on another the dependence is smooth, or continuous. An exact description of what this means is given in the following definition. (4.5) Definition. A real-valued function f defined on a compact interval I is continuous on I if for each e> 0 there exists ro(~:) > 0 such that lf(x)- J(y)l ~ t whenever x,ye i and lx - yl& co(e). The operation e Ha>(s) is called a modulus of continuity for f A real-valued function Jon an arbitrary interval J is continuous on J if it is continuous on every compact subinterval I of J. For example, when a and b are real numbers with a <b, then f is continuous on (a, b) if and only if it is continuous on [a + c5, b - c5] for each c5 with 0< c5<t (b - a). A modulus of continuity w is an indispensable part of the definition of a continuous function on a compact interval, although sometimes it is not mentioned explicitly. In the same way, moduli of continuity of the restrictions of f to each compact subinterval are indispensable parts of the definition of a continuous function f on a general intervaL Constant functions, and the identity function x.-x, are continuous on every interval. (4.6) Proposition. Iff: [a, b]-+ 1R is a continuous function on a compact interval, then the quantities sup f = sup{f(x): xe [a,b]} 4. Continuous Functions 39 and inff= inf{f(x): x e [a, b]} (called, respectively, the supremum and the infimum off on the interval [a, b)) exist. Proof: Consider any &>0. Choose real numbers a = a0 ;;! a1 ~ • • • ~ a" = b such that a;. 1 - a, ~ w(&) (O ~ i ~ n-l), where w is a modulus of continuity for f. Then for each x in [a, b] we have l x-a;l ~ w{&), and therefore lf(x)- /(a,) I ~ e, for some i. Since e is arbitrary, it foll ows that the set {f(x): x e [a, b]} is totally bounded. Therefore sup f and inf f exist, by (4.4). 0 Let f: A-It and g: A-R be two functions defined on the same set A. We define the sum function f + g: A -+It by (f + g)(x)• f(x)+ g(x) (xe A). Such functions as f g, 1/1, and max {J, g} are defined similarly. If g(x) =t= 0 for each x in A, then g- 1 : A .... R is defined by g- 1 (x) :: (g(x)) - 1 (xeA). 1 , We also write l fg for g- and f fg for the quotient funct ion f g- 1• The proof of the following proposition, which resembles the proof of Proposition (3.4), is left to the reader. (4.7) Proposition. Let f and g be continuous real-valued functions defined on an interval I . Then the functions f + g, f g, and max {J, g} are continuous on I. If f is bounded away from 0 on every compact subinterval J of I - that is, if l f(x)l~ c for all x in J and some C> O (depending on J) - then J- 1 is continuous on I . Proposition (4.7) implies that the quotient of continuous functions is continuous, provided that the denominator is bounded away from 0 on every compact subinterval. It also implies that a polynomial function is continuous on every interval, and that Il l is continuous on each interval where f is continuous. The composition of continuous functions is continuous, in the sense that iff: I -J and g: J -:R are continuous, then gof is continuous, provided that f maps every compact subinterval of I into a compact subinterval of J. To prove this, it is sufficient to consider the case in which I and J are both compact. Let w be a modulus of 40 Chapter 2 Calculus a nd the Real Numbers continuity for J, a nd a a modulus of continuity for g. Then if x, ye l , c> O, and Jx -yl ~ w(a{c)), we have lf{x) -f(y)J ~ a(s). Therefore Jg{f(x)) - g{f{y))l~t . It follows that g • f is continuous, with modulus of continuity c~-+w(a(c)). Classically, a continuous function maps an interval onto an interval. We now prove a weak version of this result, known as the Intermediate value theorem. (4.8) Theorem. Let f be a continuous map defined on an interval I , and let a, b be points of I with f(a) < f(b). 11ten for each y in [f(a), f(b)] and each e> O, there exists x in [min {a, b}, max {a, b}] such that Jf(x)- Yl <c. Proof: Since[ is continuous, we must have a =t= b. We may assume that a < b. Consider yin [f(a),f(b)] and t>O. Let m = inf{Jf(x)-yJ: a ~ x ~ b}, which exists by (4.6). Suppose that m > 0. Then f(a)- y ~ - m and f(b) - y ~ m. Let w be a modulus of continuity for f on [ a, b], and choose points a=x 0 ~ x 1 ~ ... ~ x .. = b such that Xut - x. ~w (m) for O ~ k ~ n - 1. Then for such k we have lf(xt +tl - y - {f(xt) - y)l = lf(xt +tl - f(xk)l ~ m. Since Jf{x) - yl ~ m for all x in [a, b], it follows that the quantities f(x.)-y and f(xt+tl-y are either both positive or both negative. Therefore the quantities f(x;) - y (0 ~ i ~ n) are either all positive or all negative. H ence f (a)- y and f(b) - y are either both positive or both negative. This contradiction ensures that the possibility m> 0 is ruled out; so that m < t, and the desired conclusion follows. 0 Under additional hypotheses satisfied by many of the common elementary functions of analysis, Theorem (4.8) can be strengthened to yield the conclusion that f(x) = y for some x in [min {a, b}, max {a, b}]. For example, this strong conclusion obtains whenever f is strictly increasing, in the sense that f(x)> f(x') for any two points x, x' of its domain with x>x'. For in that case, taking a<b we can construct sequences (a.), (b.) in [a, b] such that for each n in :z+, (i) a = a 1 ~ a 2 ~ ••• ~ a.. ~ b. ~ (ii) f(a.) ~ y ~ f(b.) ... ~ b 2 ~ b 1 = b (iii) b,. + 1 - a. + 1 ~ (2/3) (b,. -a.). 4. Continuous F unctions 41 ne sequences (a,), (b,) then converge to a common limit x in [a,b] vith f(x) = y. Just as sequences of real numbers can converge to real numbers, ;equences of continuous functions can converge to continuous func:ions. In fact, most of the important functions of analysis are defined lS limits of sequences of continuous functions. :4.9) DeflnitiorL A sequence (!,) of continuous functions on a compact interval I converges on I to a continuous function f if for each e> O there exists N, in z+ such that (4.9.1) 1/,(x) -J(x)l ~r. (x el, n ~ NJ. A sequence (J,) of continuous functions on an arbitrary interval J converges on J to a continuous function f if it converges to f on every compact subinterval I of J ; in that case, sequence (f,). f is called the limit of the Definition (4.9) can be recast to bear a closer resemblance to Definition (3.1). To this end, we define the norm 11! 11 1 of a continuous function f on a compact interval I to be the supremum of 1/1 on I . Then (!,) converges to f on I if and only if for each k in Z + there exists N. in z+ with 11 !,-fll r ~ k - ' (n ?; Nl). (4.10) Definition. A sequence (/,) of continuo us functions on a compact interval I is a Cauchy sequence on I if for each £ > 0 there exists M. in z+ such that (4.10.1) lf.. (x) - J,(x)l ~& (xe i ; m, n?; M,). A sequence of continuous functions on an arbitrary interval J is a Cauchy sequence on J if it is a Cauchy sequence on every compact subinterval of J. The sequence (J,) is a Cauchy sequence on the compact interval I if and only if for every kin z+ there exists M. in z+ such that Ufm- J,llr ~ k - 1 (n, n?; Mk) . Notice that a sequence (cJ of real numbers converges if and only if the corresponding sequence of constant functions, which we a lso denote by (c.), converges on a given nonvoid interval I, and that a sequence of real numbers is a Cauchy sequence if and only if the corresponding sequence of constant functions is a Cauchy sequence on I. Because of these remarks, the following theorem is a generalization of Theorem (3.3). 42 Chapter 2 Culculus and the Real Numbe.rs (4. 11) Theorem. A sequence (/.) of continuous functions on an interval J converges on J if and only if it is a Cauchy sequence on J. Proof: Assume that (/.) converges to f on J . Let I be any compact subinterval of J. For each &>0 choose N, in z+ satisfying (4.9.1), and write M, =Nt/ 2 • Then whenever m, n ~ M , and xe I, we have 1/.,(x)- J. (x~ ~ l fm(x)- f(x)l +1/.(x)-f(x)l ~ &/2+ r./2 = &. Therefore (/.) is a Cauchy sequence on /. It follows that (/.) is a Cauchy sequence on J. Assume conversely that (/.) is a Cauchy sequence on J . Then for each x in J, (/.(x)) is a Cauchy sequence of real numbers, whose limit we denote by f(x). We shall show that f : J -It is a continuous function and that <J~) converges to f on J. It is enough to show that f is continuous on each compact subinterval I of J, and that (/.) converges to f on /. To this end, choose the positive integers M, such that (4.10.1) is valid, and for each n in z+ let ro~ be a modulus of continuity for f~ on /. For each & > 0 write w(e) :: wM(E/ 3), where M -. MrtJ· Then whenever x,ye / and lx-yl ~ w(r. ), we have lf(x) - f(y~ ~ lf(x)- fM(x)i +1/M(x)-fM(Y)I+IfM{y)- f(y)l = lim 1/.(x)-fM(x)l+lfM(x) - f M(y)l + lim lfM(y)-J.(y)l Therefore f is continuous, with modulus of continuity w. Finally, if xel, &>0, and n ~ M .. then lfn(x)-f(x)l = lim 1/.(x)-/..,(x)l ~ £. Hence (/.) converges to f on/. 0 Notations to express the fact that (/.) converges to fare lim/.=! and !.-+ f We also write simply fn-+ f as n-+oo . 4. Continuous Functions 43 To each sequence (f.) of continuous functions on an interval I corresponds a sequence (g.) of partial sums, defined by If (g.) converges to a continuous function g on I, then g is the sum of <C 2: J,. , the series J g= n• I 2: J,., 2:'" 1/.1converges on I, and the series is said to converge to g on I. If co then . 2:.,J,. n- l is said to converge absolutely on I. An absolutely con- vergent series of functions converges. The comparison test and the ratio test for convergence carry over to J> 2: g.. series of functions. The comparison test states that if is a con- ,. _ ~ vergent series of nonnegative continuous functions on an interval I, 00 then the series 2: J,. of continuous functions on I converges on I ne l whenever 1/.(x)l ~ g.. (x) for all n in z + and all x in I . 00 The ratio test states that if 2: J,. is a series of continuous functions on an interval J such that for each compact subinterval I of J there exist a constant c 1 , 0 < c 1 < 1, and a positive integer N1 with lfn + l(x)l ~ c, IJ,.(x)l (x e i, n "?;;, Nt), oc then 2: f. •- 1 converges absolutely on J. ~ A power series is a series of the form 2: a.(x - x n- o . 0 )", where a. (x - x 0 )" represents the function x~--+a.(x - x 0 )" and where a0 (x- x 0 ) 0 =a 0 for all x. The ratio test has the following corollary. 00 (4.12) ProposltiorL Let the power series L a.(x - •- 0 x 0)" have the property that there exist r>O and N in 7J. + such that l a. + 1 l ~ r - 1 l a.l for all n?;; N. Then the series converges absolutely on the interval (x 0 -r,x0 + r). Proof: If I is a compact subinterval of (x 0 - r, x 0 + r), then there exists r0 with 0 < r0 < r such that lx - x 0 1.~ r0 for all x in I. Then l a.+l (x-xo)"+t l~ r- 1 rola,(x - x 0 )"1 (n "?;;, N, x ei). By the ratio test, the series therefore converges absolutely on I . 0 44 Chapter 2 Calculus and the Real Numbers 5. Differentiation The rate at which a function is changing is a fundamental property of the function. Here is the precise definition of this concept. (5.1) Defmition. Let f and g be continuous functions on a proper compact interval I, and let (J be an operation from JR. + to R + such that lf(y)- f(x)- g(x) (y- x)l ~ & ly- xl whenever &>0, x , ye l, and l y-x l ~ o(e). Then f is said to be differentiable on I, g is called a derivative off on I, and lJ is called a modulus of differentiability for f on I . Iff and g are continuous functions on a proper interval J, then g is a derivative of f on J if it is a derivative of f on every proper compact subinterval of J; f is then said to be differentiable on J . To express that g is a derivative off we write g=f, , or g= Df, or df(x) g(x) = ~· One way to interpret Definition (5. 1) is that the difference quotient (f(y) - f(x)) (y- x)- 1 approaches g(x) as y approaches x. In other words, g is the rate of change of f. Iff has two derivatives on I , then they are equal functions. (5.2) Theorem Let h and / 2 be differentiable functions on an interval I . Th en h + f 2 and h f 2 are differentiable on I. In case h is bounded away from 0 on every compact subinterval of I , then f 1- 1 is differentiable on I. The function x 1-+ x is differentiable on R. For each c in R the function xt-+c is differentiable on It. The derivatives in question are given by the following relations: (a) D(!J + fz)=D!J +Dfz (b) D(!J fz) = h D fz + fz Dft Dfi- 1 =- h- 2 Dft (c) dx (d) - = I dx (e) de = O. dx S. Differentiation 45 Proof: It is enoug h to consider the case in which I is compact. Let o1 and o2 be moduli of differentiability for ft and / 2 , respectively, o n I, and w1 a modulus of continuity for J. on I . (a) Whenever x, ye i and iy -xi ~ o(c) s min{o 1 (c/2), o 2 (e/2)}, we have if,(y) + fz(y) - {j,(x)+ f z(x)) - (fi(x)+ /2(x)) (y - x)i ~ ift(y) - fdx) - fi(x) (y - x)l +lfz(y) - fz (x) - fi(x)(y - x)i Thus ft + h. is differentable on I , with derivative fj + fi and modulus of differentiability o. (b) Let M be a common bound for 1/d, 1/2 1, and ifi i on the interval I . (For instance, define M :: max{llft ll 1, ll f 2 11 1 . ii fi ll 1 }.) Then whenever x, ye i and iy -xi ~ o(e) = min {0 1 ((3 M)- 1 e), o 2 ((3 M)- 1 e), w1 ((3 M) - 1 c)}, we have lft(y) fz(Y) - f , (x) fz(x) - (JI(x) f.i (x) + fz (x) f i (x)) (y - x)i :;; lj,(y)llfz(y) - fz (x)- f.i(x)(y-x)i + ift (y) - ft(x) l lfi(x)llY - x i + ifz(x)i ift(y) - f1(x) - f{(x) (y - x)l ~ 3M(3M) - 1 e iy - x i=e ly- x i. Therefore !t / 2 is differentiable on I, with derivative / 1 fi + / 2 j ; and modulus of differentiability o. (c) For each e>O write o(c) s min{o 1 (!M - 2 &), w1{!M - 4 t:)} where M = max { IIJ.- 1 11 1 , ~ o(s), we have IIflU,}. Then whenever x, ye I and IY- xl IJ.- 1(y)-ft- 1(x) + J.- 2 (x) f j(x) (y - x)i = lft- 1(X) ft 1(y)iift(Y) ~M 2 1Ji (y) - J. (x) - !t (x) - !t (y) J,- '(x) fj'(x)(y - x)i fl(x) {y - x)i 2 + M 1/i(x) ft(x) - 1 ilft(y)- ft(x)lly -xi ~ M 2 (! M - 2 e) ly - x l + M 4 (t M - 4 e) iy-xi=ei y -xi . Therefore !t- 1 is differentiable on / , with derivative modulus of differentiability o. J.- 2 f1 and 46 Chapter 2 Calculus and the Real Numbers (d) This is obvious. (e) This is obvious too. 0 (5.3) Corollary. For all positive integers n, (5.3.1) dx" - = nx"- 1• dx Proof: The proof is by induction on n. When n = 1, (5.3.1) is just (d) of Theorem (5.2). If (5.3.1) is true for a given value of n, then dx•+ l d(x·x") - - = - - - = x"+x(n x"- ')= (n + 1)x", dx dx by (b) of Theorem (5.2). Therefore (5.3.1) is true for all n. 0 Theorem (5.2) and its corollary imply the formula D(f, fz- 1)= f t 2U2 Df, - f1 Df2) for the derivative of a quotient, and the formula for the derivative of a polynomial. The next theorem is the so-called chain rule for the derivative of a composite function. Its intuitive meaning is that the rate of change of quantity C with respect to quantity A is the product of the rate of change of C with respect to some third quantity B by the rate of change of B with respect to A. (5.4) Theorem. Let f: I -+R. and g: J -+:R. be different iable functions such that f maps each compact subinterval of I into a compact subillterval of J. Then g of is differentiable, and (5.4.1) (go f)' = (g' of) J'. Proof: It is no loss of generality to assume that I and J are compact. o Let 1 be a modulus of differentiability and ro1 a modulus of continuity for f on I. Let o, be a modulus of differentiability for g on J . For each e>O write where 5. Differentiation Then for x,y in I and l y - x l ~ o(e) we have 1/(y) -f(x)l ~ o, (a), 47 so that lg(f(y)) - g(f (x)) - g'(f(x)) (f(y) - f(x))l ~ a lf (y)- f (x)l. Also lf(y) - /(x)l ~ llf' II ,IY - xl + lf(y) - f(x) - f '(x) (y - x)l and lf (y) - f(x) - f'(x) (y - x) l ~ PlY - xi. Using these inequalities and noting that a ll f'll 1 < t:/2, we compute lg(f(y)) - g(f(x)) -g'(f(x)) f'(x )(y - x)l ~ lg(f(y))- g(f(x)) - g'(f(x)) ( f(y)- /(x))l + lg' (f(x))l lf (y)- f (x) - f' (x ) (y- x)l ~ a l f(y) - f(x)l + llg' IIJ lf(y)- f(x) - f'(x)(y - x)l ~ a 11/' 11 IIY-xi+ (a + llg' II J) 1/(y) - / (x) - f'(x) {y -x)l e t < 21y - xl + 21y - xl = ely - xl. It follows that go f is differentiable on I, with derivative (g'of)f' and modulus of differentiability o. 0 The next lemma is known as R olle's theorem. (5.5) Lemma. Let f be differentiable on the interval [a, b], and let f(a) = f (b). Then f or each e> O there exists x in [a, b] with lf'(x)l ~ t. Proof: Let o be a modulus of differentiability for f on [a,b]. Let ms inf {lf'(x)l: x e [a,b]}, which exists, by (4.6). Suppose that m > 0. We may assume that For each x in [a,b] we have f'(x) ~ m. For if f'(x )< m, then f'(x) ~ - m, so that, by the intermediate value theorem (4.8), there exists in [a, b] with If'W I < m; this contradicts the definition of m. Now choose points a = x 0 ~ x 1 ~ ... ~ x. = b so that xk+ 1 - x. ~ o{tm) (O ~ k ~ n - 1 ). Then 0 = /(b) - f (a) f'(a) ~ m. e n- 1 = L (f(x.t+d - /(x.)) t. o ,._ 1 = ~ L f'(x .)(xk+ 11- l 1- 1<. o n- I L m(xk+ x.) + L (f(x• +d - f (x.) - f'(x.)(x . ... 1 - x.)) •- o •- l 1- x.)- It~ 0 =!m(b - a)>O. L-f m(x. + ·- 0 1- x.) 48 Chapter 2 Calculus and the Real Numbers This contradiction ensures that m = 0. The desired conclusion follows immediately. 0 Rolle's theorem implies the mean value theorem, which gives a basic estimate for the difference of two values of a differentiable function. (5.6) Theorem. Letf be diffe rentiable on the interval [a, b]. Then for each e >O there exists x in [a, b] with lf(b) - f(a)- f'(x)(b - a)l~ c. Proof: Define the function h on [a, b] by h(x)= (x - a)(f(b) - f(a)) - f(x)(b-a) (xe[a, b]). Then h(b)= h(a)= - f(a)(b - a). By (5.5), there eltists x in [a,b] with c ~ lh'(x)l= lf(b) - f(a) - f'(x)(b - a)l. 0 A function f on a proper interval I is increasing (respectively, strictly increasing) if f(x) ~ f(y) (respectively, f(x) > f(y)) whenever x, y E I and x > y. We say that f is decreasing (respectively, strictly decreasing) if - f is increasing (respectively, strictly increasing). It follows from Theorem (5.6) that if f : I-+ lR is differentiable on I and f is increasing (respectively, decreasing) on I. f'(x) ~ 0 (respectively, f'(x) ~ 0) for all x in I, then (5.7) DefinltiOIL Let f.J<1l,j< 21, ... ,j<"- 11 be differentiable functions on a proper interval I such that Df = f<l>, DJ<1l= f121, ... , Dj<"- 2'= f <n- 1), and set f<"l = DJ<"- 1>. Then f<"l is called the n'h derivative of f on 1, and is also written D"f; f is then said to be n times differentiable on I. The function f itself may be written f< 0l or D0f A natural way to simplify a continuous function and set it up for computation is to replace it by a polynomial approximation. The basic result on polynomial approltimation of differentiable functions is Taylor's theorem ((5.10) below). To see the motivation for Taylor's theorem, consider an n times differentiable function f on an interval 1, and a point a in I. It is natural to approltimate f by a polynomial of degree n whose derivatives of orders 0, 1, .. . ,n at a have the same values as the corresponding derivatives of f at a. The unique such S. Differentiation 49 polynomial is obviously " p«l(a) « -k ,- (x - a) . L- (5.8) k- 0 . This approximation is useful for a given va lue b of x only when there exists a good estimate for the remainder n JCkl(a) (5.9) R = f(b)- L - -(b-a}'. •- o k! The purpose of Taylor's theorem is to obtain such an estimate. (5.10) Theorem. Let f be an (n+ 1) times differentiable f unction on an interval I, let t be a positive number, and let a and b be points of I. Then there exists c with min {a,b} ~ c ~ max {a,b}, such that IR pn:;l(c)(b -c)"(b-a)l ~ t, where R is given by (5.9). Proof: Let M = I+ max {fll)(a)/ 1 !,f'2 l(a)/ 2!, ... , flnl(a)fn!} and let w be a modulus of continuity for f on [min{a,b}, max{a,b}]. Let 0 < c5< min {1, e/2n M, w{!e)} . Either O< la-bl or la - bl <o. In the latter case, ft IRI ~ lf(b) - f(a) l + . L l/111(a)/k!llb - all ~h +M L o•<!t + M k- 1 " L (ej 2n M) =e, k- 1 and so we can take c =: b. In case O< la function g on I by f'(x) g(x) = f(b)- f(x) ---yy-(b - x) j<•l(x) ... - - - (b-x)" - R(b II 1 Then g(a) ; g(b) "'" O and bl, define the differentiable f"(x) 2 ! (b - x) 2 x)(b-a) 1 g'(x)= - f'(x)+ f'(x)- f"(x)(b -x) + f"(x)(b - x)f<n+1l(x) f (•l(x) ... + -(b - x)•-• (b - x)"+R(b - a)- 1 (n - 1)! n! = - f (• + l)( X ) (b - x)" +R(b - a)- 1, 1 n. 50 Chapter 2 Calculus and the Real Numbers since the terms cancel in pairs except for the las t two. By Rolle's theorem, there exists c with min {a,b} ~ c ~ max {a,b} and l g'(c) I~ E.(b - a)- 1 • The required estimate for R now follows. 0 An important special case occurs when f is infinitely differentiable - that is, j<•l exists for all positive integers n - on an open interval I = (a - t, a+ c). In that case, if (5.11) r•j<• + t) - - - -+0 n! then the Taylor series for as n-+ ro (O<r<t). f about a, r. ,.,.Do-P")(a) ,- (x -a)", n. (5.12) converges to f on J. Note that although condition (.5.1 1) enables us to prove the convergence of (5.12) by the comparison test, Theorem (5.10) is needed to show that the sum of the Taylor series is f In fac~ the series (5.12) converges to f under the weaker condition r"j<") - - -+0 n! as n-+ co (O<r<t). To prove this, we need a different estimate of the remainder term, which is derived using the theory of integration from Section 6 below. (See Problem 23.) Examples of (5.12) will occur later, when some of the common functions of analysis are expanded in Taylor series. 6. Integration Differentiation deals with the instantaneous behavior of a function. We now look at a process, called integration, which deals with the behavior of a function on the average; in fact, it is essentially a rigorous formulation of the averaging process. Differentiation and integration will turn out to be inverse processes in a certain precise sense. If a= a0 ~ a1 ~ ... ~ a. = b are real numbers, the finite sequence P = (a 0 , ... , a.) is called a partition of the interval [a, b], n is called the length of P, and mesh P o; max {a;+ t - a;: O ~ i ~ n - 1} is called the mesh of P. A partition Q =: (a'0 , ... ,a;,.) of [a,b] is a refinement of P if for each i (0 ;£ i ~ n) there existsj with a) = a;. 6. Integration 51 For each partition P =: (a 0 ,a11 ••• ,a.) of [a,b] and each continuous function f on [a, b], S(J. P) will denote an arbitrary sum of the type n I L f(x1)(a,+ 1 -aJ (6.1) •· 0 where x1e [a1, al+ 1] (O ~ i ~ n - 1 ). In particular, if a1= a + i(b- a)jn (O ~ i ~ n), the quantity b- a n l ( b-a) (6.2) S(f,a,b,n) = f a+in l. o n is one of the numbers S(J. P). Two partitions P =: (a 0 , •• • ,aJ and Q = (b 0 , • •• , b.,) of [a,b] are separated if a0 < a1 < ... <a., b0 < b1 < ... < bm. and a1 =t= b1 for all i and j with l ~ i ~ n-1 and l ~j ~ m - 1. L (6.3) Theorem. If f is a continuous function on a compact interval [a, b], then the sequence (S(f, a, b, n))~. , (6.3.1) converges to a limit, which is called the integral of f from a to b and is written , Jf(x )dx . (6.3.2) a Moreover, if w is a modulus of continuity for f. then for any e > O and any partition P of [a, b] with mesh P ~ w(c), we have IS(J. P)- (6.3.3) If(x)dxl~e(b-a). Proof: Let e and {J be arbitrary positive numbers. Consider partitions P =: (a 0 , .• • ,a.) and Q=:(b 0 , .•. ,b, ) of [a, b] with meshP ~ w(e) and meshQ ~ w(b). To begin with, assume that b - a > O and that P and Q are separated. Let R =: (c0 , ... ,c,) be the common refinement of P and Q obtained by ordering the numbers a0 , ... , a., bit ... , bm- l in a strictly increasing sequence. Let z11 e [c~r ,cu.J (O ~ k ~ r - 1). For each i in {0, ... , n- 1}, let L denote summation over those indices k for which i a; ~ c,< a1 +t· Then IS(f, P)- S(J. R)l = I"I.' f(x;)(~+l - a;) - 'I.' f(z~r)(cu = l"f f(x;)L(Ck+ J i- 0 ~ 1 I •- 1 )1 -c11 ~- 0 ;. 0 -c~r) -"f Lf(z,)(cHJ-ck)l 1• 0 ,· •- 1 L L!f(x,)-f(zJI(ck+ I-Ct) ~ L L c(ct+,-c,) = e(b-a). •-o 1 1. 0 1 52 Chapler 2 Ca lculus and t.h e Real Numbers Similarly, IS(/, Q)- S(f, R)l ~ ~(b - a). It now follows that IS(/. P) - S(f, Q)l ~ (t + ~)(b - a). (6.3.4) Now consider the general case, and let IX be an arbitrary positive number. If b-a is small enough, then IS(/, P)- S(f, Q) l ~ (t + ~)(b - a) +IX. (6.3.5) If b - a> 0, then applying the fi rst part of the proof to separated partitions which a re sufficiently close approximations to P and Q, we again obtain (6.3.5). As IX> 0 is arbitrary, we now see that (6.3.4) holds in the general case also. In particular, IS(/, a,b,J) -S(f,a, b, k)l ~ 2 t (b - a) for all positive integers j, k~ w(El) - 1(b - a). Therefore (6.3.1) is a Cauchy sequence, and hence converges. For each integer k ~w(<>t 1 (b -a), inequality (6.3.4) entails IS(/, P) -S(f, a, b, k)l ~ (e + c5)(b - a). 0 Passing to the limit as k ..... oo and c5-+ 0 gives (6.3.3). If f and g are real-valued functions on a set X such that X, then we write f :5 g. f(x) ~ g(x) for all x in (6.4) Proposition. Let f and g be continuous functions on a compact interval [a,b] , and let a.,p be real numbers. Th en b (6.4.1 ) M oreover, b " if f ~ g, then b (6.4.2) b J<1Xf(x)+{Jg(x))dx = a J f(x)dx + {J Jg(x)dx. b Jf (x)dx ;£ J g(x)dx. Proof: Equation (6.4.1) follows from the fact that S(af + {Jg, a, b, n) = aS(f, a, b, n) + {JS(g, a, b, n) for all n in .z+, and (6.4.2) follows from the inequalities S( f, a, b, n) ~ S(g, a,b, n) (nE ll+). As a corollary we have b c 1 (b - a) ~ J f(x) dx :5 c2 (b - a) " 0 6. Integration whenever c 1 ~ /(x) ~ c 2 for all x in [u,b]. By setting = II / II (a, bJ• we obtain I! (6.5) f(x) (6.6) Proposition. If f is [tl, b], then ~ (6.6.1) Cl d.'(,~ II / II (o, bJ(b - - c 1 = c2 a). continuous func tion on < 53 t1 com pact illlen:al h Jf(x) dx = Jf(x) clx + Jf(x)dx (a ~ c ~ b). II Proof: Let a ~ c ~ b. For arbitrary partitions P = (a, .. . ,£')of [u,c] and Q = (c, ... ,b) of [c,b] we have S(f, P) +S(f, Q) = S(f, R ) with appropriate choices of the approximating sums, where R is the partition (a , ... , c·, ... , b) of [tt, b] formed by combining the partitions P and Q. Since mesh R = max {mesh P, mesh Q}, equality (6.6.1 ) follows by passage to the limit as mesh P -+ 0 a nd mesh Q -+ 0. 0 f For arbitrary real numbers a and b and each continuous funct ion on [min{a,h}, max{a, b}] we define (6.7) b b II II ). ). Jf(x)tL"C = f f(x)dx - Jf(x) clx, where A. :: min{a,b}. In case a ~ b this agrees with the definitionalready given. Equality (6.4.1 ) is valid for arbitrary real numbers a and b, provided that the integrals are defined (which means that f and g are continuous on the interval (min {a,b}, max {t1,b}]). Equality (6.6.1) is valid whenever f is continuous on the interval [min {a. b, c}, max {a,b,c}]. In particular, setting b = a in (6.6.1) g ives < II Jf(x) tlx = - Jf(x) dx II whenever the integrals are defined. For arbitrary real numbers a a nd b equality (6.5) holds with I / 11 1,.~ 1 replaced by the norm of f on [min {a,b}, max {a, b}]. We now prove that differentiation and integration are inverse processes, a result whose imposing title is the fundamelllal theorem of calculu.,. Chapter 2 Calculus and the Real Numbers 54 (6.8) Theorem. L et f be a continuous function on a proper interval I, let a be a point of I, and write X g(x) : Jf(t) dt (X E I). a Then g' = f Also, if g 0 is any differentiable function on I with g'o = f, then the difference g - g 0 is a constant fun ction. Proof: There is no loss of generality in taking I to be compact. Let w be a modulus of continuity for f on J. For each e> 0 and each pair of points x, yin I with lx - yl ~ w(e), we have lg(y) - g(x) - f(x)(y - x)l =I! =I! f(t)dt - ! f(x)dt l (f(t) - f(x))dtl ~ely -xi. Therefore g is differentiable on I , with derivative j and modulus of differentiability w. Ir also g0= f on I, then (g - g 0 )' = 0 on I . By the mean-value theorem, (g -g 0 )(a) = (g-g0 )(b) whenever a, bEl and a<b. Therefore g - g 0 is a constant function. 0 Our next lemma has an interest of its own. (6.9) Lemma. Let (J,.) be a sequence of continuous functions converging to a function f on a nonvoid interval I. Let a be any point of I , and write X f g(x) = f(t)dt (xEI) a and X f g.(x) = f.(t)dt (xEI) " for each positive integer 11. Then g.-+ g on I as n-+ oo. Proof: It is no loss of generality to assume that I is compact. Let r be the length of J. Then for each x in I we have jg.(x) -g(x)l =I! (f.(t)- f(t)) del ~ r ll f. -!11 1 -+0 Therefore (g.) converges to g on I . 0 as n-+oo . 7. Certain ImpOrtant Functions SS (6.10) Theorem. Let (f,.) be a sequence of differentiable fun ctions which converges to a continuous f on a proper interval I . Assume that the sequence (f:) of derivatives converges to a continuous function g on I. Then f' = g on I. Proof: It is no loss of generality to assume that I is compact Let a be any point of I . For each x in I write X h(x) ::: Jg(t)dt G and JC h.(x) = J f:(r)dt . By (6.8), we have h'= g and h~ = f,' (nEZ +). Hence f.-h,. is constant on / . By (6.9), we have h.-+ h on I as n-+ oo. Therefore f- h = lim (J,. -h,.) is constant on /. It follows that f' ~ h' = g on /, as was to be proved. 0 7. Certain Important Functions A good point of departure for the study of the exponential function is obtained from an analysis of its rate of change. Intuitively the exponential function is that differentiable function on It, normalized to have the value l at 0, which increases at a rate equal to its size. In other words, it is the solution of the differential equation Df = f normalized by the condition f(O) = 1. Clearly, such a function J, it it exists, is infinitely differentiablt; and fl" 1(0) = l for all nonnegative integers n. Thus the Taylor series for f about x = O is (7.1) Fo r each interval [ - c, c] with c>O; c" ll f 111- t,tVn !-+0 as n -+oo, and hence the series converges to f on [- c, c]. Therefore if f exists, it is unique, and is given by (7.1). On the other hand, (7.1) converges on lR, by the ratio test. We therefore define the exponential function exp by means of the series (7.1): "' x" (7.2) exp(x) = (xe lR). , . on! L- 56 Chapte r 2 Calculus and the Real Numbers Clearly exp (0) = I. Also, (7.3) d exp(x) dx a) nxn- 1 L- n- 0 a) xn- 1 ,- = L -( - l)' = exp(x), n. n. 1 n . by (6.10). The function exp has the desired properties, and is unique. (7.4) Proposition. Tlte function exp satisfies the functional equation (7.4.1) exp (x + y) = exp (x) exp (y) (x, ye lR). Proof: Let z be any positive real number. Then exp(z)=f: O, by (7.2). Define the differentiable function f: lR-+ lR by f(x) ::: (exp(z))- 1 exp(x+z). The derivative off is dfl(x) = (exp(z))- 1 exp(x + zJ"(x/+ z) = f(x), tX lX by (5.4). Also, f(O) = 1. By the uniqueness of the exponential function, f(x) = exp (x) - that is, (7.4.2) exp(x+ z)=exp(x) exp(z). Consider now arbitrary real numbers x and y. Choose z with z > 0 and z + y > 0. Then, by (7.4.2), exp (x + y) exp(z)= exp(x + y+ z) = exp(x) exp(y + z) = exp (x) exp (y) exp (z), which is equivalent to (7.4.1). 0 It can be shown that exp is the unique continuous function from 1R to :R that has the value I at 0 and satisfies (7.4.1). Equality (7.4.1) implies that exp( -x)= (exp(x))- 1• Also. (7.5) exp(x)>O (xe 1R). For, given x in lR and choosing z > 0 so that x + z > 0, we have exp (z);;;; I and exp(x+ z);;;; I; whence exp(x)=exp(x + z)(exp(z))- 1 > 0. The theory of the exponential function can also be approached as follows . To construct a function f with f(O) = I and f' = f, notice that f' = f implies, by the mean-value theorem, that f(x + o) is approximately equal to f(x) + of'(x) = (1 + i5) f(x) when o is small. Therefore, if t is any real number and n is a large positive integer, f(tf n) 7. Certain Important F unct ions is 1 + (t/ n), /(21/n) = f(t/ n +t/n) approximately is 57 approximately (I + (t/n)) 2 , and so forth , and thus f(t) = f(n(t jn)) is approxi mately (l + (tjn)t. This heuristic argument suggests that we define exp(t) = lim ·-·J (1 +.:_)". 11 It is not hard to see that t n(n - l) {t) ~ + ... (l+~t)" - l +n~+ 2! 2 a. converges on R to L t"j n! as n __. oo. This approach to •- 0 the theory of the exponential function would therefore lead to the same series representation. We want the logarithmic function to be the inverse function to the exponential fun ction, but we shall not define it in that way. Arguing heuristically on the basis of the chain rule (5.4), for each x > O we want l dx = d d d dx = dx (exp(ln (x)) =exp(ln (x)) dx In (x) = x dx In (x). This gives (7.6) d dx ln(x)= x - 1 (x > O). We therefore define In (x) to be the integral of x - 1 ; specifically J: (7.7) ln(x) = Jc 1 dt (x>O). I By (6.8), In is differentiable on (0, ex:>) and (7.6) is valid. Let y be any positive real number. By (5.4. 1), the derivative of the funct ion (7.8) XI-+ In (x y) - In (y) is x - 1 • Also, (7.8) vanishes at l. By the last statement of (6.8), ln(x) = ln(xy)-ln(y); in other words, (7.9) ln(xy) = ln(x)+ln(y) (x,y>O). This is the functional equation for the logarithmic function, corresponding to the functional equation (7.4.1) for the exponential function. 58 Chapter 2 Calculus and the Real Numbers By (7.5) and (5.4), the composite function lnoexp exists and is differentiable eve rywhere on R , and d dx (ln (exp (x))= 1 (xe ll). Since also In (exp (0)) = In (1)= 0, we have (7.10) ln(exp(x))= x (xe ll), by the last part of (6.8). Consider x > O and write y :: exp(ln(x)). Then In (y) = In (exp (In (x)))= In (x), by (7.10). Thus 1 0 = In (y) - In (x)= Jt - 1 dt. z If y> x, this gives 0 ~ y- 1 (y - x), a contradiction. By (218), we therefore have y ~ x. Similarly, x ~ y; whence x = y. Thus exp (ln (x)) = x for all x> O. It follows that the functions exp: JR _. Jt+ and In: IR.+-+R are inverse to each other. For any a > O we now define ~ = exp(x ln (a)) (xe ll). We leave the reader to confirm that the map x~--+~ has the familiar properties. Note that for a fixed x > 0 the map t 1--+ t" of IR. + into IR. + extends to a continuous map of R 0 + into IR.O+ with OX = 0; this follows from the fact that t" is arbitrarily small for all positive numbers t sufficiently close to 0. The trigonometric functions sin and cos also can be approached via an intuitive analysis of their rates of change. From this analysis we are led to believe that (7. 11) d . dx cosx= -smx, d . dx StnX = COSX (.X€ R.). We therefore define these functions by power series constructed in such a way that (7.11) will hold. Remembering that cos (0)= 1 and sin (0) = 0, we are forced to define ,., x2• (7.12) cos X E L ((2 ) I ' •- o n. (xe iR.). w "" sinx = L (•• o xln+l 1)" (2 I)' n+ . Theorem (6.10) implies that (7.11) is valid. 7. Certain lmportunt Functions 59 (7.13) Proposition. The functions sin and cos satisfy the fun ctional equations {7.13.1) sin (x+ y) = sin x cosy +cosx sin y (7.13.2) cos(x+ y) - cosx cosy -sinx sin y for all x, y in 1t. Proof: Consider both sides of (7.13.1) as functions of x, say f(x) and g(x). The functions f and g have Taylor series about x = O which converge on R. The coefficients of these series are expressed in terms of the various derivatives off and g at x = O. Therefore to show that f = g on lR it is enough to show that J<"1(0) = g<•I(O) for all nonnegative integers n. Since j< 21 = - f and g 121 = - g, it is enough to notice that f(O) = sin y = sin 0 cosy+ cos 0 sin y = g(O) and f'(O) = cos y = cos 0 cosy+( -sin 0) sin y = g'(O). Therefore (7.13.1) is valid. Equation (7.13.2) follows from (7.13.1) by differentiation with respect to x. 0 Putting y = - x in (7.13.2) gives the useful identity (7.14) Further study of the trigonometric functions depends on properties of the number n, which we construct as twice the first positive zero of the cosine function. (7.15) Theorem. The sequence (x.) of real numbers defined inductively by the equations (i) (ii) x. = l x • ..- 1 :: x.+cosx. (n ~ 1) is increasing, and converges to a limit rt/ 2 such that cos(rt/ 2)=0 and cosx>O for all x with O ~ x<rt/2. Proof: To begin with; we prove that (7.15.1) cost>O (O ~ t ~ x., ne Z ...). Consider first the case n = I, when we have 0 ~ t ~ x 1 = I. Then cost= ( 1 - ~~) + (~~ - ~~) + ... ~ I - ! > 0. 60 Chapter 2 Calculus and the Real Numbers Next assume that (7.15.1) is true for a particular value of n; then x. +1 > x •. Consider a value of t with x. < t ~ x, +1• Now Isin xi ~ I for all x in R , by (7.14); and lsinx,l <I, because cosx, > O. Hence I J cos l = cosx,- sin X dx> cosx, -(x,+ 1 -;<,) = 0. Since cos is a continuous function and cost> 0 whenever 0 ~ t ~ x, or x.<t~Xn+h cost>O whenever O~t~Xn+t· By induction, (7.15.1) holds for all n. Since cos x, > 0 for all n, the sequence (x.) is increasing. From the mean-value theorem and (7.15.1) we obtain sin x.~sin 0 =0 (7.15.2) for all n, and sin t;;:; sin I whenever 1 ~ t ~ x, for some n. Also by the mean-value theorem, for each n ~ 2 there exists t in [x,_ 1 , x,] such that x,+ 1 -x,=x.+cosx,-(x._ 1 +cosx.,_ .) =x. -x,_ 1 +cosx, -cosx._ 1 is arbitrarily near to x.-x._ 1 -(sint)(x,-x,_ 1 )~(x,.-x,_.)(1-sin 1). Therefore This gives x,+ 1 -x,~(l-sin 1)"- 1 (x 2 -x 1 ) for all n ~I. Thus (x.) is a Cauchy sequence, whose limit we denote by n/2. lfO~x<n/2, then x<x, for some 11, and thus cosx>O, by (7.15.1). Taking limits on both sides of (ii) as n .... oo we see that cos (n/ 2) =0. 0 By (7.15.2), sin(n/2)~0. Since cos(n/2)= 0, it follows from (7.14) that sin (n/ 2) = I, and hence from (7.13.1) that sin (x + n/2) = cos x for each x in JR. Similarly, sin(n/2-x)=cosx and cos(x+n/ 2) = -sinx. Hence sin(x+n) = cos(x+n/ 2) = -sinx, and thus sin(x+2n) = sinx. Similarly, cos(x+n)= -cosx and cos(x+2tt) = cosx. On the other hand, if a>O and sin(x+a)=sinx for all x, then cosa= sin(n/ 2+a) =sin (n/2) = I. Since (as the reader may prove) cos is a strictly decreasing function on [0, n] and a strictly increasing function on [n, 2n], and since cos0=cos2n= I, it follows that a ~ 2n. T hus the function sin, and similarly the function cos, is periodic, with period 2n. 7. Certain Important Functions 61 Next we study the function arcsin, which is inverse to the function sin. For motivation we take the derivative of the equation sin(arc sinx) = x and get cos (arc sin x) :x (arc sin x) = l. Since 1 = cos 2(arcsin x)+ sin 2 (arc sin x) = cos 2 (arc sin x) + x 2 , it follows that :)arcsinx)= ±(l-x 2 ) With this motivation, we define X arcsinx :: J(l-t 2 )- ! dt (-I <x< l). 0 Then arc sin is dirferentiab1e on (- 1, 1), and d . dx(arcsmx) = (l-x 2)-~. To discuss the range of arc sin, we first observe that the function sin is strictly increasing on [ -rr/ 2, rr/2]. Since sin ( - rr/ 2) = - 1 and sin (rr/ 2) = 1, it follows from the remarks after (4.8), and the continuity of sin, that for each x in (-I, I) there exists y in ( -rr/2, rr/2) with x = sin y. For any y in ( - rr./2, rr/2) we have d arcsin(siny)=(l-sin 2 y) l ty l cosy = I. Hence, as arcsin(sin 0) =0, (7.16) arcsin(siny)=y ( -rr1 2<y<n/2). It follows that the function arcsin maps the interval (-I, 1) into the interval ( -rr./2, n/2). Moreover, ir -1 <x< 1 and z = sin(arcsinx), then arcsinz=arcsinx, by (7.16); so that - J(l -r 2 ) -~ dt=0. X Therefore x = z = sin (arc sin x). It follows that the functions sin:( -rr./ 2, rr./2)->( -I, 1) and arcsin: ( -1,1)->( -nt 2,nf 2) are inverses. 62 Chapter 2 Calculus and the Real Numbers Problems The word "not" will be used throughout in the sense of the discussion at the end of Section 2. 1. Construct a set A s uch that x = y for all elements x and y of A, but A is not subfinite or void. 2. Construct a mapping f from a set A to a set B such that f is onto B, but there does not exist a mapping g: B-+ A with f(g(b)) = b for all bin B. 3. Prove that (x,y) ........ xy is a function from R x JR. into JR. 4. Let x and y be real numbers s uch that x =t= y entails 0 = 1. Show that x = y. 5. Show that if x1o ... , x" are real numbers such that x1 ... x" < 0, then for some i. X;< 0 n 6. Call a pair (S, of nonvoid subsets of the set Q of rational numbers a Dedekind cut if s < t for all s in S and t in T, and if for arbitrary rational numbers x, y with x < y either x e S or ye T Show that for each Dedekind cut (S, T) there exists a unique real number x(S, T} such that s ~ x(S, for all sinS, and t ~ x(S, T} for all t in T Show that for each real number x there exists a Dedekind cut (S, T} with x(S, n n=x. 7. Show that the real number system can be constructed from the Dedekind cuts. In other words, define directly equality, order, addition, a nd multiplication of Dedekind cuts in such a way that the map (S, T) ........ x(S, from the set of all Dedekind cuts into JR. is a bijection which preserves order and the operations of addition and multiplication. n 8. Construct a real number that is not positive or negative or equal to 0. 9. Construct a real number that does not have a decimal expansion. 10. Construct a real number that is not rational and not irrational. (A real number x is irrational if x=t=r for each rational number r.} 11. Construct a continuous function f: [0, 1]-+ JR. such that there does not exist a point x in [0, 1] with f (x} equal to the supremum off Problems 63 12. Show that a continuous function on a compact interval admits a modulus of continuity which is a continuous function. 13. A mapping f of an interval I into R is weakly injective if x = y whenever f(x) = f(y), and is injective if f(x) =F f(y) whenever x =F y. Prove that the following statements are equivalent. (i) Every weakly injective map f: [0, 1]-+ R is injective. (ii) If x is a real number such that the equality x = O is impossible, then x =t=O. (iii) For each sequence (x.) in {0, 1}, if it is impossible that x. = O for all n, then there exists n with x.= 1 (Markov's principle). (There is no known proof or counterexample for Markov's principle.) 14. Let f: [0, 1]- R be a continuous function with /(0) < /(1). Show that there is a sequence {y.) of elements of (f(0),/(1)] such that if /(0) ~ y ~ /(1) and if y =t= y. for each 11, then there exists x in [0, 1] with /(x) = y. 15. Let f : [0, 1] -+ JR. be a continuous function with /(0) < 0 and /(1)> 0, such that for arbitrary real numbers a and b with O ~ a<b ~ 1 there exists x in [a, b] with f (x)=F O. Show that there exists x in [0, 1] with f(x) = 0. 16. Let / : [0, 1]-+ JR. be a continuous function with /(0) < 0 and /(1) > 0, such that there exists a positive integer n with 1/(x)l + lf'(x)l + ... + lfl"l(x)I >O for all x in [0,1]. Show that there exists x in [0, 1] with f(x)=O. 17. Let f: [0, 1]-+ JR. be a polynomial function with /(0) < 0 and /(1)>0. Show that there exists x in [0,1] with f (x)=O. 18. Show that if f: [0,1]-+ [0,1] is a continuous function, then for each e>O there exists x in [0, 1] with 1/(x) - xl <e. 19. Let f: [0, 1]-+ R be differentiable and locally nonconstant, in the sense that if x, yE[O, 1] and x <y, then there exists z in (x, y) with f(z) =F f(x). Suppose also that /(0)= /(1). Prove that there exists x in [0, 1] with f'(x) = O. 20. Construct a continuous function /: [0,1]-+ R such that f' does not exist on any proper compact subinterval of [0,1]. (In other words, 64 Chapter 2 Calculus and the Real Numbers the assumption that to a contradiction.) f' exists on a proper compact subinterval leads 21. Let f and g be continuous functions on the compact interval [a,b], such that g(x) G O for all x in [a,b]. Prove that for each e>O there exists c in [a, b] such that l! f(x) g(x) dx - f(c)! g(x) dxl <e. 22. Prove that under the hypotheses of Taylor's theorem, b j(•+J>(c) R=J t (b-t)"dt. G II. Hence prove that for each e>O there exists c with min {a, b} 2" c ~max {a,b}, such that R l j<•+ll(c) n+ll (n+ l)! (b- a) <e. 23. Prove that if f is infinitely differentiable on the open interval I :(a-t, a+ c), and if r•J<•> - - -+0 n! then the Taylor series for as n-+oo (O<r < c), f about a converges to f on /. Notes As the definitions of Section 1 indicate, a single concept of classical mathematics may split into two or more distinct concepts when looked at constructively. For example, the distinction between finite and subfinite sets does not exist classically ; nor does the distinction between weakly injective and injective mappings (Problem 13). In [7] what we have called a map of A onto B is called a surjection; a function f: A-+ B is there said to map A onto B only if it has a right inverse. Although there is a meaning ful distinction involved here (see Problem 2), we prefer to use the simple expression 'onto ' for the more general situation. When a right inverse map exists we can say so explicitly if we need to. Problem 2 indicates that the axiom of choice is 110t true in constructive mathematics. In certain formal systems attempting to de- Notes 65 scribe constructive mathematics it can be shown that the axiom of choice entails the principle of omniscience [42]. An alternative to Definition (2.1) would be to define real !lumbers by the method of Dedekind cuts (see Problem 6). Another possibility would be to define a real number as a sequence (xn) of rational numbers and a sequence (Nk) of integers, such that lxm-xnl~k- 1 whenever m, n~Nk. Notice that a real number is a regular sequence of rational numbers, not an equivalence class of regular sequences of rational numbers. To define a real ·number to be such an equivalence class would be either pointless or incorrect: pointless (but correct) if the equivalence class is required to be specified by giving some particular regular sequence of rational numbers that belongs to it, and incorrect otherwise. Precisely what is meant by a subset (in particular, what it means for R + and :R0 + to be subsets of .IR) will be discussed in Chapter 3. In classical mathematics, a proof like that of Lemma (2.18), in which we assume a certain hypothesis, derive a contradiction, and therefore conclude that the negation of our hypothesis is true, is called an indirect proof. It is necessary to exercise caution in thinking of Lemma (2.18) in this way. Although x ~y is equivalent to the negation of x > y, it is not true that x > y is equivalent to the negation of x ~ y. There is a paradox growing out of Theorem (2.19) which the reader should resolve. Since every regular sequence of rational numbers can presumably be described by a phrase in the English language, and since the phrases in the English language can be sequentially ordered, the regular sequences of rational numbers can be sequentially ordered, in contradiction to Theorem (2.19). The counterexamples following Theorem (2.19) are deceitful. The reader is asked not to form the impression that the purpose of constructive mathematics is to consider pathological numbers like the ones discussed here. The only reason for discussing such numbers is to show that certain statements (in this case, the statement that for each x in JR 0 + either x>O or x = O) are not constructively valid. To appreciate this point better, the reader should examine the counterexamples given in Chapter I, where the treatment is more detailed. Sometimes a mathematician, when confronted with a counterexample of this sort, will say he is not interested in real numbers that have been artificially twisted out of shape. One reply to this is to point out that the set of real numbers x such that either x >0 or x ~ O is 110t complete (nor is it closed under addition). The 'choice' involved. in the proof of (b) of Proposition(3.4) is performed according to a definite rule. 66 Chapter 2 Calculus and the Real Numbers Strictly speaking, a continuous function on a compact interval I is a pair (/, w) consisting of a function f: I-+ R and a modulus of continuity w for f on I ; two continuous functions (f1, w 1) and (f 2 , w 2) are equal if / 1 and / 2 are equal functions. In practice, as in Definition (4.5), we call f itself a continuous function, in the spirit of the remarks preceding Proposition (2.9). Notice that divergence is a positive notion. As we remarked in Chapter 1, a nonvoid bounded set of real numbers need not have a least upper bound. This, of course, makes some parts of constructive mathematics harder than the corresponding classical material, but it also makes some parts more interesting. The concept introduced in Definition (4.5) is classically called uniform continuity. We abbreviate this to continuity, since the classical version of continuity (that is, pointwise continuity) is not used anywhere in the book. Classically, one proves that on a compact interval, pointwise continuity implies uniform continuity. Constructively, there is no known proof of this result, and so we assume uniform continuity from the start. Nothing essential is lost by this assumption, and a certain simplicity is gained. Similar remarks apply to Definitions (4.9) and (5.1). It would be more correct to call Lemma (5.5) a constructive substitute for Rolle's theorem: the classical result states that under the hypotheses of Lemma (5.5), f'(x) =0 for some x in (a, b). It follows from Problem 19 that as long as we are concerned with the familiar elementary functions of real analysis, the sharp classical form of Rolle's theorem is constructively valid. The proof of Theorem (5.10) illustrates the minor inconveniences caused by not being able to compare arbitrary real numbers. For Riemann integration of possibly discontinuous functions see Problem 10 of Chapter 6.