Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Inductive probability wikipedia , lookup
Random variable wikipedia , lookup
Birthday problem wikipedia , lookup
Infinite monkey theorem wikipedia , lookup
Probability box wikipedia , lookup
Probability interpretations wikipedia , lookup
Ars Conjectandi wikipedia , lookup
Central limit theorem wikipedia , lookup
CHAPTER I - THE BASIC IDEAS JOSEPH G. CONLON 1. Measures induced by random variables A probability space is generally defined as a triple (Ω, F, P ), where Ω is a set, F is a Borel algebra of subsets of Ω, and P : F → R a probability measure. Hence P is a positive measure on F which satisfies P (Ω) = 1. A real valued random variable X on Ω is then a Borel measurable function X : Ω → R, which means (1.1) O an open subset of R implies {ω ∈ Ω : X(ω) ∈ O} ∈ F . Rather than begin in such an abstract way, we can start with the cumulative distribution function (cdf) F (x) = P (X ≤ x) of X. The function F : R → [0, 1] is monotone increasing on R, and right continuous i.e. lim F (x0 ) = F (x), (1.2) x0 ↓x lim F (x) − lim F (x) = 1. x→∞ x→−∞ Standard measure theory tells us that there is a Borel measure P on R such that (1.3) P ( {x : a < x ≤ b} ) = F (b) − F (a), −∞ < a < b < ∞. Thus given the cdf F (·), we can construct (Ω, F, P ) and X : Ω → R as above, in which Ω = R, F = Borel algebra generated by open sets of R, and P is given by (1.3). The variable X is just the identity X(ω) = ω, ω ∈ R. The probability space (Ω, F, P ) so constructed is the simplest probability space associated with the cdf F (·). We can generalize the previous construction to many random variables X1 , ..., XN , once we know the joint cdf F (x1 , ...xN ) = P (X1 ≤ x1 , ..., XN ≤ xN ). In that case Ω = RN , F = Borel algebra generated by open sets of RN , and P is defined similarly to (1.3). This can be further generalized to a countably infinite set of variables X1 , X2 , ..., provided the joint cdfs of the variables satisfy an obvious consistency condition. Thus for 1 ≤ j1 < j2 < · · · < jN , let Fj1 ,j2 ,..,jN (x1 , x2 , ...xN ) be the cdf of Xj1 , Xj2 , ..., XjN . Consistency then just means: (1.4) lim Fj1 ,j2 ,..,jN (x1 , x2 , ...xk−1 , xk , xk+1 , ...., xN ) xk →∞ = Fj1 ,j2 ,..,jk−1 ,jk+1 ,...,jN (x1 , x2 , ...xk−1 , xk+1 , ...., xN ) , for all possible choices of the ji ≥ 1, xi ∈ R. This construction of (Ω, F, P ) for a countably infinite set of variables is known as the Kolmogorov construction. Here Ω = R∞ and F is the σ field generated by finite dimensional rectangles. For a particularly simple set of variables the Kolmogorov construction is already done for us by the construction of Lebesgue measure. Thus consider a Bernoulli variable X = 1 with probability 1/2, and X = 0 with probability 1/2. We wish to carry out the Kolmogorov constuction for an infinite set X1 , X2 , ... of independent 1 2 JOSEPH G. CONLON Bernoulli variables. This can be done by setting Ω = [0, 1], P = Lebesgue measure, and Xj (ω) the jth entry in the binary expansion of ω ∈ [0, 1]. Thus we have (1.5) ω = ∞ X Xj (ω) j=1 2j , ω ∈ [0, 1]. Note that this construction is actually simpler than the Kolmogorov construction since for that we have Ω = R∞ . Here we are using the fact that the space {0, 1}∞ with probability measure induced by the counting measure on finite subsets, is identical to the interval [0, 1] with Lebesgue measure. This illustrates an important point in probability theory. What matters are the values a random variable can take and the associated cdf. Generally we try to construct the simplest probability space (Ω, F, P ) consistent with this information. 2. Law of Large Numbers If we take the Bernoulli variables X1 , X2 , ..., generated above, then we have a model for the problem of independent tosses of a fair coin. Thus Xj = 1 if the jth toss is H and Xj = 0 if it is T. Hence SN = X1 + · · · + XN is the number of heads in N tosses of the coin. The law of averages tells us that 1 SN = with probability 1. (2.1) lim N →∞ N 2 The abstract measure theory we just presented gives us a precise way of formulating this law of averages. Proposition 2.1. Let X1 , X2 , ..., be independent identically distributed (i.i.d.) variables on (Ω, F, P ) which have the property that hX12 i < ∞. Then if SN = X1 + · · · + XN , there is for any ε > 0 the limit SN (2.2) lim P = 0. − hX1 i > ε N →∞ N Proof. We use the Chebyshev inequality that for any random variable Y and p, δ > 0, one has (2.3) P (|Y | > δ) ≤ h |Y |p i/δ p . Hence taking p = 2 in (2.3), we have that SN var[SN ] N var[X1 ] h X12 i (2.4) P − hX1 i > ε ≤ = ≤ . 2 2 2 2 N ε N ε N ε2 N Proposition 2.1 is known as the weak law of large numbers. It tells us that SN /N converges in probability i.e. in measure to the average h X1 i. However (2.1) is really saying something stronger than this. Proposition 2.2 (Strong Law of Large Numbers). Let X1 , X2 , ..., be i.i.d. variables on (Ω, F, P ) which have the property that hX14 i < ∞. If SN = X1 + · · · + XN , then SN (2.5) lim = hX1 i with probability 1. N →∞ N MATH 625-2010 3 Proof. Just as the weak law followed from estimating the variance of SN , so the strong law (2.5) follows by estimating the fourth moment of SN − h SN i = SN − N hX1 i. It is easy to see that h {SN − h SN i}4 i ≤ CN 2 (2.6) for some constant C. Applying (2.3) with p = 4 yields SN CN 2 C ≤ 4 4 = 4 2 . (2.7) P − hX1 i > ε N ε N ε N Thus for any N0 ≥ 1, SN (2.8) P lim sup ≤ − hX1 i > ε N N →∞ ∞ X N =N0 SN P − hX1 i > ε N ≤ ∞ X N =N0 C C0 ≤ , ε4 N 2 ε 4 N0 0 for some constant C . We conclude that for any ε > 0, SN (2.9) P lim sup = 0. − hX1 i > ε N N →∞ Since ε > 0 is arbitrary we have then that SN = 0, (2.10) P lim sup − hX1 i > 0 N N →∞ whence (2.5) holds. Remark 1. Proposition 2.2 applies to the coin tossing problem since in this case hX14 i = 1. However there are many cases where the fourth moment of X1 is not finite. Since hX1 i enters the RHS of (2.5), we might reasonably expect that (2.5) holds provided h |X1 | i < ∞. This stronger result is not so easy to prove, and requires the development of new ideas which we shall discuss in future lectures. 3. The Central Limit Theorem The SLLN tells us that the variable SN /N converges to hX1 i as N → ∞ for sums of i.i.d. variables. The central limit theorem allows us to estimate the fluctuation of SN /N from its average. Recall that if we define the variable ZN by p (3.1) SN = N hX1 i + N var[X1 ]ZN , then hZN i = 0 and var[ZN ] = 1. The CLT tells us that ZN is for large N approximately a standard normal variable. Recall that a variable Z is standard normal if it is defined by Z 2 1 √ e−z /2 dz , (3.2) P (Z ∈ O) = 2π O for all open sets O. Now the characteristic function χY : R → C of a variable Y is defined by (3.3) χY (σ) = h eiσY i, σ ∈ R. 4 JOSEPH G. CONLON Evidently χY satisfies kχY (·)k∞ ≤ 1, χY (0) = 1. For the standard normal variable Z we easily see that χZ (σ), σ ∈ R, extends to an entire function χZ : C → C. We can also evaluate χZ (σ) for pure imaginary σ = it with t ∈ R by change of variable, (3.4) Z ∞ χZ (it) = −∞ Z ∞ 1 1 2 t2 /2 √ exp[−tz − z /2] dz = e √ exp[−(z + t)2 /2] dz 2π 2π −∞ Z ∞ 2 2 2 1 √ exp[−z 02 /2] dz 0 = et /2 χZ (0) = et /2 . = et /2 2π −∞ Using the fact now that χZ : C → C is entire, we conclude that χZ (σ) 2 e−σ /2 , σ ∈ C. = Lemma 3.1. Suppose ZN is defined by (3.1), where X1 satisfies h |X1 |3 i < ∞. Then (3.5) lim χZN (σ) = e−σ 2 /2 N →∞ , σ ∈ R. Proof. By independence of the variables X1 , · · · , XN , we have that √ (3.6) log χZN (σ) = N log χX (σ/ N ) , p where X has the distribution of [X1 − hX1 i]/ var[X1 ], whence hXi = 0, hX 2 i = 1 and h |X|3 i < ∞. Observe now that χX (·) is a C 3 function and 3 d χX (σ) ≤ h |X|3 i. (3.7) sup dσ 3 σ∈R Hence from Taylor’s theorem and the fact that hXi = 0, hX 2 i = 1, we conclude that σ |σ|3 h |X|3 i σ 2 √ χ . (3.8) ≤ − 1 + X 2N 6N 3/2 N The result follows from (3.8) and the inequality (3.9) z ≤ − log(1 − z) ≤ z + z 2 , 0 ≤ z < 1/2. Lemma 3.1 tells us that lim h f (ZN ) i = h f (Z) i , (3.10) N →∞ provided f is a function of the type f (z) = exp[iσz], z ∈ R, for some σ ∈ R. We say the variables ZN , N = 1, 2..., converge in distribution to the variable Z if the cdfs of ZN converge to the cdf of Z as N → ∞ i.e. (3.11) lim P (ZN ≤ ξ) = P (Z ≤ ξ), N →∞ ξ ∈ R. Evidently we can rewrite (3.11) as (3.10) with f (z) = H(ξ − z), z ∈ R, where H(·) is the Heaviside function (3.12) H(z) = 0, z < 0, H(z) = 1, z ≥ 0. MATH 625-2010 5 We can obtain a proof of (3.11) from Lemma 3.1 and some Fourier analysis. To do this we introduce the Schwartz space S of functions f : R → C defined as follows: f ∈ S if n d f (z) < ∞ (3.13) sup (1 + |z|)m dz n z∈R for all non-negative integers m, n. If we define now the Fourier transform of a function f : R → C by Z ∞ (3.14) Ff (σ) = fˆ(σ) = f (z)eiσz dz, σ ∈ R, −∞ then it is easy to see that F is an isomorphism of S with inverse F −1 given by Z ∞ 1 −1 ˆ (3.15) f (z) = F f (z) = fˆ(σ)e−iσz dσ, z ∈ R. 2π −∞ Lemma 3.2. Assume the conditions of Lemma 3.1. Then the limit (3.10) holds for all f ∈ S. Proof. From (3.15) we have that Z ∞ Z ∞ 1 1 fˆ(σ)e−iσZN dσ i = fˆ(σ)χZN (−σ) dσ. (3.16) h f (ZN ) i = h 2π −∞ 2π −∞ Observe that in (3.16) we have used Fubini’s theorem. From Lemma 3.1 and the dominated convergence theorem we then see that Z ∞ 2 1 (3.17) lim h f (ZN ) i = fˆ(σ)e−σ /2 dσ. N →∞ 2π −∞ Now from (3.14) we have that Z ∞ Z ∞ Z ∞ 2 2 1 1 fˆ(σ)e−σ /2 dσ = f (z)eiσz dz e−σ /2 dσ (3.18) 2π −∞ 2π −∞ −∞ Z ∞ Z ∞ Z ∞ 2 2 1 1 = eiσz−σ /2 dσ f (z) dz = √ e−z /2 f (z) dz = h f (Z) i . 2π 2π −∞ −∞ −∞ Theorem 3.1 (Central Limit Theorem). Suppose ZN is defined by (3.1), where X1 satisfies h |X1 |3 i < ∞. Then ZN converges in distribution as N → ∞ to the standard normal variable Z. Proof. Consider a finite interval [a, b]. Then for any ε > 0 there is a C ∞ function f1,ε : R → R with support in the open interval (a, b) such that kf1,ε k∞ ≤ 1 and f1,ε (z) = 1 for a + ε ≤ z ≤ b − ε. Now since f1,ε ∈ S, Lemma 3.2 implies that (3.19) lim inf P (a < ZN < b) ≥ h f1,ε (Z) i . N →∞ If we let ε → 0 in (3.19) we conclude that (3.20) lim inf P (a < ZN < b) ≥ P (a ≤ Z ≤ b) . N →∞ Similarly by choosing a C ∞ function f2,ε : R → R with support in the open interval (a − ε, b + ε) such that kf2,ε k∞ ≤ 1 and f2,ε (z) = 1 for a ≤ z ≤ b, we conclude that (3.21) lim sup P (a ≤ ZN ≤ b) ≤ P (a < Z < b) . N →∞ 6 JOSEPH G. CONLON From (3.20), (3.21) it follows that for any a < b, (3.22) lim [P (ZN ≤ b) − P (ZN ≤ a)] = P (Z ≤ b) − P (Z ≤ a) . N →∞ The result follows from (3.22) if we can show that lim sup P (ZN ≤ a) ≤ Ka , (3.23) N →∞ where the constant Ka satisfies lima→−∞ Ka = 0. To prove (3.23) we observe that for a C ∞ function fa : R → R with compact support in (a, ∞) such that kfa k∞ ≤ 1, one has lim inf P (ZN > a) ≥ hfa (Z) i . (3.24) N →∞ Now for a << 0 we choose fa to have the property that fa (z) = 1 for a+1 < z < |a|. If we set the RHS of (3.24) to be 1 − Ka , then lima→−∞ Ka = 0. Remark 2. We can compare the SLLN with the CLT. Since the SLLN tells us that √ (3.25) lim ZN / N = 0 with probability 1, N →∞ and CLT that ZN converges to the normal variable Z in distribution, we might expect that for any α > 0, lim ZN /N α = 0 (3.26) N →∞ with probability 1. We might even expect that ZN converges with probability 1 to Z, but this is always false whereas (3.26) is true for many i.i.d. variables. Thus the function ZN (ω), ω ∈ Ω, is very oscillatory, but for fixed large N the measure of the set of ω for which ZN (ω) takes values between a and b remains more or less constant. As an example we might think of ZN being like the function ZN (ω) = sin 2πN ω, ω ∈ [0, 1]. Thus ZN (ω) is very oscillatory but Z 1 Z 1 (3.27) h f (ZN ) i = f (sin 2πN ω) dω = f (sin 2πω) dω . 0 0 Thus all the ZN , N ≥ 1, have the same distribution but limN →∞ ZN does not exist with probability 1. 4. Recurrence and Tail Events We shall abstract part of the argument used in the proof of SLLN, in particular (2.8) to (2.10). For a probability space (Ω, F, P ), suppose that An ∈ F, n = 1, 2.., is an infinite sequence of “events”. We define the lim sup and lim inf of this sequence as follows: (4.1) ∞ lim sup An = ∩∞ N =1 ∪n=N An , n→∞ ∞ lim inf An = ∪∞ N =1 ∩n=N An . n→∞ Hence lim supn→∞ An is the event that An occurs infinitely often, whereas lim inf n→∞ An is the event that An eventually always happens. P∞ Proposition 4.1 (Borel-Cantelli lemma). (a) Suppose that n=1 P (An ) < ∞. Then P ( lim supn→∞ A ) = 0. P∞ n (b) Suppose that n=1 P (An ) = ∞ and the An , n = 1, 2, ..., are independent events. Then P ( lim supn→∞ An ) = 1. MATH 625-2010 7 Proof. (a) From (4.1) we have that (4.2) P ( lim sup An ) ≤ P ( ∪∞ n=N An ) ≤ n→∞ ∞ X P (An ) n=N for any N ≥ 1. Since by our assumption the RHS of (4.2) goes to 0 as N → ∞ we are done. (b) We observe that the complement of the set lim supn→∞ An in Ω is lim inf n→∞ Ãn , where Ãn = Ω − An , n = 1, 2..., are the complements of the An in Ω. Hence we have that (4.3) ∞ 1 − P ( lim sup An ) = P ( ∪∞ N =1 ∩n=N Ãn ), n→∞ which in turn implies that (4.4) 1 − P ( lim sup An ) = n→∞ lim P ( ∩∞ n=N Ãn ) . N →∞ Using the independence of the events An , n = 1, 2.., we have that (4.5) P ( ∩∞ n=N Ãn ) = ∞ Y [1 − P (An )] . n=N Now from (3.9) we have that ( ∞ ) ∞ Y X (4.6) − log [1 − P (An )] ≥ P (An ) = ∞. n=N n=N Remark 3. Note that in (2.8) the set An is the set An = {ω ∈ Ω : |Sn (ω)/n − hX1 i| > ε}. We can use Borel-Cantelli (b) to prove recurrence for the standard random walk on the integers Z. Thus let the Xj , j = 1, 2, .., be Bernoulli variables taking the values ±1 with equal probability 1/2. Then SN = X1 + · · · + XN is the position after N steps of the standard random walk on Z starting at the origin. We wish to show that (4.7) P (SN = 0 for infinitely many N ) = 1. Thus (4.7) says that the walk recurs to its starting point 0 infinitely often with probability 1. Following the argument of Borel-Cantelli (b) we need to first show that (4.8) ∞ X P (Sn = 0) = ∞ . n=1 We can see why this is plausible from√CLT. Thus using the fact that Sn takes only integer values and the fact that Sn / n converges in distribution to the standard normal variable Z, we expect that Z 1/2√n 2 1 1 (4.9) P (Sn = 0) ' √ e−z /2 dz ' √ √ 2π −1/2 n 2πn 8 JOSEPH G. CONLON for large n. The approximation (4.9) is of course not correct because P (Sn = 0) = 0 if n is odd. We might however be tempted then to modify (4.9) to be double the RHS of (4.9) in the case of even integer n. In that case we expect P (S2m = 0) ' 2 √ (4.10) 1 1 = √ . πm 2π2m for large integer m. We shall show using √ Sterling’s formula that (4.10) holds. This illustrates that approximation of SN / N by a standard normal variable can be much finer than what is given by CLT. Lemma 4.1. For SN√ , N = 1, 2, ..., the standard random walk on Z starting at the origin, then limm→∞ πm P (S2m = 0) = 1. Proof. We have that (4.11) P (S2m = 0) = (2m)! 1 . (m!)2 22m Recall now Sterling’s formula that (4.12) n! lim n→∞ nn+1/2 e−n = √ 2π . Observe now that 2 mm+1/2 e−m √ 2. (m!) (2m)2m+1/2 e−2m √ From (4.12) we see that the RHS of (4.13) converges to 1/ π as m → ∞. (4.13) √ m P (S2m = 0) = (2m)! Evidently Lemma 4.1 implies (4.8), but (4.8) does not imply (4.7) since the events {SN = 0}, N = 1, 2.., are not independent. These events are however approximately independent and using that fact we can still prove (4.7). The notion of “approximate independence” is a bit fuzzy, but we can give it some quantitative meaning by considering covariances. Thus if X, Y are two variables, the covariance of X and Y is defined as (4.14) cov[X, Y ] = h [X − hXi][Y − hY i] i = hXY i − hXihY i . If X, Y are independent then cov[X, Y ] = 0, but of course one can have cov[X, Y ] = 0 and X, Y not be independent. We define the coefficient of correlation ρ for the variables X, Y as (4.15) cov[X, Y ] ρ(X, Y ) = p . var[X]var[Y ] Evidently we have that −1 ≤ ρ(X, Y ) ≤ 1. If ρ(X, Y ) = ±1 then X is a multiple of Y , which is the opposite of independence, and ρ(X, Y ) = 0 if X, Y are independent. Therefore we may use the coefficient of correlation as a measure of independence. In particular if |ρ(X, Y )| << 1 then we could conclude that X, Y are “approximately independent”. Let us apply the above considerations to the events {SN = 0}, N = 1, 2.... We define Xm by Xm (ω) = 1 if S2m (ω) = 0, otherwise Xm (ω) = 0. We have now, assuming from Lemma 4.1 that for large m and k ≥ 1, 1 1 , hXm Xm+k i = √ , (4.16) hXm i = √ πm π mk MATH 625-2010 9 that ρ(Xm , Xm+k ) is given by the formula (4.17) h i p √ 1/π mk − 1/π m(m + k) ρ(Xm , Xm+k ) = h p i1/2 . √ 1/2 [ 1/ πm − 1/πm ] 1/ π(m + k) − 1/π(m + k) Thus we have for fixed k, (4.18) √ lim ρ(Xm , Xm+k ) = 1/ πk . m→∞ We conclude that the sequence of variables Xrk , r = 1, 2, .., is approximately independent for large fixed k. Since Lemma 4.1 implies that ∞ X (4.19) P (Xrk = 1) = ∞, r=1 we can more or less conclude the recurrence (4.7) from Borel-Cantelli (b). This is of course not a rigorous argument, but still an important indicator of why the result is correct. Now to the rigorous argument: Proposition 4.2. For SN , N = 1, 2..., the standard random walk on Z, the recurrence (4.7) holds. Proof. We consider the disjoint events Ak , k = 0, 1, 2, .., defined by (4.20) A0 = {ω : SN (ω) 6= 0 for all N = 1, 2, ...} , Ak = {ω : Sk (ω) = 0, SN +k (ω) 6= 0 for all N = 1, 2, ...} , k ≥ 1. Thus Ak is the event that SN returns to 0 for the last time at N = k. Since these are disjoint events we have ∞ X (4.21) P (Ak ) ≤ 1 . k=0 Observe now that we can rewrite Ak as (4.22) Ak = {ω : Sk (ω) = 0, SN +k (ω) − Sk (ω) 6= 0 for all N = 1, 2, ...} . Hence by the independence of the events {Sk = 0} and {SN +k − Sk 6= 0 for all N = 1, 2, ...}, we conclude from (4.22) that (4.23) P (Ak ) = P (Sk = 0)P (A0 ) . Substituting (4.23) into (4.21) and using (4.8), we conclude that P (A0 ) = 0. Now (4.23) implies that P (Ak ) = 0, k = 0, 1, 2, .... To finish the proof we observe that (4.24) ∪∞ k=0 Ak = event that SN = 0 finitely often, and we have shown the probability of the LHS of (4.24) is 0. We have observed in the Borel-Cantelli Lemma and in the previous proposition that the probability of an event is either 0 or 1. These events are so called “tail events” since they are independent of any finite number of the variables involved. The following systematizes this observation: Proposition 4.3 (Kolmogorov zero-one law). Let X1 , X2 , .., be independent variables on a probability space (Ω, F, P ). Suppose E ∈ F is in the σ field generated by the X1 , X2 , .., but is independent of any finite number of the variables X1 , X2 , ... Then P (E) = 0 or P (E) = 1. 10 JOSEPH G. CONLON Proof. E can be approximated arbitrarily closely by sets EN in the σ field F(X1 , X2 , ..., XN ) for large enough N . Hence we have that (4.25) lim P (EN ) = P (E), N →∞ lim P (EN ∩ E) = P (E) . N →∞ By assumption we have P (EN ∩ E) = P (EN )P (E), whence (4.25) implies that P (E) = P (E)2 . We can apply Proposition 4.3 to the issue raised at the end of §3, in particular (3.26). Observe that for α > 0 the set {ω : limN →∞ ZN (ω)/N α exists} is a tail event, whence it follows that it occurs with probability 1 or 0. We have seen from the SLLN that it occurs with probability 1 if α = 1/2 and the limit is 0. Next we show that if α = 0 it occurs with probability 0. Proposition 4.4. Let X1 , X2 , .., be i.i.d. variables on (Ω, F, P ) and assume that h|X1 |3 i < ∞. Then P ( {ω : limN →∞ ZN (ω) exists} ) = 0. Proof. By the Kolmogorov law we can assume for contradiction that {ω : limN →∞ ZN (ω) exists} occurs with probability 1. Thus there exists a variable Z and limN →∞ ZN = Z with probability 1. However by the CLT this limiting variable Z must be the standard normal variable, whence (4.26) P ( {ω : lim ZN (ω) > 0} ) = 1/2 . N →∞ Since {ω : limN →∞ ZN (ω) > 0} is a tail event we have obtained a contradiction to the Kolmogorov law. Thus {ω : limN →∞ ZN (ω) exists} occurs with probability 0. 5. Law of the Iterated Logarithm In this section we shall be considering the standard random walk SN = X1 + · · · + XN on Z, so the Xj , j = 1, 2, ..., are i.i.d. Bernoulli taking values ±1 with probability 1/2. Suppose now that bn , n = 1, 2, ..., is a positive sequence satisfying lim bn = ∞ . (5.1) n→∞ By Kolmogorov the set A given by (5.2) A = { ω : lim sup |Sn (ω)|/bn < ∞ } n→∞ is a tail event and hence P (A) = 0 or P (A) = 1. Assume now that P (A) = 1. Invoking Kolmogorov again, we see that there exists γ ≥ 0 such that (5.3) lim sup |Sn (ω)|/bn = γ with probability 1. n→∞ √ We have already seen that if bn = n then γ = 0. We also know that if bn = n then P (A) = 0. A natural question to ask therefore is to find a sequence bn such that (5.3) holds for some γ with 0 < γ < ∞. The answer is given by the following: Theorem 5.1 (Law of the Iterated Logarithm). If bn = [2n log log n]1/2 , then (5.3) holds with γ = 1. MATH 625-2010 11 To prove Theorem 5.1 we use estimates on the fluctuations of the random walk which come from CLT, but we also need an important new idea, the idea of a maximal function. We define the maximal function MN associated with the random walk SN to be (5.4) MN (ω) = sup Sn (ω), ω∈Ω. 0≤n≤N Thus MN measures the farthest to the right the walk has wandered in N steps. Evidently MN is a non-negative variable, but there is no obvious relationship between the distribution of MN and SN . The reflection principle tells us that the distribution of MN and |SN | are almost identical. Lemma 5.1. Let r ≥ 1 be an integer. Then there is the inequality, (5.5) 2P (SN ≥ r + 1) ≤ P (MN ≥ r) ≤ 2P (SN ≥ r) . Proof. Observe that (5.6) P (MN ≥ r) − P (SN ≥ r) = P (MN −1 ≥ r, SN < r) . Reflection symmetry comes in by noting that (5.7) P (MN −1 ≥ r, SN < r) = P (MN −1 ≥ r, SN > r) . To see (5.7) we use a random stopping time n∗ defined by n∗ (ω) = inf{ n ≥ 1 : Sn (ω) = r } . (5.8) Thus we have that (5.9) P (MN −1 ≥ r, SN < r) = P ( {ω : n∗ (ω) < N, SN −n∗ (ω) < 0} ) = P ( {ω : n∗ (ω) < N, SN −n∗ (ω) > 0} ) = P (MN −1 ≥ r, SN > r) . Now (5.6), (5.7) imply that (5.10) 1 1 P (MN ≥ r) = P (MN −1 ≥ r, SN 6= r)+P (SN ≥ r) ≤ P (MN ≥ r)+P (SN ≥ r) , 2 2 which implies the upper bound in (5.5). We also get the lower bound by using the identity in (5.10). Thus we have from (5.10) that (5.11) P (MN ≥ r) ≥ 1 1 P (MN −1 ≥ r) + P (SN = r) + P (SN ≥ r + 1) 2 2 1 ≥ P (MN ≥ r) + P (SN ≥ r + 1) . 2 Next we obtain some estimates on the asymptotics for the cdf of the normal variable. Lemma 5.2. For Z the standard normal variable and a > 0, there are the inequalities 2 2 1 1 1 1 √ (5.12) − 3 e−a /2 ≤ P (Z > a) ≤ √ e−a /2 . a a 2π a 2π 12 JOSEPH G. CONLON Proof. We have that Z ∞ Z 1 −a2 /2 ∞ −ax−x2 /2 1 −z 2 /2 e dz = √ e e dx , (5.13) P (Z > a) = √ 2π a 2π 0 upon making the change of variable z = x + a in the integration. The upper bound in (5.12) follows from Z ∞ Z ∞ 2 (5.14) e−ax−x /2 dx ≤ e−ax dx = 1/a . 0 0 The estimate 1/a in (5.14) is just the first term in an asymptotic expansion for the integral in powers of 1/a which should be valid for a >> 1. The second term is obtained by making the approximation e−x (5.15) 2 /2 ' 1 − x2 /2, in which case we get Z ∞ 2 e−ax−x (5.16) /2 dx ' 1/a − 1/a3 . 0 We can turn the calculation (5.16) into a rigorous lower bound by noting that 2 1 1 d 3 −z 2 /2 − 3 e (5.17) − = 1 − 4 e−z /2 . dz z z z Hence Z (5.18) ∞ e−ax−x 2 /2 dx = ea 0 2 Z ∞ 2 e−z /2 dz a Z ∞ 2 2 1 1 3 − 3 , ≥ ea /2 1 − 4 e−z /2 dz = z a a a /2 whence the lower bound in (5.12) follows. The fine asymptotics in Lemma 5.2 shows us how the logarithm gets involved in LIL. Lemma 5.3. There is the inequality ∞ X Sn (5.19) P >1 < ∞ [2n(1 + ε) log n]1/2 n=2 or = ∞ according as ε > 0 or ε < 0. Proof. From Lemma 5.2 and CLT we have that Sn Sn 1/2 √ (5.20) P > 1 = P > [2(1 + ε) log n] n [2n(1 + ε) log n]1/2 1 1 ' e−(1+ε) log n = . [4π(1 + ε) log n]1/2 [4π(1 + ε) log n]1/2 n1+ε Since (5.21) ∞ X n=1 1 n1+ε < ∞ or = ∞ according as ε > 0 or ε < 0, the result follows. MATH 625-2010 13 Remark 4. From Lemma 5.3 and Borel-Cantelli (a) we conclude that (5.22) lim sup n→∞ Sn ≤ 1 [2n(1 + ε) log n]1/2 with probability 1 if ε > 0, whence it follows for ε = 0. In order to go further than (5.22) we need to use Lemma 5.1. Lemma 5.4. With bn the sequence in the statement of Theorem 5.1, then lim sup Sn /bn ≤ 1 (5.23) with probability 1. n→∞ Proof. We shall show using Lemma 5.1 that for any α > 1 ! ∞ X Sn (5.24) P sup > α < ∞. αm ≤n<αm+1 bn m=1 The result follows from (5.24) and Borel-Cantelli (a) on letting α → 1. To prove (5.24) consider integers r, k which satisfy for some integer m ≥ 1 the inequality αm ≤ r, k ≤ αm+1 . Then we have that √ br α γ(m) √ ≤ , ≤ (5.25) bk γ(m) α where limm→∞ γ(m) = 1. Thus in intervals αm ≤ n ≤ αm+1 the sequence bn is essentially constant for α close to 1. Using this observation we note then that we may estimate ! ! Sn (5.26) P sup > α ≤ P sup Sn > αbαm αm ≤n<αm+1 bn n≤αm+1 1/2 ! 2ξ(m) log m Sαm+1 √ > α ≤ 2P ( Sαm+1 > αbαm ) = 2P α αm+1 where limm→∞ ξ(m) = 1. Using the CLT and Lemma 5.2 we have that 1/2 ! 2ξ(m) log m Sαm+1 √ (5.27) P > α ' α αm+1 1 p 4παξ(m) log m exp[−αξ(m) log m] = p 1 4παξ(m) log m 1 mαξ(m) . The inequality (5.24) follows from (5.27) since ∞ X 1 < ∞. αξ(m) m m=1 (5.28) Lemma 5.5. With bn the sequence in the statement of Theorem 5.1, then (5.29) lim sup Sn /bn ≥ 1 n→∞ with probability 1. 14 JOSEPH G. CONLON Proof. Taking α > 1 again and setting β = α/(α − 1) > 1 we consider (5.30) 1/2 ! Sαm (α−1) 1 2ξ(m) log m p P Sαm+1 − Sαm > > b m+1 = P β α β αm (α − 1) 1/2 1/2 β ξ(m) log m β 1 . ' exp − = 4πξ(m) log m β 4πξ(m) log m mξ(m)/β Sine β > 1 we conclude that ∞ X 1 (5.31) = ∞. P Sαm+1 − Sαm > b m+1 β α m=1 Since the events in (5.31) are independent we conclude from Borel-Cantelli (b) that 1 (5.32) Sαm+1 − Sαm > b m+1 infinitely often with probability 1. β α Combining Lemma 5.4 with (5.32) we see that 1 (5.33) Sαm+1 > b m+1 − α1/4 bαm infinitely often with probability 1, β α whence we have that Sαm+1 1 ξ(m) > − 1/4 infinitely often with probability 1. (5.34) bαm+1 β α Now let α → ∞ in (5.34) to get the result, noting that then β → 1. Proof of Theorem 5.1. Simply observe that Sn Sn |Sn | = max lim sup , − lim inf . (5.35) lim sup n→∞ bn bn n→∞ bn n→∞ From the previous lemmas we have by symmetry that Sn Sn (5.36) lim inf = − lim sup = −1, n→∞ bn n→∞ bn whence the RHS of (5.35) is 1. University of Michigan, Department of Mathematics, Ann Arbor, MI 48109-1109 E-mail address: [email protected]