Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 6: Borel-Cantelli Lemmas 1.) First Borel-Cantelli Lemma We begin with some notation. If An , n ≥ 1 is a sequence of subsets of Ω, then we define lim sup An ≡ n→∞ lim inf An ≡ n→∞ ∞ [ ∞ \ n=1 m=n ∞ \ ∞ [ Am = {ω that are in infinitely many An } Am = {ω that are in all but finitely many An }. n=1 m=n It is common to write lim sup An = {ω : ω ∈ An i.o.}. For example, Xn → X a.s. if and only if for all > 0, P(|Xn − X| > i.o.) = 0. Our first key result is: Lemma (1.6.1): First Borel-Cantelli Lemma. If P∞ n=1 P(An ) < ∞, then P(An i.o.) = 0. P Proof: Let N = k 1Ak be the number of events that occur. By Fubini’s theorem, EN = P k P(Ak ) < ∞, so we must have N < ∞ a.s. Before we apply this, we first recall the following result from topology. Lemma (1.6.3): yn → y if and only if every subsequence (ynm ; m ≥ 1) contains a further subsequence (ynm(k) ; k ≥ 1) that converges to y. Theorem (1.6.2): Xn → X in probability if and only if every subsequence (Xnm ; m ≥ 1) contains a further subsequence (Xnm(k) ; k ≥ 1) that converges almost surely to X. Proof: Suppose that Xn → X in probability and let k ↓ 0. Given any sequence (nm : m ≥ 1), for each k there is an nm(k) > nm(k−1) such that P(|Xnm(k) − X| > k ) ≤ 2−k . Since ∞ X P(|Xnm(k) − X| > k ) < ∞, k=1 the first Borel-Cantelli lemma implies that P(|Xnm(k) − X| > k i.o.) = 0. Furthermore, since k ↓ 0, this implies that P(|Xnm(k) − X| > i.o.) = 0 for every > 0 and so Xnm(k) → X almost surely. To prove the converse, fix δ > 0 and let yn = P(|Xn − X| > δ). We need to show that yn → 0. Given any subsequence (ynm ; m ≥ 1), by assumption we know that the subsequence (Xnm ; m ≥ 1) contains a subsequence (Xnm(k) ; k ≥ 1) that converges to X almost surely. This implies that (ynm(k) ; k ≥ 0) converges to 0 and so Lemma (1.6.3) implies that yn → 0. Since this holds for every δ > 0, it follows that Xn → X in probability. 1 Corollary (1.6.4): If f is continuous and Xn → X in probability, then f (Xn ) → f (X) in probability. If f is also bounded, then Ef (Xn ) → Ef (X). Proof: If Xn(m) is a subsequence, then Theorem (1.6.2) implies that there is a further subsequence Xn(mk ) → X almost surely. Since f is continuous, f (Xn(mk ) ) → f (X) almost surely and so (1.6.2) now implies that f (Xn ) → f (X) in probability. If f is bounded, then the bounded convergence theorem implies that Ef (Xn(mk ) ) → Ef (X) and the desired result follows from Lemma (1.6.3) with yk = Ef (Xk ). 2.) The Second Borel-Cantelli Lemma The following example shows that the converse of the first Borel-Cantelli Lemma is false. Let Ω = (0, P 1), F = BorelPsets, P = Lebesgue measure. If An = (0, 1/n), then lim sup An = ∅ even though n P(An ) = n n−1 = ∞. Theorem (1.6.6): The Second Borel-Cantelli Lemma: If the events An are independent P and P(An ) = ∞, then P(An i.o.) = 1. Proof: Let M < N < ∞. Using the inequality 1 − x ≤ e−x whenever x ≥ 0, we have P c ∩N n=M An N Y = 1 − P(An ) ≤ n=M = exp − N Y exp (−P(An )) n=M N X ! P(An ) → 0. n=M ∞ So P(∪∞ n=M An ) = 1 for every M . Since ∪n=M An ↓ lim sup An , it follows that P(lim sup An ) = 1. A typical application for the second B-C lemma is: Theorem (1.6.7): If X1 , X2 , · · · are i.i.d. with E|X1 | = ∞, then P(|Xn | ≥ n i.o.) = 1. Consequently, if Sn = X1 + · · · + Xn , then P(lim Sn /n ∈ (−∞, ∞)) = 0. Proof: Since Z E|X1 | = ∞ P(|X1 | > x)dx ≤ 0 ∞ X P(|X1 | > n), n=0 the fact that E|X1 | = ∞ along with the Second Borel-Cantelli lemma imply that P(|Xn ≥ n i.o.) = 1. To prove the second claim, let C = {ω : limn→∞ Sn (ω)/n ∈ (−∞, ∞)} and notice that Sn+1 Sn Xn+1 Sn − = − . n n+1 n(n + 1) n + 1 2 Since Sn /(n(n + 1)) → 0 on C, it follows that on the set C ∩ {ω : |Xn | ≥ n, i.o.} we have Sn Sn+1 2 n − n + 1 > 3 i.o. contradicting the fact that ω ∈ C. This shows that C ∩ {ω : |Xn | ≥ n, i.o.} = ∅ and so P(C) = 0. With additional work, convergence rates can be estimated for the second Borel-Cantelli lemma. Theorem (1.6.8): If A1 , A2 , · · · are pairwise independent and n X 1Am m=1 n .X P∞ n=1 P(An ) = ∞, then P(Am ) → 1 a.s. m=1 2 = EX . Also, Proof: Let Xm = 1Am and Sn = X1 + · · · + Xn and notice that Var(Xm ) ≤ EXm m since the Am are pairwise independent, the Xm are uncorrelated and so Var(Sn ) = n X Var(Xm ) ≤ m=1 n X EXm = ESn . m=1 It then follows from Chebyshev’s inequality and our assumption that ESn → ∞ that (?) P |Sn − ESn | > δ · ESn ≤ Var(Sn )/(δ · ESn )2 → 0 as n → ∞. This shows that Sn /ESn → 1 in probability. To get almost sure convergence, we need to consider subsequences. Let nk = inf{n : ESn ≥ k 2 }. Let Tk = Snk and observe that k 2 ≤ ETk ≤ k 2 + 1. Replacing n by nk in (?) and using ETk ≥ k 2 shows that P |Tk − ETk | > δETk ≤ 1/(δ 2 k 2 ). P So k P(|Tk − ETk | > δ · ETk ) < ∞ and thus the first Borel-Cantelli lemma implies that P(|Tk − ETk | > δ · ETk i.o.) = 0. Since δ > 0 is arbitrary, it follows that Tk /ETk → 1 almost surely. To show that Sn /ESn → 1 almost surely, choose ω such that Tk (ω)/ETk → 1 and observe that if nk ≤ n ≤ nk+1 then Tk (ω) Sn (ω) Tk+1 (ω) ≤ ≤ . ETk+1 ESn ETk However, these inequalities can rewritten as ETk Tk (ω) Sn (ω) ETk+1 Tk+1 (ω) · ≤ ≤ · ETk+1 ETk ESn ETk ETk+1 and the desired result follows upon recognizing that ETk+1 /ETk → 1. 3 Example: Record Values. Let X1 , X2 , · · · be i.i.d. with continuous distribution function F (x) and let Ak = {Xk > supj<k Xj } be the event that a record value occurs on the k’th trial. We will show that the Ak are pairwise independent with P(Ak ) = 1/k. We first observe that because F is continuous, P(Xj = Xk ) = 0 for any j 6= k, so we can let Y1n > Y2n > · · · > Ynn be the (decreasing) order statistics of X1 , · · · , Xn . This induces a random permutation of {1, · · · , n} defined by πn (i) = j if Xi = Yjn . Furthermore, because the joint distribution of the random variables (X1 , · · · , Xn ) is not affected by the order of occurrence, it follows that πn is uniformly distributed over the set of n! possible permutations. To show pairwise independence, let Cn,i be the set of permutations of {1, · · · , n} with the property that π(i) < min{π(1), · · · , π(i − 1)} and notice that Ai = {πn ∈ Cn,i }. Since πn is uniformly distributed, it follows that P(Ai ) = |Cn,i | n(n − 1) · · · (i + 1)(i − 1) · · · 2 · 1 1 = = . n! n! i Similarly, if i < j, then P(Ai ∩ Aj ) = = |Cn,i ∩ Cn,j | n(n − 1) · · · (j + 1)(j − 1) · · · (i + 1)(i − 1) · · · 2 · 1 = n! n! 1 = P(Ai )P(Aj ) ij and so Ai and Aj are independent. P P Let Rn = nm=1 1Am be the number of record values at time n. Since nm=1 P(Am ) ∼ log(n) → ∞, Theorem (1.6.8) implies that as n → ∞ Rn / log(n) → 1 a.s. 4