Download Lecture 6: Borel-Cantelli Lemmas 1.) First Borel

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Randomness wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Central limit theorem wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Lecture 6: Borel-Cantelli Lemmas
1.) First Borel-Cantelli Lemma
We begin with some notation. If An , n ≥ 1 is a sequence of subsets of Ω, then we define
lim sup An ≡
n→∞
lim inf An ≡
n→∞
∞ [
∞
\
n=1 m=n
∞ \
∞
[
Am = {ω that are in infinitely many An }
Am = {ω that are in all but finitely many An }.
n=1 m=n
It is common to write lim sup An = {ω : ω ∈ An i.o.}. For example, Xn → X a.s. if and only if
for all > 0, P(|Xn − X| > i.o.) = 0. Our first key result is:
Lemma (1.6.1): First Borel-Cantelli Lemma. If
P∞
n=1 P(An )
< ∞, then P(An i.o.) = 0.
P
Proof:
Let
N
=
k 1Ak be the number of events that occur. By Fubini’s theorem, EN =
P
k P(Ak ) < ∞, so we must have N < ∞ a.s.
Before we apply this, we first recall the following result from topology.
Lemma (1.6.3): yn → y if and only if every subsequence (ynm ; m ≥ 1) contains a further
subsequence (ynm(k) ; k ≥ 1) that converges to y.
Theorem (1.6.2): Xn → X in probability if and only if every subsequence (Xnm ; m ≥ 1)
contains a further subsequence (Xnm(k) ; k ≥ 1) that converges almost surely to X.
Proof: Suppose that Xn → X in probability and let k ↓ 0. Given any sequence (nm : m ≥ 1),
for each k there is an nm(k) > nm(k−1) such that P(|Xnm(k) − X| > k ) ≤ 2−k . Since
∞
X
P(|Xnm(k) − X| > k ) < ∞,
k=1
the first Borel-Cantelli lemma implies that P(|Xnm(k) − X| > k i.o.) = 0. Furthermore, since
k ↓ 0, this implies that P(|Xnm(k) − X| > i.o.) = 0 for every > 0 and so Xnm(k) → X almost
surely.
To prove the converse, fix δ > 0 and let yn = P(|Xn − X| > δ). We need to show that
yn → 0. Given any subsequence (ynm ; m ≥ 1), by assumption we know that the subsequence
(Xnm ; m ≥ 1) contains a subsequence (Xnm(k) ; k ≥ 1) that converges to X almost surely. This
implies that (ynm(k) ; k ≥ 0) converges to 0 and so Lemma (1.6.3) implies that yn → 0. Since this
holds for every δ > 0, it follows that Xn → X in probability.
1
Corollary (1.6.4): If f is continuous and Xn → X in probability, then f (Xn ) → f (X) in
probability. If f is also bounded, then Ef (Xn ) → Ef (X).
Proof: If Xn(m) is a subsequence, then Theorem (1.6.2) implies that there is a further subsequence Xn(mk ) → X almost surely. Since f is continuous, f (Xn(mk ) ) → f (X) almost surely and
so (1.6.2) now implies that f (Xn ) → f (X) in probability.
If f is bounded, then the bounded convergence theorem implies that Ef (Xn(mk ) ) → Ef (X) and
the desired result follows from Lemma (1.6.3) with yk = Ef (Xk ).
2.) The Second Borel-Cantelli Lemma
The following example shows that the converse of the first Borel-Cantelli Lemma is false. Let
Ω = (0, P
1), F = BorelPsets, P = Lebesgue measure. If An = (0, 1/n), then lim sup An = ∅ even
though n P(An ) = n n−1 = ∞.
Theorem
(1.6.6): The Second Borel-Cantelli Lemma: If the events An are independent
P
and
P(An ) = ∞, then P(An i.o.) = 1.
Proof: Let M < N < ∞. Using the inequality 1 − x ≤ e−x whenever x ≥ 0, we have
P
c
∩N
n=M An
N
Y
=
1 − P(An )
≤
n=M
= exp −
N
Y
exp (−P(An ))
n=M
N
X
!
P(An )
→ 0.
n=M
∞
So P(∪∞
n=M An ) = 1 for every M . Since ∪n=M An ↓ lim sup An , it follows that P(lim sup An ) = 1.
A typical application for the second B-C lemma is:
Theorem (1.6.7): If X1 , X2 , · · · are i.i.d. with E|X1 | = ∞, then P(|Xn | ≥ n i.o.) = 1. Consequently, if Sn = X1 + · · · + Xn , then P(lim Sn /n ∈ (−∞, ∞)) = 0.
Proof: Since
Z
E|X1 | =
∞
P(|X1 | > x)dx ≤
0
∞
X
P(|X1 | > n),
n=0
the fact that E|X1 | = ∞ along with the Second Borel-Cantelli lemma imply that P(|Xn ≥
n i.o.) = 1. To prove the second claim, let C = {ω : limn→∞ Sn (ω)/n ∈ (−∞, ∞)} and notice
that
Sn+1
Sn
Xn+1
Sn
−
=
−
.
n
n+1
n(n + 1) n + 1
2
Since Sn /(n(n + 1)) → 0 on C, it follows that on the set C ∩ {ω : |Xn | ≥ n, i.o.} we have
Sn
Sn+1 2
n − n + 1 > 3 i.o.
contradicting the fact that ω ∈ C. This shows that
C ∩ {ω : |Xn | ≥ n, i.o.} = ∅
and so P(C) = 0.
With additional work, convergence rates can be estimated for the second Borel-Cantelli lemma.
Theorem (1.6.8): If A1 , A2 , · · · are pairwise independent and
n
X
1Am
m=1
n
.X
P∞
n=1 P(An )
= ∞, then
P(Am ) → 1 a.s.
m=1
2 = EX . Also,
Proof: Let Xm = 1Am and Sn = X1 + · · · + Xn and notice that Var(Xm ) ≤ EXm
m
since the Am are pairwise independent, the Xm are uncorrelated and so
Var(Sn ) =
n
X
Var(Xm ) ≤
m=1
n
X
EXm = ESn .
m=1
It then follows from Chebyshev’s inequality and our assumption that ESn → ∞ that
(?)
P |Sn − ESn | > δ · ESn ≤ Var(Sn )/(δ · ESn )2 → 0
as n → ∞. This shows that Sn /ESn → 1 in probability.
To get almost sure convergence, we need to consider subsequences. Let nk = inf{n : ESn ≥ k 2 }.
Let Tk = Snk and observe that k 2 ≤ ETk ≤ k 2 + 1. Replacing n by nk in (?) and using ETk ≥ k 2
shows that
P |Tk − ETk | > δETk ≤ 1/(δ 2 k 2 ).
P
So
k P(|Tk − ETk | > δ · ETk ) < ∞ and thus the first Borel-Cantelli lemma implies that
P(|Tk − ETk | > δ · ETk i.o.) = 0. Since δ > 0 is arbitrary, it follows that Tk /ETk → 1 almost
surely.
To show that Sn /ESn → 1 almost surely, choose ω such that Tk (ω)/ETk → 1 and observe that
if nk ≤ n ≤ nk+1 then
Tk (ω)
Sn (ω)
Tk+1 (ω)
≤
≤
.
ETk+1
ESn
ETk
However, these inequalities can rewritten as
ETk
Tk (ω)
Sn (ω)
ETk+1 Tk+1 (ω)
·
≤
≤
·
ETk+1 ETk
ESn
ETk
ETk+1
and the desired result follows upon recognizing that ETk+1 /ETk → 1.
3
Example: Record Values. Let X1 , X2 , · · · be i.i.d. with continuous distribution function
F (x) and let Ak = {Xk > supj<k Xj } be the event that a record value occurs on the k’th trial.
We will show that the Ak are pairwise independent with P(Ak ) = 1/k.
We first observe that because F is continuous, P(Xj = Xk ) = 0 for any j 6= k, so we can let
Y1n > Y2n > · · · > Ynn be the (decreasing) order statistics of X1 , · · · , Xn . This induces a random
permutation of {1, · · · , n} defined by πn (i) = j if Xi = Yjn . Furthermore, because the joint
distribution of the random variables (X1 , · · · , Xn ) is not affected by the order of occurrence, it
follows that πn is uniformly distributed over the set of n! possible permutations.
To show pairwise independence, let Cn,i be the set of permutations of {1, · · · , n} with the
property that π(i) < min{π(1), · · · , π(i − 1)} and notice that
Ai = {πn ∈ Cn,i }.
Since πn is uniformly distributed, it follows that
P(Ai ) =
|Cn,i |
n(n − 1) · · · (i + 1)(i − 1) · · · 2 · 1
1
=
= .
n!
n!
i
Similarly, if i < j, then
P(Ai ∩ Aj ) =
=
|Cn,i ∩ Cn,j |
n(n − 1) · · · (j + 1)(j − 1) · · · (i + 1)(i − 1) · · · 2 · 1
=
n!
n!
1
= P(Ai )P(Aj )
ij
and so Ai and Aj are independent.
P
P
Let Rn = nm=1 1Am be the number of record values at time n. Since nm=1 P(Am ) ∼ log(n) →
∞, Theorem (1.6.8) implies that as n → ∞
Rn / log(n) → 1 a.s.
4