Download CHAPTER I - Mathematics - University of Michigan

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Randomness wikipedia , lookup

Inductive probability wikipedia , lookup

Random variable wikipedia , lookup

Birthday problem wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Probability box wikipedia , lookup

Probability interpretations wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Central limit theorem wikipedia , lookup

Conditioning (probability) wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
CHAPTER I
-
THE BASIC IDEAS
JOSEPH G. CONLON
1. Measures induced by random variables
A probability space is generally defined as a triple (Ω, F, P ), where Ω is a set, F
is a Borel algebra of subsets of Ω, and P : F → R a probability measure. Hence P
is a positive measure on F which satisfies P (Ω) = 1. A real valued random variable
X on Ω is then a Borel measurable function X : Ω → R, which means
(1.1)
O an open subset of R implies {ω ∈ Ω : X(ω) ∈ O} ∈ F .
Rather than begin in such an abstract way, we can start with the cumulative distribution function (cdf) F (x) = P (X ≤ x) of X. The function F : R → [0, 1] is
monotone increasing on R, and right continuous i.e.
lim F (x0 ) = F (x),
(1.2)
x0 ↓x
lim F (x) − lim F (x) = 1.
x→∞
x→−∞
Standard measure theory tells us that there is a Borel measure P on R such that
(1.3)
P ( {x : a < x ≤ b} ) = F (b) − F (a),
−∞ < a < b < ∞.
Thus given the cdf F (·), we can construct (Ω, F, P ) and X : Ω → R as above, in
which Ω = R, F = Borel algebra generated by open sets of R, and P is given by
(1.3). The variable X is just the identity X(ω) = ω, ω ∈ R. The probability space
(Ω, F, P ) so constructed is the simplest probability space associated with the cdf
F (·).
We can generalize the previous construction to many random variables X1 , ..., XN ,
once we know the joint cdf F (x1 , ...xN ) = P (X1 ≤ x1 , ..., XN ≤ xN ). In that case
Ω = RN , F = Borel algebra generated by open sets of RN , and P is defined similarly to (1.3). This can be further generalized to a countably infinite set of variables
X1 , X2 , ..., provided the joint cdfs of the variables satisfy an obvious consistency
condition. Thus for 1 ≤ j1 < j2 < · · · < jN , let Fj1 ,j2 ,..,jN (x1 , x2 , ...xN ) be the cdf
of Xj1 , Xj2 , ..., XjN . Consistency then just means:
(1.4)
lim Fj1 ,j2 ,..,jN (x1 , x2 , ...xk−1 , xk , xk+1 , ...., xN )
xk →∞
= Fj1 ,j2 ,..,jk−1 ,jk+1 ,...,jN (x1 , x2 , ...xk−1 , xk+1 , ...., xN ) ,
for all possible choices of the ji ≥ 1, xi ∈ R. This construction of (Ω, F, P ) for a
countably infinite set of variables is known as the Kolmogorov construction. Here
Ω = R∞ and F is the σ field generated by finite dimensional rectangles.
For a particularly simple set of variables the Kolmogorov construction is already
done for us by the construction of Lebesgue measure. Thus consider a Bernoulli
variable X = 1 with probability 1/2, and X = 0 with probability 1/2. We wish to
carry out the Kolmogorov constuction for an infinite set X1 , X2 , ... of independent
1
2
JOSEPH G. CONLON
Bernoulli variables. This can be done by setting Ω = [0, 1], P = Lebesgue measure,
and Xj (ω) the jth entry in the binary expansion of ω ∈ [0, 1]. Thus we have
(1.5)
ω =
∞
X
Xj (ω)
j=1
2j
,
ω ∈ [0, 1].
Note that this construction is actually simpler than the Kolmogorov construction
since for that we have Ω = R∞ . Here we are using the fact that the space {0, 1}∞
with probability measure induced by the counting measure on finite subsets, is
identical to the interval [0, 1] with Lebesgue measure. This illustrates an important
point in probability theory. What matters are the values a random variable can
take and the associated cdf. Generally we try to construct the simplest probability
space (Ω, F, P ) consistent with this information.
2. Law of Large Numbers
If we take the Bernoulli variables X1 , X2 , ..., generated above, then we have a
model for the problem of independent tosses of a fair coin. Thus Xj = 1 if the jth
toss is H and Xj = 0 if it is T. Hence SN = X1 + · · · + XN is the number of heads
in N tosses of the coin. The law of averages tells us that
1
SN
=
with probability 1.
(2.1)
lim
N →∞ N
2
The abstract measure theory we just presented gives us a precise way of formulating
this law of averages.
Proposition 2.1. Let X1 , X2 , ..., be independent identically distributed (i.i.d.)
variables on (Ω, F, P ) which have the property that hX12 i < ∞. Then if SN =
X1 + · · · + XN , there is for any ε > 0 the limit
SN
(2.2)
lim P = 0.
− hX1 i > ε
N →∞
N
Proof. We use the Chebyshev inequality that for any random variable Y and p, δ >
0, one has
(2.3)
P (|Y | > δ) ≤ h |Y |p i/δ p .
Hence taking p = 2 in (2.3), we have that
SN
var[SN ]
N var[X1 ]
h X12 i
(2.4)
P − hX1 i > ε
≤
=
≤
.
2
2
2
2
N
ε N
ε N
ε2 N
Proposition 2.1 is known as the weak law of large numbers. It tells us that SN /N
converges in probability i.e. in measure to the average h X1 i. However (2.1) is
really saying something stronger than this.
Proposition 2.2 (Strong Law of Large Numbers). Let X1 , X2 , ..., be i.i.d. variables
on (Ω, F, P ) which have the property that hX14 i < ∞. If SN = X1 + · · · + XN ,
then
SN
(2.5)
lim
= hX1 i with probability 1.
N →∞ N
MATH 625-2010
3
Proof. Just as the weak law followed from estimating the variance of SN , so the
strong law (2.5) follows by estimating the fourth moment of SN − h SN i =
SN − N hX1 i. It is easy to see that
h {SN − h SN i}4 i ≤ CN 2
(2.6)
for some constant C. Applying (2.3) with p = 4 yields
SN
CN 2
C
≤ 4 4 = 4 2 .
(2.7)
P − hX1 i > ε
N
ε N
ε N
Thus for any N0 ≥ 1,
SN
(2.8) P lim sup ≤
− hX1 i > ε
N
N →∞
∞
X
N =N0
SN
P − hX1 i > ε
N
≤
∞
X
N =N0
C
C0
≤
,
ε4 N 2
ε 4 N0
0
for some constant C . We conclude that for any ε > 0,
SN
(2.9)
P lim sup = 0.
− hX1 i > ε
N
N →∞
Since ε > 0 is arbitrary we have then that
SN
= 0,
(2.10)
P lim sup − hX1 i > 0
N
N →∞
whence (2.5) holds.
Remark 1. Proposition 2.2 applies to the coin tossing problem since in this case
hX14 i = 1. However there are many cases where the fourth moment of X1 is not
finite. Since hX1 i enters the RHS of (2.5), we might reasonably expect that (2.5)
holds provided h |X1 | i < ∞. This stronger result is not so easy to prove, and
requires the development of new ideas which we shall discuss in future lectures.
3. The Central Limit Theorem
The SLLN tells us that the variable SN /N converges to hX1 i as N → ∞ for sums
of i.i.d. variables. The central limit theorem allows us to estimate the fluctuation
of SN /N from its average. Recall that if we define the variable ZN by
p
(3.1)
SN = N hX1 i + N var[X1 ]ZN ,
then hZN i = 0 and var[ZN ] = 1. The CLT tells us that ZN is for large N
approximately a standard normal variable. Recall that a variable Z is standard
normal if it is defined by
Z
2
1
√ e−z /2 dz ,
(3.2)
P (Z ∈ O) =
2π
O
for all open sets O. Now the characteristic function χY : R → C of a variable Y is
defined by
(3.3)
χY (σ) = h eiσY i,
σ ∈ R.
4
JOSEPH G. CONLON
Evidently χY satisfies kχY (·)k∞ ≤ 1, χY (0) = 1. For the standard normal variable
Z we easily see that χZ (σ), σ ∈ R, extends to an entire function χZ : C → C.
We can also evaluate χZ (σ) for pure imaginary σ = it with t ∈ R by change of
variable,
(3.4)
Z
∞
χZ (it) =
−∞
Z ∞
1
1
2
t2 /2
√ exp[−tz − z /2] dz = e
√ exp[−(z + t)2 /2] dz
2π
2π
−∞
Z ∞
2
2
2
1
√ exp[−z 02 /2] dz 0 = et /2 χZ (0) = et /2 .
= et /2
2π
−∞
Using the fact now that χZ : C → C is entire, we conclude that χZ (σ)
2
e−σ /2 , σ ∈ C.
=
Lemma 3.1. Suppose ZN is defined by (3.1), where X1 satisfies h |X1 |3 i < ∞.
Then
(3.5)
lim χZN (σ) = e−σ
2
/2
N →∞
,
σ ∈ R.
Proof. By independence of the variables X1 , · · · , XN , we have that
√
(3.6)
log χZN (σ) = N log χX (σ/ N ) ,
p
where X has the distribution of [X1 − hX1 i]/ var[X1 ], whence hXi = 0, hX 2 i = 1
and h |X|3 i < ∞. Observe now that χX (·) is a C 3 function and
3
d χX (σ) ≤ h |X|3 i.
(3.7)
sup dσ 3 σ∈R
Hence from Taylor’s theorem and the fact that hXi = 0, hX 2 i = 1, we conclude
that
σ
|σ|3 h |X|3 i
σ 2 √
χ
.
(3.8)
≤
−
1
+
X
2N 6N 3/2
N
The result follows from (3.8) and the inequality
(3.9)
z ≤ − log(1 − z) ≤ z + z 2 ,
0 ≤ z < 1/2.
Lemma 3.1 tells us that
lim h f (ZN ) i = h f (Z) i ,
(3.10)
N →∞
provided f is a function of the type f (z) = exp[iσz], z ∈ R, for some σ ∈ R. We
say the variables ZN , N = 1, 2..., converge in distribution to the variable Z if the
cdfs of ZN converge to the cdf of Z as N → ∞ i.e.
(3.11)
lim P (ZN ≤ ξ) = P (Z ≤ ξ),
N →∞
ξ ∈ R.
Evidently we can rewrite (3.11) as (3.10) with f (z) = H(ξ − z), z ∈ R, where H(·)
is the Heaviside function
(3.12)
H(z) = 0, z < 0,
H(z) = 1, z ≥ 0.
MATH 625-2010
5
We can obtain a proof of (3.11) from Lemma 3.1 and some Fourier analysis.
To do this we introduce the Schwartz space S of functions f : R → C defined as
follows: f ∈ S if
n
d f (z) < ∞
(3.13)
sup (1 + |z|)m dz n z∈R
for all non-negative integers m, n. If we define now the Fourier transform of a
function f : R → C by
Z ∞
(3.14)
Ff (σ) = fˆ(σ) =
f (z)eiσz dz, σ ∈ R,
−∞
then it is easy to see that F is an isomorphism of S with inverse F −1 given by
Z ∞
1
−1 ˆ
(3.15)
f (z) = F f (z) =
fˆ(σ)e−iσz dσ, z ∈ R.
2π −∞
Lemma 3.2. Assume the conditions of Lemma 3.1. Then the limit (3.10) holds
for all f ∈ S.
Proof. From (3.15) we have that
Z ∞
Z ∞
1
1
fˆ(σ)e−iσZN dσ i =
fˆ(σ)χZN (−σ) dσ.
(3.16) h f (ZN ) i = h
2π −∞
2π −∞
Observe that in (3.16) we have used Fubini’s theorem. From Lemma 3.1 and the
dominated convergence theorem we then see that
Z ∞
2
1
(3.17)
lim h f (ZN ) i =
fˆ(σ)e−σ /2 dσ.
N →∞
2π −∞
Now from (3.14) we have that
Z ∞
Z ∞ Z ∞
2
2
1
1
fˆ(σ)e−σ /2 dσ =
f (z)eiσz dz e−σ /2 dσ
(3.18)
2π −∞
2π −∞ −∞
Z ∞
Z ∞
Z ∞
2
2
1
1
=
eiσz−σ /2 dσ f (z) dz = √
e−z /2 f (z) dz = h f (Z) i .
2π
2π
−∞
−∞
−∞
Theorem 3.1 (Central Limit Theorem). Suppose ZN is defined by (3.1), where
X1 satisfies h |X1 |3 i < ∞. Then ZN converges in distribution as N → ∞ to the
standard normal variable Z.
Proof. Consider a finite interval [a, b]. Then for any ε > 0 there is a C ∞ function
f1,ε : R → R with support in the open interval (a, b) such that kf1,ε k∞ ≤ 1 and
f1,ε (z) = 1 for a + ε ≤ z ≤ b − ε. Now since f1,ε ∈ S, Lemma 3.2 implies that
(3.19)
lim inf P (a < ZN < b) ≥ h f1,ε (Z) i .
N →∞
If we let ε → 0 in (3.19) we conclude that
(3.20)
lim inf P (a < ZN < b) ≥ P (a ≤ Z ≤ b) .
N →∞
Similarly by choosing a C ∞ function f2,ε : R → R with support in the open interval
(a − ε, b + ε) such that kf2,ε k∞ ≤ 1 and f2,ε (z) = 1 for a ≤ z ≤ b, we conclude that
(3.21)
lim sup P (a ≤ ZN ≤ b) ≤ P (a < Z < b) .
N →∞
6
JOSEPH G. CONLON
From (3.20), (3.21) it follows that for any a < b,
(3.22)
lim [P (ZN ≤ b) − P (ZN ≤ a)] = P (Z ≤ b) − P (Z ≤ a) .
N →∞
The result follows from (3.22) if we can show that
lim sup P (ZN ≤ a) ≤ Ka ,
(3.23)
N →∞
where the constant Ka satisfies lima→−∞ Ka = 0. To prove (3.23) we observe
that for a C ∞ function fa : R → R with compact support in (a, ∞) such that
kfa k∞ ≤ 1, one has
lim inf P (ZN > a) ≥ hfa (Z) i .
(3.24)
N →∞
Now for a << 0 we choose fa to have the property that fa (z) = 1 for a+1 < z < |a|.
If we set the RHS of (3.24) to be 1 − Ka , then lima→−∞ Ka = 0.
Remark 2. We can compare the SLLN with the CLT. Since the SLLN tells us that
√
(3.25)
lim ZN / N = 0 with probability 1,
N →∞
and CLT that ZN converges to the normal variable Z in distribution, we might
expect that for any α > 0,
lim ZN /N α = 0
(3.26)
N →∞
with probability 1.
We might even expect that ZN converges with probability 1 to Z, but this is always
false whereas (3.26) is true for many i.i.d. variables. Thus the function ZN (ω), ω ∈
Ω, is very oscillatory, but for fixed large N the measure of the set of ω for which
ZN (ω) takes values between a and b remains more or less constant.
As an example we might think of ZN being like the function ZN (ω) = sin 2πN ω, ω ∈
[0, 1]. Thus ZN (ω) is very oscillatory but
Z 1
Z 1
(3.27)
h f (ZN ) i =
f (sin 2πN ω) dω =
f (sin 2πω) dω .
0
0
Thus all the ZN , N ≥ 1, have the same distribution but limN →∞ ZN does not exist
with probability 1.
4. Recurrence and Tail Events
We shall abstract part of the argument used in the proof of SLLN, in particular
(2.8) to (2.10). For a probability space (Ω, F, P ), suppose that An ∈ F, n = 1, 2..,
is an infinite sequence of “events”. We define the lim sup and lim inf of this sequence
as follows:
(4.1)
∞
lim sup An = ∩∞
N =1 ∪n=N An ,
n→∞
∞
lim inf An = ∪∞
N =1 ∩n=N An .
n→∞
Hence lim supn→∞ An is the event that An occurs infinitely often, whereas lim inf n→∞ An
is the event that An eventually always happens.
P∞
Proposition 4.1 (Borel-Cantelli lemma). (a) Suppose that
n=1 P (An ) < ∞.
Then P ( lim supn→∞
A
)
=
0.
P∞ n
(b) Suppose that
n=1 P (An ) = ∞ and the An , n = 1, 2, ..., are independent
events. Then P ( lim supn→∞ An ) = 1.
MATH 625-2010
7
Proof. (a) From (4.1) we have that
(4.2)
P ( lim sup An ) ≤ P ( ∪∞
n=N An ) ≤
n→∞
∞
X
P (An )
n=N
for any N ≥ 1. Since by our assumption the RHS of (4.2) goes to 0 as N → ∞ we
are done.
(b) We observe that the complement of the set lim supn→∞ An in Ω is lim inf n→∞ Ãn ,
where Ãn = Ω − An , n = 1, 2..., are the complements of the An in Ω. Hence we
have that
(4.3)
∞
1 − P ( lim sup An ) = P ( ∪∞
N =1 ∩n=N Ãn ),
n→∞
which in turn implies that
(4.4)
1 − P ( lim sup An ) =
n→∞
lim P ( ∩∞
n=N Ãn ) .
N →∞
Using the independence of the events An , n = 1, 2.., we have that
(4.5)
P ( ∩∞
n=N Ãn ) =
∞
Y
[1 − P (An )] .
n=N
Now from (3.9) we have that
( ∞
)
∞
Y
X
(4.6)
− log
[1 − P (An )] ≥
P (An ) = ∞.
n=N
n=N
Remark 3. Note that in (2.8) the set An is the set An = {ω ∈ Ω : |Sn (ω)/n −
hX1 i| > ε}.
We can use Borel-Cantelli (b) to prove recurrence for the standard random walk
on the integers Z. Thus let the Xj , j = 1, 2, .., be Bernoulli variables taking the
values ±1 with equal probability 1/2. Then SN = X1 + · · · + XN is the position
after N steps of the standard random walk on Z starting at the origin. We wish to
show that
(4.7)
P (SN = 0 for infinitely many N ) = 1.
Thus (4.7) says that the walk recurs to its starting point 0 infinitely often with
probability 1. Following the argument of Borel-Cantelli (b) we need to first show
that
(4.8)
∞
X
P (Sn = 0) = ∞ .
n=1
We can see why this is plausible from√CLT. Thus using the fact that Sn takes only
integer values and the fact that Sn / n converges in distribution to the standard
normal variable Z, we expect that
Z 1/2√n
2
1
1
(4.9)
P (Sn = 0) ' √
e−z /2 dz ' √
√
2π −1/2 n
2πn
8
JOSEPH G. CONLON
for large n. The approximation (4.9) is of course not correct because P (Sn = 0) = 0
if n is odd. We might however be tempted then to modify (4.9) to be double the
RHS of (4.9) in the case of even integer n. In that case we expect
P (S2m = 0) ' 2 √
(4.10)
1
1
= √
.
πm
2π2m
for large integer m. We shall show using
√ Sterling’s formula that (4.10) holds. This
illustrates that approximation of SN / N by a standard normal variable can be
much finer than what is given by CLT.
Lemma 4.1. For SN√
, N = 1, 2, ..., the standard random walk on Z starting at the
origin, then limm→∞ πm P (S2m = 0) = 1.
Proof. We have that
(4.11)
P (S2m = 0) =
(2m)! 1
.
(m!)2 22m
Recall now Sterling’s formula that
(4.12)
n!
lim
n→∞ nn+1/2 e−n
=
√
2π .
Observe now that
2
mm+1/2 e−m √
2.
(m!)
(2m)2m+1/2 e−2m
√
From (4.12) we see that the RHS of (4.13) converges to 1/ π as m → ∞.
(4.13)
√
m P (S2m = 0) =
(2m)!
Evidently Lemma 4.1 implies (4.8), but (4.8) does not imply (4.7) since the
events {SN = 0}, N = 1, 2.., are not independent. These events are however
approximately independent and using that fact we can still prove (4.7). The notion
of “approximate independence” is a bit fuzzy, but we can give it some quantitative
meaning by considering covariances. Thus if X, Y are two variables, the covariance
of X and Y is defined as
(4.14)
cov[X, Y ] = h [X − hXi][Y − hY i] i = hXY i − hXihY i .
If X, Y are independent then cov[X, Y ] = 0, but of course one can have cov[X, Y ] =
0 and X, Y not be independent. We define the coefficient of correlation ρ for the
variables X, Y as
(4.15)
cov[X, Y ]
ρ(X, Y ) = p
.
var[X]var[Y ]
Evidently we have that −1 ≤ ρ(X, Y ) ≤ 1. If ρ(X, Y ) = ±1 then X is a multiple of
Y , which is the opposite of independence, and ρ(X, Y ) = 0 if X, Y are independent.
Therefore we may use the coefficient of correlation as a measure of independence. In
particular if |ρ(X, Y )| << 1 then we could conclude that X, Y are “approximately
independent”.
Let us apply the above considerations to the events {SN = 0}, N = 1, 2.... We
define Xm by Xm (ω) = 1 if S2m (ω) = 0, otherwise Xm (ω) = 0. We have now,
assuming from Lemma 4.1 that for large m and k ≥ 1,
1
1
, hXm Xm+k i = √
,
(4.16)
hXm i = √
πm
π mk
MATH 625-2010
9
that ρ(Xm , Xm+k ) is given by the formula
(4.17)
h
i
p
√
1/π mk − 1/π m(m + k)
ρ(Xm , Xm+k ) =
h p
i1/2 .
√
1/2
[ 1/ πm − 1/πm ]
1/ π(m + k) − 1/π(m + k)
Thus we have for fixed k,
(4.18)
√
lim ρ(Xm , Xm+k ) = 1/ πk .
m→∞
We conclude that the sequence of variables Xrk , r = 1, 2, .., is approximately
independent for large fixed k. Since Lemma 4.1 implies that
∞
X
(4.19)
P (Xrk = 1) = ∞,
r=1
we can more or less conclude the recurrence (4.7) from Borel-Cantelli (b). This
is of course not a rigorous argument, but still an important indicator of why the
result is correct. Now to the rigorous argument:
Proposition 4.2. For SN , N = 1, 2..., the standard random walk on Z, the recurrence (4.7) holds.
Proof. We consider the disjoint events Ak , k = 0, 1, 2, .., defined by
(4.20) A0
=
{ω : SN (ω) 6= 0 for all N = 1, 2, ...} ,
Ak
=
{ω : Sk (ω) = 0, SN +k (ω) 6= 0 for all N = 1, 2, ...} ,
k ≥ 1.
Thus Ak is the event that SN returns to 0 for the last time at N = k. Since these
are disjoint events we have
∞
X
(4.21)
P (Ak ) ≤ 1 .
k=0
Observe now that we can rewrite Ak as
(4.22)
Ak = {ω : Sk (ω) = 0, SN +k (ω) − Sk (ω) 6= 0 for all N = 1, 2, ...} .
Hence by the independence of the events {Sk = 0} and {SN +k − Sk 6= 0 for all N =
1, 2, ...}, we conclude from (4.22) that
(4.23)
P (Ak ) = P (Sk = 0)P (A0 ) .
Substituting (4.23) into (4.21) and using (4.8), we conclude that P (A0 ) = 0. Now
(4.23) implies that P (Ak ) = 0, k = 0, 1, 2, .... To finish the proof we observe that
(4.24)
∪∞
k=0 Ak = event that SN = 0 finitely often,
and we have shown the probability of the LHS of (4.24) is 0.
We have observed in the Borel-Cantelli Lemma and in the previous proposition
that the probability of an event is either 0 or 1. These events are so called “tail
events” since they are independent of any finite number of the variables involved.
The following systematizes this observation:
Proposition 4.3 (Kolmogorov zero-one law). Let X1 , X2 , .., be independent variables on a probability space (Ω, F, P ). Suppose E ∈ F is in the σ field generated
by the X1 , X2 , .., but is independent of any finite number of the variables X1 , X2 , ...
Then P (E) = 0 or P (E) = 1.
10
JOSEPH G. CONLON
Proof. E can be approximated arbitrarily closely by sets EN in the σ field F(X1 , X2 , ..., XN )
for large enough N . Hence we have that
(4.25)
lim P (EN ) = P (E),
N →∞
lim P (EN ∩ E) = P (E) .
N →∞
By assumption we have P (EN ∩ E) = P (EN )P (E), whence (4.25) implies that
P (E) = P (E)2 .
We can apply Proposition 4.3 to the issue raised at the end of §3, in particular
(3.26). Observe that for α > 0 the set {ω : limN →∞ ZN (ω)/N α exists} is a tail
event, whence it follows that it occurs with probability 1 or 0. We have seen from
the SLLN that it occurs with probability 1 if α = 1/2 and the limit is 0. Next we
show that if α = 0 it occurs with probability 0.
Proposition 4.4. Let X1 , X2 , .., be i.i.d. variables on (Ω, F, P ) and assume that
h|X1 |3 i < ∞. Then P ( {ω : limN →∞ ZN (ω) exists} ) = 0.
Proof. By the Kolmogorov law we can assume for contradiction that {ω : limN →∞ ZN (ω) exists}
occurs with probability 1. Thus there exists a variable Z and limN →∞ ZN = Z with
probability 1. However by the CLT this limiting variable Z must be the standard
normal variable, whence
(4.26)
P ( {ω : lim ZN (ω) > 0} ) = 1/2 .
N →∞
Since {ω : limN →∞ ZN (ω) > 0} is a tail event we have obtained a contradiction
to the Kolmogorov law. Thus {ω : limN →∞ ZN (ω) exists} occurs with probability
0.
5. Law of the Iterated Logarithm
In this section we shall be considering the standard random walk SN = X1 +
· · · + XN on Z, so the Xj , j = 1, 2, ..., are i.i.d. Bernoulli taking values ±1 with
probability 1/2. Suppose now that bn , n = 1, 2, ..., is a positive sequence satisfying
lim bn = ∞ .
(5.1)
n→∞
By Kolmogorov the set A given by
(5.2)
A = { ω : lim sup |Sn (ω)|/bn < ∞ }
n→∞
is a tail event and hence P (A) = 0 or P (A) = 1. Assume now that P (A) = 1.
Invoking Kolmogorov again, we see that there exists γ ≥ 0 such that
(5.3)
lim sup |Sn (ω)|/bn = γ
with probability 1.
n→∞
√
We have already seen that if bn = n then γ = 0. We also know that if bn = n
then P (A) = 0. A natural question to ask therefore is to find a sequence bn such
that (5.3) holds for some γ with 0 < γ < ∞. The answer is given by the following:
Theorem 5.1 (Law of the Iterated Logarithm). If bn = [2n log log n]1/2 , then (5.3)
holds with γ = 1.
MATH 625-2010
11
To prove Theorem 5.1 we use estimates on the fluctuations of the random walk
which come from CLT, but we also need an important new idea, the idea of a
maximal function. We define the maximal function MN associated with the random
walk SN to be
(5.4)
MN (ω) =
sup Sn (ω),
ω∈Ω.
0≤n≤N
Thus MN measures the farthest to the right the walk has wandered in N steps.
Evidently MN is a non-negative variable, but there is no obvious relationship between the distribution of MN and SN . The reflection principle tells us that the
distribution of MN and |SN | are almost identical.
Lemma 5.1. Let r ≥ 1 be an integer. Then there is the inequality,
(5.5)
2P (SN ≥ r + 1) ≤ P (MN ≥ r) ≤ 2P (SN ≥ r) .
Proof. Observe that
(5.6)
P (MN ≥ r) − P (SN ≥ r) = P (MN −1 ≥ r, SN < r) .
Reflection symmetry comes in by noting that
(5.7)
P (MN −1 ≥ r, SN < r) = P (MN −1 ≥ r, SN > r) .
To see (5.7) we use a random stopping time n∗ defined by
n∗ (ω) = inf{ n ≥ 1 : Sn (ω) = r } .
(5.8)
Thus we have that
(5.9) P (MN −1 ≥ r, SN < r) = P ( {ω : n∗ (ω) < N, SN −n∗ (ω) < 0} ) =
P ( {ω : n∗ (ω) < N, SN −n∗ (ω) > 0} ) = P (MN −1 ≥ r, SN > r) .
Now (5.6), (5.7) imply that
(5.10)
1
1
P (MN ≥ r) = P (MN −1 ≥ r, SN 6= r)+P (SN ≥ r) ≤ P (MN ≥ r)+P (SN ≥ r) ,
2
2
which implies the upper bound in (5.5).
We also get the lower bound by using the identity in (5.10). Thus we have from
(5.10) that
(5.11) P (MN ≥ r) ≥
1
1
P (MN −1 ≥ r) + P (SN = r) + P (SN ≥ r + 1)
2
2
1
≥ P (MN ≥ r) + P (SN ≥ r + 1) .
2
Next we obtain some estimates on the asymptotics for the cdf of the normal
variable.
Lemma 5.2. For Z the standard normal variable and a > 0, there are the inequalities
2
2
1
1
1
1
√
(5.12)
− 3 e−a /2 ≤ P (Z > a) ≤ √ e−a /2 .
a
a
2π
a 2π
12
JOSEPH G. CONLON
Proof. We have that
Z ∞
Z
1 −a2 /2 ∞ −ax−x2 /2
1
−z 2 /2
e
dz = √ e
e
dx ,
(5.13)
P (Z > a) = √
2π a
2π
0
upon making the change of variable z = x + a in the integration. The upper bound
in (5.12) follows from
Z ∞
Z ∞
2
(5.14)
e−ax−x /2 dx ≤
e−ax dx = 1/a .
0
0
The estimate 1/a in (5.14) is just the first term in an asymptotic expansion for the
integral in powers of 1/a which should be valid for a >> 1. The second term is
obtained by making the approximation
e−x
(5.15)
2
/2
' 1 − x2 /2,
in which case we get
Z
∞
2
e−ax−x
(5.16)
/2
dx ' 1/a − 1/a3 .
0
We can turn the calculation (5.16) into a rigorous lower bound by noting that
2
1
1
d
3
−z 2 /2
− 3 e
(5.17)
−
= 1 − 4 e−z /2 .
dz
z
z
z
Hence
Z
(5.18)
∞
e−ax−x
2
/2
dx = ea
0
2
Z
∞
2
e−z /2 dz
a
Z ∞
2
2
1
1
3
− 3 ,
≥ ea /2
1 − 4 e−z /2 dz =
z
a a
a
/2
whence the lower bound in (5.12) follows.
The fine asymptotics in Lemma 5.2 shows us how the logarithm gets involved in
LIL.
Lemma 5.3. There is the inequality
∞
X
Sn
(5.19)
P
>1 < ∞
[2n(1 + ε) log n]1/2
n=2
or = ∞
according as ε > 0 or ε < 0.
Proof. From Lemma 5.2 and CLT we have that
Sn
Sn
1/2
√
(5.20) P
>
1
=
P
>
[2(1
+
ε)
log
n]
n
[2n(1 + ε) log n]1/2
1
1
'
e−(1+ε) log n =
.
[4π(1 + ε) log n]1/2
[4π(1 + ε) log n]1/2 n1+ε
Since
(5.21)
∞
X
n=1
1
n1+ε
< ∞ or = ∞
according as ε > 0 or ε < 0, the result follows.
MATH 625-2010
13
Remark 4. From Lemma 5.3 and Borel-Cantelli (a) we conclude that
(5.22)
lim sup
n→∞
Sn
≤ 1
[2n(1 + ε) log n]1/2
with probability 1
if ε > 0, whence it follows for ε = 0.
In order to go further than (5.22) we need to use Lemma 5.1.
Lemma 5.4. With bn the sequence in the statement of Theorem 5.1, then
lim sup Sn /bn ≤ 1
(5.23)
with probability 1.
n→∞
Proof. We shall show using Lemma 5.1 that for any α > 1
!
∞
X
Sn
(5.24)
P
sup
> α
< ∞.
αm ≤n<αm+1 bn
m=1
The result follows from (5.24) and Borel-Cantelli (a) on letting α → 1.
To prove (5.24) consider integers r, k which satisfy for some integer m ≥ 1 the
inequality αm ≤ r, k ≤ αm+1 . Then we have that
√
br
α
γ(m)
√
≤
,
≤
(5.25)
bk
γ(m)
α
where limm→∞ γ(m) = 1. Thus in intervals αm ≤ n ≤ αm+1 the sequence bn is
essentially constant for α close to 1. Using this observation we note then that we
may estimate
!
!
Sn
(5.26) P
sup
> α
≤ P
sup Sn > αbαm
αm ≤n<αm+1 bn
n≤αm+1
1/2 !
2ξ(m) log m
Sαm+1
√
> α
≤ 2P ( Sαm+1 > αbαm ) = 2P
α
αm+1
where limm→∞ ξ(m) = 1. Using the CLT and Lemma 5.2 we have that
1/2 !
2ξ(m) log m
Sαm+1
√
(5.27) P
> α
'
α
αm+1
1
p
4παξ(m) log m
exp[−αξ(m) log m] = p
1
4παξ(m) log m
1
mαξ(m)
.
The inequality (5.24) follows from (5.27) since
∞
X
1
< ∞.
αξ(m)
m
m=1
(5.28)
Lemma 5.5. With bn the sequence in the statement of Theorem 5.1, then
(5.29)
lim sup Sn /bn ≥ 1
n→∞
with probability 1.
14
JOSEPH G. CONLON
Proof. Taking α > 1 again and setting β = α/(α − 1) > 1 we consider
(5.30)
1/2 !
Sαm (α−1)
1
2ξ(m) log m
p
P Sαm+1 − Sαm >
>
b m+1
= P
β α
β
αm (α − 1)
1/2
1/2
β
ξ(m) log m
β
1
.
'
exp −
=
4πξ(m) log m
β
4πξ(m) log m
mξ(m)/β
Sine β > 1 we conclude that
∞
X
1
(5.31)
= ∞.
P Sαm+1 − Sαm >
b m+1
β α
m=1
Since the events in (5.31) are independent we conclude from Borel-Cantelli (b) that
1
(5.32)
Sαm+1 − Sαm >
b m+1 infinitely often with probability 1.
β α
Combining Lemma 5.4 with (5.32) we see that
1
(5.33)
Sαm+1 >
b m+1 − α1/4 bαm infinitely often with probability 1,
β α
whence we have that
Sαm+1
1
ξ(m)
>
− 1/4 infinitely often with probability 1.
(5.34)
bαm+1
β
α
Now let α → ∞ in (5.34) to get the result, noting that then β → 1.
Proof of Theorem 5.1. Simply observe that
Sn
Sn
|Sn |
= max lim sup
, − lim inf
.
(5.35)
lim sup
n→∞ bn
bn
n→∞ bn
n→∞
From the previous lemmas we have by symmetry that
Sn
Sn
(5.36)
lim inf
= − lim sup
= −1,
n→∞ bn
n→∞ bn
whence the RHS of (5.35) is 1.
University of Michigan, Department of Mathematics, Ann Arbor, MI 48109-1109
E-mail address: [email protected]