Download BROWNIAN MOTION AND THE STRONG MARKOV PROPERTY

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Indeterminism wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Probability box wikipedia , lookup

Random variable wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Randomness wikipedia , lookup

Probability interpretations wikipedia , lookup

Karhunen–Loève theorem wikipedia , lookup

Central limit theorem wikipedia , lookup

Conditioning (probability) wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
BROWNIAN MOTION AND THE STRONG MARKOV
PROPERTY
JAMES LEINER
Abstract. This paper is an introduction to Brownian motion. After a brief
introduction to measure-theoretic probability, we begin by constructing Brownian motion over the dyadic rationals and extending this construction to Rd .
After establishing some relevant features, we introduce the strong Markov
property and its applications. We then use these tools to demonstrate the
existence of various Markov processes embedded within Brownian motion.
Contents
1. Introduction
2. Construction
3. Non-Differentiability
4. Invariance Properties
5. Markov Properties
6. Applications
7. Derivative Markov Processes
Acknowledgments
References
1
5
9
10
11
13
15
16
16
1. Introduction
With a varied array of uses across pure and applied mathematics, Brownian
motion is one of the most widely studied stochastic processes. This paper seeks
to provide a rigorous introduction to the topic, using [3] and [4] as our primary
references.
In order to properly ground our discussion, however, we must first become comfortable with the basic framework of measure-theoretic probability. In this section,
we provide a terse review of the subject sufficient to prepare the reader for some
of the more advanced proofs within this paper. Most of the important results in
this section are presented without proof and individuals wishing for a more detailed
exposition of the topic can consult [1]. If so inclined, a large portion of the first
half of the paper can be understood without measure theory. In this case, consult
[2] for a more basic introduction to probability not requiring measure theory.
Definition 1.1. A σ-algebra on a set S is a subset of the power set Σ ⊆ 2S such
that
(1) S ∈ Σ;
(2) if A ∈ Σ, then Ac ∈ Σ;
1
2
JAMES LEINER
(3) if A1 , A2 , ... ∈ Σ, then
S∞
i=1
Ai ∈ Σ.
Note that by De Morgan’s laws,
(
∞
[
i=1
Aci )c =
∞
\
Ai .
i=1
Thus, a σ-algebra is closed under countable intersection as well.
Definition 1.2. A pair (S, Σ) where S is a set, and Σ is some σ-algebra over that
set, is called a measurable space.
Definition 1.3. Let C be a collection of subsets of a set S. The σ-algebra generated
by C, denoted σ(C), is the smallest σ-algebra on S such that C ⊆ S. In other words,
it is the intersection of all σ-algebras on S which have C as a subcollection.
Definition 1.4. Let S be a topological space. B(S), the Borel σ-algebra, is the
σ-algebra generated by the family of open subsets of S.
Note that in the specific case of R, we can say that
B(R) = σ({(−∞, x] : x ∈ R}).
Definition 1.5. Let (S, Σ) be a measurable space. A map µ : Σ → [0, 1] is called
a measure when µ({∅}) = 0 and it is countably additive. That is,


∞
∞
[
X
µ
Fj  =
µ(Fj ).
j=1
j=1
A triple (Ω, Σ, µ) is then called a measure space.
Definition 1.6. Let (S, Σ) be a measurable space and let X be a function from Ω
to R. Then, X is Σ-measurable if X −1 (H) ⊆ Σ for all H ∈ σ(R).
Definition 1.7. Let (S, Σ, µ) be a measure space. When µ(Σ) equals 1, this map
is termed a probability measure and the associated measure space is called a
probability space.
We are now able to use this machinery to re-introduce some familiar concepts
within probability theory. First, let us introduce some standard terminology. Conventionally, the measure space (S, Σ, µ) is denoted as (Ω, F, P) and called a probability triple. The set Ω is called a sample space, while an element w of Ω is
correspondingly termed a sample point. Similarly, the σ-algebra F is called a
family of events and a particular element of F is termed an event.
Definition 1.8. Let E1 , E2 , ..., En be a collection of events. The limit superior of
these sets is defined as
∞ [
∞
\
lim sup En =
En .
k=1 n=k
Because inclusion in the limit superior implies that x ∈ En for infinitely many
values, we henceforth denote lim sup En as En , i.o.
Definition 1.9. A statement S about outcomes is said to be true almost surely,
or with probability one, if F = {w : S(w) is true} ∈ F and P(F ) = 1.
BROWNIAN MOTION AND THE STRONG MARKOV PROPERTY
3
Lemma 1.10 (Borel-Cantelli Lemma). Suppose E1 , E2 ..., En is a collection of
events such that
∞
X
P{En } < ∞.
n=1
Then, P(En , i.o.) = 0.
Proof. Given a collection of events {En : n ∈ N.}, we have for all k,
∞ [
∞
\
∞
[
En ⊂
k=1 n=k
En .
n=k
So for all k,
P(En , i.o.) ≤ P(
∞
[
En ) ≤
n=k
∞
P
However, by the hypothesis that
∞
X
P(Ek ).
n=k
P{En } is finite, the sum on the right side of
n=1
the equation converges to zero as k gets arbitrarily large.
Definition 1.11. A random variable on a probability space (Ω, F, P) is a map
X : Ω → R that is F-measurable.
We now turn to formalize the notion of independence and dependence with
respect to events and random variables.
Definition 1.12. Let F be a σ-algebra. Sub-σ-algebras G1 , G2 , ... of F are independent if whenever Gi ∈ G for all i ∈ N and i1 , ...in are distinct, then
P(Gi1
\
Gi2
\
...
\
Gin ) =
n
Y
P(Gik ).
k=1
Definition 1.13. Random variables X1 , X2 , ... defined on a probability space (Ω, F, P)
are called independent if the σ-algebras σ(X1 ), σ(X2 ), ... are independent.
Definition 1.14. Events E1 , E2 , ... defined on a probability space (Ω, F, P) are
called independent if the corresponding σ-algebras ε1 , ε2 , ... are independent and
εi = {∅, En , Ω\En , Ω}.
At this point, we will define the concept of an expectation formally. Doing so,
however, first requires an understanding of what taking an integral with respect to
a measure means exactly. It will be beyond the scope of this paper to develop this
theory in sufficient detail. Interested readers can consult [1] for such a development.
Definition 1.15. Let X be a random variable defined on a probability space
(Ω, F, P). The expectation of X, E[X], is defined by
Z
Z
E[X] = XdP = X(w)P(dw).
Ω
Ω
Definition 1.16. Let X and Y be two random variables defined on a probability
space. We define the covariance of X and Y, Cov[X, Y ], as
Cov[X, Y ] = E[(X − E[X])(Y − E[Y ])].
4
JAMES LEINER
Definition 1.17. Let X be a random variable defined on a probability space. The
variance of X, V ar[X], is defined as
V ar[X] = Cov[X, X] = E[(X − E[X])2 ].
Definition 1.18. Let X be a random variable defined on a probability space. We
then define the law of X, Lx , as
Lx = P ◦ X −1 , Lx : B → [0, 1].
Thus, Lx defines a probability measure on (R, B). We are now able to uniquely
generate Lx by the function Fx : R → [0, 1] defined as follows:
Definition 1.19. Let X be a random variable defined on a probability space. The
distribution function of X is defined as
Fx (c) = Lx (−∞, c] = P(X ≤ c) = P{w : X(w) ≤ c}.
The properties of a distribution function remain as you would remember them
from basic probability theory. In particular, we have the following:
(1) Fx : R → [0, 1];
(2) Fx is increasing monotonically. That is, if a ≤ b then Fx (a) ≤ Fx (b);
(3) lim Fx (a) = 1 and lim Fx (a) = 0;
a→∞
a→−∞
(4) F is right-continuous.
Definition 1.20. Let X be a random variable defined on a probability space. We
say that X has a probability density function fX if there exists a function
fX : X → [0, ∞) such that
Z
P(X ∈ B) =
fX (x)dx, B ∈ B.
B
In particular, we concern ourselves with a particular distribution function that
is intimately tied with the concept of Brownian motion and will remain prominent
over the course of this paper.
Definition 1.21. A random variable X has a normal distribution with mean µ
and variance σ 2 if
P{X > x} = √
Z∞
1
2πσ 2
e
−(u−µ)2
2σ 2
du.
x
Occasionally, the shorthand N (µ, σ 2 ) is used to refer to this distribution.
Within the course of this paper, we will also need to concern ourselves with
continuous random variables in two dimensions. Although there is a natural way
to extend the measure theory we have defined thus far to this purpose, it will be
sufficient for our purposes to utilize the standard definitions from basic probability
theory.
Definition 1.22. Let X and Y be random variables defined on a probability space.
These variables are jointly continuous if there exists a joint probability density
function fXY such that for all x, y ∈ R and measurable sets of real numbers A
and B, we have
Z Z
P(X ∈ A, Y ∈ B) =
fXY (x, y)dxdy.
B
A
BROWNIAN MOTION AND THE STRONG MARKOV PROPERTY
5
Definition 1.23. Let X and Y be jointly continuous random variables defined on
a probability space. We define the conditional probability density function
as fY (y | X = x) such that such that for all x, y ∈ R and every measurable set of
real numbers B, we have
Z
Z
fXY (x, y)
P(Y ∈ B | X = x) =
fY (y | X = x)dy =
dy.
fX (x)
B
B
At this point, we are now ready to present the definition of Brownian motion.
Definition 1.24. A stochastic process is a collection of random variables {Wt :
t ∈ T } on some probability space and indexed by some set of times T .
Definition 1.25. A Brownian motion started at x ∈ R is a stochastic process
with the following properties:
(1) W0 = x;
(2) For every 0 ≤ s ≤ t, Wt − Ws has a normal distribution with mean zero
and variance t − s, and |Wt − Ws | is independent of {Wr : r ≤ s};
(3) With probability one, the function t → Wt is continuous.
A Brownian motion started at 0 is termed standard Brownian motion.
2. Construction
Given the definition of Brownian motion, one needs to present more work in
order to demonstrate that the conditions imposed on the distributions allow for a
continuous random process to exist. In this section, we present a construction of
Brownian motion that demonstrates no contradictions arise.
S
Definition 2.1. The set of non-negative dyadic rationals is D = n Dn where
Dn = { 2kn : k = 0, 1, 2, ...}.
Definition 2.2. A standard one-dimensional Brownian motion on the dyadic rationals {Wq : q ∈ D} is a random process such that for every n, the random variables
Wk/2n − W(k−1)/2n , k ∈ N are independent and N (0, 21n ).
Proposition 2.3. Suppose X, Y are independent normal random variables, each
N (0, 1). If
X
Y
X
Y
Z = √ + √ and Ẑ = √ − √ ,
2
2
2
2
then Z and Ze are independent N (0, 1) variables.
Proof. We have
√
√
1
1
Z = √ N (0, 1) + √ N (0, 1) = N (0, (1/ 2)2 + (1/ 2)2 ) = N (0, 1).
2
2
√
√
For convenience, let us define X̂ = X/ 2 and Ŷ = Y / 2. We know these variables
are each N (0, 1/2) and that Z = X̂ + Ŷ . This allows us to write the joint density
for (X̂, Ŷ ) as
!
!
2
2
1
1
1
−x̂2
−ŷ 2
p
p
e
e
= e−(x̂ +ŷ ) .
π
2π(1/2)
2π(1/2)
6
JAMES LEINER
2
2
The joint density for (X̂, Z) becomes π1 e−(x̂ +(z−x̂) ) and the density for Z becomes
2
√1 e−z /2 . Using these, we can say that the conditional density of X̂ given Z = z
2π
is
2
2
2
1
(1/π)e−(x̂ +(z−x̂) )
√
=p
e−2(x̂−z/2) .
2 /2
−z
(1/ 2π)e
π/2
So, conditioned on the value of Z, we know that X̂ is N (Z/2, 1/4). Similarly, we
know that conditioned on the value of Z that Ŷ is N (Z/2, 1/4). This allows us to
write
Z
Ze
Z
Ze
X̂ = +
and Ŷ = −
2
2
2
2
e
where Z is N (0, 1) and independent of Z. Substituting back X and Y and rearranging terms yields the desired equations.
Lemma 2.4. Standard Brownian motion exists over the dyadic rationals.
Proof. Define J(k, n) as follows
J(k, n) = 2n/2 [Wk/2n − W(k−1)/2n ].
We start with the assumption that there exist a countable number of independent
normal random variables {Zn : n ∈ N} and work back recursively to define Wq . We
can start by defining {J(k, 0) = Zk : k ∈ N}. Now, assume that there exists n such
that {J(k, n) : k ∈ N} has been defined only using {Zq : q ∈ D} so that they are
independent N (0, 1) variables. We can then define J(k, n + 1) as follows:
J(k, n) Z(2k+1)/2n+1
√
√
+
2
2
J(k, n) Z(2k+1)/2n+1
√
J(2k, n + 1) = √
−
2
2
Using Proposition 2.3 repeatedly defines {J(k, n + 1) : k ∈ N}, thus yielding a
collection of independent N (0, 1) variables. We are then able to construct Wk/2n
in a natural way:
k
X
Wk/2n = 2−n/2
J(j, n)
J(2k − 1, n + 1) =
j=1
So,
2n/2 (Wk/2n − W(k−1)/2n ) =
k
X
j=1
J(j, n) −
k−1
X
J(j, n) = J(k, n).
j=1
Lemma 2.5. If Wq , q ∈ D is a standard one-dimensional Brownian motion, then
almost surely, the function converges uniformly on every interval closed interval
[a, b].
Proof. We begin by defining
Mn = sup{|Ws − Wt | : 0 ≤ s, t ≤ 1, |s − t| ≤ 2−n , s, t ∈ D}.
BROWNIAN MOTION AND THE STRONG MARKOV PROPERTY
7
First, note that the statement Mn → 0 as n → 0 is true if and only if the function
t 7−→ Wt converges uniformly. If Mn converges to zero, then given we can choose
Mn < and then pick δ = 2−n to ensure that |Ws − Wt | < for all |s − t| < δ.
Similarly, given a uniformly continuous function then for any , we can set 2−n < δ
which ensures that Mn < .
Keeping this in mind, we define
k
k−1
Kn = max n sup |Wq − W(k−1)/2n | : q ∈ D, n ≤ q ≤ n .
k=1,..2
2
2
It is clear that Kn ≤ Mn forms a bound because the dyadic rationals contained
in an interval are a subset of the real numbers. We can then bound Kn above using
the triangle inequality.
|Ws − Wt | ≤ |Ws − W(k−1)/2n | + |Wt − W(k)/2n | + |Wkn − W(k−1)/2n | ≤ 3Mn
Thus,
Kn ≤ Mn ≤ 3Kn .
We must now demonstrate that Kn → 0. First, let us define the following:
\
K̃ = sup{Wq : q ∈ D [0, 1]}
L̃ = max{Wq : q ∈ Dn
\
[0, 1]}
M̃ = max{Wk/2n : k = 1, ..., 2n }
For fixed n, define the event Ek as the first time that M̃ is greater than or equal
to some constant a. More formally, let it be the event that
{Wk/2n ≥ a, Wj/2n for j = 1, ..., k − 1}.
From thisTdefinition, it is clear that the events E1 , ..., En are pairwise disjoint.
That is, Ek Ej = ∅ for j 6= k. Moreover, the union over this entire set of events
is equal to the event that M̃ ≥ a. Lastly, we can tell that the random variable
W1 − Wk/2n is independent of the event Ek for all possible k. Therefore, we can
write
P[Ek
\
\
{W1 ≥ a}] ≥ P[Ek {W1 − Wk/2n ≥ 0}]
= P(Ek )P{W1 − Wk/2n ≥ 0} ≥
1
P(Ek ).
2
Thus,
2n
2n
k=1
k=1
X
\
1
1X
P{M̃ ≥ a} =
P(Ek ) ≤
P[Ek {W1 ≥ a}] = P{W1 ≥ a}.
2
2
So,
P{M̃ ≥ a} ≤ 2P{W1 ≥ a}.
T
Now note that if K̃ > a, then Wq > a for some q ∈ D [0, 1]. This implies that
P{K̃ > a} ≤ lim P{L̃ ≥ a}.
n→∞
8
JAMES LEINER
So,
P{K̃ > a} ≤ 2P{W1 ≥ a}.
(2.6)
Now, let us remind ourselves that given any a ∈ D, Brownian motion can be
bounded as follows:
Z∞
e
−x2 /2
Z∞
dx ≤
e
a
a
−a2 /2
2
2
1
dx = e−a /2 ⇒ P{W1 ≥ a} ≤ √
a
2π
Z∞
e−x
2
/2
dx ≤
1 −a2
e 2
a
a
This, together with Equation 2.6 yields
P{K̃ > a} ≤
2 −a2 /2
.
e
a
Thus,
\
2
4
[0, 1]} ≤ e−a /2 .
a
Finally, consider 2n/2 Kn , which is merely the supremum
of 2n different variables
T
all with the same distribution as sup{|Wq | : q ∈ D [0, 1]}. Given a fixed a ∈ R,
the probability that the supremum of a collection of independent random variables
is greater than a is merely the sum of each individual probability. We can therefore
write
sup{|Wq | : q ∈ D
P{Kn ≥ a2−n/2 } ≤ 2n sup{|Wq | : q ∈ D
√
Setting a = 2 n yields
\
[0, 1]} ≤
4 −a2 /2
e
.
a
∞
X
√
√
2
P{Kn ≥ 2 n2−n/2 } < ∞.
P{Kn ≥ 2 n2−n/2 } ≤ √ (2/e2 )n ⇒
n
n=1
√Appling the Borel-Cantelli lemma implies that with probability one, Kn <
2 n2−n/2 for all n large enough. Thus, Kn → 0 as n → ∞, which completes
the proof.
Theorem 2.7. Standard Brownian motion exists.
Proof. This follows naturally from uniform continuity. Let us choose 2 which yields
some δ such that |Wt −Ws | ≤ 2 for all s, t ∈ D. Now pick n0 ∈ N such that 2n10 < δ.
k
Pick a ∈ D. Now, pick n, m > n0 and kn0 = 0, 1, ..., 2n0 such 0 < a − 2nn0 < 21n .
Picking kn and km in a similar fashion yields the following:
1
2n0
1
|km − kn0 | < n0
2
|kn − kn0 | <
So,
|Wkn − Wkm | ≤ |Wkn − Wkn0 | + |Wkm − Wkn0 | < .
This yields a Cauchy sequence Wkn in R, which in turn defines a convergent
subsequence with a unique limit. Defining {Wt , t ∈ R} as this limit yields a unique
extension of {Wq : q ∈ D} to {Wt : t ∈ R} that is continuous.
BROWNIAN MOTION AND THE STRONG MARKOV PROPERTY
9
3. Non-Differentiability
Despite being continuous, the random nature of Brownian motion yields many
interesting pathological properties. The most prominent example of this is that
it is nowhere differentiable. We present a proof of this for the interval [0, 1] for
convenience, but this can be easily generalized for any arbitrary closed interval
[a, b].
Theorem 3.1. Wt is nowhere differentiable on t ∈ [0, 1] with probability one.
Proof. Define M (k, n) and Mn as follows:
M (k, n) = max{|W k − W k−1 |, |W k+1 − W k |, |W k+2 − W k+1 |}
n
n
n
n
n
n
Mn = min{M (1, n), ..., M (n, n)}
Suppose there exists a t ∈ [0, 1] at which Wt is differentiable. Then, for some
M ∈ R,
Wr − Wt
= M.
t−t
So, for any > 0, there exists δ such that |Wr − Wt | < (M + )|r − t| whenever
|r − t| < δ.
Given an , pick no such that n2o < δ. Then, for all n ≥ no , n2 ≤ n2o < δ.
This implies that for nk such that 0 ≤ t − nk < n1 ,
lim
r→t
|W k −W k−1 | ≤ |Wt −W k−1 |+|Wt −W k | ≤ (M +)(2/n)+(M +)(2/n) ≤ 4(M +)/n.
n
n
n
n
Setting C = 4(M + ) and using a similar argument to bound the other two
terms demonstrates that if there exists a t ∈ [0, ∞] at which Wt is differentiable,
then there exists a C, no < ∞ such that for all n ≥ no , C/n ≥ Mn .
Using property (2) of the definition of Brownian motion, we can see that M (k, n)
takes the maximum of a collection of independent, normally distributed random
variables all with mean zero and variance n1 . This implies that for any Brownian
Motion and any k, n,
P{M (k, n) ≤ C/m}
≤
≤
|P {|W1/n | ≤ C/n}|3

3
C/n
r
Z
2
−x
 2n

e 2n dx ≤

π
r
C
2 1
√
π n
!3
.
0
So, there exists a c such that P{M (k, n) ≤ C/m} ≤ ( √cn )3 . Hence, this yields
P{Mn ≤ C/n} ≤ P{
n
[
{M (k, n) ≤ C/n}} ≤
k=1
n
X
k=1
c
P{M (k, n) ≤ C/n} ≤ √ .
n
Thus,
lim P{Mn ≥ C/n} = 1.
n→∞
10
JAMES LEINER
Thus, Mn ≥ C/n with probability one for arbitrarily large enough n. Suppose
Wt is differentiable somewhere with positive probability. Then, with positive probability, C/n ≥ Mn for n ≥ n0 . Thus, Mn ≤ C/n for arbitrarily large n. Thus, Wt
is almost surely not differentiable on t ∈ [0, 1].
4. Invariance Properties
In this section, we begin to prove some basic properties of Brownian motions
that will become invaluable as we start delving into more complex proofs.
Lemma 4.1 (Scaling invariance). Suppose Wt is a standard Brownian motion and
let a > 0. Then, the process Xt = a1 Wa2 t is also a standard Brownian motion.
Proof. Under scaling, continuity of paths and independence of increments still hold.
Considering Wt − W s, we have a12 N (0, a2 (t − s)). This has a mean of zero and a
variance of a12 a2 (t − s) = t − s.
Theorem 4.2 (Time inversion). Suppose Wt is a standard Brownian motion.
Then, the process defined by Xt ,
(
0
if t = 0
Xt =
.
tW1/t if t > 0
is a standard Brownian motion.
Proof. The increments of this process having an expected value of zero is immediate.
To see that the other properties are satisfied, note that for Brownian motions, we
have
Cov[Wt , Wt+s ] = E[Wt Wt+s ]
= E[Wt (Wt+s − Wt + Wt )]
= E[Wt2 ] + E[Wt+s ]E[Wt − Ws ] = t.
Then, for Xt we get
Cov[Xt , Xt+s ]
=
Cov[tW1/t , (t + s)W1/(t+s) ]
=
t(t + s)Cov[W1/t , W1/(t+s ] = (t + s)
t
= t.
t+s
Thus,
Cov[Xt , Xt+s − Xt ] = Cov[Xt , Xt+s ] − V ar[Xt ] = t − t = 0.
V ar[Xt+s − Xt ]
=
V ar[Xt+s ] + V ar[Xt ] − 2Cov[Xt+s , Xt ]
=
(t + s) + t − 2t = s.
This shows that the increments have the right variance. Independence of increments
holds due to Xt and Xt+s having zero covariance and being normal variables.
Lastly, we need to demonstrate continuity. When t > 0, this is clear. Now, recall
that the distribution of Xt over the rationals is the same as the distribution for a
Brownian Motion. This implies that for t ∈ Q,
lim Xt = 0.
t→0
Thus, completing the proof.
Corollary 4.3 (Law of large numbers). Almost surely,
lim Wt
t→∞ t
= 0.
BROWNIAN MOTION AND THE STRONG MARKOV PROPERTY
11
Proof. Let Xt be defined as in the above theoreom. Then, we have lim Wt t =
t→∞
lim X1/t = 0.
t→∞
5. Markov Properties
In this section, we begin to discuss multi-dimensional Brownian motion. Let us
start with this definition.
Definition 5.1. If W1 , ..., Wd are all independent Brownian motions started in
x1 , ..., xd , then the random process Wt given by
Wt = (W1 , ..., Wd ).
is called a d-dimensional Brownian motion started in (x1 , ..., xd ). If Wt starts
at the origin it is termed a standard d-dimensional Brownian motion.
Definition 5.2. A filtration on a probability space (Ω, F, P) is a family F(t) : t ≥
0) of σ-algebras such that F(S) ⊂ F(t) ⊂ F. A probability space with a filtration
is termed a filtered probability space. A random process {Xt : t ≥ 0} defined on
(Ω, F, P) is adapted if Xt is F(t) measurable for all t ≥ 0.
Keeping these definitions in mind, we begin by establishing the simple Markov
Property. Intuitively, this states that knowing the current position of a Brownian
motion yields as much information as knowing the entire history of positions up to
that point.
Theorem 5.3 (Markov Property). Let {Wt : t ≥ 0} is a Brownian motion started
in x ∈ Rd . Then the process {Wt+s − Ws : t, s > 0} is a Brownian motion started
at the origin and is independent of {Wt : 0 ≤ t ≤ s}.
Proof. This follows directly from property the independence of increments of Brownian motion.
However, this is rather trivial. A preliminary means of making this property
slightly stronger is establishing that Brownian motion is independent of information
that exists an infinitesimal amount of time into the future.
Definition 5.4. The germ σ-algebra is defined as F + (0), where
\
F + (t) =
F 0 (s)
s>t
and {F 0 : t ≥ 0} is the σ-algebra generated by {Wt : 0 ≤ s ≤ t}.
Definition 5.5. The tail σ-algebra, T of a Brownian motion is defined as
\
T =
G(t),
t≥0
where G(t) is the σ algebra generated by {Ws : s ≥ t}.
Theorem 5.6. For all s ≥ 0, the random process {Wt+s − Ws : t ≥ 0} is independent of F + (s).
Proof. By continuity, we can write the following for a strictly decreasing sequence
{sn : n ∈ N} converging to s:
Wt+s − Ws = lim Wsn +t − Wsn
n→∞
12
JAMES LEINER
However, the Markov property verifies that the right side of the above equation is
independent of F + (s).
Definition 5.7. We define Px as the probability measure which makes the random
process {Wt : t ≥ 0} a d-dimensional Brownian motion started in x ∈ Rd .
Theorem 5.8 (Blumenthal’s 0-1 law). Let x ∈ Rd and A ∈ F + (0). Then Px (A) ∈
{0, 1}.
Proof. By Theorem 5.11, we know that any A ∈ σ{Wt : t ≥ 0} is independent
of F + (0). However, because σ{Wt : t ≥ 0} ⊂ F + (0) means that any A chosen
from F + (0) must be independent from itself. This can only be the case if it has
probability zero or one.
Theorem 5.9 (Kolgomorov’s 0-1 law). Let x ∈ Rd and A ∈ T . Then Px (A) ∈
{0, 1}.
Proof. Let A ∈ T . We can map the tail σ-algebra onto the germ σ-algebra as
follows. First, let Wt be a Brownian motion and define X1/s = 1s Ws . Now, define
S(t) = σ(Ws : s ≥ t). We then have
S(t) = σ(sX1/s : s ≥ t) = σ(X1/s : s ≥ t) = σ(Xu : u ≤ 1/t).
Thus,
T =
\
S(t) =
t≥0
\
σ(Xu : u ≤ 1/t) = F + (0).
t≥0
d
Then for any x ∈ R , we know Px (A) is either zero or one by Blumenthal’s 0-1
law.
Definition 5.10. A random variable T with values in [0, ∞) defined on a filtered
probability space is called a stopping time if {T < t} ∈ F(t) for every t ≥ 0. It is
called a strict stopping time if for every t ≥ 0, {T ≤ t} ∈ F(t).
Remark 5.11. Every strict stopping time is also a stopping time.
Theorem 5.12. Every stopping time with respect to the filtration {F + (t) : t ≥ 0}
is a strict stopping time.
Proof. First, let us establish the right-continuity of {F + (t) : t ≥ 0}. To do this, we
can write
∞ \
∞
\
\
1
1
F + (t) =
F 0 (t + + ) =
F + (t + ).
n
k
n=1
>0
k=1
Thus,
{T ≤ t} =
∞
\
∞
\
1
1
{T < t + } ∈
F + (t + ) = F + (t).
k
n
n=1
k=1
Theorem 5.13 (Strong Markov property). For every almost surely finite stopping time T, the process {WT +t − WT : t ≥ 0} is a standard Brownian motion
independent of F + (T ).
BROWNIAN MOTION AND THE STRONG MARKOV PROPERTY
13
Proof. Let T be a stopping time. We can then define
Tn = (m + 1)2−n , where m/2n ≤ T < (m + 1)/2n .
This can be thought of as a discrete approximation which stops at the first dyadic
rational next to the original. Keeping in mind that this definition implies that Tn
is also as stopping time, we then define the following:
Wk (t) = Wt+k/2n − Wk/2n and Wk = {Wk (t) : t ≥ 0}
W∗ (t) = Wt+Tn − WTn and W∗ = {W∗ (t) : t ≥ 0}
Now, take E ∈ F + (Tn ) and the event {W∗ ∈ A}. We have
P({W∗ ∈ A}
\
E) =
∞
X
P({Wk ∈ A}
\
E
\
{Tn = k/2n }).
k=1
T
Note, however, that E {Tn = k/2n } ∈ F + (k/2n ), which by Theorem 5.12 is
independent of {Wk ∈ A}. Thus, we have
P({W∗ ∈ A}
\
E) =
∞
X
P{Wk ∈ A}P(E
\
{Tn = k/2n }).
k=0
Now, using the Markov property we see that for all k ∈ N, P{W ∈ A} = P{Wk ∈
A}. This yields
∞
X
P{Wk ∈ A}P(E
\
{Tn = k/2n })
= P{B ∈ A}
k=0
∞
X
P(E
\
{Tn = k/2n })
k=0
= P{B ∈ A}P(E).
Thus, W∗ is independent of every E and hence independent of F + (Tn ). Now,
recall that the sequence Tn is a uniformly decreasing sequence that converges to
T , hence F + (Tn ) ⊂ F + (T ) is independent of the Brownian motion Ws+Tn − WTn .
Then, the random process Wr+T − WT , defined by the increments
Ws+t+T − Wt+T = lim Ws+t+Tn − Wt+Tn ,
n→∞
is independent, N (0, s), and almost surely continuous. Thus, it is a Brownian
motion and independent of F + (T ).
6. Applications
Theorem 6.1 (Reflection principle). If T is
standard Brownian motion, then the random
(
2WT − Wt
∗
Wt =
Wt
a stopping time and {Wt : t ≥ 0} is a
process
if t ≤ T
if 0 ≤ t ≤ T
is a standard Brownian motion.
Proof. Consider the following random processes:
(6.2)
{Wt+T − WT : t ≥ 0}
(6.3)
{−(Wt+T − WT ) : t ≥ 0}
14
JAMES LEINER
Using the Strong Markov property, we know that both (6.2)and (6.3) are Brownian
motions and independent of the beginning process {Wt : 0 ≥ t ≥ T }. Consider
that both (6.2) and (6.3) begin at the end point of this beginning process. So, if we
attach (6.2) to the end point of the beginning process, we merely get the continuous
process {Wt : t ≥ 0}. If we attach (6.3) to the beginning process, then for all t ≥ T ,
we have WT − (Wt − WT ) = 2WT − Wt . This gives {Wt∗ : t ≥ 0}. Because (6.2) and
(6.3) have the same distribution and are Brownian motion, we therefore know that
both of these concatenated processes have the same distributions and are Brownian
motions.
We now focus on the properties of the zero set of Brownian motion.
Theorem 6.4. Let {Wt : t ≥ 0} be a Brownian motion. Then, the zero set of
Brownian motion
Zeros = {t ≥ 0 : Wt = 0}
is a closed set with no isolated points.
Proof. Note that {0} is a closed set. The zero set being closed is then immediate
from the continuity of Brownian motion, as the preimage of a closed set has to be
closed. Now, consider the construction τq = inf{t ≥ q : Wt = 0} for all rationals
q ∈ [0, ∞). First, note that the infimum of this set is also a minimum because it
is a closed set. Second, note this construction is an almost surely finite stopping
time. Now, apply the strong Markov property to τq . This yields a new Brownian
motion started at 0, WT +τq −Wτq . However, we know that any infinitesimally small
interval to the right of the origin will contain a zero. Thus, almost surely τq is not
isolated from the right for all positive rationals.
Now, consider the remaining points in the Zero set that do not correspond to
some τq . Fix such a point, t. We can then pick a sequence of rational numbers q(n)
converging to t. Then, the sequence τq (n) converges to t. Thus, t is not isolated
from the left.
An interesting corollary to this is that the zero set is uncountable.
Theorem 6.5. A closed set with no isolated points is uncountable.
Proof. Let A be a closed set with no isolated points. Since A is a collection of
limit points, we know that it cannot be finite. Thus, A can either be countably or
uncountably infinite. Suppose that it is countable. We can then write the set as
A = {a1 , a2 , a3 , ..., an }. Define Un := (an − 1, an + 1). Then, for every n we can
construct sets such that
(1) Ūn+1 ⊆ Un ;
(2) Un does not contain any points aj for j < n;
(3) Un contains an .
T
T
T
Then, consider the set V = n∈N Ūn A. Each of the sets U¯n A is compact.
Because an intersection of nested compact sets is non-empty, we therefore know
that V contains a point not enumerated in {a1 , a2 , a3 , ..., an }. This yields a contradiction.
BROWNIAN MOTION AND THE STRONG MARKOV PROPERTY
15
7. Derivative Markov Processes
Using the tools that we have built up to this point, we can now show the existence of several important Markov processes embedded within Brownian motion in
interesting ways.
Definition 7.1. A function p : [0, ∞) × Rd × B → R, where B denotes the Borel
sigma algebra over Rd is a Markov transition kernel provided that the following
hold:
(1) p(·, ·, A) is measurable as a function of (t,x) for each A ∈ B;
(2) p(t, x, ·) is a Borel probability measure on Rd for all t ≥ 0 and x ∈ Rd ;
When integrating a function f with respect to this probability measure, we
write
Z
f (y)p(t, x, dy);
(3) For all A ∈ R, x ∈ Rd and t, s > 0,
Z
p(t + s, x, A) = p(t, y, A)p(s, x, dy).
Rd
.
Theorem 7.2. For any a ≥ 0, define the stopping times
Ta = inf{t ≥ 0 : Wt = a}.
Then, {Ta : a ≥ 0} is an increasing Markov process with transition kernel given by
the densities
a2
a
(7.3)
p(a, t, s) = p
e− 2(s−t) 1{s ≥ t}.
3
2π(s − t)
This is called the stable suboordinator of index 1/2.
Proof. Fix a, b such that 0 ≤ b ≤ a. Now, note that for all t ≥ 0, we have
{Ta − Tb = t} = {WTb +s − WTb < a − b, for s < t, and WTb +t − WTb }.
Using the strong Markov property, we know that this event is independent of
F + (Tb ). We therefore can establish independence with {Td : d ≤ b}. This, in
turn, establishes the Markov property of {Ta : a ≥ 0}. Now, we are able to determine the transition Markov kernel through the reflection principle. We have
P{Ta − Tb ≤ t}
= P{Ta−b ≤ t}
= P{ max Ws ≥ a − b} = 2P{Wt ≥ a − b}
o≤s≤t
Z∞
=
√
a−b
Zt
=
0
Using the substitution
p
√
1
tπs3
1
2πs3
(a − b)e−
(a − b)e−
(a−b)2
2s
(a−b)2
2s
ds
ds.
t/s(a − b) yields the new integral
Zt
a
p
0
2π(s −
a2
t)3
e− 2(s−t) ds.
16
JAMES LEINER
Analogously, we can find a similar sort of process embedded within two dimensional Brownian motion.
Theorem 7.4. Let {Wt : t ≥ 0} = (W1 (t), W2 (t)) be a two-dimensional Brownian
motion. Given a fixed a ≥ 0, we then define
V (a) = {(x, y) ∈ R2 : x = a}
and let T (a) be the first hitting time of V (a). Defining a new process {X(a) : a ≥ 0)}
such that X(a) = W2 (T (a)) then yields a Markov process with transition kernel
given by
Z
1
a
dy.
(7.5)
p(a, x, A) =
2
π A a + (x − y)2
This is termed the Cauchy process.
Proof. First, note that T (a) < T (b) for all a < b. This, together with the strong
Markov property of Brownian motion applied to the stopping time T (a), confirms
that the Markov property applies to X(a).
Now, recall that Theorem 7.2 confirms that T (a) has the density given by (7.3).
This, coupled with the independence of T (a) from W2 (s) allows us to write the
density of W2 (T (a)) as
Z∞
x2
a2
1
a
√
e− 2s √
e− 2s ds.
3
2πs
2πs
0
Substituting σ =
1
2
2s (a
+ x2 ) then yields
Z∞
a
ae−σ
dσ =
.
2
2
2
π(a + x )
π(a + x2 )
0
Acknowledgments. It is a pleasure to thank my mentor, Marcelo Alvisio for his
helpful critiques on the composition of this paper, as well as his help in building
up the requisite level of mathematical maturity needed to explore this fascinating
topic. I would also like to voice my appreciation for Peter May and the rest of
the faculty who gave up their time to organize and teach in the program. Without
their help, this paper would not have been possible.
References
[1]
[2]
[3]
[4]
David Williams. Probability with Martingales. Cambridge University Press. 1991.
Sheldon M. Ross. A First Course in Probability. Prentice Hall. 2009.
Gregory F. Lawler. Random Walk and the Heat Equation. University of Chicago Press. 2010.
Peter Mörtes and Yuval Peres. Brownian Motion. Cambridge University Press. 2010.