Download Advanced Stochastic Calculus I Fall 2007 Prof. K. Ramanan Chris Almost

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Limit of a function wikipedia , lookup

Pi wikipedia , lookup

History of calculus wikipedia , lookup

Divergent series wikipedia , lookup

Multiple integral wikipedia , lookup

Function of several real variables wikipedia , lookup

Sobolev space wikipedia , lookup

Distribution (mathematics) wikipedia , lookup

Series (mathematics) wikipedia , lookup

Lp space wikipedia , lookup

Lebesgue integration wikipedia , lookup

Fundamental theorem of calculus wikipedia , lookup

Transcript
Advanced Stochastic Calculus I
Fall 2007
Prof. K. Ramanan∗
Chris Almost†
Course website available under Dr. Ramanan’s website. These notes were originally compiled during the Fall semester of 2007, with updates made during the
Fall semester of 2009.
∗
†
[email protected]
[email protected]
1
Contents
Contents
2
0 Review of some probability theory
3
1 Brownian Motion
1.1 Introduction to stochastic processes .
1.2 Construction of Brownian motion . .
1.3 Sample path properties . . . . . . . .
1.4 Distributional properties . . . . . . . .
1.5 Markov property . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
4
5
12
15
15
2 Martingales
2.1 Martingale convergence theorem
2.2 Continuous Martingales . . . . . .
2.3 Applications . . . . . . . . . . . . .
2.4 Lévy processes . . . . . . . . . . . .
2.5 Doob-Meyer decomposition . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
22
22
24
26
29
30
3 Stochastic Integration
3.1 Riemann-Stieltjes Integration . . . . . . . . . . . . . .
3.2 Construction of the Itô integral . . . . . . . . . . . . .
3.3 Characterization of the Stochastic Integral . . . . . .
3.4 Stochastic Integration . . . . . . . . . . . . . . . . . . .
3.5 Integration by parts formula for stochastic integrals .
3.6 Fisk-Stratonovich integral . . . . . . . . . . . . . . . . .
3.7 Applications of Itô’s formula . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
33
34
37
40
42
45
45
Index
.
.
.
.
.
.
.
.
.
.
49
2
Review of some probability theory
0
3
Review of some probability theory
For this course we let (Ω, F , P) be a probability space and (S, S ) be a (sufficiently
nice) topological space. The Borel σ-algebra B(S) is the σ-algebra generated by
the open sets, B(S) = σ(S ).
0.0.1 Definition. X : (Ω, F ) → (S, S ) is called a random element (or an F measurable function) if X −1 (A) ∈ F for all A ∈ S .
We will also use the terms random variable, random vector, random process, or
random measure, as appropriate for the codomain.
0.0.2 Definition. (Comparison of random elements)
(i) If P[X = X 0 ] = 1 then we say that X and X 0 are equal a.s. or indistinguishable.
(ii) If P[X ∈ A] = P[X 0 ∈ A] for all A ∈ S then we say that X and X 0 are equal
in distribution.
Equality in distribution can be defined for random elements defined on unequal probability spaces (but they must have the same codomain).
0.0.3 Example. Let Ω = {H, T } and P be the uniform probability (i.e. Bernoulli with parameter 21 ). Let X (H) = 0 = X 0 (T ) and X (T ) = 1 = X 0 (H). Then
P[X = 1] = P[X 0 = 1] so they are equal in distribution, but they are not equal
a.s., in fact they are unequal a.s.
0.0.4 Definition. (Convergence of random variables)
(i) X n converges P-a.s. to X if P[lim supn |X n − X | > 0] = 0.
(p)
(ii) X n converges in probability to X (or X n −
→ X ) if
lim P[|X n − X | ≥ "] = 0
n→∞
for all " > 0.
(d)
(iii) X n converges in distribution to X (or X n −
→ X ) if
P[X n ∈ A] → P[X ∈ A]
(d)
for all A such that P[X ∈ ∂ A] = 0. Equivalently, X n −
→ X if the distribution
functions of X n converge pointwise to the distribution function of X at ever
(d)
point of continuity of that function. Equivalently, X n −
→ X if
E[ f (X n )] → E[ f (X )]
for all bounded continuous functions f .
Remark. When trying to prove a sequence of random variables converges in probability, Markov’s inequality is often very useful.
4
Stochastic Calculus I
Now let us recall some important
Pn theorems. Let X i , i ∈ N be i.i.d. r.v.’s such
that µ = E |X i | < ∞ and let Sn = i=1 X i . Khintchine’s weak law of large numbers
says that
Sn (p)
−
→ µ,
n
and Kolmogorov’s strong law of large numbers says that
Sn
n
→ µ P-a.s.
If σ2 = E[|X 1 − µ|2 ] < ∞ (i.e. the r.v.’s have finite variance) then the central limit
theorem states
Sn − nµ (d)
−
→ N (0, 1),
p
σ n
where of course the right hand side may be replaced by any standard Gaussian
random element. The appearance in this theorem of the normal distribution is a
big part of why this distribution comes up everywhere.
1
1.1
Brownian Motion
Introduction to stochastic processes
1.1.1 Definition. A stochastic process on (Ω, F , P) with state space (S, S ) is (equivalently) a
(i) (one-parameter) family of random variables, {X t : Ω → S | t ∈ [0, ∞)};
(ii) random element of R[0,∞) , X = {X (·) (ω) : [0, ∞) → S | ω ∈ Ω};
(iii) random element of two variables X : [0, ∞) × Ω → S.
Notice that the concept of “measurability” would seem a priori to be different
for each of the three definitions.
1.1.2 Definition. A stochastic process is said to be measurable if X is measurable
as a random element of two variables, i.e. if for all A ∈ S ,
{(t, ω) | X (t, ω) ∈ A} ∈ B[0, ∞) × F .
N
Warning: In K&S the authors write B(R[0,∞) ) = [0,∞) B(R), which may or may
not be true according to our definitions. (Think about this?)
1.1.3 Definition. (Comparison of random processes) Let X and X 0 be random
processes.
(i) X and X 0 are indistinguishable if P[X t = X t0 for all t] = 1.
(ii) X is said to be a modification of X 0 if P[X t = X t0 ] = 1 for all t.
(iii) If P[X ∈ A] = P[X 0 ∈ A] for all A ∈ B(R[0,∞) ) then X and X 0 are equal in
distribution.
Construction of Brownian motion
5
(iv) If for every t 1 , . . . , t n then we have
P[(X t 1 , . . . , X t n ) ∈ A] = P[(X t01 , . . . , X t0n ) ∈ A]
for all A ∈ B(Rn ) they X and X 0 are said to have the same finite dimensional
marginal distributions (or fi.di. distributions).
As before, the definitions of equality in distribution and having the same fi.di.
distributions can be extended to processes defined on different probability spaces.
1.1.4 Definition. A sequence of random elements X n converges weakly to X (or
(w)
X n −→ X ) if for every bounded continuous function f : S → R then E[ f (X n )] →
E[ f (X )].
1.2
Construction of Brownian motion
For reasons that will become clear in the section on Donsker’s invariance principle,
we look for a process B = {B t , t ≥ 0} with the following properties.
(i) B0 = 0 and B t ∼ N (0, t) for all t;
(ii) B has stationary increments (or homogeneous increments), i.e. B t − Bs has the
same distribution as B t−s for all s < t;
(iii) B has independent increments, i.e. for s < t, B t − Bs is independent of {Bu |
0 ≤ u ≤ s}; and
(iv) B has continuous paths.
The canonical version of a random element X : (Ω, F , P) → (S, S ) is constructed as follows. The image measure P X −1 of X is defined by P X −1 (A) = P[X ∈
A] for A ∈ B(S). There is always a random element defined on (S, S , P X −1 )
with the same distribution as X , namely the identity function. Thus our goal is to
construct a measure on (C[0, 1], B(C[0, 1])) that satisfies properties 1–3.
1.2.1 Lemma. Given 0 ≤ t 1 < t 2 < t 3 , any process B that satisfies properties 1–3
must have the following joint probability density function.
f Bt
1
,B t 2 ,B t 3 (x,
where p(t; x, y) :=
y, z) = p(t 1 ; 0, x)p(t 2 − t 1 ; x, y)p(t 3 − t 2 ; y, z),
p1
2πt
exp(−
( y−x)2
).
2t
This lemma extends easily to any finite number of times and it is seen that
properties 1–3 determine the fi.di. distributions of B.
To use Carathéodory’s extension theorem to construct a measure on (G, G ) one
must:
(i) Define a finitely additive set function µ0 on some algebra or π-system C ⊆
G.
(ii) Show that µ0 is countably additive on C .
(iii) Carathéodory’s theorem allows us to conclude that µ0 may be extended as
a measure to the completion of σ(C ) ⊆ G .
One could also start with a reference measure and define another measure
via a density function. Or one could define a measure as a “limit” of “simple”
measures.
6
Stochastic Calculus I
Construction of Brownian motion: method 1
Let I be the space of finite increasing sequences of times
I := {(t 1 , . . . , t n ) | n ∈ N, 0 ≤ t 1 < · · · < t n < ∞}.
From the lemma, it is clear that P has finite dimensional distributions given by
Q t (A) := P{ω ∈ C[0, ∞) | (ω t 1 , . . . , ω t n ) ∈ A}
for all A ∈ B(Rn ), where
Z
Q t (A) =
p(t 1 ; 0, x 1 ) · · · p(t n − t n−1 ; x n−1 , x n )d x 1 · · · d x n .
A
This method will yield a unique measure on (C[0, ∞), B(C[0, ∞))), Wiener measure, with fi.di. distributions given by {Q t , t ∈ I}. (But I don’t see how, yet.)
Let C 0 be the set containing each so-called cylinder set, sets of the form
{ω ∈ C[0, ∞) | (ω t 1 , . . . , ω t n ) ∈ A}
for A ∈ B(Rn ), for n ∈ N. We have a finitely additive set function Q t defined on
C 0 . As an exercise, prove that the collection {Q t | t ∈ I} is consistent (see below).
The next step is to show that Q t is countably additive on C 0 (see Itô-McKean), and
the final step is to show that σ(C 0 ) = B(C[0, ∞)). (I must be missing something
here.)
The canonical version of Brownian motion is defined to be the canonical process on (C[0, ∞), B(C[0, ∞)), P), where P is Wiener measure.
Construction of Brownian motion: method 2
1.2.2 Theorem (Daniell-Kolmogorov). Let {Q t (·), t ∈ I} be a family of finite dimensional distributions that satisfies
(i) Q t (A) is invariant under permutation of the elements of t; and
(ii) For any t = (t 1 , . . . , t n ), if s = (t 1 , . . . , t n−1 ) and A ∈ B(Rn−1 ) then Q t (A ×
R) = Q s (A). This is the requirement that the family is consistent.
Then there is a unique probability measure P on (R[0,∞) , B(R[0,∞) )) such that
P(ω ∈ R[0,∞) | (ω t 1 , . . . , ω t n ) ∈ A} = Q t (A)
for all A ∈ B(Rn ) and t ∈ I.
Let C˜ be the collection of cylinder sets
{ω ∈ R[0,∞) | (ω t 1 , . . . , ω t n ) ∈ A}
for A ∈ B(Rn ), for n ∈ N, from R[0,∞) . As in method 1, Q t is defined on C˜ and it is
finitely additive and consistent. It is countably additive by the Daniell-Kolmogorov
extension theorem. Carathéodory’s theorem gives a measure satisfying properties
1–3 on σ(C˜). The only thing remaining is to deal with the continuity of the paths.
Construction of Brownian motion
7
1.2.3 Theorem (Kolmogorov-Čentsov). If {X t | t ∈ [0, T ]} is a real valued stochastic process defined on (Ω, F , P) that satisfies
E[|X t − X s |α ] ≤ C|t − s|1+β
for all 0 ≤ s ≤ t ≤ T and some positive constants α, β, and C, then there is a
continuous modification of X which is locally Hölder continuous with exponent γ,
β
for all γ ∈ (0, α ).
A stochastic process {X t | t ∈ [0, T ]} is said to be locally Hölder continuous
with exponent γ if an a.s. positive r.v. h such that
‚
Œ
|X t − X s |
P
sup
≤δ =1
γ
0<t−s<h |t − s|
s,t∈[0,T ]
for some appropriate constant δ > 0.
PROOF: The first step is to choose a countable dense subset of [0, T ]. We use the
dyadic rationals
D = {k2−n | k = 0, . . . , 2n−1 , n ∈ N}.
Define Ω̃∗ = {ω | t 7→ X t (ω) is uniformly continuous}, where X is the canonical
process on R[0,∞) . We would like to show that P(Ω∗ ) = 1. We will show instead
that P(Ω∗ ) = 1, where
Ω∗ = {ω | t 7→ X t (ω) is locally Hölder continuous on D with coefficient γ}.
By definition, ω ∈ Ω∗ holds if there is n∗ (ω) such that
max |X k2−n (ω) − X (k−1)2−n (ω)| < 2−nγ
1≤k≤2n
for all n ≥ n∗ (ω). Call this property (P). Let
En = {ω | max n |X k2−n (ω) − X (k−1)2−n (ω)| ≥ 2−nγ }.
1≤k≤2
The set of ω for which property (P) does not hold is the set of ω for which En
occurs infinitely often. Whence
\ [
Ω \ Ω∗ =
En .
m∈N n≥m
Now
n
P(En ) = P
2
[
!
|X k2−n − X (k−1)2−n | ≥ 2−nγ
k=1
≤
2n
X
k=1
P(|X k2−n − X (k−1)2−n |α ≥ 2−nγα )
8
Stochastic Calculus I
≤2
nγα
2n
X
E(|X k2−n − X (k−1)2−n |α )
k=1
≤ C2
nγα
2n X
1 1+β
k=1
2n
≤ 2(γα−β)n
P∞
β
so n=1 P(En ) < ∞ if γ ∈ (0, α ). Therefore by the Borel-Cantelli Lemma we have
∗
P(Ω ) = 1.
The next step is to define the modification. We define

X (ω)
t ∈ D, ω ∈ Ω∗
 t
/ D, ω ∈ Ω∗
X̃ (ω) = lim sn →t X sn (ω) t ∈
 {sn }⊆D
0
ω∈
/ Ω∗
and we must show that it is truly a modification of X . For t ∈ D, P(X t = X̃ t ) =
P(Ω∗ ) = 1. For t ∈
/ D, we know by construction that X̃ sn → X̃ t a.s. for any {sn } ⊆ D
converging to t. We know by K-C inequality that X sn → X t in probability. Therefore
ƒ
X̃ sn → X t in probability, and so P(X t = X̃ t ) = 1.
To complete the construction we must check the K-C inequality holds for Brownian motion for some α and β.
Weak convergence
There are a few books on convergence of processes: Billingsley (but not Probability), Parthasarathy, Jacod-Shiryaev.
For this section we let (S, S ) be a metrizable space.
1.2.4 Definition. A sequence of measures µn converges weakly to µ, all defined
(w)
on the same space (S, B(S)), denoted µn −→ µ, if
Eµ n [ f ] =
Z
S
f (x)µn (d x) →
Z
f (x)µ(d x) = Eµ [ f ],
S
for all f ∈ C b (S).
It can be seen that when S = R this reduces to convergence in distribution.
A norm that may be defined on a collection of measures absolutely continuous
with respect to some measure λ is the total variation norm defined by
Z dµ dλ.
kµkTV = dλ S
Construction of Brownian motion
9
1.2.5 Exercise. Suppose that µn is the distribution of
Pn
Sn
i=1 (X i − E[X i ])
,
p =
p
n
n
where the X i are i.i.d. with finite means and variances. Does µn converge in the
total variation metric?
1.2.6 Definition. Given a sequence of probability measures on (S, B(S)) and a
probability measure µ on (S, B(S)), µn converges weakly to µ if and only if
Z
Z
f dµn =
lim
n→∞
f dµ
S
S
(w)
for every bounded continuous function f on S. We write µn −→ µ.
This is the weak∗ topology induced from considering the space of finitely additive measures as the dual of the space of bounded measurable functions. Note
that
Z
kµkTV = sup f dµ ,
f ∈L ∞ (S) S
k f k∞ ≤1
so the total variation norm is the operator norm induced by this duality.
1.2.7 Theorem (Portmanteau). The following are equivalent.
(w)
(i) µn −→ µ;
(ii) lim infn→∞ µn (U) ≥ µ(U) for every open set U;
(iii) lim supn→∞ µn (C) ≤ µ(C) for every closed set C;
(iv) limn→∞ µn (A) = µ(A) for every measurable set A such that µ(A) = 0.
(w)
1.2.8 Continuous mapping theorem. If Φ : S → S 0 is continuous and µn −→ µ in
(w)
(S, B(S)) then Φ(µn ) −→ Φ(µ) in (S 0 , B(S 0 )).
PROOF: Let g ∈ C b (S 0 ). Then g ◦ Φ ∈ C b (S), so by definition
Z
Z
Z
Z
g dΦ(µn ) =
S0
g ◦ Φ dµn →
S
g ◦ Φ dµ =
S
g dΦ(µ),
S0
(w)
so Φ(µn ) −→ Φ(µ).
ƒ
1.2.9 Definition. Let Π be a family of probability measures on (S, B(S)).
(i) Π is said to be relatively compact if every sequence in Π contains a weakly
convergent subsequence. (This is just relative compactness with respect to
the topology induced by weak convergence).
10
Stochastic Calculus I
(ii) Π is tight if for every " > 0 there is a compact set K ⊆ S such that P(K) ≥
1 − " for all P ∈ Π.
1.2.10 Theorem (Prohorov). Suppose that (S, S ) is a Polish space (i.e. a complete, separable, metrizable space). Then a family Π of probability measures on
(S, B(S)) is tight if and only if Π is relatively compact.
We will be interested in the case when S is the space of continuous functions
on [0, ∞) with the metric ρ associated with the uniform norm
ρ( f , g) =
∞
X
1
n=1
2n
sup t∈[0,n] | f (t) − g(t)|
1 + sup t∈[0,n] | f (t) − g(t)|
.
(C[0, ∞), ρ) is a Polish space.
1.2.11 Theorem (Arzela-Ascoli). A set A ⊆ C[0, ∞) is relatively compact if and
only if the following two conditions hold.
(i) supω∈A |ω(0)| < ∞; and
(ii) limδ&0 supω∈A m T (ω, δ) = 0 for all T < ∞, where
m T (ω, δ) =
sup |ω(t) − ω(s)|
0≤s≤t≤T
|t−s|≤δ
is the modulus of continuity. (A is said to be equicontinuous).
1.2.12 Theorem. A sequence {Pn } of probability measures on C[0, ∞) is tight if
and only if
(i) limλ%∞ supn≥1 Pn {ω | |ω(0)| ≥ λ} = 0; and
(ii) limδ&0 supn≥1 Pn {ω | m T (ω, δ) > "} = 0 for all " > 0 and all T < ∞.
PROOF: Suppose that {Pn } is tight. Then given any η > 0 there is some compact
set Kη ⊆ C[0, ∞) such that Pn (Kη ) > 1 − η. By the Arzela-Ascoli theorem, given
T > 0, " > 0 there is λ < ∞ and δ0 > 0 such that
K = {ω | |ω(0)| ≤ λ, m T (ω, δ) ≤ " for δ ∈ (0, δ0 )}.
Now suppose that {Pn } satisfies the conditions. Then given T > 0 and η > 0,
choose λ > 0 such that
sup Pn {ω | |ω(0)| > λ} ≤
n≥1
η
2 T +1
Choose δk > 0 such that for each k = 1, 2, . . .
sup Pn {ω | m T (ω, δk ) > 1k } ≤
n≥1
η
2 T +k+1
.
Define
AT := {ω | |ω(0)| ≤ λ, m T (ω, δk ) ≤
1
k
for all k = 1, 2, . . . }
Construction of Brownian motion
and let A :=
T∞
T =1 A T .
11
Then
Pn (A T ) ≥ 1 −
∞
X
η
k=0
2 T +k+1
=1−
η
2T
so Pn (A) ≥ 1−η for all n ≥ 1. By the Arzela-Ascoli theorem A is relatively compact,
so {Pn } is tight.
ƒ
It turns out that the topology on M1 (S) induced by weak convergence is metrizable, and M1 (S) with this so-called Prohorov metric is complete and separable.
Another important fact is that we will use often is that probability measures
on (C[0, ∞), B(C[0, ∞))) are uniquely characterized by their finite dimensional
distributions, i.e. their values on cylinders
{ω ∈ [0, ∞) | (ω t 1 , . . . , ω t n ) ∈ A},
for A ∈ B(Rn ), for n ∈ N.
Donsker’s invariance principle
Let ξ1 , ξ2 , . . . be i.i.d. r.v.’s with mean zero and variance one. Let Sn =
p (w)
Recall the central limit theorem, that Sn / n −→ N (0, 1). Following KS, let
Pn
i=1 ξi .
Yt := Sbtc + (t − btc)ξbtc+1
be the continuous time process that is the linear interpolation between the partial
p
sums. For each n ≥ 1, scale Y in space by a factor of n and in time by a factor
of n (the choice of these scaling factors will become clear in a moment) to get a
(n)
process {X t | t ∈ [0, 1]},
(n)
Xt
1
1
= p Ynt = p (Sbntc + {nt}ξbntc+1 ).
n
n
p
(n)
Notice that for s = k/n and t = (k + 1)/n we have X t − X s(n) = ξk+1 / n which is
independent of σ(X u(n) | u ≤ s) = σ(ξ1 , . . . , ξk ) and it has mean zero and variance
p
(n)
1. At t = 1 we have X 1 = Sn / n, which converges weakly to N (0, 1). For t = 12
we have (approximately)
1
1 Sb n c
(n)
X 1 = p Sb n c = p p2n
n 2
2
2
2
which is N (0, 21 ) in the limit, by the CLT. These computations lead us to believe
(n)
that the X t
“converge” to Brownian motion. This is made precise below.
1.2.13 Theorem. {X (n) } converges weakly to Brownian motion.
12
Stochastic Calculus I
First we show that {X (n) } is tight. From Exercise 4.11 of KS, if {X (n) } is a
(n)
sequence of continuous stochastic processes with X 0 = 0 and if
(n)
sup E[|X t − X s(n) |α ] ≤ C T |t − s|1+β
n≥1
for all T > 0 and 0 ≤ s, t ≤ T for some constants α, β, and C T , then Pn = P(X (n) )−1
form a tight sequence. To show that {X (n) } is tight we will show that it satisfies
the conditions of this problem.
The next step is to apply Prohorov’s theorem to see that {X (n) } is relatively
compact. Thus there is a weakly convergent subsequence {X (nk ) }. Let X be such
(w)
that X (nk ) −→ X and apply the continuous mapping theorem and central limit
theorem to conclude that X has the required fi.di. distributions.
1.3
Sample path properties
1.3.1 Proposition. If B is a Brownian motion then the following processes X are
also Brownian motions with respect to their natural filtrations.
(i) X t := p1c Bc t for c > 0 (scaling property)
(ii) X t := B t+c − Bc for c ≥ 0 (simple Markov property)
(iii) X t := B T − B T −t for t ∈ [0, T ] (time reversal property)
(iv) X t := t B 1 for t > 0 and X 0 := 0 (time inversion property)
t
(v) X t := U B t for an orthogonal matrix U (where B is d-dimensional Brownian
motion). In particular we have the reflection property, that −B is a Brownian
motion.
1.3.2 Lemma. P[sup t≥0 B t = ∞ and inf t≥0 B t = −∞] = 1.
PROOF: Let Z = sup t≥0 B t . For any c > 0 we have
(d)
cZ = sup cB t = sup B
t≥0
t≥0
t
c2
= Z.
Therefore the law of Z is concentrated on {0, ∞}. Let p = P(Z = 0). Then
p ≤ P[B1 ≤ 0 and sup B1+t − B1 < ∞]
t≥0
= P[B1 ≤ 0] P[sup B1+t − B1 < ∞]
B1 ⊥ (B1+t − B1 , t ≥ 0)
= P(B1 ≤ 0) P(Z = 0)
(B1+t − B1 , t ≥ 0) is a BM
t≥0
=
1
p
2
so p = 0 and P(Z = ∞) = 1.
ƒ
Remark. A direct consequence of this is that a.s. for all a ∈ R, {t | B t = a} is not
bounded above.
Sample path properties
13
1.3.3 Lemma. Brownian motion is a.s. not differentiable at zero.
PROOF: The last lemma and time inversion together imply that
P[∀ " > 0, ∃ s, t ≤ " s.t. Bs < 0 < B t ] = 1.
Indeed, if this were not the case then there would be a set A with P[A] > 0 and
the property that for all ω ∈ A there is " = "(ω) such that either B(u) > 0 or
B(u) < 0 for all u ∈ (0, "]. By the time inversion property this implies that for all
ω ∈ A, B̃u = uB 1 satisfies B̃u > 0 or B̃u < 0 for all s ∈ [ 1" , ∞), which contradicts
u
the previous lemma.
Therefore the only possible (right) derivative of Brownian motion at zero is
0. If this were the case on a set A of positive probability, then then for ω ∈ A,
|B t (ω)| ≤ t for all 0 ≤ t ≤ T (ω). Once again, using time inversion, B̃ t := t B 1 is a
Brownian motion. On A, for all 0 < t ≤ T (ω), B̃ 1 =
t
Bt
t
t
≤ 1, which is impossible.ƒ
1.3.4 Lemma. Brownian paths are monotone on no interval, a.s.
S
PROOF: We must show that the set s,t∈Q {ω | B(ω) is monotone on [s, t]} has
probability zero. By the symmetry properties of Brownian motion it suffices to
show that A := {ω | B(ω) is non-decreasing on [0, 1]} has probability zero. Let
Tn−1
T∞
An := i=0 {B i+1 − B i ≥ 0} and notice that A = n=1 An since B has continun
n
ous paths. (It follows in particular that A is measurable.) Since B has independent, normally distributed increments P[An ] = ( 12 )n , so by continuity of measure,
P[A] = limn→∞ P[An ] = 0.
ƒ
Given a partition Π = {0 = t 0 < t 1 < · · · < t kΠ = 1} of [0, 1], and a real-valued
function f : [0, 1] → R, let
V (p) (Π)( f ) =
kΠ
X
| f (t i ) − f (t i−1 )| p .
i=1
The (classical) p-variation of f is defined to be
e (p) ( f ) = sup V (p) (Π)( f ).
V
Π
If f is continuous and p = 1 then V (p) ( f ) = limkΠn k→0 V (p) (Πn )( f ) where Πn is
any sequence of partitions such that kΠn k → 0 (i.e. the mesh size goes to zero).
When p 6= 1, these quantities need not be the same. Regardless, the p-variation of
a stochastic process X is defined to be
V (p) (X ) = lim V (p) (Πn )(X )
kΠn k→0
where the limit is taken in probability.
14
Stochastic Calculus I
1.3.5 Lemma. The quadratic (p = 2) variation of Brownian motion on the interval [0, t] is the deterministic value t, and Brownian motion does not have finite
variation (p = 1) on any interval.
PROOF: Fix t and fix a partition Π = {0 = t 0 < t 1 < · · · < t n = t} of [0, t]. Then
E(V (2) (Π)(B) − t)2 = E
X
n
∆j
2
j=1
where ∆ j = (B t j − B t j−1 )2 − (t j − t j−1 )2 . It can be shown that E[∆ j ∆k ] = 0 for
j 6= k and E[∆2j ] = 2(t j − t j−1 )2 . Whence
E(V (2) (Π)(B) − t)2 = 2
n
X
(t j − t j−1 )2 ≤ kΠkt,
j=1
so V (2) (Π)(B) converges to t in L 2 (0, t) (and hence in probability). It follows
that Brownian motion cannot have finite first variation because it has continuous
paths.
ƒ
1.3.6 Sample path properties. The following properties are true of a.e. sample
path of Brownian motion.
(i) Unboundedness.
(ii) Of unbounded first variation. (This is a consequence of the fact that V (2) (B(ω)) 6=
0 a.s.)
(iii) Non-differentiable at zero (or anywhere).
(iv) Monotone on no interval. (This is a consequence of (ii))
(v) Nowhere differentiable, i.e.
{ω | ∀ t ∈ [0, ∞) either D+ Wt (ω) = ∞ or D− Wt (ω) = −∞}
contains an event F with P(F ) = 1, where
1
D+ f t = lim sup ( f (t + h) − f (h))
h
h→0
and
1
D− f t = lim inf ( f (t + h) − f (h)).
h→0 h
(vi) Law of the Iterated Logarithm:
lim sup p
t→∞
Wt (ω)
2t log log t
=1
and
lim inf p
t→∞
(vii) Exact modulus of continuity (Lévy), see 2.9.F.
Wt (ω)
2t log log t
= −1
Distributional properties
1.4
15
Distributional properties
1.4.1 Definition. A stochastic process X is a Gaussian process if for every 0 <
t 1 < t 2 · · · < ∞ the Rn valued random vector (X t 1 , . . . , X t n ) has a (multi-variate)
Gaussian distribution. The covariance function of a Gaussian process is defined to
be ρ(s, t) := E[(X s − E[X s ])(X t − E[X t ])].
Brownian motion is a Gaussian process mean zero and covariance function
ρ(s, t) = s ∧ t.
1.4.2 Lemma. The CDF F t 1 ,...,t n (x 1 , . . . , x n ) of (B t 1 , . . . , B t n ) is
Z
x1
Z
xn
···
−∞
1.5
p(t 1 ; 0, y)p(t 2 − t 1 ; y1 , y2 ) · · · p(t n − t n−1 ; yn−1 , yn )d yn · · · d y1 .
−∞
Markov property
Informally, X is a Markov process if there is a family of Borel measurable functions
{ fs,t } such that
P(X t ∈ A | FsX ) = fs,t (X s , A).
1.5.1 Definition. Let (Ω, F ) be a measurable space. A kernel on Ω is a map N :
Ω × F → [0, 1] such that
(i) A 7→ N (x, A) is a probability measure on (Ω, F ) for all x ∈ F ; and
(ii) x 7→ N (x, A) is F -measurable for every A ∈ F .
A kernel N is called a transition probability or stochastic kernel if N (x, Ω) = 1 for
all x ∈ Ω.
Notation. If f is a non-negative F -measurable function and N is a kernel then
the function
Z
N f (x) :=
Ω
N (x, d y) f ( y) = EN (x,·) [ f ].
Likewise, if M and N are two kernels then
Z
M N (x, A) =
M (x, d y)N ( y, A).
Ω
Suppose there is a process X for which, for any s < t there is a transition
probability Ps,t such that a.s.
P(X t ∈ A | FsX ) = Ps,t (X s , A).
Then for any positive F -measurable function f , using standard approximation
arguments,
E[ f (X t ) | FsX ] = Ps,t f (X s ).
16
Stochastic Calculus I
So if s < t < v then
P(X v ∈ A |
FsX )
= E[P(X v ∈ A |
F tX )
|
FsX ]
=
Z
Ps,t (X s , d y)Pt,v ( y, A).
This should equal Ps,v (X s , A).
1.5.2 Definition. A transition function on (Ω, F ) is a family {Ps,t | 0 ≤ s < t} of
transition probabilities on (Ω, F ) such that for all s < t < v,
Z
Ps,t (X s , d y)Pt,v ( y, A) = Ps,v (x, A).
Ω
These are the Chapman-Kolmogorov equations.
1.5.3 Definition. Let (Ω, F , {F t , t ≥ 0}, P) be a filtered probability space. An
adapted process X is a Markov process with respect to another filtration {G t } (containing {F tX }) with transition functions Ps,t if for all non-negative F -measurable
functions f and 0 ≤ s ≤ t, E[ f (X t ) | Gs ] = Ps,t f (X s ).
Remark.
(i) Given a transition function, one can always construct a Markov process with
that transition function using Kolmogorov’s extension theorem.
(ii) The transition function is said to be homogeneous if for all s < t, Ps,t depends
on s and t only
R through t − s. In this case the C-K equation takes the form
Pt+s (x, A) = Ps (x, d y)Pt ( y, A).
1.5.4 Theorem. Brownian motion is a Markov process with respect to its natural
filtration. (With what transition function?)
Intuition: B t = B t − Bs + Bs and B t − Bs is N (0, t − s), distributed as P y if
Bs = y.
1.5.5 Lemma. Suppose that X and Y are d-dimensional random vectors on (Ω, F , P),
G is a sub-σ-algebra of F , X is independent of G and Y is G -measurable. Then
for every Γ ∈ B(Rd ),
P[X + Y ∈ Γ | G ] = P[X + Y ∈ Γ | Y ]
P-a.s.
and
P[X + Y ∈ Γ | Y = y] = P[X + Y ∈ Γ] a.e. for P Y −1 .
PROOF: We will show that for D ∈ B(Rd × Rd ),
P[(X , Y ) ∈ D | G ] = P[(X , Y ) ∈ D | Y ].
Markov property
17
First look at D = D1 × D2 for D1 , D2 ∈ B(Rd ). The left hand side is
P[X ∈ D1 , Y ∈ D2 | G ] = 1{Y ∈D2 } P[X ∈ D1 | G ] = 1{Y ∈D2 } P[X ∈ D1 ]
and the right hand side is equal to the same thing by the same logic. Since the
measurable rectangles form a Dynkin system and generate B(Rd × Rd ), we are
done.
ƒ
Let Ω0 = (C[0, ∞) : R)d , F 0 = B(Ω0 ), and P 0 = P (1) × · · · × P (d) , where each
P is Wiener measure. Let X be the canonical process on (Ω0 , F 0 , P 0 ), so X is
a d-dimensional Brownian motion started at zero. Let µ be an arbitrary initial
distribution on (Rd , B(Rd )). Consider the random variable on (Rd × Ω0 , B(Rd ) ⊗
F 0 , µ ⊗ P 0 ) defined by X (x, ω1 , . . . , ωd ) = x + (ω1 , . . . , ωd ), and so is Brownian
motions with initial distribution µ.
Another way to think about this is to think of P µ as the image measure. How
µ
0
x
0
can we explicitly write
R P in terms of P and µ? Naturally, take P (F ) = P (F − x),
µ
x
and write P (F ) = P (F )µ(d x). This integral is well-defined if for every F ∈ F 0
the map x 7→ P x (F ) is “universally measurable.” The following fact is true: for
B
every F ∈ F∞
the map x 7→ P x (F ) is B(Rd )-measurable. (Universal measurability
is introduced so that we get this nice property for slightly larger filtrations, such
as the augmented natural filtration.)
Define
\
µ
B(Rd ) ⊇ B(Rd ).
U (Rd ) :=
(i)
µ prob
A mapping R → R is said to be universally measurable if it is U (Rd )/B(Rd )measurable.
Note that if RF is a set of the form {ω ∈ Ω0 | ω(0) ∈ Γ0 , ω(t 1 ) ∈ Γ1 } then
x
P (F ) = 1Γ0 (x) Γ pd (t 1 ; x 1 , y1 )d y1 , where as always
d
1
pd (t; x, y) =
‚
1
d
(2πt) 2
exp −
kx − yk2
2t
Œ
.
As a consequence, Brownian
R motion is a homogeneous Markov process with
transition function Pt (x, A) = A pd (t; x, y)d y. Here we call pd (t; x, y) the transition density of Brownian motion. We have seen E[ f (B t+s ) | Bs ] = Pt f (Bs ). The
infinitesimal generator of a homogeneous Markov process is G := lims& 1s (Ps − I).
1.5.6 Definition. A Markov family is an adapted process {S t , F t | t ≥ 0} on some
(Ω, F ) together with a family of probability measures {P x | x ∈ Rd } on (Ω, F )
such that
(i) x 7→ P x (F ) is universally measurable for all F ∈ F ;
(ii) P x [X 0 = x] = 1 for all x ∈ Rd ;
(iii) P x [X s+· ∈ F | Fs ] = P x [X s+· ∈ F | X s ] for all x ∈ Rd , for all F ∈ B(Rd )[0,∞) ;
(iv) P x [X s+t ∈ Γ | X s = y] = P y [X t ∈ Γ] for all x ∈ Rd , a.s.-P x X s−1
1.5.7 Definition. A process X is F -progressively measurable if the restricted map
X : [0, t] × Ω → Rd is (B[0, t] ⊗ F t )/B(Rd )-measurable for all t < ∞.
18
Stochastic Calculus I
1.5.8 Definition.
The σ-algebra generated
by a random time T is the σ-algebra
generated by {X T ∈ A} | A ∈ B(Rd ) ∪ {T = ∞}.
1.5.9 Definition. A random time T is an F -stopping time (resp. F -optional time)
if {T ≤ t} ∈ F t (resp. {T < t} ∈ F t ) for all t. The σ-algebra generated by a
stopping time T is
F T := {A ∈ F | A ∩ {T ≤ t} ∈ F t for all t ≥ 0}.
1.5.10 Exercise. Suppose that X is an adapted process with right continuous
paths and A ∈ B(Rd ). The hitting time is HA = inf{t ≥ 0 | X t ∈ A}.
(i) If A is open show that HA is an optional time.
(ii) If A is closed and X is continuous show that HA is a stopping time.
1.5.11 Definition. Let {F t } t≥0 be the completion of {F tB } t≥0 , and for each t ≥ 0
T
let F t = s>t F s . Then {F t } t≥0 is a right continuous filtration, the Brownian
filtration.
1.5.12 Theorem. Let {B t , F t } t≥0 be a Brownian motion and let T be a finite val(T )
ued stopping time. Then the process defined by B t = B T +t − B T for t ≥ 0 is a
Brownian motion independent of F T .
PROOF: A simple stopping time is a stopping time whose image is countable. We
claim that given any finite time there is a non-increasing sequence of simple stopping times
T T1 ≥ T2 ≥ · · · such that limn→∞ Tn = T point-wise. In addition,
F T = n F Tn . (The reason we take the approximation from above is for this
latter property.) Indeed, define
Tn =
∞
X
k
k=0
2n
1[k2−n ,(k+1)2−n ) ◦ T.
Then clearly Tn ≥ Tn+1 for all n, and Tn converges to T since 0 ≤ Tn − T ≤ 2−n .
Further, F T ⊆ F Tn for all n as a consequence of the general fact that S ≤ T implies
FS ⊆ F T . (Indeed, if A ∈ FS then
A ∩ {T ≤ t} = A ∩ {S ≤ t} ∩ {T ≤ t} ∈ F t
so A ∈ F T .) If A ∈
Therefore
T
n F Tn
A ∩ {T ≤ t} =
then A ∩ {Tn ≤ t} ∈ F t for all n ≥ 1 and all t ≥ 0.
\[ \
(A ∩ {Tn ≤ t + "}) ∈
">0 m≥1 n≥m
by the right continuity of {F t }.
\
">0
F t+" = F t
Markov property
19
Back to the proof of the theorem. If T is a simple stopping time then let
{τ1 , τ2 , . . . } be the range of T . For any A ∈ F T and for all C1 , . . . , Cn ∈ B(R),
\
P(A ∩ ( {B T +t i − B T ∈ Ci }))
i≤m
=
∞
X
P(A ∩ (
k=0
=
∞
X
{Bτk +t i − Bτk ∈ Ci , T = τk }))
i≤m
P(
k=0
= P(
\
\
\
{Bτk +t i − Bτk ∈ Ci }) P({T = τk } ∩ A)
i≤m
{B t i ∈ Ci }) P(A)
i≤m
Now set A = R to deduce that t 7→ B T +t − B T is a BM. Since A ∈ F T was arbitrary,
we also get independence.
For a general topping time T , consider the approximating sequence of simple
stopping times Tn & T defined above. For any A ∈ F T and for all open C1 , . . . , Cn ∈
B(Rn ) we have
\
\
P(A ∩ ( {B T +t i − B T ∈ Ci })) = lim P(A ∩ ( {B Tn +t i − B Tn ∈ Ci }))
n→∞
i≤m
i≤m
= lim P(
n→∞
= P(
\
\
{B t i ∈ Ci }) P(A)
i≤m
{B t i ∈ Ci }) P(A)
i≤m
since A ∈ F Tn for all n.
ƒ
1.5.13 Example. This theorem is not true for general random times. Take T to
be the last time before 1 that B t is zero.
1.5.14 Lemma. Under P 0 , |B| = {|B t |, F t } is a Markov process with transition
density P 0 [|Wt−s | ∈ d y | |Wt | = x] = p+ (s; x, y), where p+ (s; x, y) = p(s; x, y) +
p(s; x, − y).
1.5.15 Lemma. Define Yt = M t − B t . Under P 0 , the process {Yt , F t } is a Markov
process and has transition density
P 0 [Yt+s ∈ d y | Yt = z] = (p(s; z, y) + p(s; z, − y))d y = p+ (s; z, y)d y.
PROOF: For s > 0, t ≥ 0, b ≥ a, b ≥ 0,
P 0 [B t+s ≥ a, M t+s ≤ b | F t ] = P 0 [B t+s ≥ a, M t ≤ b, ( sup B t+u ) ≤ b | F t ]
u∈[0,s]
0
= 1{M t ≤b} P [B t+s ≥ a, ( sup B t+u ) ≤ b | F t ]
u∈[0,s]
20
Stochastic Calculus I
= 1{M t ≤b} P 0 [B t+s ≥ a, ( sup B t+u ) ≤ b | B t ]
u∈[0,s]
since {B t } is a Markov process under P 0 . This calculation shows that (M t , B t ) is a
Markov process under P 0 . Since Yt is a function of M t and B t , it follows that for
every Γ ∈ B(R),
P 0 [Yt+s ∈ Γ | F t ] = P 0 [Yt+s ∈ Γ | B t , M t ].
It suffices to show that
P 0 [Yt+s ∈ d y | B t = x, M t = m] = p+ (s; m − x, y)d y.
For b > m > x, b ≥ a, m ≥ 0,
P 0 [B t+s ∈ d a, M t+s ∈ d b | B t = x, M t = m]
= P 0 [B t+s ∈ da, max B t+u ∈ d b | B t = x, M t = m]
0≤u≤s
= P x [Bs ∈ da, Ms ∈ d b]
= P 0 [Bs ∈ da − x, Ms ∈ d b − x]
(2b−a−x)2
2
=p
(2b − a − x)e− 2s da d b
2πs3
For m > x, m ≥ a, m ≥ 0,
P 0 [B t+s ∈ d a, M t+s = m | B t = x, M t = m]
= P 0 [B t+s ∈ da, sup B t+u ≤ m | B t = x, M t = m]
u∈[0,s]
= P [Bs ∈ da, Ms ≤ m]
x
= P x [Bs ∈ da] − P x [Bs ∈ da, Ms ≥ m]
‹
(2m−a−x)2
1  − (a−x)2
=p
e 2s − e− 2s
da
2πs
Therefore, since either the maximum increases to a new level b > m over the
interval [t, t + s] or it stays the same, we have
P 0 [Yt+s ∈ d y | B t = x, M t = m]
Z∞
P 0 [B t+s ∈ b − d y, M t+s ∈ d b | B t = x, M t = m]
=
m
+ P 0 [B t+s ∈ m − d y, M t+s = m | B t = x, M t = m]
=
Z
∞
m
2
p
2πs3
(b + y − x)e−
(b+ y−x)2
2s
d y db
‹
(m+ y−x)2
1  − (m− y−x)2
2s
e
− e− 2s
dy
+p
2πs
= p+ (s; m − x, y)d y
ƒ
Markov property
21
We have the following Markov processes:
(i) Brownian motion B t
(ii) Poisson process Nt
(iii) Reflected Brownian motion |B t |
(iv) (M t , B t )
(v) Yt = M t − B t
What about
Tb = inf{t ≥ 0 | B t = b} = inf{t ≥ 0 | B t ≥ b} = inf{t ≥ 0 | M t ≥ b}?
1.5.16 Theorem. {Tb , 0 < b < ∞} is a non-decreasing left-continuous (strong
Markov) process that has stationary independent increments and is purely discontinuous (i.e. there is no interval on which b 7→ Tb is right-continuous).
PROOF: Notice that
{Tb ≤ t} = {M t ≥ b} =
\
{M t ≥ b − 1n } =
n∈N
\
{Tb− 1 ≤ t}.
n∈N
n
which implies left-continuity. That it is non-decreasing is obvious.
The time shift operator, defined by θs (ω)(t) := ω(s + t) for s, t ≥ 0. The
operator can also be defined for random times. We have for 0 < a < b, Tb =
Ta + Tb ◦ θ Ta a.s. For all F Tb -measurable functions f ,
E[ f (Tb − Ta ) | F Ta ] = E[ f (Tb ◦ θ Ta ) | F Ta ] = Ea [ f (Tb )] = E[ f (Tb−a )]
using the continuity of Brownian motion and the fact that Brownian motion has
stationary, independent increments. This implies that Tb − Ta is independent of
Fa and has the same distribution as Tb−a .
For the last part it is enough to show that for p, q ∈ Q,
P[ω | b 7→ Tb (ω) is cts on [p, q]] = 0.
However, b 7→ Tb is continuous on this interval if and only if M t is strictly increasing on [Tp , Tq ], and for this to happen we would require B Tp +t − B Tp to be
strictly increasing on that interval. But this last process is a Brownian motion by
the strong Markov property, and so is not strictly increasing anywhere.
ƒ
p
1.5.17 Lemma. E0 [exp(−uTb )] = exp(−b 2u).
1.5.18 Proposition. Almost surely, the set Z = {t ∈ [0, ∞) | B t = 0} has no
isolated points.
PROOF: Earlier we showed that zero is not an isolated point, indeed
p
E0 [exp(−u(t + T0 ◦ θ t ))] = e−ut E0 [EB t [exp(−uT0 )]] = e−ut E0 [exp(−|B t | 2u)]]
22
Stochastic Calculus I
and as t → 0, the right hand side goes to one. Therefore by Fatou’s Lemma,
P[lim inf(t + T0 ◦ θ t ) = 0] ≥ lim inf P[(t + T0 ◦ θ t ) = 0] = 1,
t&0
t&0
so zero is a limit point of Z a.s. For any rational q, the define the time dq to be
q + Tq ◦ θq , the first point in Z after q. However, Bdq = 0, so {Bdq +t | t ≥ 0} is a
standard BM by the strong Markov property. Therefore the set
[
{dq is not a limit point of Z}
q∈Q
has P-measure zero. If h ∈ Z and h = dq then h is a limit point of Z. If not, choose
a sequence {qn } ⊆ Q such that qn % h. Then dqn ∈ [qn , h], so dqn → h and h is a
limit point of Z.
ƒ
2
2.1
Martingales
Martingale convergence theorem
The definitions of sub- and super-martingales are analogous in discrete- and continuoustime.
2.1.1 Definition. A stochastic process X is integrable if E |X a | < ∞ for all a ∈ I.
An adapted stochastic process {X a , Fa | a ∈ I} is a
(i) sub-martingale if E[X b | Fa ] ≥ X a for all a < b;
(ii) super-martingale if E[X b | Fa ] ≤ X a for all a < b;
(iii) martingale if it is both a sub- and super-martingale.
2.1.2 Doob’s Upcrossing Lemma. Let X be a super-martingale and let UN [a, b]
be the number of up-crossings of [a, b] by time N . Then
(b − a) E UN [a, b] ≤ E(X N − a)−
2.1.3 Definition. A predictable process is a process {Cn , Fn } such that Cn is Fn−1 measurable for all n.
PROOF: Let {X n , Fn } be a super-martingale, C1 = 1{X 0 <a} and
Cn = 1{Cn−1 =1} 1{X n−1 ≤b} + 1{Cn−1 =0} 1{X n−1 <a}
Pn
for n > 1, and Yn := i=1 Ci (X i − X i−1 ). Then {Cn , Fn } is predicable and {Yn , Fn }
is a super-martingale. We have fundamental inequality
YN ≥ (b − a)UN [a, b] − (X N − a)− .
Therefore we may conclude that
0 ≥ E[YN ] ≥ (b − a) E UN [a, b] − E(X N − a)− .
ƒ
Martingale convergence theorem
23
Now for the continuous-time analog. For a < b and F ⊆ [0, ∞) finite, let
(i) τ1 = min{t ∈ F | X t ≤ a}
(ii) σ j = min{t ∈ F | t ≥ τ j , X t > b}
(iii) τ j+1 = min{t ∈ F | t ≥ σ j , X t < a}
Given an interval I ⊆ [0, ∞) let
U I (a, b; X ) = sup{U F [a, b] | F ⊆ I finite}.
2.1.4 Theorem. Let {X t , F t } be a right-continuous sub-martingale, a < b, and
λ > 0.
(i) λ P[sup t∈[σ,τ] X t ≥ λ] ≤ E X τ+ .
(ii) λ P[inf t∈[σ,τ] X t ≤ −λ] ≤ E X τ+ − E X σ .
(iii) Up-crossings: (b − a) E U[σ,τ] (a, b; X ) ≤ |a| + E X τ+ .
p
(iv) If X is non-negative then E[(sup t∈[σ,τ] X t ) p ] ≤ ( p−1 ) p E X τp .
PROOF: Exercise. (Approximation arguments.)
ƒ
2.1.5 Doob’s Forward Convergence Theorem. Let {X n , Fn } be a super-martingale
bounded in L 1 (i.e. supn E |X n | < ∞). Then X ∞ = limn→∞ X n exists and is finite
a.s., and it is F∞ -measurable.
PROOF: Let
Λ = {X n does not converge to a limit in [−∞, ∞]}
= {lim inf X n 6= lim sup X n }
n
n
[
=
{lim inf X n < a < b < lim sup X n }
a<b∈Q
=:
[
n
n
Λa,b .
a<b∈Q
But Λa,b ⊆ {limN →∞ UN [a, b] = ∞}. The probability of this set is zero, so P(Λ) =
0. By Fatou’s Lemma
E |X ∞ | = E[lim inf |X n |] ≤ lim inf E |X n | < ∞.
n
n
ƒ
2.1.6 Doob’s Forward Convergence Theorem. Let {X t , F t } be a cadlag supermartingale bounded in L 1 (i.e. sup t E |X t | < ∞). Then X ∞ = lim t→∞ X t exists and
is finite a.s., and it is F∞ -measurable.
2.1.7 Discrete Optional Sampling Theorem. Let M be a uniformly integrable
martingale and T be a stopping time. Then E[M∞ | F T ] = M T a.s.
2.1.8 Corollary. E |M T | < ∞ and E M T = E M0 .
24
Stochastic Calculus I
2.1.9 Corollary. If S is another stopping time and S ≤ T then E[M T | FS ] = MS .
2.1.10 Continuous Optional Sampling Theorem. Let {X t , F t } be a right-continuous
sub-martingale with a last element and let S and T be F t -optional times. Then
E[X T | FS+ ] ≥ MS a.s. If S is a stopping time then we may replace FS+ by FS .
PROOF: Define
¨
Sn =
∞
k2−n
S=∞
(k − 1)2−n ≤ S < k2−n
and Tn similarly. Then Sn and Tn are stopping times and Sn & W and Tn & T , and
by the discrete optional sampling theorem, E[X Tn | FSn ] ≥ X Sn . For all A ∈ FSn ,
Z
Z
X Tn d P ≥
X Sn d P .
A
A
T∞
Therefore it also holds for all A ∈ n=1 FSn = FS+ . Also, since S ≤ Sn , FS ⊆ FSn .
Observe that {X Sn , FSn } are backward sub-martingales and E X Sn is decreasing
and bounded below by E X 0 . Therefore {X Sn } are u.i., and likewise for {X Tn }.
Since the process is right continuous
X S = lim X Sn
n→∞
and
X T = lim X Tn ,
n→∞
so we can take limits in the equation above and interchange the limits to get
E[X T | FS+ ] ≥ E X S .
ƒ
2.2
Continuous Martingales
For this section let {X t , F t } be a process with right continuous paths.
Remark.
(i) We automatically know that it has limits from the left a.s. since
{∃ t ∈ [0, n] lim inf X s
s%t
< lim sup X s } ⊆
s%t
[
{ω | U[0,n] (a, b, X (ω)) = ∞}.
a<b∈Q
(ii) Recall that {F t } is said to satisfy the usual conditions if F0 contains all the Pnegligible sets and {F t } is right continuous. If {X t , F t } is a sub-martingale
and {F t } satisfies the usual conditions then t 7→ E X t is right continuous
then X t has a right continuous modification such that {X t , F t } is a submartingale.
(iii) Continuous martingale results that are derived from discrete martingale results (e.g. OST) rely on approximation arguments for which a key ingredient
is the backward sub-martingale convergence theorem. This theorem (given
below) allows one to justify limits of the following kind.
Continuous Martingales
25
Let Tn & T , Sn & S, and {X n , Fn } is a right-continuous submartingale. If X Sn ≤ E[X Tn | FSn ] for all n then X S ≤ E[X T | FS+ ].
Namely, for all A ∈ FS+ ⊆ FSn we have
Z
lim
n→∞
X Tn d P =
Z
u.i.
A
A
lim X Tn d P =
Z
r t c ts
n→∞
XT d P
A
using the fact that E X Tn ≥ E X 0 .
2.2.1 Theorem (BS-MCT). Let {Fn }∞
n=1 be a decreasing sequence of σ-algebra
and suppose that {X n , Fn } is a backward sub-martingale (i.e. E |X n | < ∞ and
E[X n | Fn+1 ] ≥ X n+1 a.s. for all n). If limn→∞ E X n > −∞ then {X n } is u.i.
PROOF: Step 1: Show that {X n+ , Fn } is a backward sub-martingale. This is the case
since X n+1 ≤ E[X n | Fn+1 ] implies, since x 7→ x + is non-decreasing,
+
X n+1
≤ E[X n | Fn+1 ]+ ≤ E[X n+ | Fn+1 ]
by the condition Jensen inequality.
Step 2: limλ→∞ supn≥1 P[|X n | > λ] = 0 since
λ P[|X n | > λ] ≤ E |X n | = − E X n + 2 E X n+ < ∞
by the assumed bound and the fact that E X n+ ≤ E X 1+ .
+
Step 3: X n+ is u.i. Indeed, since {X n+ , Fn } is a sub-martingale we have E[X n−1
|
+
Fn ] ≥ X n so
Z
X n+ d
P=
Z
E[X 1+
| Fn ]d P ≤
{|X n |>λ}
{|X n |>λ}
{|X n |>λ}
Z
Step 4: X n− is u.i. Indeed, for λ > 0 and n > m we have
Z
0≥
Xnd P
X n <−λ
= E Xn −
Z
Xnd P
X n ≥−λ
Z
≥ E Xn −
Xmd P
X n ≥−λ
= E Xn − E Xm +
Z
Xmd P
X n <−λ
Given " > 0, there is m large enough so that
0 ≤ E Xm − E Xn ≤
"
2
X 1+ d P
26
Stochastic Calculus I
for all n > m (since X is L 1 -bounded and E X n is monotonic). For that m choose
λ > 0 such that
Z
"
sup
|X m |d P <
2
n>m X <−λ
n
R
so supn>m X − >λ X n− d P < " and X n1 is u.i.
ƒ
n
2.2.2 Theorem (Convergence). If {X t , F t } is a sub-martingale and
sup E X t+ < ∞
t≥0
then X t has a limit a.s. and in L 1 .
2.2.3 Corollary. If {X t , F t } t≥0 is a right continuous non-negative super-martingale
then X ∞ = lim t→∞ X t exists and is in L 1 .
2.2.4 Optional Sampling Theorem. If {X t , F t } is a right-continuous sub-martingale
with a last element (i.e. X ∞ = lim t→∞ X t exists a.s. and is in L 1 ) and S and T are
{F t }-optional times then E[X T | FS+ ] ≥ MS a.s. If S is a stopping time then we
may replace FS+ by FS .
2.3
Applications
Note first that if {B t , F t } is a Brownian motion then it is a martingale. Indeed,
E[B t − Bs + Bs | Fs ] = E[B t − Bs ] + Bs = Bs .
2.3.1 Lemma. Let τ = inf{t ≥ 0 | B t ∈
/ (a, b)}, where a < 0 < b. Then
−a
(i) P(Bτ = b) = b−a
(ii) E[τ] = −a b
PROOF: τ is a stopping time, but we cannot naively apply the OST since Brownian
motion does not have a last element. Instead we look at the stopped process {B t∧n },
which is a right continuous martingale with last element. Applying the OST we
get
0 = E Bτ∧n = b P[Bτ = b, τ ≤ n] + a P[Bτ = a, τ ≤ n] + E[Bn ; τ > n].
Taking limits as n → ∞ we get
0 = b P[Bτ = b] + a P[Bτ = a]
For the next part, we show that M t := (B t − a)(b − B t ) + t is a martingale.
(This is not too hard.) Then again applying the OST to the stopped process
−a b = E Mτ∧n = E[τ ∧ n] + E[(Bτ∧n − a)(b − Bτ∧n )]
and taking limits we get the result.
ƒ
Applications
27
2.3.2 Example. Let X t = B t + c t (Brownian motion with drift). We are interested
in H x = inf{t > 0 | X t = x}. We need the fact that exp(θ B t − 21 θ 2 t) is a martingale.
Fix λ > 0. Then from the exponential martingale it follows that
exp(θ X t − λt) = exp(θ B t − (λ − θ c)t)
p
is a martingale provided that λ − θ c = 12 θ 2 . Let β, α = −c ± c 2 + 2λ. Note that
α < 0 < β. Thus for any λ > 0 and β as given the martingale exp(β X t − λt) is
bounded on [0, H x ]. We can use the OST to conclude that
E[exp(β X H x − λH x )] = eβ x E[e−λH x ]
p
from which it follows that E[e−λH x ] = exp(−x( c 2 + 2λ− c)). The Laplace transform can be inverted explicitly to give
Œ
‚
x
(x − c t)2
.
P(H x ∈ d t) = p
exp −
2t
2πt 3
Take limits as λ & 0 in the Laplace transform to conclude that
¨
1
if c ≥ 0
P[H x < ∞] = −2|c|x
e
if c < 0
Now we calculate E[e−λT ] where T = H a ∧ H b for a < 0 < b. Recall that the θ
that we used previously was found as a root of λ − θ c = 21 θ 2 . We know that any
process of the form
M t = C1 eαX t −λt + C2 eβ X t −λt
is a martingale for any constants C1 , C2 . Choose M t of the form M t = f (X t )e−λt
such that f (a) = f (b), say
M t = (eβ b − eβ a )eαX t −λt + (eαa − eαb )eβ X t −λt .
With this choice, M t is bounded on [0, T ], and so the OST implies
f (0) = E[M0 ] = E[M T ] = E[ f (a)e−λT ]
so
E[e−λT ] =
eβ b − eβ a + eαa − eαb
eβ b+αa − eβ a+αb
p
In the special case c = 0 and a = −b this reduces to E[e−λT ] = sech(b 2λ).
2.3.3 Law of the Iterated Logarithm.


P lim sup Æ
t&0

Bt
2t
log log( 1t )

=1
28
Stochastic Calculus I
PROOF: Write h(t) =
Æ
2t log log( 1t ). The first step is to show that
lim sup
t&0
Bt
h(t)
≤1
P-a.s. Apply Doob’s maximal inequality to the exponential martingale Z t = exp(αB t −
α2
t) yielding for α > 0
2
‚
P
Œ
sup (B t −
s∈[0,t]
1
αs)
2
>β
‚
=P
sup Z t > e
αβ
Œ
≤ e−αβ E[Z t ] = e−αβ .
s∈[0,t]
Now fix θ , δ ∈ (0, 1) and apply the inequality with t = θ n , α = θ −n (1 + δ)h(θ n ),
and β = 12 h(θ n ). Then
αβ =
1
2
n 2
(1 + δ)θ h (θ ) = (1 + δ) log log
n
n
1
θ
and eαβ = log(n log( θ1 ))1+δ = O(n1+δ ). So
1
1
sup P(Bs − s(1 + δ)θ −n h(θ n )) ≥ h(θ n )) ≤ C n−(1+δ) .
n
2
2
s∈[0,θ ]
By the Borel-Cantelli Lemma there is Θ0θ ,δ ∈ F with P Ω0 = 1 such that for all
ω ∈ Ω0 there is Nθ ,δ (ω) such that for all n ≥ Nθ ,δ (ω)
1
1
maxn (Bs − s(1 + δ)θ −n h(θ n )) < h(θ n )
x∈[0,θ ]
2
2
Thus for θ n+1 < t ≤ θ n
B t ≤ sup Bs ≤
s∈[0,θ n ]
1
2
(2 + δ)θ −n h(θ n ) ≤
1
2
1
(2 + δ)θ − 2 h(t)
1
1
where the last inequality uses the fact that h(θ n ) ≤ θ − 2 h(θ n+1 ) ≤ θ − 2 h(t), so
lim sup
t&0
Bt
t
1
≤ (1 + δ2 )θ − 2 .
Letting δ & 0 and θ % 1 along countable sequences we complete the proof of the
first step.
For the second step, see KS p.112–113
ƒ
Recall the generator of a Markov process is G = lim t&0
E[ f (X t ) | X 0 = x]. The Brownian transition function is
p(t; x, y) =
1
(2πt)
d
2
exp(−
kx − yk2
2t
)
Pt −I
t
where Pt f (x) =
Lévy processes
and satisfies
1
∆f
2
∂ pt
∂t
29
= 12 ∆p t (x). For Brownian motion, check that G f (t, x) =
∂f
∂t
+
. Define
f
Ct
= f (t, B t ) − f (0, B t ) −
Z
t
G f (s, Bs )ds.
0
f
If {B t , F t } is a Brownian motion and f ∈ C 1,2 then C t is an {F t }-martingale.
2.4
Lévy processes
2.4.1 Definition. A Poisson process {Nt , F t } with intensity λ is a right continuous
{F t }-adapted process with N0 = 0 and Nt − Ns ∼ Poisson(λ(t − s)), i.e. for i =
0, 1, . . . ,
λi (t − s)i
P[Nt − Ns = i] = e−λ(t−s)
i!
The Poisson process has stationary independent increments and {Nt − λt} and
α
α
{eαNt −λt(e −1) }, for any α ∈ R, are martingales. This is because E[eαNt ] = eλt(e −1)
is the moment generating function for the Poisson distribution.
2.4.2 Definition. A Lévy process is a right-continuous process with stationary independent increments.
2.4.3 Examples.
(i) Brownian motion (with drift)
(ii) Poisson process
(iii) Ta , the hitting time of Brownian motion to a level a.
2.4.4 Definition. A probability measure µ on R is said to be infinitely divisible if
for all n there is a probability measure ν on R such that µ = ν ∗nP
. Equivalently, if
n
Y ∼ µ then for every n there are i.i.d. r.v.’s Yi ∼ ν such that Y = i=1 Yi .
If {X t , F t } is a Lévy process with X 0 = 0 then for all t, X t is infinitely divisible
since it may be written as a sum of n i.i.d. increments,
Xt =
n
X
(X t i − X t i−1 ).
i=1
n
n
Conversely, given any infinitely divisible r.v. Y there is a Lévy process {X t , F t } such
(d)
that Y = X 1 .
Analytical methods can be used to show that if µ is infinitely divisible then its
Fourier transform is equal to eΨ(θ ) , where
BM
Poisson
z }| { Z
z }| {
z}|{ 1
iθ x 2 2
iθ x
e − 1−
ν(d x)
Ψ(θ ) = iβθ − σ θ +
2
1 + x2
|
{z
}
drift
pure jump process
30
Stochastic Calculus I
R x
and ν is a Radon measure on R\{0} such that 1+x
2 ν(d x) < ∞. This immediately
gives you a complex exponential martingale associated with a Lévy process. This
is the Lévy-Khintchine formula
2.4.5 Definition. A r.v. Y is stable if for all n there are independent r.v.’s with the
(d)
same law as Y and constants an > 0 and bn such that Y1 + · · · + Yn = an Y + bn .
2.4.6 Lemma. Stable r.v.’s are infinitely divisible.
1
2.4.7 Exercise. It must be the case that an = n α for some α ∈ (0, 2]. When α = 2
we get the Gaussian distribution. When α ∈ (0, 2] then σ = 0 in the L-K formula
and the Lévy measure has density
(m1 1{x<0} + m2 1{x>0} )|x|−(1+α)
for some m1 , m2 ≥ 0.
2.5
Doob-Meyer decomposition
2.5.1 Lemma. Any non-constant continuous martingale {M t } a.s. has infinite variation.
PROOF: Let Vt be the variation of M on [0, t] and define
Sn := inf{s ≥ 0 | Vs ≥ n} ∧ inf{s ≥ 0 | |Ms | ≥ n}.
Then the stopped process M Sn is of bounded variation and is a martingale by
the OST. Therefore it is enough to prove that M is constant whenever it and its
variation are bounded. Assume further that M0 = 0 a.s. Fix t < ∞ and let
Π = {0 = t 0 < t 1 < · · · < t k = t} be a subdivision of [0, t]. Then
E[M t2 ] = E[
k−1
k−1
X
X
(M t2i+1 − M t2i )] = E[ (M t i+1 − M t i )2 ]
i=0
i=0
As a result
E[M t2 ] ≤ E[Vt sup |M t i+1 − M t i |]
i
and since M is of bounded variation and it is continuous, this quantity goes to
zero as the mesh of the partition goes to zero, so M ≡ 0.
ƒ
Remark. Continuity is required for this proof. The compensated Poisson process
is a right continuous process of bounded variation (it is seen to be of bounded
variation since it is the difference of two increasing processes).
2.5.2 Definition. An is said to be an increasing sequence if 0 = An ≤ A1 ≤ · · · Pa.e. and E[An ] < ∞ for all n ≥ 0. A t is said to be an increasing sequence if 0 = A0 ,
t 7→ A t is non-decreasing P-a.e., right continuous and E A t < ∞ for all t ≥ 0. Such
a thing is said to be integrable if E[A∞ ] < ∞.
Doob-Meyer decomposition
31
2.5.3 Definition. In discrete time, a sequence is said to be natural if for every
bounded martingale {Mn },
E[Mn An ] = E[
n
X
Mk−1 (An − An−1 )]
k=1
Pn
Let Yn = k=1 Mk−1 (An − An−1 ). Then a sequence A is natural if and only if
E[Yn ] = 0 for all n, if and only if {An } is predictable.
2.5.4 Definition. In continuous time, A is natural if for all bounded martingales
Z
E[M t A t ] = E[
Ms− dAs ].
(0,t]
2.5.5 Lemma.
Z
E[
Ms dAs = E[
Z
(0,t]
Ms− dAs ]
(0,t]
It will be a consequence of the definition of the stochastic integral that
R
M t A t − (0,t] Ms− dAs and so E[(A · M ) t ] = 0.
R
(0,t]
A s d Ms =
2.5.6 Proposition. (In discrete time) an increasing random sequence is predictable
if and only if it is natural.
PROOF: Suppose that A is natural and M is a bounded martingale. Let Yn be the
martingale transform, as defined above. Then
E[An (Mn − Mn−1 )] = E[Yn ] − E[Yn−1 ] = 0
ƒ
2.5.7 Definition. An increasing process {A t } is said to be natural if
Z
Z
Ms dAs ] = E[
E[
(0,t]
Ms− dAs ]
(0,t]
for all bounded martingale {M t F t }.
This is analogous to the discrete time version because
Z
E[M t A t ] = E[
Ms− dAs ]
(0,t]
(see text for the approximation argument). An increasing process is increasing if
and only if it is natural. See notes for the proof of existence.
32
Stochastic Calculus I
2.5.8 Definition. A right continuous process X is of class D (resp. class DL) if
{X τ }τ∈S (resp. {X τ }τ∈Sa for all a ∈ R+ ) is u.i., where S is the set of all finite
stopping times (resp. Sa is the set of all stopping times bounded by a).
2.5.9 Theorem. If X is a right continuous sub-martingale of class DL, then X =
M +A, where M is a martingale and A is a natural increasing (predictable) process.
If X is class D then M is u.i. and A is integrable.
2.5.10 Definition. A local martingale is a right continuous process M such that
there exists a localizing sequence of stopping times {Tn } with Tn % ∞ a.s. and such
that (M − M0 ) Tn is a martingale for all n.
2.5.11 Theorem. A process X is a local sub-martingale if and only if it has a
decomposition X = M + A, where M is a local martingale and A is a locally integrable increasing process. The decomposition is unique when A is required to be
predictable/natural.
2.5.12 Definition. A process X is a semi-martingale if X = M + A, where M is a
local martingale and A is locally of finite variation.
2.5.13 Definition. A process X is said to be regular if for all a > 0 and every nondecreasing sequence of stopping times {Tn } bounded by a, if T = limn→∞ Tn then
limn→∞ E[X Tn ] = E[X T ].
A continuous sub-martingale is regular.
2.5.14 Theorem. For a right continuous sub-martingale X of class DL, the compensator is continuous if and only if X is regular.
2.5.15 Lemma. Non-negative sub-martingales are of class DL.
PROOF: Fix a > 0 and suppose that T is such that P[T ≤ a] = 1. Then apply
OST to {X T ∧a } to get E[X a | F T ] ≥ X T . Multiply both sides by 1{X T >λ} and take
expectations to get
E[X T 1{X T >λ} ] ≤ E[X a 1{X T >λ} ].
Since X a ∈ L 1 and
P[X T > λ] ≤
it follows that {X T } T ∈Sa is u.i.
1
λ
E[X T ] ≤
1
λ
E[X a ] → 0 as λ → ∞,
ƒ
Stochastic Integration
3
33
Stochastic Integration
Naive stochastic integration (i.e. via Riemann sums) is impossible. Let X be a right
continuous function on [0, 1] and Πn be a refining sequence of dyadic rationals
such
P that kΠn k → 0. What conditions are needed on X so that the sums Sn =
Πn h(t k )(x(t k+1 ) − x(t k )) converge to a finite limit for all continuous h?
3.0.16 Theorem. Finite variation is necessary.
PROOF: Let X = C[0, ∞) and Y = R, and for h ∈ X let
X
Tn (h) =
h(t k )(x(t k+1 ) − x(t k )).
Πn
Construct hn in X such that hn (t k P
) = sgn(x(t k+1 ) − x(t k )) over Πn and khn k = 1.
For such an hn we have Tn (hn ) = Πn |x(t k+1 ) − x(t k )|, so kTn k ≥ Var[0,1] (x). On
the other hand, if for all h ∈ X limn→∞ Tn (h) exists then by the Banach-Steinhaus
theorem the total variation over [0, 1] of x is finite.
ƒ
3.1
Riemann-Stieltjes Integration
Let F V be the class of finite variation processes (differences of increasing processes) started at 0.
3.1.1 Theorem. Let A ∈ F V and H be a (jointly) measurable process such that
a.s. s 7→ H(s, ω) is continuous. Let Πn be a sequence of random finite partitions of
[0, t] such that limn→∞ kΠn k → 0. Then for and {Sk } with Tk ≤ Sk ≤ Tk+1 , a.s.
Z t
X
Hs dAs .
lim
HSk (A Tk+1 − A Tk ) =
n→∞
Πn
0
3.1.2 Theorem. Let A ∈ F V be right continuous . For f ∈ C 1 , the process
( f (A t )) t≥0 is in F V and is equal to
Z t
X
f 0 (As )dAs +
∆ f (As ) + f 0 (As− )∆As .
0
0<s≤t
3.1.3 Example. Let N be a Poisson process of parameter λ and M t = Nt − λt be
the compensated Poisson process. Let H be jointly measurable and (say) bounded.
The natural way to define the integral of H with respect to M is
Z t
Z t
Z t
Z t
I tM (H) =
H s d Ms =
0
Hs dNs − λ
Hs d(Ns − λt) =
0
0
Let {Ti } be the jump times of the Poisson process, so that Nt =
Z t
∞
X
M
I t (H) =
H Tn 1 Tn ≤t − λ
Hs ds.
n=1
0
Hs ds.
0
P∞
i=0 1 Ti ≤t .
Then
34
Stochastic Calculus I
If H is continuous and adapted, then
E[I tM (H) − IsM (H)
| Fs ] = E[
Z
t
Hu d Mu | Fs ] = 0
s
applying the first theorem in the section, so the integral is seen to be a martingale.
What if H is not continuous but right continuous? Let H̃ t = 1[0,T1 ) (t), so
Z
t
H s d Ms =
0
∞
X
H̃ Ti 1 Ti ≤t − λ
Z
t
H̃s ds = −λ(t ∧ T1 ),
0
i=1
which is not a martingale.
Now for the harder case of continuous martingales (necessarily of unbounded
variation). Let B be standard Brownian motion and consider a sequence {Πn } of
dyadic partitions of [0, ∞) with kΠn k → 0 as n → ∞. Define
(n)
Bt
=
X
Πk
B t k 1(t k ,t k+1 ] .
We know B (n) is càglàd, and B (n) → B u.c.p. i.e. for all T ,
(n)
(p)
sup |B t − B t | −
→0
t∈[0,T ]
as n → ∞. The martingale transform is
I tB (B (n) ) =
X
Πn
B t k (B t k+1 − B t k )
X1
1
(B t k+1 + B t k )(B t k+1 − B t k ) − (B t k+1 − B t k )(B t k+1 − B t k )
2
Πn
1X
1
(B t k+1 − B t k )2
= B 2t −
2
2 Π
=
2
n
This converges u.c.p. to 21 B 2t − 12 t. Note that this is not what we would expect from
the usual change of variable formula. There is an extra term of − 12 t,
Z
0
3.2
t
Bs dBs =
1
2
B 2t −
Construction of the Itô integral
See text, §3.1 and §3.2.
1
2
t.
Construction of the Itô integral
35
3.2.1 Definition. A process X is called simple if there is a strictly increasing sequence of real numbers {t n }n≥0 with t 0 = 0 and limn→∞ t n = ∞ and a sequence
of random variables {ξn }n≥0 with supn≥0 |ξn | ≤ C < ∞ and ξn is F t n -measurable
for all n and
∞
X
X t = ξ0 1{0} (t) +
ξi 1(t i ,t i+1 ] (t)
i=1
This class of processes will be denoted L0 . We define the stochastic integral of
X ∈ L0 with respect to M by the martingale transform
I tM (X ) =
∞
X
ξi (M t∧t i+1 − M t∧t i ).
i=0
Some obvious properties of I tM for simple processes are
(i) I0M (X ) = 0;
(ii) I tM (αX + β Y ) = αI tM (X ) + I tM (Y ) (linearity);
(iii) {I tM (X ), F t } t≥0 is a martingale.
Rt
(iv) E[(I tM (X ))2 ] = E[ 0 X s2 d[M ]s ] (see below)
PROOF (OF 4.):
E[(I tM (X ))2 ] = fill in all details as an exercise
Z t
X s2 d[M ]s ]
= E[
ƒ
0
It follows from the last property that
E[(I tM (X ))2
− (IsM (X ))2
| Fs ] = E[
Z
t
X s2 d[M ]s ],
s
so if X ∈ L0 and M ∈ M2 then I M (X ) ∈ M2 and
kI M (X )k = [X ] =
∞
X
1
n=0
2n
1 ∧ [X ]n ,
Rn
where [X ]2n = E[ 0 X s2 d[M ]s ].
It is a fact that (M2 , k · k) and (L , [·]) are complete metric spaces.
3.2.2 Lemma. Let X be a bounded, measurable, adapted process. Then there is a
sequence {X (m) }m≥1 of simple processes such that
Z
T >0 m→∞
PROOF: Fix T > 0.
T
(m)
|X t
sup lim E
0
− X t |2 d t
= 0.
36
Stochastic Calculus I
(i) Suppose that X has continuous paths. Then X can be approximated by
(m)
Xt
= X 0 1{0} (t) +
∞
X
X k2−m 1(k2−m ,(k+1)2−m ] (t).
k=1
Indeed, X (m) → X a.s. as m → ∞ and by the BCT [X − X (m) ] T → 0.
(ii) Suppose that X is progressively measurable. Define, for t ∈ [0, T ],
Z t
Ft =
X s ds
(m)
and
X̃ t
= m(F t − F t− 1 ).
0
m
Note that for F to be well-defined we require only that X is measurable. F
is continuous, so X̃ (m) is continuous for all m. Progressive measurability of
F follows from that of X . Indeed, if
g : ([0, t] × Ω, B[0, t] ⊗ F t ) → (R, B(R))
Rt
is measurable then 0 g(s, ω)ds is also (B[0, t] ⊗ F t )-measurable by Fubini(?). Therefore F is continuous and progressively measurable. For all
(m)
ω ∈ Ω, X t (ω) → X t (ω) by the fundamental theorem of calculus. The
BCT gives you the rest. Standard diagonalization gives you that X can be
approximated by a sequence in L0 .
(iii) Let X be measurable and adapted. As before, F is continuous and measurable but we can not be sure that it is progressively measurable. We will
show that F is indeed adapted. By KS Proposition 1.1.2, X has a progresRt
sively measurable modification Y . Let G t = 0 Ys ds for t ∈ [0, T ]. We know
from the second part that G is F t -adapted. It suffices to show that F is a
modification of G since the filtration is complete. Fix t ∈ [0, T ].
Z T
{F t 6= G t } ⊆ {
1X t 6=Yt d t > 0}
0
so
P[F t 6= G t ] ≤
Z
T
P[X t 6= Yt ]d t = 0.
0
Since {F t } is complete, F t is adapted. Proceed as in the previous step.
ƒ
3.2.3 Proposition. If t 7→ [M ] t is absolutely continuous then L0 is dense in
(L (M ), [·] M ).
PROOF: If X ∈ L is bounded then the assertion essentially follows from the previous lemma. Choose a subsequence of {X (m) } along which
n
oC
lim X (mk ) = X
k→∞
has zero µB measure, and therefore zero µ M -measure. By BCT we have convergence in [·] M . If not bounded then use DCT instead (truncation, take limits). ƒ
Characterization of the Stochastic Integral
Integrand
L (M ) (meas, adapted)
L ∗ (M ) (prog meas)
P (M ) (predictable)
37
Integrator
M ∈ M2c , t 7→ [M ] t a.c. (Itô)
M ∈ M2c (Itô)
M ∈ M2 (Kunita-Watanabe)
Using localization we can replace M2c by M c,loc and replace
Z
E[
T
X s2 d[M ]s ]
<∞
Z
by
0
P[
T
X s2 d[M ]s ] < ∞.
0
We must deal with questions like whether I(X T ) = (I(X )) T for stopping times T .
3.3
Characterization of the Stochastic Integral
Rt
For M ∈ M2c , X ∈ L ∗ (M ) we have shown that I tM (X ) = 0 X s d Ms is well-defined.
Rt
We know what I M (X ) ∈ M2c with quadratic variation 0 X s2 d[M ]s .
What is the cross variation of I M (X ) and I Y (N )? Recall that the cross variation
may be characterized as the unique predictable process (of finite variation) such
that
I M (X )I N (Y ) − [I M (X ), I N (Y )]
is a martingale. First, when X and Y are simple, suppose without loss of generality
that
∞
X
ξi 1(t i ,t i+1 ] (t)
X = ξ0 1{0} (t) +
i=1
and
Y = η0 1{0} (t) +
∞
X
ηi 1(t i ,t i+1 ] (t).
i=1
Remember that
I tM (X ) =
∞
X
ξi (M t i+1 ∧t − M t i ∧t ) and
i=1
I tN (Y ) =
∞
X
ηi (Nt i+1 ∧t − Nt i ∧t ).
i=1
Fix 0 ≤ s ≤ t < t and suppose that n and m are such that t m ≤ s < t m+1 and
t n ≤ t < t n+1 (in fact, suppose for now that s = t m and t = t m+1 ).
E[(I tM (X ) − IsM (X ))(I tN (Y ) − IsN (Y )) | Fs ]
n X
n
X
= E[
ξi η j (M t i+1 − M t i )(Nt j+1 − Nt j ) | Fs ]
i=m j=m
= E[
n
X
i=m
ξi ηi (M t i+1 − M t i )(Nt i+1 − Nt i ) | Fs ]
38
Stochastic Calculus I
=
=
=
n
X
i=m
n
X
i=m
n
X
E[ξi ηi (M t i+1 − M t i )(Nt i+1 − Nt i ) | Fs ]
E[ξi ηi E[M t i+1 Nt i+1 − M t i Nt i | F t i ] | Fs ]
E[ξi ηi E[[M , N ] t i+1 − [M , N ] t i | F t i ] | Fs ]
i=m
= E[
= E[
n
X
i=m
Z t
ξi ηi ([M , N ] t i+1 − [M , N ] t i ) | Fs ]
X u Yu d[M , N ]u | Fs ]
s
3.3.1 Proposition. Let α, β, γ be right continuous functions [0, ∞) → R with
α(0) = β(0) = γ(0) = 0. Let α be of finite variation and β and γ be increasing. Suppose further that for all s ≤ t we have
Z t
Z t
1 Z t
1
2
2
dαu ≤
dβu
dγu .
s
s
s
Then for any measurable functions f , g we have
Z t
Z t
1
1 Z t
2
2
2
2
| f g|d|α| ≤
g dγ .
f dβ
s
s
s
PROOF: Monotone class theorem.
ƒ
3.3.2 Theorem (Kunita-Watanabe inequality, 1967).
If M , N ∈ M2c , X ∈ L ∗ (M ), Y ∈ L ∗ (N ) then a.s.
Z t
Z t
1 Z
X s2 d[M ]s
|X s Ys |d|[M , N ]s | ≤
0
2
0
t
Ys2 d[N ]s
1
2
.
0
PROOF: By the previous result we only need to show that there is a negligible set
Z such that
Z t
Z t
1 Z t
1
2
2
d|[M , N ]u | ≤
d[M ]u
d[N ]u
s
s
s
holds path-wise for all s, t. Let Z be the null set such that if ω ∈
/ Z then
Z t
d[M + r N , M + r N ]u
0≤
s
for all r, s, t with s ≤ t and r, s, t ∈ Q. Then
Z t
Z t
d[M + r N , M + r N ] t −
0≤
s
s
d[M + r N , M + r N ]s
Characterization of the Stochastic Integral
39
= r s ([N ] t − [N ]s ) + 2r([N , M ] t − [N , M ]s ) + ([M ] t − [M ]s )
The right hand side is non-negative for all rational r, so it holds for all real r
by continuity. The discriminant of the quadratic equation must be non-negative,
which gives us the desired inequality. Since we have it for rational s, t, by right
continuity of the paths we have it for all s, t.
ƒ
∗
3.3.3 Lemma. If M , N ∈ M2c , X ∈ L ∗ (M ), {X (n) }∞
n=1 ⊆ L (M ), with
Z
lim
n→∞
T
|X s(n) − X s |2 d[M ]s = 0
0
a.s.-P, then, for all 0 ≤ t ≤ T ,
lim [I M (X (n) ), N ] t = [I M (X ), N ] t .
n→∞
PROOF:
T
Z
|[] t | ≤ [] t [] t ≤
[] T
ƒ
0
3.3.4 Lemma. If M , N ∈ M2c and X ∈ L ∗ (M ) then
[I M (X ), N ] t =
t
Z
X s d[M , N ]s .
0
PROOF: We showed there is a sequence of simple processes such that the condition
in the above lemma holds. But we showed that the condition in this lemma holds
for simple processes.
ƒ
3.3.5 Theorem. Consider a martingale M ∈ M2c and X ∈ L ∗ (M ). Then I M (X ) is
the unique martingale Φ ∈ M2c such that
[Φ, N ] t =
Z
t
X s d[M , N ]s
0
for all N ∈ M2c .
3.3.6 Corollary. If M ∈ M2c , X ∈ L ∗ (M ), N = I M (X ), Y ∈ L ∗ (N ), then X Y ∈
L ∗ (M ) and I N (Y ) = I M (X Y ).
PROOF: [N ] t =
Rt
0
X s2 d[M ]s , so
Z
E[
0
T
X s2 Ys2 d[M ]s ]
= E[
Z
0
T
Ys2 d[N ]s ] < ∞.
40
Stochastic Calculus I
So X , Y ∈ L ∗ (M ). For any Ñ ∈ M2c , the previous theorem showed that
d[N , Ñ ]s = X s d[M , Ñ ]s
and so
[I M (X Y ), Ñ ] t =
Z
t
X s Ys d[M , Ñ ]s =
0
Z
t
Ys d[N , Ñ ]s = [I N (Y ), Ñ ] t .
0
By the characterization of the integral, I M (X Y ) = I N (Y ).
Today we have shown that [I M (X ), I N (Y )] =
Read the proof of Itô’s formula.
3.4
Rt
0
ƒ
X s Ys d[M , N ]s .
Stochastic Integration
Today we extend the definition of the stochastic integral to all of M c,loc . For
M ∈ M2c with t 7→ [M ] t absolutely continuous and X ∈ L ∗ (M ). We used the fact
that for all T < ∞ there is a sequence {X (m) } ⊆ L0 such that
T
Z
(m)
|X t
E[
− X t |2 d t] → 0
0
as m → ∞. We use “time changes” to do the general case.
3.4.1 Theorem. For M ∈ M2c , L0 (M ) is dense in L ∗ (M ) with respect to [·].
PROOF: The proof in the general case follows from the following more general
lemma.
ƒ
3.4.2 Lemma. Let {A t } is a continuous (resp. right-continuous) increasing, F adapted process. If X is progressively measurable and satisfies
T
Z
X t2 dA t ] < ∞
E[
0
for all T > 0, then there exists a.s. a sequence {X (n) }∞
n=1 of simple processes such
that
Z t
(n)
|X t − X t |2 dA t ] = 0.
sup lim E[
T >0 n→∞
0
PROOF: Assume without loss of generality that X is bounded, say by C. It suffices
to fix T > 0 and show there exists {X (n) }∞
n=1 ⊆ L0 such that
Z
lim E[
n→∞
0
t
(n)
|X t − X t |2 dA t ] = 0.
Stochastic Integration
41
The process A t + t is strictly increasing and continuous, so it has a continuous
strictly inverse function Ts defined by A Ts + Ts = s for all ω. In particular, Ts ≤ s
and
{Ts ≤ t} = {A t + t ≥ s} ∈ F t .
Therefore for all s ≥ 0, Ts is an F -stopping time. Define Gs = F Ts and Ys = X Ts .
Since X is progressively measurable, Ys is G -adapted. Without loss of generality,
assume X t ≡ 0 for t ≥ T . Also,
Z
∞
E[
Ys2 ds
= E[
Z
0
∞
1{Ts ≤T } X T2s ds]
0
A T +T
= E[
Z
1{Ts ≤T } X T2s ds]
0
≤ C(E[A T ] + T ) < ∞.
For any N ∈ N, choose R < ∞ such that E[
is a simple process Ỹ
(n)
R∞
R
Ys2 ds] <
1
.
2n
By the old result, there
such that
R
Z
|Ỹs(n) − Ys |2 ds] <
E[
0
1
2n
.
Define Ys(n) = 1[0,R] (s)Ỹs(n) . Then
Z
R
E[
|Ys(n) − Ys |2 ds] <
0
But
Ys(n) = ξ0 1{0} (s) +
X
1
n
.
ξi 1(si ,si+1 ] (s)
i≥0
where each ξi is Gsi -measurable. Define
(n)
Xt
(n)
= Yt+At = ξ0 1{0} (s) +
X
ξi 1(Ts ,Ts
i
i+1
] (s).
i≥0
(n)
To see that X t is F t -adapted, simply observe that ξi restricted to (Tsi , Tsi+1 ] is
F t -measurable.
ƒ
Now we extend to M c,loc . For simplicity we assume that M0 = 0. Define
Tn := inf{t > 0 | |M t | > n} for n ∈ N. Then M Tn is a bounded martingale. Recall
the (generalized) Doob decomposition. If M ∈ M c,loc then there is a unique,
continuous, increasing process [M ] such that M 2 − [M ] ∈ M c,loc . For M , N ∈
c,loc
M0
define
1
[M , N ] = ([M + N ] − [M − N ]).
4
42
Stochastic Calculus I
c,loc
3.4.3 Definition. Let M ∈ M0
and X progressively measurable with
Z
P
T
X t2 d[M ] t
=1
0
for all T < ∞. Then I tM (X ) is defined to be
I tM (X )
=
Tn
I tM (X Tn )
=
Z
t
X Tn d M Tn
0
for all t ∈ [0, Tn ], where Tn = Sn ∧ R n and
Z t
X t2 d[M ] t ≥ n}
Sn = inf{t > 0 |
and R n = inf{t > 0 | |M t | ≥ n}.
0
Tn
Note that M Tn ∈ M2c and X Tn ∈ L ∗ (M Tn ) by the definition of Sn . So I M (X Tn )
is well-defined.
3.4.4 Definition. A (continuous) semi-martingale X is a process that admits a decomposition X = M + A, where M ∈ M c,loc and A ∈ F V c,loc , the collection of
continuous adapted processes that are of finite variation on every bounded interval.
This decomposition is unique. Now we can define the stochastic integral with
respect to a continuous semi-martingale in the obvious way.
3.4.5 Proposition. We have the following properties of I M (H).
(i) Linearity; R
t
(ii) [I M (H)] t = 0 Hs2 d[M ]s ;
Rt
(iii) [I M (H), I N (K)] t = 0 Hs Ks d[M , N ]s ;
Furthermore, the stochasticR integral I M (H) is characterized as the unique Φ ∈
t
M c,loc such that [Φ, N ] t = 0 Hs d[M , N ]s for all N ∈ M2c .
In particular we cannot say things regarding the conditional expectations.
3.5
Integration by parts formula for stochastic integrals
Recall that
M t2 =
X
(M t2k − M t2k−1 )
Π
X
=
(M t k − M t k−1 )(M t k + M t k−1 )
Π
=2
X
Π
M t k−1 (M t k − M t k−1 ) +
X
(M t k − M t k−1 )2
Π
Integration by parts formula for stochastic integrals
43
3.5.1 Lemma. Let M be a bounded continuous martingale and A be a continuous
adapted process of finite variation. Then
Rt
(i) M t2 = 2 0 Ms d Ms + [M ] t ; and
Rt
Rt
(ii) M t A t = 0 As d Ms + 0 Ms dAs .
PROOF: Define T0n := 0 and
n
Tk+1
:= inf{t > Tkn | |M t − M Tkn | >
1
2n
},
and define t kn = t ∧ Tkn . Then we have
M t2 = 2
X
n (M n − M n ) +
M t k−1
tk
t k−1
k≥1
X
2
n )
(M t kn − M t k−1
k≥1
Define
X n :=
X
n
n
M Tk−1 1(Tk−1
,Tkn ]
An :=
and
X
2
n ) ,
(M t kn − M t k−1
k≥1
k≥1
so we can rewrite this as M t2 = 2I M (X n ) + An . We showed earlier that An →
[M ] a.s. (at least along a subsequence). Since sup t |X tn − X tn+1 | ≤ 2−n−1 and
sup t |X tn − M t | ≤ 2−n . Taking limits, we have M t2 = 2I tM (M ) + [M ] t .
n
−n
n
P The second part is a similar argument, taking t k = (k2 ) ∧ t and X =
n
n
n
ƒ
k≥0 A t k−1 1(t k−1 ,t k ] .
3.5.2 Theorem. Let X and Y be continuous semi-martingales. Then
X t Yt − X 0 Y0 =
Z
t
X s dYs +
0
Z
t
Ys dX s + [X , Y ] t .
0
PROOF: We assume without loss of generality that X 0 = Y0 = 0. Suppose that
X = M + A and Y = N + V . Using the first part of the last lemma applied to M + N
and M − N to get
M t Nt =
Z
t
0
Ms dNs +
Z
t
Ns d Ms + [M , N ] t .
0
Combine this with ordinary Lebesgue-Stieltjes integration to get the result.
ƒ
Recall the following results.
3.5.3 Theorem (Chain Rule).
If X is a continuous semi-martingale and U and V are progressively measurable
processes with V ∈ L ∗ (X ) then U ∈ L ∗ (V · X ) if and only if U V ∈ L ∗ (X ), in
which case U · (V · X ) = (U V ) · X .
44
Stochastic Calculus I
3.5.4 Theorem (Integration-by-parts).
For continuous semi-martingales X and Y ,
Z t
Z
X t Yt − X 0 Y0 =
t
X s dYs +
0
Ys dX s + [X , Y ]s .
0
We prove today the following.
3.5.5 Theorem (Itô’s Formula).
For a continuous semi-martingale X and a smooth function f ∈ C 2 (Rd ),
Z
Z t
X t ∂f
1X
∂2f
i
(X s )dX s +
(X s )d[X i , X j ]s .
f (X t ) − f (X 0 ) =
∂
x
2
∂
x
∂
x
i
i
j
0
0
i
i, j
PROOF: We prove the case d = 1. Fix X and let C be the collection of smooth
functions f for which the formula holds. Clearly C is a linear subspace of C 2 (R)
what contains all linear functions. We show that C is closed under multiplication.
Let f , g ∈ C and define F = f (X ) and G = g(X ). Then F and G are continuous
semi-martingales. By the integration-by-parts formula,
( f g)(X t ) − ( f g)(X 0 ) = F t G t − F0 G0
= (F · G) t + (G · F ) t + [F, G] t
= f (X ) · (g 0 (X ) · X + 21 g 00 (X ) · [X ]) + g(X ) · ( f 0 (X ) · X + 12 f 00 (X ) · [X ]) + [ f (X ), g(X )]
= f (X ) · (g 0 (X ) · X + 21 g 00 (X ) · [X ]) + g(X ) · ( f 0 (X ) · X + 12 f 00 (X ) · [X ]) + f 0 (X )g 0 (X ) · [
= ( f g 0 + g f 0 )(X ) · X + 12 (2 f 00 + 2g 0 f 0 + 2g 00 )(X ) · [X ]
= ( f g)0 (X ) · X + 12 ( f g)00 (X ) · [X ]
since
[ f (X ), g(X )] = [ f 0 (X ) · X , g 0 (X ) · X ] = f 0 (X )g 0 (X ) · [X ]
by Kunita-Watanabe. Therefore C contains all polynomials. Let f ∈ C 2 be arbitrary. By the Weierstrass approximation theorem there are polynomials pn such
that
sup |pn (x) − f 00 (x)| → 0
|x|≤c
as n → ∞ for all c > 0. Integrate pn twice to get polynomials Fn such that
sup |Fn (x) − f (x)| ∨ |Fn0 (x) − f 0 (x)| ∨ |Fn00 (x) − f 00 (x)| → 0
|x|≤c
as n → ∞ for all c > 0. In particular, Fn (X t ) → f (X t ) for all t ≥ 0. Letting
X = M̃ + Ã, by the dominated convergence theorem for Stieltjes integral,
(Fn0 (X ) · Ã + 21 Fn00 (X ) · [ M̃ ]) → ( f 0 (X ) · Ã + 21 f 00 (X ) · [ M̃ ]).
All that remains to show is that
Z t
Fn0 (X )d M̃s
0
Z
→
0
t
f 0 (X )d M̃s .
Fisk-Stratonovich integral
45
This sequence does converge to this limit in L 2 because
t
Z
Fn0 (X ) −
E
0
f (X )d M̃s
2
=E
Z
t
(Fn0 (X ) − f 0 (X ))2 d[ M̃s ] → 0
0
0
as n → ∞ (this is the Itô Isometry). Therefore there is a subsequence along which
the convergence is a.s.
ƒ
3.6
Fisk-Stratonovich integral
3.6.1 Definition. The Fisk-Stratonovich integral of X with respect to Y is
Z
t
Z
X s ◦ d Ys :=
0
0
t
1
X s dYs + [X , Y ] t .
2
We have an Itô rule and integration-by-parts rule for this type of integral. We
also have the following fact.
S" (Π) =
m−1
X
i=0
1
2
(p)
(B t i + B t i+1 )(B t i − B t i+1 ) −
→
Z
t
Bs ◦ dBs .
0
3.6.2 Example. Consider the ODE dN
= a(t)N (t), where a is the rate of growth
dt
and N is the number of people. For whatever reason, we may think of a(t) as
r(t)+ξ(t), a deterministic part plus a process of random fluctuations. Empirically,
dB
“ξ(t) = d tt ”, so we write
d Nt = r(t)Nt d t + σNt dB t .
It is not clear which integral we should use to write Nt . The choice of integral
depends on the model. In finance the Itô integral is used because we cannot look
into the future. For stochastic processes on a manifold the Stratonovich integral is
used.
3.7
Applications of Itô’s formula
Regular conditional probabilities
Given a probability measure P on (Ω, F ), its characteristic function is
Z
e iθ x P(d x) = E[e iθ Z ]
f (θ ) =
Ω
for any r.v. Z with law P. Now suppose you were able to show that for P-a.e.
ω ∈ Ω,
f (θ ) = E[e iθ Z | F T ](ω).
46
Stochastic Calculus I
We would expect that f (θ ) should be the characteristic function of the conditional
law of Z given F T . In order to do that we would like to be able to write
Z
E[e iθ Z | F T ](ω) =
e iθ x (P | F T )(ω | d x)
Ω
for some measure (P | F T ). This motivates the following.
3.7.1 Definition. Given a r.v. Z on (Ω, F , P) taking values in (S, S ), and a subσ-algebra F T ⊆ F , we will say (P | F T ) is a regular conditional probability if
(i) for all ω ∈ Ω, (P | F T )(ω, ·) defines a probability measure on (S, S );
(ii) for all A ∈ S , (P | F T )(·, A) is F T -measurable; and
(iii) for all A ∈ S and ω ∈ Ω, P(Z ∈ A | F T )(ω) = (P | F T )(ω, A).
For fixed A, we notice P(Z ∈ A | F T )(ω) = E[1A | F T ](ω) P-a.s. The existence
of a regular conditional probability is asking whether there a “modification” of
{P(A | F T ) | A ∈ S } (defined as above) that satisfies the first condition (as it
certainly satisfies the second).
3.7.2 Theorem. If (S, S ) is a complete, separable, metric space and S = B(S )
then regular conditional probabilities exist.
3.7.3 Lemma. Let X be a d-dimensional random vector on (Ω, F , P). Suppose
that G is a sub-σ-field of F and suppose that for each ω ∈ Ω there is a function
ϕ(ω, ·) : Rd → C such that for all u ∈ Rd ,
ϕ(ω, u) = E[e i〈u,X 〉 | G ](ω)
P-a.e. If for each ω, ϕ(ω, u) is the characteristic function of some probability
measure Pω on (Rd , B(Rd )), i.e. if
Z
e i〈u,x〉 Pω (d x)
ϕ(ω, u) =
Rd
then for all A ∈ B(Rd ),
P[X ∈ A | G ](ω) = Pω (A) =: (P | G )(ω, A)
P-a.e. ω.
PROOF: Let (Q | G ) be a regular conditional probability for X given G , so that for
each fixed u ∈ Rd ,
Z
ϕ(ω, u) = E[e i〈u,X 〉 | G ](ω) =
e i〈u,x〉 (Q | G )(ω, d x)
Rd
P-a.e. ω. The set of ω for which this holds could depend on u. Take a countable
dense subset D and Ω̃ ∈ F with P(Ω̃) = 1 so that the equation above holds for all
u ∈ D and all ω ∈ Ω̃. Use continuity with respect to u of both sides to conclude for
all u ∈ Rd and all ω ∈ Ω̃. (See page 85 in KS.)
ƒ
This can be used to prove the strong Markov property in a different way.
Applications of Itô’s formula
47
Martingale characterization of Brownian motion
Recall that if B is a standard d-dimensional Brownian motion then the covariation
among the components is [B ( j) , B ( j) ] t = δi j t.
3.7.4 Theorem (Lévy, 1948, Kunita-Watanabe, 1967).
Suppose that X is a d-dimensional continuous adapted process such that for ev(k)
(k)
(k)
ery component 1 ≤ k ≤ d the process M t := X t − X 0 is a continuous local
( j)
( j)
martingale and [M , M ] t = δi j . Then X is a Brownian motion.
PROOF: We will show that for all 0 ≤ s ≤ t, X t − X s is independent of Fs and has
the d-variate normal distribution with mean zero and covariance matrix (t − s)I d .
To do this we will show that
1
E[e i〈u,X t −X s 〉 | Fs ] = e− 2 kuk
∂f
(x)
∂ xj
For fixed u, f (x) = e i〈u,x〉 satisfies
2
(t−s)
.
= iu j f (x) and
∂2f
(x)
∂ xi ∂ x j
= −ui u j f (x).
Applying Itô’s formula to the real and imaginary parts we have
e
i〈u,X t 〉
=e
i〈u,X s 〉
+i
d
X
Z
uj
t
e
i〈u,X v 〉
d M v( j)
−
s
j=1
d
1X
2
u2j
j=1
Z
t
e i〈u,X v 〉 d v.
s
Now | f (x)| ≤ 1 for all x ∈ Rd , and because [M ( j) ] t = t is bounded on any interval,
Rt
M ( j) ∈ M2c . Thus the real and imaginary parts of { 0 e i〈u,X v 〉 d M v( j) } lie in M2c (not
just M c,loc ). Taking expectations,
Z
E[
t
e i〈u,X v 〉 d M v( j) Fs ] = 0
s
P-a.s. For A ∈ Fs , multiplying by e−i〈u,X s 〉 1A gives the result. (Fill this in.)
ƒ
Itô’s formula for general f ∈ C 1,2 and X = M + A a semi-martingale is
Z
f (t, X t ) = f (0, X 0 )+
0
∂f
∂t
(x, X s )ds+
1X
2
i, j
Z
0
t
∂2f
∂ xi∂ x j
(s, X s )d[X i , X j ]s +
X
i
Z
0
t
∂f
∂ xi
(x, X s )dX si
Bessel process
Let B be a d-dimensional Brownian motion and let
Æ
(1)
(d)
R t = kB t k = (B t )2 + · · · + (B t )2 .
By the rotation (orthogonal transformation property) of Brownian motion, if k yk =
kxk then R has the same distribution under P x and P y . Use this to show that R is
a Markov process (see hand-written notes).
Index
Borel σ-algebra, 3
Brownian filtration, 18
localizing sequence, 32
locally Hölder continuous, 7
canonical version, 5
central limit theorem, 4, 11
Chapman-Kolmogorov equations, 16
class D, 32
class D L, 32
consistent, 6
converges P-a.s., 3
converges in distribution, 3
converges in probability, 3
converges weakly, 5, 8
covariance function, 15
cylinder set, 6
Markov family, 17
Markov process, 15, 16
Markov’s inequality, 3
martingale, 22
measurable, 4
measurable function, 3
modification, 4
modulus of continuity, 10
dyadic rationals, 7
Poisson process, 29
Polish space, 10
predictable process, 22
progressively measurable, 17
Prohorov metric, 11
natural, 31
optional time, 18
equal a.s., 3
equal in distribution, 3, 4
equicontinuous, 10
fi.di. distributions, 5
finite dimensional marginal distributions, 5
Fisk-Stratonovich integral, 45
random element, 3
random measure, 3
random process, 3
random variable, 3
random vector, 3
reflection property, 12
regular, 32
regular conditional probability, 46
relatively compact, 9
Gaussian process, 15
hitting time, 18
homogeneous, 16
homogeneous increments, 5
scaling property, 12
semi-martingale, 32, 42
simple, 35
simple Markov property, 12
simple stopping time, 18
stable, 30
state space, 4
stationary increments, 5
stochastic kernel, 15
stochastic process, 4
stopped process, 26
stopping time, 18
strong law of large numbers, 4
image measure, 5
increasing sequence, 30
independent increments, 5
indistinguishable, 3, 4
infinitely divisible, 29
infinitesimal generator, 17
integrable, 22, 30
kernel, 15
Lévy process, 29
Lévy-Khintchine formula, 30
local martingale, 32
49
50
sub-martingale, 22
super-martingale, 22
tight, 10
time inversion property, 12
time reversal property, 12
time shift operator, 21
total variation norm, 8
transition density, 17
transition function, 16
transition probability, 15
universally measurable, 17
usual conditions, 24
variation, 13
weak law of large numbers, 4
Wiener measure, 6
INDEX