Download An Introduction to Stochastic Calculus with Applications to Finance

Document related concepts

Probability wikipedia , lookup

Transcript
An Introduction to Stochastic Calculus with Applications
to Finance
Ovidiu Calin
Department of Mathematics
Eastern Michigan University
Ypsilanti, MI 48197 USA
[email protected]
Preface
The goal of this work is to introduce elementary Stochastic Calculus to senior undergraduate as well as to master students with Mathematics, Economics and Business
majors. The author’s goal was to capture as much as possible of the spirit of elementary Calculus, at which the students have been already exposed in the beginning
of their majors. This assumes a presentation that mimics similar properties of deterministic Calculus, which facilitates the understanding of more complicated concepts
of Stochastic Calculus. Since deterministic Calculus books usually start with a brief
presentation of elementary functions, and then continue with limits, and other properties of functions, we employed here a similar approach, starting with elementary
stochastic processes, different types of limits and pursuing with properties of stochastic processes. The chapters regarding differentiation and integration follow the same
pattern. For instance, there is a product rule, a chain-type rule and an integration by
parts in Stochastic Calculus, which are modifications of the well-known rules from the
elementary Calculus.
Since deterministic Calculus can be used for modeling regular business problems, in
the second part of the book we deal with stochastic modeling of business applications,
such as Financial Derivatives, whose modeling are solely based on Stochastic Calculus.
In order to make the book available to a wider audience, we sacrificed rigor for
clarity. Most of the time we assumed maximal regularity conditions for which the
computations hold and the statements are valid. This will be found attractive by both
Business and Economics students, who might get lost otherwise in a very profound
mathematical textbook where the forest’s scenary is obscured by the sight of the trees.
An important feature of this textbook is the large number of solved problems and
examples from which will benefit both the beginner as well as the advanced student.
This book grew from a series of lectures and courses given by the author at the
Eastern Michigan University (USA), Kuwait University (Kuwait) and Fu-Jen University
(Taiwan). Several students read the first draft of these notes and provided valuable
feedback, supplying a list of corrections, which is by far exhaustive. Any typos or
comments regarding the present material are welcome.
The Author,
Ann Arbor, October 2012
i
ii
O. Calin
Contents
I
Stochastic Calculus
3
1 Introduction
1.1 A Few Introductory Problems . . . . . . . .
1.1.1 Stochastic population growth models
1.1.2 Pricing zero-coupon bonds . . . . . .
1.1.3 Diffusions of particles . . . . . . . .
1.1.4 White noise . . . . . . . . . . . . . .
1.2 Bounded Variation . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 Basic Notions
2.1 Probability Space . . . . . . . . . . . . . . . . . . . .
2.2 Sample Space . . . . . . . . . . . . . . . . . . . . . .
2.3 Events and Probability . . . . . . . . . . . . . . . . .
2.4 Random Variables . . . . . . . . . . . . . . . . . . .
2.5 Integration in Probability Measure . . . . . . . . . .
2.6 Distribution Functions . . . . . . . . . . . . . . . . .
2.7 Independence . . . . . . . . . . . . . . . . . . . . . .
2.8 Expectation . . . . . . . . . . . . . . . . . . . . . . .
2.8.1 The best approximation of a random variable
2.8.2 Change of measure in an expectation . . . . .
2.9 Basic Distributions . . . . . . . . . . . . . . . . . . .
2.10 Conditional Expectations . . . . . . . . . . . . . . .
2.11 Inequalities of Random Variables . . . . . . . . . . .
2.12 Limits of Sequences of Random Variables . . . . . .
2.13 Properties of Mean-Square Limit . . . . . . . . . . .
2.14 Stochastic Processes . . . . . . . . . . . . . . . . . .
3 Useful Stochastic Processes
3.1 The Brownian Motion . . . . . .
3.2 Geometric Brownian Motion . . .
3.3 Integrated Brownian Motion . . .
3.4 Exponential Integrated Brownian
. . . . .
. . . . .
. . . . .
Motion
iii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
5
6
7
7
8
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
11
12
13
14
15
16
16
19
19
19
23
27
31
35
36
.
.
.
.
39
39
45
46
49
iv
O. Calin
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
Brownian Bridge . . . . . . . .
Brownian Motion with Drift . .
Bessel Process . . . . . . . . . .
The Poisson Process . . . . . .
Interarrival times . . . . . . . .
Waiting times . . . . . . . . . .
The Integrated Poisson Process
Submartingales . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
49
50
50
52
55
57
57
60
4 Properties of Stochastic Processes
4.1 Stopping Times . . . . . . . . . . . . . . . .
4.2 Stopping Theorem for Martingales . . . . .
4.3 The First Passage of Time . . . . . . . . . .
4.4 The Arc-sine Laws . . . . . . . . . . . . . .
4.5 More on Hitting Times . . . . . . . . . . . .
4.6 The Inverse Laplace Transform Method . .
4.7 Limits of Stochastic Processes . . . . . . . .
4.8 Mean Square Convergence . . . . . . . . . .
4.9 The Martingale Convergence Theorem . . .
4.10 Quadratic Variations . . . . . . . . . . . . .
4.10.1 The Quadratic Variation of Wt . . .
4.10.2 The Quadratic Variation of Nt − λt
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
63
63
67
69
73
76
80
86
87
89
90
91
93
5 Stochastic Integration
5.1 Nonanticipating Processes . . . . . . . . . . .
5.2 Increments of Brownian Motions . . . . . . .
5.3 The Ito Integral . . . . . . . . . . . . . . . . .
5.4 Examples of Ito integrals . . . . . . . . . . . .
5.4.1
The case Ft = c, constant . . . . . . .
5.4.2
The case Ft = Wt . . . . . . . . . . .
5.5 Properties of the Ito Integral . . . . . . . . .
5.6 The Wiener Integral . . . . . . . . . . . . . .
5.7 Poisson Integration . . . . . . . . . . . . . . .
5.8 An Work Out Example: the case FRt = Mt . .
T
5.9 The distribution function of XT = 0 g(t) dNt
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
97
97
97
98
99
100
100
101
106
108
109
114
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
117
117
117
120
123
124
125
6 Stochastic Differentiation
6.1 Differentiation Rules . . . . . . . . . . . .
6.2 Basic Rules . . . . . . . . . . . . . . . . .
6.3 Ito’s Formula . . . . . . . . . . . . . . . .
6.3.1 Ito diffusions . . . . . . . . . . . .
6.3.2 Ito’s formula for Poisson processes
6.3.3 Ito’s multidimensional formula . .
.
.
.
.
.
.
.
.
.
.
.
.
1
An Introd. to Stoch. Calc. with Appl. to Finance
7 Stochastic Integration Techniques
7.1 Notational Conventions . . . . . .
7.2 Stochastic Integration by Parts . .
7.3 The Heat Equation Method . . . .
7.4 Table of Usual Stochastic Integrals
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
127
127
130
135
140
8 Stochastic Differential Equations
8.1 Definitions and Examples . . . . . . . . . . . .
8.2 Finding Mean and Variance from the Equation
8.3 The Integration Technique . . . . . . . . . . . .
8.4 Exact Stochastic Equations . . . . . . . . . . .
8.5 Integration by Inspection . . . . . . . . . . . .
8.6 Linear Stochastic Differential Equations . . . .
8.7 The Method of Variation of Parameters . . . .
8.8 Integrating Factors . . . . . . . . . . . . . . . .
8.9 Existence and Uniqueness . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
143
143
144
150
155
158
160
166
169
170
9 Applications of Brownian Motion
9.1 The Directional Derivative . . . . . . . . . . . .
9.2 The Generator of an Ito Diffusion . . . . . . . .
9.3 Dynkin’s Formula . . . . . . . . . . . . . . . . .
9.4 Kolmogorov’s Backward Equation . . . . . . .
9.5 Exit Time from an Interval . . . . . . . . . . .
9.6 Transience and Recurrence of Brownian Motion
9.7 Application to Parabolic Equations . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
173
173
173
176
178
178
179
184
10 Martingales and Girsanov’s Theorem
187
10.1 Examples of Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
10.2 Girsanov’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
2
O. Calin
Part I
Stochastic Calculus
3
Chapter 1
Introduction
1.1
A Few Introductory Problems
Even if deterministic Calculus is an excellent tool for modeling real life problems, however, when it comes to randomness, Stochastic Calculus is the one which can do a more
accurate modeling of the problem. In real life applications, involving trajectories, measurements, noisy signals, etc, the effects of many unpredictable factors can be averaged
out, via the Central Limit Theorem, as a normal random variable. This is related to the
Brownian motion, which was introduced to model the irregular movements of pollen
grains in a liquid. In the following we shall discuss a few problems involving random
perturbations which can be modeled by Stochastic Calculus.
1.1.1
Stochastic population growth models
Exponential growth model Let P (t) denote the population at time t. In the time interval
∆t the population increases by the amount ∆P (t) = P (t + ∆t) − P (t). The classical
model of population growth suggests that the relative percentage increase in population
is proportional with the time interval, i.e.
∆P (t)
= r∆t,
P (t)
where the constant r > 0 denotes the population growth. Allowing for infinitesimal
time intervals, the aforementioned equation writes as
dP (t) = rP (t)dt.
This differential equation has the solution P (t) = P0 ert , where P0 is the initial population size. The evolution of the population is driven by its growth rate r. In real life
this rate in not constant. It might be a function of time t, or even more general, it
might oscillate irregularly around some deterministic average a(t):
rt = a(t) + “noise”.
5
6
In this case, rt becomes a random variable indexed over time t. The associated equation
becomes a stochastic differential equation
dP (t) = a(t) + “noise” P (t)dt.
(1.1.1)
Solving an equation of type (1.1.1) is a problem of Stochastic Calculus.
Logistic growth model The previous exponential growth model allows the population
to increase indefinitely. However, due to competition, limited space or resources, the
population will increase slower and slower. This model was introduced by P.F. Verhust
in 1832 and rediscovered by R. Pearl in the twentieth century. The main assumption
of the model is that the amount of competition is proportional with the number of
encounters between the population members, which is proportional with the square of
the population size
dP (t) = rP (t)dt − kP (t)2 dt.
(1.1.2)
The solution is given by the logistic function
P (t) =
P0 K
,
P0 + (K − P0 )e−rt
where K = r/k is the saturation level of the population. One of the stochastic variants
of equation (1.1.2) is given by
dP (t) = rP (t)dt − kP (t)2 dt + β(“noise”)P (t),
where β ∈ R is a measure of the size of the noise in the system. This equation is used
to model the growth of a population in a stochastic, crowded environment.
1.1.2
Pricing zero-coupon bonds
A bond is a financial instrument which pays back at the end of its lifetime, T , an
amount equal to B, and provides some periodical some payments, called coupons. If
the coupons are equal to zero, the bond is called a zero-coupon bond or a discount bond.
Using the time value of money, the price of a bond at time t is B(t) = Be−t(T −t) , where
r is the risk-free interest rate. The bond satisfies the ordinary differential equation
dB(t) = rB(t)dt
with the final condition B(T ) = B. In a “noisy” market the constant interest rate r
is replaced by rt = r(t) + “noise”, fact that makes the bond pricing more complicated.
This can be achieved by Stochastic Calculus.
7
Stochastic pendulum
The free oscillations of a simple pendulum of unit mass can be described by the nonlinear equation θ̈(t) = −k2 sin θ(t), where θ(t) is the angle between the string and the
vertical direction. If the pendulum is moving under the influence of a time dependent
exterior force F = F (t), then the equation of the pendulum with forced oscillations is
given by θ̈(t) + k2 sin θ(t) = F (t). We may encounter the situation when the force is
not deterministic and we have
F (t) = f (t) + (“noise”).
How does the noisy force influence the angle deviation θ(t)? Stochastic Calculus can
be used to answer this question.
1.1.3
Diffusions of particles
Consider a flowing fluid with the velocity field v(x). A particle that
moves with the
′
fluid has a trajectory φ(t) described by the equation φ (t) = v φ(t) . A small particle,
that is also subject to molecular bombardments, will be described by an equation of
the type φ′ (t) = v φ(t) + σ(“noise”), where the constant σ > 0 determines the size of
the noise and controls the diffusion of the small particle in the fluid.
1.1.4
White noise
All aforementioned problems involved a “noise” influence. This noise is directly related
to the trajectory of a small particle which diffuses in a liquid due to the molecular
bombardments (just consider the last example in the case of a static fluid, v = 0). This
was observed first time by R. Brown in 1823, and it was called Brownian motion and
will be denoted by Bt . It is a very irregular, continuous trajectory, which from the
mathematical point of view is nowhere differentiable.
The adjective “white” comes from signal processing, where it refers to the fact
that the noise is completely unpredictable, i.e. it is not biased towards any specific
“frequency”.1 If Nt denotes the “white noise” at time t, the trajectory φ(t) of a
diffused particle satisfies
φ′ (t) = σNt ,
t ≥ 0.
The solution depends of a scaled Brownian motion starting at x, i.e φ(t) = x + σBt .
Therefore the white noise is the instantaneously rate of change of the Brownian motion,
and can be written informally as
dBt
.
Nt =
dt
This looks contradictory, since Bt is not differentiable. However, there is a way of
making sense of the previous formula by considering the derivative in the following
1
The white light is an equal mixture of radiations of all visible frequencies.
8
“generalized sense”:
Z
R
Nt f (t) dt = −
Z
Bt f ′ (t) dt,
R
for any compact supported, smooth function f .
1.2
Bounded Variation
Let f : [a, b] → R be a continuously differentiable function, and consider the partition
a = x0 < x1 < · · · < xn = b of the interval [a, b]. A smooth curve y = f (x),
a ≤ x ≤ b, is rectifiable (has length) if the sum of the lengths of the line segments with
vertices at P0 (x0 , f (x0 )), · · · , Pn (xn , f (xn )) is bounded by a given constant, which is
independent of the number n and the choice of the division points xi . Assuming the
division equidistant, ∆x = (b − a)/n, the curve length is given by
ℓ = sup
xi
n−1
X
k=0
|Pk Pk+1 | = sup
xi
n−1 p
X
(∆x)2 + (∆f )2 .
k=0
Furthermore, if f is continuously differentiable, the computation can be continued as
ℓ =
lim
n→∞
r
n−1
X
1+
k=0
∆f 2
∆x
∆x =
Z bp
a
1 + f ′ (x)2 dx,
where we used that the limit of an increasing sequence is equal to its superior limit.
Definition 1.2.1 The function f (x) has bounded variation on the interval [a, b] if for
any division a = x0 < x1 < · · · < xn = b the sum
n−1
X
k=0
|f (xk+1 ) − f (xk )|
is bounded above by a given constant.
The total variation of f on [a, b] is defined by
V (f ) = sup
xi
n−1
X
k=0
|f (xk+1 ) − f (xk )|.
(1.2.3)
The amount V (f ) measures in a certain sense the “roughness” of the function. If f is
a constant function, then V (f ) = 0. If f is a stair-type function, then V (f ) is the sum
of the absolute value of its jumps.
9
We note that if f is continuously differentiable, then the total variation can be
written as an integral
V (f ) = sup
xi
n−1
X
k=0
|f (xk+1 ) − f (xk )| = lim
n→∞
n−1
X
k=0
|f (xk+1 ) − f (xk )|
∆x =
xk+1 − xk
Z
b
a
|f ′ (x)| dx.
The next result states a relation between the length of the graph and the total
variation of a function f .
Proposition 1.2.2 Let f : [a, b] → R be a function. Then the graph y = f (x) has
length if and only if V (f ) < ∞.
Proof: Consider the simplifying notations (∆f )k = f (xk+1 ) − f (xk ) and ∆x = xk+1 −
xk . Taking the summation in the double inequality
p
|(∆f )k | ≤ (∆x)2 + |(∆f )k |2 ≤ ∆x + |(∆f )k |
and then apply the “sup” yields
V (f ) ≤ ℓ ≤ (b − a) + V (f ),
which implies the desired conclusion.
In the virtue of the previous result, the functions with infinite total variations have
the graphs of infinite lengths.
The following informal computation shows that the Brownian motion has infinite
total variation
V (Bt ) = sup
tk
n−1
X
k=0
|Wtk+1
Z b
Z b
dBt − Wtk | =
|Nt | dt = ∞,
dt =
dt
a
a
since the area under the curve t → |Nt | is infinite.
We can try to model a more finer “roughness” of the function using the quadratic
variation of f on the interval [a, b]
V (2) (f ) = sup
xi
n−1
X
k=0
|f (xk+1 ) − f (xk )|2 .
(1.2.4)
where the “sup” is taken over all divisions a = x0 < x1 < · · · < xn = b.
It is worth noting that if f has bounded total variation, V (f ) < ∞, then V (2) (f ) =
0. This comes from the following inequality
V (2) (f ) ≤ max |f (xk+1 ) − f (xk )|V (f ) → 0,
xk
as |∆x| → 0.
10
The total variation for the Brownian motion does not provide much information.
It turns out that the correct measure for the roughness of the Brownian motion is the
quadratic variation
n−1
X
|B(tk+1 ) − B(tk )|2 .
(1.2.5)
V (2) (Bt ) = sup
ti
It will be shown that V
(2) (B
t)
k=0
is equal to the time interval b − a.
Chapter 2
Basic Notions
2.1
Probability Space
The modern theory of probability stems from the work of A. N. Kolmogorov published
in 1933. Kolmogorov associates a random experiment with a probability space, which
is a triplet, (Ω, F, P ), consisting of the set of outcomes, Ω, a σ-field, F, with Boolean
algebra properties, and a probability measure, P . In the following sections, each of
these elements will be discussed in more detail.
2.2
Sample Space
A random experiment in the theory of probability is an experiment whose outcomes
cannot be determined in advance. When an experiment is performed, all possible
outcomes form a set called the sample space, which will be denoted by Ω.
For instance, flipping a coin produces the sample space with two states Ω = {H, T },
while rolling a die yields a sample space with six states Ω = {1, · · · , 6}. Choosing
randomly a number between 0 and 1 corresponds to a sample space which is the entire
segment Ω = (0, 1).
In financial markets one can regard Ω as the states of the world, understanding by
this all possible states the world might have. The number of states the world that
affect the stock market is huge. These would contain all possible values for the vector
parameters that describe the world, which is practically infinite.
All subsets of the sample space Ω form a set denoted by 2Ω . The reason for this
notation is that the set of parts of Ω can be put into a bijective correspondence with the
set of binary functions f : Ω → {0, 1}. The number of elements of this set is 2|Ω| , where
|Ω| denotes the cardinal of Ω. If the set is finite, |Ω| = n, then 2Ω has 2n elements. If
Ω is infinitely countable (i.e. can be put into a bijective correspondence with the set
of natural numbers), then 2|Ω| is infinite and its cardinal is the same as that of the real
number set R.
11
12
Remark 2.2.1 Pick a natural number at random. Any subset of the sample space
corresponds to a sequence formed with 0 and 1. For instance, the subset {1, 3, 5, 6}
corresponds to the sequence 10101100000 . . . having 1 on the 1st, 3rd, 5th and 6th
places and 0 in rest. It is known that the number of these sequences is infinite and
can be put into a bijective correspondence with the real number set R. This can be
also written as |2N | = |R|, and stated by saying that the set of all subsets of natural
numbers N has the same cardinal as the real numbers set R.
2.3
Events and Probability
The set of parts 2Ω satisfies the following properties:
1. It contains the empty set Ø;
2. If it contains a set A, then it also contains its complement Ā = Ω\A;
3. It is closed with regard to unions, i.e., if A1 , A2 , . . . is a sequence of sets, then
their union A1 ∪ A2 ∪ · · · also belongs to 2Ω .
Any subset F of 2Ω that satisfies the previous three properties is called a σ-field. The
sets belonging to F are called events. This way, the complement of an event, or the
union of events is also an event. We say that an event occurs if the outcome of the
experiment is an element of that subset.
The chance of occurrence of an event is measured by a probability function P :
F → [0, 1] which satisfies the following two properties:
1. P (Ω) = 1;
2. For any mutually disjoint events A1 , A2 , · · · ∈ F,
P (A1 ∪ A2 ∪ · · · ) = P (A1 ) + P (A2 ) + · · · .
The triplet (Ω, F, P ) is called a probability space. This is the main setup in which
the probability theory works.
Example 2.3.1 In the case of flipping a coin, the probability space has the following
elements: Ω = {H, T }, F = {Ø, {H}, {T }, {H, T }} and P defined by P (Ø) = 0,
P ({H}) = 21 , P ({T }) = 21 , P ({H, T }) = 1.
Example 2.3.2 Consider a finite sample space Ω = {s1 , . . . , sn }, with the σ-field F =
2Ω , and probability given by P (A) = |A|/n, ∀A ∈ F. Then (Ω, F, P ) is a probability
space.
13
Figure 2.1: If any set X −1 (a, b) is “known”, then the random variable X : Ω → R is
2Ω -measurable.
Example 2.3.3 Let Ω = [0, 1] and consider the σ-field B([0, 1]) given by the set of
all open or closed intervals on [0, 1], or any unions, intersections, and complementary
sets. Define P (A) = λ(A), where λ stands for the Lebesgue measure (in particular,
if A = (a, b), then P (A) = b − a is the length of the interval). It can be shown that
(Ω, B([0, 1]), P ) is a probability space.
2.4
Random Variables
Since the σ-field F provides the knowledge about which events are possible on the
considered probability space, then F can be regarded as the information component
of the probability space (Ω, F, P ). A random variable X is a function that assigns a
numerical value to each state of the world, X : Ω → R, such that the values taken by
X are known to someone who has access to the information F. More precisely, given
any two numbers a, b ∈ R, then all the states of the world for which X takes values
between a and b forms a set that is an event (an element of F), i.e.
{ω ∈ Ω; a < X(ω) < b} ∈ F.
Another way of saying this is that X is an F-measurable function.
Example 2.4.1 Let X(ω) be the number of people who want to buy houses, given the
state of the market ω. Is X measurable? This would mean that given two numbers,
say a = 10, 000 and b = 50, 000, we know all the market situations ω for which there
are at least 10, 000 and at most 50, 000 people willing to purchase houses. Many times,
in theory, it makes sense to assume that we have enough knowledge to assume X
measurable.
14
Example 2.4.2 Consider the experiment of flipping three coins. In this case Ω is the
set of all possible triplets. Consider the random variable X which gives the number of
tails obtained. For instance X(HHH) = 0, X(HHT ) = 1, etc. The sets
{ω; X(ω) = 0} = {HHH},
{ω; X(ω) = 3} = {T T T },
{ω; X(ω) = 1} = {HHT, HT H, T HH},
{ω; X(ω) = 2} = {HT T, T HT, T T H}
belong to 2Ω , and hence X is a random variable.
2.5
Integration in Probability Measure
The notion of expectation is based on integration on measure spaces. In this section
we recall briefly the definition of an integral with respect to the probability measure
P.
Let X : Ω → R be a random variable on the probability space (Ω, F, P ). A partition
(Ωi )1≤i≤n of Ω is a family of subsets Ωi ⊂ Ω, with Ωi ∈ F, satisfying
1. Ωi ∩ Ωj = Ø, for i 6= j;
n
[
Ωi = Ω.
2.
i
Each Ωi is an event and its associated probability
is P (Ωi ). Consider the characteristic
1, if ω ∈ A
. More properties of χA
function of a set A ⊂ Ω defined by χA (ω) =
0, if ω ∈
/A
can be found in Exercise 2.10.8.
P
A simple function is a sum of characteristic functions f = ni ci χΩi . This means
f (ω) = ck for ω ∈ Ωk . The integral of the simple function f is defined by
Z
n
X
ci P (Ωi ).
f dP =
Ω
i
If X : Ω → R is a random variable, then from the measure theory it is known that
there is a sequence of simple functions (fn )n≥1 satisfying limn→∞ fn (ω) = X(ω). Furthermore, if X ≥ 0, then we may assume that fn ≤ fn+1 . Then from the monotone
convergence theorem we can write
Z
Z
fn dP.
X dP = lim
n→∞ Ω
Ω
If X is not non-negative, we can write it as X = X1 − X1 with Xi ≥ 0 and define
Z
Z
Z
X2 dP.
X1 dP − lim
X dP = lim
Ω
n→∞ Ω
R
n→∞ Ω
R
From now on, the integral notations Ω X dP or Ω X(ω) dP (ω) will be used interchangeably. In the rest of the chapter the integral notation will be used informally,
without requiring a direct use of the previous definition.
15
2.6
Distribution Functions
Let X be a random variable on the probability space (Ω, F, P ). The distribution
function of X is the function FX : R → [0, 1] defined by
FX (x) = P (ω; X(ω) ≤ x).
It is worth observing that since X is a random variable, then the set {ω; X(ω) ≤ x}
belongs to the information set F.
The distribution function is non-decreasing and satisfies the limits
lim FX (x) = 0,
lim FX (x) = 1.
x→−∞
x→+∞
If we have
d
F (x) = p(x),
dx X
then we say that p(x) is the probability density function of X.
It is important to note the following relation among distribution function, probability and probability density function of the random variable X
Z x
Z
p(u) du.
(2.6.1)
dP (ω) =
FX (x) = P (X ≤ x) =
−∞
{X≤x}
The probability density function p(x) has the following properties:
(i) p(x) ≥ 0
R∞
(ii) −∞ p(u) du = 1.
The first one is a consequence of the fact that the distribution function FX (x) is
nondecreasing. The second follows from (2.6.1) by making x → ∞
Z
Z ∞
dP = P (Ω) = 1.
p(u) du =
−∞
Ω
As an extension of formula (2.6.1) we have for any F-measurable function h
Z
Z
h(x)p(x) dx.
(2.6.2)
h(X(ω)) dP (ω) =
Ω
R
Another useful propety, which follows from the Fundamental Theorem of Calculus
is
P (a < X < b) = P (ω; a < X(ω) < b) =
Z
b
p(x) dx.
a
In the case of discrete random variables the aforementioned integral is replaced by the
following sum
X
P (a < X < b) =
P (X = x).
a<x<b
For more details the reader is referred to a traditional probability book, such as Wackerly et. al. [?].
16
2.7
Independence
Roughly speaking, two random variables X and Y are independent if the occurrence
of one of them does not change the probability density of the other. More precisely, if
for any two open intervals A, B ⊂ R, the events
E = {ω; X(ω) ∈ A},
F = {ω; Y (ω) ∈ B}
are independent, i.e., P (E ∩ F ) = P (E)P (F ), then X and Y are called independent
random variables.
Proposition 2.7.1 Let X and Y be independent random variables with density functions pX (x) and pY (y). Then the joint density function of (X, Y ) is given by pX,Y (x, y) =
pX (x) pY (y).
Proof: Using the independence of sets, we have1
pX,Y (x, y) dxdy = P (x < X < x + dx, y < Y < y + dy)
= P (x < X < x + dx)P (y < Y < y + dy)
= pX (x) dx pY (y) dy
= pX (x)pY (y) dxdy.
Dropping the factor dxdy yields the desired result. We note that the converse holds
true.
The σ-algebra generated by a random variable X : Ω → R is the σ-algebra generated
by the unions, intersections and complements of events of the form {ω; X(ω) ∈ (a, b)},
with a < b real numbers. It will be denoted by AX .
Two σ-fields G and H included in F are called independent if
P (G ∩ H) = P (G)P (H),
∀G ∈ G, H ∈ H.
The random variable X and the σ-field G are called independent if the algebras AX
and G are independent.
2.8
Expectation
A random variable X : Ω → R is called integrable if
Z
Z
|x|p(x) dx < ∞,
|X(ω)| dP (ω) =
Ω
R
where p(x) denotes the probability density function of X. The previous identity is
based on changing the domain of integration from Ω to R.
1
We are using the useful approximation P (x < X < x + dx) =
R x+dx
x
p(u) du = p(x)dx.
17
The expectation of an integrable random variable X is defined by
Z
Z
x p(x) dx.
X(ω) dP (ω) =
E[X] =
Ω
R
Customarily, the expectation of X is denoted by µ and it is also called the mean. In
general, for any measurable function h : R → R, we have
Z
Z
h(x)p(x) dx.
h X(ω) dP (ω) =
E[h(X)] =
Ω
R
In the case of a discrete random variable X the expectation is defined as
X
E[X] =
xk P (X = xk ).
k≥1
Proposition 2.8.1 The expectation operator E is linear, i.e. for any integrable random variables X and Y
1. E[cX] = cE[X],
∀c ∈ R;
2. E[X + Y ] = E[X] + E[Y ].
Proof: It follows from the fact that the integral is a linear operator.
Proposition 2.8.2 Let X and Y be two independent integrable random variables.
Then
E[XY ] = E[X]E[Y ].
Proof: This is a variant of Fubini’s theorem, which in this case states that a double
integral is a product of two simple integrals. Let pX , pY , pX,Y denote the probability densities of X, Y and (X, Y ), respectively. Since X and Y are independent, by
Proposition 2.7.1 we have
E[XY ] =
ZZ
xypX,Y (x, y) dxdy =
Z
xpX (x) dx
Z
ypY (y) dy = E[X]E[Y ].
Definition 2.8.3 The covariance of two random variables is defined by
Cov(X, Y ) = E[XY ] − E[X][Y ].
The variance of X is given by
V ar(X) = Cov(X, X).
18
Proposition 2.8.2 states that if X and Y are independent, then Cov(X, Y ) = 0. It
is worth to note that the converse in not necessarily true, see Exercise 2.8.6.
Exercise 2.8.4 Show that
(a) Cov(X, Y ) = E[(X − µX )(Y − µY )], where µX = E[X] and µY = E[Y ];
(b) V ar(X) = E[(X − µX )2 ];
From Exercise 2.8.4 (b), we have V ar(X) ≥ 0, so, there is a real number σ > 0 such
that V ar(X) = σ 2 . The number σ is called standard deviation.
Exercise 2.8.5 Let µ and σ denote the mean and the standard deviation of the random
variable X. Show that
E[X 2 ] = µ2 + σ 2 .
Exercise 2.8.6 Consider two random variables with the following table of joint probabilities:
Y \X
−1
0
1
−1
1/16
3/16
1/16
0
3/16
0
3/16
1
1/16
3/16
1/16
Show the following:
(a) E[X] = E[Y ] = E[XY ] = 0;
(b) Cov(X, Y ) = 0.
(c) P (0, 0) 6= PX (0)PY (0);
(d) X and Y are not independent.
The covariance can be standardized in the following way. Let σX and σY be the
standard deviations of X and Y , respectively. The correlation coefficient of X and Y
is defined as
Cov(X, Y )
.
ρ(X, Y ) =
σX σY
Example 2.8.1 (a) Show the inequality
−1 ≤ ρ(X, Y ) ≤ 1.
(b) What can you say about the random variables X and Y if ρ(X, Y ) = 1?
19
2.8.1
The best approximation of a random variable
Let X be a random variable. We would like to approximate X by a single (nonrandom)
number x. The “best” value of x is chosen in the sense of the “least squares”, i.e. x
is picked such that the expectation of the error square (X − x)2 is minimum. Denote
µ = E[X] and σ 2 = var(X). Since
E[(X − x)2 ] = E[X 2 ] − 2xE[X] + x2
= σ 2 + µ2 − 2xµ + x2
= σ 2 + (x − µ)2 ,
then the minimum is obtained for x = µ, and in this case
min E[(X − x)2 ] = σ 2 .
x
It follows that the mean is the the best approximation of the random variable X in the
least squares sense.
2.8.2
Change of measure in an expectation
Let P, Q : F → R be two probability measures on Ω, such that there is an integrable
random variable f : Ω → R, such that dQ = f dP . This means
Z
Z
f (ω) dP (ω),
∀A ∈ F.
dQ =
Q(A) =
A
A
Denote by E P and E Q the expectations with respect to the measures P and Q, respectively. Then we have
Z
Z
X(ω)f (ω) dP (ω) = E P [f X].
X(ω) dQ(ω) =
E Q [X] =
Ω
Ω
Z
1
g(x) dx = 1.
Exercise 2.8.7 Let g : [0, 1] → [0, ∞) be a integrable function with
0
Z
g(x) dx. Show that Q is a probability
Consider Q : B([0, 1]) → R, given by Q(A) =
measure on (Ω = [0, 1], B([0, 1])).
2.9
A
Basic Distributions
We shall recall a few basic distributions, which are most often seen in applications.
Normal distribution A random variable X is said to have a normal distribution if
its probability density function is given by
1
2
2
p(x) = √ e−(x−µ) /(2σ ) ,
σ 2π
20
with µ and σ > 0 constant parameters, see Fig.2.2a. The mean and variance are given
by
E[X] = µ,
V ar[X] = σ 2 .
If X has a normal distribution with mean µ and variance σ 2 , we shall write
X ∼ N (µ, σ 2 ).
Exercise 2.9.1 Let α, β ∈ R. Show that if X is normal distributed, with X ∼
N (µ, σ 2 ), then Y = αX + β is also normal distributed, with Y ∼ N (αµ + β, α2 σ 2 ).
Log-normal distribution Let X be normally distributed with mean µ and variance
σ 2 . Then the random variable Y = eX is said to be log-normal distributed. The mean
and variance of Y are given by
E[Y ] = eµ+
σ2
2
2
2
V ar[Y ] = e2µ+σ (eσ − 1).
The density function of the log-normal distributed random variable Y is given by
p(x) =
1
√
xσ 2π
e−
(ln x−µ)2
2σ 2
,
x > 0,
see Fig.2.2b.
Definition 2.9.2 The moment
generating function of a random variable X is the funcZ
tX
tx
tion mX (t) = E[e ] = e p(x) dx, where p(x) is the probability density function of
X, provided the integral exists.
The name comes from the fact that the n-th moments of X, given by µk = E[X k ], are
generated as derivatives of mX (t)
dn mX (t)
= µk .
dtn |t=0
It is worth noting the relation
Z between the Laplace transform and the moment generating function L(p(x))(t) = e−tx p(x) dx = mX (−t).
Exercise 2.9.3 Find the moment generating function for the exponential distribution
p(x) = λe−xλ , x ≥ 0, λ > 0.
Exercise 2.9.4 Show that if X and Y are two independent random variables, then
mX+Y (t) = mX (t)mY (t).
21
0.4
0.5
0.3
0.4
0.3
0.2
0.2
0.1
0.1
-4
0
-2
2
4
0
2
4
a
6
8
b
3.5
Α = 8, Β = 3
Α = 3, Β = 9
3.0
0.20
Α = 3, Β = 2
2.5
0.15
2.0
0.10
1.5
Α = 4, Β = 3
1.0
0.05
0.5
0
5
10
15
20
0.0
0.2
0.4
c
0.6
0.8
1.0
d
Figure 2.2: a Normal distribution; b Log-normal distribution; c Gamma distributions;
d Beta distributions.
Exercise 2.9.5 Given that the moment generating function of a normally distributed
2 2
random variable X ∼ N (µ, σ 2 ) is m(t) = E[etX ] = eµt+t σ /2 , show that
2 2
(a) E[Y n ] = enµ+n σ /2 , where Y = eX .
(b) Show that the mean and variance of the log-normal random variable Y = eX
are
2
2
2
E[Y ] = eµ+σ /2 , V ar[Y ] = e2µ+σ (eσ − 1).
Gamma distribution A random variable X is said to have a gamma distribution with
parameters α > 0, β > 0 if its density function is given by
p(x) =
xα−1 e−x/β
,
β α Γ(α)
x ≥ 0,
where Γ(α) denotes the gamma function,2 see Fig.2.2c. The mean and variance are
E[X] = αβ,
V ar[X] = αβ 2 .
The case α = 1 is known as the exponential distribution, see Fig.2.3a. In this case
p(x) =
2
1 −x/β
e
,
β
x ≥ 0.
Recall the definition of the gamma function Γ(α) =
Γ(n) = (n − 1)!
R∞
0
y α−1 e−y dy; if α = n, integer, then
22
0.10
0.35
Λ = 15, 0 < k < 30
0.30
0.08
0.25
Β=3
0.20
0.06
0.15
0.04
0.10
0.02
0.05
0
2
4
6
8
10
a
5
10
15
20
25
30
b
Figure 2.3: a Exponential distribution; b Poisson distribution.
The particular case when α = n/2 and β = 2 becomes the χ2 −distribution with n
degrees of freedom. This characterizes also a sum of n independent standard normal
distributions.
Beta distribution A random variable X is said to have a beta distribution with
parameters α > 0, β > 0 if its probability density function is of the form
p(x) =
xα−1 (1 − x)β−1
,
B(α, β)
0 ≤ x ≤ 1,
where B(α, β) denotes the beta function.3 See see Fig.2.2d for two particular density
functions. In this case
α
αβ
E[X] =
, V ar[X] =
.
2
α+β
(α + β) (α + β + 1)
Poisson distribution A discrete random variable X is said to have a Poisson probability distribution if
λk −λ
e ,
k = 0, 1, 2, . . . ,
k!
with λ > 0 parameter, see Fig.2.3b. In this case E[X] = λ and V ar[X] = λ.
Pearson 5 distribution Let α, β > 0. A random variable X with the density function
P (X = k) =
p(x) =
e−β/x
1
,
βΓ(α) (x/β)α+1
x≥0
is said to have a Pearson 5 distribution4 with positive parameters α and β. It can be
shown that


β2
 β

, if α > 2
, if α > 1
E[X] =
V ar(X) =
(α − 1)2 (α − 2)
α−1
 ∞,

otherwise,
∞,
otherwise.
R1
= 0 y α−1 (1 − y)β−1 dy.
Two definition formulas for the beta functions are B(α, β) = Γ(α)Γ(β)
Γ(α+β)
4
The Pearson family of distributions was designed by Pearson between 1890 and 1895. There are
several Pearson distributions, this one being distinguished by the number 5.
3
23
β
.
α+1
The Inverse Gaussian distribution Let µ, λ > 0. A random variable X has an
inverse Gaussian distribution with parameters µ and λ if its density function is given
by
2
λ − λ(x−µ)
2x
2µ
e
p(x) =
,
x > 0.
(2.9.3)
2πx3
We shall write X ∼ IG(µ, λ). Its mean, variance and mode are given by
r
9µ2 3µ µ3
1+ 2 −
,
,
M ode(X) = µ
E[X] = µ,
V ar(X) =
λ
4λ
2λ
The mode of this distribution is equal to
where the mode denotes the value xm for which p(x) is maximum, i.e., p(x0 ) =
maxx p(x). This distribution will be used to model the time instance when a Brownian
motion with drift exceeds a certain barrier for the first time.
2.10
Conditional Expectations
Let X be a random variable on the probability space (Ω, F, P ), and G be a σ-field
contained in F. Since X is F-measurable, the expectation of X, given the information
F must be X itself, fact that can be written as E[X|F] = X (for details see Example
2.10.5).
On the other side, the information G does not completely determine X. The random
variable that makes a prediction for X based on the information G is denoted by E[X|G],
and is called the conditional expectation of X given G. This is defined as the random
variable Y = E[X|G], which is the best approximation of X in the least squares sense,
i.e.
E[(X − Y )2 ] ≤ E[(X − Z)2 ],
(2.10.4)
for any G-measurable random variable Z.
The set of all random variables on Ω forms a Hilbert space with the inner product
hX, Y i = E[XY ].
This defines the norm kXk2 = E[X 2 ], which induces the distance d(X, Y ) = kX − Y k.
Denote by SG the set of all G-measurable random variables on Ω. We shall show that the
element of SG that is the closest to X in the aforementioned distance is the conditional
expectation Y = E[X|G]. Let X⊥ denote the orthogonal projection of X on the space
SG . This satisfies
E[(X − X⊥ )(Z − X⊥ )] = 0,
∀Z ∈ SG .
(2.10.5)
The Pythagorean relation
kX − X⊥ k2 + kZ − X⊥ k2 = kX − Zk2 ,
∀Z ∈ SG
24
implies the inequality
kX − X⊥ k2 ≤ kX − Zk2 ,
∀Z ∈ SG ,
which is equivalent to
E[(X − X⊥ )2 ] ≤ E[(X − Z)2 ],
∀Z ∈ SG ,
which yields X⊥ = E[X|G], so the conditional expectation is the orthogonal projection
of X on the space SG . The uniqueness of this projection is a consequence of the
Pythagorean relation. The orthogonality relation (2.10.5) can be written equivalently
as
E[(X − Y )U ] = 0,
∀U ∈ SG .
Therefore, the conditional expectation Y satisfies the identity
E[XU ] = E[Y U ],
∀U ∈ SG .
In particular, if choose U = χA , the characteristic function of a set A ∈ G, then the
foregoing relation yields
Z
Z
Y dP,
∀A ∈ G.
X dP =
A
A
We arrived at the following equivalent characterization of the conditional expectations.
The conditional expectation of X given G is a random variable satisfying:
1. E[X|G] is G-measurable;
R
R
2. A E[X|G] dP = A X dP,
∀A ∈ G.
Exercise 2.10.1 Consider the probability space (Ω, F, P ), and let G be a σ-field included in F. If X is a G-measurable random variable such that
Z
X dP = 0
∀A ∈ G,
A
then X = 0 a.s.
It is worth to mention here an equivalent result, which relates to conditional expectations:
Theorem 2.10.2 (Radon-Nikodym) Let (Ω, F, P ) be a probability space and G be a
σ-field included in F. Then for any random variable X there is a G-measurable random
variable Y such that
Z
Z
Y dP,
∀A ∈ G.
(2.10.6)
X dP =
A
A
25
Radon-Nikodym’s theorem states the existence of Y . In fact this is unique almost
surely by the application of Exercise 2.10.1.
Exercise 2.10.3 Show that if G = {Ø, Ω}, then E[X|G] = E[X].
Proof: We need to show that E[X] satisfies conditions 1 and 2. The first one is
obviously satisfied since any constant is G-measurable. The latter condition is checked
on each set of G. We have
Z
Z
Z
E[X]dP
dP =
X dP = E[X] = E[X]
Ω
Ω
Ω
Z
Z
E[X]dP.
X dP =
Ø
Ø
Exercise 2.10.4 Show that E[E[X|G]] = E[X], i.e. all conditional expectations have
the same mean, which is the mean of X.
Proof: Using the definition of expectation and taking A = Ω in the second relation of
the aforementioned definition, yields
Z
Z
XdP = E[X],
E[X|G] dP =
E[E[X|G]] =
Ω
Ω
which ends the proof.
Example 2.10.5 The conditional expectation of X given the total information F is
the random variable X itself, i.e.
E[X|F] = X.
Proof: The random variables X and E[X|F] are both F-measurable (from the definition of the random variable). From the definition of the conditional expectation we
have
Z
Z
X dP, ∀A ∈ F.
E[X|F] dP =
A
A
Exercise 2.10.1 implies that E[X|F] = X almost surely.
General properties of the conditional expectation are stated below without proof.
The proof involves more or less simple manipulations of integrals and can be taken as
an exercise for the reader.
26
Proposition 2.10.6 Let X and Y be two random variables on the probability space
(Ω, F, P ). We have
1. Linearity:
E[aX + bY |G] = aE[X|G] + bE[Y |G],
∀a, b ∈ R;
2. Factoring out the measurable part:
E[XY |G] = XE[Y |G]
if X is G-measurable. In particular, E[X|G] = X.
3. Tower property (“the least information wins”):
E[E[X|G]|H] = E[E[X|H]|G] = E[X|H], if H ⊂ G;
4. Positivity:
E[X|G] ≥ 0, if X ≥ 0;
5. Expectation of a constant is a constant:
E[c|G] = c.
6. An independent condition drops out:
E[X|G] = E[X],
if X is independent of G.
Exercise 2.10.7 Prove the property 3 (tower property) given in the previous proposition.
Exercise 2.10.8 Let X be a random variable on the probability space (Ω, F, P ), which
is independent of theσ-field G ⊂ F. Consider the characteristic function of a set A ⊂ Ω
1, if ω ∈ A
defined by χA (ω) =
Show the following:
0, if ω ∈
/A
(a) χA is G-measurable for any A ∈ G;
(b) P (A) = E[χA ];
(c) X and χA are independent random variables;
(d) E[χA X] = E[X]P (A) for any A ∈ G;
(e) E[X|G] = E[X].
27
Figure 2.4: Jensen’s inequality ϕ(E[X]) < E[ϕ(X)] for a convex function ϕ.
2.11
Inequalities of Random Variables
This section prepares the reader for the limits of sequences of random variables and
limits of stochastic processes. We recall first that an infinite differentiable function
f (x), has a Taylor series at a if
f (x) = f (a) +
f ′′ (a)
f ′′′ (a)
f ′ (a)
(x − a) +
(x − a)2 +
(x − a)3 + . . . ,
1!
2!
3!
where x belongs to an interval neighborhood of a.
Exercise 2.11.1 Let f (x) be n + 1 times differentiable on an interval I containing a.
Show that there is a ξ ∈ I such that for any x ∈ I
f (x) = f (a)+
f ′ (a)
f ′′ (a)
f (n) (a)
f (n+1) (a)
(x−a)+
(x−a)2 +· · · +
(x−a)n +
(ξ −a)n+1 .
1!
2!
n!
(n + 1)!
We shall start with a classical inequality result regarding expectations:
Theorem 2.11.2 (Jensen’s inequality) Let ϕ : R → R be a convex function and
let X be an integrable random variable on the probability space (Ω, F, P ). If ϕ(X) is
integrable, then
ϕ(E[X]) ≤ E[ϕ(X)].
Proof: We shall assume ϕ twice differentiable with ϕ′′ continuous. Let µ = E[X].
Expand ϕ in a Taylor series about µ, see Exercise 2.11.1, and get
1
ϕ(x) = ϕ(µ) + ϕ′ (µ)(x − µ) + ϕ′′ (ξ)(ξ − µ)2 ,
2
28
with ξ in between x and µ. Since ϕ is convex, ϕ′′ ≥ 0, and hence
ϕ(x) ≥ ϕ(µ) + ϕ′ (µ)(x − µ),
which means the graph of ϕ(x) is above the tangent line at x, ϕ(x) . Replacing x by
the random variable X, and taking the expectation yields
E[ϕ(X)] ≥ E[ϕ(µ) + ϕ′ (µ)(X − µ)] = ϕ(µ) + ϕ′ (µ)(E[X] − µ)
= ϕ(µ) = ϕ(E[X]),
which proves the result.
Fig.2.4 provides a graphical interpretation of Jensen’s inequality. If the distribution
of X is symmetric, then the distribution of ϕ(X) is skewed, with ϕ(E[X]) < E[ϕ(X)].
It is worth noting that the inequality is reversed for ϕ concave. We shall present
next a couple of applications.
A random variable X : Ω → R is called square integrable if
Z
Z
2
2
x2 p(x) dx < ∞.
|X(ω)| dP (ω) =
E[X ] =
Ω
R
Application 2.11.3 If X is a square integrable random variable, then it is integrable.
Proof: Jensen’s inequality with ϕ(x) = x2 becomes
E[X]2 ≤ E[X 2 ].
Since the right side is finite, it follows that E[X] < ∞, so X is integrable.
Application 2.11.4 If mX (t) denotes the moment generating function of the random
variable X with mean µ, then
mX (t) ≥ etµ .
Proof: Applying Jensen inequality with the convex function ϕ(x) = ex yields
eE[X] ≤ E[eX ].
Substituting tX for X yields
eE[tX] ≤ E[etX ].
(2.11.7)
Using the definition of the moment generating function mX (t) = E[etX ] and that
E[tX] = tE[X] = tµ, then (2.11.7) leads to the desired inequality.
The variance of a square integrable random variable X is defined by
V ar(X) = E[X 2 ] − E[X]2 .
By Application 2.11.3 we have V ar(X) ≥ 0, so there is a constant σX > 0, called
standard deviation, such that
2
= V ar(X).
σX
29
Exercise 2.11.5 Prove the following identity:
V ar[X] = E[(X − E[X])2 ].
Exercise 2.11.6 Prove that a non-constant random variable has a non-zero standard
deviation.
Exercise 2.11.7 Prove the following extension of Jensen’s inequality: If ϕ is a convex
function, then for any σ-field G ⊂ F we have
ϕ(E[X|G]) ≤ E[ϕ(X)|G].
Exercise 2.11.8 Show the following:
(a) |E[X]| ≤ E[|X|];
(b) |E[X|G]| ≤ E[|X| |G], for any σ-field G ⊂ F;
(c) |E[X]|r ≤ E[|X|r ], for r ≥ 1;
(d) |E[X|G]|r ≤ E[|X|r |G], for any σ-field G ⊂ F and r ≥ 1.
Theorem 2.11.9 (Markov’s inequality) For any λ, p > 0, we have the following
inequality:
1
P (ω; |X(ω)| ≥ λ) ≤ p E[|X|p ].
λ
Proof: Let A = {ω; |X(ω)| ≥ λ}. Then
Z
Z
Z
p
p
p
λp dP (ω)
|X(ω)| dP (ω) ≥
|X(ω)| dP (ω) ≥
E[|X| ] =
A
A
ΩZ
= λp
dP (ω) = λp P (A) = λp P (|X| ≥ λ).
A
Dividing by
λp
leads to the desired result.
Theorem 2.11.10 (Tchebychev’s inequality) If X is a random variable with mean
µ and variance σ 2 , then
σ2
P (ω; |X(ω) − µ| ≥ λ) ≤ 2 .
λ
Proof: Let A = {ω; |X(ω) − µ| ≥ λ}. Then
Z
Z
2
2
2
σ = V ar(X) = E[(X − µ) ] = (X − µ) dP ≥ (X − µ)2 dP
A
Ω
Z
≥ λ2
dP = λ2 P (A) = λ2 P (ω; |X(ω) − µ| ≥ λ).
A
Dividing by λ2 leads to the desired inequality.
The next result deals with exponentially decreasing bounds on tail distributions.
30
Theorem 2.11.11 (Chernoff bounds) Let X be a random variable. Then for any
λ > 0 we have
E[etX ]
, ∀t > 0;
1. P (X ≥ λ) ≤
eλt
2. P (X ≤ λ) ≤
E[etX ]
, ∀t < 0.
eλt
Proof: 1. Let t > 0 and denote Y = etX . By Markov’s inequality
P (Y ≥ eλt ) ≤
E[Y ]
.
eλt
Then we have
P (X ≥ λ) = P (tX ≥ λt) = P (etX ≥ eλt )
E[etX ]
E[Y ]
.
= P (Y ≥ eλt ) ≤ λt =
e
eλt
2. The case t < 0 is similar.
In the following we shall present an application of the Chernoff bounds for the
normal distributed random variables.
Let X be a random variable normally distributed with mean µ and variance σ 2 . It
is known that its moment generating function is given by
1 2 2
σ
m(t) = E[etX ] = eµt+ 2 t
.
Using the first Chernoff bound we obtain
P (X ≥ λ) ≤
1 2 2
m(t)
= e(µ−λ)t+ 2 t σ , ∀t > 0,
λt
e
which implies
1
min[(µ − λ)t + t2 σ 2 ]
2
P (X ≥ λ) ≤ e t>0
.
It is easy to see that the quadratic function f (t) = (µ − λ)t + 12 t2 σ 2 has the minimum
λ−µ
value reached for t =
. Since t > 0, λ needs to satisfy λ > µ. Then
σ2
λ − µ
(λ − µ)2
min f (t) = f
.
=
−
t>0
σ2
2σ 2
Substituting into the previous formula, we obtain the following result:
Proposition 2.11.12 If X is a normally distributed variable, with X ∼ N (µ, σ 2 ),
then for any λ > µ
(λ − µ)2
−
2σ 2 .
P (X ≥ λ) ≤ e
31
Exercise 2.11.13 Let X be a Poisson random variable with mean λ > 0.
(a) Show that the moment generating function of X is m(t) = eλ(e
(b) Use a Chernoff bound to show that
P (X ≥ k) ≤ eλ(e
t −1)−tk
,
t −1)
;
t > 0.
Markov’s, Tchebychev’s and Chernoff’s inequalities will be useful later when computing limits of random variables.
Proposition 2.11.14 Let X be a random variable and f and g be two functions, both
increasing or decreasing. Then
E[f (X)g(X)] ≥ E[f (X)]E[g(X)].
(2.11.8)
Proof: For any two independent random variables X and Y , we have
f (X) − f (Y ) g(X) − g(Y ) ≥ 0.
Applying expectation yields
E[f (X)g(X)] + E[f (Y )g(Y )] ≥ E[f (X)]E[f (Y )] + E[f (Y )]E[f (X)].
Considering Y as an independent copy of X we obtain
2E[f (X)g(X)] ≥ 2E[f (X)]E[g(X)].
Exercise 2.11.15 Show the following inequalities:
(a) E[X 2 ] ≥ E[X]2 ;
(b) E[X sinh(X)] ≥ E[X]E[sinh(X)];
(c) E[X 6 ] ≥ E[X]E[X 5 ];
(d) E[X 6 ] ≥ E[X 3 ]2 .
Exercise 2.11.16 For any n, k ≥ 1, show that
E[X 2(n+k+1) ] ≥ E[X 2k+1 ]E[X 2n+1 ].
2.12
Limits of Sequences of Random Variables
Consider a sequence (Xn )n≥1 of random variables defined on the probability space
(Ω, F, P ). There are several ways of making sense of the limit expression X = lim Xn ,
n→∞
and they will be discussed in the following sections.
32
Almost Sure Limit
The sequence Xn converges almost surely to X, if for all states of the world ω, except
a set of probability zero, we have
lim Xn (ω) = X(ω).
n→∞
More precisely, this means
P ω; lim Xn (ω) = X(ω) = 1,
n→∞
and we shall write as-lim Xn = X. An important example where this type of limit
n→∞
occurs is the Strong Law of Large Numbers:
If Xn is a sequence of independent and identically distributed random variables with
X1 + · · · + Xn
= µ.
the same mean µ, then as-lim
n→∞
n
It is worth noting that this type of convergence is also known under the name of
strong convergence. This is the reason that the aforementioned theorem bears its name.
Mean Square Limit
Another possibility of convergence is to look at the mean square deviation of Xn from
X. We say that Xn converges to X in the mean square if
lim E[(Xn − X)2 ] = 0.
n→∞
More precisely, this should be interpreted as
Z
2
Xn (ω) − X(ω) dP (ω) = 0.
lim
n→∞ Ω
This limit will be abbreviated by ms-lim Xn = X. The mean square convergence is
n→∞
useful when defining the Ito integral.
Example 2.12.1 Consider a sequence Xn of random variables such that there is a
constant k with E[Xn ] → k and V ar(Xn ) → 0 as n → ∞. Show that ms-lim Xn = k.
n→∞
Proof: Since we have
E[|Xn − k|2 ] = E[Xn2 − 2kXn + k2 ] = E[Xn2 ] − 2kE[Xn ] + k2
= E[Xn2 ] − E[Xn ]2 + E[Xn ]2 − 2kE[Xn ] + k2
2
= V ar(Xn ) + E[Xn ] − k ,
the right side tends to 0 when taking the limit n → ∞.
33
Exercise 2.12.1 Show the following relation
2
E[(X − Y )2 ] = V ar[X] + V ar[Y ] + E[X] − E[Y ] − 2Cov(X, Y ).
Exercise 2.12.2 If Xn tends to X in mean square, with E[X 2 ] < ∞, show that:
(a) E[Xn ] → E[X] as n → ∞;
(b) E[Xn2 ] → E[X 2 ] as n → ∞;
(c) V ar[Xn ] → V ar[X] as n → ∞;
(d) Cov(Xn , X) → V ar[X] as n → ∞.
Exercise 2.12.3 If Xn tends to X in mean square, show that E[Xn |H] tends to
E[X|H] in mean square.
Limit in Probability
The random variable X is the limit in probability of Xn if for n large enough the
probability of deviation from X can be made smaller than any arbitrary ǫ. More
precisely, for any ǫ > 0
lim P ω; |Xn (ω) − X(ω)| ≤ ǫ = 1.
n→∞
This can be written also as
lim P ω; |Xn (ω) − X(ω)| > ǫ = 0.
n→∞
This limit is denoted by p-lim Xn = X.
n→∞
It is worth noting that both almost certain convergence and convergence in mean
square imply the convergence in probability.
Proposition 2.12.4 The convergence in the mean square implies the convergence in
probability.
Proof: Let ms-lim Yn = Y . Let ǫ > 0 be arbitrarily fixed. Applying Markov’s
n→∞
inequality with X = Yn − Y , p = 2 and λ = ǫ, yields
0 ≤ P (|Yn − Y | ≥ ǫ) ≤
1
E[|Yn − Y |2 ].
ǫ2
The right side tends to 0 as n → ∞. Applying the Squeeze Theorem we obtain
lim P (|Yn − Y | ≥ ǫ) = 0,
n→∞
which means that Yn converges stochastically to Y .
34
Example 2.12.5 Let Xn be a sequence of random variables such that E[|Xn |] → 0 as
n → ∞. Prove that p-lim Xn = 0.
n→∞
Proof: Let ǫ > 0 be arbitrarily fixed. We need to show
lim P ω; |Xn (ω)| ≥ ǫ = 0.
n→∞
(2.12.9)
From Markov’s inequality (see Exercise 2.11.9) we have
E[|Xn |]
.
0 ≤ P ω; |Xn (ω)| ≥ ǫ ≤
ǫ
Using Squeeze Theorem we obtain (2.12.9).
Remark 2.12.6 The conclusion still holds true even in the case when there is a p > 0
such that E[|Xn |p ] → 0 as n → ∞.
Limit in Distribution
We say the sequence Xn converges in distribution to X if for any continuous bounded
function ϕ(x) we have
lim E[ϕ(Xn )] = E[ϕ(X)].
n→∞
We make the remark that this type of limit is even weaker than the stochastic convergence, i.e. it is implied by it.
An application of the limit in distribution is obtained if we consider ϕ(x) = eitx . In
this case the expectation becomes the Fourier transform of the probability density
E[ϕ(X)] =
Z
eitx p(x) dx = p̂(t),
and it is called the characteristic function of the random variable X. It follows that if
Xn converges in distribution to X, then the characteristic function of Xn converges to
the characteristic function of X. From the properties of the the Fourier transform, the
probability density of Xn approaches the probability density of X.
It can be shown that the convergence in distribution is equivalent with
lim Fn (x) = F (x),
n→∞
whenever F is continuous at x, where Fn and F denote the distribution functions of
Xn and X, respectively. This is the reason that this convergence bares its name.
35
2.13
Properties of Mean-Square Limit
Lemma 2.13.1 If ms-lim Xn = 0 and ms-lim Yn = 0, then
n→∞
n→∞
ms-lim (Xn + Yn ) = 0.
n→∞
Proof: It follows from the inequality
(x + y)2 ≤ 2x2 + 2y 2 .
The details are left to the reader.
Proposition 2.13.2 If the sequences of random variables Xn and Yn converge in the
mean square, then
Proof:
1.
ms-lim (Xn + Yn ) = ms-lim Xn + ms-lim Yn
2.
ms-lim (cXn ) = c · ms-lim Xn ,
n→∞
n→∞
n→∞
n→∞
n→∞
∀c ∈ R.
1. Let ms-lim Xn = X and ms-lim Yn = Y . Consider the sequences Xn′ =
n→∞
n→∞
Xn − X and Yn′ = Yn − Y . Then ms-lim Xn′ = 0 and ms-lim Yn′ = 0. Applying Lemma
n→∞
n→∞
2.13.1 yields
ms-lim (Xn′ + Yn′ ) = 0.
n→∞
This is equivalent with
which becomes
ms-lim (Xn − X + Yn − Y ) = 0,
n→∞
ms-lim (Xn + Yn ) = X + Y .
n→∞
Remark 2.13.3 It is worthy to note that
ms-lim (Xn Yn ) 6= ms-lim (Xn )· ms-lim (Yn ).
n→∞
n→∞
Counter-examples can be found, see Exercise 2.13.5.
n→∞
Exercise 2.13.4 Use a computer algebra system to show the following:
√
Z ∞
x2
π 2p
dx =
(a)
1 − 5−1/2 ;
x5 + 1
4
0
Z ∞
x4
(b)
dx = ∞;
x5 + 1
0
Z ∞
1
1 1 4
(c)
Γ
.
dx = Γ
x5 + 1
5 5
5
0
36
Exercise 2.13.5 Let X be a random variable with the probability density function
p(x) =
1
5
,
5
Γ(1/5)Γ(4/5) x + 1
(a) Show that E[X 2 ] < ∞ and E[X 4 ] = ∞;
x ≥ 0.
(b) Construct the sequences of random variables Xn = Yn =
lim Xn = 0, ms-lim Yn = 0, but ms-lim (Xn Yn ) = ∞.
n→∞
2.14
n→∞
1
n X.
Show that ms-
n→∞
Stochastic Processes
A stochastic process on the probability space (Ω, F, P ) is a family of random variables
Xt parameterized by t ∈ T, where T ⊂ R. If T is an interval we say that Xt is a
stochastic process in continuous time. If T = {1, 2, 3, . . . } we shall say that Xt is a
stochastic process in discrete time. The latter case describes a sequence of random variables. The aforementioned types of convergence can be easily extended to continuous
time. For instance, Xt converges almost surely to X as t → ∞ if
P ω; lim Xt (ω) = X(ω) = 1.
t→∞
The evolution in time of a given state of the world ω ∈ Ω given by the function
t 7−→ Xt (ω) is called a path or realization of Xt . The study of stochastic processes
using computer simulations is based on retrieving information about the process Xt
given a large number of it realizations.
Next we shall structure the information field F with an order relation parametrized
by the time t. Consider that all the information accumulated until time t is contained
by the σ-field Ft . This means that Ft contains the information of which events have
already occurred until time t, and which did not. Since the information is growing in
time, we have
Fs ⊂ Ft ⊂ F
for any s, t ∈ T with s ≤ t. The family Ft is called a filtration.
A stochastic process Xt is called adapted to the filtration Ft if Xt is Ft - measurable,
for any t ∈ T. This means that the information at time t determines the value of the
random variable Xt .
Example 2.14.1 Here there are a few examples of filtrations:
1. Ft represents the information about the evolution of a stock until time t, with
t > 0.
2. Ft represents the information about the evolution of a Black-Jack game until
time t, with t > 0.
3. Ft represents the medical information of a patient until time t.
37
Example 2.14.2 If X is a random variable, consider the conditional expectation
Xt = E[X|Ft ].
From the definition of conditional expectation, the random variable Xt is Ft -measurable,
and can be regarded as the measurement of X at time t using the information Ft . If
the accumulated knowledge Ft increases and eventually equals the σ-field F, then
X = E[X|F], i.e. we obtain the entire random variable. The process Xt is adapted to
Ft .
Example 2.14.3 Don Joe goes to a doctor to get an estimation of how long he still has
to live. The age at which he will pass away is a random variable, denoted by X. Given
his medical condition today, which is contained in Ft , the doctor can infer an average
age, which is the average of all random instances that agree with the information to
date; this is given by the conditional expectation Xt = E[X|Ft ]. The stochastic process
Xt is adapted to the medical knowledge Ft .
We shall define next an important type of stochastic process.5
Definition 2.14.4 A process Xt , t ∈ T, is called a martingale with respect to the
filtration Ft if
1. Xt is integrable for each t ∈ T;
2. Xt is adapted to the filtration Ft ;
3. Xs = E[Xt |Fs ], ∀s < t.
Remark 2.14.5
The first condition states that the unconditional forecast is finite
Z
|Xt | dP < ∞. Condition 2 says that the value Xt is known, given the
E[|Xt ]] =
Ω
information set Ft . This can be also stated by saying that Xt is Ft -measurable. The
third relation asserts that the best forecast of unobserved future values is the last observation on Xt .
Example 2.14.1 Let Xt denote Mr. Li Zhu’s salary after t years of work at the same
company. Since Xt is known at time t and it is bounded above, as all salaries are,
then the first two conditions hold. Being honest, Mr. Zhu expects today that his future
salary will be the same as today’s, i.e. Xs = E[Xt |Fs ], for s < t. This means that Xt
is a martingale.
Exercise 2.14.6 If X is an integrable random variable on (Ω, F, P ), and Ft is a filtration. Prove that Xt = E[X|Ft ] is a martingale.
Exercise 2.14.7 Let Xt and Yt be martingales with respect to the filtration Ft . Show
that for any a, b, c ∈ R the process Zt = aXt + bYt + c is a Ft -martingale.
5
The concept of martingale was introduced by Lévy in 1934.
38
Exercise 2.14.8 Let Xt and Yt be martingales with respect to the filtration Ft .
(a) Is the process Xt Yt always a martingale with respect to Ft ?
(b) What about the processes Xt2 and Yt2 ?
Exercise 2.14.9 Two processes Xt and Yt are called conditionally uncorrelated, given
Ft , if
E[(Xt − Xs )(Yt − Ys )|Fs ] = 0,
∀ 0 ≤ s < t < ∞.
Let Xt and Yt be martingale processes. Show that the process Zt = Xt Yt is a martingale
if and only if Xt and Yt are conditionally uncorrelated. Assume that Xt , Yt and Zt are
integrable.
In the following, if Xt is a stochastic process, the minimum amount of information
resulted from knowing the process Xs until time t is denoted by Ft = σ(Xs ; s ≤ t).
This is the σ-algebra generated by the events {ω; Xs (ω) ∈ (a, b)}, for any real numbers
a < b and s ≤ t.
In the case of a discrete process, the minimum amount of information resulted from
knowing the process Xk until time n is Fn = σ(Xk ; k ≤ n), the σ-algebra generated by
the events {ω; Xk (ω) ∈ (a, b)}, for any real numbers a < b and k ≤ n.
Exercise 2.14.10 Let Xn , n ≥ 0 be a sequence of integrable independent random
variables, with E[Xn ] < ∞, for all n ≥ 0. Let S0 = X0 , Sn = X0 + · · · + Xn . Show the
following:
(a) Sn − E[Sn ] is an Fn -martingale.
(b) If E[Xn ] = 0 and E[Xn2 ] < ∞, ∀n ≥ 0, then Sn2 − V ar(Sn ) is an Fn -martingale.
Exercise 2.14.11 Let Xn , n ≥ 0 be a sequence of independent, integrable random
variables such that E[Xn ] = 1 for n ≥ 0. Prove that Pn = X0 · X1 · · · · Xn is an
Fn -martingale.
Exercise 2.14.12 (a) Let X be a normally distributed random variable with mean
µ 6= 0 and variance σ 2 . Prove that there is a unique θ 6= 0 such that E[eθX ] = 1.
(b) Let (Xi )i≥0 be a sequence of identically
P normally distributed random variables with
mean µ 6= 0. Consider the sum Sn = nj=0 Xj . Show that Zn = eθSn is a martingale,
with θ defined in part (a).
In section 10.1 we shall encounter several processes which are martingales.
Chapter 3
Useful Stochastic Processes
This chapter deals with the most common used stochastic processes and their basic
properties. The two main basic processes are the Brownian motion and the Poisson
process. The other processes described in this chapter are derived from the previous
two.
3.1
The Brownian Motion
The observation made first by the botanist Robert Brown in 1827, that small pollen
grains suspended in water have a very irregular and unpredictable state of motion, led
to the definition of the Brownian motion, which is formalized in the following:
Definition 3.1.1 A Brownian motion process is a stochastic process Bt , t ≥ 0, which
satisfies
1. The process starts at the origin, B0 = 0;
2. Bt has independent increments;
3. The process Bt is continuous in t;
4. The increments Bt − Bs are normally distributed with mean zero and variance
|t − s|,
Bt − Bs ∼ N (0, |t − s|).
The process Xt = x + Bt has all the properties of a Brownian motion that starts
at x. Condition 4 states that the increments of a Brownian motion are stationary, i.e.
the distribution of Bt − Bs depends only on the time interval t − s
P (Bt+s − Bs ≤ a) = P (Bt − B0 ≤ a) = P (Bt ≤ a).
It is worth noting that even if Bt is continuous, it is nowhere differentiable. From
condition 4 we get that Bt is normally distributed with mean E[Bt ] = 0 and V ar[Bt ] = t
Bt ∼ N (0, t).
39
40
This implies also that the second moment is E[Bt2 ] = t. Let 0 < s < t. Since the
increments are independent, we can write
E[Bs Bt ] = E[(Bs − B0 )(Bt − Bs ) + Bs2 ] = E[Bs − B0 ]E[Bt − Bs ] + E[Bs2 ] = s.
Consequently, Bs and Bt are not independent.
Condition 4 has also a physical explanation. A pollen grain suspended in water is
kicked by a very large numbers of water molecules. The influence of each molecule on
the grain is independent of the other molecules. These effects are average out into a
resultant increment of the grain coordinate. According to the Central Limit Theorem,
this increment has to be normal distributed.
Proposition 3.1.2 A Brownian motion process Bt is a martingale with respect to the
information set Ft = σ(Bs ; s ≤ t).
Proof: The integrability of Bt follows from Jensen’s inequality
E[|Bt |]2 ≤ E[Bt2 ] = V ar(Bt ) = |t| < ∞.
Bt is obviously Ft -measurable. Let s < t and write Bt = Bs + (Bt − Bs ). Then
E[Bt |Fs ] = E[Bs + (Bt − Bs )|Fs ]
= E[Bs |Fs ] + E[Bt − Bs |Fs ]
= Bs + E[Bt − Bs ] = Bs + E[Bt−s − B0 ] = Bs ,
where we used that Bs is Fs -predictable (from where E[Bs |Fs ] = Bs ) and that the
increment Bt − Bs is independent of previous values of Bt contained in the information
set Ft = σ(Bs ; s ≤ t).
A process with similar properties as the Brownian motion was introduced by Wiener.
Definition 3.1.3 A Wiener process Wt is a process adapted to a filtration Ft such that
1. The process starts at the origin, W0 = 0;
2. Wt is an Ft -martingale with E[Wt2 ] < ∞ for all t ≥ 0 and
E[(Wt − Ws )2 ] = t − s,
s ≤ t;
3. The process Wt is continuous in t.
Since Wt is a martingale, its increments satisfy
E[Wt − Ws ] = E[Wt − Ws |Fs ] = E[Wt |Fs ] − Ws = Ws − Ws = 0,
and hence E[Wt ] = 0. It is easy to show that
V ar[Wt − Ws ] = |t − s|,
V ar[Wt ] = t.
41
Exercise 3.1.4 Show that a Brownian process Bt is a Winer process.
The only property Bt has and Wt seems not to have is that the increments are normally distributed. However, there is no distinction between these two processes, as the
following result states.
Theorem 3.1.5 (Lévy) A Wiener process is a Brownian motion process.
In stochastic calculus we often need to use infinitesimal notation and its properties.
If dWt denotes the infinitesimal increment of a Wiener process in the time interval dt,
the aforementioned properties become dWt ∼ N (0, dt), E[dWt ] = 0, and E[(dWt )2 ] =
dt.
Proposition 3.1.6 If Wt is a Wiener process with respect to the information set Ft ,
then Yt = Wt2 − t is a martingale.
Proof: Yt is integrable since
E[|Yt |] ≤ E[Wt2 + t] = 2t < ∞,
t > 0.
Let s < t. Using that the increments Wt − Ws and (Wt − Ws )2 are independent of the
information set Fs and applying Proposition 2.10.6 yields
E[Wt2 |Fs ] = E[(Ws + Wt − Ws )2 |Fs ]
= E[Ws2 + 2Ws (Wt − Ws ) + (Wt − Ws )2 |Fs ]
= E[Ws2 |Fs ] + E[2Ws (Wt − Ws )|Fs ] + E[(Wt − Ws )2 |Fs ]
= Ws2 + 2Ws E[Wt − Ws |Fs ] + E[(Wt − Ws )2 |Fs ]
= Ws2 + 2Ws E[Wt − Ws ] + E[(Wt − Ws )2 ]
= Ws2 + t − s,
and hence E[Wt2 − t|Fs ] = Ws2 − s, for s < t.
Exercise 3.1.7 Show that the process
Xt =
Wt3
−3
Z
t
Ws ds
0
is a martingale with respect to the information set Ft = σ{Ws ; s ≤ t}.
The following result states the memoryless property of Brownian motion1 Wt .
Proposition 3.1.8 The conditional distribution of Wt+s , given the present Wt and the
past Wu , 0 ≤ u < t, depends only on the present.
1
These type of processes are called Markov processes.
42
Proof: Using the independent increment assumption, we have
P (Wt+s ≤ c|Wt = x, Wu , 0 ≤ u < t)
= P (Wt+s − Wt ≤ c − x|Wt = x, Wu , 0 ≤ u < t)
= P (Wt+s − Wt ≤ c − x)
= P (Wt+s ≤ c|Wt = x).
Since Wt is normally distributed with mean 0 and variance t, its density function is
φt (x) = √
1 − x2
e 2t .
2πt
Then its distribution function is
1
Ft (x) = P (Wt ≤ x) = √
2πt
Z
x
u2
e− 2t du
−∞
The probability that Wt is between the values a and b is given by
P (a ≤ Wt ≤ b) = √
1
2πt
Z
b
u2
e− 2t du,
a < b.
a
Even if the increments of a Brownian motion are independent, their values are still
correlated.
Proposition 3.1.9 Let 0 ≤ s ≤ t. Then
1. Cov(Ws , Wt ) = s;r
2. Corr(Ws , Wt ) =
s
.
t
Proof: 1. Using the properties of covariance
Cov(Ws , Wt ) = Cov(Ws , Ws + Wt − Ws )
= Cov(Ws , Ws ) + Cov(Ws , Wt − Ws )
= V ar(Ws ) + E[Ws (Wt − Ws )] − E[Ws ]E[Wt − Ws ]
= s + E[Ws ]E[Wt − Ws ]
= s,
since E[Ws ] = 0.
We can also arrive at the same result starting from the formula
Cov(Ws , Wt ) = E[Ws Wt ] − E[Ws ]E[Wt ] = E[Ws Wt ].
43
Using that conditional expectations have the same expectation, factoring the predictable part out, and using that Wt is a martingale, we have
E[Ws Wt ] = E[E[Ws Wt |Fs ]] = E[Ws E[Wt |Fs ]]
= E[Ws Ws ] = E[Ws2 ] = s,
so Cov(Ws , Wt ) = s.
2. The correlation formula yields
Corr(Ws, Wt ) =
s
Cov(Ws , Wt )
=√ √ =
σ(Wt )σ(Ws )
s t
r
s
.
t
Remark 3.1.10 Removing the order relation between s and t, the previous relations
can also be stated as
Cov(Ws , Wt ) = min{s, t};
s
min{s, t}
Corr(Ws, Wt ) =
.
max{s, t}
The following exercises state the translation and the scaling invariance of the Brownian motion.
Exercise 3.1.11 For any t0 ≥ 0, show that the process Xt = Wt+t0 −Wt0 is a Brownian
motion. This can be also stated as saying that the Brownian motion is translation
invariant.
Exercise 3.1.12 For any λ > 0, show that the process Xt = √1λ Wλt is a Brownian
motion. This says that the Brownian motion is invariant by scaling.
Exercise 3.1.13 Let 0 < s < t < u. Show the following multiplicative property
Corr(Ws , Wt )Corr(Wt , Wu ) = Corr(Ws , Wu ).
Exercise 3.1.14 Find the expectations E[Wt3 ] and E[Wt4 ].
Exercise 3.1.15 (a) Use the martingale property of Wt2 −t to find E[(Wt2 −t)(Ws2 −s)];
(b) Evaluate E[Wt2 Ws2 ];
(c) Compute Cov(Wt2 , Ws2 );
(d) Find Corr(Wt2 , Ws2 ).
44
5
50
100
150
200
250
300
350
4
3
-0.5
2
-1.0
50
a
100
150
200
250
300
350
b
Figure 3.1: a Three simulations of the Brownian motion process Wt ; b Two simulations
of the geometric Brownian motion process eWt .
Exercise 3.1.16 Consider the process Yt = tW 1 , t > 0, and define Y0 = 0.
t
(a) Find the distribution of Yt ;
(b) Find the probability density of Yt ;
(c) Find Cov(Ys , Yt );
(d) Find E[Yt − Ys ] and V ar(Yt − Ys ) for s < t.
It is worth noting that the process Yt = tW 1 , t > 0 with Y0 = 0 is a Brownian motion.
t
Exercise 3.1.17 The process Xt = |Wt | is called Brownian motion reflected at the
origin. Show that
p
(a) E[|Wt |] = 2t/π;
(b) V ar(|Wt |) = (1 − π2 )t.
Exercise 3.1.18 Let 0 < s < t. Find E[Wt2 |Fs ].
Exercise 3.1.19 Let 0 < s < t. Show that
(a) E[Wt3 |Fs ] = 3(t − s)Ws + Ws3 ;
(b) E[Wt4 |Fs ] = 3(t − s)2 + 6(t − s)Ws2 + Ws4 .
Exercise 3.1.20 Show that the following processes are Brownian motions
(a) Xt = WT − WT −t , 0 ≤ t ≤ T ;
(b) Yt = −Wt , t ≥ 0.
Exercise 3.1.21 Let Wt and W̃t be two independent Brownian motions and ρ be a
constant with |ρ| ≤ 1.
p
(a) Show that the process Xt = ρWt + 1 − ρ2 W̃t is continuous and has the distribution N (0, t);
(b) Is Xt a Brownian motion?
Exercise 3.1.22
√ Let Y be a random variable distributed as N (0, 1). Consider the
process Xt = tY . Is Xt a Brownian motion?
45
3.2
Geometric Brownian Motion
The geometric Brownian motion with drift µ and volatility σ is the process
Xt = eσWt +(µ−
σ2
)t
2
,
t ≥ 0.
In the standard case, when µ = 0 and σ = 1, the process becomes Xt = eWt , t ≥ 0. A
few simulations of this process are contained in Fig.3.1 b. The following result will be
useful in the following.
2 t/2
Lemma 3.2.1 E[eαWt ] = eα
, for α ≥ 0.
Proof: Using the definition of expectation
Z
Z
x2
1
eαx φt (x) dx = √
E[eαWt ] =
e− 2t +αx dx
2πt
2 t/2
= eα
,
where we have used the integral formula
r
Z
π b2
−ax2 +bx
e
dx =
e 4a ,
a
with a =
1
2t
a>0
and b = α.
Proposition 3.2.2 The geometric Brownian motion Xt = eWt is log-normally distributed with mean et/2 and variance e2t − et .
Proof: Since Wt is normally distributed, then Xt = eWt will have a log-normal distribution. Using Lemma 3.2.1 we have
E[Xt ] = E[eWt ] = et/2
E[Xt2 ] = E[e2Wt ] = e2t ,
and hence the variance is
V ar[Xt ] = E[Xt2 ] − E[Xt ]2 = e2t − (et/2 )2 = e2t − et .
The distribution function of Xt = eWt can be obtained by reducing it to the distribution function of a Brownian motion.
FXt (x) = P (Xt ≤ x) = P (eWt ≤ x)
= P (Wt ≤ ln x) = FWt (ln x)
Z ln x
u2
1
e− 2t du.
= √
2πt −∞
46
The density function of the geometric Brownian motion Xt = eWt is given by




1
2
√
e−(ln x) /(2t) , if x > 0,
x 2πt
d
F (x) =

dx Xt

 0,
p(x) =
elsewhere.
Exercise 3.2.3 Show that
E[eWt −Ws ] = e
t−s
2
,
s < t.
Exercise 3.2.4 Let Xt = eWt .
(a) Show that Xt is not a martingale.
t
(b) Show that e− 2 Xt is a martingale.
1 2
(c) Show that for any constant c ∈ R, the process Yt = ecWt − 2 c t is a martingale.
Exercise 3.2.5 If Xt = eWt , find Cov(Xs , Xt )
(a) by direct computation;
(b) by using Exercise 3.2.4 (b).
Exercise 3.2.6 Show that
2Wt2
E[e
3.3
]=
(1 − 4t)−1/2 , 0 ≤ t < 1/4
∞,
otherwise.
Integrated Brownian Motion
The stochastic process
Zt =
Z
t
Ws ds,
0
t≥0
is called integrated Brownian motion. Obviously, Z0 = 0.
Let 0 = s0 < s1 < · · · < sk < · · · sn = t, with sk = kt
n . Then Zt can be written as a
limit of Riemann sums
Zt = lim
n→∞
n
X
Ws 1 + · · · + Ws n
,
n→∞
n
Wsk ∆s = t lim
k=1
where ∆s = sk+1 − sk = nt . Since Wsk are not independent, we first need to transform the previous expression into a sum of independent normally distributed random
variables. A straightforward computation shows that
Ws 1 + · · · + Ws n
= n(Ws1 − W0 ) + (n − 1)(Ws2 − Ws1 ) + · · · + (Wsn − Wsn−1 )
= X1 + X2 + · · · + Xn .
(3.3.1)
47
Since the increments of a Brownian motion are independent and normally distributed,
we have
X1 ∼ N 0, n2 ∆s
X2 ∼ N 0, (n − 1)2 ∆s
X3 ∼ N 0, (n − 2)2 ∆s
..
.
Xn ∼ N 0, ∆s .
Recall now the following well known theorem on the addition formula for Gaussian
random variable.
Theorem 3.3.1 If Xj are independent random variables normally distributed with
mean µj and variance σj2 , then the sum X1 + · · · + Xn is also normally distributed
with mean µ1 + · · · + µn and variance σ12 + · · · + σn2 .
Then
n(n + 1)(2n + 1) X1 + · · · + Xn ∼ N 0, (1 + 22 + 32 + · · · + n2 )∆s = N 0,
∆s ,
6
t
with ∆s = . Using (3.3.1) yields
n
(n + 1)(2n + 1) Ws + · · · + Ws n
t 1
∼ N 0,
t3 .
n
6n2
“Taking the limit” we get
t3 Zt ∼ N 0,
.
3
Proposition 3.3.2 The integrated Brownian motion Zt has a normal distribution with
mean 0 and variance t3 /3.
Remark 3.3.3 The aforementioned limit was taken heuristically, without specifying
the type of the convergence. In order to make this to work, the following result is
usually used:
If Xn is a sequence of normal random variables that converges in mean square to X,
then the limit X is normal distributed, with E[Xn ] → E[X] and V ar(Xn ) → V ar(X),
as n → ∞.
The mean and the variance can also be computed in a direct way as follows. By
Fubini’s theorem we have
Z Z t
Z t
Ws ds dP
E[Zt ] = E[ Ws ds] =
R 0
0
Z t
Z tZ
E[Ws ] ds = 0,
Ws dP ds =
=
0
R
0
48
since E[Ws ] = 0. Then the variance is given by
V ar[Zt ] = E[Zt2 ] − E[Zt ]2 = E[Zt2 ]
Z tZ t
Z t
Z t
Wv dv] = E[
Wu Wv dudv]
= E[ Wu du ·
0
0
0
0
ZZ
Z tZ t
min{u, v} dudv
E[Wu Wv ] dudv =
=
[0,t]×[0,t]
0
0
ZZ
ZZ
min{u, v} dudv,
min{u, v} dudv +
=
(3.3.2)
D2
D1
where
D1 = {(u, v); u > v, 0 ≤ u ≤ t},
D2 = {(u, v); u < v, 0 ≤ u ≤ t}
The first integral can be evaluated using Fubini’s theorem
ZZ
ZZ
v dudv
min{u, v} dudv =
D1
D1
=
Z tZ
0
u
v dv du =
0
Z
t
0
t3
u2
du = .
2
6
Similarly, the latter integral is equal to
ZZ
t3
min{u, v} dudv = .
6
D2
Substituting in (3.3.2) yields
V ar[Zt ] =
t3 t3
t3
+
= .
6
6
3
Exercise 3.3.4 (a) Prove that the moment generating function of Zt is
m(u) = eu
2 t3 /6
.
(b) Use the first part to find the mean and variance of Zt .
Exercise 3.3.5 Let s < t. Show that the covariance of the integrated Brownian motion
is given by
t
s
,
s < t.
−
Cov Zs , Zt = s2
2 6
Exercise 3.3.6 Show that
(a) Cov(Zt , Zt − Zt−h ) =
limh→0 o(h)/h = 0;
(b) Cov(Zt , Wt ) =
t2
.
2
1 2
2t h
+ o(h), where o(h) denotes a quantity such that
49
Exercise 3.3.7 Show that
E[eWs +Wu ] = e
Exercise 3.3.8 Consider the process Xt =
u+s
2
Z
t
emin{s,u} .
eWs ds.
0
(a) Find the mean of Xt ;
(b) Find the variance of Xt .
Exercise 3.3.9 Consider the process Zt =
Z
t
Wu du, t > 0.
0
(a) Show that E[ZT |Ft ] = Zt + Wt (T − t), for any t < T ;
(b) Prove that the process Mt = Zt − tWt is an Ft -martingale.
3.4
If Zt =
Exponential Integrated Brownian Motion
Rt
0
Ws ds denotes the integrated Brownian motion, the process
Vt = eZt
is called exponential integrated Brownian motion. The process starts at V0 = e0 = 1.
Since Zt is normally distributed, then Vt is log-normally distributed. We compute the
mean and the variance in a direct way. Using Exercises 3.2.5 and 3.3.4 we have
t3
E[Vt ] = E[eZt ] = m(1) = e 6
E[Vt2 ] = E[e2Zt ] = m(2) = e
V ar(Vt ) = E[Vt2 ] − E[Vt ]2 = e
Cov(Vs , Vt ) = e
t+3s
2
.
Exercise 3.4.1 Show that E[VT |Ft ] = Vt e(T −t)Wt +
3.5
(T −t)3
3
4t3
6
2t3
3
=e
2t3
3
t3
−e3
for t < T .
Brownian Bridge
The process Xt = Wt − tW1 is called the Brownian bridge fixed at both 0 and 1. Since
we can also write
Xt = Wt − tWt − tW1 + tWt
= (1 − t)(Wt − W0 ) − t(W1 − Wt ),
using that the increments Wt − W0 and W1 − Wt are independent and normally distributed, with
Wt − W0 ∼ N (0, t),
W1 − Wt ∼ N (0, 1 − t),
50
it follows that Xt is normally distributed with
E[Xt ] = (1 − t)E[(Wt − W0 )] − tE[(W1 − Wt )] = 0
V ar[Xt ] = (1 − t)2 V ar[(Wt − W0 )] + t2 V ar[(W1 − Wt )]
= (1 − t)2 (t − 0) + t2 (1 − t)
= t(1 − t).
This can be also stated by saying that the Brownian bridge tied at0 and 1 is a Gaussian
process with mean 0 and variance t(1 − t), so Xt ∼ N 0, t(1 − t) .
Exercise 3.5.1 Let Xt = Wt − tW1 , 0 ≤ t ≤ 1 be a Brownian bridge fixed at 0 and 1.
Let Yt = Xt2 . Show that Y0 = Y1 = 0 and find E[Yt ] and V ar(Yt ).
3.6
Brownian Motion with Drift
The process Yt = µt + Wt , t ≥ 0, is called Brownian motion with drift. The process Yt
tends to drift off at a rate µ. It starts at Y0 = 0 and it is a Gaussian process with mean
E[Yt ] = µt + E[Wt ] = µt
and variance
V ar[Yt ] = V ar[µt + Wt ] = V ar[Wt ] = t.
Exercise 3.6.1 Find the distribution and the density functions of the process Yt .
3.7
Bessel Process
This section deals with the process satisfied by the Euclidean distance from the origin
to a particle following a Brownian motion in Rn . More precisely, if W1 (t), · · · , Wn (t) are
independent Brownian motions, let W (t) = (W1 (t), · · · , Wn (t)) be a Brownian motion
in Rn , n ≥ 2. The process
p
Rt = dist(O, W (t)) = W1 (t)2 + · · · + Wn (t)2
is called n-dimensional Bessel process.
The probability density of this process is given by the following result.
Proposition 3.7.1 The probability density function of Rt , t > 0 is given by

ρ2

 (2t)n/22Γ(n/2) ρn−1 e− 2t , ρ ≥ 0;
pt (ρ) =


0,
ρ<0
51
with
Γ
n
2
=
 n
 ( 2 − 1)!

for n even;
√
( n2 − 1)( n2 − 2) · · · 23 12 π, for n odd.
Proof: Since the Brownian motions W1 (t), . . . , Wn (t) are independent, their joint
density function is
fW1 ···Wn (t) = fW1 (t) · · · fWn (t)
1
2
2
e−(x1 +···+xn )/(2t) ,
=
(2πt)n/2
t > 0.
In the next computation we shall use the following formula of integration that
follows from the use of polar coordinates
Z ρ
Z
n−1
r n−1 g(r) dr,
(3.7.3)
f (x) dx = σ(S
)
0
{|x|≤ρ}
where f (x) = g(|x|) is a function on
Rn
with spherical symmetry, and where
σ(Sn−1 ) =
2π n/2
Γ(n/2)
is the area of the (n − 1)-dimensional sphere in Rn .
Let ρ ≥ 0. The distribution function of Rt is
Z
fW1 ···Wn (t) dx1 · · · dxn
FR (ρ) = P (Rt ≤ ρ) =
{Rt ≤ρ}
Z
1
2
2
=
e−(x1 +···+xn )/(2t) dx1 · · · dxn
n/2
x21 +···+x2n ≤ρ2 (2πt)
!
Z
Z ρ
1
n−1
−(x21 +···+x2n )/(2t)
r
=
e
dσ dr
n/2
S(0,1) (2πt)
0
Z
σ(Sn−1 ) ρ n−1 −r2 /(2t)
r
e
dr.
=
(2πt)n/2 0
Differentiating yields
pt (ρ) =
=
σ(Sn−1 ) n−1 − ρ2
d
ρ
e 2t
FR (ρ) =
dρ
(2πt)n/2
2
2
n−1 − ρ2t
ρ
e
,
ρ > 0, t > 0.
(2t)n/2 Γ(n/2)
52
It is worth noting that in the 2-dimensional case the aforementioned density becomes
a particular case of a Weibull distribution with parameters m = 2 and α = 2t, called
Wald’s distribution
x2
1
x > 0, t > 0.
pt (x) = xe− 2t ,
t
Exercise 3.7.2 Let P (Rt ≤ t) be the probability of a 2-dimensional Brownian motion
to be inside of the disk D(0, ρ) at time t > 0. Show that
ρ2 ρ2
ρ2 1−
< P (Rt ≤ t) < .
2t
4t
2t
Exercise 3.7.3 Let Rt be a 2-dimensional Bessel process. Show that
√
(a) E[Rt ] = 2πt/2;
(b) V ar(Rt ) = 2t 1 − π4 .
Rt
, t > 0, where Rt is a 2-dimensional Bessel process. Show
Exercise 3.7.4 Let Xt =
t
that Xt → 0 as t → ∞ in mean square.
3.8
The Poisson Process
A Poisson process describes the number of occurrences of a certain event before time t,
such as: the number of electrons arriving at an anode until time t; the number of cars
arriving at a gas station until time t; the number of phone calls received on a certain
day until time t; the number of visitors entering a museum on a certain day until time
t; the number of earthquakes that occurred in Chile during the time interval [0, t]; the
number of shocks in the stock market from the beginning of the year until time t; the
number of twisters that might hit Alabama from the beginning of the century until
time t.
The definition of a Poisson process is stated more precisely in the following.
Definition 3.8.1 A Poisson process is a stochastic process Nt , t ≥ 0, which satisfies
1. The process starts at the origin, N0 = 0;
2. Nt has independent increments;
3. The process Nt is right continuous in t, with left hand limits;
4. The increments Nt − Ns , with 0 < s < t, have a Poisson distribution with
parameter λ(t − s), i.e.
P (Nt − Ns = k) =
λk (t − s)k −λ(t−s)
e
.
k!
53
It can be shown that condition 4 in the previous definition can be replaced by the
following two conditions:
P (Nt − Ns = 1) = λ(t − s) + o(t − s)
(3.8.4)
P (Nt − Ns ≥ 2) = o(t − s),
(3.8.5)
where o(h) denotes a quantity such that limh→0 o(h)/h = 0. Then the probability
that a jump of size 1 occurs in the infinitesimal interval dt is equal to λdt, and the
probability that at least 2 events occur in the same small interval is zero. This implies
that the random variable dNt may take only two values, 0 and 1, and hence satisfies
P (dNt = 1) = λ dt
(3.8.6)
P (dNt = 0) = 1 − λ dt.
(3.8.7)
Exercise 3.8.2 Show that if condition 4 is satisfied, then conditions (3.8.4 − 3.8.5)
hold.
Exercise 3.8.3 Which of the following expressions are o(h)?
(a) f (h) = 3h2 + h;
√
(b) f (h) = h + 5;
(c) f (h) = h ln |h|;
(d) f (h) = heh .
The fact that Nt − Ns is stationary can be stated as
P (Nt+s − Ns ≤ n) = P (Nt − N0 ≤ n) = P (Nt ≤ n) =
n
X
(λt)k
k=0
k!
e−λt .
From condition 4 we get the mean and variance of increments
E[Nt − Ns ] = λ(t − s),
V ar[Nt − Ns ] = λ(t − s).
In particular, the random variable Nt is Poisson distributed with E[Nt ] = λt and
V ar[Nt ] = λt. The parameter λ is called the rate of the process. This means that the
events occur at the constant rate λ.
Since the increments are independent, we have for 0 < s < t
E[Ns Nt ] = E[(Ns − N0 )(Nt − Ns ) + Ns2 ]
= E[Ns − N0 ]E[Nt − Ns ] + E[Ns2 ]
= λs · λ(t − s) + (V ar[Ns ] + E[Ns ]2 )
= λ2 st + λs.
As a consequence we have the following result:
(3.8.8)
54
Proposition 3.8.4 Let 0 ≤ s ≤ t. Then
1. Cov(Ns , Nt ) = λs;
r
2. Corr(Ns , Nt ) =
s
.
t
Proof: 1. Using (3.8.8) we have
Cov(Ns , Nt ) = E[Ns Nt ] − E[Ns ]E[Nt ]
= λ2 st + λs − λsλt
= λs.
2. Using the formula for the correlation yields
Corr(Ns , Nt ) =
Cov(Ns , Nt )
λs
=
=
1/2
(V ar[Ns ]V ar[Nt ])
(λsλt)1/2
r
s
.
t
It worth noting the similarity with Proposition 3.1.9.
Proposition 3.8.5 Let Nt be Ft -adapted. Then the process Mt = Nt − λt is an Ft martingale.
Proof: Let s < t and write Nt = Ns + (Nt − Ns ). Then
E[Nt |Fs ] = E[Ns + (Nt − Ns )|Fs ]
= E[Ns |Fs ] + E[Nt − Ns |Fs ]
= Ns + E[Nt − Ns ]
= Ns + λ(t − s),
where we used that Ns is Fs -predictable (and hence E[Ns |Fs ] = Ns ) and that the
increment Nt − Ns is independent of previous values of Ns and the information set Fs .
Subtracting λt yields
E[Nt − λt|Fs ] = Ns − λs,
or E[Mt |Fs ] = Ms . Since it is obvious that Mt is integrable and Ft -adapted, it follows
that Mt is a martingale.
It is worth noting that the Poisson process Nt is not a martingale. The martingale
process Mt = Nt − λt is called the compensated Poisson process.
Exercise 3.8.6 Compute E[Nt2 |Fs ] for s < t. Is the process Nt2 an Fs -martingale?
55
Exercise 3.8.7 (a) Show that the moment generating function of the random variable
Nt is
x
mNt (x) = eλt(e −1) .
(b) Deduce the expressions for the first few moments
E[Nt ] = λt
E[Nt2 ] = λ2 t2 + λt
E[Nt3 ] = λ3 t3 + 3λ2 t2 + λt
E[Nt4 ] = λ4 t4 + 6λ3 t3 + 7λ2 t2 + λt.
(c) Show that the first few central moments are given by
E[Nt − λt] = 0
E[(Nt − λt)2 ] = λt
E[(Nt − λt)3 ] = λt
E[(Nt − λt)4 ] = 3λ2 t2 + λt.
Exercise 3.8.8 Find the mean and variance of the process Xt = eNt .
Exercise 3.8.9 (a) Show that the moment generating function of the random variable
Mt is
x
mMt (x) = eλt(e −x−1) .
(b) Let s < t. Verify that
E[Mt − Ms ] = 0,
E[(Mt − Ms )2 ] = λ(t − s),
E[(Mt − Ms )3 ] = λ(t − s),
E[(Mt − Ms )4 ] = λ(t − s) + 3λ2 (t − s)2 .
Exercise 3.8.10 Let s < t. Show that
V ar[(Mt − Ms )2 ] = λ(t − s) + 2λ2 (t − s)2 .
3.9
Interarrival times
For each state of the world, ω, the path t → Nt (ω) is a step function that exhibits unit
jumps. Each jump in the path corresponds to an occurrence of a new event. Let T1
be the random variable which describes the time of the 1st jump. Let T2 be the time
between the 1st jump and the second one. In general, denote by Tn the time elapsed
between the (n − 1)th and nth jumps. The random variables Tn are called interarrival
times.
56
Proposition 3.9.1 The random variables Tn are independent and exponentially distributed with mean E[Tn ] = 1/λ.
Proof: We start by noticing that the events {T1 > t} and {Nt = 0} are the same,
since both describe the situation that no events occurred until after time t. Then
P (T1 > t) = P (Nt = 0) = P (Nt − N0 = 0) = e−λt ,
and hence the distribution function of T1 is
FT1 (t) = P (T1 ≤ t) = 1 − P (T1 > t) = 1 − e−λt .
Differentiating yields the density function
fT1 (t) =
d
FT (t) = λe−λt .
dt 1
It follows that T1 is has an exponential distribution, with E[T1 ] = 1/λ.
In order to show that the random variables T1 and T2 are independent, if suffices to
show that
P (T2 ≤ t) = P (T2 ≤ t|T1 = s),
i.e. the distribution function of T2 is independent of the values of T1 . We note first
that from the independent increments property
P 0 jumps in (s, s + t], 1 jump in (0, s] = P (Ns+t − Ns = 0, Ns − N0 = 1)
= P (Ns+t − Ns = 0)P (Ns − N0 = 1) = P 0 jumps in (s, s + t] P 1 jump in (0, s] .
Then the conditional distribution of T2 is
F (t|s) = P (T2 ≤ t|T1 = s) = 1 − P (T2 > t|T1 = s)
P (T2 > t, T1 = s)
= 1−
P (T1 = s)
P 0 jumps in (s, s + t], 1 jump in (0, s]
= 1−
P (T1 = s)
P 0 jumps in (s, s + t] P 1 jump in (0, s]
= 1−
= 1 − P 0 jumps in (s, s + t]
P 1 jump in (0, s]
= 1 − P (Ns+t − Ns = 0) = 1 − e−λt ,
which is independent of s. Then T2 is independent of T1 and exponentially distributed.
A similar argument for any Tn leads to the desired result.
57
3.10
Waiting times
The random variable Sn = T1 + T2 + · · · + Tn is called the waiting time until the nth
jump. The event {Sn ≤ t} means that there are n jumps that occurred before or at
time t, i.e. there are at least n events that happened up to time t; the event is equal
to {Nt ≥ n}. Hence the distribution function of Sn is given by
FSn (t) = P (Sn ≤ t) = P (Nt ≥ n) = e−λt
∞
X
(λt)k
·
k!
k=n
Differentiating we obtain the density function of the waiting time Sn
fSn (t) =
d
λe−λt (λt)n−1
FSn (t) =
·
dt
(n − 1)!
Writing
fSn (t) =
tn−1 e−λt
,
(1/λ)n Γ(n)
it turns out that Sn has a gamma distribution with parameters α = n and β = 1/λ. It
follows that
n
n
E[Sn ] = , V ar[Sn ] = 2 ·
λ
λ
The relation lim E[Sn ] = ∞ states that the expectation of the waiting time is unn→∞
bounded as n → ∞.
Exercise 3.10.1 Prove that
λe−λt (λt)n−1
d
FSn (t) =
·
dt
(n − 1)!
Exercise 3.10.2 Using that the interarrival times T1 , T2 , · · · are independent and exponentially distributed, compute directly the mean E[Sn ] and variance V ar(Sn ).
3.11
The Integrated Poisson Process
The function u → Nu is continuous with the exception of a set of countable jumps of
size 1. It is known that such functions are Riemann integrable, so it makes sense to
define the process
Z
t
Nu du,
Ut =
0
called the integrated Poisson process. The next result provides a relation between the
process Ut and the partial sum of the waiting times Sk .
Proposition 3.11.1 The integrated Poisson process can be expressed as
Ut = tNt −
Nt
X
k=1
Sk .
58
Figure 3.2: The Poisson process Nt and the waiting times S1 , S2 , · · · Sn . The shaded
rectangle has area n(Sn+1 − t).
Let Nt = n. Since Nt is equal to k between the waiting times Sk and Sk+1 , the process
Ut , which is equal to the area of the subgraph of Nu between 0 and t, can be expressed
as
Z t
Nu du = 1 · (S2 − S1 ) + 2 · (S3 − S2 ) + · · · + n(Sn+1 − Sn ) − n(Sn+1 − t).
Ut =
0
Since Sn < t < Sn+1 , the difference of the last two terms represents the area of last
the rectangle, which has the length t − Sn and the height n. Using associativity, a
computation yields
1 · (S2 − S1 ) + 2 · (S3 − S2 ) + · · · + n(Sn+1 − Sn ) = nSn+1 − (S1 + S2 + · · · + Sn ).
Substituting in the aforementioned relation yields
Ut = nSn+1 − (S1 + S2 + · · · + Sn ) − n(Sn+1 − t)
= nt − (S1 + S2 + · · · + Sn )
= tNt −
where we replaced n by Nt .
Nt
X
Sk ,
k=1
The conditional distribution of the waiting times is provided by the following useful
result.
Theorem 3.11.2 Given that Nt = n, the waiting times S1 , S2 , · · · , Sn have the the
joint density function given by
f (s1 , s2 , · · · , sn ) =
n!
,
tn
0 < s1 ≤ s2 ≤ · · · ≤ sn < t.
59
This is the same as the density of an ordered sample of size n from a uniform distribution
on the interval (0, t). A naive explanation of this result is as follows. If we know
that there will be exactly n events during the time interval (0, t), since the events
can occur at any time, each of them can be considered uniformly distributed, with
the density f (sk ) = 1/t. Since it makes sense to consider the events independent,
taking into consideration all possible n! permutaions, the joint density function becomes
n!
f (s1 , · · · , sn ) = n!f (s1 ) · · · f (sn ) = n .
t
Exercise 3.11.3 Find the following means
(a) E[Ut ].
Nt
X
(b) E
Sk .
k=1
Exercise 3.11.4 Show that V ar(Ut ) =
λt3
.
3
Exercise 3.11.5 Can you apply a similar proof as in Proposition 3.3.2 to show that
the integrated Poisson process Ut is also a Poisson process?
Exercise 3.11.6 Let Y : Ω → N be a discrete random variable. Then for any random
variable X we have
X
E[X] =
E[X|Y = y]P (Y = y).
y≥0
Exercise 3.11.7 Use Exercise 3.11.6 to solve Exercise 3.11.3 (b).
Exercise 3.11.8 (a) Let Tk be the kth interarrival time. Show that
E[e−σTk ] =
λ
,
λ+σ
σ > 0.
(b) Let n = Nt . Show that
Ut = nt − [nT1 + (n − 1)T2 + · · · + 2Tn−1 + Tn ].
(c) Find the conditional expectation
h
i
Nt = n .
−σUt E e
(Hint: If know that there are exactly n jumps in the interval [0, T ], it makes sense
to consider the arrival time of the jumps Ti independent and uniformly distributed on
[0, T ]).
(d) Find the expectation
i
h
E e−σUt .
60
3.12
Submartingales
A stochastic process Xt on the probability space (Ω, F, P ) is called submartingale with
respect to the filtration Ft if:
R
(a) Ω |Xt | dP < ∞ (Xt integrable);
(b) Xt is known if Ft is given (Xt is adaptable to Ft );
(c) E[Xt+s |Ft ] ≥ Xt , ∀t, s ≥ 0 (future predictions exceed the present value).
Example 3.12.1 We shall prove that the process Xt = µt + σWt , with µ > 0 is a
submartingale.
The integrability follows from the inequality |Xt (ω)| ≤ µt + |Wt (ω)| and integrability
of Wt . The adaptability of Xt is obvious, and the last property follows from the
computation:
E[Xt+s |Ft ] = E[µt + σWt+s |Ft ] + µs > E[µt + σWt+s |Ft ]
= µt + σE[Wt+s |Ft ] = µt + σWt = Xt ,
where we used that Wt is a martingale.
Example 3.12.2 We shall show that the square of the Brownian motion, Wt2 , is a
submartingale.
Using that Wt2 − t is a martingale, we have
2
2
E[Wt+s
|Ft ] = E[Wt+s
− (t + s)|Ft ] + t + s = Wt2 − t + t + s
= Wt2 + s ≥ Wt2 .
The following result supplies examples of submatingales starting from martingales
or submartingales.
Proposition 3.12.3 (a) If Xt is a martingale and φ a convex function such that φ(Xt )
is integrable, then the process Yt = φ(Xt ) is a submartingale.
(b) If Xt is a submartingale and φ an increasing convex function such that φ(Xt ) is
integrable, then the process Yt = φ(Xt ) is a submartingale.
(c) If Xt is a martingale and f (t) is an increasing, integrable function, then Yt =
Xt + f (t) is a submartingale.
Proof: (a) Using Jensen’s inequality for conditional probabilities, Exercise 2.11.7, we
have
E[Yt+s |Ft ] = E[φ(Xt+s )|Ft ] ≥ φ E[Xt+s |Ft ] = φ(Xt ) = Yt .
61
(b) From the submatingale property and monotonicity of φ we have
φ E[Xt+s |Ft ] ≥ φ(Xt ).
Then apply a similar computation as in part (a).
(c) We shall check only the forcast property, since the other are obvious.
E[Yt+s |Ft ] = E[Xt+s + f (t + s)|Ft ] = E[Xt+s |Ft ] + f (t + s)
= Xt + f (t + s) ≥ Xt + f (t) = Yt ,
∀s, t > 0.
Corollary 3.12.4 (a) Let Xt be a right continuous martingale. Then Xt2 , |Xt |, eXt
are submartingales.
(b) Let µ > 0. Then eµt+σWt is a submartingale.
Proof: (a) Results from part (a) of Proposition 3.12.3.
(b) It follows from Example 3.12 and part (b) of Proposition 3.12.3.
The following result provides important inequalities involving submartingales.
Proposition 3.12.5 (Doob’s Submartingale Inequality) (a) Let Xt be a non-negative
submartingale. Then
P (sup Xs ≥ x) ≤
s≤t
E[Xt ]
,
x
∀x > 0.
(b) If Xt is a right continuous submartingale, then for any x > 0
P (sup Xt ≥ x) ≤
s≤t
E[Xt+ ]
,
x
where Xt+ = max{Xt , 0}.
Exercise 3.12.6 Let x > 0. Show the inequalities:
t
(a) P (sup Ws2 ≥ x) ≤ .
x
s≤t
p
2t/π
.
(b) P (sup |Ws | ≥ x) ≤
x
s≤t
sups≤t |Ws |
= 0.
t→∞
t
Exercise 3.12.7 Show that st-lim
62
Exercise 3.12.8 Show that for any martingale Xt we have the inequality
P (sup Xt2 > x) ≤
s≤t
E[Xt2 ]
,
x
∀x > 0.
It is worth noting that Doob’s inequality implies Markov’s inequality. Since sup Xs ≥
s≤t
Xt , then P (Xt ≥ x) ≤ P (sup Xs ≥ x). Then Doob’s inequality
s≤t
P (sup Xs ≥ x) ≤
s≤t
E[Xt ]
x
implies Markov’s inequality (see Theorem 2.11.9)
P (Xt ≥ x) ≤
E[Xt ]
.
x
Exercise 3.12.9 Let Nt denote the Poisson process and consider the information set
Ft = σ{Ns ; s ≤ t}.
(a) Show that Nt is a submartingale;
(b) Is Nt2 a submartingale?
Exercise 3.12.10 It can be shown that for any 0 < σ < τ we have the inequality
E
2 i 4τ λ
h X N
t
−λ
≤ 2 ·
t
σ
σ≤t≤τ
Nt
= λ.
t→∞ t
Using this inequality prove that ms–lim
Chapter 4
Properties of Stochastic
Processes
4.1
Stopping Times
Consider the probability space (Ω, F, P ) and the filtration (Ft )t≥0 , i.e.
Ft ⊂ Fs ⊂ F,
∀t < s.
Assume that the decision to stop playing a game before or at time t is determined by
the information Ft available at time t. Then this decision can be modeled by a random
variable τ : Ω → [0, ∞] which satisfies
{ω; τ (ω) ≤ t} ∈ Ft ,
∀t ≥ 0.
This means that given the information set Ft , we know whether the event {ω; τ (ω) ≤ t}
had occurred or not. We note that the possibility τ = ∞ is included, since the decision
to continue the game for ever is also a possible event. A random variable τ with the
previous properties is called a stopping time.
The next example illustrates a few cases when a decision is or is not a stopping time.
In order to accomplish this, think of the situation that τ is the time when some random
event related to a given stochastic process occurs first.
Example 4.1.1 Let Ft be the information available until time t regarding the evolution
of a stock. Assume the price of the stock at time t = 0 is $50 per share. The following
decisions are stopping times:
(a) Sell the stock when it reaches for the first time the price of $100 per share;
(b) Buy the stock when it reaches for the first time the price of $10 per share;
(c) Sell the stock at the end of the year;
63
64
(d) Sell the stock either when it reaches for the first time $80 or at the end of the
year.
(e) Keep the stock either until the initial investment doubles or until the end of the
year;
The following decision is not a stopping time:
(f ) Sell the stock when it reaches the maximum level it will ever be;
Part (f ) is not a stopping time because it requires information about the future that
is not contained in Ft . In part (e) there are two conditions; the latter one has the
occurring probability equal to 1.
Exercise 4.1.2 Show that any positive constant, τ = c, is a stopping time with respect
to any filtration.
Exercise 4.1.3 Let τ (ω) = inf{t > 0; |Wt (ω)| ≥ K}, with K > 0 constant. Show that
τ is a stopping time with respect to the filtration Ft = σ(Ws ; s ≤ t).
The random variable τ is called the first exit time of the Brownian motion Wt from the
interval (−K, K). In a similar way one can define the first exit time of the process Xt
from the interval (a, b):
τ (ω) = inf{t > 0; Xt (ω) ∈
/ (a, b)} = inf{t > 0; Xt (ω) ≥ b or Xt (ω) ≤ a)}.
Let X0 < a. The first entry time of Xt in the interval [a, b] is defined as
τ (ω) = inf{t > 0; Xt (ω) ∈ [a, b]}.
If let b = ∞, we obtain the first hitting time of the level a
τ (ω) = inf{t > 0; Xt (ω) ≥ a)}.
We shall deal with hitting times in more detail in section 4.3.
Exercise 4.1.4 Let Xt be a continuous stochastic process. Prove that the first exit
time of Xt from the interval (a, b) is a stopping time.
We shall present in the following some properties regarding operations with stopping
times. Consider the notations τ1 ∨ τ2 = max{τ1 , τ2 }, τ1 ∧ τ2 = min{τ1 , τ2 }, τ̄n =
supn≥1 τn and τ n = inf n≥1 τn .
Proposition 4.1.5 Let τ1 and τ2 be two stopping times with respect to the filtration
Ft . Then
1. τ1 ∨ τ2
2. τ1 ∧ τ2
3. τ1 + τ2
are stopping times.
65
Proof: 1. We have
{ω; τ1 ∨ τ2 ≤ t} = {ω; τ1 ≤ t} ∩ {ω; τ2 ≤ t} ∈ Ft ,
since {ω; τ1 ≤ t} ∈ Ft and {ω; τ2 ≤ t} ∈ Ft . From the subadditivity of P we get
P {ω; τ1 ∨ τ2 = ∞}
= P {ω; τ1 = ∞} ∪ {ω; τ2 = ∞}
≤ P {ω; τ1 = ∞} + P {ω; τ2 = ∞} .
Since
P {ω; τi = ∞} = 1 − P {ω; τi < ∞} = 0,
i = 1, 2,
it follows that P {ω; τ1 ∨ τ2 = ∞} = 0 and hence P {ω; τ1 ∨ τ2 < ∞} = 1. Then
τ1 ∨ τ2 is a stopping time.
2. The event {ω; τ1 ∧ τ2 ≤ t} ∈ Ft if and only if {ω; τ1 ∧ τ2 > t} ∈ Ft .
{ω; τ1 ∧ τ2 > t} = {ω; τ1 > t} ∩ {ω; τ2 > t} ∈ Ft ,
since {ω; τ1 > t} ∈ Ft and {ω; τ2 > t} ∈ Ft (the σ-algebra Ft is closed to complements).
The fact that τ1 ∧ τ2 < ∞ almost surely has a similar proof.
3. We note that τ1 + τ2 ≤ t if there is a c ∈ (0, t) such that
τ1 ≤ c,
τ2 ≤ t − c.
Using that the rational numbers are dense in R, we can write
[
{ω; τ1 ≤ c} ∩ {ω; τ2 ≤ t − c} ∈ Ft ,
{ω; τ1 + τ2 ≤ t} =
0<c<t,c∈Q
since
{ω; τ1 ≤ c} ∈ Fc ⊂ Ft ,
{ω; τ2 ≤ t − c} ∈ Ft−c ⊂ Ft .
Writing
{ω; τ1 + τ2 = ∞} = {ω; τ1 = ∞} ∪ {ω; τ2 = ∞}
yields
P {ω; τ1 + τ2 = ∞} ≤ P {ω; τ1 = ∞} + P {ω; τ2 = ∞} = 0,
Hence P {ω; τ1 + τ2 < ∞} = 1. It follows that τ1 + τ2 is a stopping time.
A filtration Ft is called right-continuous if Ft =
∞
\
n=1
Ft+ 1 , for t > 0. This means
n
that the information available at time t is a good approximation for any future infinitesimal information Ft+ǫ ; or, equivalently, nothing more can be learned by peeking
infinitesimally far into the future.
66
Exercise 4.1.6 (a) Let Ft = σ{Ws ; s ≤ t}, where Wt is a Brownian motion. Is Ft
right-continuous?
(b) Let Nt = σ{Ns ; s ≤ t}, where Nt is a Poisson motion. Is Nt right-continuous?
Proposition 4.1.7 Let Ft be right-continuous and (τn )n≥1 be a sequence of bounded
stopping times.
(a) Then supn τn and inf τn are stopping times.
(b) If the sequence (τn )n≥1 converges to τ , τ 6= 0, then τ is a stopping time.
Proof: (a) The fact that τ̄n is a stopping time follows from
\
{ω; τ̄n ≤ t} ⊂
{ω; τn ≤ t} ∈ Ft ,
n≥1
and from the boundedness, which implies P (τn < ∞) = 1.
In order to show that τ n is a stopping time we shall proceed as in the following.
Using that Ft is right-continuous and closed to complements, it suffices to show that
{ω; τ n ≥ t} ∈ Ft . This follows from
\
{ω; τn > t} ∈ Ft .
{ω; τ n ≥ t} =
n≥1
(b) Let lim τn = τ . Then there is an increasing (or decreasing) subsequence τnk of
n→∞
stopping times that tends to τ , so sup τnk = τ (or inf τnk = τ ). Since τnk are stopping
times, by part (a), it follows that τ is a stopping time.
The condition that τn is bounded is significant, since if take τn = n as stopping
times, then supn τn = ∞ with probability 1, which does not satisfy the stopping time
definition.
Exercise 4.1.8 Let τ be a stopping time.
(a) Let c ≥ 1 be a constant. Show that cτ is a stopping time.
(b) Let f : [0, ∞) → R be a continuous, increasing function satisfying f (t) ≥ t.
Prove that f (τ ) is a stopping time.
(c) Show that eτ is a stopping time.
Exercise 4.1.9 Let τ be a stopping time and c > 0 a constant. Prove that τ + c is a
stopping time.
Exercise 4.1.10 Let a be a constant and define τ = inf{t ≥ 0; Wt = a}. Is τ a
stopping time?
Exercise 4.1.11 Let τ be a stopping time. Consider the following sequence τn =
(m + 1)2−n if m2−n ≤ τ < (m + 1)2−n (stop at the first time of the form k2−n after
τ ). Prove that τn is a stopping time.
67
4.2
Stopping Theorem for Martingales
The next result states that in a fair game, the expected final fortune of a gambler, who
is using a stopping time to quit the game, is the same as the expected initial fortune.
From the financial point of view, the theorem says that if you buy an asset at some
initial time and adopt a strategy of deciding when to sell it, then the expected price at
the selling time is the initial price; so one cannot make money by buying and selling an
asset whose price is a martingale. Fortunately, the price of a stock is not a martingale,
and people can still expect to make money buying and selling stocks.
If (Mt )t≥0 is an Ft -martingale, then taking the expectation in
E[Mt |Fs ] = Ms ,
∀s < t
and using Example 2.10.4 yields
E[Mt ] = E[Ms ],
∀s < t.
In particular, E[Mt ] = E[M0 ], for any t > 0. The next result states necessary conditions
under which this identity holds if t is replaced by any stopping time τ .
Theorem 4.2.1 (Optional Stopping Theorem) Let (Mt )t≥0 be a right continuous
Ft -martingale and τ be a stopping time with respect to Ft such that
τ is bounded, i.e. ∃N < ∞ such that τ ≤ N .
Then E[Mτ ] = E[M0 ]. If Mt is an Ft -submartingale, then E[Mτ ] ≥ E[M0 ].
Proof: Taking the expectation in relation
Mτ = Mτ ∧t + (Mτ − Mt )1{τ >t} ,
see Exercise 4.2.3, yields
E[Mτ ] = E[Mτ ∧t ] + E[Mτ 1{τ >t} ] − E[Mt 1{τ >t} ].
Since Mτ ∧t is a martingale, see Exercise 4.2.4 (b), then E[Mτ ∧t ] = E[M0 ]. The previous
relation becomes
E[Mτ ] = E[M0 ] + E[Mτ 1{τ >t} ] − E[Mt 1{τ >t} ],
∀t > 0.
Taking the limit yields
E[Mτ ] = E[M0 ] + lim E[Mτ 1{τ >t} ] − lim E[Mt 1{τ >t} ].
t→∞
We shall show that both limits are equal to zero.
t→∞
(4.2.1)
68
Since |Mτ 1{τ >t} | ≤ |Mτ |, ∀t > 0, and Mτ is integrable, see Exercise 4.2.4 (a), by
the dominated convergence theorem we have
Z
Z
lim Mτ 1{τ >t} dP = 0.
Mτ 1{τ >t} dP =
lim E[Mτ 1{τ >t} ] = lim
t→∞
Ω t→∞
t→∞ Ω
For the second limit
lim E[Mt 1{τ >t} ] = lim
t→∞
Z
t→∞ Ω
Mt 1{τ >t} dP = 0,
since for t > N the integrand vanishes. Hence relation (4.2.1) yields E[Mτ ] = E[M0 ].
It is worth noting that the previous theorem is a special case of the more general
Optional Stopping Theorem of Doob:
Theorem 4.2.2 Let Mt be a right continuous martingale and σ, τ be two bounded
stopping times, with σ ≤ τ . Then Mσ , Mτ are integrable and
E[Mτ |Fσ ] = Mσ
a.s.
In particular, taking expectations, we have
E[Mτ ] = E[Mσ ]
a.s.
In the case when Mt is a submartingale then E[Mτ ] ≥ E[Mσ ]
a.s.
Exercise 4.2.3 Show that
Mτ = Mτ ∧t + (Mτ − Mt )1{τ >t} ,
where
1{τ >t} (ω) =
1, τ (ω) > t;
0, τ (ω) ≤ t
is the indicator function of the set {τ > t}.
Exercise 4.2.4 Let Mt be a right continuous martingale and τ be a bounded stopping
time. Show that
(a) Mτ is integrable;
(b) Mτ ∧t is a martingale.
Exercise 4.2.5 Show that if let σ = 0 in Theorem 4.2.2 yields Theorem 4.2.1.
69
2.5
Wt
2.0
a
1.5
50
100
150
200
250
Ta
300
350
Figure 4.1: The first hitting time Ta given by WTa = a.
4.3
The First Passage of Time
The first passage of time is a particular type of hitting time, which is useful in finance
when studying barrier options and lookback options. For instance, knock-in options
enter into existence when the stock price hits for the first time a certain barrier before
option maturity. A lookback option is priced using the maximum value of the stock
until the present time. The stock price is not a Brownian motion, but it depends on
one. Hence the need for studying the hitting time for the Brownian motion.
The first result deals with the first hitting time for a Brownian motion to reach the
barrier a ∈ R, see Fig.4.1.
Lemma 4.3.1 Let Ta be the first time the Brownian motion Wt hits a. Then the
distribution function of Ta is given by
2
P (Ta ≤ t) = √
2π
Z
∞
−y 2 /2
dy.
√ e
|a|/ t
Proof: If A and B are two events, then
P (A) = P (A ∩ B) + P (A ∩ B)
= P (A|B)P (B) + P (A|B)P (B).
(4.3.2)
Let a > 0. Using formula (4.3.2) for A = {ω; Wt (ω) ≥ a} and B = {ω; Ta (ω) ≤ t}
yields
P (Wt ≥ a) = P (Wt ≥ a|Ta ≤ t)P (Ta ≤ t)
+P (Wt ≥ a|Ta > t)P (Ta > t)
(4.3.3)
If Ta > t, the Brownian motion did not reach the barrier a yet, so we must have Wt < a.
Therefore
P (Wt ≥ a|Ta > t) = 0.
70
If Ta ≤ t, then WTa = a. Since the Brownian motion is a Markov process, it starts
fresh at Ta . Due to symmetry of the density function of a normal variable, Wt has
equal chances to go up or go down after the time interval t − Ta . It follows that
1
P (Wt ≥ a|Ta ≤ t) = .
2
Substituting into (4.3.3) yields
P (Ta ≤ t) = 2P (Wt ≥ a)
Z ∞
Z ∞
2
2
−y 2 /2
−x2 /(2t)
= √
dy.
e
dx = √
√ e
2πt a
2π a/ t
If a < 0, symmetry implies that the distribution of Ta is the same as that of T−a , so
we get
Z ∞
2
−y 2 /2
dy.
P (Ta ≤ t) = P (T−a ≤ t) = √
√ e
2π −a/ t
Remark 4.3.2 The previous proof is based on a more general principle called the Reflection Principle: If τ is a stopping time for the Brownian motion Wt , then the Brownian motion reflected at τ is also a Brownian motion.
Theorem 4.3.3 Let a ∈ R be fixed. Then the Brownian motion hits a (in a finite
amount of time) with probability 1.
Proof: The probability that Wt hits a (in a finite amount of time) is
Z ∞
2
−y 2 /2
dy
P (Ta < ∞) = lim P (Ta ≤ t) = lim √
√ e
t→∞
t→∞
2π |a|/ t
Z ∞
2
2
e−y /2 dy = 1,
= √
2π 0
where we used the well known integral
Z ∞
Z
1 ∞ −y2 /2
1√
−y 2 /2
e
dy =
2π.
e
dy =
2 −∞
2
0
The previous result stated that the Brownian motion hits the barrier a almost
surely. The next result shows that the expected time to hit the barrier is infinite.
Proposition 4.3.4 The random variable Ta has a Pearson 5 distribution given by
a2
3
|a|
p(t) = √ e− 2t t− 2 ,
2π
It has the mean E[Ta ] = ∞ and the mode
a2
.
3
t > 0.
71
0.4
0.3
0.2
0.1
2
a3
1
2
3
4
Figure 4.2: The distribution of the first hitting time Ta .
Proof: Differentiating in the formula of distribution function1
Z ∞
2
−y 2 /2
FTa (t) = P (Ta ≤ t) = √
dy
√ e
2π a/ t
yields the following probability density function
p(t) =
a2
3
a
dFTa (t)
= √ e− 2t t− 2 ,
dt
2π
t > 0.
This is a Pearson 5 distribution with parameters α = 1/2 and β = a2 /2. The expectation is
Z ∞
Z ∞
a2
1
a
√
√ e− 2t dt.
tp(t) dt =
E[Ta ] =
t
2π 0
0
a2
, t > 0, we have the estimation
2t
Z ∞
Z ∞
a3
a
1
1
√ dt − √
dt = ∞,
E[Ta ] > √
3/2
t
2π 0
2 2π 0 t
R∞ 1
1
√
dt is divergent and 0 t3/2
dt is convergent.
t
a2
Using the inequality e− 2t > 1 −
since
R∞
0
The mode of Ta is given by
a2
β
a2
= .
= 1
α+1
3
2( 2 + 1)
1
One may use Leibnitz’s formula
d
dt
Z
ψ(t)
ϕ(t)
f (u) du = f (ψ(t))ψ ′ (t) − f (ϕ(t))ϕ′ (t).
(4.3.4)
72
Remark 4.3.5 The distribution has a peak at a2 /3. Then if we need to pick a small
time interval [t − dt, t + dt] in which the probability that the Brownian motion hits the
barrier a is maximum, we need to choose t = a2 /3.
Remark 4.3.6 Formula (4.3.4) states that the expected waiting time for Wt to reach
the barrier a is infinite. However, the expected waiting time for the Brownian motion
Wt to hit either a or −a is finite, see Exercise 4.3.9.
Corollary 4.3.7 A Brownian motion process returns to the origin in a finite amount
time with probability 1.
Proof: Choose a = 0 and apply Theorem 4.3.3.
Exercise 4.3.8 Try to apply the proof of Lemma 4.3.1 for the following stochastic
processes
(a) Xt = µt + σWt , with µ, σ > 0 constants;
Z t
Ws ds.
(b) Xt =
0
Where is the difficulty?
Exercise 4.3.9 Let a > 0 and consider the hitting time
τa = inf{t > 0; Wt = a or Wt = −a} = inf{t > 0; |Wt | = a}.
Prove that E[τa ] = a2 .
Exercise 4.3.10 (a) Show that the distribution function of the process
Xt = max Ws
s∈[0,t]
is given by
2
P (Xt ≤ a) = √
2π
(b) Show that E[Xt ] =
p
Z
√
a/ t
e−y
2 /2
dy.
0
2
.
2t/π and V ar(Xt ) = t 1 −
π
Exercise 4.3.11 (a) Find the distribution of Yt = |Wt |, t ≥ 0;
r
πT
.
(b) Show that E max |Wt | =
0≤t≤T
2
The fact that a Brownian motion returns to the origin or hits a barrier almost surely
is a property characteristic to the first dimension only. The next result states that in
larger dimensions this is no longer possible.
73
Theorem 4.3.12
Let (a, b) ∈ R2 . The 2-dimensional Brownian motion W (t) =
W1 (t), W2 (t) (with W1 (t) and W2 (t) independent) hits the point (a, b) with probability zero. The same result is valid for any n-dimensional Brownian motion, with
n ≥ 2.
However, if the point (a, b) is replaced by the disk Dǫ (x0 ) = {x ∈ R2 ; |x − x0 | ≤ ǫ},
there is a difference in the behavior of the Brownian motion from n = 2 to n > 2.
Theorem 4.3.13 The 2-dimensional Brownian motion W (t) = W1 (t), W2 (t) hits
the disk Dǫ (x0 ) with probability one.
Theorem 4.3.14 Let n > 2. The n-dimensional Brownian motion W (t) = W1 (t), · · · Wn (t)
hits the ball Dǫ (x0 ) with probability
|x | 2−n
0
< 1.
P =
ǫ
The previous results can be stated by saying that that Brownian motion is transient
in Rn , for n > 2. If n = 2 the previous probability equals 1. We shall come back with
proofs to the aforementioned results in a later chapter.
Remark 4.3.15 If life spreads according to a Brownian motion, the aforementioned
results explain why life is more extensive on earth rather than in space. The probability
for a form of life to reach a planet of radius R situated at distance d is Rd . Since d is
large the probability is very small, unlike in the plane, where the probability is always
1.
Exercise 4.3.16 Is the one-dimensional Brownian motion transient or recurrent in
R?
4.4
The Arc-sine Laws
In this section we present a few results which provide certain probabilities related with
the behavior of a Brownian motion in terms the arc-sine of a quotient of two time
instances. These results are generally known under the name of law of arc-sines.
The following result will be used in the proof of the first law of Arc-sine.
Proposition 4.4.1 (a) If X : Ω → N is a discrete random variable, then for any
subset A ⊂ Ω, we have
X
P (A) =
P (A|X = x)P (X = x).
x∈N
(b) If X : Ω → R is a continuous random variable, then
Z
Z
P (A) = P (A|X = x)dP = P (A|X = x)fX (x) dx.
74
0.4
0.2
t1
t2
50
100
150
200
250
300
350
a
-0.2
-0.4
Wt
-0.6
Figure 4.3: The event A(a; t1 , t2 ) in the Law of Arc-sine.
Proof: (a) The sets X −1 (x) = {X = x} = {ω; X(ω) = x} form a partition of the
sample space Ω, i.e.:
S
(i) Ω = x X −1 (x);
(ii) X −1 (x) ∩ X −1 (y) = Ø for x 6= y.
[
[
Then A =
A ∩ X −1 (x) =
A ∩ {X = x} , and hence
x
x
P (A) =
X P A ∩ {X = x}
x
X P (A ∩ {X = x})
P ({X = x})
=
P ({X = x})
x
X
=
P (A|X = x)P (X = x).
x
(b) In the case when X is continuous, the sum is replaced by an integral and the
probability P ({X = x}) by fX (x)dx, where fX is the density function of X.
The zero set of a Brownian motion Wt is defined by {t ≥ 0; Wt = 0}. Since Wt is
continuous, the zero set is closed with no isolated points almost surely. The next result
deals with the probability that the zero set does not intersect the interval (t1 , t2 ).
Theorem 4.4.2 (The law of Arc-sine) The probability that a Brownian motion Wt
does not have any zeros in the interval (t1 , t2 ) is equal to
2
P (Wt =
6 0, t1 ≤ t ≤ t2 ) = arcsin
π
r
t1
.
t2
Proof: Let A(a; t1 , t2 ) denote the event that the Brownian motion Wt takes on the
value a between t1 and t2 . In particular, A(0; t1 , t2 ) denotes the event that Wt has (at
least) a zero between t1 and t2 . Substituting A = A(0; t1 , t2 ) and X = Wt1 into the
formula provided by Proposition 4.4.1
75
P (A) =
yields
P A(0; t1 , t2 ) =
=
Z
P (A|X = x)fX (x) dx
Z
P A(0; t1 , t2 )|Wt1 = x fWt (x) dx
1
Z ∞
− x2
1
√
P A(0; t1 , t2 )|Wt1 = x e 2t1 dx
2πt1 −∞
(4.4.5)
Using the properties of Wt with respect to time translation and symmetry we have
P A(0; t1 , t2 )|Wt1 = x = P A(0; 0, t2 − t1 )|W0 = x
= P A(−x; 0, t2 − t1 )|W0 = 0
= P A(|x|; 0, t2 − t1 )|W0 = 0
= P A(|x|; 0, t2 − t1 )
= P T|x| ≤ t2 − t1 ,
the last identity stating that Wt hits |x| before t2 − t1 . Using Lemma 4.3.1 yields
Z ∞
2
2
− y
P A(0; t1 , t2 )|Wt1 = x = p
e 2(t2 −t1 ) dy.
2π(t2 − t1 ) |x|
Substituting into (4.4.5) we obtain
Z ∞
Z ∞
2
x2
2
1
− y
−
p
e 2(t2 −t1 ) dy e 2t1 dx
P A(0; t1 , t2 ) = √
2πt1 −∞
2π(t2 − t1 ) |x|
Z ∞Z ∞
2
2
1
− y
−x
p
e 2(t2 −t1 ) 2t1 dydx.
=
π t1 (t2 − t1 ) 0
|x|
The above integral can be evaluated to get (see Exercise 4.4.3 )
r
t1
2
.
P A(0; t1 , t2 ) = 1 − arcsin
π
t2
Using P (Wt 6= 0, t1 ≤ t ≤ t2 ) = 1−P A(0; t1 , t2 ) we obtain the desired result.
Exercise 4.4.3 Use polar coordinates to show
r
Z ∞Z ∞
2
x2
2
1
t1
− 2(t y−t ) − 2t
2
1
1 dydx = 1 −
p
arcsin
.
e
π
t2
π t1 (t2 − t1 ) 0
|x|
Exercise 4.4.4
Find the probability that a 2-dimensional Brownian motion W (t) =
W1 (t), W2 (t) stays in the same quadrant for the time interval t ∈ (t1 , t2 ).
76
Exercise 4.4.5 Find the probability that a Brownian motion Wt does not take the
value a in the interval (t1 , t2 ).
Exercise 4.4.6 Let a 6= b. Find the probability that a Brownian motion Wt does not
take any of the values {a, b} in the interval (t1 , t2 ). Formulate and prove a generalization.
We provide below without proof a few similar results dealing with arc-sine probabilities.
The first result deals with the amount of time spent by a Brownian motion on the
positive half-axis.
Rt
+
Theorem 4.4.7 (Arc-sine Law of Lévy) Let L+
t = 0 sgn Ws ds be the amount of
time a Brownian motion Wt is positive during the time interval [0, t]. Then
r
τ
2
+
.
P (Lt ≤ τ ) = arcsin
π
t
The next result deals with the Arc-sine law for the last exit time of a Brownian
motion from 0.
Theorem 4.4.8 (Arc-sine Law of exit from 0) Let γt = sup{0 ≤ s ≤ t; Ws = 0}.
Then
r
2
τ
, 0 ≤ τ ≤ t.
P (γt ≤ τ ) = arcsin
π
t
The Arc-sine law for the time the Brownian motion attains its maximum on the
interval [0, t] is given by the next result.
Theorem 4.4.9 (Arc-sine Law of maximum) Let Mt = max Ws and define
0≤s≤t
θt = sup{0 ≤ s ≤ t; Ws = Mt }.
Then
4.5
2
P (θt ≤ s) = arcsin
π
r
s
,
t
0 ≤ s ≤ t, t > 0.
More on Hitting Times
In this section we shall deal with results regarding hitting times of Brownian motion
with drift. They will be useful in the sequel when pricing barrier options.
Theorem 4.5.1 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate
µ, and consider α, β > 0. Then
P (Xt goes up to α before down to − β) =
e2µβ − 1
.
e2µβ − e−2µα
77
Proof: Let T = inf{t > 0; Xt = α or Xt = −β} be the first exit time of Xt from the
interval (−β, α), which is a stopping time, see Exercise 4.1.4. The exponential process
c2
Mt = ecWt − 2 t ,
t≥0
is a martingale, see Exercise 3.2.4(c). Then E[Mt ] = E[M0 ] = 1. By the Optional
Stopping Theorem (see Theorem 4.2.1), we get E[MT ] = 1. This can be written as
1 2
)T
1 2
1 = E[ecWT − 2 c T ] = E[ecXT −(cµ+ 2 c
].
(4.5.6)
Choosing c = −2µ yields E[e−2µXT ] = 1. Since the random variable XT takes only the
values α and −β, if let pα = P (XT = α), the previous relation becomes
e−2µα pα + e2µβ (1 − pα ) = 1.
Solving for pα yields
pα =
Noting that
e2µβ − 1
.
e2µβ − e−2µα
(4.5.7)
pα = P (Xt goes up to α before down to − β)
leads to the desired answer.
It is worth noting how the previous formula changes in the case when the drift rate
is zero, i.e. when µ = 0, and Xt = Wt . The previous probability is computed by taking
the limit µ → 0 and using L’Hospital’s rule
lim
2βe2µβ
β
e2µβ − 1
= lim
=
.
−2µα
2µβ
µ→0 2βe
−e
+ 2αe−2µα
α+β
µ→0 e2µβ
Hence
P (Wt goes up to α before down to − β) =
β
.
α+β
Taking the limit β → ∞ we recover the following result
P (Wt hits α) = 1.
If α = β we obtain
P (Wt goes up to α before down to − α) =
1
,
2
which shows that the Brownian motion is equally likely to go up or down an amount
α in a given time interval.
If Tα and Tβ denote the times when the process Xt reaches α and β, respectively,
then the aforementioned probabilities can be written using inequalities. For instance
the first identity becomes
P (Tα ≤ T−β ) =
e2µβ − 1
.
e2µβ − e−2µα
78
Exercise 4.5.2 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate
µ, and consider α > 0.
(a) If µ > 0 show that
P (Xt goes up to α) = 1.
(b) If µ < 0 show that
P (Xt goes up to α) = e2µα < 1.
Formula (a) can be written equivalently as
P (sup(Wt + µt) ≥ α) = 1,
µ ≥ 0,
t≥0
while formula (b) becomes
P (sup(Wt + µt) ≥ α) = e2µα ,
µ < 0,
P (sup(Wt − γt) ≥ α) = e−2γα ,
γ > 0,
t≥0
or
t≥0
which is known as one of the Doob’s inequalities. This can be also described in terms
of stopping times as follows. Define the stopping time τα = inf{t > 0; Wt − γt ≥ α}.
Using
P (τα < ∞) = P sup(Wt − γt) ≥ α
t≥0
yields the identities
P (τα < ∞) = e−2αγ ,
P (τα < ∞) = 1,
γ > 0,
γ ≤ 0.
Exercise 4.5.3 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate
µ, and consider β > 0. Show that the probability that Xt never hits −β is given by
1 − e−2µβ , if µ > 0
0,
if µ < 0.
Recall that T is the first time when the process Xt hits α or −β.
Exercise 4.5.4 (a) Show that
E[XT ] =
(b) Find E[XT2 ];
(c) Compute V ar(XT ).
αe2µβ + βe−2µα − α − β
.
e2µβ − e−2µα
79
The next result deals with the time one has to wait (in expectation) for the process
Xt = µt + Wt to reach either α or −β.
Proposition 4.5.5 The expected value of T is
E[T ] =
αe2µβ + βe−2µα − α − β
.
µ(e2µβ − e−2µα )
Proof: Using that Wt is a martingale, with E[Wt ] = E[W0 ] = 0, applying the Optional
Stopping Theorem, Theorem 4.2.1, yields
0 = E[WT ] = E[XT − µT ] = E[XT ] − µE[T ].
Then by Exercise 4.5.4(a) we get
E[T ] =
αe2µβ + be−2µα − α − β
E[XT ]
=
.
µ
µ(e2µβ − e−2µα )
Exercise 4.5.6 Take the limit µ → 0 in the formula provided by Proposition 4.5.5 to
find the expected time for a Brownian motion to hit either α or −β.
Exercise 4.5.7 Find E[T 2 ] and V ar(T ).
Exercise 4.5.8 (Wald’s identities) Let T be a finite and bounded stopping time for
the Brownian motion Wt . Show that:
(a) E[WT ] = 0;
(b) E[WT2 ] = E[T ].
The previous techniques can be also applied to right continuous martingales. Let
a > 0 and consider the hitting time of the Poisson process of the barrier a
τ = inf{t > 0; Nt ≥ a}.
Proposition 4.5.9 The expected waiting time for Nt to reach the barrier a is E[τ ] = λa .
Proof: Since Mt = Nt − λt is a right continuous martingale, by the Optional Stopping
Theorem E[Mτ ] = E[M0 ] = 0. Then E[Nτ − λτ ] = 0 and hence E[τ ] = λ1 E[Nτ ] = λa .
80
4.6
The Inverse Laplace Transform Method
In this section we shall use the Optional Stopping Theorem in conjunction with the
inverse Laplace transform to obtain the probability density for hitting times.
A. The case of standard Brownian motion Let x > 0. The first hitting time
1 2
τ = Tx = inf{t > 0; Wt = x} is a stopping time. Since Mt = ecWt − 2 c t , t ≥ 0, is a
martingale, with E[Mt ] = E[M0 ] = 1, by the Optional Stopping Theorem, see Theorem
4.2.1, we have
E[Mτ ] = 1.
1 2
This can be written equivalently as E[ecWτ e− 2 c τ ] = 1. Using Wτ = x, we get
1 2
E[e− 2 c τ ] = e−cx .
1 2
τ
It is worth noting that c > 0. This is implied from the fact that e− 2 c
τ, x > 0.
Substituting s = 21 c2 , the previous relation becomes
E[e−sτ ] = e−
√
2sx
.
< 1 and
(4.6.8)
This relation has a couple of useful applications.
Proposition 4.6.1 The moments of the first hitting time are all infinite E[τ n ] = ∞,
n ≥ 1.
Proof: The nth moment of τ can be obtained by differentiating and taking s = 0
dn
= (−1)n E[τ n ].
= E[(−τ )n e−sτ ]
E[e−sτ ]
dsn
s=0
s=0
Using (4.6.8) yields
E[τ n ] = (−1)n
Since by induction we have
dn −√2sx
e
dsn
.
s=0
n−1
√
X Mk xn−k
dn −√2sx
n − 2sx
=
(−1)
e
e
,
dsn
2rk /2 s(n+k)/2
k=0
with Mk , rk positive integers, it easily follows that E[τ n ] = ∞.
For instance, in the case n = 1 we have
E[τ ] = −
√
x
d −√2sx
− 2sx
√
=
lim
e
e
= +∞.
ds
s→0+
2 2sx
s=0
Another application involves the inverse Laplace transform to get the probability
density. This way we can retrieve the result of Proposition 4.3.4.
81
Proposition 4.6.2 The probability density of the hitting time τ is given by
|x| − x2
p(t) = √
e 2t ,
2πt3
Proof: Let x > 0. The expectation
Z
E[e−sτ ] =
0
∞
t > 0.
(4.6.9)
e−sτ p(τ ) dτ = L{p(τ )}(s)
is the Laplace transform of p(τ ). Applying the inverse Laplace transform yields
p(τ ) = L−1 {E[e−sτ ]}(τ ) = L−1 {e−
x2
x
e− 2τ ,
τ > 0.
= √
2πτ 3
√
2sx
}(τ )
In the case x < 0 we obtain
−x − x2
e 2τ ,
p(τ ) = √
2πτ 3
τ > 0,
which leads to (4.6.9).
√
The computation on the inverse Laplace transform L−1 {e− 2sx }(τ ) is beyound the
goal of this book. The reader can obtain the value of this inverse Laplace transform
using the Mathematica software. However, the more mathematically interested reader
is referred to consult the method of complex integration in a book on inverse Laplace
transforms.
Another application of formula (4.6.8) is the following inequality.
Proposition 4.6.3 (Chernoff bound) Let τ denote the first hitting time when the
Brownian motion Wt hits the barrier x, x > 0. Then
x2
P (τ ≤ λ) ≤ e− 2λ ,
∀λ > 0.
Proof: Let s = −t in the part 2 of Theorem 2.11.11 and use (4.6.8) to get
√
E[etX ]
E[e−sX ]
∀s > 0.
=
= eλs−x 2s ,
λt
−λs
e
e
√
Then P (τ ≤ λ) ≤ emins>0 f (s) , where f (s) = λs − x 2s. Since f ′ (s) = λ −
P (τ ≤ λ) ≤
f (s) reaches its minimum at the critical point s0 =
min f (s) = f (s0 ) = −
s>0
x2
2λ2
√x ,
2s
. The minimum value is
x2
.
2λ
Substituting in the previous inequality leads to the required result.
then
82
B. The case of Brownian motion with drift
Consider the Brownian motion with drift Xt = µt + σWt , with µ, σ > 0. Let
τ = inf{t > 0; Xt = x}
denote the first hitting time of the barrier x, with x > 0. We shall compute the
distribution of the random variable τ and its first two moments.
Applying the Optional Stopping Theorem (Theorem 4.2.1) to the martingale Mt =
cWt − 21 c2 t
yields
e
E[Mτ ] = E[M0 ] = 1.
Using that Wτ = σ1 (Xτ − µτ ) and Xτ = x, the previous relation becomes
cµ
1 2
)τ
E[e−( σ + 2 c
Substituting s =
c
] = e− σ x .
(4.6.10)
cµ 1 2
+ c and completing to a square yields
σ
2
µ2 µ 2
2s + 2 = c +
.
σ
σ
Solving for c we get the solutions
r
µ2
µ
c = − + 2s + 2 ,
σ
σ
µ
c=− −
σ
r
2s +
µ2
.
σ2
Assume c < 0. Then substituting the second solution into (4.6.10) yields
√
1
(µ+ 2sσ2 +µ2 )x
−sτ
2
σ
.
E[e ] = e
√
1
2
2
This relation is contradictory since e−sτ < 1 while e σ2 (µ+ 2sσ +µ )x > 1, where we
used that s, x, τ > 0. Hence it follows that c > 0. Substituting the first solution into
(4.6.10) leads to
√
1
E[e−sτ ] = e σ2 (µ−
2sσ2 +µ2 )x
.
We arrived at the following result:
Proposition 4.6.4 Assume µ, x > 0. Let τ be the time the process Xt = µt + σWt
hits x for the first time. Then we have
√ 2 2
1
s > 0.
(4.6.11)
E[e−sτ ] = e σ2 (µ− 2sσ +µ )x ,
Proposition 4.6.5 Let τ be the time the process Xt = µt + σWt hits x, with x > 0
and µ > 0.
(a) Then the density function of τ is given by
(x−µτ )2
x
p(τ ) = √
e− 2τ σ2 ,
σ 2πτ 3/2
τ > 0.
(4.6.12)
83
(b) The mean and variance of τ are
E[τ ] =
x
,
µ
V ar(τ ) =
xσ 2
.
µ3
Proof: (a) Let p(τ ) be the density function of τ . Since
Z ∞
−sτ
e−sτ p(τ ) dτ = L{p(τ )}(s)
E[e ] =
0
is the Laplace transform of p(τ ), applying the inverse Laplace transform yields
√ 2 2
1
p(τ ) = L−1 {E[e−sτ ]} = L−1 {e σ2 (µ− 2sσ +µ )x }
(x−µτ )2
x
√
=
τ > 0.
e− 2τ σ2 ,
σ 2πτ 3/2
It is worth noting that the computation of the previous inverse Laplace transform is
non-elementary; however, it can be easily computed using the Mathematica software.
(b) The moments are obtained by differentiating the moment generating function
and taking the value at s = 0
√
2
2
d 1
d
E[e−sτ ]
= − e σ2 (µ− 2sσ +µ )x ds
ds
s=0
s=0
√
1
x
(µ− 2sσ2 +µ2 )x 2
p
eσ
µ2 + 2sµ
s=0
x
.
µ
E[τ ] = −
=
=
√
d2 12 (µ− 2sσ2 +µ2 )x d2
−sτ σ
E[e
]
e
=
ds2
ds2
s=0
s=0
p
√ 2 2
2
2
2
1
x(σ + x µ + 2sσ ) 2 (µ− 2sσ +µ )x eσ
(µ2 + 2sσ 2 )3/2
s=0
xσ 2
x2
+ 2.
µ3
µ
E[τ 2 ] = (−1)2
=
=
Hence
xσ 2
.
µ3
It is worth noting that we can arrive at the formula E[τ ] = µx in the following heuristic
way. Taking the expectation in the equation µτ + σWτ = x yields µE[τ ] = x, where
we used that E[Wτ ] = 0 for any finite stopping time τ (see Exercise 4.5.8 (a)). Solving
for E[τ ] yields the aforementioned formula.
V ar(τ ) = E[τ 2 ] − E[τ ]2 =
Even if the computations are more or less similar with the previous result, we shall
treat next the case of the negative barrier in its full length. This is because of its
particular importance in being useful in pricing perpetual American puts.
84
Proposition 4.6.6 Assume µ, x > 0. Let τ be the time the process Xt = µt + σWt
hits −x for the first time.
(a) We have
√
−1
(µ+ 2sσ2 +µ2 )x
−sτ
2
σ
,
s > 0.
(4.6.13)
E[e ] = e
(b) Then the density function of τ is given by
(x+µτ )2
x
e− 2τ σ2 ,
p(τ ) = √
σ 2πτ 3/2
(c) The mean of τ is
E[τ ] =
Proof:
τ > 0.
(4.6.14)
x − 2µx
e σ2 .
µ
(a) Consider the stopping time τ = inf{t > 0; Xt = −x}. By the Optional
c2
Stopping Theorem (Theorem 4.2.1) applied to the martingale Mt = ecWt − 2 t yields
Therefore
c
c2 c2 1 = M0 = E[Mτ ] = E ecWτ − 2 τ = E e σ (Xτ −µτ )− 2 τ
cµ c2 c cµ c2 c
= E e− σ x− σ τ − 2 τ = e− σ x E e−( σ + 2 )τ .
cµ c2 c
E e−( σ + 2 )τ = e σ x .
(4.6.15)
q
2
µ
c2
If let s = cµ
2s + σµ2 , but only the negative
σ + 2 , then solving for c yields c = − σ ±
solution works out; this comes from the fact that both terms of the equation (4.6.15)
have to be less than 1. Hence (4.6.15) becomes
√
−1
2
2
E[e−sτ ] = e σ2 (µ+ 2sσ +µ )x ,
s > 0.
(b) Relation (4.6.13) can be written equivalently as
√
−1
(µ+ 2sσ2 +µ2 )x
2
σ
L p(τ ) = e
.
Taking the inverse Laplace transform, and using Mathematica software to compute it,
we obtain
−1 √ 2 2 (x+µτ )2
x
−1
e σ2 (µ+ 2sσ +µ )x (τ ) = √
p(τ ) = L
e− 2τ σ2 , τ > 0.
σ 2πτ 3/2
(c) Differentiating and evaluating at s = 0 we obtain
E[τ ] = −
2µ
x 2µx
d
x σ2
E[e−sτ ]
= e− σ2 x 2 p = e− σ2 .
ds
σ
µ
µ2
s=0
85
Exercise 4.6.7 Assume the hypothesis of Proposition 4.6.6 satisfied. Find V ar(τ ).
Exercise 4.6.8 Find the modes of distributions (4.6.12) and (4.6.14). What do you
notice?
Exercise 4.6.9 Let Xt = 2t + 3Wt and Yt = 2t + Wt .
(a) Show that the expected times for Xt and Yt to reach any barrier x > 0 are the
same.
(b) If Xt and Yt model the prices of two stocks, which one would you like to own?
Exercise 4.6.10 Does 4t + 2Wt hit 9 faster (in expectation) than 5t + 3Wt hits 14?
Exercise 4.6.11 Let τ be the first time the Brownian motion with drift Xt = µt + Wt
hits x, where µ, x > 0. Prove the inequality
P (τ ≤ λ) ≤ e−
x2 +λ2 µ2
+µx
2λ
∀λ > 0.
,
C. The double barrier case
In the following we shall consider the case of double barrier. Consider the Brownian
motion with drift Xt = µt + Wt , µ > 0. Let α, β > 0 and define the stopping time
T = inf{t > 0; Xt = α or Xt = −β}.
Relation (4.5.6) states
1 2
)T
E[ecXT e−(cµ+ 2 c
] = 1.
Since the random variables T and XT are independent (why?), we have
1 2
)T
E[ecXT ]E[e−(cµ+ 2 c
] = 1.
Using E[ecXT ] = ecα pα + e−cβ (1 − pα ), with pα given by (4.5.7), then
1 2
)T
E[e−(cµ+ 2 c
]=
1
ecα p
α
+
e−cβ (1
− pα )
·
If substitute s = cµ + 12 c2 , then
E[e−sT ] =
e(−µ+
√
1
2s+µ2 )α p
−(−µ+
α+e
√
2s+µ2 )β (1
− pα )
·
(4.6.16)
The probability density of the stopping time T is obtained by taking the inverse Laplace
transform of the right side expression
n
o
1
√
√
p(T ) = L−1
(τ ),
2
2
e(−µ+ 2s+µ )α pα + e−(−µ+ 2s+µ )β (1 − pα )
an expression which is not feasible for having closed form solution. However, expression
(4.6.16) would be useful for computing the price for double barrier derivatives.
86
Exercise 4.6.12 Use formula (4.6.16) to find the expectation E[T ].
Exercise 4.6.13 Denote by Mt = Nt − λt the compensated Poisson process and let
c > 0 be a constant.
(a) Show that
Xt = ecMt −λt(e
c −c−1)
is an Ft -martingale, with Ft = σ(Nu ; u ≤ t).
(b) Let a > 0 and T = inf{t > 0; Mt > a} be the first hitting time of the level a. Use
the Optional Stopping Theorem to show that
E[e−λsT ] = e−ϕ(s)a ,
s > 0,
where ϕ : [0, ∞) → [0, ∞) is the inverse function of f (x) = ex − x − 1.
(c) Show that E[T ] = ∞.
(d) Can you use the inverse Laplace transform to find the probability density function
of T ?
4.7
Limits of Stochastic Processes
Let (Xt )t≥0 be a stochastic process. We can make sense of the limit expression X =
lim Xt , in a similar way as we did in section 2.12 for sequences of random variables.
t→∞
We shall rewrite the definitions for the continuous case.
Almost Certain Limit
The process Xt converges almost certainly to X, if for all states of the world ω, except
a set of probability zero, we have
lim Xt (ω) = X(ω).
t→∞
We shall write ac-lim Xt = X. It is also sometimes called strong convergence.
t→∞
Mean Square Limit
We say that the process Xt converges to X in the mean square if
lim E[(Xt − X)2 ] = 0.
t→∞
In this case we write ms-lim Xt = X.
t→∞
87
Limit in Probability or Stochastic Limit
The stochastic process Xt converges in stochastic limit to X if
lim P ω; |Xt (ω) − X(ω)| > ǫ = 0.
t→∞
This limit is abbreviated by st-lim Xt = X.
t→∞
It is worth noting that, like in the case of sequences of random variables, both almost
certain convergence and convergence in mean square imply the stochastic convergence,
which implies the limit in distribution.
Limit in Distribution
We say that Xt converges in distribution to X if for any continuous bounded function
ϕ(x) we have
lim E[ϕ(Xt )] = E[ϕ(X)].
t→∞
It is worth noting that the stochastic convergence implies the convergence in distribution.
4.8
Mean Square Convergence
The following property is a reformulation of Exercise 2.12.1 in the continuous setup.
Proposition 4.8.1 Consider a stochastic process Xt such that E[Xt ] → k, a constant,
and V ar(Xt ) → 0 as t → ∞. Then ms-lim Xt = k.
t→∞
It is worthy to note that the previous statement holds true if the limit to infinity is
replaced by a limit to any other number.
Next we shall provide a few applications.
Application 4.8.2 If α > 1/2, then
ms-lim
t→∞
Wt
= 0.
tα
E[Wt ]
1
t
Wt
= 0, and V ar[Xt ] = 2α V ar[Wt ] = 2α =
Proof: Let Xt = α . Then E[Xt ] =
α
t
t
t
t
1
1
,
for
any
t
>
0.
Since
→
0
as
t
→
∞,
applying
Proposition
4.8.1
yields
t2α−1
t2α−1
Wt
ms-lim α = 0.
t→∞ t
Wt
= 0.
t→∞ t
Corollary 4.8.3 ms-lim
88
Application 4.8.4 Let Zt =
Rt
0
Ws ds. If β > 3/2, then
Zt
= 0.
t→∞ tβ
ms-lim
E[Zt ]
1
t3
Zt
=
0,
and
V
ar[X
]
=
V
ar[Z
]
=
=
Proof: Let Xt = β . Then E[Xt ] =
t
t
t
tβ
t2β
3t2β
1
1
, for any t > 0. Since 2β−3 → 0 as t → ∞, applying Proposition 4.8.1 leads to
3t2β−3
3t
the desired result.
Application 4.8.5 For any p > 0, c ≥ 1 we have
eWt −ct
= 0.
t→∞
tp
ms-lim
Proof: Consider the process Xt =
eWt
eWt −ct
=
. Since
tp
tp ect
1
E[eWt ]
et/2
1
=
=
→ 0, as t → ∞
1
p
ct
p
ct
p
)t
(c−
t e
t e
e 2 t
e2t − et
1 1
V ar[eWt ]
1 =
=
−
→ 0,
t2p e2ct
t2p e2ct
t2p e2t(c−1)
et(2c−1)
E[Xt ] =
V ar[Xt ] =
as t → ∞, Proposition 4.8.1 leads to the desired result.
Application 4.8.6 Show that
max Ws
0≤s≤t
ms-lim
t→∞
t
= 0.
max Ws
Proof: Let Xt =
0≤s≤t
t
. Since by Exercise 4.3.10
E[ max Ws ] = 0
0≤s≤t
√
V ar( max Ws ) = 2 t,
0≤s≤t
then
E[Xt ] = 0
V ar[Xt ] =
√
2 t
→ 0,
t2
Apply Proposition 4.8.1 to get the desired result.
t → ∞.
89
Remark 4.8.7 One of the strongest result regarding limits of Brownian motions is
called the law of iterated logarithms and was first proved by Lamperti:
lim sup p
t→∞
almost certainly.
Wt
= 1,
2t ln(ln t)
Exercise 4.8.8 Find a stochastic process Xt such that both conditions are satisfied:
(i) ms-lim Xt = 0
t→∞
(ii) ms-lim Xt2 6= 0.
t→∞
Exercise 4.8.9 Let Xt be a stochastic process. Show that
ms-lim Xt = 0 ⇐⇒ ms-lim |Xt | = 0.
t→∞
4.9
t→∞
The Martingale Convergence Theorem
We state now, without proof, a result which is a powerful way of proving the almost
sure convergence. We start with the discrete version:
Theorem 4.9.1 Let Xn be a martingale with bounded means
∃M > 0 such that E[|Xn |] ≤ M,
∀n ≥ 1.
(4.9.17)
Then there is a random variable L such that as- lim Xn = L, i.e.
n→∞
P ω; lim Xn (ω) = L(ω) = 1.
n→∞
Since E[|Xn |]2 ≤ E[Xn2 ], the condition (4.9.17) can be replaced by its stronger version
∃M > 0 such that E[Xn2 ] ≤ M,
∀n ≥ 1.
The following result deals with the continuous versionof theMartingale Convergence Theorem. Denote the infinite knowledge by F∞ = σ ∪t Ft .
Theorem 4.9.2 Let Xt be an Ft -martingale such that
∃M > 0 such that E[|Xt |] < M,
∀t > 0.
Then there is an F∞ -measurable random variable X∞ such that Xt → X∞ a.s. as
t → ∞.
The next exercise deals with a process that is a.s-convergent but is not ms-convergent.
90
Exercise 4.9.3 It is known that Xt = eWt −t/2 is a martingale. Since
E[|Xt |] = E[eWt −t/2 ] = e−t/2 E[eWt ] = e−t/2 et/2 = 1,
by the Martingale Convergence Theorem there is a random variable L such that Xt → L
a.s. as t → ∞.
(a) What is the limit L? How did you make your guess?
(b) Show that
2
E[|Xt − 1|2 ] = V ar(Xt ) + E(Xt ) − 1 .
(c) Show that Xt does not converge in the mean square to 1.
(d) Prove that the sequence Xt is a.s-convergent but it is not ms-convergent.
4.10
Quadratic Variations
In order to gauge the roughness of stochastic processes we shall consider the sum ppowers of consecutive increments in the process, as the norm of the partition decreases
to zero.
Definition 4.10.1 Let Xt be a continuous stochastic process on Ω, and p > 0 be a
(p)
constant. The p-th variation process of Xt , denoted by hX, Xit is defined by
(p)
hX, Xit (ω)
= p- lim
maxi |ti+1 −ti |
n−1
X
i=0
|Xti+1 (ω) − Xti (ω)|p ,
where 0 = t1 < t2 < · · · < tn = t.
(1)
If p = 1, the process hX, Xit is called the total variation process of Xt . For p = 2,
(2)
hX, Xit = hX, Xit is called the quadratic variation process of Xt . It can be shown
that the limit exists for continuous square integrable martingales Xt .
Next we introduce a symmetric and bilinear operation.
Definition 4.10.2 The quadratic covariation of two processes Xt and Yt is defined as
1
hX + Y, X + Y it − hX − Y, X − Y it .
hX, Y it =
4
Exercise 4.10.3 Prove that:
(a) hX, Y it = hY, Xit ;
(b) haX + bY, Zit = ahX, Zit + bhY, Zit .
Exercise 4.10.4 Prove that the total variation on the interval [0, t] of a Brownian
motion is infinite a.s.
We shall encounter in the following a few important examples that will be useful
when dealing with stochastic integrals.
91
4.10.1
The Quadratic Variation of Wt
The next result states that the quadratic variation of the Brownian motion Wt on the
interval [0, T ] is T . More precisely, we have:
Proposition 4.10.5 Let T > 0 and consider the equidistant partition 0 = t0 < t1 <
· · · tn−1 < tn = T . Then
ms- lim
n→∞
n−1
X
(Wti+1 − Wti )2 = T.
i=0
(4.10.18)
Proof: Consider the random variable
Xn =
n−1
X
i=0
(Wti+1 − Wti )2 .
Since the increments of a Brownian motion are independent, Proposition 5.2.1 yields
E[Xn ] =
E[(Wti+1 − Wti ) ] =
n−1
X
V ar[(Wti+1 − Wti )2 ] =
i=0
2
= tn − t0 = T ;
V ar(Xn ) =
n−1
X
n−1
X
i=0
i=0
(ti+1 − ti )
n−1
X
i=0
2(ti+1 − ti )2
T 2 2T 2
= n·2
=
,
n
n
where we used that the partition is equidistant. Since Xn satisfies the conditions
E[Xn ]
=
T,
V ar[Xn ] → 0,
∀n ≥ 1;
n → ∞,
by Proposition 4.8.1 we obtain ms-lim Xn = T , or
n→∞
ms- lim
n→∞
n−1
X
i=0
(Wti+1 − Wti )2 = T.
(4.10.19)
Since the mean square convergence implies convergence in probability, it follows
that
n−1
X
(Wti+1 − Wti )2 = T,
p- lim
n→∞
i=0
i.e. the quadratic variation of Wt on [0, T ] is T .
Exercise 4.10.6 Prove that the quadratic variation of the Brownian motion Wt on
[a, b] is equal to b − a.
92
The Fundamental Relation dWt2 = dt
The relation discussed in this section can be regarded as the fundamental relation of
Stochastic Calculus. We shall start by recalling relation (4.10.19)
ms- lim
n→∞
n−1
X
i=0
(Wti+1 − Wti )2 = T.
(4.10.20)
The right side can be regarded as a regular Riemann integral
Z T
dt,
T =
0
while the left side can be regarded as a stochastic integral with respect to dWt2
Z
T
(dWt )2 = ms- lim
n→∞
0
Substituting into (4.10.20) yields
Z
Z T
2
(dWt ) =
n−1
X
i=0
(Wti+1 − Wti )2 .
T
dt,
0
0
∀T > 0.
The differential form of this integral equation is
dWt2 = dt.
In fact, this expression also holds in the mean square sense, as it can be inferred from
the next exercise.
Exercise 4.10.7 Show that
(a) E[dWt2 − dt] = 0;
(b) V ar(dWt2 − dt) = o(dt);
(c) ms-lim (dWt2 − dt) = 0.
dt→0
Roughly speaking, the process dWt2 , which is the square of infinitesimal increments
of a Brownian motion, is totally predictable. This relation plays a central role in
Stochastic Calculus and will be useful when dealing with Ito’s lemma.
The following exercise states that dtdWt = 0, which is another important stochastic
relation useful in Ito’s lemma.
Exercise 4.10.8 Consider the equidistant partition 0 = t0 < t1 < · · · tn−1 < tn = T .
Show that
n−1
X
(Wti+1 − Wti )(ti+1 − ti ) = 0.
(4.10.21)
ms- lim
n→∞
i=0
93
4.10.2
The Quadratic Variation of Nt − λt
The following result deals with the quadratic variation of the compensated Poisson
process Mt = Nt − λt.
Proposition 4.10.9 Let a < b and consider the partition a = t0 < t1 < · · · < tn−1 <
tn = b. Then
n−1
X
ms– lim
(4.10.22)
(Mtk+1 − Mtk )2 = Nb − Na ,
k∆n k→0
where k∆n k :=
k=0
sup (tk+1 − tk ).
0≤k≤n−1
Proof: For the sake of simplicity we shall use the following notations:
∆tk = tk+1 − tk ,
∆Mk = Mtk+1 − Mtk ,
∆Nk = Ntk+1 − Ntk .
The relation we need to prove can also be written as
ms-lim
n→∞
Let
n−1
X
k=0
(∆Mk )2 − ∆Nk = 0.
Yk = (∆Mk )2 − ∆Nk = (∆Mk )2 − ∆Mk − λ∆tk .
It suffices to show that
E
lim V ar
n→∞
n−1
hX
k=0
n−1
hX
k=0
Yk
Yk
i
i
= 0,
(4.10.23)
= 0.
(4.10.24)
The first identity follows from the properties of Poisson processes (see Exercise 3.8.9)
E
h n−1
X
k=0
Yk
i
=
=
n−1
X
k=0
n−1
X
k=0
E[Yk ] =
n−1
X
k=0
E[(∆Mk )2 ] − E[∆Nk ]
(λ∆tk − λ∆tk ) = 0.
For the proof of the identity (4.10.24) we need to find first the variance of Yk .
V ar[Yk ] = V ar[(∆Mk )2 − (∆Mk + λ∆tk )] = V ar[(∆Mk )2 − ∆Mk ]
= V ar[(∆Mk )2 ] + V ar[∆Mk ] − 2Cov[∆Mk2 , ∆Mk ]
= λ∆tk + 2λ2 ∆t2k + λ∆tk
h
i
−2 E[(∆Mk )3 ] − E[(∆Mk )2 ]E[∆Mk ]
= 2λ2 (∆tk )2 ,
94
where we used Exercise 3.8.9 and the fact that E[∆Mk ] = 0. Since Mt is a process
with independent increments, then Cov[Yk , Yj ] = 0 for i 6= j. Then
V ar
h n−1
X
k=0
Yk
i
n−1
X
=
V ar[Yk ] + 2
k=0
= 2λ2
and hence V ar
k=0
Cov[Yk , Yj ] =
k6=j
n−1
X
k=0
h n−1
X
X
n−1
X
V ar[Yk ]
k=0
(∆tk )2 ≤ 2λ2 k∆n k
n−1
X
k=0
∆tk = 2λ2 (b − a)k∆n k,
i
Yn → 0 as k∆n k → 0. According to the Example 2.12.1, we obtain
the desired limit in mean square.
The previous result states that the quadratic variation of the martingale Mt between
a and b is equal to the jump of the Poisson process between a and b.
The Fundamental Relation dMt2 = dNt
Recall relation (4.10.22)
n−1
X
ms- lim
(Mtk+1 − Mtk )2 = Nb − Na .
n→∞
(4.10.25)
k=0
The right side can be regarded as a Riemann-Stieltjes integral
Nb − Na =
Z
b
dNt ,
a
while the left side can be regarded as a stochastic integral with respect to (dMt )2
Z
b
n→∞
n−1
X
2
Z
(dMt )2 := ms- lim
a
k=0
(Mtk+1 − Mtk )2 .
Substituting in (4.10.25) yields
Z
b
(dMt ) =
a
b
dNt ,
a
for any a < b. The equivalent differential form is
(dMt )2 = dNt .
(4.10.26)
95
The Relations dt dMt = 0, dWt dMt = 0
In order to show that dt dMt = 0 in the mean square sense, we need to prove the limit
ms- lim
n→∞
n−1
X
k=0
(tk+1 − tk )(Mtk+1 − Mtk ) = 0.
(4.10.27)
This can be thought as a vanishing integral of the increment process dMt with respect
to dt
Z b
dMt dt = 0,
∀a, b ∈ R.
a
Denote
Xn =
n−1
X
k=0
(tk+1 − tk )(Mtk+1 − Mtk ) =
n−1
X
∆tk ∆Mk .
k=0
In order to show (4.10.27) it suffices to prove that
1. E[Xn ] = 0;
2. lim V ar[Xn ] = 0.
n→∞
Using the additivity of the expectation and Exercise 3.8.9, (ii)
E[Xn ] = E
h n−1
X
k=0
i n−1
X
∆tk E[∆Mk ] = 0.
∆tk ∆Mk =
k=0
Since the Poisson process Nt has independent increments, the same property holds for
the compensated Poisson process Mt . Then ∆tk ∆Mk and ∆tj ∆Mj are independent
for k 6= j, and using the properties of variance we have
V ar[Xn ] = V ar
h n−1
X
k=0
n−1
i n−1
X
X
∆tk ∆Mk =
(∆tk )2 V ar[∆Mk ] = λ
(∆tk )3 ,
k=0
k=0
where we used
V ar[∆Mk ] = E[(∆Mk )2 ] − (E[∆Mk ])2 = λ∆tk ,
see Exercise 3.8.9 (ii). If we let k∆n k = max ∆tk , then
k
V ar[Xn ] = λ
n−1
X
k=0
(∆tk )3 ≤ λk∆n k2
n−1
X
k=0
∆tk = λ(b − a)k∆n k2 → 0
as n → ∞. Hence we proved the stochastic differential relation
dt dMt = 0.
(4.10.28)
96
For showing the relation dWt dMt = 0, we need to prove
ms- lim Yn = 0,
(4.10.29)
n→∞
where we have denoted
Yn =
n−1
X
k=0
(Wk+1 − Wk )(Mtk+1 − Mtk ) =
n−1
X
∆Wk ∆Mk .
k=0
Since the Brownian motion Wt and the process Mt have independent increments and
∆Wk is independent of ∆Mk , we have
E[Yn ] =
n−1
X
E[∆Wk ∆Mk ] =
k=0
n−1
X
E[∆Wk ]E[∆Mk ] = 0,
k=0
where we used E[∆Wk ] = E[∆Mk ] = 0. Using also E[(∆Wk )2 ] = ∆tk , E[(∆Mk )2 ] =
λ∆tk , and invoking the independence of ∆Wk and ∆Mk , we get
V ar[∆Wk ∆Mk ] = E[(∆Wk )2 (∆Mk )2 ] − (E[∆Wk ∆Mk ])2
= E[(∆Wk )2 ]E[(∆Mk )2 ] − E[∆Wk ]2 E[∆Mk ]2
= λ(∆tk )2 .
Then using the independence of the terms in the sum, we get
V ar[Yn ] =
n−1
X
V ar[∆Wk ∆Mk ] = λ
n−1
X
(∆tk )2
k=0
k=0
≤ λk∆n k
n−1
X
k=0
∆tk = λ(b − a)k∆n k → 0,
as n → ∞. Since Yn is a random variable with mean zero and variance decreasing to
zero, it follows that Yn → 0 in the mean square sense. Hence we proved that
dWt dMt = 0.
(4.10.30)
Exercise 4.10.10 Show the following stochastic differential relations:
(a) dt dNt = 0;
(d) (dNt
)2
= dNt ;
(b) dWt dNt = 0;
(e) (dMt
)2
= dNt ;
(c) dt dWt = 0;
(f ) (dMt )4 = dNt .
The relations proved in this section will be useful in the Part II when developing
the stochastic model of a stock price that exhibits jumps modeled by a Poisson process.
Chapter 5
Stochastic Integration
This chapter deals with one of the most useful stochastic integrals, called the Ito integral. This type of integral was introduced in 1944 by the Japanese mathematician K.
Ito, and was originally motivated by a construction of diffusion processes.
5.1
Nonanticipating Processes
Consider the Brownian motion Wt . A process Ft is called a nonanticipating process
if Ft is independent of any future increment Wt′ − Wt for any t and t′ with t < t′ .
Consequently, the process Ft is independent of the behavior of the Brownian motion
in the future, i.e. it cannot anticipate the future. For instance, Wt , eWt , Wt2 − Wt + t
are examples of nonanticipating processes, while Wt+1 or 21 (Wt+1 − Wt )2 are not.
Nonanticipating processes are important because the Ito integral concept applies
only to them.
If Ft denotes the information known until time t, where this information is generated
by the Brownian motion {Ws ; s ≤ t}, then any Ft -adapted process Ft is nonanticipating.
5.2
Increments of Brownian Motions
In this section we shall discuss a few basic properties of the increments of a Brownian
motion, which will be useful when computing stochastic integrals.
Proposition 5.2.1 Let Wt be a Brownian motion. If s < t, we have
1. E[(Wt − Ws )2 ] = t − s.
2. V ar[(Wt − Ws )2 ] = 2(t − s)2 .
Proof: 1. Using that Wt − Ws ∼ N (0, t − s), we have
E[(Wt − Ws )2 ] = E[(Wt − Ws )2 ] − (E[Wt − Ws ])2 = V ar(Wt − Ws ) = t − s.
97
98
2. Dividing by the standard deviation yields the standard normal random variable
(Wt − Ws )2
Wt − Ws
√
is χ2 -distributed with 1 degree of free∼ N (0, 1). Its square,
t−s
t−s
dom.1 Its mean is 1 and its variance is 2. This implies
h (W − W )2 i
t
s
= 1 =⇒ E[(Wt − Ws )2 ] = t − s;
t−s
h (W − W )2 i
t
s
V ar
= 2 =⇒ V ar[(Wt − Ws )2 ] = 2(t − s)2 .
t−s
E
Remark 5.2.2 The infinitesimal version of the previous result is obtained by replacing
t − s with dt
1. E[dWt2 ] = dt;
2. V ar[dWt2 ] = 2dt2 .
We shall see in an upcoming section that in fact dWt2 and dt are equal in a mean square
sense.
Exercise 5.2.3 Show that
(a) E[(Wt − Ws )4 ] = 3(t − s)2 ;
(b) E[(Wt − Ws )6 ] = 15(t − s)3 .
5.3
The Ito Integral
The Ito integral is defined in a way that is similar to the Riemann integral. The
Ito integral is taken with respect to infinitesimal increments of a Brownian motion,
dWt , which are random variables, while the Riemann integral considers integration
with respect to the predictable infinitesimal changes dt. It is worth noting that the Ito
integral is a random variable, while the Riemann integral is just a real number. Despite
this fact, there are several common properties and relations between these two types
of integrals.
Consider 0 ≤ a < b and let Ft = f (Wt , t) be a nonanticipating process with
hZ b
i
E
Ft2 dt < ∞.
(5.3.1)
a
The role of the previous condition will be made more clear when we shall discuss
about the martingale property of a the Ito integral. Divide the interval [a, b] into n
subintervals using the partition points
a = t0 < t1 < · · · < tn−1 < tn = b,
1
A χ2 -distributed random variable with n degrees of freedom has mean n and variance 2n.
99
and consider the partial sums
Sn =
n−1
X
i=0
Fti (Wti+1 − Wti ).
We emphasize that the intermediate points are the left endpoints of each interval, and
this is the way they should be always chosen. Since the process Ft is nonanticipative,
the random variables Fti and Wti+1 − Wti are independent; this is an important feature
in the definition of the Ito integral.
The Ito integral is the limit of the partial sums Sn
Z b
Ft dWt ,
ms-lim Sn =
n→∞
a
provided the limit exists. It can be shown that the choice of partition does not influence
the value of the Ito integral. This is the reason why, for practical purposes, it suffices
to assume the intervals equidistant, i.e.
ti+1 − ti =
(b − a)
,
n
i = 0, 1, · · · , n − 1.
The previous convergence is in the mean square sense, i.e.
Z b
2 i
h
= 0.
lim E Sn −
Ft dWt
n→∞
a
Existence of the Ito integral
It is known that the Ito stochastic integral
Z
b
Ft dWt exists if the process Ft = f (Wt , t)
a
satisfies the following properties:
1. The paths t → Ft (ω) are continuous on [a, b] for any state of the world ω ∈ Ω;
2. The process Ft is nonanticipating for t ∈ [a, b];
hZ b
i
3. E
Ft2 dt < ∞.
a
For instance, the following stochastic integrals exist:
Z
5.4
T
0
Wt2 dWt ,
Z
T
sin(Wt ) dWt ,
0
Z
a
b
cos(Wt )
dWt .
t
Examples of Ito integrals
As in the case of the Riemann integral, using the definition is not an efficient way of
computing integrals. The same philosophy applies to Ito integrals. We shall compute
in the following two simple Ito integrals. In later sections we shall introduce more
efficient methods for computing Ito integrals.
100
5.4.1
The case Ft = c, constant
In this case the partial sums can be computed explicitly
Sn =
n−1
X
i=0
Fti (Wti+1 − Wti ) =
= c(Wb − Wa ),
n−1
X
i=0
c(Wti+1 − Wti )
and since the answer does not depend on n, we have
Z b
c dWt = c(Wb − Wa ).
a
In particular, taking c = 1, a = 0, and b = T , since the Brownian motion starts at 0,
we have the following formula:
Z
5.4.2
T
dWt = WT .
0
The case Ft = Wt
We shall integrate the process Wt between 0 and T . Considering an equidistant partikT
, k = 0, 1, · · · , n − 1. The partial sums are given by
tion, we take tk =
n
Sn =
n−1
X
i=0
Wti (Wti+1 − Wti ).
Since
1
xy = [(x + y)2 − x2 − y 2 ],
2
letting x = Wti and y = Wti+1 − Wti yields
Wti (Wti+1 − Wti ) =
1
1
1 2
Wti+1 − Wt2i − (Wti+1 − Wti )2 .
2
2
2
Then after pair cancelations the sum becomes
n−1
Sn =
i=0
=
Using tn = T , we get
n−1
n−1
i=0
i=0
1X 2 1X
1X 2
Wti+1 −
Wti −
(Wti+1 − Wti )2
2
2
2
1 2
1
Wtn −
2
2
n−1
X
i=0
(Wti+1 − Wti )2 .
n−1
1X
1
(Wti+1 − Wti )2 .
Sn = WT2 −
2
2
i=0
101
Since the first term on the right side is independent of n, using Proposition 4.10.5, we
have
n−1
ms- lim Sn =
n→∞
1X
1 2
(Wti+1 − Wti )2
WT − ms- lim
n→∞ 2
2
(5.4.2)
i=0
1 2 1
W − T.
(5.4.3)
2 T 2
We have now obtained the following explicit formula of a stochastic integral:
Z T
1
1
Wt dWt = WT2 − T.
2
2
0
=
In a similar way one can obtain
Z b
1
1
Wt dWt = (Wb2 − Wa2 ) − (b − a).
2
2
a
It is worth noting that the right side contains random variables depending on the limits
of integration a and b.
Exercise 5.4.1 Show the following identities:
RT
(a) E[ 0 dWt ] = 0;
RT
(b) E[ 0 Wt dWt ] = 0;
Z T
T2
.
Wt dWt ] =
(c) V ar[
2
0
5.5
Properties of the Ito Integral
We shall start with some properties which are similar with those of the Riemannian
integral.
Proposition 5.5.1 Let f (Wt , t), g(Wt , t) be nonanticipating processes and c ∈ R.
Then we have
1. Additivity:
Z T
Z T
Z T
g(Wt , t) dWt .
f (Wt , t) dWt +
[f (Wt , t) + g(Wt , t)] dWt =
2. Homogeneity:
Z
T
cf (Wt , t) dWt = c
0
3. Partition property:
Z
Z T
f (Wt , t) dWt =
0
0
0
0
u
f (Wt , t) dWt +
0
Z
Z
T
f (Wt , t) dWt .
0
T
f (Wt , t) dWt ,
u
∀0 < u < T.
102
Proof: 1. Consider the partial sum sequences
n−1
X
Xn =
i=0
n−1
X
Yn =
i=0
Z
Since ms-lim Xn =
n→∞
f (Wti , ti )(Wti+1 − Wti )
g(Wti , ti )(Wti+1 − Wti ).
T
f (Wt , t) dWt and ms-lim Yn =
n→∞
0
Z
T
g(Wt , t) dWt , using Propo-
0
sition 2.13.2 yields
Z T
f (Wt , t) + g(Wt , t) dWt
0
= ms-lim
n→∞
n−1
X
i=0
f (Wti , ti ) + g(Wti , ti ) (Wti+1 − Wti )
n−1
n−1 i
hX
X
g(Wti , ti )(Wti+1 − Wti )
= ms-lim
f (Wti , ti )(Wti+1 − Wti ) +
n→∞
i=0
i=0
= ms-lim (Xn + Yn ) = ms-lim Xn + ms-lim Yn
n→∞
n→∞
n→∞
Z T
Z T
g(Wt , t) dWt .
f (Wt , t) dWt +
=
0
0
The proofs of parts 2 and 3 are left as an exercise for the reader.
Some other properties, such as monotonicity, do not hold in general.R It is possible
T
to have a nonnegative random variable Ft for which the random variable 0 Ft dWt has
RT
negative values. More precisely, let Ft = 1. Then Ft > 0 but 0 1 dWt = WT it is not
always positive. The probability to be negative is P (WT < 0) = 1/2.
Some of the random variable properties of the Ito integral are given by the following
result:
Proposition 5.5.2 We have
1. Zero mean:
E
hZ
a
2. Isometry:
E
h Z
b
i
f (Wt , t) dWt = 0.
b
f (Wt , t) dWt
a
2 i
=E
hZ
b
a
i
f (Wt , t)2 dt .
3. Covariance:
i
hZ b
h Z b
Z b
i
E
g(Wt , t) dWt = E
f (Wt , t)g(Wt , t) dt .
f (Wt , t) dWt
a
a
a
103
We shall discuss the previous properties giving rough reasons why they hold true.
The detailed proofs are beyond the goal of this book.
Rb
1. The Ito integral I = a f (Wt , t) dWt is the mean square limit of the partial sums
Pn−1
Sn = i=0
fti (Wti+1 − Wti ), where we denoted fti = f (Wti , ti ). Since f (Wt , t) is a
nonanticipative process, then fti is independent of the increments Wti+1 − Wti , and
hence we have
E[Sn ] = E
h n−1
X
i=0
=
n−1
X
i=0
i n−1
X
E[fti (Wti+1 − Wti )]
fti (Wti+1 − Wti ) =
i=0
E[fti ]E[(Wti+1 − Wti )] = 0,
because the increments have mean zero. Applying the Squeeze Theorem in the double
inequality
2
0 ≤ E[Sn − I] ≤ E[(Sn − I)2 ] → 0,
n→∞
yields E[Sn ] − E[I] → 0. Since E[Sn ] = 0 it follows that E[I] = 0, i.e. the Ito integral
has zero mean.
2. Since the square of the sum of partial sums can be written as
Sn2
=
=
n−1
X
i=0
n−1
X
i=0
2
fti (Wti+1 − Wti )
ft2i (Wti+1 − Wti )2 + 2
X
i6=j
fti (Wti+1 − Wti )ftj (Wtj+1 − Wtj ),
using the independence yields
E[Sn2 ] =
n−1
X
i=0
+2
E[ft2i ]E[(Wti+1 − Wti )2 ]
X
i6=j
=
n−1
X
i=0
E[fti ]E[(Wti+1 − Wti )]E[ftj ]E[(Wtj+1 − Wtj )]
E[ft2i ](ti+1 − ti ),
which are the Riemann sums of the integral
Z
a
b
E[ft2 ] dt
=E
hZ
a
b
i
ft2 dt , where the last
identity follows from Fubini’s theorem. Hence E[Sn2 ] converges to the aforementioned
integral.
104
3. Consider the partial sums
n−1
X
Sn =
i=0
fti (Wti+1 − Wti ),
Vn =
n−1
X
j=0
gtj (Wtj+1 − Wtj ).
Their product is
n−1
X
S n Vn =
i=0
n−1
X
=
i=0
n−1
X
gtj (Wtj+1 − Wtj )
fti (Wti+1 − Wti )
j=0
fti gti (Wti+1 − Wti )2 +
n−1
X
i6=j
fti gtj (Wti+1 − Wti )(Wtj+1 − Wtj )
Using that ft and gt are nonanticipative and that
E[(Wti+1 − Wti )(Wtj+1 − Wtj )] = E[Wti+1 − Wti ]E[Wtj+1 − Wtj ] = 0, i 6= j
E[(Wti+1 − Wti )2 ] = ti+1 − ti ,
it follows that
n−1
X
E[Sn Vn ] =
i=0
n−1
X
=
i=0
E[fti gti ]E[(Wti+1 − Wti )2 ]
E[fti gti ](ti+1 − ti ),
which is the Riemann sum for the integral
Rb
E[ft gt ] dt.
Rb
From 1 and 2 it follows that the random variable a f (Wt , t) dWt has mean zero
and variance
hZ b
i
hZ b
i
2
V ar
f (Wt , t) dWt = E
f (Wt , t) dt .
a
a
a
From 1 and 3 it follows that
Cov
hZ
b
f (Wt , t) dWt ,
a
Z
i
b
g(Wt , t) dWt =
a
Z
b
E[f (Wt , t)g(Wt , t)] dt.
a
Corollary 5.5.3 (Cauchy’s integral inequality) Let ft = f (Wt , t) and gt = g(Wt , t).
Then
Z b
2 Z b
Z b
2
2
E[ft gt ] dt ≤
E[ft ] dt
E[gt ] dt .
a
a
a
105
Proof: It follows from the previous theorem and from the correlation formula |Corr(X, Y )| =
|Cov(X, Y )|
≤ 1.
[V ar(X)V ar(Y )]1/2
Let Ft be the information set at time t. This implies that fti and Wti+1 − Wti are
n−1
X
fti (Wti+1 −
known at time t, for any ti+1 ≤ t. It follows that the partial sum Sn =
i=0
Wti ) is Ft -measurable. The following result, whose proof is omited for technical reasons,
states that this is also valid after taking the limit in the mean square:
Proposition 5.5.4 The Ito integral
Rt
0
fs dWs is Ft -measurable.
The following two results state that if the upper limit of an Ito integral is replaced
by the parameter t we obtain a continuous martingale.
Proposition 5.5.5 For any s < t we have
hZ t
i Z
E
f (Wu , u) dWu |Fs =
s
f (Wu , u) dWu .
0
0
Proof: Using part 3 of Proposition 5.5.2 we get
i
hZ t
f (Wu , u) dWu |Fs
E
0
Z t
i
hZ s
f (Wu , u) dWu |Fs
f (Wu , u) dWu +
= E
s
0
hZ s
i
hZ t
i
= E
f (Wu , u) dWu |Fs + E
f (Wu , u) dWu |Fs .
0
(5.5.4)
s
Rs
Since 0 f (Wu , u) dWu is Fs -predictable (see Proposition 5.5.4), by part 2 of Proposition
2.10.6
hZ s
i Z s
f (Wu , u) dWu .
E
f (Wu , u) dWu |Fs =
0
0
Rt
Since s f (Wu , u) dWu contains only information between s and t, it is independent of
the information set Fs , so we can drop the condition in the expectation; using that Ito
integrals have zero mean we obtain
hZ t
i
hZ t
i
E
f (Wu , u) dWu |Fs = E
f (Wu , u) dWu = 0.
s
s
Substituting into (5.5.4) yields the desired result.
Rt
Proposition 5.5.6 Consider the process Xt = 0 f (Ws , s) dWs . Then Xt is continuous, i.e. for almost any state of the world ω ∈ Ω, the path t → Xt (ω) is continuous.
106
Proof: A rigorous proof is beyond the purpose of this book. We shall provide a rough
sketch. Assume the process f (Wt , t) satisfies E[f (Wt , t)2 ] < M , for some M > 0. Let
t0 be fixed and consider h > 0. Consider the increment Yh = Xt0 +h − Xt0 . Using the
aforementioned properties of the Ito integral we have
h Z t0 +h
i
E[Yh ] = E[Xt0 +h − Xt0 ] = E
f (Wt , t) dWt = 0
E[Yh2 ]
= E
h Z
< M
Z
t0
t0 +h
f (Wt , t) dWt
t0
t0 +h
2 i
=
Z
t0 +h
E[f (Wt , t)2 ] dt
t0
dt = M h.
t0
The process Yh has zero mean for any h > 0 and its variance tends to 0 as h → 0.
Using a convergence theorem yields that Yh tends to 0 in mean square, as h → 0. This
is equivalent with the continuity of Xt at t0 .
Rt
R∞
Proposition 5.5.7 Let Xt = 0 f (Ws , s) dWs , with E 0 f 2 (s, Ws ) ds < ∞. Then
Xt is a continuous Ft -martingale.
Proof: We shall check in the following the properties of a martingale.
Integrability: Using properties of Ito integrals
Z
Z
Z
2 ∞ 2
t
t 2
2
f (Ws , s) ds < ∞,
f (Ws , s) ds < E
f (Ws , s) dWs
E[Xt ] = E
=E
0
0
0
and then from the inequality E[|Xt |]2 ≤ E[Xt2 ] we obtain E[|Xt |] < ∞, for all t ≥ 0.
Predictability: Xt is Ft -measurable from Proposition 5.5.4.
Forecast: E[Xt |Fs ] = Xs for s < t by Proposition 5.5.5.
Continuity: See Proposition 5.5.6.
5.6
The Wiener Integral
The Wiener integral is a particular case of the Ito stochastic integral. It is obtained by
replacing the nonanticipating stochastic process f (Wt , t) by the deterministic function
Rb
f (t). The Wiener integral a f (t) dWt is the mean square limit of the partial sums
Sn =
n−1
X
i=0
f (ti )(Wti+1 − Wti ).
All properties of Ito integrals also hold for Wiener integrals. The Wiener integral is a
random variable with zero mean
i
hZ b
E
f (t) dWt = 0
a
107
and variance
E
h Z
b
f (t) dWt
a
2 i
=
Z
b
f (t)2 dt.
a
However, in the case of Wiener integrals we can say something about their distribution.
Rb
Proposition 5.6.1 The Wiener integral I(f ) = a f (t) dWt is a normal random variable with mean 0 and variance
Z b
f (t)2 dt := kf k2L2 .
V ar[I(f )] =
a
Proof: Since increments Wti+1 −Wti are normally distributed with mean 0 and variance
ti+1 − ti , then
f (ti )(Wti+1 − Wti ) ∼ N (0, f (ti )2 (ti+1 − ti )).
Since these random variables are independent, by the Central Limit Theorem (see
Theorem 3.3.1), their sum is also normally distributed, with
Sn =
n−1
X
i=0
n−1
X
f (ti )2 (ti+1 − ti ) .
f (ti )(Wti+1 − Wti ) ∼ N 0,
i=0
Taking n → ∞ and max kti+1 − ti k → 0, the normal distribution tends to
i
Z b
f (t)2 dt .
N 0,
a
The previous convergence holds in distribution, and it still needs to be shown in the
mean square. However, we shall omit this essential proof detail.
RT
Exercise 5.6.2 Show that the random variable X = 1 √1t dWt is normally distributed
with mean 0 and variance ln T .
RT √
Exercise 5.6.3 Let Y = 1 t dWt . Show that Y is normally distributed with mean
0 and variance (T 2 − 1)/2.
Rt
Exercise 5.6.4 Find the distribution of the integral 0 et−s dWs .
Rt
Rt
Exercise 5.6.5 Show that Xt = 0 (2t−u) dWu and Yt = 0 (3t−4u) dWu are Gaussian
processes with mean 0 and variance 37 t3 .
1
Exercise 5.6.6 Show that ms- lim
t→0 t
Z
t
u dWu = 0.
0
Exercise 5.6.7 Find all constants a, b such that Xt =
distributed with variance t.
Rt
0 a+
bu
t
dWu is normally
108
Exercise 5.6.8 Let n be a positive integer. Prove that
Z t
tn+1
Cov Wt ,
un dWu =
.
n+1
0
Formulate and prove a more general result.
5.7
Poisson Integration
In this section we deal with the integration with respect to the compensated Poisson
process Mt = Nt − λt, which is a martingale. Consider 0 ≤ a < b and let Ft = F (t, Mt )
be a nonanticipating process with
E
hZ
b
a
i
Ft2 dt < ∞.
Consider the partition
a = t0 < t1 < · · · < tn−1 < tn = b
of the interval [a, b], and associate the partial sums
Sn =
n−1
X
i=0
Fti− (Mti+1 − Mti ),
where Fti− is the left-hand limit at ti− . For predictability reasons, the intermediate
points are the left-handed limit to the endpoints of each interval. Since the process Ft
is nonanticipative, the random variables Fti− and Mti+1 − Mti are independent.
The integral of Ft− with respect to Mt is the mean square limit of the partial sum
Sn
ms-lim Sn =
n→∞
Z
T
Ft− dMt ,
0
provided the limit exists. More precisely, this convergence means that
lim E
n→∞
Z b
2 i
h
= 0.
Ft− dMt
Sn −
a
Exercise 5.7.1 Let c be a constant. Show that
Z
b
a
c dMt = c(Mb − Ma ).
109
An Work Out Example: the case Ft = Mt
5.8
We shall integrate the process Mt− between 0 and T with respect to Mt . Consider the
kT
, k = 0, 1, · · · , n − 1. Then the partial sums are
equidistant partition points tk =
n
given by
n−1
X
Mti− (Mti+1 − Mti ).
Sn =
i=0
1
Using xy = [(x + y)2 − x2 − y 2 ], by letting x = Mti− and y = Mti+1 − Mti , we get
2
1
1
1
Mti− (Mti+1 − Mti ) =
(Mti+1 − Mti + Mti− )2 − Mt2i− − (Mti+1 − Mti )2 .
2
2
2
Let J be the set of jump instances between 0 and T . Using that Mti = Mti− for ti ∈
/ J,
and Mti = 1 + Mti− for ti ∈ J yields
Mti+1 ,
if ti ∈
/J
Mti+1 − Mti + Mti− =
Mti+1 − 1, if ti ∈ J.
Splitting the sum, canceling in pairs, and applying the difference of squares formula we
have
Sn =
=
n−1
n−1
n−1
i=0
i=0
i=0
1X 2
1X
1X
(Mti+1 − Mti + Mti− )2 −
Mti− −
(Mti+1 − Mti )2
2
2
2
n−1
1X
1X 2
1X 2 1X 2
1X
2
(Mti+1 − 1) +
Mti− −
Mti+1 −
Mti −
(Mti+1 − Mti )2
2
2
2
2
2
ti ∈J
=
ti ∈J
/
1
1
1 X
(Mti+1 − 1)2 − Mt2i− + Mt2n −
2
2
2
ti ∈J
=
n−1
X
i=0
ti ∈J
i=0
(Mti+1 − Mti )2
n−1
1X
1 2
1X
(Mti+1 − 1 − Mti− )(Mti+1 − 1 + Mti− ) + Mtn −
(Mti+1 − Mti )2
2
2
2
{z
}
|
i=0
t ∈J
i
=
ti ∈J
/
1 2
1
M −
2 tn 2
=0
n−1
X
i=0
(Mti+1 − Mti )2 .
Hence we have arrived at the following formula
Z T
1
1
Mt− dMt = MT2 − NT .
2
2
0
Similarly, one can obtain
Z b
1
1
Mt− dMt = (Mb2 − Ma2 ) − (Nb − Na ).
2
2
a
110
Exercise 5.8.1 (a) Show that E
(b) Find V ar
hZ
b
a
i
b
hZ
i
Mt− dMt = 0,
Mt dMt .
a
Remark 5.8.2 (a) Let ω be a fixed state of the world and assume the sample path
t → Nt (ω) has a jump in the interval (a, b). Even if beyound the scope of this book,
one can be shown that the integral
Z
b
Nt (ω) dNt
a
does not exist in the Riemann-Stieltjes sense.
(b) Let Nt− denote the left-hand limit of Nt . It can be shown that Nt− is predictable,
while Nt is not.
The previous remarks provide the reason why in the following we shall work with
Z b
Z b
Mt− dNt does
Mt dNt might not exist, while
Mt− instead of Mt : the integral
a
a
exist.
Exercise 5.8.3 Show that
Z t
Z T
1 2
Nt dt.
Nt− dMt = (Nt − Nt ) − λ
2
0
0
Exercise 5.8.4 Find the variance of
Z
T
Nt− dMt .
0
The following integrals with respect to a Poisson process Nt are considered in the
Riemann-Stieltjes sense.
Proposition 5.8.5 For any continuous function f we have
Z t
hZ t
i
(a) E
f (s) ds;
f (s) dNs = λ
0
h Z
t
2 i
0
Z
t
2
2
f (s) ds + λ
=λ
f (s) dNs
0
0
h Rt
i
R t f (s)
(c) E e 0 f (s) dNs = eλ 0 (e −1) ds .
(b) E
Z
0
t
f (s) ds
2
;
Proof: (a) Consider the equidistant partition 0 = s0 < s1 < · · · < sn = t, with
sk+1 − sk = ∆s. Then
111
E
hZ
t
f (s) dNs
0
i
=
lim E
n→∞
= λ lim
n→∞
h n−1
X
i=0
n−1
X
i=0
n−1
i
h
i
X
f (si )E Nsi+1 − Nsi
f (si )(Nsi+1 − Nsi ) = lim
n→∞
f (si )(si+1 − si ) = λ
Z
i=0
t
f (s) ds.
0
(b) Using that Nt is stationary and has independent increments, we have respectively
E[(Nsi+1 − Nsi )2 ] = E[Ns2i+1 −si ] = λ(si+1 − si ) + λ2 (si+1 − si )2
= λ ∆s + λ2 (∆s)2 ,
E[(Nsi+1 − Nsi )(Nsj+1 − Nsj )] = E[(Nsi+1 − Nsi )]E[(Nsj+1 − Nsj )]
= λ(si+1 − si )λ(sj+1 − sj ) = λ2 (∆s)2 .
Applying the expectation to the formula
n−1
X
i=0
n−1
2
X
f (si )2 (Nsi+1 − Nsi )2
f (si )(Nsi+1 − Nsi )
=
i=0
+2
X
i6=j
f (si )f (sj )(Nsi+1 − Nsi )(Nsj+1 − Nsj )
yields
E
h n−1
X
i=0
2 i
f (si )(Nsi+1 − Nsi )
=
n−1
X
f (si )2 (λ∆s + λ2 (∆s)2 ) + 2
i=0
=
λ
n−1
X
=
λ
→ λ
i=0
Z t
0
f (si )f (sj )λ2 (∆s)2
i6=j
f (si )2 ∆s + λ2
h n−1
X
f (si )2 (∆s)2 + 2
i=0
i=0
n−1
X
X
f (si )2 ∆s + λ2
f (s)2 ds + λ2
n−1
X
i=0
Z t
f (si )f (sj )(∆s)2
i6=j
f (si ) ∆s
f (s) ds
0
X
2
2
, as n → ∞.
(c) Using that Nt is stationary with independent increments and has the moment
i
112
generating function E[ekNt ] = eλ(e
k −1)t
, we have
i
i
h n−1
h Pn−1
Y
ef (si )(Nsi+1 −Nsi )
lim E e i=0 f (si )(Nsi+1 −Nsi ) = lim E
h Rt
i
E e 0 f (s) dNs =
n→∞
n→∞
=
lim
n→∞
=
lim
n→∞
= eλ
Rt
0
n−1
Y
i=0
n−1
Y
i
h
E ef (si )(Nsi+1 −Nsi ) = lim
eλ(e
f (si ) −1)(s
i+1 −si )
n→∞
Z
h
i
E ef (si )(Nsi+1 −si )
Pn−1
i=0
n→∞
(ef (si ) −1)(si+1 −si )
.
Since f is continuous, the Poisson integral
of the waiting times Sk
i=0
= lim eλ
i=0
(ef (s) −1) ds
i=0
n−1
Y
t
f (s) dNs =
0
Z
t
f (s) dNs can be computed in terms
0
Nt
X
f (Sk ).
k=1
This formula can be used to give a proof for the previous result. For instance, taking
the expectation and using conditions over Nt = n, yields
E
hZ
0
t
f (s) dNs
i
= E
Nt
hX
k=1
=
n
i X hX
i
f (Sk ) =
E
f (Sk )|Nt = n P (Nt = n)
XnZ
n≥0
= e−λt
t
Z
n≥0
t
f (x) dx
0
t
(λt)n
n!
k=1
−λt
−λt
e
=e
Z
t
0
f (x) dx λeλt = λ
Z
t
f (x) dx
1 X (λt)n
t
(n − 1)!
n≥0
f (x) dx.
0
0
Exercise 5.8.6 Solve parts (b) and (c) of Proposition 5.8.5 using a similar idea with
the one presented above.
Exercise 5.8.7 Show that
Z t
2 i
h Z t
f (s)2 ds,
=λ
E
f (s) dMs
0
0
where Mt = Nt − λt is the compensated Poisson process.
Exercise 5.8.8 Prove that
V ar
Z
t
f (s) dNs = λ
0
Z
t
0
f (s)2 dNs .
113
Exercise 5.8.9 Find
h Rt
i
E e 0 f (s) dMs .
Proposition 5.8.10 Let Ft = σ(Ns ; 0 ≤ s ≤ t). Then for any constant c, the process
c
Mt = ecNt +λ(1−e )t ,
t≥0
is an Ft -martingale.
Proof: Let s < t. Since Nt − Ns is independent of Fs and Nt is stationary, we have
E[ec(Nt −Ns ) |Fs ] = E[ec(Nt −Ns ) ] = E[ecNt−s ]
= eλ(e
c −1)(t−s)
.
On the other side, taking out the predictable part yields
E[ec(Nt −Ns ) |Fs ] = e−cNs E[ecNt |Fs ].
Equating the last two relations we arrive at
c
E[ecNt +(1−e )t |Fs ] = ecNs +λ(1−e
c )s
,
which is equivalent with the martingale condition E[Mt |Fs ] = Ms .
We shall present an application of the previous result. Consider the waiting time
until the nth jump, Sn = inf{t > 0; Nt = n}, which is a stopping time, and the filtration
Ft = σ(Ns ; 0 ≤ s ≤ t). Since
c
Mt = ecNt +λ(1−e )t
is an Ft -martingale, by the Optional Stopping Theorem (Theorem 4.2.1) we have
c
E[MSn ] = E[M0 ] = 1, which is equivalent with E[eλ(1−e )Sn ] = e−cn . Substituting
s = −λ(1 − ec ), then c = ln(1 + λs ). Since s, λ > 0, then c > 0. The previous expression
becomes
λ x
s
E[e−sSn ] = e−n ln(1+ λ ) =
.
λ+s
Since the expectation on the left side is the Laplace transform of the probability density
of Sn , then
p(Sn ) = L−1 {E[e−sSn ]} = L−1
=
e−tλ tn−1 λn
,
Γ(n)
which shows that Sn has a gamma distribution.
n λ x o
λ+s
114
5.9
The distribution function of XT =
RT
0
g(t) dNt
In this section we consider the function g(t) continuous. Let S1 < S2 < · · · < SNt
denote the waiting times until time t. Since the increments dNt are equal to 1 at Sk
and 0 otherwise, the integral can be written as
Z T
XT =
g(t) dNt = g(S1 ) + · · · + g(SNt ).
0
RT
The distribution function of the random variable XT = 0 g(t) dNt can be obtained
conditioning over the Nt
X
P (XT ≤ u) =
P (XT ≤ u|NT = k) P (NT = k)
k≥0
=
X
k≥0
=
X
k≥0
P (g(S1 ) + · · · + g(SNt ) ≤ u|NT = k) P (NT = k)
P (g(S1 ) + · · · + g(Sk ) ≤ u) P (NT = k).
(5.9.5)
Considering S1 , S2 , · · · , Sk independent and uniformly distributed over the interval
[0, T ], we have
Z
vol(Dk )
1
dx1 · · · dxk =
,
P g(S1 ) + · · · + g(Sk ) ≤ u =
k
Tk
Dk T
where
Dk = {g(x1 ) + g(x2 ) + · · · + g(xk ) ≤ u} ∩ {0 ≤ x1 , · · · , xk ≤ T }.
Substituting back in (5.9.5) yields
X
P (XT ≤ u) =
P (g(S1 ) + · · · + g(Sk ) ≤ u) P (NT = k)
k≥0
X λk vol(Dk )
X vol(Dk ) λk T k
−λT
−λT
e
=
e
.
=
Tk
k!
k!
k≥0
(5.9.6)
k≥0
In general, the volume of the k-dimensional solid Dk is not obvious easy to obtain.
However, there are simple cases when this can be computed explicitly.
A Particular
Case. We shall do an explicit computation of the partition function of
RT
XT = 0 s2 dNs . In this case the solid Dk is the intersection between the k-dimensional
√
ball of radius u centered at the origin and the k-dimensional cube [0, T ]k . There are
√
three possible shapes for Dk , which depend on the size of u:
√
(a) if 0 ≤ u < T , then Dk is a 21k -part of a k-dimensional sphere;
√
√
(b) if T ≤ u < T k, then Dk has a complicated shape;
115
√
√
(c) if T k ≤ u, then Dk is the entire k-dimensional cube, and then vol(Dk ) = T k .
Since the volume of the k-dimensional ball of radius R is given by
the volume of Dk in case (a) becomes
vol(Dk ) =
π k/2 Rk
, then
Γ( k2 + 1)
π k/2 uk/2
.
2k Γ( k2 + 1)
Substituting in (5.9.6) yields
P (XT ≤ u) = e−λT
X (λ2 πu)k/2
,
k!Γ( k2 + 1)
k≥0
0≤
√
u < T.
√
√
It is worth noting that for u → ∞, the inequality T k ≤ u is satisfied for all
k ≥ 0; hence relation (5.9.6) yields
lim P (XT ≤ u) = e−λT
u→∞
X λk T k
k≥0
k!
= e−kT ekT = 1.
The computation in case (b) is more complicated and will be omitted.
hR
i
R
T
T
Exercise 5.9.1 Calculate the expectation E 0 eks dNs and the variance V ar 0 eks dNs .
Exercise 5.9.2 Compute the distribution function of Xt =
RT
0
s dNs .
116
Chapter 6
Stochastic Differentiation
6.1
Differentiation Rules
Most stochastic processes are not differentiable. For instance, the Brownian motion
process Wt is a continuous process which is nowhere differentiable. Hence, derivatives
t
like dW
dt do not make sense in stochastic calculus. The only quantities allowed to be
used are the infinitesimal changes of the process, in our case, dWt .
The infinitesimal change of a process
The change in the process Xt between instances t and t + ∆t is given by ∆Xt =
Xt+∆t − Xt . When ∆t is infinitesimally small, we obtain the infinitesimal change of a
process Xt
dXt = Xt+dt − Xt .
Sometimes it is useful to use the equivalent formula Xt+dt = Xt + dXt .
6.2
Basic Rules
The following rules are the analog of some familiar differentiation rules from elementary
Calculus.
The constant multiple rule
If Xt is a stochastic process and c is a constant, then
d(c Xt ) = c dXt .
The verification follows from a straightforward application of the infinitesimal change
formula
d(c Xt ) = c Xt+dt − c Xt = c(Xt+dt − Xt ) = c dXt .
117
118
The sum rule
If Xt and Yt are two stochastic processes, then
d(Xt + Yt ) = dXt + dYt .
The verification is as in the following:
d(Xt + Yt ) = (Xt+dt + Yt+dt ) − (Xt + Yt )
= (Xt+dt − Xt ) + (Yt+dt − Yt )
= dXt + dYt .
The difference rule
If Xt and Yt are two stochastic processes, then
d(Xt − Yt ) = dXt − dYt .
The proof is similar to the one for the sum rule.
The product rule
If Xt and Yt are two stochastic processes, then
d(Xt Yt ) = Xt dYt + Yt dXt + dXt dYt .
The proof is as follows:
d(Xt Yt ) = Xt+dt Yt+dt − Xt Yt
= Xt (Yt+dt − Yt ) + Yt (Xt+dt − Xt ) + (Xt+dt − Xt )(Yt+dt − Yt )
= Xt dYt + Yt dXt + dXt dYt ,
where the second identity is verified by direct computation.
If the process Xt is replaced by the deterministic function f (t), then the aforementioned formula becomes
d(f (t)Yt ) = f (t) dYt + Yt df (t) + df (t) dYt .
Since in most practical cases the process Yt is an Ito diffusion
dYt = a(t, Wt )dt + b(t, Wt )dWt ,
using the relations dt dWt = dt2 = 0, the last term vanishes
df (t) dYt = f ′ (t)dtdYt = 0,
and hence
d(f (t)Yt ) = f (t) dYt + Yt df (t).
119
This relation looks like the usual product rule.
The quotient rule
If Xt and Yt are two stochastic processes, then
X Y dX − X dY − dX dY
Xt
t
t
t
t
t
t
t
=
+ 3 (dYt )2 .
d
Yt
Yt2
Yt
The proof follows from Ito’s formula and will be addressed in section 6.3.3.
When the process Yt is replaced by the deterministic function f (t), and Xt is an
Ito diffusion, then the previous formula becomes
X f (t)dX − X df (t)
t
t
t
=
.
d
f (t)
f (t)2
Example 6.2.1 We shall show that
d(Wt2 ) = 2Wt dWt + dt.
Applying the product rule and the fundamental relation (dWt )2 = dt, yields
d(Wt2 ) = Wt dWt + Wt dWt + dWt dWt = 2Wt dWt + dt.
Example 6.2.2 Show that
d(Wt3 ) = 3Wt2 dWt + 3Wt dt.
Applying the product rule and the previous exercise yields
d(Wt3 ) = d(Wt · Wt2 ) = Wt d(Wt2 ) + Wt2 dWt + d(Wt2 ) dWt
= Wt (2Wt dWt + dt) + Wt2 dWt + dWt (2Wt dWt + dt)
= 2Wt2 dWt + Wt dt + Wt2 dWt + 2Wt (dWt )2 + dt dWt
= 3Wt2 dWt + 3Wt dt,
where we used (dWt )2 = dt and dt dWt = 0.
Example 6.2.3 Show that d(tWt ) = Wt dt + t dWt .
Using the product rule and dt dWt = 0, we get
d(tWt ) = Wt dt + t dWt + dt dWt
= Wt dt + t dWt .
Example 6.2.4 Let Zt =
Rt
0
Wu du be the integrated Brownian motion. Show that
dZt = Wt dt.
120
The infinitesimal change of Zt is
dZt = Zt+dt − Zt =
Z
t+dt
Ws ds = Wt dt,
t
since Ws is a continuous function in s.
Rt
Example 6.2.5 Let At = 1t Zt = 1t 0 Wu du be the average of the Brownian motion on
the time interval [0, t]. Show that
dAt =
1 1
Wt − Zt dt.
t
t
We have
1
1
1
Zt + dZt + d
dZt
dAt = d
t
t
t
−1
1
−1
=
Zt dt + Wt dt + 2 Wt |{z}
dt2
2
t
t
t
=0
1
1 =
Wt − Zt dt.
t
t
Rt
Exercise 6.2.1 Let Gt = 1t 0 eWu du be the average of the geometric Brownian motion
on [0, t]. Find dGt .
6.3
Ito’s Formula
Ito’s formula is the analog of the chain rule from elementary Calculus. We shall start
by reviewing a few concepts regarding function approximations.
Let f be a twice continuously differentiable function of a real variable x. Let x0 be
fixed and consider the changes ∆x = x − x0 and ∆f (x) = f (x) − f (x0 ). It is known
from Calculus that the following second order Taylor approximation holds
1
∆f (x) = f ′ (x)∆x + f ′′ (x)(∆x)2 + O(∆x)3 .
2
When x is infinitesimally close to x0 , we replace ∆x by the differential dx and obtain
1
df (x) = f ′ (x)dx + f ′′ (x)(dx)2 + O(dx)3 .
2
(6.3.1)
In the elementary Calculus, all terms involving terms of equal or higher order to dx2
are neglected; then the aforementioned formula becomes
df (x) = f ′ (x)dx.
121
Now, if we consider x = x(t) to be a differentiable function of t, substituting into the
previous formula we obtain the differential form of the well known chain rule
df x(t) = f ′ x(t) dx(t) = f ′ x(t) x′ (t) dt.
We shall present a similar formula for the stochastic environment. In this case the
deterministic function x(t) is replaced by a stochastic process Xt . The composition
between the differentiable function f and the process Xt is denoted by Ft = f (Xt ).
Neglecting the increment powers higher than or equal to (dXt )3 , the expression
(6.3.1) becomes
2
1
dFt = f ′ Xt dXt + f ′′ Xt dXt .
(6.3.2)
2
In the computation of dXt we may take into the account stochastic relations such as
dWt2 = dt, or dt dWt = 0.
Theorem 6.3.1 (Ito’s formula) Consider Xt a stochastic process satisfying
dXt = bt dt + σt dWt ,
with bt (ω) and σt (ω) ”good behaving” processes. Let Ft = f (Xt ), with f twice continuously differentiable. Then
h
i
σ2
dFt = bt f ′ (Xt ) + t f ′′ (Xt ) dt + σt f ′ (Xt ) dWt .
2
(6.3.3)
Proof: We shall provide an informal proof. Using relations dWt2 = dt and dt2 =
dWt dt = 0, we have
(dXt )2 =
bt dt + σt dWt
2
= b2t dt2 + 2bt σt dWt dt + σt2 dWt2
= σt2 dt.
Substituting into (6.3.2) yields
2
1
dFt = f ′ Xt dXt + f ′′ Xt dXt
2
1
= f ′ Xt bt dt + σt dWt + f ′′ Xt σt2 dt
2
h
i
2
σ
= bt f ′ (Xt ) + t f ′′ (Xt ) dt + σt f ′ (Xt ) dWt .
2
In the case Xt = Wt we obtain the following consequence:
122
Corollary 6.3.2 Let Ft = f (Wt ). Then
1
dFt = f ′′ (Wt )dt + f ′ (Wt ) dWt .
2
(6.3.4)
Particular cases
1. If f (x) = xα , with α constant, then f ′ (x) = αxα−1 and f ′′ (x) = α(α − 1)xα−2 . Then
(6.3.4) becomes the following useful formula
d(Wtα ) =
1
α(α − 1)Wtα−2 dt + αWtα−1 dWt .
2
A couple of useful cases easily follow:
d(Wt2 ) = 2Wt dWt + dt
d(Wt3 ) = 3Wt2 dWt + 3Wt dt.
2. If f (x) = ekx , with k constant, f ′ (x) = kekx , f ′′ (x) = k2 ekx . Therefore
1
d(ekWt ) = kekWt dWt + k2 ekWt dt.
2
In particular, for k = 1, we obtain the increments of a geometric Brownian motion
1
d(eWt ) = eWt dWt + eWt dt.
2
3. If f (x) = sin x, then
d(sin Wt ) = cos Wt dWt −
1
sin Wt dt.
2
Exercise 6.3.3 Use the previous rules to find the following increments
(a) d(Wt eWt )
(b) d(3Wt2 + 2e5Wt )
2
(c) d(et+Wt )
(d) d (t + Wt )n .
1 Z t
Wu du
(e) d
t 0
1 Z t
eWu du , where α is a constant.
(f ) d α
t 0
123
In the case when the function f = f (t, x) is also time dependent, the analog of
(6.3.1) is given by
1
df (t, x) = ∂t f (t, x)dt + ∂x f (t, x)dx + ∂x2 f (t, x)(dx)2 + O(dx)3 + O(dt)2 .
2
(6.3.5)
Substituting x = Xt yields
1
df (t, Xt ) = ∂t f (t, Xt )dt + ∂x f (t, Xt )dXt + ∂x2 f (t, Xt )(dXt )2 .
2
(6.3.6)
If Xt is an Ito diffusion we obtain an extra-term in formula (6.3.3)
dFt =
h
∂t f (t, Xt ) + a(Wt , t)∂x f (t, Xt ) +
+b(Wt , t)∂x f (t, Xt ) dWt .
i
b(Wt , t)2 2
∂x f (t, Xt ) dt
2
(6.3.7)
Exercise 6.3.4 Show that
d(tWt2 ) = (t + Wt2 )dt + 2tWt dWt .
Exercise 6.3.5 Find the following increments
(a) d(tWt )
(b) d(et Wt )
6.3.1
(c) d(t2 cos Wt )
(d) d(sin t Wt2 ).
Ito diffusions
Consider the process Xt given by
dXt = b(Xt , t)dt + σ(Xt , t)dWt .
(6.3.8)
A process Xt = (Xti ) ∈ Rn satisfying this relation is called an Ito diffusion in Rn .
Equation (6.3.8) models the position of a small particle that moves under the influence
of a drift force b(Xt , t), and is subject to random deviations. This situation occurs in
the physical world when a particle suspended in a moving liquid is subject to random
molecular bombardments. The amount 12 σσ T is called the diffusion coefficient and
describes the difussion of the particle.
Exercise 6.3.6 Consider the time-homogeneous Ito diffusion
dXt = b(Xt )dt + σ(Xt )dWt .
Show that:
E[(Xti − xi )]
.
t→0
t
E[(Xti − xi )(Xtj − xj )]
.
(b) (σσ T )ij (x) = lim
t→0
t
(a) bi (x) = lim
124
6.3.2
Ito’s formula for Poisson processes
Consider the process Ft = F (Mt ), where Mt = Nt − λt is the compensated Poisson
process. Ito’s formula for the process Ft takes the following integral form. For a proof
the reader can consult Kuo [?].
Proposition 6.3.7 Let F be a twice differentiable function. Then for any a < t we
have
Z t
X F ′ (Ms− ) dMs +
∆F (Ms ) − F ′ (Ms− )∆Ms ,
Ft = Fa +
a
a<s≤t
where ∆Ms = Ms − Ms− and ∆F (Ms ) = F (Ms ) − F (Ms− ).
We shall apply the aforementioned result for the case Ft = F (Mt ) = Mt2 . We have
Z t
X 2
Ms− dMs +
Ms2 − Ms−
− 2Ms− (Ms − Ms− ) .
(6.3.9)
Mt2 = Ma2 + 2
a
a<s≤t
Since the jumps in Ns are of size 1, we have (∆Ns )2 = ∆Ns . Since the difference of the
processes Ms and Ns is continuous, then ∆Ms = ∆Ns . Using these formulas we have
2
Ms2 − Ms−
− 2Ms− (Ms − Ms− )
= (Ms − Ms− ) Ms + Ms− − 2Ms−
= (Ms − Ms− )2 = (∆Ms )2 = (∆Ns )2
= ∆Ns = Ns − Ns− .
P
Since the sum of the jumps between s and t is a<s≤t ∆Ns = Nt − Na , formula (6.3.9)
becomes
Z t
Ms− dMs + Nt − Na .
(6.3.10)
Mt2 = Ma2 + 2
a
The differential form is
d(Mt2 ) = 2Mt− dMt + dNt ,
which is equivalent with
d(Mt2 ) = (1 + 2Mt− ) dMt + λdt,
since dNt = dMt + λdt.
Exercise 6.3.8 Show that
Z
T
0
1
Mt− dMt = (MT2 − NT ).
2
Exercise 6.3.9 Use Ito’s formula for the Poison process to find the conditional expectation E[Mt2 |Fs ] for s < t.
125
6.3.3
Ito’s multidimensional formula
If the process Ft depends on several Ito diffusions, say Ft = f (t, Xt , Yt ), then a similar
formula to (6.3.7) leads to
dFt =
∂f
∂f
∂f
(t, Xt , Yt )dt +
(t, Xt , Yt )dXt +
(t, Xt , Yt )dYt
∂t
∂x
∂y
1 ∂2f
1 ∂2f
+
(t, Xt , Yt )(dXt )2 +
(t, Xt , Yt )(dYt )2
2
2 ∂x
2 ∂y 2
∂2f
+
(t, Xt , Yt )dXt dYt .
∂x∂y
Particular cases
In the case when Ft = f (Xt , Yt ), with Xt = Wt1 , Yt = Wt2 independent Brownian
motions, we have
∂f
∂f
dWt1 +
dWt2 +
∂x
∂y
1 ∂2f
dWt1 dWt2
+
2 ∂x∂y
∂f
∂f
=
dWt1 +
dWt2 +
∂x
∂y
dFt =
The expression
∆f =
1 ∂2f
1 ∂2f
1 2
(dW
)
+
(dWt2 )2
t
2 ∂x2
2 ∂y 2
1 ∂2f
∂2f +
dt
2 ∂x2
∂y 2
∂2f 1 ∂2f
+
2 ∂x2
∂y 2
is called the Laplacian of f . We can rewrite the previous formula as
dFt =
∂f
∂f
dWt1 +
dWt2 + ∆f dt
∂x
∂y
A function f with ∆f = 0 is called harmonic. The aforementioned formula in the case
of harmonic functions takes the simple form
dFt =
∂f
∂f
dWt1 +
dWt2 .
∂x
∂y
(6.3.11)
Exercise 6.3.10 Let Wt1 , Wt2 be two independent Brownian motions. If the function
f is harmonic, show that Ft = f (Wt1 , Wt2 ) is a martingale. Is the converse true?
Exercise 6.3.11 Use the previous formulas to find dFt in the following cases
(a)
Ft = (Wt1 )2 + (Wt2 )2
(b)
Ft = ln[(Wt1 )2 + (Wt2 )2 ].
126
p
Exercise 6.3.12 Consider the Bessel process Rt = (Wt1 )2 + (Wt2 )2 , where Wt1 and
Wt2 are two independent Brownian motions. Prove that
dRt =
1
W1
W2
dt + t dWt1 + t dWt2 .
2Rt
Rt
Rt
Example 6.3.1 (The product rule) Let Xt and Yt be two Ito difussions. Show that
d(Xt Yt ) = Yt dXt + Xt dYt + dXt dYt .
Consider the function f (x, y) = xy. Since ∂x f = y, ∂y f = x, ∂x2 f = ∂y2 f = 0, ∂x ∂y = 1,
then Ito’s multidimensional formula yields
d(Xt Yt ) = d f (X, Yt ) = ∂x f dXt + ∂y f dYt
1
1
+ ∂x2 f (dXt )2 + ∂y2 f (dYt )2 + ∂x ∂y f dXt dYt
2
2
= Yt dXt + Xt dYt + dXt dYt .
Example 6.3.2 (The quotient rule) Let Xt and Yt be two Ito difussions. Show that
X Y dX − X dY − dX dY
Xt
t
t
t
t
t
t
t
d
+ 2 (dYt )2 .
=
Yt
Yt2
Yt
Consider the function f (x, y) = xy . Since ∂x f = y1 , ∂y f = − yx2 , ∂x2 f = 0, ∂y2 f = − yx2 ,
∂x ∂y = y12 , then applying Ito’s multidimensional formula yields
X t
= d f (X, Yt ) = ∂x f dXt + ∂y f dYt
d
Yt
1
1
+ ∂x2 f (dXt )2 + ∂y2 f (dYt )2 + ∂x ∂y f dXt dYt
2
2
1
1
Xt
=
dXt − 2 dYt − 2 dXt dYt
Yt
Yt
Yt
Yt dXt − Xt dYt − dXt dYt
Xt
=
+ 2 (dYt )2 .
2
Yt
Yt
Chapter 7
Stochastic Integration Techniques
Computing a stochastic integral starting from the definition of the Ito integral is a quite
inefficient method. Like in the elementary Calculus, several methods can be developed
to compute stochastic integrals. We tried to keep the analogy with the elementary
Calculus as much as possible. The integration by substitution is more complicated in
the stochastic environment and we have considered only a particular case of it, which
we called The method of heat equation.
7.1
Notational Conventions
The intent of this section is to discuss some equivalent integral notations for a stochastic
differential equation. Consider a process Xt whose increments satisfy the stochastic differential equation dXt = f (t, Wt )dWt . This can be written equivalently in the integral
form as
Z
t
dXs =
a
Z
t
f (s, Ws )dWs .
(7.1.1)
a
If we consider the partition 0 = t0 < t1 < · · · < tn−1 < tn = t, then the left side
becomes
Z t
n−1
X
(Xtj+1 − Xtj ) = Xt − Xa ,
dXs = ms-lim
a
n→∞
j=0
after cancelling the terms in pairs. Substituting into formula (7.1.1) yields the equivalent form
Z t
f (s, Ws )dWs .
Xt = Xa +
a
Z t
f (s, Ws )dWs , since Xa is a constant. Using
This can be written also as dXt = d
a
dXt = f (t, Wt )dWt , the previous formula can be written in the following two equivalent
ways:
127
128
(i) For any a < t, we have
Z t
d
f (s, Ws )dWs = f (t, Wt )dWt .
a
(ii) If Yt is a stochastic process, such that Yt dWt = dFt , then
Z
b
a
Yt dWt = Fb − Fa .
These formulas are equivalent ways of writing the stochastic differential equation (7.1.1),
and are useful in computations. A few applications follow.
Example 7.1.1 Verify the stochastic formula
Z t
t
W2
Ws dWs = t − .
2
2
0
Let Xt =
Rt
t
Wt2
− . From Ito’s formula
2
2
t 1
W 2 1
t
−d
= (2Wt dWt + dt) − dt = Wt dWt ,
dYt = d
2
2
2
2
0 Ws dWs and Yt =
and from the formula (i) we get
Z t
dXt = d
Ws dWs = Wt dWt .
0
Hence dXt = dYt , or d(Xt − Yt ) = 0. Since the process Xt − Yt has zero increments,
then Xt − Yt = c, constant. Taking t = 0, yields
Z 0
W 2 0
0
Ws dWs −
c = X0 − Y0 =
= 0,
−
2
2
0
and hence c = 0. It follows that Xt = Yt , which verifies the desired relation.
Example 7.1.2 Verify the formula
Z t
Z
t 2 t 1 t 2
sWs dWs =
Wt −
−
W ds.
2
2
2 0 s
0
Rt
t 2
Wt − 1 , and Zt =
Consider the stochastic processes Xt = 0 sWs dWs , Yt =
2
R
1 t
2 ds. Formula (i) yields
W
s
2 0
dXt = tWt dWt
1 2
W dt.
dZt =
2 t
129
Applying Ito’s formula, we get
t2 t
t 1
Wt2 −
= d(tWt2 ) − d
dYt = d
2
2
2
4
i 1
1h
=
(t + Wt2 )dt + 2tWt dWt − tdt
2
2
1 2
=
W dt + tWt dWt .
2 t
We can easily see that
dXt = dYt − dZt .
This implies d(Xt −Yt +Zt ) = 0, i.e. Xt −Yt +Zt = c, constant. Since X0 = Y0 = Z0 = 0,
it follows that c = 0. This proves the desired relation.
Example 7.1.3 Show that
Z
t
0
1
(Ws2 − s) dWs = Wt3 − tWt .
3
Consider the function f (t, x) = 31 x3 − tx, and let Ft = f (t, Wt ). Since ∂t f = −x,
∂x f = x2 − t, and ∂x2 f = 2x, then Ito’s formula provides
1
dFt = ∂t f dt + ∂x f dWt + ∂x2 f (dWt )2
2
1
2
= −Wt dt + (Wt − t) dWt + 2Wt dt
2
= (Wt2 − t)dWt .
From formula (i) we get
Z t
Z t
1
dFs = Ft − F0 = Ft = Wt3 − tWt .
(Ws2 − s) dWs =
3
0
0
Exercise
Z t
Z t 7.1.1 Show that
1
2
ln Ws dWs = 2Wt (ln Wt − 1) −
(a)
ds;
0 Ws
0
Z t
s
t
e 2 cos Ws dWs = e 2 sin Wt ;
(b)
Z0 t
t
s
e 2 sin Ws dWs = 1 − e 2 cos Wt ;
(c)
Z0 t
t
s
eWs − 2 dWs = eWt − 2 − 1;
(d)
Z 0t
Z
1 t
(e)
cos Ws dWs = sin Wt +
sin Ws ds;
2 0
0
Z
Z t
1 t
cos Ws ds.
sin Ws dWs = 1 − cos Wt −
(f )
2 0
0
130
7.2
Stochastic Integration by Parts
Consider the process Ft = f (t)g(Wt ), with f and g differentiable. Using the product
rule yields
dFt = df (t) g(Wt ) + f (t) dg(Wt )
1
= f ′ (t)g(Wt )dt + f (t) g ′ (Wt )dWt + g′′ (Wt )dt
2
1
′′
′
= f (t)g(Wt )dt + f (t)g (Wt )dt + f (t)g′ (Wt )dWt .
2
Writing the relation in the integral form, we obtain the first integration by parts formula:
Z b
Z
b Z b
1 b
f ′ (t)g(Wt ) dt −
f (t)g′ (Wt ) dWt = f (t)g(Wt ) −
f (t)g′′ (Wt ) dt.
2
a
a
a
a
This formula is to be used when integrating a product between a function of t and
a function of the Brownian motion Wt , for which an antiderivative is known. The
following two particular cases are important and useful in applications.
1. If g(Wt ) = Wt , the aforementioned formula takes the simple form
Z
b
a
t=b Z b
f ′ (t)Wt dt.
−
f (t) dWt = f (t)Wt t=a
(7.2.2)
a
It is worth noting that the left side is a Wiener integral.
2. If f (t) = 1, then the formula becomes
Z
a
b
t=b 1 Z b
−
g (Wt ) dWt = g(Wt )
g′′ (Wt ) dt.
t=a
2 a
′
Application 1 Consider the Wiener integral IT =
Z
(7.2.3)
T
t dWt . From the general theory,
0
see Proposition 5.6.1, it is known that I is a random variable normally distributed with
mean 0 and variance
Z T
T3
t2 dt =
.
V ar[IT ] =
3
0
Recall the definition of integrated Brownian motion
Z t
Wu du.
Zt =
0
Formula (7.2.2) yields a relationship between I and the integrated Brownian motion
Z T
Z T
Wt dt = T WT − ZT ,
t dWt = T WT −
IT =
0
0
131
and hence IT + ZT = T WT . This relation can be used to compute the covariance
between IT and ZT .
Cov(IT + ZT , IT + ZT ) = V ar[T WT ] ⇐⇒
V ar[IT ] + V ar[ZT ] + 2Cov(IT , ZT ) = T 2 V ar[WT ] ⇐⇒
T 3 /3 + T 3 /3 + 2Cov(IT , ZT ) = T 3 ⇐⇒
Cov(IT , ZT ) = T 3 /6,
where we used that V ar[ZT ] = T 3 /3. The processes It and Zt are not independent.
Their correlation coefficient is 0.5 as the following calculation shows
Corr(IT , ZT ) =
T 3 /6
Cov(IT , ZT )
1/2 = T 3 /3
V ar[IT ]V ar[ZT ]
= 1/2.
x2
2
Application 2 If we let g(x) =
Z
in formula (7.2.3), we get
b
Wt dWt =
a
Wb2 − Wa2 1
− (b − a).
2
2
It is worth noting that letting a = 0 and b = T , we retrieve a formula that was proved
by direct methods in chapter 3
Z
T
Wt dWt =
0
Similarly, if we let g(x) =
x3
3
Z
a
WT2
T
− .
2
2
in (7.2.3) yields
b
Wt2 dWt
Z b
Wt3 b
Wt dt.
=
−
3 a
a
Application 3
Choosing f (t) = eαt and g(x) = cos x, we shall compute the stochastic integral
R T αt
0 e cos Wt dWt using the formula of integration by parts
Z T
Z T
αt
eαt (sin Wt )′ dWt
e cos Wt dWt =
0
0
Z
T Z T
1 T αt
e (cos Wt )′′ dt
(eαt )′ sin Wt dt −
= eαt sin Wt −
2
0
0
0
Z T
Z T
1
eαt sin Wt dt +
= eαT sin WT − α
eαt sin Wt dt
2
0
0
Z T
1
eαt sin Wt dt.
= eαT sin WT − α −
2 0
132
The particular case α =
1
2
leads to the following exact formula of a stochastic integral
Z
T
t
T
(7.2.4)
e 2 cos Wt dWt = e 2 sin WT .
0
RT
In a similar way, we can obtain an exact formula for the stochastic integral 0 eβt sin Wt dWt
as follows
Z T
Z T
βt
eβt (cos Wt )′ dWt
e sin Wt dWt = −
0
0
Z T
Z
T
1 T βt
eβt cos Wt dt −
= −eβt cos Wt + β
e cos Wt dt.
0
2 0
0
Taking β =
1
2
yields the closed form formula
Z
T
t
T
e 2 sin Wt dWt = 1 − e 2 cos WT .
0
(7.2.5)
A consequence of the last two formulas and of Euler’s formula
eiWt = cos Wt + i sin Wt ,
is
Z
T
0
T
t
e 2 +iWt dWt = i(1 − e 2 +iWT ).
The proof details are left to the reader.
A general form of the integration by parts formula
In general, if Xt and Yt are two Ito diffusions, from the product formula
d(Xt Yt ) = Xt dYt + Yt dXt + dXt dYt .
Integrating between the limits a and b
Z b
Z b
Z b
Z b
dXt dYt .
Yt dXt +
Xt dYt +
d(Xt Yt ) =
a
a
a
a
From the Fundamental Theorem
Z b
d(Xt Yt ) = Xb Yb − Xa Ya ,
a
so the previous formula takes the following form of integration by parts
Z
b
a
Xt dYt = Xb Yb − Xa Ya −
Z
a
b
Yt dXt −
Z
b
dXt dYt .
a
This formula is of theoretical value. In practice, the term dXt dYt needs to be computed
using the rules Wt2 = dt, and dt dWt = 0.
133
Exercise 7.2.1 (a) Use integration by parts to get
Z
T
0
1
dWt = tan−1 (WT ) +
1 + Wt2
(b) Show that
E[tan
−1
(WT )] = −
Z
Z
T
0
T
E
0
h
Wt
dt,
(1 + Wt2 )2
T > 0.
i
Wt
dt.
(1 + Wt2 )2
(c) Prove the double inequality
√
√
3 3
x
3 3
−
≤
≤
,
16
(1 + x2 )2
16
∀x ∈ R.
(d) Use part (c) to obtain
√
√
Z T
3 3
3 3
Wt
−
T ≤
2 2 dt ≤ 16 T.
16
0 (1 + Wt )
(e) Use part (d) to get
√
√
3 3
3 3
−1
−
T ≤ E[tan (WT )] ≤
T.
16
16
(f ) Does part (e) contradict the inequality
−
π
π
< tan−1 (WT ) < ?
2
2
Exercise 7.2.2 (a) Show the relation
Z
T
Wt
e
WT
dWt = e
0
1
−1−
2
Z
T
eWt dt.
0
(b) Use part (a) to find E[eWt ].
Exercise 7.2.3 (a) Use integration by parts to show
Z
T
0
Wt eWt dWt = 1 + WT eWT − eWT −
1
2
Z
T
eWt (1 + Wt ) dt;
0
(b) Use part (a) to find E[Wt eWt ];
(c) Show that Cov(Wt , eWt ) = tet/2 ;
r
(d) Prove that Corr(Wt, eWt ) =
et
t
, and compute the limits as t → 0 and t → ∞.
−1
134
Exercise 7.2.4 (a) Let T > 0. Show the following relation using integration by parts
Z
T
0
2Wt
dWt = ln(1 + WT2 ) −
1 + Wt2
Z
T
1 − Wt2
dt.
(1 + Wt2 )2
0
(b) Show that for any real number x the following double inequality holds
−
1
1 − x2
≤
≤ 1.
8
(1 + x2 )2
(c) Use part (b) to show that
1
− T ≤
8
Z
T
0
1 − Wt2
dt ≤ T.
(1 + Wt2 )2
(d) Use parts (a) and (c) to get
−
T
≤ E[ln(1 + WT2 )] ≤ T.
8
(e) Use Jensen’s inequality to get
E[ln(1 + WT2 )] ≤ ln(1 + T ).
Does this contradict the upper bound provided in (d)?
Exercise 7.2.5 Use integration by parts to show
Z
t
0
1
1
arctan Ws dWs = Wt arctan Wt − ln(1 + Wt2 ) −
2
2
Z
t
0
1
ds.
1 + Ws2
Exercise 7.2.6 (a) Using integration by parts prove the identity
Z
t
Ws
Ws e
Wt
dWs = 1 + e
0
1
(Wt − 1) −
2
Z
t
(1 + Ws )eWs ds;
0
(b) Use part (a) to compute E[Wt eWt ].
Exercise 7.2.7 Check the following formulas using integration by parts
Z t
Z
1 t
Ws sin Ws dWs = sin Wt − Wt cos Wt −
(a)
(sin Ws + Ws cos Ws ) ds;
2 0
0
Z
Z t
q
1 t
Ws
2
p
dWs = 1 + Wt − 1 −
(1 + Ws2 )−3/2 ds.
(b)
2 0
1 + Ws2
0
135
7.3
The Heat Equation Method
In elementary Calculus, integration by substitution is the inverse application of the
chain rule. In the stochastic environment, this will be the inverse application of Ito’s
formula. This is difficult to apply in general, but there is a particular case of great
importance.
Let ϕ(t, x) be a solution of the equation
1
∂t ϕ + ∂x2 ϕ = 0.
2
(7.3.6)
This is called the heat equation without sources. The non-homogeneous equation
1
∂t ϕ + ∂x2 ϕ = G(t, x)
2
(7.3.7)
is called heat equation with sources. The function G(t, x) represents the density of heat
sources, while the function ϕ(t, x) is the temperature at point x at time t in a onedimensional wire. If the heat source is time independent, then G = G(x), i.e. G is a
function of x only.
Example 7.3.1 Find all solutions of the equation (7.3.6) of type ϕ(t, x) = a(t) + b(x).
Substituting into equation (7.3.6) yields
1 ′′
b (x) = −a′ (t).
2
Since the left side is a function of x only, while the right side is a function of variable
t, the only where the previous equation is satisfied is when both sides are equal to the
same constant C. This is called a separation constant. Therefore a(t) and b(x) satisfy
the equations
1 ′′
a′ (t) = −C,
b (x) = C.
2
Integrating yields a(t) = −Ct + C0 and b(x) = Cx2 + C1 x + C2 . It follows that
ϕ(t, x) = C(x2 − t) + C1 x + C3 ,
with C0 , C1 , C2 , C3 arbitrary constants.
Example 7.3.2 Find all solutions of the equation (7.3.6) of the type ϕ(t, x) = a(t)b(x).
Substituting into the equation and dividing by a(t)b(x) yields
a′ (t) 1 b′′ (x)
+
= 0.
a(t)
2 b(x)
136
There is a separation constant C such that
b′′ (x)
a′ (t)
= −C and
= 2C. There are
a(t)
b(x)
three distinct cases to discuss:
1. C = 0. In this case a(t) = a0 and b(x) = b1 x+b0 , with a0 , a1 , b0 , b1 real constants.
Then
ϕ(t, x) = a(t)b(x) = c1 x + c0 ,
c0 , c1 ∈ R
is just a linear function in x.
2
2. C > 0. Let λ > 0 such that 2C = λ2 . Then a′ (t) = − λ2 a(t) and b′′ (x) = λ2 b(x),
with solutions
2 t/2
a(t) = a0 e−λ
b(x) = c1 eλx + c2 e−λx .
The general solution of (7.3.6) is
2 t/2
ϕ(t, x) = e−λ
(c1 eλx + c2 e−λx ),
c1 , c2 ∈ R.
3. C < 0. Let λ > 0 such that 2C = −λ2 . Then a′ (t) =
2
−λ b(x). Solving yields
λ2
2 a(t)
and b′′ (x) =
2 t/2
a(t) = a0 eλ
b(x) = c1 sin(λx) + c2 cos(λx).
The general solution of (7.3.6) in this case is
2 t/2
ϕ(t, x) = eλ
c1 sin(λx) + c2 cos(λx) ,
c1 , c2 ∈ R.
In particular, the functions x, x2 − t, ex−t/2 , e−x−t/2 , et/2 sin x and et/2 cos x, or any
linear combination of them are solutions of the heat equation (7.3.6). However, there
are other solutions which are not of the previous type.
Exercise 7.3.1 Prove that ϕ(t, x) = 13 x3 −tx is a solution of the heat equation (7.3.6).
Exercise 7.3.2 Show that ϕ(t, x) = t−1/2 e−x
(7.3.6) for t > 0.
2 /(2t)
is a solution of the heat equation
x
Exercise 7.3.3 Let ϕ = u(λ), with λ = √ , t > 0. Show that ϕ satisfies the heat
2 t
equation (7.3.6) if and only if u′′ + 2λu′ = 0.
2
Exercise 7.3.4 Let erf c(x) = √
π
solution of the equation (7.3.6).
Z
∞
x
√
2
e−r dr. Show that ϕ = erf c(x/(2 t)) is a
137
2
Exercise 7.3.5 (the fundamental solution) Show that ϕ(t, x) =
satisfies the equation (7.3.6).
x
√ 1 e− 4t
4πt
, t>0
Sometimes it is useful to generate new solutions for the heat equation from other
solutions. Below we present a few ways to accomplish this:
(i) by linear combination: if ϕ1 and ϕ2 are solutions, then a1 ϕ1 + a1 ϕ2 is a solution,
where a1 , a2 constants.
(ii) by translation: if ϕ(t, x) is a solution, then ϕ(t − τ, x − ξ) is a solution, where
(τ, ξ) is a translation vector.
(iii) by affine transforms: if ϕ(t, x) is a solution, then ϕ(λt, λ2 x) is a solution, for
any constant λ.
∂ n+m
(iv) by differentiation: if ϕ(t, x) is a solution, then n m ϕ(t, x) is a solution.
∂ x∂ t
(v) by convolution: if ϕ(t, x) is a solution, then so are
Z
b
ϕ(t, x − ξ)f (ξ) dξ
a
Z
a
b
ϕ(t − τ, x)g(t) dt.
For more detail on the subject the reader can consult Widder [?] and Cannon [?].
Theorem 7.3.6 Let ϕ(t, x) be a solution of the heat equation (7.3.6) and denote
f (t, x) = ∂x ϕ(t, x). Then
Z
b
a
f (t, Wt ) dWt = ϕ(b, Wb ) − ϕ(a, Wa ).
Proof: Let Ft = ϕ(t, Wt ). Applying Ito’s formula we get
1
dFt = ∂x ϕ(t, Wt ) dWt + ∂t ϕ + ∂x2 ϕ dt.
2
Since ∂t ϕ + 12 ∂x2 ϕ = 0 and ∂x ϕ(t, Wt ) = f (t, Wt ), we have
dFt = f (t, Wt ) dWt .
Applying the Fundamental Theorem yields
Z
b
f (t, Wt ) dWt =
a
Z
b
a
dFt = Fb − Fa = ϕ(b, Wb ) − ϕ(a, Wa ).
138
Application 7.3.7 Show that
Z
T
1
1
Wt dWt = WT2 − T.
2
2
0
Choose the solution of the heat equation (7.3.6) given by ϕ(t, x) = x2 − t. Then
f (t, x) = ∂x ϕ(t, x) = 2x. Theorem 7.3.6 yields
Z
Z
T
2Wt dWt =
T
0
0
0
T
f (t, Wt ) dWt = ϕ(t, x) = WT2 − T.
Dividing by 2 leads to the desired result.
Application 7.3.8 Show that
Z
T
0
(Wt2 − t) dWt =
1 3
W − T WT .
3 T
Consider the function ϕ(t, x) = 31 x3 − tx, which is a solution of the heat equation
(7.3.6), see Exercise 7.3.1. Then f (t, x) = ∂x ϕ(t, x) = x2 − t. Applying Theorem 7.3.6
yields
Z
T
0
(Wt2
− t) dWt =
Z
T
1
f (t, Wt ) dWt = ϕ(t, Wt ) = WT3 − T WT .
3
0
T
0
Application 7.3.9 Let λ > 0. Prove the identities
Z
T
e−
λ2 t
±λWt
2
dWt =
0
Consider the function ϕ(t, x) = e−
λ2 t
±λx
2
1 − λ2 T ±λWT
e 2
−1 .
±λ
, which is a solution of the homogeneous heat
equation (7.3.6), see Example 7.3.2. Then f (t, x) = ∂x ϕ(t, x) = ±λe−
Theorem 7.3.6 to get
Z
T
0
2
− λ2 t ±λx
±λe
dWt =
Z
T
λ2 t
±λx
2
. Apply
T
λ2 T
f (t, Wt ) dWt = ϕ(t, Wt ) = e− 2 ±λWT − 1.
0
0
Dividing by the constant ±λ ends the proof.
In particular, for λ = 1 the aforementioned formula becomes
Z
T
0
t
T
e− 2 +Wt dWt = e− 2 +WT − 1.
(7.3.8)
139
Application 7.3.10 Let λ > 0. Prove the identity
Z
T
e
λ2 t
2
cos(λWt ) dWt =
0
1 λ2 T
e 2 sin(λWT ).
λ
From the Example 7.3.2 we know that ϕ(t, x) = e
λ2 t
2
sin(λx) is a solution of the heat
equation. Applying Theorem 7.3.6 to the function f (t, x) = ∂x ϕ(t, x) = λe
yields
Z
T
λe
λ2 t
2
Z
λ2 t
2
cos(λx),
T
f (t, Wt ) dWt = ϕ(t, Wt )
0
0
T
λ2 T
λ2 t
= e 2 sin(λWt ) = e 2 sin(λWT ).
cos(λWt ) dWt =
0
T
0
Divide by λ to end the proof.
If we choose λ = 1 we recover a result already familiar to the reader from section
7.2
Z
T
t
T
(7.3.9)
e 2 cos(Wt ) dWt = e 2 sin WT .
0
Application 7.3.11 Let λ > 0. Show that
Z
T
e
λ2 t
2
sin(λWt ) dWt =
0
Choose ϕ(t, x) = e
λ2 t
2
λ2 T
1
1 − e 2 cos(λWT ) .
λ
cos(λx) to be a solution of the heat equation. Apply Theorem
7.3.6 for the function f (t, x) = ∂x ϕ(t, x) = −λe
Z
0
T
(−λ)e
λ2 t
2
λ2 t
2
sin(λx) to get
T
sin(λWt ) dWt = ϕ(t, Wt )
0
= e
and then divide by −λ.
λ2 T
2
T
λ2 T
cos(λWt ) = e 2 cos(λWT ) − 1,
0
Application 7.3.12 Let 0 < a < b. Show that
Z
b
a
3
t− 2 Wt e−
Wt2
2t
1
dWt = a− 2 e−
2
Wa
2a
1
− b− 2 e−
Wb2
2b
.
(7.3.10)
140
2
From Exercise 7.3.2 we have that ϕ(t, x) = t−1/2 e−x /(2t) is a solution of the homoge2
neous heat equation. Since f (t, x) = ∂x ϕ(t, x) = −t−3/2 xe−x /(2t) , applying Theorem
7.3.6 yields to the desired result. The reader can easily fill in the details.
Integration techniques will be used when solving stochastic differential equations in
the next chapter.
Exercise 7.3.13 Find the value of the following stochastic integrals
Z 1
√
et cos( 2Wt ) dWt
(a)
0
Z 3
e2t cos(2Wt ) dWt
(b)
0
Z 4
√
e−t+ 2Wt dWt .
(c)
0
Exercise 7.3.14 Let ϕ(t, x) be a solution of the following non-homogeneous heat equation with time-dependent and uniform heat source G(t)
1
∂t ϕ + ∂x2 ϕ = G(t).
2
Denote f (t, x) = ∂x ϕ(t, x). Show that
Z
b
a
f (t, Wt ) dWt = ϕ(b, Wb ) − ϕ(a, Wa ) −
Z
b
G(t) dt.
a
How does the formula change if the heat source G is constant?
7.4
Table of Usual Stochastic Integrals
Now we present a user-friendly table, which enlists identities that are among the most
important of our standard techniques. This table is far too complicated to be memorized in full. However, the first couple of identities in this table are the most memorable,
and should be remembered.
Let a < b and 0 < T . Then we have:
Z b
dWt = Wb − Wa ;
1.
a
2.
3.
Z
Z
T
Wt dWt =
0
T
0
WT2
T
− ;
2
2
(Wt2 − t) dWt =
WT2
− T WT ;
3
141
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
T
t dWt = T WT −
0
T
0
Wt2 dWt
T
Z
W3
= T −
3
T
Wt dt,
0 < T;
0
Z
T
Wt dt;
0
T
t
e 2 cos Wt dWt = e 2 sin WT ;
0
T
t
T
e 2 sin Wt dWt = 1 − e 2 cos WT ;
0
T
0
T
t
e− 2 +Wt dWt = e− 2 +WT − 1;
T
e
λ2 t
2
e
λ2 t
2
cos(λWt ) dWt =
1 λ2 T
e 2 sin(λWT );
λ
sin(λWt ) dWt =
λ2 T
1
1 − e 2 cos(λWT ) ;
λ
0
T
0
T
e−
λ2 t
±λWt
2
dWt =
0
b
3
t− 2 Wt e−
Wt2
2t
1 − λ2 T ±λWT
e 2
−1 ;
±λ
1
dWt = a− 2 e−
2
Wa
2a
a
T
0
1
cos Wt dWt = sin WT +
2
Z
1
− b− 2 e−
Wb2
2b
;
T
sin Wt dt;
0
Z
1 T
cos Wt dt;
sin Wt dWt = 1 − cos WT −
14.
2 0
0
Z t
15. d
f (s, Ws ) dWs = f (t, Wt ) dWt ;
Z
T
a
16.
17.
18.
Z
Z
Z
b
Yt dWt = Fb − Fa , if Yt dWt = dFt ;
a
b
a
b
a
f (t) dWt = f (t)Wt |ba −
Z
b
f ′ (t)Wt dt;
a
b 1 Z b
g (Wt ) dWt = g(Wt ) −
g′′ (Wt ) dt.
2 a
a
′
142
Chapter 8
Stochastic Differential Equations
8.1
Definitions and Examples
Let Xt be a continuous stochastic process. If small changes in the process Xt can
be written as a linear combination of small changes in t and small increments of the
Brownian motion Wt , we may write
dXt = a(t, Wt , Xt )dt + b(t, Wt , Xt ) dWt
(8.1.1)
and call it a stochastic differential equation. In fact, this differential relation has the
following integral meaning:
Xt = X0 +
Z
t
a(s, Ws , Xs ) ds +
Z
t
b(s, Ws , Xs ) dWs ,
(8.1.2)
0
0
where the last integral is taken in the Ito sense. Relation (8.1.2) is taken as the definition
for the stochastic differential equation (8.1.1). However, since it is convenient to use
stochastic differentials informally, we shall approach stochastic differential equations by
analogy with the ordinary differential equations, and try to present the same methods
of solving equations in the new stochastic environment.
The functions a(t, Wt , Xt ) and b(t, Wt , Xt ) are called drift rate and volatility, respectively. A process Xt is called a (strong) solution for the stochastic equation (8.1.1)
if it satisfies the equation. We shall start with an example.
Example 8.1.1 (The Brownian bridge) Let a, b ∈ R. Show that the process
Z t
1
dWs , 0 ≤ t < 1
Xt = a(1 − t) + bt + (1 − t)
1
−
s
0
is a solution of the stochastic differential equation
dXt =
b − Xt
dt + dWt ,
1−t
143
0 ≤ t < 1, X0 = a.
144
We shall perform a routine verification to show that Xt is a solution. First we compute
b − Xt
the quotient
:
1−t
Z
b − Xt = b − a(1 − t) − bt − (1 − t)
Z t
= (b − a)(1 − t) − (1 − t)
0
and dividing by 1 − t yields
b − Xt
=b−a−
1−t
Using
the product rule yields
Z
d
t
0
Z
t
0
t
1
dWs
0 1−s
1
dWs ,
1−s
1
dWs .
1−s
(8.1.3)
1
1
dWs =
dWt ,
1−s
1−t
Z t
Z t 1
1
dWs + (1 − t)d
dWs
dXt = a d(1 − t) + bdt + d(1 − t)
0 1−s
0 1−s
Z t
1
dWs dt + dWt
=
b−a−
0 1−s
b − Xt
=
dt + dWt ,
1−t
where the last identity comes from (8.1.3). We just verified that the process Xt is
a solution of the given stochastic equation. The question of how this solution was
obtained in the first place, is the subject of study for the next few sections.
8.2
Finding Mean and Variance from the Equation
For most practical purposes, the most important information one needs to know about
a process is its mean and variance. These can be found directly from the stochastic
equation in some particular cases without solving explicitly the equation. We shall deal
in the present section with this problem.
Taking the expectation in (8.1.2) and using the property of the Ito integral as a
zero mean random variable yields
Z t
E[a(s, Ws , Xs )] ds.
(8.2.4)
E[Xt ] = X0 +
0
Applying the Fundamental Theorem of Calculus we obtain
d
E[Xt ] = E[a(t, Wt , Xt )].
dt
145
We note that Xt is not differentiable, but its expectation E[Xt ] is. This equation can
be solved exactly in a few particular cases.
1. If a(t, Wt , Xt ) = a(t), then
Rt
X0 + 0 a(s) ds.
d
dt E[Xt ]
= a(t) with the exact solution E[Xt ] =
2. If a(t, Wt , Xt ) = α(t)Xt + β(t), with α(t) and β(t) continuous deterministic
function, then
d
E[Xt ] = α(t)E[Xt ] + β(t),
dt
which is a linear differential equation in E[Xt ]. Its solution is given by
A(t)
E[Xt ] = e
Z t
e−A(s) β(s) ds ,
X0 +
(8.2.5)
0
Rt
where A(t) = 0 α(s) ds. It is worth noting that the expectation E[Xt ] does not depend
on the volatility term b(t, Wt , Xt ).
Exercise 8.2.1 If dXt = (2Xt + e2t )dt + b(t, Wt , Xt )dWt , then show that
E[Xt ] = e2t (X0 + t).
Proposition 8.2.2 Let Xt be a process satisfying the stochastic equation
dXt = α(t)Xt dt + b(t)dWt .
Then the mean and variance of Xt are given by
E[Xt ] = eA(t) X0
Z t
2A(t)
V ar[Xt ] = e
e−A(s) b2 (s) ds,
0
where A(t) =
Rt
0
α(s) ds.
Proof: The expression of E[Xt ] follows directly from formula (8.2.5) with β = 0. In
order to compute the second moment we first compute
(dXt )2 = b2 (t) dt;
d(Xt2 ) = 2Xt dXt + (dXt )2
= 2Xt α(t)Xt dt + b(t)dWt + b2 (t)dt
= 2α(t)Xt2 + b2 (t) dt + 2b(t)Xt dWt ,
where we used Ito’s formula. If we let Yt = Xt2 , the previous equation becomes
p
dYt = 2α(t)Yt + b2 (t) dt + 2b(t) Yt dWt .
146
Applying formula (8.2.5) with α(t) replaced by 2α(t) and β(t) by b2 (t), yields
Z t
2A(t)
e−2A(s) b2 (s) ds ,
E[Yt ] = e
Y0 +
0
which is equivalent with
E[Xt2 ]
2A(t)
=e
It follows that the variance is
V ar[Xt ] =
E[Xt2 ]
X02
+
Z
e−2A(s) b2 (s) ds .
t
0
2
2A(t)
− (E[Xt ]) = e
Z
t
e−2A(s) b2 (s) ds.
0
Remark 8.2.3 We note that the previous equation is of linear type. This shall be
solved explicitly in a future section.
The mean and variance for a given stochastic process can be computed by working
out the associated stochastic equation. We shall provide next a few examples.
Example 8.2.1 Find the mean and variance of ekWt , with k constant.
From Ito’s formula
1
d(ekWt ) = kekWt dWt + k2 ekWt dt,
2
and integrating yields
kWt
e
=1+k
Z
t
kWs
e
0
1
dWs + k2
2
Z
t
ekWs ds.
0
Taking the expectations we have
kWt
E[e
1
] = 1 + k2
2
Z
t
E[ekWs ] ds.
0
If we let f (t) = E[ekWt ], then differentiating the previous relations yields the differential
equation
1
f ′ (t) = k2 f (t)
2
with the initial condition f (0) = E[ekW0 ] = 1. The solution is f (t) = ek
E[ekWt ] = ek
2 t/2
.
The variance is
V ar(ekWt ) = E[e2kWt ] − (E[ekWt ])2 = e4k
2
2
= ek t (ek t − 1).
2 t/2
− ek
2t
2 t/2
, and hence
147
Example 8.2.2 Find the mean of the process Wt eWt .
We shall set up a stochastic differential equation for Wt eWt . Using the product formula
and Ito’s formula yields
d(Wt eWt ) = eWt dWt + Wt d(eWt ) + dWt d(eWt )
1
= eWt dWt + (Wt + dWt )(eWt dWt + eWt dt)
2
1
Wt
Wt
Wt
Wt
= ( Wt e + e )dt + (e + Wt e )dWt .
2
Integrating and using that W0 eW0 = 0 yields
Z t
Z t
1
Wt
Ws
Ws
( Ws e + e ) ds +
Wt e =
(eWs + Ws eWs ) dWs .
2
0
0
Since the expectation of an Ito integral is zero, we have
Z t
1
Wt
E[Ws eWs ] + E[eWs ] ds.
E[Wt e ] =
2
0
Let f (t) = E[Wt eWt ]. Using E[eWs ] = es/2 , the previous integral equation becomes
Z t
1
( f (s) + es/2 ) ds,
f (t) =
0 2
Differentiating yields the following linear differential equation
1
f ′ (t) = f (t) + et/2
2
with the initial condition f (0) = 0. Multiplying by e−t/2 yields the following exact
equation
(e−t/2 f (t))′ = 1.
The solution is f (t) = tet/2 . Hence we obtained that
E[Wt eWt ] = tet/2 .
Exercise 8.2.4 Find a E[Wt2 eWt ]; (b) E[Wt ekWt ].
Example 8.2.3 Show that for any integer k ≥ 0 we have
E[Wt2k ] =
(2k)! k
t ,
2k k!
In particular, E[Wt4 ] = 3t2 , E[Wt6 ] = 15t3 .
E[Wt2k+1 ] = 0.
148
From Ito’s formula we have
d(Wtn ) = nWtn−1 dWt +
n(n − 1) n−2
Wt dt.
2
Integrate and get
Wtn = n
Z
t
0
Wsn−1 dWs +
n(n − 1)
2
Z
0
t
Wsn−2 ds.
Since the expectation of the first integral on the right side is zero, taking the expectation
yields the following recursive relation
Z
n(n − 1) t
n
E[Wt ] =
E[Wsn−2 ] ds.
2
0
Using the initial values E[Wt ] = 0 and E[Wt2 ] = t, the method of mathematical intk . The details are left to the
duction implies that E[Wt2k+1 ] = 0 and E[Wt2k ] = (2k)!
2k k!
reader.
Exercise 8.2.5 (a) Is Wt4 − 3t2 an Ft -martingale?
(b) What about Wt3 ?
Example 8.2.4 Find E[sin Wt ].
From Ito’s formula
d(sin Wt ) = cos Wt dWt −
1
sin Wt dt,
2
then integrating yields
sin Wt =
Z
t
0
1
cos Ws dWs −
2
Z
t
sin Ws ds.
0
Taking expectations we arrive at the integral equation
Z
1 t
E[sin Wt ] = −
E[sin Ws ] ds.
2 0
Let f (t) = E[sin Wt ]. Differentiating yields the equation f ′ (t) = − 12 f (t) with f (0) =
E[sin W0 ] = 0. The unique solution is f (t) = 0. Hence
E[sin Wt ] = 0.
Exercise 8.2.6 Let σ be a constant. Show that
(a) E[sin(σWt )] = 0;
2
(b) E[cos(σWt )] = e−σ t/2 ;
2
(c) E[sin(t + σWt )] = e−σ t/2 sin t;
2
(d) E[cos(t + σWt )] = e−σ t/2 cos t;
149
Exercise 8.2.7 Use the previous exercise and the definition of expectation to show that
Z
∞
π 1/2
;
e1/4
−∞
r
Z ∞
2π
−x2 /2
.
e
cos x dx =
(b)
e
−∞
(a)
2
e−x cos x dx =
Exercise 8.2.8 Using expectations show that
r Z ∞
π b b2 /(4a)
2
xe−ax +bx dx =
(a)
e
;
a 2a
−∞
r
Z ∞
b2 b2 /(4a)
π 1
2
1+
e
;
x2 e−ax +bx dx =
(b)
a 2a
2a
−∞
(c) Can you apply a similar method to find a close form expression for the integral
Z ∞
2
xn e−ax +bx dx?
−∞
Exercise 8.2.9 Using the result given by Example 8.2.3 show that
3
(a) E[cos(tWt )] = e−t /2 ;
(b) E[sin(tWt )] = 0;
(c) E[etWt ] = 0.
For general drift rates we cannot find the mean, but in the case of concave drift
rates we can find an upper bound for the expectation E[Xt ]. The following result will
be useful.
Lemma 8.2.10 (Gronwall’s inequality) Let f (t) be a non-negative function satisfying the inequality
Z t
f (s) ds
f (t) ≤ C + M
0
for 0 ≤ t ≤ T , with C, M constants. Then
f (t) ≤ CeM t ,
0 ≤ t ≤ T.
Proposition 8.2.11 Let Xt be a continuous stochastic process such that
dXt = a(Xt )dt + b(t, Wt , Xt ) dWt ,
with the function a(·) satisfying the following conditions
1. a(x) ≥ 0, for 0 ≤ x ≤ T ;
2. a′′ (x) < 0, for 0 ≤ x ≤ T ;
3. a′ (0) = M .
Then E[Xt ] ≤ X0 eM t , for 0 ≤ Xt ≤ T .
150
Proof: From the mean value theorem there is ξ ∈ (0, x) such that
a(x) = a(x) − a(0) = (x − 0)a′ (ξ) ≤ xa′ (0) = M x,
(8.2.6)
where we used that a′ (x) is a decreasing function. Applying Jensen’s inequality for
concave functions yields
E[a(Xt )] ≤ a(E[Xt ]).
Combining with (8.2.6) we obtain E[a(Xt )] ≤ M E[Xt ]. Substituting in the identity
(8.2.4) implies
Z t
E[Xs ] ds.
E[Xt ] ≤ X0 + M
0
Applying Gronwall’s inequality we obtain E[Xt ] ≤ X0 eM t .
Exercise 8.2.12 State the previous result in the particular case when a(x) = sin x,
with 0 ≤ x ≤ π.
Not in all cases can the mean and the variance be obtained directly from the stochastic equation. In these cases we need more powerful methods that produce closed form
solutions. In the next sections we shall discuss several methods of solving stochastic
differential equation.
8.3
The Integration Technique
We shall start with the simple case when both the drift and the volatility are just
functions of time t.
Proposition 8.3.1 The solution Xt of the stochastic differential equation
dXt = a(t)dt + b(t)dWt
is Gaussian distributed with mean X0 +
Rt
0
a(s) ds and variance
Proof: Integrating in the equation yields
Xt − X0 =
Z
t
dXs =
0
Z
t
a(s) ds +
0
Z
Rt
0
b2 (s) ds.
t
b(s) dWs .
0
Rt
Using the property of Wiener integrals, 0 b(s) dWs is Gaussian distributed with mean
Rt
0 and variance 0 b2 (s) ds. Then Xt is Gaussian (as a sum between a deterministic
151
function and a Gaussian), with
Z
Z
t
t
b(s) dWs ]
a(s) ds +
E[Xt ] = E[X0 +
0
0
Z t
Z t
a(s) ds + E[ b(s) dWs ]
= X0 +
0
0
Z t
a(s) ds,
= X0 +
0
Z t
Z t
b(s) dWs ]
a(s) ds +
V ar[Xt ] = V ar[X0 +
0
0
hZ t
i
= V ar
b(s) dWs
0
Z t
b2 (s) ds,
=
0
which ends the proof.
Exercise 8.3.2 Solve the following stochastic differential equations for t ≥ 0 and determine the mean and the variance of the solution
(a) dXt = cos t dt − sin t dWt , X0 = 1.
√
(b) dXt = et dt + t dWt , X0 = 0.
(c) dXt =
t
1+t2
dt + t3/2 dWt , X0 = 1.
If the drift and the volatility depend on both variables t and Wt , the stochastic
differential equation
dXt = a(t, Wt )dt + b(t, Wt )dWt ,
t≥0
defines an Ito diffusion. Integrating yields the solution
Xt = X0 +
Z
t
a(s, Ws ) ds +
0
Z
t
b(s, Ws ) dWs .
0
There are several cases when both integrals can be computed explicitly. In order to
compute the second integral we shall often use the table of usual stochastic integrals
provided in section 7.4.
Example 8.3.1 Find the solution of the stochastic differential equation
dXt = dt + Wt dWt ,
X0 = 1.
152
Integrate between 0 and t and get
Z t
Z t
t
W2
Ws dWs = 1 + t + t −
ds +
Xt = 1 +
2
2
0
0
1
2
(W + t) + 1.
=
2 t
Example 8.3.2 Solve the stochastic differential equation
dXt = (Wt − 1)dt + Wt2 dWt ,
X0 = 0.
Rt
Let Zt = 0 Ws ds denote the integrated Brownian motion process. Integrating the
equation between 0 and t yields
Z t
Z t
Z t
Ws2 dWs
(Ws − 1)ds +
dXs =
Xt =
0
0
0
1
= Zt − t + Wt3 − Zt
3
1 3
=
W − t,
3 t
where we used that
Rt
0
Ws2 dWs = 13 Wt3 − Zt .
Example 8.3.3 Solve the stochastic differential equation
dXt = t2 dt + et/2 cos Wt dWt ,
X0 = 0,
and find E[Xt ] and V ar(Xt ).
Integrating yields
Xt =
=
Z
t
2
s ds +
0
Z
t
es/2 cos Ws dWs
0
t3
+ et/2 sin Wt ,
3
(8.3.7)
where we used (7.3.9). Even if the process Xt is not Gaussian, we can still compute its
mean and variance. By Ito’s formula we have
d(sin Wt ) = cos Wt dWt −
1
sin Wt dt
2
Integrating between 0 and t yields
Z t
Z
1 t
cos Ws dWs −
sin Wt =
sin Ws ds,
2 0
0
153
where we used that sin W0 = sin 0 = 0. Taking the expectation in the previous relation
yields
hZ t
i 1Z t
E[sin Wt ] = E
cos Ws dWs −
E[sin Ws ] ds.
2 0
0
From the properties of the Ito integral, the first expectation on the right side is zero.
Denoting µ(t) = E[sin Wt ], we obtain the integral equation
Z
1 t
µ(s) ds.
µ(t) = −
2 0
Differentiating yields the differential equation
1
µ′ (t) = − µ(t)
2
with the solution µ(t) = ke−t/2 . Since k = µ(0) = E[sin W0 ] = 0, it follows that
µ(t) = 0. Hence
E[sin Wt ] = 0.
Taking expectation in (8.3.7) leads to
E[Xt ] = E
h t3 i
3
+ et/2 E[sin Wt ] =
t3
.
3
Since the variance of deterministic functions is zero,
V ar[Xt ] = V ar
h t3
3
i
+ et/2 sin Wt = (et/2 )2 V ar[sin Wt ]
= et E[sin2 Wt ] =
et
(1 − E[cos 2Wt ]).
2
(8.3.8)
In order to compute the last expectation we use Ito’s formula
d(cos 2Wt ) = −2 sin 2Wt dWt − 2 cos 2Wt dt
and integrate to get
cos 2Wt = cos 2W0 − 2
Z
t
0
sin 2Ws dWs − 2
Z
t
cos 2Ws ds
0
Taking the expectation and using that Ito integrals have zero expectation, yields
Z t
E[cos 2Ws ] ds.
E[cos 2Wt ] = 1 − 2
0
If we denote m(t) = E[cos 2Wt ], the previous relation becomes an integral equation
Z t
m(s) ds.
m(t) = 1 − 2
0
154
Differentiate and get
m′ (t) = −2m(t),
with the solution m(t) = ke−2t . Since k = m(0) = E[cos 2W0 ] = 1, we have m(t) =
e−2t . Substituting into (8.3.8) yields
V ar[Xt ] =
et
et − e−t
(1 − e−2t ) =
= sinh t.
2
2
In conclusion, the solution Xt has the mean and the variance given by
E[Xt ] =
t3
,
3
V ar[Xt ] = sinh t.
Example 8.3.4 Solve the following stochastic differential equation
et/2 dXt = dt + eWt dWt ,
X0 = 0,
and find the distribution of the solution Xt and its mean and variance.
Dividing by et/2 , integrating between 0 and t, and using formula (7.3.8) yields
Z t
Z t
−s/2
e−s/2+Ws dWs
e
ds +
Xt =
0
−t/2 Wt
0
−t/2
= 2(1 − e
−t/2
= 1+e
)+e
Wt
(e
− 2).
e
−1
Since eWt is a geometric Brownian motion, using Proposition 3.2.2 yields
E[Xt ] = E[1 + e−t/2 (eWt − 2)] = 1 − 2e−t/2 + e−t/2 E[eW
t ]
= 2 − 2e−t/2 .
V ar(Xt ) = V ar[1 + e−t/2 (eWt − 2)] = V ar[e−t/2 eWt ] = e−t V ar[eWt ]
= e−t (e2t − et ) = et − 1.
The process Xt has the following distribution:
F (y) = P (Xt ≤ y) = P 1 + e−t/2 (eWt − 2) ≤ y
W
1
t
= P Wt ≤ ln 2 + et/2 (y − 1) = P √ ≤ √ ln 2 + et/2 (y − 1)
t
t
1
= N √ ln 2 + et/2 (y − 1) ,
t
Z u
1
2
where N (u) = √
e−s /2 ds is the distribution function of a standard normal
2π −∞
distributed random variable.
155
Example 8.3.5 Solve the stochastic differential equation
2
dXt = dt + t−3/2 Wt e−Wt /(2t) dWt ,
X1 = 1.
Integrating between 1 and t and applying formula (7.3.10) yields
Z t
Z t
2
s−3/2 Ws e−Ws /(2s) dWs
ds +
Xt = X1 +
1
1
−W12 /2
= 1+t−1−e
2
= t − e−W1 /2 −
1
t1/2
−
1
t1/2
2
e−Wt /(2t)
2
e−Wt /(2t) ,
∀t ≥ 1.
Exercise 8.3.3 Solve the following stochastic differential equations by the method of
integration
1
(a) dXt = (t − sin Wt )dt + (cos Wt )dWt , X0 = 0;
2
1
(b) dXt = ( cos Wt − 1)dt + (sin Wt )dWt , X0 = 0;
2
1
(c) dXt = (sin Wt + Wt cos Wt )dt + (Wt sin Wt )dWt , X0 = 0.
2
8.4
Exact Stochastic Equations
The stochastic differential equation
dXt = a(t, Wt )dt + b(t, Wt )dWt
(8.4.9)
is called exact if there is a differentiable function f (t, x) such that
1
a(t, x) = ∂t f (t, x) + ∂x2 f (t, x)
2
b(t, x) = ∂x f (t, x).
(8.4.10)
(8.4.11)
Assume the equation is exact. Then substituting in (8.4.9) yields
1
dXt = ∂t f (t, Wt ) + ∂x2 f (t, Wt ) dt + ∂x f (t, Wt )dWt .
2
Applying Ito’s formula, the previous equations becomes
dXt = d f (t, Wt ) ,
which implies Xt = f (t, Wt ) + c, with c constant.
Solving the partial differential equations system (8.4.10)–(8.4.11) requires the following steps:
156
1. Integrating partially with respect to x in the second equation to obtain f (t, x)
up to an additive function T (t);
2. Substitute into the first equation and determine the function T (t);
3. The solution is Xt = f (t, Wt ) + c, with c determined from the initial condition
on Xt .
Example 8.4.1 Solve the stochastic differential equation
dXt = et (1 + Wt2 )dt + (1 + 2et Wt )dWt ,
X0 = 0.
In this case a(t, x) = et (1 + x2 ) and b(t, x) = 1 + 2et x. The associated system is
1
et (1 + x2 ) = ∂t f (t, x) + ∂x2 f (t, x)
2
t
1 + 2e x = ∂x f (t, x).
Integrate partially in x in the second equation yields
Z
f (t, x) = (1 + 2et x) dx = x + et x2 + T (t).
Then ∂t f = et x2 + T ′ (t) and ∂x2 f = 2et . Substituting in the first equation yields
et (1 + x2 ) = et x2 + T ′ (t) + et .
This implies T ′ (t) = 0, or T = c constant. Hence f (t, x) = x + et x2 + c, and Xt =
f (t, Wt ) = Wt + et Wt2 + c. Since X0 = 0, it follows that c = 0. The solution is
Xt = Wt + et Wt2 .
Example 8.4.2 Find the solution of
dXt = 2tWt3 + 3t2 (1 + Wt ) dt + (3t2 Wt2 + 1)dWt ,
X0 = 0.
The coefficient functions are a(t, x) = 2tx3 + 3t2 (1 + x) and b(t, x) = 3t2 x2 + 1. The
associated system is given by
1
2tx3 + 3t2 (1 + x) = ∂t f (t, x) + ∂x2 f (t, x)
2
3t2 x2 + 1 = ∂x f (t, x).
Integrating partially in the second equation yields
Z
f (t, x) =
(3t2 x2 + 1) dx = t2 x3 + x + T (t).
Then ∂t f = 2tx3 + T ′ (t) and ∂x2 f = 6t2 x, and substituting into the first equation we
get
1
2tx3 + 3t2 (1 + x) = 2tx3 + T ′ (t) + 6t2 x.
2
157
After cancellations we get T ′ (t) = 3t2 , so T (t) = t3 + c. Then
f (t, x) = t2 x3 + x + t3 + c.
The solution process is given by Xt = f (t, Wt ) = t2 Wt3 + Wt + t3 + c. Using X0 = 0 we
get c = 0. Hence the solution is Xt = t2 Wt3 + Wt + t3 .
The next result deals with a condition regarding the closeness of the stochastic
differential equation.
Theorem 8.4.1 If the stochastic differential equation (8.4.9) is exact, then the coefficient functions a(t, x) and b(t, x) satisfy the condition
1
∂x a = ∂t b + ∂x2 b.
2
(8.4.12)
Proof: If the stochastic equation is exact, there is a function f (t, x) satisfying the
system (8.4.10)–(8.4.11). Differentiating the first equation of the system with respect
to x yields
1
∂x a = ∂t ∂x f + ∂x2 ∂x f.
2
Substituting b = ∂x f yields the desired relation.
Remark 8.4.2 The equation (8.4.12) has the meaning of a heat equation. The function b(t, x) represents the temperature measured at x at the instance t, while ∂x a is the
density of heat sources. The function a(t, x) can be regarded as the potential from which
the density of heat sources is derived by taking the gradient in x.
It is worth noting that equation (8.4.12) is just a necessary condition for exactness.
This means that if this condition is not satisfied, then the equation is not exact. In
that case we need to try a different method to solve the equation.
Example 8.4.3 Is the stochastic differential equation
dXt = (1 + Wt2 )dt + (t4 + Wt2 )dWt
exact?
Collecting the coefficients, we have a(t, x) = 1 + x2 , b(t, x) = t4 + x2 . Since ∂x a = 2x,
∂t b = 4t3 , and ∂x2 b = 2, the condition (8.4.12) is not satisfied, and hence the equation
is not exact.
Exercise 8.4.3 Solve the following exact stochastic differential equations
X0 = 1;
(a) dXt = et dt + (Wt2 − t)dWt ,
2
X0 = −1;
(b) dXt = (sin t)dt + (Wt − t)dWt ,
t
(c) dXt = t2 dt + eWt − 2 dWt ,
X0 = 0;
t/2
(d) dXt = tdt + e (cos Wt ) dWt ,
X0 = 1.
158
Exercise 8.4.4 Verify the closeness condition and then solve the following exact stochastic differential equations
X0 = 0;
(a) dXt = Wt + 32 Wt2 dt + (t + Wt3 )dWt ,
(b) dXt = 2tWt dt + (t2 + Wt )dWt ,
X0 = 0;
1
t
t
(c) dXt = e Wt + 2 cos Wt dt + (e + sin Wt )dWt ,
(d) dXt = eWt (1 + 2t )dt + teWt dWt ,
8.5
X0 = 0;
X0 = 2.
Integration by Inspection
When solving a stochastic differential equation by inspection we look for opportunities
to apply the product or the quotient formulas:
d(f (t)Yt ) = f (t) dYt + Yt df (t).
X f (t)dX − X df (t)
t
t
t
d
=
.
f (t)
f (t)2
For instance, if a stochastic differential equation can be written as
dXt = f ′ (t)Wt dt + f (t)dWt ,
the product rule brings the equation into the exact form
dXt = d f (t)Wt ,
which after integration leads to the solution
Xt = X0 + f (t)Wt .
Example 8.5.1 Solve
dXt = (t + Wt2 )dt + 2tWt dWt ,
X0 = a.
We can write the equation as
dXt = Wt2 dt + t(2Wt dWt + dt),
which can be contracted to
dXt = Wt2 dt + td(Wt2 ).
Using the product rule we can bring it to the exact form
dXt = d(tWt2 ),
with the solution Xt = tWt2 + a.
159
Example 8.5.2 Solve the stochastic differential equation
dXt = (Wt + 3t2 )dt + tdWt .
If we rewrite the equation as
dXt = 3t2 dt + (Wt dt + tdWt ),
we note the exact expression formed by the last two terms Wt dt + tdWt = d(tWt ).
Then
dXt = d(t3 ) + d(tWt ),
which is equivalent with d(Xt ) = d(t3 + tWt ). Hence Xt = t3 + tWt + c, c ∈ R.
Example 8.5.3 Solve the stochastic differential equation
e−2t dXt = (1 + 2Wt2 )dt + 2Wt dWt .
Multiply by e2t to get
dXt = e2t (1 + 2Wt2 )dt + 2e2t Wt dWt .
After regrouping, this becomes
dXt = (2e2t dt)Wt2 + e2t (2Wt dWt + dt).
Since d(e2t ) = 2e2t dt and d(Wt2 ) = 2Wt dWt + dt, the previous relation becomes
dXt = d(e2t )Wt2 + e2t d(Wt2 ).
By the product rule, the right side becomes exact
dXt = d(e2t Wt2 ),
and hence the solution is Xt = e2t Wt2 + c, c ∈ R.
Example 8.5.4 Solve the equation
t3 dXt = (3t2 Xt + t)dt + t6 dWt ,
X1 = 0.
The equation can be written as
t3 dXt − 3Xt t2 dt = tdt + t6 dWt .
Divide by t6 :
t3 dXt − Xt d(t3 )
= t−5 dt + dWt .
(t3 )2
160
Applying the quotient rule yields
t−4 X t
+ dWt .
d 3 = −d
t
4
Integrating between 1 and t, yields
Xt
t−4
+ Wt − W1 + c
=
−
t3
4
so
Xt = ct3 −
1
+ t3 (Wt − W1 ),
4t
c ∈ R.
Using X1 = 0 yields c = 1/4 and hence the solution is
Xt =
1 3 1
t −
+ t3 (Wt − W1 ),
4
t
c ∈ R.
Exercise 8.5.1 Solve the following stochastic differential equations by the inspection
method
(a) dXt = (1 + Wt )dt + (t + 2Wt )dWt ,
(b)
(c)
(d)
t2 dXt
(2t3
− Wt )dt + tdWt ,
X1 =
1
e−t/2 dXt = 2 Wt dt + dWt ,
X0 = 0;
2
dXt = 2tWt dWt + Wt dt,
X0 = 0;
=
(e) dXt = 1 +
8.6
X0 = 0;
1
√
W
2 t t
√
dt + t dWt ,
0;
X1 = 0.
Linear Stochastic Differential Equations
Consider the stochastic differential equation with drift term linear in Xt
dXt = α(t)Xt + β(t) dt + b(t, Wt )dWt ,
t ≥ 0.
This also can be written as
dXt − α(t)Xt dt = β(t)dt + b(t, Wt )dWt .
Rt
Let A(t) = 0 α(s) ds. Multiplying by the integrating factor e−A(t) , the left side of the
previous equation becomes an exact expression
= e−A(t) β(t)dt + e−A(t) b(t, Wt )dWt
e−A(t) dXt − α(t)Xt dt
= e−A(t) β(t)dt + e−A(t) b(t, Wt )dWt .
d e−A(t) Xt
161
Integrating yields
Z t
e−A(s) b(s, Ws ) dWs
e−A(s) β(s) ds +
0
0
Z t
Z t
−A(s)
A(t)
A(t)
e−A(s) b(s, Ws ) dWs .
e
β(s) ds +
= X0 e
+e
e−A(t) Xt = X0 +
Xt
Z
t
0
0
The first integral within the previous parentheses is a Riemann integral, and the latter
one is an Ito stochastic integral. Sometimes, in practical applications these integrals
can be computed explicitly.
When b(t, Wt ) = b(t), the latter integral becomes a Wiener integral. In this case
the solution Xt is Gaussian with mean and variance given by
Z t
A(t)
A(t)
E[Xt ] = X0 e
+e
e−A(s) β(s) ds
0
Z t
V ar[Xt ] = e2A(t)
e−2A(s) b(s)2 ds.
0
Another important particular case is when α(t) = α 6= 0, β(t) = β are constants
and b(t, Wt ) = b(t). The equation in this case is
dXt = αXt + β dt + b(t)dWt ,
t ≥ 0,
and the solution takes the form
αt
Xt = X0 e
β
+ (eαt − 1) +
α
Z
t
eα(t−s) b(s) dWs .
0
Example 8.6.1 Solve the linear stochastic differential equation
dXt = (2Xt + 1)dt + e2t dWt .
Write the equation as
dXt − 2Xt dt = dt + e2t dWt
and multiply by the integrating factor e−2t to get
d e−2t Xt = e−2t dt + dWt .
Integrate between 0 and t and multiply by e2t , and obtain
Z t
Z t
2t
2t
−2s
2t
Xt = X0 e + e
e ds + e
dWs
0
1
= X0 e + (e2t − 1) + e2t Wt .
2
2t
0
162
Example 8.6.2 Solve the linear stochastic differential equation
dXt = (2 − Xt )dt + e−t Wt dWt .
Multiplying by the integrating factor et yields
et (dXt + Xt dt) = 2et dt + Wt dWt .
Since et (dXt + Xt dt) = d(et Xt ), integrating between 0 and t we get
Z t
Z t
t
t
Ws dWs .
2e dt +
e Xt = X0 +
0
0
Dividing by
et
and performing the integration yields
1
Xt = X0 e−t + 2(1 − e−t ) + e−t (Wt2 − t).
2
Example 8.6.3 Solve the linear stochastic differential equation
1
dXt = ( Xt + 1)dt + et cos Wt dWt .
2
Write the equation as
1
dXt − Xt dt = dt + et cos Wt dWt
2
and multiply by the integrating factor e−t/2 to get
d(e−t/2 Xt ) = e−t/2 dt + et/2 cos Wt dWt .
Integrating yields
−t/2
e
Xt = X0 +
Z
0
t
−s/2
e
ds +
Z
t
es/2 cos Ws dWs
0
Multiply by et/2 and use formula (7.3.9) to obtain the solution
Xt = X0 et/2 + 2(et/2 − 1) + et sin Wt .
Exercise 8.6.1 Solve the following linear stochastic differential equations
(a) dXt = (4Xt − 1)dt + 2dWt ;
(b) dXt = (3Xt − 2)dt + e3t dWt ;
(c) dXt = (1 + Xt )dt + et Wt dWt ;
(d) dXt = (4Xt + t)dt + e4t dWt ;
(e) dXt = t + 12 Xt dt + et sin Wt dWt ;
(f ) dXt = −Xt dt + e−t dWt .
163
In the following we present an important example of stochastic differential equations, which can be solved by the method presented in this section.
Proposition 8.6.2 (The mean-reverting Ornstein-Uhlenbeck process) Let m and
α be two constants. Then the solution Xt of the stochastic equation
dXt = (m − Xt )dt + αdWt
is given by
−t
Xt = m + (X0 − m)e
+α
Z
t
(8.6.13)
es−t dWs .
(8.6.14)
0
Xt is Gaussian with mean and variance given by
E[Xt ] = m + (X0 − m)e−t
α2
V ar(Xt ) =
(1 − e−2t ).
2
Proof: Adding Xt dt to both sides and multiplying by the integrating factor et we get
d(et Xt ) = met dt + αet dWt ,
which after integration yields
et Xt = X0 + m(et − 1) + α
Z
t
es dWs .
0
Hence
−t
−t
−t
Z
t
+ αe
es dWs
0
Z t
es−t dWs .
= m + (X0 − m)e−t + α
Xt = X0 e
+m−e
0
Since Xt is the sum between a predictable function and a Wiener integral, then we can
use Proposition 5.6.1 and it follows that Xt is Gaussian, with
i
h Z t
−t
es−t dWs = m + (X0 − m)e−t
E[Xt ] = m + (X0 − m)e + E α
0
Z t
i
h Z t
2 −2t
s−t
e dWs = α e
e2s ds
V ar(Xt ) = V ar α
= α2 e−2t
0
2t
e
0
1
−1
= α2 (1 − e−2t ).
2
2
164
The name mean-reverting comes from the fact that
lim E[Xt ] = m.
t→∞
The variance also tends to zero exponentially, lim V ar[Xt ] = 0. According to Propot→∞
sition 4.8.1, the process Xt tends to m in the mean square sense.
Proposition 8.6.3 (The Brownian bridge) For a, b ∈ R fixed, the stochastic differential equation
dXt =
b − Xt
dt + dWt ,
1−t
0 ≤ t < 1, X0 = a
has the solution
Z
Xt = a(1 − t) + bt + (1 − t)
t
0
1
dWs , 0 ≤ t < 1.
1−s
The solution has the property lim Xt = b, almost certainly.
t→1
Proof: If we let Yt = b − Xt the equation becomes linear in Yt
dYt +
1
Yt dt = −dWt .
1−t
1
1−t
Multiplying by the integrating factor ρ(t) =
yields
Y 1
t
d
dWt ,
=−
1−t
1−t
which leads by integration to
Yt
=c−
1−t
Z
t
0
Making t = 0 yields c = a − b, so
1
dWs .
1−s
b − Xt
=a−b−
1−t
Z
Xt = a(1 − t) + bt + (1 − t)
Z
t
1
dWs .
1−s
t
1
dWs , 0 ≤ t < 1.
1−s
0
Solving for Xt yields
0
(8.6.15)
165
Let Ut = (1 − t)
Rt
1
0 1−s
dWs . First we notice that
Z
t
1
dWs = 0,
0 1−s
Z
Z t
t 1
1
V ar(Ut ) = (1 − t)2 V ar
dWs = (1 − t)2
ds
2
0 1−s
0 (1 − s)
1
− 1 = t(1 − t).
= (1 − t)2
1−t
E[Ut ] = (1 − t)E
In order to show ac-limt→1 Xt = b, we need to prove
P ω; lim Xt (ω) = b = 1.
t→1
Since Xt = a(1 − t) + bt + Ut , it suffices to show that
P ω; lim Ut (ω) = 0 = 1.
t→1
(8.6.16)
We evaluate the probability of the complementary event
P ω; lim Ut (ω) 6= 0 = P ω; |Ut (ω)| > ǫ, ∀t ,
t→1
for some ǫ > 0. Since by Markov’s inequality
P ω; |Ut (ω)| > ǫ) <
V ar(Ut )
t(1 − t)
=
2
ǫ
ǫ2
holds for any 0 ≤ t < 1, choosing t → 1 implies that
P ω; |Ut (ω)| > ǫ, ∀t) = 0,
which implies (8.6.16).
The process (8.6.15) is called the Brownian bridge because it joins X0 = a with
X1 = b. Since Xt is the sum between a deterministic linear function in t and a Wiener
integral, it follows that it is a Gaussian process, with mean and variance
E[Xt ] = a(1 − t) + bt
V ar(Xt ) = V ar(Ut ) = t(1 − t).
It is worth noting that the variance is maximum at the midpoint t = (b − a)/2 and zero
at the end points a and b.
ms b as t → 1.
Exercise 8.6.4 Show that the Brownian bridge (8.6.15) satisfies Xt −→
Exercise 8.6.5 Find Cov(Xs , Xt ), 0 < s < t for the following cases:
(a) Xt is a mean reverting Orstein-Uhlenbeck process;
(b) Xt is a Brownian bridge process.
166
Stochastic equations with respect to a Poisson process
Similar techniques can be applied in the case when the Brownian motion process Wt
is replaced by a Poisson process Nt with constant rate λ. For instance, the stochastic
differential equation
dXt = 3Xt dt + e3t dNt
X0 = 1
can be solved multiplying by the integrating factor e−3t to obtain
d(e−3t Xt ) = dNt .
Integrating yields e−3t Xt = Nt + C. Making t = 0 yields C = 1, so the solution is given
by Xt = e3t (1 + Nt ).
The following equation
dXt = (m − Xt )dt + αdNt
is similar with the equation defining the mean-reverting Ornstein-Uhlenbeck process.
As we shall see, in this case, the process is no more mean-reverting, but it reverts to a
certain constant. A similar method yields the solution
Z t
Xt = m + (X0 − m)e−t + αe−t
es dNs .
0
Since from Proposition 5.8.5 and Exercise 5.8.8 we have
Z
Z t
t s
e dNs = λ
E
es ds = λ(et − 1)
0
0
Z t
Z t
λ
e2s ds = (e2t − 1),
es dNs = λ
V ar
2
0
0
it follows that
E[Xt ] = m + (X0 − m)e−t + αλ(1 − e−t ) → m + αλ
λα2
(1 − e−2t ).
V ar(Xt ) =
2
It is worth noting that in this case the process Xt is not Gaussian any more.
8.7
The Method of Variation of Parameters
Let’s start by considering the following stochastic equation
dXt = αXt dWt ,
(8.7.17)
167
with α constant. This is the equation which, in physics is known to model the linear
noise. Dividing by Xt yields
dXt
= α dWt .
Xt
Switch to the integral form
Z
dXt
=
Xt
Z
α dWt ,
and integrate “blindly” to get ln Xt = αWt + c, with c an integration constant. This
leads to the “pseudo-solution”
Xt = eαWt +c .
The nomination “pseudo” stands for the fact that Xt does not satisfy the initial equation. We shall find a correct solution by letting the parameter c be a function of t. In
other words, we are looking for a solution of the following type:
Xt = eαWt +c(t) ,
(8.7.18)
where the function c(t) is subject to be determined. Using Ito’s formula we get
dXt = d(eαWt +c(t) ) = eαWt +c(t) (c′ (t) + α2 /2)dt + αeαWt +c(t) dWt
= Xt (c′ (t) + α2 /2)dt + αXt dWt .
Substituting the last term from the initial equation (8.7.17) yields
dXt = Xt (c′ (t) + α2 /2)dt + dXt ,
which leads to the equation
c′ (t) + α2 /2 = 0
2
with the solution c(t) = − α2 t + k. Substituting into (8.7.18) yields
Xt = eαWt −
α2
t+k
2
.
The value of the constant k is determined by taking t = 0. This leads to X0 = ek .
Hence we have obtained the solution of the equation (8.7.17)
Xt = X0 eαWt −
α2
t
2
.
Example 8.7.1 Use the method of variation of parameters to solve the stochastic differential equation
dXt = µXt dt + σXt dWt ,
with µ and σ constants.
168
After dividing by Xt we bring the equation into the equivalent integral form
Z
Z
Z
dXt
= µ dt + σ dWt .
Xt
Integrate on the left “blindly” and get
ln Xt = µt + σWt + c,
where c is an integration constant. We arrive at the following “pseudo-solution”
Xt = eµt+σWt +c .
Assume the constant c is replaced by a function c(t), so we are looking for a solution
of the form
Xt = eµt+σWt +c(t) .
(8.7.19)
Apply Ito’s formula and get
dXt = Xt µ + c′ (t) +
Subtracting the initial equation yields
c′ (t) +
2
σ2 dt + σXt dWt .
2
σ2 dt = 0,
2
2
which is satisfied for c′ (t) = − σ2 , with the solution c(t) = − σ2 t+k, k ∈ R. Substituting
into (8.7.19) yields the solution
Xt = eµt+σWt −
σ2
t+k
2
= e(µ−
σ2
)t+σWt +k
2
= X0 e(µ−
σ2
)t+σWt
2
.
Exercise 8.7.1 Use the method of variation of parameters to solve the equation
dXt = Xt Wt dWt
by following the next two steps:
(a) Divide by Xt and integrate “blindly” to get the “pseudo-solution”
Xt = e
Wt2
− 2t +c
2
,
with c constant.
(b) Consider c = c(t, Wt ) and find a solution of type
Xt = e
Wt2
− 2t +c(t,Wt )
2
.
169
8.8
Integrating Factors
The method of integrating factors can be applied to a class of stochastic differential
equations of the type
dXt = f (t, Xt )dt + g(t)Xt dWt ,
(8.8.20)
where f and g are continuous deterministic functions. The integrating factor is given
by
R
Rt
1 t 2
ρt = e− 0 g(s) dWs + 2 0 g (s) ds .
The equation can be brought into the following exact form
d(ρt Xt ) = ρt f (t, Xt )dt.
Substituting Yt = ρt Xt , we obtain that Yt satisfies the deterministic differential equation
dYt = ρt f (t, Yt /ρt )dt,
which can be solved by either integration or as an exact equation. We shall exemplify
the method of integrating factors with a few examples.
Example 8.8.1 Solve the stochastic differential equation
dXt = rdt + αXt dWt ,
(8.8.21)
with r and α constants.
1
2
The integrating factor is given by ρt = e 2 α t−αWt . Using Ito’s formula, we can easily
check that
dρt = ρt (α2 dt − αdWt ).
Using dt2 = dt dWt = 0, (dWt )2 = dt we obtain
dXt dρt = −α2 ρt Xt dt.
Multiplying by ρt , the initial equation becomes
ρt dXt − αρt Xt dWt = rρt dt,
and adding and subtracting α2 ρt Xt dt from the left side yields
ρt dXt − αρt Xt dWt + α2 ρt Xt dt − α2 ρt Xt dt = rρt dt.
This can be written as
ρt dXt + Xt dρt + dρt dXt = rρt dt,
170
which, with the virtue of the product rule, becomes
d(ρt Xt ) = rρt dt.
Integrating yields
ρt Xt = ρ0 X0 + r
Z
t
ρs ds
0
and hence the solution is
r
1
X0 +
ρt
ρt
Z
t
ρs ds
Z t
1 2
αWt − 12 α2 t
e− 2 α (t−s)+α(Wt −Ws ) ds.
+r
= X0 e
Xt =
0
0
Exercise 8.8.1 Let α be a constant. Solve the following stochastic differential equations by the method of integrating factors
(a) dXt = αXt dWt ;
(b) dXt = Xt dt + αXt dWt ;
1
(c) dXt =
dt + αXt dWt , X0 > 0.
Xt
Exercise 8.8.2 Let Xt Rbe the solution of the stochastic equation dXt = σXt dWt , with
t
σ constant. Let At = 1t 0 Xs dWs be the stochastic average of Xt . Find the stochastic
equation satisfied by At , the mean and variance of At .
8.9
Existence and Uniqueness
An exploding solution
Consider the non-linear stochastic differential equation
dXt = Xt3 dt + Xt2 dWt ,
X0 = 1/a.
(8.9.22)
We shall look for a solution of the type Xt = f (Wt ). Ito’s formula yields
1
dXt = f ′ (Wt )dWt + f ′′ (Wt )dt.
2
Equating the coefficients of dt and dWt in the last two equations yields
f ′ (Wt ) = Xt2 =⇒ f ′ (Wt ) = f (Wt )2
(8.9.23)
1 ′′
f (Wt ) = Xt3 =⇒ f ′′ (Wt ) = 2f (Wt )3
2
(8.9.24)
We note that equation (8.9.23) implies (8.9.24) by differentiation. So it suffices to solve
only the ordinary differential equation
f ′ (x) = f (x)2 ,
f (0) = 1/a.
171
Separating and integrating we have
Z
Z
df
1
= ds =⇒ f (x) =
.
2
f (x)
a−x
Hence a solution of equation (8.9.22) is
Xt =
1
.
a − Wt
Let Ta be the first time the Brownian motion Wt hits a. Then the process Xt is defined
only for 0 ≤ t < Ta . Ta is a random variable with P (Ta < ∞) = 1 and E[Ta ] = ∞, see
section 4.3.
The following theorem is the analog of Picard’s uniqueness result from ordinary
differential equations:
Theorem 8.9.1 (Existence and Uniqueness) Consider the stochastic differential
equation
dXt = b(t, Xt )dt + σ(t, Xt )dWt ,
X0 = c
where c is a constant and b and σ are continuous functions on [0, T ] × R satisfying
1. |b(t, x)| + |σ(t, x)| ≤ C(1 + |x|);
x ∈ R, t ∈ [0, T ]
2. |b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)| ≤ K|x − y|,
x, y ∈ R, t ∈ [0, T ]
with C, K positive constants. There there is a unique solution process Xt that is continuous and satisfies
i
hZ T
Xt2 dt < ∞.
E
0
The first condition says that the drift and volatility increase no faster than a linear
function in x. The second conditions states that the functions are Lipschitz in the
second argument.
Example 8.9.1 Consider the stochastic differential equation
q
q
1 1 + Xt2 + Xt dt + 1 + Xt2 dWt , X0 = x0 .
dXt =
2
(a) Solve the equation;
(b) Show that there a unique solution.
172
Chapter 9
Applications of Brownian Motion
9.1
The Directional Derivative
Consider a smooth curve x : [0, ∞) → Rn , starting at x(0) = x0 with the initial velocity
v = x′ (0). Then the derivative of a function f : Rn → R in the direction v is defined
by
f x(t) − f (x)
d
= f x(t) .
Dv f (x0 ) = lim
tց0
t
dt
t=0+
Applying chain rule yields
n
X
∂f
dxk (t) Dv f (x0 ) =
(x(t))
∂xk
dt k=1
t=0+
n
X
∂f
(x0 )vk (0) = h∇f (x0 ), vi,
=
∂xk
k=1
where ∇f stands for the gradient of f and h , i denotes the scalar product. The linear
differential operator Dv is called the directional derivative with respect to the vector
v. In the next section we shall extend this definition to the case when the curve x(t) is
replaced by an Ito diffusion Xt ; in this case the corresponding “directional derivative”
will be a second order partial differential operator.
9.2
The Generator of an Ito Diffusion
Let (Xt )t≥0 be a stochastic process with X0 = x0 . We shall consider an operator that
describes infinitesimally the rate of change of a function which depends smoothly on
Xt .
More precisely, the generator of the stochastic process Xt is the second order partial
differential operator A defined by
E[f (Xt )] − f (x)
,
tց0
t
Af (x) = lim
173
174
for any smooth function (at least of class C 2 ) with compact support f : Rn → R. Here
E stands for the expectation operatorgiven the initial condition X0 = x, i.e.,
Z
f (y)pt (x, y) dy,
E[f (Xt )] = E[f (Xt )|X0 = x] =
Rn
where pt (x, y) = p(x, y; t, 0) is the transition density of Xt , given X0 = x.
In the following we shall find the generator associated with the Ito diffusion
dXt = b(Xt )dt + σ(Xt )dW (t),
t ≥ 0, X0 = x,
(9.2.1)
where W (t) = W1 (t), . . . , Wm (t) is an m-dimensional Brownian motion, b : Rn → Rn
and σ : Rn → Rn×m are measurable functions.
The main tool used in deriving the formula for the generator A is Ito’s formula in
several variables. If let Ft = f (Xt ), then using Ito’s formula we have
X ∂f
1 X ∂2f
(Xt ) dXti +
(Xt ) dXti dXtj ,
(9.2.2)
dFt =
∂xi
2
∂xi ∂xj
i,j
i
where Xt = (Xt1 , · · · , Xtn ) satisfies the Ito diffusion (9.2.1) on components, i.e.,
dXti = bi (Xt )dt + [σ(Xt ) dW (t)]i
X
= bi (Xt )dt +
σik dWk (t).
(9.2.3)
k
Using the stochastic relations dt2 = dt dWk (t) = 0 and dWk (t) dWr (t) = δkr dt, a
computation provides
dXti dXtj
=
=
bi dt +
X
k
=
X
X
k
X
σik dWk (t) bj dt +
σjk dWk (t)
k
X
σik dWk (t)
σjr dWr (t)
r
σik σjr dWk (t)dWr (t) =
k,r
σik σjk dt
k
T
= (σσ )ij dt.
Therefore
X
dXti dXtj = (σσ T )ij dt.
Substitute (9.2.3) and (9.2.4) into (9.2.2) yields
dFt =
"
#
X
∂f
1 X ∂2f
(Xt )(σσ T )ij +
bi (Xt )
(Xt ) dt
2
∂xi ∂xj
∂xi
i,j
X ∂f
(Xt )σik (Xt ) dWk (t).
+
∂xi
i,k
i
(9.2.4)
175
Integrate and obtain
Ft
#
Z t" X
X ∂f
1
∂2f
T
= F0 +
(σσ )ij +
bi
(Xs ) ds
2
∂xi ∂xj
∂xi
0
i,j
i
XZ t X
∂f
(Xs ) dWk (s).
+
σik
∂x
i
0
k
i
Since F0 = f (X0 ) = f (x) and E(f (x)) = f (x), applying the expectation operator in
the previous relation we obtain
"Z
#
t X
X ∂f ∂2f
1
T
(σσ )ij
E(Ft ) = f (x) + E
+
bi
(Xs ) ds .
(9.2.5)
2
∂xi ∂xj
∂xi
0
i,j
i
Using the commutativity between the operator E and the integral
rule (see Exercise 9.2.7), yields
Rt
0,
applying l’Hospital
∂ 2 f (x) X ∂f (x)
1X
E(Ft ) − f (x)
(σσ T )ij
=
+
bk
.
tց0
t
2
∂xi ∂xj
∂xk
lim
i,j
k
We conclude the previous computations with the following result.
Theorem 9.2.1 The generator of the Ito diffusion (9.2.1) is given by
A=
X
∂2
1X
∂
(σσ T )ij
+
.
bk
2
∂xi ∂xj
∂xk
i,j
(9.2.6)
k
The matrix σ is called dispersion and the product σσ T is called diffusion matrix.
These names are related with their physical significance Substituting (9.2.6) in (9.2.5)
we obtain the following formula
hZ t
i
E[f (Xt )] = f (x) + E
Af (Xs ) ds ,
(9.2.7)
0
for any f ∈
C02 (Rn ).
Exercise 9.2.2 Find the generator operator associated with the n-dimensional Brownian motion.
Exercise 9.2.3 Find the Ito diffusion corresponding to the generator Af (x) = f ′′ (x)+
f ′ (x).
Exercise 9.2.4 Let ∆G = 12 (∂x21 + x21 ∂x22 ) be the Grushin’s operator.
(a) Find the diffusion process associated with the generator ∆G .
(b) Find the diffusion and dispersion matrices and show that they are degenerate.
176
Exercise 9.2.5 Let Xt and Yt be two one-dimensional independent Ito diffusions with
infinitesimal generators AX and AY . Let Zt = (Xt , Yt ) with the infinitesimal generator
AZ . Show that AZ = AX + AY .
Exercise 9.2.6 Let Xt be an Ito diffusion with infinitesimal generator AX . Consider
the process Yt = (t, Xt ). Show that the infinitesimal generator of Yt is given by AY =
∂t + AX .
Exercise 9.2.7 Let Xt be an Ito diffusion with X0 = x, and ϕ a smooth function.
Using l’Hospital rule to show that
Z
i
1 h t
ϕ(Xs ) ds = ϕ(x).
lim E
t→0+ t
0
9.3
Dynkin’s Formula
Formula (9.2.7) holds under more general conditions, when t is a stopping time. First
we need the following result, which deals with a continuity-type property in the upper
limit of a Ito integral.
Lemma 9.3.1 Let g be a bounded measurable function and τ be a stopping time for
Xt with E[τ ] < ∞. Then
lim E
k→∞
hZ
lim E
k→∞
τ ∧k
i
g(Xs ) dWs = E
0
hZ
τ ∧k
i
hZ
g(Xs ) ds = E
0
i
g(Xs ) dWs ;
τ
0
hZ
τ
0
i
g(Xs ) ds .
(9.3.8)
(9.3.9)
Proof: Let |g| < K. Using the properties of Ito integrals, we have
E
= E
h Z
hZ
τ
g(Xs ) dWs −
0
τ
2
i
Z
τ ∧k
g(Xt ) dWs
0
2
2 i
=E
g (Xs ) ds ≤ K E[τ − τ ∧ k] → 0,
τ ∧k
h Z
τ
g(Xs ) dWs
τ ∧k
2 i
k → ∞.
Since E[X 2 ] ≤ E[X]2 , it follows that
E
hZ
0
τ
g(Xs ) dWs −
Z
0
τ ∧k
i
g(Xt ) dWs → 0,
k → ∞,
which is equivalent with relation (9.3.8).
The second relation can be proved similar and is left as an exercise for the reader.
177
Exercise 9.3.2 Assume the hypothesis of the previous lemma. Let 1{s<τ } be the characteristic function of the interval (−∞, τ )
1, if u < τ
1{s<τ } (u) =
0, otherwise.
Show that
Z k
Z τ ∧k
1{s<τ } g(Xs ) dWs ,
g(Xs ) dWs =
(a)
0
0
Z k
Z τ ∧k
1{s<τ } g(Xs ) ds.
g(Xs ) ds =
(b)
0
0
Theorem 9.3.3 (Dynkin’s formula) Let f ∈ C02 (Rn ), and Xt be an Ito diffusion
starting at x. If τ is a stopping time with E[τ ] < ∞, then
hZ τ
i
E[f (Xτ )] = f (x) + E
Af (Xs ) ds ,
(9.3.10)
0
where A is the infinitesimal generator of Xt .
Proof: Replace t by k and f by 1{s<τ } f in (9.2.7) and obtain
E[1s<τ f (Xk )] = 1{s<τ } f (x) + E
hZ
k
0
i
A(1{s<τ } f )(Xs ) ds ,
which can be written as
E[f (Xk∧τ )] = 1{s<τ } f (x) + E
= 1{s<τ } f (x) + E
hZ
k
1{s<τ } (s)A(f )(Xs ) ds
0
hZ
k∧τ
0
i
A(f )(Xs ) ds .
i
(9.3.11)
Since by Lemma 9.3.1 (b)
E
hZ
E[f (Xk∧τ )] → E[f (Xτ )],
k∧τ
0
i
hZ
A(f )(Xs ) ds → E
0
τ
k→∞
i
A(f )(Xs ) ds ,
k → ∞,
using Exercise (9.3.2) relation (9.3.11) yields (9.3.10).
Exercise 9.3.4 Write Dynkin’s formula for the case of a function f (t, Xt ). Use Exercise 9.2.6.
In the following sections we shall present a few important results of stochastic
calculus that can be obtained as direct consequences of Dynkin’s formula.
178
9.4
Kolmogorov’s Backward Equation
For any function f ∈ C02 (Rn ) let v(t, x) = E[f (Xt )], given that X0 = x. The operator
E denotes the expectation given the initial condition X0 = x. Then v(0, x) = f (x),
and differentiating in Dynkin’s formula (9.2.7)
Z t
E[Af (Xs )] ds
v(t, x) = f (x) +
0
yields
∂v
∂t
= E[Af (Xt )].
Since we are allowed to differentiate inside of an integral, the operators A and E
commute
E[Af (Xt )] = AE[f (Xt )].
Therefore
∂v
= E[Af (Xt )] = AE[f (Xt )] = Av(t, x).
∂t
Hence, we arrived at the following result.
Theorem 9.4.1 (Kolmogorov’s backward equation) For any f ∈ C02 (Rn ) the function v(t, x) = E[f (Xt )] satisfies the following Cauchy’s problem
∂v
= Av,
∂t
v(0, x) = f (x),
t>0
where A denotes the generator of the Ito’s diffusion (9.2.1).
9.5
Exit Time from an Interval
Let Xt = x0 +Wt be a one-dimensional Brownian motion starting at x0 , with x0 ∈ (a, b).
Consider the exit time of the process Xt from the strip (a, b)
τ = inf{t > 0; Xt 6∈ (a, b)}.
Assuming E[τ ] < 0, applying Dynkin’s formula yields
i
h
i
h Z τ 1 d2
f
(X
)
ds
.
E f (Xτ ) = f (x0 ) + E
s
2
0 2 dx
(9.5.12)
Choosing f (x) = x in (9.5.12) we obtain
E[Xτ ] = x0
(9.5.13)
179
Exercise 9.5.1 Prove relation (9.5.13) using the Optional Stopping Theorem for the
martingale Xt .
Let pa = P (Xτ = a) and pb = P (Xτ = b) be the exit probabilities of Xt from the
interval (a, b). Obviously, pa + pb = 1, since the probability that the Brownian motion
will stay for ever inside the bounded interval is zero. Using the expectation definition,
relation (9.5.13) yields
apa + b(1 − pa ) = x0 .
Solving for pa and pb we get the following exit probabilities
pa =
b − x0
b−a
pb = 1 − pa =
(9.5.14)
x0 − a
.
b−a
(9.5.15)
It is worth noting that if b → ∞ then pa → 1 and if a → −∞ then pb → 1. This can
be stated by saying that a Brownian motion starting at x0 reaches any level (below or
above x0 ) with probability 1.
Next we shall compute the mean of the exit time, E[τ ]. Choosing f (x) = x2 in
(9.5.12) yields
E[(Xτ )2 ] = x20 + E[τ ].
From the definition of the mean and formulas (9.5.14)-(9.5.15) we obtain
b − x0
x0 − a
E[τ ] = a2 pa + b2 pb − x20 = a2
+ b2
− x0
b−a
b−a
i
1 h 2
ba − ab2 + x0 (b − a)(b + a) − x20
=
b−a
= −ab + x0 (b + a) − x20
= (b − x0 )(x0 − a).
(9.5.16)
Exercise 9.5.2 (a) Show that the equation x2 −(b−a)x+E[τ ] = 0 cannot have complex
roots;
(b − a)2
(b) Prove that E[τ ] ≤
;
4
(c) Find the point x0 ∈ (a, b) such that the expectation of the exit time, E[τ ], is maximum.
9.6
Transience and Recurrence of Brownian Motion
We shall consider first the expectation of the exit time from a ball. Then we shall
extend it to an annulus and compute the transience probabilities.
180
Figure 9.1: The Brownian motion Xt in the ball B(0, R).
1. Consider the process Xt = a + W (t), where W (t) = W1 (t), . . . , Wn (t) is an ndimensional Brownian motion, and a = (a1 , . . . , an ) ∈ Rn is a fixed vector, see Fig. 9.1.
Let R > 0 such that R > |a|. Consider the exit time of the process Xt from the ball
B(0, R)
τ = inf{t > 0; |Xt | > R}.
(9.6.17)
Assuming E[τ ] < ∞ and letting f (x) = |x|2 = x21 + · · · + x2n in Dynkin’s formula
hZ τ 1
i
E f (Xτ ) = f (x) + E
∆f (Xs ) ds
0 2
yields
2
2
R = |a| + E
and hence
hZ
τ
0
i
n ds ,
R2 − |a|2
.
(9.6.18)
n
In particular, if the Brownian motion starts from the center, i.e. a = 0, the expectation
of the exit time is
R2
E[τ ] =
.
n
(i) Since R2 /2 > R2 /3, the previous relation implies that it takes longer for a Brownian
motion to exit a disk of radius R rather than a ball of the same radius.
(ii) The probability that a Brownian motion leaves the interval (−R, R) is twice the
probability that a 2-dimensional Brownian motion exits the disk B(0, R).
E[τ ] =
Exercise 9.6.1 Prove that E[τ ] < ∞, where τ is given by (9.6.17).
181
Exercise 9.6.2 Apply the Optional Stopping Theorem for the martingale Wt = Wt2 − t
to show that E[τ ] = R2 , where
τ = inf{t > 0; |Wt | > R}
is the first exit time of the Brownian motion from (−R, R).
2. Let b ∈ Rn such that b ∈
/ B(0, R), i.e. |b| > R, and consider the annulus
Ak = {x; R < |x| < kR}
where k > 0 such that b ∈ Ak . Consider the process Xt = b + W (t) and let
τk = inf{t > 0; Xt ∈
/ Ak }
be the first exit time of Xt from the annulus Ak . Let f : Ak → R be defined by
(
− ln |x|, if n = 2
1
f (x) =
, if n > 2.
|x|n−2
A straightforward computation shows that ∆f = 0. Substituting into Dynkin’s formula
i
h Z τk 1
∆f (Xs ) ds
E f (Xτk ) = f (b) + E
2
0
yields
E f (Xτk ) = f (b).
(9.6.19)
This can be stated by saying that the value of f at a point b in the annulus is equal to
the expected value of f at the first exit time of a Brownian motion starting at b.
Since |Xτk | is a random variable with two outcomes, we have
E f (Xτk ) = pk f (R) + qk f (kR),
where pk = P (|Xτk | = R), qk = P (|XXτk |) = kR and pk + qk = 1. Substituting in
(9.6.19) yields
pk f (R) + qk f (kR) = f (b).
(9.6.20)
There are the following two distinguished cases:
(i) If n = 2 we obtain
−pk ln R − qk (ln k + ln R) = − ln b.
Using pk = 1 − qk , solving for pk yields
ln( Rb )
.
pk = 1 −
ln k
182
Hence
P (τ < ∞) = lim pk = 1,
k→∞
where τ = inf{t > 0; |Xt | < R} is the first time Xt hits the ball B(0, R). Hence in
R2 a Brownian motion hits with probability 1 any ball. This is stated equivalently by
saying that the Brownian motion is recurrent in R2 .
(ii) If n > 2 the equation (9.6.20) becomes
qk
1
pk
+ n−2 n−2 = n−2 .
n−2
R
k
R
b
Taking the limit k → ∞ yields
lim pk =
k→∞
R n−2
b
< 1.
Then in Rn , n > 2, a Brownian motion starting outside of a ball hits it with a probability
less than 1. This is usually stated by saying that the Brownian motion is transient.
3. We shall recover the previous results using the n-dimensional Bessel process
p
Rt = dist 0, W (t) = W1 (t)2 + · · · + Wn (t)2 .
Consider the process Yt = α + Rt , with 0 ≤ α < R, see section 3.7. It can be shown
that the generator of Yt is the Bessel operator of order n
A=
n−1 d
1 d2
+
.
2
2 dx
2x dx
Consider the exit time
τ = {t > 0; Yt > R}.
Applying Dynkin’s formula
h
E f (Yτ ) = f (Y0 ) + E
for f (x) = x2 yields R2 = α2 + E
Rτ
0
Z
τ
(Af )(Ys ) ds
0
nds . This leads to
E[τ ] =
R2 − α2
.
n
which recovers (9.6.18) with α = |a|.
In the following assume n ≥ 3 and consider the annulus
Ar,R = {x ∈ Rn ; r < |x| < R}.
i
183
Figure 9.2: The Brownian motion Xt in the annulus Ar,R .
Consider the stopping time τ = inf{t > 0; Xt ∈
/ Ar,R } = inf{t > 0; Yt ∈
/(r, R)},
where
2−n
|Y0 | = α ∈ (r, R). Applying Dynkin’s formula for f (x) = x
yields E f (Yτ = f (α).
This can be written as
pr r 2−n + pR R2−n = α2−n ,
where
pr = P (|Xt | = r),
pR = P (|Xt | = R),
pr + pR = 1.
Solving for pr and pR yields
pr =
R n−2
α
R n−2
r
−1
−1
,
pR =
r n−2
−1
α
.
r n−2
−1
R
The transience probability is obtained by taking the limit to infinity
pr =
α2−n Rn−2 − 1 r n−2
,
=
R→∞ r 2−n Rn−2 − 1
α
lim pr,R = lim
R→∞
where pr is the probability that a Brownian motion starting outside the ball of radius
r will hit the ball, see Fig. 9.2.
Exercise 9.6.3 Solve the equation 21 f ′′ (x) +
monomial type f (x) = xk .
n−1 ′
2x f (x)
= 0 by looking for a solution of
184
9.7
Application to Parabolic Equations
This section deals with solving first and second order parabolic equations using the
integral of the cost function along a certain characteristic solution. The first order
equations are related to predictable characteristic curves, while the second order equations depend on stochastic characteristic curves.
1. Predictable characteristics Let ϕ(s) be the solution of the following one-dimensional
ODE
dX(s)
= a s, X(s) ,
ds
X(t) = x,
t≤s≤T
and define the cumulative cost between t and T along the solution ϕ
u(t, x) =
Z
t
T
c s, ϕ(s) ds,
(9.7.21)
where c denotes a continuous cost function. Differentiate both sides with respect to t
Z T
∂
∂
u t, ϕ(t) =
c s, ϕ(s) ds
∂t
∂t t
′
∂t u + ∂x u ϕ (t) = −c t, ϕ(t) .
Hence (9.7.21) is a solution of the following final value problem
∂t u(t, x) + a(t, x)∂x u(t, x) = −c(t, x)
u(T, x) = 0.
It is worth mentioning that this is a variant of the method of characteristics.1 The
curve given by the solution ϕ(s) is called a characteristic curve.
Exercise 9.7.1 Using the previous method solve the following final boundary problems:
(a)
∂t u + x∂x u = −x
u(T, x) = 0.
(b)
∂t u + tx∂x u = ln x,
x>0
u(T, x) = 0.
1
This is a well known method of solving linear partial differential equations.
185
2. Stochastic characteristics Consider the diffusion
dXs = a(s, Xs )ds + b(s, Xs )dWs ,
Xt = x,
t≤s≤T
and define the stochastic cumulative cost function
u(t, Xt ) =
Z
T
c(s, Xs ) ds,
(9.7.22)
t
with the conditional expectation
h
i
u(t, x) = E u(t, Xt )|Xt = x
hZ T
i
= E
c(s, Xs ) ds|Xt = x .
t
Taking increments in both sides of (9.7.22) yields
du(t, Xt ) = d
Z
T
c(s, Xs ) ds.
t
Applying Ito’s formula on one side and the Fundamental Theorem of Calculus on the
other, we obtain
1
∂t u(t, x)dt + ∂x u(t, Xt )dXt + ∂x2 u(t, t, Xt )dXt2 = −c(t, Xt )dt.
2
Taking the expectation E[ · |Xt = x] on both sides yields
1
∂t u(t, x)dt + ∂x u(t, x)a(t, x)dt + ∂x2 u(t, x)b2 (t, x)dt = −c(t, x)dt.
2
Hence, the expected cost
u(t, x) = E
hZ
T
c(s, Xs ) ds|Xt = x
t
i
is a solution of the following second order parabolic equation
1
∂t u + a(t, x)∂x u + b2 (t, x)∂x2 u(t, x) = −c(t, x)
2
u(T, x) = 0.
This represents the probabilistic interpretation of the solution of a parabolic equation.
186
Exercise 9.7.2 Solve the following final boundary problems:
(a)
1
∂t u + ∂x u + ∂x2 u = −x
2
u(T, x) = 0.
(b)
1
∂t u + ∂x u + ∂x2 u = ex ,
2
u(T, x) = 0.
(c)
1
∂t u + µx∂x u + σ 2 x2 ∂x2 u = −x,
2
u(T, x) = 0.
Chapter 10
Martingales and Girsanov’s
Theorem
10.1
Examples of Martingales
In this section we shall use the knowledge acquired in previous chapters to present a
few important examples of martingales. These will be useful in the proof of Girsanov’s
theorem in the next section.
We start by recalling that a process Xt , 0 ≤ t ≤ T , is an Ft -martingale if
1. E[|Xt |] < ∞ (Xt integrable for each t);
2. Xt is Ft -measurable;
3. The forecast of future values is the last observation: E[Xt |Fs ] = Xs , ∀s < t.
We shall present three important examples of martingales and some of their particular
cases.
Example 10.1.1 If v(s) is a continuous function on [0, T ], then
Xt =
Z
t
v(s) dWs
0
is an Ft -martingale.
Taking out the predictable part
Z t
i
v(τ ) dWτ Fs
v(τ ) dWτ +
s
0
i
hZ t
= Xs + E
v(τ ) dWτ Fs = Xs ,
E[Xt |Fs ] = E
hZ
s
s
187
188
Rt
where we used that s v(τ ) dWτ is independent of Fs and the conditional expectation
equals the usual expectation
i
hZ t
hZ t
i
E
v(τ ) dWτ = 0.
v(τ ) dWτ Fs = E
s
s
Example 10.1.2 Let Xt =
Z
t
v(s) dWs be a process as in Example 10.1.1. Then
0
Mt =
Xt2
−
Z
t
v 2 (s) ds
0
is an Ft -martingale.
The process Xt satisfies the stochastic equation dXt = v(t)dWt . By Ito’s formula
d(Xt2 ) = 2Xt dXt + (dXt )2 = 2v(t)Xt dWt + v 2 (t)dt.
(10.1.1)
Integrating between s and t yields
Xt2 − Xs2 = 2
Z
t
Xτ v(τ ) dWτ +
Z
t
v 2 (τ ) dτ.
s
s
Then separating the predictable from the unpredictable we have
Z t
h
i
v 2 (τ ) dτ Fs
E[Mt |Fs ] = E Xt2 −
0
Z s
Z t
h
i
2
2
2
2
v 2 (τ ) dτ Fs
v (τ ) dτ + Xs −
= E Xt − Xs −
0
s
Z t
Z s
h
i
v 2 (τ ) dτ Fs
v 2 (τ ) dτ + E Xt2 − Xs2 −
= Xs2 −
s
0
hZ t
i
= Ms + 2E
Xτ v(τ ) dWτ Fs = Ms ,
s
where we used relation (10.1.1) and that
Z
t
Xτ v(τ ) dWτ is totally unpredictable given
s
the information set Fs . In the following we shall mention a few particular cases.
1. If v(s) = 1, then Xt = RWt . In this case Mt = Wt2 − t is an Ft -martingale.
t
2. If v(s) = s, then Xt = 0 s dWs , and hence
Mt =
is an Ft -martingale.
Z
t
s dWs
0
2
−
t3
3
189
Example 10.1.3 Let u : [0, T ] → R be a continuous function. Then
Rt
Mt = e
0
Rt
u(s) dWs − 21
0
u2 (s) ds
is an Ft -martingale for 0 ≤ t ≤ T .
Z
Z t
1 t 2
u (s) ds. Then
u(s) dWs −
Consider the process Ut =
2 0
0
1
dUt = u(t)dWt − u2 (t)dt
2
(dUt )2 = u(t)dt.
Then Ito’s formula yields
1
dMt = d(eUt ) = eUt dUt + eUt (dUt )2
2
1
1 2
Ut
= e u(t)dWt − u (t)dt + u2 (t)dt
2
2
= u(t)Mt dWt .
Integrating between s and t yields
Mt = Ms +
Z
t
u(τ )Mτ dWτ
s
Since
Z
t
s
u(τ )Mτ dWτ is independent of Fs , then
E
hZ
s
i
t
u(τ )Mτ dWτ |Fs = E
and hence
E[Mt |Fs ] = E[Ms +
Z
s
Z
s
t
u(τ )Mτ dWτ = 0,
t
u(τ )Mτ dWτ |Fs ] = Ms .
Remark 10.1.4 The condition that u(s) is continuous on [0, T ] can be relaxed by asking only
Z t
2
|u(s)|2 ds < ∞}.
u ∈ L [0, T ] = {u : [0, T ] → R; measurable and
0
It is worth noting that the conclusion still holds if the function u(s) is replaced by a
stochastic process u(t, ω) satisfying Novikov’s condition
1 RT 2
E e 2 0 u (s,ω) ds < ∞.
190
The previous process has a distinguished importance in the theory of martingales and
it will be useful when proving the Girsanov theorem.
Definition 10.1.5 Let u ∈ L2 [0, T ] be a deterministic function. Then the stochastic
process
R
Rt
1 t 2
Mt = e 0 u(s) dWs − 2 0 u (s) ds
is called the exponential process induced by u.
Particular cases of exponential processes.
1. Let u(s) = σ, constant, then Mt = eσWt −
σ2
t
2
is an Ft -martingale.
2. Let u(s) = s. Integrating in d(tWt ) = tdWt − Wt dt yields
Z
t
0
Let Zt =
Rt
0
s dWs = tWt −
Z
t
Ws ds.
0
Ws ds be the integrated Brownian motion. Then
Rt
Mt = e
0
d dWs − 12
Rt
0
s2 ds
t3
= etWt − 6 −Zt
is an Ft -martingale.
Example 10.1.6 Let Xt be a solution of dXt = u(t)dt + dWt , with u(s) a bounded
function. Consider the exponential process
Mt = e−
Rt
0
u(s) dWs − 12
Rt
0
u2 (s) ds
.
Then Yt = Mt Xt is an Ft -martingale.
In Example 10.1.3 we obtained dMt = −u(t)Mt dWt . Then
dMt dXt = −u(t)Mt dt.
The product rule yields
dYt = Mt dXt + Xt dMt + dMt dXt
= Mt u(t)dt + dWt − Xt u(t)Mt dWt − u(t)Mt dt
= Mt 1 − u(t)Xt dWt .
Integrating between s and t yields
Yt = Ys +
Z
t
s
Mτ 1 − u(τ )Xτ dWτ .
(10.1.2)
191
Since
Rt
s
Mτ 1 − u(τ )Xτ dWτ is independent of Fs ,
E
hZ
t
s
i
Mτ 1 − u(τ )Xτ dWτ |Fs = E
and hence
hZ
t
s
i
Mτ 1 − u(τ )Xτ dWτ = 0,
E[Yt |Fs ] = Ys .
1
Exercise 10.1.7 Prove that (Wt + t)e−Wt − 2 t is an Ft -martingale.
Exercise 10.1.8 Let h be a continuous function. Using the properties of the Wiener
integral and log-normal random variables, show that
i
h Rt
R
1 t
2
E e 0 h(s) dWs = e 2 0 h(s) ds .
Exercise 10.1.9 Let Mt be the exponential process (10.1.2). Use the previous exercise
to show that for any t > 0
(a) E[Mt ] = 1
Rt
(b) E[Mt2 ] = e
0
u(s)2 ds
.
Exercise 10.1.10 Let Ft = σ{Wu ; u ≤ t}. Show that the following processes are
Ft -martingales:
(a) et/2 cos Wt ;
(b) et/2 sin Wt .
Recall that the Laplacian of a twice differentiable function f is defined by ∆f (x) =
P
n
2
j=1 ∂xj f .
Example 10.1.11 Consider the smooth function f : Rn → R, such that
(i) ∆f = 0;
(ii) E |f (Wt )| < ∞, ∀t > 0 and x ∈ R.
Then the process Xt = f (Wt ) is an Ft -martingale.
Proof: It follows from the more general Example 10.1.13.
Exercise 10.1.12 Let W1 (t) and W2 (t) be two independent Brownian motions. Show
that Xt = eW1 (t) cos W2 (t) is a martingale.
Example 10.1.13 Let f : Rn → R be a smooth function such that
(i) E |f (Wt )| < ∞;
hR
i
t
(ii) E 0 |∆f (Ws )| ds < ∞.
Rt
Then the process Xt = f (Wt ) − 12 0 ∆f (Ws ) ds is a martingale.
192
Proof: For 0 ≤ s < t we have
h1 Z t
i
E Xt |Fs = E f (Wt )|Fs − E
∆f (Wu ) du|Fs
2 0
Z t h
Z s
i
1
1
(10.1.3)
E ∆f (Wu )|Fs du
∆f (Wu ) du −
= E f (Wt )|Fs −
2 0
2
s
Let p(t, y, x) be the probability density function of Wt . Integrating by parts and using
that p satisfies the Kolmogorov’s backward equation, we have
Z
Z
i
h1
1
1
p(u − s, Ws , x) ∆f (x) dx =
∆x p(u − s, Ws , x)f (x) dx
E ∆f (Wu )|Fs =
2
2
2
Z
∂
=
p(u − s, Ws , x)f (x) dx.
∂u
Then, using the Fundamental Theorem of Calculus, we obtain
Z t
Z
Z t h
i
∂
1
E ∆f (Wu )|Fs du =
p(u − s, Ws , x)f (x) dx du
2
∂u
s
Zs
Z
=
p(t − s, Ws , x)f (x) dx − lim p(ǫ, Ws , x)f (x) dx
ǫց0
Z
= E f (Wt )|Fs − δ(x = Ws )f (x) dx
= E f (Wt )|Fs − f (Ws ).
Substituting in (10.1.3) yields
E Xt |Fs
Z
s
∆f (Wu ) du − E f (Wt )|Fs + f (Ws()10.1.4)
= E f (Wt )|Fs
0
Z
1 s
= f (Ws ) −
∆f (Wu ) du
(10.1.5)
2 0
= Xs .
(10.1.6)
1
−
2
Hence Xt is an Ft -martingale.
Exercise 10.1.14 Use the Example 10.1.13 to show that the following processes are
martingales:
(a) Xt = Wt2 − t;
Rt
(b) Xt = Wt3 − 3 0 Ws ds;
Rt
1
Wtn − 12 0 Wsn−2 ds;
(c) Xt = n(n−1)
Rt
(d) Xt = ecWt − 21 c2 0 eWs ds, with c constant;
Rt
(e) Xt = sin(cWt ) + 12 c2 0 sin(cWs ) ds, with c constant.
193
Exercise 10.1.15 Let f : Rn → R be a function such that
(i) E |f (Wt )| < ∞;
(ii) ∆f = λf , λ constant.
Rt
Show that the process Xt = f (Wt ) − λ2 0 f (Ws ) ds is a martingale.
10.2
Girsanov’s Theorem
In this section we shall present and prove a particular version of Girsanov’s theorem,
which will suffice for the purpose of later applications. The application of Girsanov’s
theorem to finance is to show that in the study of security markets the differences
between the mean rates of return can be removed.
We shall recall first a few basic notions. Let (Ω, F, P ) be a probability space. When
dealing with an Ft -martingale on the aforementioned probability space, the filtration
Ft is considered to be the sigma-algebra generated by the given Brownian motion Wt ,
i.e. Ft = σ{Wu ; 0 ≤ u ≤ s}. By default, a martingale is considered with respect to the
probability measure P , in the sense that the expectations involve an integration with
respect to P
Z
E P [X] =
X(ω)dP (ω).
Ω
We have not used the upper script until now since there was no doubt which probability
measure was used. In this section we shall use also another probability measure given
by
dQ = MT dP,
where MT is an exponential process. This means that Q : F → R is given by
Z
Z
MT dP,
∀A ∈ F.
dQ =
Q(A) =
A
A
Since MT > 0, M0 = 1, using the martingale property of Mt yields
Q(A) > 0,
A 6= Ø;
Z
MT dP = E P [MT ] = E P [MT |F0 ] = M0 = 1,
Q(Ω) =
Ω
which shows that Q is a probability on F, and hence (Ω, F, Q) becomes a probability
space. The following transformation of expectation from the probability measure Q to
P will be useful in part II. If X is a random variable
Z
Z
Q
X(ω)MT (ω) dP (ω)
X(ω) dQ(ω) =
E [X] =
Ω
Ω
= E P [XMT ].
The following result will play a central role in proving Girsanov’s theorem:
194
Lemma 10.2.1 Let Xt be the Ito process
dXt = u(t)dt + dWt ,
X0 = 0, 0 ≤ t ≤ T,
with u(s) a bounded function. Consider the exponential process
Mt = e−
Rt
0
u(s) dWs − 12
Rt
0
u2 (s) ds
.
Then Xt is an Ft -martingale with respect to the measure
dQ(ω) = MT (ω)dP (ω).
Proof: We need to prove that Xt is an Ft -martingale with respect to Q, so it suffices
to show the following three properties:
1. Integrability of Xt . This part usually follows from standard manipulations of norms
estimations. We shall do it here in detail, but we shall omit it in other proofs. Integrating in the equation of Xt between 0 and t yields
Z t
u(s) ds + Wt .
(10.2.7)
Xt =
0
We start with an estimation of the expectation with respect to P
Z t
h Z t
i
2
P
P
2
u(s) ds + 2
E [Xt ] = E
u(s) ds Wt + Wt2
0
0
Z t
Z t
2
=
u(s) ds + 2
u(s) ds E P [Wt ] + E P [Wt2 ]
0
0
Z t
2
=
u(s) ds + t < ∞,
∀0 ≤ t ≤ T
0
where the last inequality follows from the norm estimation
Z t
Z t
i1/2
h Z t
2
|u(s)| ds
|u(s)| ds ≤ t
u(s) ds ≤
0
0
0
i1/2
h Z T
|u(s)|2 ds
≤ t
= T 1/2 kukL2 [0,T ] .
0
Next we obtain an estimation with respect to Q
Z
2 Z
Z
2
Q
2
MT2 dP
|Xt | dP
|Xt |MT dP ≤
E [|Xt |] =
Ω
Ω
= E
P
[Xt2 ] E P [MT2 ]
RT
since E P [Xt2 ] < ∞ and E P [MT2 ] = e
0
u(s)2 ds
Ω
< ∞,
kuk2 2
=e
L [0,T ]
, see Exercise 10.1.9.
195
2. Ft -mesurability of Xt . This follows from equation (10.2.7) and the fact that Wt is
Ft -measurable.
3. Conditional expectation of Xt . From Examples 10.1.3 and 10.1.6 recall that for any
0≤t≤T
Mt is an Ft -martingale with respect to probability measure P ;
Xt Mt is an Ft -martingale with respect to probability measure P.
We need to verify that
E Q [Xt |Fs ] = Xs ,
which can be written as
Z
Xt dQ =
A
Z
∀s ≤ t.
Xs dQ,
A
Since dQ = MT dP , the previous relation becomes
Z
Z
Xs MT dP,
Xt MT dP =
A
A
∀A ∈ Fs .
∀A ∈ Fs .
This can be written in terms of conditional expectation as
E P [Xt MT |Fs ] = E P [Xs MT |Fs ].
(10.2.8)
We shall prove this identity by showing that both terms are equal to Xs Ms . Since Xs
is Fs -predictable and Mt is a martingale, the right side term becomes
E P [Xs MT |Fs ] = Xs E P [MT |Fs ] = Xs Ms ,
∀s ≤ T.
Let s < t. Using the tower property (see Proposition 2.10.6, part 3), the left side term
becomes
E P [Xt MT |Fs ] = E P E P [Xt MT |Ft ]|Fs = E P Xt E P [MT |Ft ]|Fs
= E P Xt Mt |Fs = Xs Ms ,
where we used that Mt and Xt Mt are martingales and Xt is Ft -measurable. Hence
(10.2.8) holds and Xt is an Ft -martingale with respect to the probability measure Q.
Lemma 10.2.2 Consider the process
Z t
Xt =
u(s) ds + Wt ,
0
0 ≤ t ≤ T,
with u ∈ L2 [0, T ] a deterministic function, and let dQ = MT dP . Then
E Q [Xt2 ] = t.
196
Proof: Denote U (t) =
Rt
0
u(s) ds. Then
E Q [Xt2 ] = E P [Xt2 MT ] = E P [U 2 (t)MT + 2U (t)Wt MT + Wt2 MT ]
= U 2 (t)E P [MT ] + 2U (t)E P [Wt MT ] + E P [Wt2 MT ].
(10.2.9)
From Exercise 10.1.9 (a) we have E P [MT ] = 1. In order to compute E P [Wt MT ] we
use the tower property and the martingale property of Mt
E P [Wt MT ] = E[E P [Wt MT |Ft ]] = E[Wt E P [MT |Ft ]]
= E[Wt Mt ].
(10.2.10)
Using the product rule
d(Wt Mt ) = Mt dWt + Wt dMt + dWt dMt
= Mt − u(t)Mt Wt dWt − u(t)Mt dt,
where we used dMt = −u(t)Mt dWt . Integrating between 0 and t yields
Z t
Z t
Ms − u(s)Ms Ws dWs −
Wt Mt =
u(s)Ms ds.
0
0
Taking the expectation and using the property of Ito integrals we have
Z t
Z t
u(s) ds = −U (t).
u(s)E[Ms ] ds = −
E[Wt Mt ] = −
(10.2.11)
0
0
Substituting into (10.2.10) yields
E P [Wt MT ] = −U (t).
(10.2.12)
For computing E P [Wt2 MT ] we proceed in a similar way
E P [Wt2 MT ] = E[E P [Wt2 MT |Ft ]] = E[Wt2 E P [MT |Ft ]]
= E[Wt2 Mt ].
(10.2.13)
Using the product rule yields
d(Wt2 Mt ) = Mt d(Wt2 ) + Wt2 dMt + d(Wt2 )dMt
= Mt (2Wt dWt + dt) − Wt2 u(t)Mt dWt
−(2Wt dWt + dt) u(t)Mt dWt
= Mt Wt 2 − u(t)Wt dWt + Mt − 2u(t)Wt Mt dt.
Integrate between 0 and t
Z
Z t
2
[Ms Ws 2 − u(s)Ws ] dWs +
Wt Mt =
0
0
t
Ms − 2u(s)Ws Ms ds,
197
and take the expected value to get
Z
E[Wt2 Mt ] =
t
0
Z
=
0
t
E[Ms ] − 2u(s)E[Ws Ms ] ds
1 + 2u(s)U (s) ds
= t + U 2 (t),
where we used (10.2.11). Substituting into (10.2.13) yields
E P [Wt2 MT ] = t + U 2 (t).
(10.2.14)
Substituting (10.2.12) and (10.2.14) into relation (10.2.9) yields
E Q [Xt2 ] = U 2 (t) − 2U (t)2 + t + U 2 (t) = t,
(10.2.15)
which ends the proof of the lemma.
Now we are prepared to prove one of the most important results of stochastic
calculus.
Theorem 10.2.3 (Girsanov Theorem) Let u ∈ L2 [0, T ] be a deterministic function. Then the process
Xt =
Z
t
0≤t≤T
u(s) ds + Wt ,
0
is a Brownian motion with respect to the probability measure Q given by
dQ = e−
RT
0
u(s) dWs − 12
RT
0
u(s)2 ds
dP.
Proof: In order to prove that Xt is a Brownian motion on the probability space
(Ω, F, Q) we shall apply Lévy’s characterization theorem, see Theorem 3.1.5. Hence it
suffices to show that Xt is a Wiener process. Lemma 10.2.1 implies that the process
Xt satisfies the following properties:
1. X0 = 0;
2. Xt is continuous in t;
3. Xt is a square integrable Ft -martingale on the space (Ω, F, Q).
The only property we still need to show is
4. E Q [(Xt − Xs )2 ] = t − s,
s < t.
198
Using Lemma 10.2.2, the martingale property of Wt , the additivity and the tower
property of expectations, we have
E Q [(Xt − Xs )2 ] = E Q [Xt2 ] − 2E Q [Xt Xs ] + E Q [Xs2 ]
= t − 2E Q [Xt Xs ] + s
= t − 2E Q [E Q [Xt Xs |Fs ]] + s
= t − 2E Q [Xs E Q [Xt |Fs ]] + s
= t − 2E Q [Xs2 ] + s
= t − 2s + s = t − s,
which proves relation 4.
Choosing u(s) = λ, constant, we obtain the following consequence that will be
useful later in finance applications.
Corollary 10.2.4 Let Wt be a Brownian motion on the probability space (Ω, F, P ).
Then the process
Xt = λt + Wt ,
0≤t≤T
is a Brownian motion on the probability space (Ω, F, Q), where
1
2 T −λW
dQ = e− 2 λ
T
dP.
Chapter 1
2.9.1 Let X ∼ N (µ, σ 2 ). Then the distribution function of Y is
y − β
FY (y) = P (Y < y) = P (αX + β < y) = P X <
α
2
Z y−β
Z y
2
z−(αµ+β)
(x−µ)
α
1
1
= √
e− 2α2 σ2
e− 2σ2 dx = √
dz
2πσ −∞
2πασ −∞
Z y
(z−µ′ )2
1
−
e 2(σ′ )2 dz,
= √
2πσ ′ −∞
with µ′ = αµ + β, σ ′ = ασ.
2 2
2.9.5 (a) Making t = n yields E[Y n ] = E[enX ] = eµn+n σ /2 .
(b) Let n = 1 and n = 2 in (a) to get the first two moments and then use the formula
of variance.
2.10.7 The tower property
E E[X|G]|H = E[X|H],
is equivalent with
Z
A
E[X|G] dP =
Z
X dP,
H⊂G
∀A ∈ H.
199
Since A ∈ G, the previous relation holds by the definition on E[X|G].
2.10.8 (a) Direct application of the definition.
R
R
(b) P (A) = A dP = Ω χA (ω) dP (ω) = E[χA ].
(d) E[χA X] = E[χA ]E[X] = P (A)E[X].
(e) We have the sequence of equivalencies
Z
Z
X dP, ∀A ∈ G ⇔
E[X] dP =
E[X|G] = E[X] ⇔
A
A
Z
X dP ⇔ E[X]P (A) = E[χA ],
E[X]P (A) =
A
which follows from (d).
2.11.5 If µ = E[X] then
E[(X − E[X])2 ] = E[X 2 − 2µX + µ2 ] = E[X 2 ] − 2µ2 + µ2
= E[X 2 ] − E[X]2 = V ar[X].
2.11.6 From 2.11.5 we have V ar(X) = 0 ⇔ X = E[X], i.e. X is a constant.
2.11.7 The same proof as the one in Jensen’s inequality.
2.11.8 It follows from Jensen’s inequality or using properties of integrals.
P
k e−λ
t
= eλ(e −1) ;
2.11.13 (a) m(t) = E[etX ] = k etk λ k!
(b) It follows from the first Chernoff bound.
2.11.16 Choose f (x) = x2k+1 and g(x) = x2n+1 .
2.12.1 By direct computation we have
E[(X − Y )2 ] = E[X 2 ] + E[Y 2 ] − 2E[XY ]
= V ar(X) + E[X]2 + V ar[Y ] + E[Y ]2 − 2E[X]E[Y ]
+2E[X]E[Y ] − 2E[XY ]
= V ar(X) + V ar[Y ] + (E[X] − E[Y ])2 − 2Cov(X, Y ).
2.12.2 (a) Since
E[(X − Xn )2 ] ≥ E[X − Xn ]2 = (E[Xn ] − E[X])2 ≥ 0,
the Squeeze theorem yields limn→∞ (E[Xn ] − E[X]) = 0.
(b) Writing
Xn2 − X 2 = (Xn − X)2 − 2X(X − Xn ),
200
and taking the expectation we get
E[Xn2 ] − E[X 2 ] = E[(Xn − X)2 ] − 2E[X(X − Xn )].
The right side tends to zero since
E[(Xn − X)2 ] → 0
Z
|X(X − Xn )| dP
|E[X(X − Xn )]| ≤
Ω
Z
1/2 Z
1/2
2
≤
X dP
(X − Xn )2 dP
Ω
pΩ
E[X 2 ]E[(X − Xn )2 ] → 0.
=
(c) It follows from part (b).
(d) Apply Exercise 2.12.1.
2.12.3 Using Jensen’s inequality we have
2 2 E E[Xn |H] − E[X|H]
= E E[Xn − X|H]
≤ E E[(Xn − X)2 |H]
= E[(Xn − X)2 ] → 0,
as n → 0.
2.14.6 The integrability of Xt follows from
E[|Xt |] = E |E[X|Ft ]| ≤ E E[|X| |Ft ] = E[|X|] < ∞.
Xt is Ft -predictable by the definition of the conditional expectation. Using the tower
property yields
s < t.
E[Xt |Fs ] = E E[X|Ft ]|Fs = E[X|Fs ] = Xs ,
2.14.7 Since
E[|Zt |] = E[|aXt + bYt + c|] ≤ |a|E[|Xt |] + |b|E[|Yt |] + |c| < ∞
then Zt is integrable. For s < t, using the martingale property of Xt and Yt we have
E[Zt |Fs ] = aE[Xt |Fs ] + bE[Yt |Fs ] + c = aXs + bYs + c = Zs .
2.14.8 In general the answer is no. For instance, if Xt = Yt the process Xt2 is not a
martingale, since the Jensen’s inequality
2
E[Xt2 |Fs ] ≥ E[Xt |Fs ] = Xs2
201
is not necessarily an identity. For instance Bt2 is not a martingale, with Bt the Brownian
motion process.
2.14.9 It follows from the identity
E[(Xt − Xs )(Yt − Ys )|Fs ] = E[Xt Yt − Xs Ys |Fs ].
2.14.10 (a) Let Yn = Sn − E[Sn ]. We have
Yn+k = Sn+k − E[Sn+k ]
= Yn +
k
X
j=1
Xn+j −
k
X
E[Xn+j ].
j=1
Using the properties of expectation we have
E[Yn+k |Fn ] = Yn +
= Yn +
k
X
j=1
k
X
j=1
= Yn .
E[Xn+j |Fn ] −
E[Xn+j ] −
k
X
k
X
E[E[Xn+j ]|Fn ]
j=1
E[Xn+j ]
j=1
(b) Let Zn = Sn2 − V ar(Sn ). The process Zn is an Fn -martingale iff
E[Zn+k − Zn |Fn ] = 0.
Let U = Sn+k − Sn . Using the independence we have
2
Zn+k − Zn = (Sn+k
− Sn2 ) − V ar(Sn+k − V ar(Sn )
= (Sn + U )2 − Sn2 − V ar(Sn+k − V ar(Sn )
= U 2 + 2U Sn − V ar(U ),
so
E[Zn+k − Zn |Fn ] = E[U 2 ] + 2Sn E[U ] − V ar(U )
= E[U 2 ] − (E[U 2 ] − E[U ]2 )
= 0,
since E[U ] = 0.
2.14.11 Let Fn = σ(Xk ; k ≤ n). Using the independence
E[|Pn |] = E[|X0 |] · · · E[|Xn |] < ∞,
202
so |Pn | integrable. Taking out the predictable part we have
E[Pn+k |Fn ] = E[Pn Xn+1 · · · Xn+k |Fn ] = Pn E[Xn+1 · · · Xn+k |Fn ]
= Pn E[Xn+1 ] · · · E[Xn+k ] = Pn .
2.14.12 (a) Since the random variable Y = θX is normally distributed with mean θµ
and variance θ 2 σ 2 , then
1 2 2
E[eθX ] = eθµ+ 2 θ σ .
Hence E[eθX ] = 1 iff θµ + 12 θ 2 σ 2 = 0 which has the nonzero solution θ = −2µ/σ 2 .
(b) Since eθXi are independent, integrable and satisfy E[eθXi ] = 1, by Exercise
2.14.11 we get that the product Zn = eθSn = eθX1 · · · eθXn is a martingale.
Chapter 2
3.1.4 Bt starts at 0 and is continuous in t. By Proposition 3.1.2 Bt is a martingale
with E[Bt2 ] = t < ∞. Since Bt − Bs ∼ N (0, |t − s|), then E[(Bt − Bs )2 ] = |t − s|.
3.1.11 It is obvious that Xt = Wt+t0 − Wt0 satisfies X0 = 0 and that Xt is continuous
in t. The increments are normal distributed Xt − Xs = Wt+t0 − Ws+t0 ∼ N (0, |t − s|).
If 0 < t1 < · · · < tn , then 0 < t0 < t1 + t0 < · · · < tn + t0 . The increments
Xtk+1 − Xtk = Wtk+1 +t0 − Wtk +t0 are obviously independent and stationary.
3.1.12 For any λ > 0, show that the process Xt = √1λ Wλt is a Brownian motion. This
says that the Brownian motion is invariant by scaling. Let s < t. Then Xt − Xs =
√1 (Wλt − Wλs ) ∼ √1 N 0, λ(t − s) = N (0, t − s). The other properties are obvious.
λ
λ
3.1.13 Apply Property 3.1.9.
3.1.14 Using the moment generating function, we get E[Wt3 ] = 0, E[Wt4 ] = 3t2 .
3.1.15 (a) Let s < t. Then
i
i
h
h
E[(Wt2 − t)(Ws2 − s)] = E E[(Wt2 − t)(Ws2 − s)]|Fs = E (Ws2 − s)E[(Wt2 − t)]|Fs
i
i
h
h
= E (Ws2 − s)2 = E Ws4 − 2sWs2 + s2
= E[Ws4 ] − 2sE[Ws2 ] + s2 = 3s2 − 2s2 + s2 = 2s2 .
(b) Using part (a) we have
2s2 = E[(Wt2 − t)(Ws2 − s)] = E[Ws2 Wt2 ] − sE[Wt2 ] − tE[Ws2 ] + ts
= E[Ws2 Wt2 ] − st.
Therefore E[Ws2 Wt2 ] = ts + 2s2 .
(c) Cov(Wt2 , Ws2 ) = E[Ws2 Wt2 ] − E[Ws2 ]E[Wt2 ] = ts + 2s2 − ts = 2s2 .
203
(d) Corr(Wt2 , Ws2 ) =
2s2
2ts
= st , where we used
V ar(Wt2 ) = E[Wt4 ] − E[Wt2 ]2 = 3t2 − t2 = 2t2 .
3.1.16 (a) The distribution function of Yt is given by
F (x) = P (Yt ≤ x) = P (tW1/t ≤ x) = P (W1/t ≤ x/t)
Z x/t p
Z x/t
2
φ1/t (y) dy =
=
t/(2π)ety /2 dy
0
Z
=
0
√
x/ t
0
1
2
√ e−u /2 du.
2π
(b) The probability density of Yt is obtained by differentiating F (x)
d
p(x) = F (x) =
dx
′
Z
√
x/ t
0
1
2
1
2
√ e−u /2 du = √ e−x /2 ,
2π
2π
and hence Yt ∼ N (0, t).
(c) Using that Yt has independent increments we have
Cov(Ys , Yt ) = E[Ys Yt ] − E[Ys ]E[Yt ] = E[Ys Yt ]
i
h
= E Ys (Yt − Ys ) + Ys2 = E[Ys ]E[Yt − Ys ] + E[Ys2 ]
= 0 + s = s.
(d) Since
Yt − Ys = (t − s)(W1/t − W0 ) − s(W1/s − W1/t )
E[Yt − Ys ] = (t − s)E[W1/t ] − sE[W1/s − W1/t ] = 0,
and
1 1
1
V ar(Yt − Ys ) = E[(Yt − Ys )2 ] = (t − s)2 + s2 ( − )
t
s
t
2
(t − s) + s(t − s)
=
= t − s.
t
3.1.17 (a) Applying the definition of expectation we have
Z ∞
Z ∞
1 − x2
1 − x2
e 2t dx =
e 2t dx
2x √
|x| √
E[|Wt |] =
2πt
2πt
0
−∞
Z ∞
p
y
1
e− 2t dy = 2t/π.
= √
2πt 0
(b) Since E[|Wt |2 ] = E[Wt2 ] = t, we have
V ar(|Wt |) = E[|Wt |2 ] − E[|Wt |]2 = t −
2
2t
= t(1 − ).
π
π
204
3.1.18 By the martingale property of Wt2 − t we have
E[Wt2 |Fs ] = E[Wt2 − t|Fs ] + t = Ws2 + t − s.
3.1.19 (a) Expanding
(Wt − Ws )3 = Wt3 − 3Wt2 Ws + 3Wt Ws2 − Ws3
and taking the expectation
E[(Wt − Ws )3 |Fs ] = E[Wt3 |Fs ] − 3Ws E[Wt2 ] + 3Ws2 E[Wt |Fs ] − Ws3
= E[Wt3 |Fs ] − 3(t − s)Ws − Ws3 ,
so
E[Wt3 |Fs ] = 3(t − s)Ws + Ws3 ,
since
3
] = 0.
E[(Wt − Ws )3 |Fs ] = E[(Wt − Ws )3 ] = E[Wt−s
(b) Hint: Start from the expansion of (Wt − Ws )4 .
3.2.3 Using that eWt −Ws is stationary, we have
1
E[eWt −Ws ] = E[eWt−s ] = e 2 (t−s) .
3.2.4 (a)
E[Xt |Fs ] = E[eWt |Fs ] = E[eWt −Ws eWs |Fs ]
= eWs E[eWt −Ws |Fs ] = eWs E[eWt −Ws ]
= eWs et/2 e−s/2 .
(b) This can be written also as
E[e−t/2 eWt |Fs ] = e−s/2 eWs ,
which shows that e−t/2 eWt is a martingale.
(c) From the stationarity we have
1 2
(t−s)
E[ecWt −cWs ] = E[ec(Wt −Ws ) ] = E[ecWt−s ] = e 2 c
.
Then for any s < t we have
E[ecWt |Fs ] = E[ec(Wt −Ws ) ecWs |Fs ] = ecWs E[ec(Wt −Ws ) |Fs ]
1 2
(t−s)
= ecWs E[ec(Wt −Ws ) ] = ecWs e 2 c
1 2
t
Multiplying by e− 2 c
yields the desired result.
1 2
= Ys e 2 c t .
205
3.2.5 (a) Using Exercise 3.2.3 we have
Cov(Xs , Xt ) = E[Xs Xt ] − E[Xs ]E[Xt ] = E[Xs Xt ] − et/2 es/2
= E[eWs +Wt ] − et/2 es/2 = E[eWt −Ws e2(Ws −W0 ) ] − et/2 es/2
= E[eWt −Ws ]E[e2(Ws −W0 ) ] − et/2 es/2 = e
= e
t+3s
2
− e(t+s)/2 .
t−s
2
e2s − et/2 es/2
(b) Using Exercise 3.2.4 (b), we have
h
i
h
i
E[Xs Xt ] = E E[Xs Xt |Fs ] = E Xs E[Xt |Fs ]
h
i
h
i
= et/2 E Xs E[e−t/2 Xt |Fs ] = et/2 E Xs e−s/2 Xs ]
= e(t−s)/2 E[Xs2 ] = e(t−s)/2 E[e2Ws ]
= e(t−s)/2 e2s = e
t+3s
2
,
and continue like in part (a).
3.2.6 Using the definition of the expectation we have
Z
Z
x2
1
2
2
e2x φt (x) dx = √
e2x e− 2t dx
2πt
Z
1
2
1
− 1−4t
x
= √
e 2t
dx = √
,
1 − 4t
2πt
if
Z 1 − 4t >p0. Otherwise, the integral is infinite. We used the standard integral
2
e−ax = π/a, a > 0.
2
E[e2Wt ] =
3.3.4 It follows from the fact that Zt is normally distributed.
3.3.5 Using the definition of covariance we have
= E[Zs Zt ] − E[Zs ]E[Zt ] = E[Zs Zt ]
Cov Zs , Zt
Z t
i
hZ sZ t
hZ s
i
Wv dv = E
= E
Wu du ·
Wu Wv dudv
0
0
0
0
Z sZ t
Z sZ t
=
E[Wu Wv ] dudv =
min{u, v} dudv
0
0
0
0
t
s
,
s < t.
−
= s2
2 6
3.3.6 (a) Using Exercise 3.3.5
Cov(Zt , Zt − Zt−h ) = Cov(Zt , Zt ) − Cov(Zt , Zt−h )
t3
t
t−h
=
− (t − h)2 ( −
)
3
2
6
1 2
=
t h + o(h).
2
206
(b) Using Zt − Zt−h =
Rt
t−h Wu du
= hWt + o(h),
Cov(Zt , Wt ) =
=
1
Cov(Zt , Zt − Zt−h )
h
1
1 1 2
t h + o(h) = t2 .
h 2
2
3.3.7 Let s < u. Since Wt has independent increments, taking the expectation in
eWs +Wt = eWt −Ws e2(Ws −W0 )
we obtain
E[eWs +Wt ] = E[eWt −Ws ]E[e2(Ws −W0 ) ] = e
= e
3.3.8 (a) E[Xt ] =
Rt
0
E[eWs ] ds =
u+s
2
Rt
0
es = e
u+s
2
u−s
2
e2s
emin{s,t} .
E[es/2 ] ds = 2(et/2 − 1)
(b) Since V ar(Xt ) = E[Xt2 ] − E[Xt ]2 , it suffices to compute E[Xt2 ]. Using Exercise
3.3.7 we have
Z t
hZ t
i
hZ tZ t
i
E[Xt2 ] = E
eWu du = E
eWt ds ·
eWs eWu dsdu
0
0
0
0
Z tZ t
Z tZ t
u+s
e 2 emin{s,t} dsdu
E[eWs +Wu ] dsdu =
=
0
0
0
0
ZZ
ZZ
u+s
u+s
s
e 2 eu duds
e 2 e duds +
=
D2
ZD
Z1
u+s
3
4 1 2t
u
e 2 e duds =
= 2
,
e − 2et/2 +
3 2
2
D2
where D1 {0 ≤ s < u ≤ t} and D2 {0 ≤ u < s ≤ t}. In the last identity we applied
Fubini’s theorem. For the variance use the formula V ar(Xt ) = E[Xt2 ] − E[Xt ]2 .
3.3.9 (a) Splitting the integral at t and taking out the predictable part, we have
207
Z T
Z t
Wu du|Ft ]
Wu du|Ft ] = E[ Wu du|Ft ] + E[
t
0
0
Z T
Wu du|Ft ]
Zt + E[
t
Z T
Zt + E[
(Wu − Wt + Wt ) du|Ft ]
t
Z T
(Wu − Wt ) du|Ft ] + Wt (T − t)
Zt + E[
t
Z T
(Wu − Wt ) du] + Wt (T − t)
Zt + E[
t
Z T
E[Wu − Wt ] du + Wt (T − t)
Zt +
Z
E[ZT |Ft ] = E[
=
=
=
=
=
T
t
= Zt + Wt (T − t),
since E[Wu − Wt ] = 0.
(b) Let 0 < t < T . Using (a) we have
E[ZT − T WT |Ft ] = E[ZT |Ft ] − T E[WT |Ft ]
= Zt + Wt (T − t) − T Wt
= Zt − tWt .
3.4.1
h Rt
i
RT
E[VT |Ft ] = E e 0 Wu du+ t Wu du |Ft
i
h RT
Rt
= e 0 Wu du E e t Wu du |Ft
i
h RT
Rt
= e 0 Wu du E e t (Wu −Wt ) du+(T −t)Wt |Ft
i
h RT
= Vt e(T −t)Wt E e t (Wu −Wt ) du |Ft
i
h RT
= Vt e(T −t)Wt E e t (Wu −Wt ) du
h R T −t
i
(T −t)Wt
Wτ dτ
0
E e
= Vt e
3
1 (T −t)
3
= Vt e(T −t)Wt e 2
.
208
3.6.1
F (x) = P (Yt ≤ x) = P (µt + Wt ≤ x) = P (Wt ≤ x − µt)
Z x−µt
1 − u2
√
e 2t du;
=
2πt
0
1 − (x−µt)2
2t
e
f (x) = F ′ (x) = √
.
2πt
3.7.2 Since
P (Rt ≤ ρ) =
Z
ρ
0
1 − x2
xe 2t dx,
t
use the inequality
1−
x2
x2
< e− 2t < 1
2t
to get the desired result.
3.7.3
E[Rt ] =
Z
∞
xpt (x) dx =
Z
∞
1 2 − x2
x e 2t dx
t
0
√ Z ∞ 3 −1 −z
dy = 2t
z 2 e dz
0
Z
1 ∞ 1/2 − y
=
y e 2t
2t 0
√
√
2πt
3
.
=
=
2t Γ
2
2
0
Since E[Rt2 ] = E[W1 (t)2 + W2 (t)2 ] = 2t, then
V ar(Rt ) = 2t −
2πt
π
= 2t 1 − .
4
4
3.7.4
E[Xt ] =
V ar(Xt ) =
√
r
2πt
π
E[Xt ]
=
=
→ 0,
t
2t
2t
1
2
π
V ar(Rt ) = (1 − ) → 0,
2
t
t
4
By Example 2.12.1 we get Xt → 0, t → ∞ in mean square.
t → ∞;
t → ∞.
209
3.8.2
P (Nt − Ns = 1) = λ(t − s)e−λ(t−s)
= λ(t − s) 1 − λ(t − s) + o(t − s)
= λ(t − s) + o(t − s).
P (Nt − Ns ≥ 1) = 1 − P (Nt − Ns = 0) + P (Nt − Ns = 1)
= 1 − e−λ(t−s) − λ(t − s)e−λ(t−s)
= 1 − 1 − λ(t − s) + o(t − s)
−λ(t − s) 1 − λ(t − s) + o(t − s)
= λ2 (t − s)2 = o(t − s).
3.8.6 Write first as
Nt2 = Nt (Nt − Ns ) + Nt Ns
= (Nt − Ns )2 + Ns (Nt − Ns ) + (Nt − λt)Ns + λtNs ,
then
E[Nt2 |Fs ] = E[(Nt − Ns )2 |Fs ] + Ns E[Nt − Ns |Fs ] + E[Nt − λt|Fs ]Ns + λtNs
= E[(Nt − Ns )2 ] + Ns E[Nt − Ns ] + E[Nt − λt]Ns + λtNs
= λ(t − s) + λ2 (t − s)2 + λtNs + Ns2 − λsNs + λtNs
= λ(t − s) + λ2 (t − s)2 + 2λ(t − s)Ns + Ns2
= λ(t − s) + [Ns + λ(t − s)]2 .
Hence E[Nt2 |Fs ] 6= Ns2 and hence the process Nt2 is not an Fs -martingale.
3.8.7 (a)
mNt (x) = E[exNt ] =
=
X
X
exh P (Nt = k)
k≥0
k k
xk −λt λ t
e e
k≥0
k!
x
= e−λt eλte = eλt(e
x −1)
.
(b) E[Nt2 ] = m′′Nt (0) = λ2 t2 + λt. Similarly for the other relations.
3.8.8 E[Xt ] = E[eNt ] = mNt (1) = eλt(e−1) .
3.8.9 (a) Since exMt = ex(Nt −λt) = e−λtx exNt , the moment generating function is
mMt (x) = E[exMt ] = e−λtx E[exNt ]
= e−λtx eλt(e
x −1)
= eλt(e
x −x−1)
.
210
(b) For instance
E[Mt3 ] = m′′′
Mt (0) = λt.
Since Mt is a stationary process, E[(Mt − Ms )3 ] = λ(t − s).
3.8.10
V ar[(Mt − Ms )2 ] = E[(Mt − Ms )4 ] − E[(Mt − Ms )2 ]2
= λ(t − s) + 3λ2 (t − s)2 − λ2 (t − s)2
= λ(t − s) + 2λ2 (t − s)2 .
3.11.3 (a) E[Ut ] = E
(b) E
Nt
X
k=1
hR
t
0 Nu du
i
=
Rt
0
E[Nu ] du =
Rt
0
λu du =
λt2
2 .
λt2
λt2
=
.
Sk = E[tNt − Ut ] = tλt −
2
2
3.11.5 The proof cannot go through because a product between a constant and a
Poisson process is not a Poisson process.
3.11.6 Let pX (x) be the probability
P density of X. If p(x, y) is the joint probability
density of X and Y , then pX (x) = y p(x, y). We have
X
E[X|Y = y]P (Y = y) =
y≥0
XZ
xpX|Y =y (x|y)P (Y = y) dx
y≥0
=
XZ
y≥0
=
Z
xp(x, y)
P (Y = y) dx =
P (Y = y)
Z
x
X
p(x, y) dx
y≥0
xpX (x) dx = E[X].
3.11.8 (a) Since Tk has an exponential distribution with parameter λ
Z ∞
λ
−σTk
E[e
] =
e−σx λe−λx dx =
.
λ+σ
0
(b) We have
Ut = T2 + 2T3 + 3T4 + · · · + (n − 2)Tn−1 + (n − 1)Tn + (t − Sn )n
= T2 + 2T3 + 3T4 + · · · + (n − 2)Tn−1 + (n − 1)Tn + nt − n(T1 + T2 + · · · + Tn )
= nt − [nT1 + (n − 1)T2 + · · · + Tn ].
(c) Using that the arrival times Sk , k = 1, 2, . . . n, have the same distribution as the
order statistics U(k) corresponding to n independent random variables uniformly dis-
211
tributed on the interval (0, t), we get
i
h
PNt
E e−σUt Nt = n = E[e−σ(tNt − k=1 Sk ) |Nt = n]
= e−nσt E[eσ
−nσt
Pn
i=1
U(i)
σU1
] = e−nσt E[eσ
Pn
i=1
Ui
]
σUn
E[e ] · · · E[e
]
Z t
Z
1 t σxn
σx1
−nσt 1
e
e
dxn
dx1 · · ·
= e
t 0
t 0
(1 − e−σt )n
=
·
σ n tn
= e
(d) Using Exercise 3.11.6 we have
h
h
i
X
E e−σUt ] =
P (Nt = n)E e−σUt Nt = n
n≥0
X e−λ tn λn (1 − e−σt )n
=
σ n tn
n!
n≥0
= eλ(1−e
−σt )/σ−λ
·
4.10.10 (a) dt dNt = dt(dMt + λdt) = dt dMt + λdt2 = 0
(b) dWt dNt = dWt (dMt + λdt) = dWt dMt + λdWt dt = 0
2
2
3.12.6 Use Doob’s
pinequality for the submartingales Wt and |Wt |, and use that E[Wt ] =
t and E[|Wt |] = 2t/π, see Exercise 3.1.17 (a).
3.12.7 Divide by t in the inequality from Exercise 3.12.6 part (b).
3.12.10 Let σ = n and τ = n + 1. Then
E
h
sup
n≤t≤n=1
N
t
t
−λ
2 i
4λ(n + 1)
·
n2
≤
The result follows by taking n → ∞ in the sequence of inequalities
0≤E
h N
t
t
−λ
2 i
≤E
h
sup
n≤t≤n=1
N
2 i 4λ(n + 1)
−λ
≤
·
t
n2
t
Chapter 3
4.1.2 We have
{ω; τ (ω) ≤ t} =
and use that Ø, Ω ∈ Ft .
Ω, if c ≤ t
Ø, if c > t
212
4.1.3 First we note that
{ω; τ (ω) < t} =
[
0<s<t
{ω; |Ws (ω)| > K}.
(10.2.16)
This can be shown by double inclusion. Let As = {ω; |Ws (ω)| > K}.
“ ⊂ ” Let ω ∈ {ω; τ (ω) < t}, so inf{s > 0; |Ws (ω)| > K} < t. Then exists τ < u < t
such that |Wu (ω)|
S > K, and hence ω ∈ Au .
“ ⊃ ” Let ω ∈ 0<s<t {ω; |Ws (ω)| > K}. Then there is 0 < s < t such that |Ws (ω)| >
K. This implies τ (ω) < s and since s < t it follows that τ (ω) < t.
Since Wt is continuous, then (10.2.16) can be also written as
[
{ω; τ (ω) < t} =
{ω; |Wr (ω)| > K},
0<r<t,r∈Q
which implies {ω; τ (ω) < t} ∈ Ft since {ω; |Wr (ω)| > K} ∈ Ft , for 0 < r < t. Next we
shall show that P (τ < ∞) = 1.
[
P ({ω; τ (ω) < ∞}) = P
{ω; |Ws (ω)| > K} > P ({ω; |Ws (ω)| > K})
0<s
= 1−
Z
|x|<K
√
y2
2K
1
e− 2s dy > 1 − √
→ 1, s → ∞.
2πs
2πs
Hence τ is a stopping time.
4.1.4 Let Km = [a +
1
m, b
−
1
m ].
We can write
\ [
{ω; τ ≤ t} =
{ω; Xr ∈
/ Km } ∈ Ft ,
m≥1 r<t,r∈Q
since {ω; Xr ∈
/ Km } = {ω; Xr ∈ K m } ∈ Fr ⊂ Ft .
4.1.6 (a) No. The event A = {ω; Wt (ω) has a local maximum at time t} is not in
σ{Ws ; s ≤ t} but is in σ{Ws ; s ≤ t + ǫ}.
4.1.8 (a) We have {ω; cτ ≤ t} = {ω; τ ≤ t/c} ∈ Ft/c ⊂ Ft . And P (cτ < ∞) = P (τ <
∞) = 1.
(b) {ω; f (τ ) ≤ t} = {ω; τ ≤ f −1 (t)} = Ff −1 (t) ⊂ Ft , since f −1 (t) ≤ t. If f is bounded,
then it is obvious
that P (f (τ ) < ∞) = 1. If limt→∞ f (t) = ∞, then P (f (τ ) < ∞) =
P τ < f −1 (∞) = P (τ < ∞) = 1.
(c) Apply (b) with f (x) = ex .
T
4.1.10 If let G(n) = {x; |x − a| < n1 }, then {a} = n≥1 G(n). Then τn = inf{t ≥
0; Wt ∈ G(n)} are stopping times. Since supn τn = τ , then τ is a stopping time.
4.2.3 The relation is proved by verifying two cases:
(i) If ω ∈ {ω; τ > t} then (τ ∧ t)(ω) = t and the relation becomes
Mτ (ω) = Mt (ω) + Mτ (ω) − Mt (ω).
213
(ii) If ω ∈ {ω; τ ≤ t} then (τ ∧ t)(ω) = τ (ω) and the relation is equivalent with the
obvious relation
Mτ = Mτ .
4.2.5 Taking the expectation in E[Mτ |Fσ ] = Mσ yields E[Mτ ] = E[Mσ ], and then
make σ = 0.
4.3.9 Since Mt = WT2 − t is a martingale with E[Mt ] = 0, by the Optional Stopping
Theorem we get E[Mτa ] = E[M0 ] = 0, so E[Wτ2a − τa ] = 0, from where E[τa ] =
E[Wτ2a ] = a2 , since Wτa = a.
4.3.10 (a)
F (a) = P (Xt ≤ a) = 1 − P (Xt > a) = 1 − P ( max Ws > a)
0≤s≤t
Z ∞
2
−y 2 /2
dy
= 1 − P (Ta ≤ t) = 1 − √
√ e
2π |a|/ t
Z ∞
Z ∞
2
2
−y 2 /2
−y 2 /2
dy
= √
e
dy − √
√ e
2π 0
2π |a|/ t
Z |a|/√t
2
2
e−y /2 dy.
= √
2π 0
(b) The density function is p(a) = F ′ (a) =
E[Xt ] =
E[Xt2 ] =
=
V ar(Xt ) =
2
√ 2 e−a /(2t) ,
2πt
a > 0. Then
r
Z ∞
2
2t
−x2 /(2t)
xp(x) dx = √
xe
dy =
π
2πt 0
0
Z ∞
Z ∞
2
2
2
4t
√
x2 e−x /(2t) dx = √
y 2 e−y dy
π 0
2πt 0
Z
2 1 ∞ 1/2 −u
1
√
Γ(3/2) = t
u e du = √
2πt 2 0
2πt
2
.
E[Xt2 ] − E[Xt ]2 = t 1 −
π
Z
∞
4.3.16 It is recurrent since P (∃t > 0 : a < Wt < b) = 1.
4.4.4 Since
1
1
P (Wt > 0; t1 ≤ t ≤ t2 ) = P (Wt =
6 0; t1 ≤ t ≤ t2 ) = arcsin
2
π
r
t1
,
t2
using the independence
P (Wt1
>
0, Wt2 )
=
P (Wt1
>
0)P (Wt2
1
> 0) = 2 arcsin
π
r
t1 2
.
t2
214
The probability for Wt =
(Wt1 , Wt2 )
4.5.2 (a) We have
4
to be in one of the quadrants is 2 arcsin
π
r
t1 2
.
t2
e2µβ − 1
= 1.
β→∞ e2µβ − e−2µα
P (Xt goes up to α) = P (Xt goes up to α before down to −∞) = lim
4.5.3
e2µβ − 1
.
α→∞ e2µβ − e−2µα
P (Xt never hits − β) = P (Xt goes up to ∞ before down to − β) = lim
4.5.4 (a) Use that E[XT ] = αpα − β(1 − pα ); (b) E[XT2 ] = α2 pα + β 2 (1 − pα ), with
e2µβ −1
pα = e2µβ
; (c) Use V ar(T ) = E[T 2 ] − E[T ]2 .
−e−2µα
4.5.7 Since Mt = Wt2 − t is a martingale, with E[Mt ] = 0, by the Optional Stopping
Theorem we get E[WT2 − T ] = 0. Using WT = XT − µT yields
E[XT2 − 2µT XT + µ2 T 2 ] = E[T ].
Then
E[T 2 ] =
E[T ](1 + 2µE[XT ]) − E[XT2 ]
.
µ2
Substitute E[XT ] and E[XT2 ] from Exercise 4.5.4 and E[T ] from Proposition 4.5.5.
4.6.11 See the proof of Proposition 4.6.3.
4.6.12
E[T ] = −
1
E[XT ]
d
E[e−sT ]|s=0 = (αpα + β(1 − pβ )) =
.
ds
µ
µ
4.6.13 (b) Applying the Optional Stopping Theorem
E[ecMT −λT (e
c −c−1)
] = E[X0 ] = 1
E[eca−λT f (c) ] = 1
E[e−λT f (c) ] = e−ac .
Let s = f (c), so c = ϕ(s). Then E[e−λsT ] = e−aϕ(s) .
(c) Differentiating and taking s = 0 yields
−λE[T ] = −ae−aϕ(0) ϕ′ (0)
1
= −∞,
= −a ′
f (0)
so E[T ] = ∞.
215
(d) The inverse Laplace transform L−1 e−aϕ(s) cannot be represented by elementary
functions.
4.8.9 Use E[(|Xt | − 0)2 ] = E[|Xt |2 ] = E[(Xt − 0)2 ].
?? Let an = ln bn . Then Gn = eln Gn = e
a1 +···+an
n
→ eln L = L.
4.9.3 (a) L = 1. (b) A computation shows
E[(Xt − 1)2 ] = E[Xt2 − 2Xt + 1] = E[Xt2 ] − 2E[Xt ] + 1
= V ar(Xt ) + (E[Xt ] − 1)2 .
(c) Since E[Xt ] = 1, we have E[(Xt − 1)2 ] = V ar(Xt ). Since
V ar(Xt ) = e−t E[eWt ] = e−t (e2t − et ) = et − 1,
then E[(Xt − 1)2 ] does not tend to 0 as t → ∞.
4.10.7 (a) E[(dWt )2 − dt2 ] = E[(dWt )2 ] − dt2 = 0.
(b)
V ar((dWt )2 − dt) = E[(dWt2 − dt)2 ] = E[(dWt )4 − 2dtdWt + dt2 ]
= 3st2 − 2dt · 0 + dt2 = 4dt2 .
Chapter 4
5.2.3 (a) Use either the definition or the moment generation function to show that
4
] = 3(t − s)2 .
E[Wt4 ] = 3t2 . Using stationarity, E[(Wt − Ws )4 ] = E[Wt−s
Z T
dWt ] = E[WT ] = 0.
5.4.1 (a) E[
0
Z T
1
1
Wt dWt ] = E[ WT2 − T ] = 0.
(b) E[
2
2
0Z
Z T
T
1
1
T2
1
.
Wt dWt )2 ] = E[ WT2 + T 2 − T WT2 ] =
Wt dWt ) = E[(
(c) V ar(
4
4
2
2
0
0
RT
5.6.2 X ∼ N (0, 1 1t dt) = N (0, ln T ).
5.6.3 Y ∼ N (0,
RT
1
tdt) = N 0, 12 (T 2 − 1)
5.6.4 Normally distributed with zero mean and variance
Z
t
0
1
e2(t−s) ds = (e2t − 1).
2
5.6.5 Using the property of Wiener integrals, both integrals have zero mean and vari7t3
ance
.
3
5.6.6 The mean is zero and the variance is t/3 → 0 as t → 0.
216
5.6.7 Since it is a Wiener integral, Xt is normally distributed with zero mean and
variance
Z t
bu 2
b2
a+
du = (a2 +
+ ab)t.
t
3
0
Hence a2 +
b2
3
+ ab = 1.
5.6.8 Since both Wt and
Rt
0
f (s) dWs have the mean equal to zero,
Z t
Z t
Z t
Z t
dWs
f (s) dWs ] = E[
f (s) dWs ]
= E[Wt ,
f (s) dWs
Cov Wt ,
0
0
0
0
Z t
Z t
f (u) ds.
= E[ f (u) ds] =
0
0
The general result is
Z t
Z t
f (s) ds.
f (s) dWs =
Cov Wt ,
0
0
Choosing f (u) = un yields the desired identity.
5.8.6 Apply the expectation to
Nt
X
k=1
Nt
Nt
2 X
X
f (Sk )f (Sj ).
f 2 (Sk ) + 2
f (Sk ) =
k6=j
k=1
5.9.1
E
V ar
hZ
Z
T
0
T
eks dNs
eks dNs
0
i
=
=
λ kT
(e − 1)
k
λ 2kT
(e
− 1).
2k
Chapter 5
6.2.1 Let Xt =
Rt
0
eWu du. Then
dGt = d(
Xt
tdXt − Xt dt
teWt dt − Xt dt
1 Wt
−
G
e
)=
=
=
t dt.
t
t2
t2
t
6.3.3
(a) eWt (1 + 12 Wt )dt + eWt (1 + Wt )dWt ;
(b) (6Wt + 10e5Wt )dWt + (3 + 25e5Wt )dt;
2
2
(c) 2et+Wt (1 + Wt2 )dt + 2et+Wt Wt dWt ;
217
(d) n(t + Wt )n−2 (t + Wt + n−1
)dt
+
(t
+
W
)dW
t
t ;
2
!
Z
1 t
1
Wt −
Wu du dt;
(e)
t
t 0
!
Z
α t Wu
1
Wt
(f ) α e −
e du dt.
t
t 0
6.3.4
d(tWt2 ) = td(Wt2 ) + Wt2 dt = t(2Wt dWt + dt) + Wt2 dt = (t + Wt2 )dt + 2tWt dWt .
6.3.5 (a) tdWt + Wt dt;
(b) et (Wt dt + dWt );
(c) (2 − t/2)t cos Wt dt − t2 sin Wt dWt ;
(d) (sin t + Wt2 cos t)dt + 2 sin t Wt dWt ;
6.3.8 It follows from (6.3.10).
6.3.9 Take the conditional expectation in
Z t
Mu− dMu + Nt − Ns
Mt2 = Ms2 + 2
s
and obtain
E[Mt2 |Fs ]
=
Ms2
Z t
+ 2E[ Mu− dMu |Fs ] + E[Nt |Fs ] − Ns
s
= Ms2 + E[Mt + λt|Fs ] − Ns
= Ms2 + Ms + λt − Ns
= Ms2 + λ(t − s).
6.3.10 Integrating in (6.3.11) yields
Z
Ft = Fs +
t
s
∂f
dWt1 +
∂x
Z
s
t
∂f
dWt2 .
∂y
One can check that E[Ft |Fs ] = Fs .
2Wt1 dWt1 + 2Wt2 dWt2
.
(Wt1 )2 + (Wt2 )2
p
∂f
x
∂f
x2 + y 2 . Since
6.3.12 Consider the function f (x, y) =
= p
=
,
2
2
∂x
∂y
x +y
1
y
p
, ∆f = p
, we get
2
2
2
x +y
2 x + y2
6.3.11 (a) dFt = 2Wt1 dWt1 + 2Wt2 dWt2 + 2dt; (b) dFt =
dRt =
∂f
∂f
W1
W2
1
dWt1 +
dWt2 + ∆f dt = t dWt1 + t dWt2 +
dt.
∂x
∂y
Rt
Rt
2Rt
218
Chapter 6
7.2.1 (a) Use integration formula with g(x) = tan−1 (x).
Z T
Z T
Z
1 T
1
2Wt
−1 ′
−1
(tan ) (Wt ) dWt = tan WT +
dWt =
dt.
2
2 0 (1 + Wt )2
0
0 1 + Wt
i
hZ T
1
= 0.
dW
(b) Use E
t
2
0 1 + Wt
x
(c) Use Calculus to find minima and maxima of the function ϕ(x) =
.
(1 + x2 )2
7.2.2 (a) Use integration by parts with g(x) = ex and get
Z T
Z
1 T Wt
eWt dWt = eWT − 1 −
e dt.
2 0
0
(b) Applying the expectation we obtain
WT
E[e
1
] =1+
2
Z
T
E[eWt ] dt.
0
If let φ(T ) = E[eWT ], then φ satisfies the integral equation
Z
1 T
φ(T ) = 1 +
φ(t) dt.
2 0
Differentiating yields the ODE φ′ (T ) = 21 φ(T ), with φ(0) = 1. Solving yields φ(T ) =
eT /2 .
7.2.3 (a) Apply integration by parts with g(x) = (x − 1)ex to get
Z T
Z T
Z
1 T ′′
′
Wt
g (Wt ) dWt = g(WT ) − g(0) −
Wt e dWt =
g (Wt ) dt.
2 0
0
0
(b) Applying the expectation yields
WT
E[WT e
Z
i
1 Th
] = E[e ] − 1 +
E[eWt ] + E[Wt eWt ] dt
2 0
Z T
1
et/2 + E[Wt eWt ] dt.
= eT /2 − 1 +
2 0
WT
Then φ(T ) = E[WT eWT ] satisfies the ODE φ′ (T ) − φ(T ) = eT /2 with φ(0) = 0.
7.2.4 (a) Use integration by parts with g(x) = ln(1 + x2 ).
(e) Since ln(1 + T ) ≤ T , the upper bound obtained in (e) is better than the one in (d),
without contradicting it.
219
7.3.1 By straightforward computation.
7.3.2 By computation.
√
7.3.13 (a) √e2 sin( 2W1 ); (b) 12 e2T sin(2W3 ); (c)
√
√1 (e 2W4 −4
2
− 1).
7.3.14 Apply Ito’s formula to get
1
dϕ(t, Wt ) = (∂t ϕ(t, Wt ) + ∂x2 ϕ(t, Wt ))dt + ∂x ϕ(t, Wt )dWt
2
= G(t)dt + f (t, Wt ) dWt .
Integrating between a and b yields
ϕ(t, Wt )|ba
=
Z
8.2.1 Integrating yields Xt = X0 +
we get
b
G(t) dt +
b
f (t, Wt ) dWt .
a
a
Rt
Z
Chapter 7
Rt
2s
0 (2Xs + e ) ds + 0 b dWs .
E[Xt ] = X0 +
Z
t
Taking the expectation
(2E[Xs ] + e2s ) ds.
0
Differentiating we obtain f ′ (t) = 2f (t) + e2t , where f (t) = E[Xt ], with f (0) = X0 .
Multiplying by the integrating factor e−2t yields (e−2t f (t))′ = 1. Integrating yields
f (t) = e2t (t + X0 ).
8.2.4 (a) Using product rule and Ito’s formula, we get
1
d(Wt2 eWt ) = eWt (1 + 2Wt + Wt2 )dt + eWt (2Wt + Wt2 )dWt .
2
Integrating and taking expectations yields
Z t
1
2 Wt
E[eWs ] + 2E[Ws eWs ] + E[Ws2 eWs ] ds.
E[Wt e ] =
2
0
Since E[eWs ] = et/2 , E[Ws eWs ] = tet/2 , if let f (t) = E[Wt2 eWt ], we get by differentiation
1
f ′ (t) = et/2 + 2tet/2 + f (t),
2
f (0) = 0.
Multiplying by the integrating factor e−t/2 yields (f (t)e−t/2 )′ = 1 + 2t. Integrating
yields the solution f (t) = t(1 + t)et/2 . (b) Similar method.
8.2.5 (a) Using Exercise 3.1.19
E[Wt4 − 3t2 |Ft ] = E[Wt4 |Ft ] − 3t2
= 3(t − s)2 + 6(t − s)Ws2 + Ws4 − 3t2
= (Ws4 − 3s2 ) + 6s2 − 6ts + 6(t − s)Ws2 6= Ws4 − 3s2
220
Hence Wt4 − 3t2 is not a martingale. (b)
E[Wt3 |Fs ] = Ws3 + 3(t − s)Ws 6= Ws3 ,
and hence Wt3 is not a martingale.
8.2.6 (a) Similar method as in Example 8.2.4.
(b) Applying Ito’s formula
1
d(cos(σWt )) = −σ sin(σWt )dWt − σ 2 cos(σWt )dt.
2
σ2
2
Let f (t) = E[cos(σWt )]. Then f ′ (t) = − σ2 f (t), f (0) = 1. The solution is f (t) = e− 2 t .
(c) Since sin(t+σWt ) = sin t cos(σWt )+cos t sin(σWt ), taking the expectation and using
(a) and (b) yields
E[sin(t + σWt )] = sin tE[cos(σWt )] = e−
σ2
t
2
sin t.
(d) Similarly starting from cos(t + σWt ) = cos t cos(σWt ) − sin t sin(σWt ).
8.2.7 From Exercise 8.2.6 (b) we have E[cos(Wt )] = e−t/2 . From the definition of
expectation
Z ∞
1 − x2
e 2t dx.
cos x √
E[cos(Wt )] =
2πt
−∞
Then choose t = 1/2 and t = 1 to get (a) and (b), respectively.
8.2.8 (a) Using a standard method involving Ito’s formula we can get E(Wt ebWt ) =
2
bteb t/2 . Let a = 1/(2t). We can write
Z
Z
√
1 − x2
2
2πt xebx √
xe−ax +bx dx =
e 2t dx
2πt
r √
√
π b b2 /(4a)
bWt
b2 t/2
e
.
2πtE(Wt e ) = 2πtbte
=
=
a 2a
The same method for (b) and (c).
8.2.9 (a) We have
E[cos(tWt )] = E
X
(−1)n
n≥0
=
X
Wt2n t2n X
E[Wt2n ]t2n
=
(−1)n
(2n)!
(2n)!
n≥0
n≥0
−t3 /2
= e
n≥0
3n
t2n (2n)!tn X
n t
=
(−1)
(−1)n
(2n)! 2n n!
2n n!
.
(b) Similar computation using E[Wt2n+1 ] = 0.
221
Rt
Rt
8.3.2 (a) Xt = 1 + sin t − 0 sin s dWs , E[Xt ] = 1 + sin t, V ar[Xt ] = 0 (sin s)2 ds =
t
1
2 − 4 sin(2t);
Rt√
2
(b) Xt = et − 1 + 0 s dWs , E[Xt ] = et − 1, V ar[Xt ] = t2 ;
Rt
4
(c) Xt = 1 + 12 ln(1 + t2 ) + 0 s3/2 dWs , E[Xt ] = 1 + 12 ln(1 + t2 ), V ar(Xt ) = t4 .
8.3.4
−t/2
Xt = 2(1 − e
)+
Z
t
0
s
e− 2 +Ws dWs = 1 − 2e−t/2 + e−t/2+Wt
= 1 + et/2 (eWt − 2);
Its distribution function is given by
F (y) = P (Xt ≤ y) = P (1 + et/2 (eWt − 2) ≤ y) = P Wt ≤ ln(2 + (y − 1)et/2 )
Z ln(2+(y−1)et/2 )
x2
1
= √
e− 2t dx.
2πt −∞
E[Xt ] = 2(1 − e−t/2 ), V ar(Xt ) = V ar(e−t/2 eWt ) = e−t V ar(eWt ) = et − 1.
8.4.3 (a) Xt = 13 Wt3 − tWt + et ; (b) Xt = 31 Wt3 − tWt − cos t;
t
3
2
(c) Xt = eWt − 2 + t3 ; (d) Xt = et/2 sin Wt + t2 + 1.
8.4.4 (a) Xt = 14 Wt4 + tWt ; (b) Xt = 21 Wt2 + t2 Wt − 2t ;
(c) Xt = et Wt − cos Wt + 1; (d) Xt = teWt + 2.
8.5.1 (a)
dXt = (2Wt dWt + dt) + Wt dt + tdWt
= d(Wt2 ) + d(tWt ) = d(tWt + Wt2 )
so Xt = tWt + Wt2 . (b) We have
1
1
Wt )dt + dWt
t2
t
1
1
= 2tdt + d( Wt ) = d(t2 + Wt ),
t
t
dXt = (2t −
so Xt = t2 + 1t Wt − 1 − W1 .
1
(c) dXt = et/2 Wt dt + et/2 dWt = d(et/2 Wt ), so Xt = et/2 Wt . (d) We have
2
dXt = t(2Wt dWt + dt) − tdt + Wt2 dt
1
= td(Wt2 ) + Wt2 dt − d(t2 )
2
2
t
= d(tWt2 − ),
2
222
√
√
√
(e) dXt = dt + d( tWt ) = d(t + tWt ), so Xt = t + tWt − W1 .
Z t
1
4t
4t
e4(t−s) dWs ;
8.6.1 (a) Xt = X0 e + (1 − e ) + 2
4
0
2
(b) Xt = X0 e3t + (1 − e3t ) + e3t Wt ;
3
1
t
t
(c) Xt = e (X0 + 1 + Wt2 − ) − 1;
2
2
t
1
(d) Xt = X0 e4t − − (1 − e4t ) + e4t Wt ;
4 16
(e) Xt = X0 et/2 − 2t − 4 + 5et/2 − et cos Wt ;
(f ) Xt = X0 e−t + e−t Wt .
so Xt = tWt −
t2
2.
Rt
1
Rt
α2
2
8.8.1 (a) The integrating factor is ρt = e− 0 α dWs + 2 0 α ds = e−αWt + 2 t , which transforms the equation in the exact form d(ρt Xt ) = 0. Then ρt Xt = X0 and hence
Xt = X0 eαWt −
α2
t
2
.
2
−αWt + α2 t
(b) ρt = e
2
(1− α2 )t+αWt
X0 e
, d(ρt Xt ) = ρt Xt dt, dYt = Yt dt, Yt = Y0 et , ρt Xt = X0 et , Xt =
.
8.8.2 σt dAt = dXt − σAt dt, E[At ] = 0, V ar(At ) = E[A2t ] =
Chapter 8
9.2.2 A = 21 ∆ =
9.2.4
1
t2
Rt
2
0 E[Xs ] ds =
X02
t .
1 Pn
2
2
k=1 ∂k .
X1 (t) = x01 + W1 (t)
Z t
0
X1 (s) dW2 (s)
X2 (t) = x2 +
0
Z t
W1 (s) dW2 (s).
= x02 + x01 W2 (t) +
0
9.5.2 E[τ ] is maximum if b − x0 = x0 − a, i.e. when x0 = (a + b)/2. The maximum
value is (b − a)2 /4.
9.6.1 Let τk = min(k, τ ) ր τ and k → ∞. Apply Dynkin’s formula for τk to show that
E[τk ] ≤
1 2
(R − |a|2 ),
n
and take k → ∞.
9.6.3 x0 and x2−n .
9.7.1 (a) We have a(t, x) = x, c(t, x) = x, ϕ(s) = xes−t and u(t, x) =
x(eT −t − 1).
RT
t
xes−t ds =
223
(b) a(t, x) = tx, c(t, x) = − ln x, ϕ(s) = xe(s
2
−(T − t)[ln x + T6 (T + t) − t3 ].
2 −t2 )/2
and u(t, x) = −
RT
t
ln xe(s
2 −t2 )/2
ds =
3
9.7.2 (a) u(t, x) = x(T − t) + 12 (T − t)2 . (b) u(t, x) = 32 ex e 2 (T −t) − 1 . (c) We have
a(t, x) = µx, b(t, x) = σx, c(t, x) = x. The associated diffusion is dXs = µXs ds +
σXs dWs , Xt = x, which is the geometric Brownian motion
1
Xs = xe(µ− 2 σ
2 )(s−t)+σ(W
s −Wt )
s ≥ t.
,
The solution is
hZ
i
T
1 2
u(t, x) = E
xe(µ− 2 σ )(s−t)+σ(Ws −Wt ) ds
t
Z T
1
e(µ− 2 (s−t))(s−t) E[eσ(s−t) ] ds
= x
t
Z T
1
2
e(µ− 2 (s−t))(s−t) eσ (s−t)/2 ds
= x
t
Z T
x µ(T −t)
eµ(s−t) ds =
= x
e
−1 .
µ
t
Chapter 9
10.1.7 Apply Example 10.1.6 with u = 1.
Rt
Rt
10.1.8 Xt = 0 h(s) dWs ∼ N (0, 0 h2 (s) ds). Then eXt is log-normal with E[eXt ] =
1
1
e 2 V ar(Xt ) = e 2
Rt
0
h(s)2 ds
.
10.1.9 (a) Using Exercise 10.1.8 we have
E[Mt ] = E[e−
− 12
= e
Rt
Rt
0
0
u(s) dWs − 12
u(s)2
e
ds
E[e−
Rt
0
Rt
0
u(s)2 ds
u(s) dWs
]
1
] = e− 2
Rt
0
u(s)2 ds
1
e2
Rt
0
u(s)2 ds
(b) Similar computation with (a).
10.1.10 (a) Applying the product and Ito’s formulas we get
d(et/2 cos Wt ) = −e−t/2 sin Wt dWt .
Integrating yields
et/2 cos Wt = 1 −
Z
t
e−s/2 sin Ws dWs ,
0
which is an Ito integral, and hence a martingale; (b) Similarly.
10.1.12 Use that the function f (x1 , x2 ) = ex1 cos x2 satisfies ∆f = 0.
= 1.
224
10.1.14 (a) f (x) = x2 ; (b) f (x) = x3 ; (c) f (x) = xn /(n(n − 1)); (d) f (x) = ecx ; (e)
f (x) = sin(cx).
Chapter 10
?? (a) Since rt ∼ N (µ, s2 ), with µ = b + (r0 − b)e−at and s2 =
Z
P (rt < 0) =
0
−∞
(x−µ)2
1
1
√
e− 2s2 dx = √
2πs
2π
Z
−µ/s
e−v
σ2
2a (1
2 /2
− e−2at ). Then
dv = N (−µ/s),
−∞
where by computation
µ
1
− =−
s
σ
(b) Since
µ
lim =
t→∞ s
then
r
2a
r0 + b(eat − 1) .
−1
e2at
√
√
2a
r0 + b(eat − 1)
b 2a
√
lim
,
=
σ t→∞
σ
e2at − 1
√
b 2a
).
lim P (rt < 0) = lim N (−µ/s) = N (−
t→∞
t→∞
σ
It is worth noting that the previous probability is less than 0.5.
(c) The rate of change is
µ
µ2 d
d
1
P (rt < 0) = − √ e− 2s2
dt
ds s
2π
2
µ
ae2at [b(e2at − 1)(e−at − 1) − r0 ]
1
.
= − √ e− 2s2
(e2at − 1)3/2
2π
?? By Ito’s formula
1
n− 1
d(rtn ) = na(b − rt )rtn−1 + n(n − 1)σ 2 rtn−1 dt + nσrt 2 dWt .
2
Integrating between 0 and t and taking the expectation yields
Z t
n(n − 1) 2
σ µn−1 (s)] ds
[nabµn−1 (s) − naµn (s) +
µn (t) = r0n +
2
0
Differentiating yields
µ′n (t) + naµn (t) = (nab +
n(n − 1) 2
σ )µn−1 (t).
2
Multiplying by the integrating factor enat yields the exact equation
[enat µn (t)]′ = enat (nab +
n(n − 1) 2
σ )µn−1 (t).
2
225
Integrating yields the following recursive formula for moments
Z t
n(n − 1) 2
n −nat
e−na(t−s)µn−1 (s) ds .
σ )
µn (t) = r0 e
+ (nab +
2
0
?? (a) The spot rate rt follows the process
d(ln rt ) = θ(t)dt + σdWt .
Integrating yields
ln rt = ln r0 +
Z
t
θ(u) du + σWt ,
0
Rt
which is normally distributed. (b) Then rt = r0 e
tributed. (c) The mean and variance are
Rt
E[rt ] = r0 e
0
θ(u) du
1
2
e 2 σ t,
V ar(rt ) = r02 e2
0
θ(u) du+σWt
Rt
0
θ(u) du σ2 t
e
is log-normally dis2
(eσ t − 1).
?? Substitute ut = ln rt and obtain the linear equation
dut + a(t)ut dt = θ(t)dt + σ(t)dWt .
Rt
The equation can be solved multiplying by the integrating factor e
0
a(s) ds
.
Chapter 11
?? Apply the same method
used in the Application ??. The price of the bond is:
R T −t
P (t, T ) = e−rt (T −t) E e−σ 0 Ns ds .
?? The price of an infinitely lived bond at time t is given by limT →∞ P (t, T ). Using
1
lim B(t, T ) = , and
T →∞
a

if b < σ 2 /(2a)
 +∞,
1,
if b > σ 2 /(2a)
lim A(t, T ) =
 1
2
2
T →∞
σ
σ
if b = σ 2 /(2a),
( a + t)(b − 2a
2 ) − 4a3 ,
we can get the price of the bond in all three cases.
Chapter 12
?? Dividing the equations
St = S0 e(µ−
Su = S0 e(µ−
yields
St = Su e(µ−
σ2
)t+σWt
2
σ2
)u+σWu
2
σ2
)(t−u)+σ(Wt −Wu )
2
.
226
Taking the predictable part out and dropping the independent condition we obtain
E[St |Fu ] = Su e(µ−
= Su e(µ−
σ2
)(t−u)
2
σ2
)(t−u)
2
2
(µ− σ2 )(t−u)
= Su e
= Su e(µ−
σ2
)(t−u)
2
E[eσ(Wt −Wu ) |Fu ]
E[eσ(Wt −Wu ) ]
E[eσWt−u ]
1
e2σ
2 (t−u)
= Su eµ(t−u)
Similarly we obtain
E[St2 |Fu ] = Su2 e2(µ−
σ2
)(t−u)
2
= Su2 e2µ(t−u) eσ
E[e2σ(Wt −Wu ) |Fu ]
2 (t−u)
.
Then
V ar(St |Fu ) = E[St2 |Fu ] − E[St |Fu ]2
2
= Su2 e2µ(t−u) eσ (t−u) − Su2 e2µ(t−u)
2
= Su2 e2µ(t−u) eσ (t−u) − 1 .
When s = t we get E[St |Ft ] = St and V ar(St |Ft ) = 0.
?? By Ito’s formula
1
1 1
(dSt )2
dSt −
St
2 St2
σ2
= (µ − )dt + σdWt ,
2
d(ln St ) =
2
so ln St = ln S0 +(µ− σ2 )t+σWt , and hence ln St is normally distributed with E[ln St ] =
2
ln S0 + (µ − σ2 )t and V ar(ln St ) = σ 2 t.
?? (a) d S1t = (σ 2 − µ) S1t dt − Sσt dWt ;
2
n
(b) d Stn = n(µ + n−1
2 σ )dt + nσSt dWt ;
(c)
d (St − 1)2 = d(St2 ) − 2dSt = 2St dSt + (dSt )2 − 2dSt
=
(2µ + σ 2 )St2 − 2µSt dt + 2σSt (St − 1)dWt .
?? (b) Since St = S0 e(µ−
σ2
)t+σWt
2
, then Stn = S0n e(nµ−
2
(nµ− σ2 )t
E[Stn ] = S0n e
= S0n e(nµ−
= S0n e(nµ+
σ2
)t
2
σ2
)t+nσWt
2
E[enσWt ]
1
e2n
2 σ2 t
n(n−1) 2
σ )t
2
.
and hence
227
(a) Let n = 2 in (b) and obtain E[St2 ] = S0 e(2µ+σ
2 )t
.
?? (a) Let Xt = St Wt . The product rule yields
d(Xt ) = Wt dSt + St dWt + dSt dWt
= Wt (µSt dt + σSt dWt ) + St dWt + (µSt dt + σSt dWt )dWt
= St (µWt + σ)dt + (1 + σWt )St dWt .
Integrating
Xs =
Z
s
St (µWt + σ)dt +
Z
s
(1 + σWt )St dWt .
0
0
Let f (s) = E[Xs ]. Then
f (s) =
Z
s
(µf (t) + σS0 eµt ) dt.
0
Differentiating yields
f ′ (s) = µf (s) + σS0 eµs ,
f (0) = 0,
with the solution f (s) = σS0 seµs . Hence E[Wt St ] = σS0 teµt .
(b) Cov(St , Wt ) = E[St Wt ] − E[St ]E[Wt ] = E[St Wt ] = σS0 teµt . Then
Cov(St , Wt )
σS0 teµt
p
=
√
σSt σWt
S0 eµt eσ2 t − 1 t
r
t
→ 0, t → ∞.
= σ
2t
σ
e −1
Corr(St, Wt ) =
?? d = Sd /S0 = 0.5, u = Su /S0 = 1.5, γ = −6.5, p = 0.98.
?? Let Ta be the time St reaches level a for the first time. Then
F (a) = P (S t < a) = P (Ta > t)
Z ∞
(ln a −αστ )2
S0
1
a
−
2τ σ 3
√
e
= ln
dτ,
3
3
S0 t
2πσ τ
where α = µ − 21 σ 2 .
RT
−(ln Sa −αστ )2 /(2τ σ3 )
0
dτ .
?? (a) P (Ta < T ) = ln Sa0 0 √ 1 3 3 e
2πσ τ
RT
(b) T12 p(τ ) dτ .
S0 2
2
2
?? E[At ] = S0 1 + µt
2 + O(t ), V ar(At ) = 3 σ t + O(t ).
228
1
t
Rt
ln Su du, using the product rule yields
Z
1 Z t
1 t
ln Su du + d
ln Su du
d(ln Gt ) = d
t 0
t
0
Z t
1
1
= − 2
ln Su du dt + ln St dt
t
t
0
1
(ln St − ln Gt )dt.
=
t
?? (a) Since ln Gt =
0
(b) Let Xt = ln Gt . Then Gt = eXt and Ito’s formula yields
1
dGt = deXt = eXt dXt + eXt (dXt )2 = eXt dXt
2
Gt
= Gt d(ln Gt ) =
(ln St − ln Gt )dt.
t
?? Using the product rule we have
1
H 1
t
= d
Ht + dHt
d
t
t
t
Ht 1 1
dt
= − 2 Ht dt + 2 Ht 1 −
t
t
St
1 H2
= − 2 t dt,
t St
so, for dt > 0, then d Htt < 0 and hence Htt is decreasing. For the second part, try to
apply l’Hospital rule.
?? By continuity, the inequality (??) is preserved under the limit.
Rt
Rt
P α
P
?? (a) Use that n1
Stk = 1t
Stk nt → 1t 0 Suα du. (b) Let It = 0 Suα du. Then
1
1 1
1 α It It + dIt =
S −
dt.
d It = d
t
t
t
t t
t
Let Xt = Aαt . Then by Ito’s formula we get
!
1 α−1 1 1
1 α−2 1 2
dXt = α It
d It + α(α − 1) It
d It
t
t
2
t
t
1 α−1 1 1 α−1 1 It
Stα −
dt
d It = α It
= α It
t
t
t
t
t
α
=
St (Aαt )1−1/α − Aαt dt.
t
Rt
(c) If α = 1 we get the continuous arithmetic mean, A1t = 1t 0 Su du. If α = −1 we
t
.
obtain the continuous harmonic mean, A−1
t = Rt 1
0 Su
du
229
?? (a) By the product rule we have
dXt
Z
1 Z t
1 t
Su dWu + d
Su dWu
= d
t 0
t
0
Z t
1
1
Su dWu dt + St dWt
= − 2
t
t
0
1
1
= = − Xt dt + St dWt .
t
t
Using the properties of Ito integrals, we have
Z t
1
E[Xt ] =
Su dWu ] = 0
E[
t
0
V ar(Xt ) = E[Xt2 ] − E[Xt ]2 = E[Xt2 ]
Z
i
Z t
1 h t
Su dWu
Su dWu
= 2E
t
0
0
Z t
Z t
1
1
2
= 2
E[Su2 ] du = 2
S0 e(2µ+σ )u du
t 0
t 0
2
S02 e(2µ+σ )t − 1
.
t2 2µ + σ 2
=
(b) The stochastic differential equation of the stock price can be written in the form
σSt dWt = dSt − µSt dt.
Integrating between 0 and t yields
Z t
Z t
Su du.
Su dWu = St − S0 − µ
σ
0
0
Dividing by t yields the desired relation.
?? Using independence E[St ] = S0 eµt E[(1 + ρ)Nt ]. Then use
E[(1 + ρ)Nt ] =
X
n≥0
E[(1 + ρ)n |Nt = n]P (Nt = n) =
?? E[ln St ] = ln S0 µ − λρ −
?? E[St |Fu ] = Su e(µ−λρ−
σ2
2
X
(1 + ρ)n
n≥0
(λt)n
= eλ(1+ρ)t .
n!
+ λ ln(ρ + 1) t.
σ2
)(t−u)
2
(1 + ρ)Nt−u .
Chapter 13
?? (a) Substitute limSt →∞ N (d1 ) = limSt →∞ N (d2 ) = 1 and limSt →∞
c(t)
K
= N (d1 ) − e−r(T −t) N (d2 ).
St
St
K
St
= 0 in
230
(b) Use limSt →0 N (d1 ) = limSt →0 N (d2 ) = 0. (c) It comes from an analysis similar to
the one used at (a).
?? Differentiating in c(t) = St N (d1 ) − Ke−r(T −t) N (d2 ) yields
dN (d1 )
dN (d2 )
− Ke−r(T −t)
dSt
dSt
∂d2
∂d
1
− Ke−r(T −t) N ′ (d2 )
= N (d1 ) + St N ′ (d1 )
∂St
∂St
1 −d21 /2
1
1
1
2
1
−r(T −t) 1
√
√ e−d2 /2 √
= N (d1 ) + St √ e
− Ke
S
S
σ T −t t
σ T −t t
2π
2π
i
1 −d22 /2 h (d22 −d21 )/2
1
1
√
St e
− Ke−r(T −t) .
e
= N (d1 ) + √
2π σ T − t St
dc(t)
dSt
= N (d1 ) + St
It suffices to show
2
2
St e(d2 −d1 )/2 − Ke−r(T −t) = 0.
Since
√
2 ln(St /K) + 2r(T − t)
√
d22 − d21 = (d2 − d1 )(d2 + d1 ) = −σ T − t
σ T −t
= −2 ln St − ln K + r(T − t) ,
then we have
2
ln
2
e(d2 −d1 )/2 = e− ln St +ln K−r(T −t) = e
1
=
Ke−r(T −t) .
St
1
St
eln K e−r(T −t)
Therefore
2
2
St e(d2 −d1 )/2 − Ke−r(T −t) = St
which shows that
dc(t)
dSt
1
Ke−r(T −t) − Ke−r(T −t) = 0,
St
= N (d1 ).
?? The payoff is
fT =
where XT ∼ N ln St + (µ −
1, if ln K1 ≤ XT ≤ ln K2
0, otherwise.
σ2
2 )(T
− t), σ 2 (T − t) .
231
bt [fT ] = E[fT |Ft , µ = r] =
E
Z
Z
∞
fT (x)p(x) dx
−∞
σ2
ln K2
2
[x−ln St −(r− 2 )(T −t)]
1
−
2σ 2 (T −t)
√ √
=
e
dx
ln K1 σ 2π T − t
Z −d2 (K2 )
Z d2 (K1 )
1 −y2 /2
1
2
√ e
√ e−y /2 dy
=
dy =
2π
2π
−d2 (K1 )
d2 (K2 )
= N d2 (K1 ) − N d2 (K2 ) ,
where
d2 (K1 ) =
d2 (K2 ) =
ln St − ln K1 + (r −
√
σ T −t
ln St − ln K2 + (r −
√
σ T −t
σ2
2 )(T
− t)
σ2
2 )(T
− t)
.
Hence the value of a box-contract at time t is ft = e−r(T −t) [N d2 (K1 ) − N d2 (K2 ) ].
?? The payoff is
eXT , if XT > ln K
0,
otherwise.
Z ∞
bt [fT ] = e−r(T −t)
ft = e−r(T −t) E
ex p(x) dx.
fT =
ln K
The computation is similar with the one of the integral I2 from Proposition ??.
?? (a) The payoff is
fT =
bt [fT ] =
E
=
=
=
=
=
Z
enXT , if XT > ln K
0,
otherwise.
Z ∞
[x−ln St −(r−σ 2 /2)(T −t)]2
1
−1
σ 2 (T −t)
enx p(x) dx = p
enx e 2
dx
σ
2π(T
−
t)
ln K
ln K
Z ∞
√
σ2
1 2
1
n
√
enσ T −ty eln St en(r− 2 )(T −t) e− 2 y dy
2π −d2
Z ∞
√
2
1 2
1
n n(r− σ2 )(T −t)
√ St e
e− 2 y +nσ T −ty dy
2π
−d2
Z ∞
√
σ2
1 2 2
1
1
2
n n(r− 2 )(T −t) 2 n σ (T −t)
√ St e
e
e− 2 (y−nσ T −t) dy
2π
Z ∞ −d2
2
σ2
1
− z2
√ Stn e(nr+n(n−1) 2 )(T −t)
e
dz
√
2π
−d2 −nσ T −t
√
σ2
Stn e(nr+n(n−1) 2 )(T −t) N (d2 + nσ T − t).
∞
232
The value of the contract at time t is
bt [fT ] = S n e(n−1)(r+
ft = e−r(T −t) E
t
nσ 2
2
)(T −t)
√
N (d2 + nσ T − t).
√
nσ 2
(b) Using gt = Stn e(n−1)(r+ 2 )(T −t) , we get ft = g√
t N (d2 + nσ T − t).
(c) In the particular case n = 1, we have N (d2 + σ T − t) = N (d1 ) and the value takes
the form ft = St N (d1 ).
2
?? Since ln ST ∼ N ln St + µ − σ2 (T − t) , σ 2 (T − t) , we have
bt [fT ] = E (ln ST )2 |Ft , µ = r
E
= V ar ln ST |Ft , µ = r + E[ln ST |Ft , µ = r]2
2
σ2
= σ 2 (T − t) + ln St + (r − )(T − t) ,
2
so
h
2 i
σ2
ft = e−r(T −t) σ 2 (T − t) + ln St + (r − )(T − t) .
2
?? ln STn is normally distributed with
σ2
ln STn = n ln ST ∼ N n ln St + n µ − (T − t) , n2 σ 2 (T − t) .
2
Redo the computation of Proposition ?? in this case.
?? Let n = 2.
bt [(ST − K)2 ]
ft = e−r(T −t) E
bt [ST ]
bt [ST2 ] + e−r(T −t) K 2 − 2K E
= e−r(T −t) E
= St2 e(r+σ
2 )(T −t)
+ e−r(T −t) K 2 − 2KSt .
Let n = 3. Then
bt [(ST − K)3 ]
ft = e−r(T −t) E
bt [S 3 − 3KS 2 + 3K 2 ST − K 3 ]
= e−r(T −t) E
T
T
bt [ST3 ] − 3Ke−r(T −t) E
bt [ST2 ] + e−r(T −t) (3K 2 E
bt [ST ] − K 3 )
= e−r(T −t) E
= e2(r+3σ
2 /2)(T −t)
St3 − 3Ke(r+σ
2 )(T −t)
St2 + 3K 2 St − e−r(T −t) K 3 .
?? Choose cn = 1/n! in the general contract formula.
?? Similar computation with the one done for the call option.
?? Write the payoff as a difference of two puts, one with strike price K1 and the other
with strike price K2 . Then apply the superposition principle.
233
?? Write the payoff as fT = c1 + c3 − 2c2 , where ci is a call with strike price Ki . Apply
the superposition principle.
?? (a) By inspection. (b) Write the payoff as the sum between a call and a put both
with strike price K, and then apply the superposition principle.
?? (a) By inspection. (b) Write the payoff as the sum between a call with strike price
K2 and a put with strike price K1 , and then apply the superposition principle. (c) The
strangle is cheaper.
?? The payoff can be written as a sum between the payoffs of a (K3 , K4 )-bull spread
and a (K1 , K2 )-bear spread. Apply the superposition principle.
?? By computation.
?? The risk-neutral valuation yields
bt [ST − AT ] = e−r(T −t) E
bt [ST ] − e−r(T −t) E
bt [AT ]
ft = e−r(T −t) E
1
t
St 1 − e−r(T −t)
= St − e−r(T −t) At −
T
rT
1
t
1 −r(T −t) = St 1 −
− e−r(T −t) At .
+
e
rT
rT
T
?? We have
lim ft = S0 e−rT +(r−
σ2 T
)2
2
2
+ σ 6T
t→0
1
= S0 e− 2 (r+
σ2
)T
6
− e−rT K
− e−rT K.
bt [GT ] < E
bt [AT ] and hence
?? Since GT < AT , then E
bt [GT − K] < e−r(T −t) E
bt [AT − K],
e−r(T −t) E
so the forward contract on the geometric average is cheaper.
?? Dividing
Gt = S0 e(µ−
GT
yields
= S0 e(µ−
σ 2 T −t σ
GT
= e(µ− 2 ) 2 e T
Gt
σ2 t
)
2 2
σ
et
σ2 T
)
2 2
RT
0
σ
eT
Rt
0
Wu du
Rt
0
Wu du
Wu du− σt
Rt
0
Wu du
.
An algebraic manipulation leads to
GT = Gt e(µ−
σ 2 T −t
) 2
2
1
1
eσ( T − t )
Rt
0
Wu du
σ
eT
RT
t
Wu du
.
234
?? Use that
σ2
z P (max St ≥ z) = P sup[(µ − )t + σWt ] ≥ ln
t≤T
2
S0
t≤T
2
σ
1
z 1
.
= P sup[ (µ − )t + Wt ] ≥ ln
2
σ S0
t≤T σ
Chapter 15
?? P = F − ∆F S = c − N (d1 )S = −Ke−r(T −t) .
Chapter 17
??
σ2
b
b σ2 P (St < b) = P (r − )t + σWt < ln
= P σWt < ln
− (r − )t
2
S0
S0
2
−λ2
= P σWt < −λ = P σWt > λ ≤ e− 2σ2 t
1
= e− 2σ2 t [ln(S0 /b)+(r−σ
2 /2)t]2
,
where we used Proposition 2.11.12 with µ = 0.