* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Advanced Stochastic Calculus I Fall 2007 Prof. K. Ramanan Chris Almost
Limit of a function wikipedia , lookup
History of calculus wikipedia , lookup
Divergent series wikipedia , lookup
Multiple integral wikipedia , lookup
Function of several real variables wikipedia , lookup
Sobolev space wikipedia , lookup
Distribution (mathematics) wikipedia , lookup
Series (mathematics) wikipedia , lookup
Advanced Stochastic Calculus I Fall 2007 Prof. K. Ramanan∗ Chris Almost† Course website available under Dr. Ramanan’s website. These notes were originally compiled during the Fall semester of 2007, with updates made during the Fall semester of 2009. ∗ † [email protected] [email protected] 1 Contents Contents 2 0 Review of some probability theory 3 1 Brownian Motion 1.1 Introduction to stochastic processes . 1.2 Construction of Brownian motion . . 1.3 Sample path properties . . . . . . . . 1.4 Distributional properties . . . . . . . . 1.5 Markov property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 5 12 15 15 2 Martingales 2.1 Martingale convergence theorem 2.2 Continuous Martingales . . . . . . 2.3 Applications . . . . . . . . . . . . . 2.4 Lévy processes . . . . . . . . . . . . 2.5 Doob-Meyer decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 22 24 26 29 30 3 Stochastic Integration 3.1 Riemann-Stieltjes Integration . . . . . . . . . . . . . . 3.2 Construction of the Itô integral . . . . . . . . . . . . . 3.3 Characterization of the Stochastic Integral . . . . . . 3.4 Stochastic Integration . . . . . . . . . . . . . . . . . . . 3.5 Integration by parts formula for stochastic integrals . 3.6 Fisk-Stratonovich integral . . . . . . . . . . . . . . . . . 3.7 Applications of Itô’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 33 34 37 40 42 45 45 Index . . . . . . . . . . 49 2 Review of some probability theory 0 3 Review of some probability theory For this course we let (Ω, F , P) be a probability space and (S, S ) be a (sufficiently nice) topological space. The Borel σ-algebra B(S) is the σ-algebra generated by the open sets, B(S) = σ(S ). 0.0.1 Definition. X : (Ω, F ) → (S, S ) is called a random element (or an F measurable function) if X −1 (A) ∈ F for all A ∈ S . We will also use the terms random variable, random vector, random process, or random measure, as appropriate for the codomain. 0.0.2 Definition. (Comparison of random elements) (i) If P[X = X 0 ] = 1 then we say that X and X 0 are equal a.s. or indistinguishable. (ii) If P[X ∈ A] = P[X 0 ∈ A] for all A ∈ S then we say that X and X 0 are equal in distribution. Equality in distribution can be defined for random elements defined on unequal probability spaces (but they must have the same codomain). 0.0.3 Example. Let Ω = {H, T } and P be the uniform probability (i.e. Bernoulli with parameter 21 ). Let X (H) = 0 = X 0 (T ) and X (T ) = 1 = X 0 (H). Then P[X = 1] = P[X 0 = 1] so they are equal in distribution, but they are not equal a.s., in fact they are unequal a.s. 0.0.4 Definition. (Convergence of random variables) (i) X n converges P-a.s. to X if P[lim supn |X n − X | > 0] = 0. (p) (ii) X n converges in probability to X (or X n − → X ) if lim P[|X n − X | ≥ "] = 0 n→∞ for all " > 0. (d) (iii) X n converges in distribution to X (or X n − → X ) if P[X n ∈ A] → P[X ∈ A] (d) for all A such that P[X ∈ ∂ A] = 0. Equivalently, X n − → X if the distribution functions of X n converge pointwise to the distribution function of X at ever (d) point of continuity of that function. Equivalently, X n − → X if E[ f (X n )] → E[ f (X )] for all bounded continuous functions f . Remark. When trying to prove a sequence of random variables converges in probability, Markov’s inequality is often very useful. 4 Stochastic Calculus I Now let us recall some important Pn theorems. Let X i , i ∈ N be i.i.d. r.v.’s such that µ = E |X i | < ∞ and let Sn = i=1 X i . Khintchine’s weak law of large numbers says that Sn (p) − → µ, n and Kolmogorov’s strong law of large numbers says that Sn n → µ P-a.s. If σ2 = E[|X 1 − µ|2 ] < ∞ (i.e. the r.v.’s have finite variance) then the central limit theorem states Sn − nµ (d) − → N (0, 1), p σ n where of course the right hand side may be replaced by any standard Gaussian random element. The appearance in this theorem of the normal distribution is a big part of why this distribution comes up everywhere. 1 1.1 Brownian Motion Introduction to stochastic processes 1.1.1 Definition. A stochastic process on (Ω, F , P) with state space (S, S ) is (equivalently) a (i) (one-parameter) family of random variables, {X t : Ω → S | t ∈ [0, ∞)}; (ii) random element of R[0,∞) , X = {X (·) (ω) : [0, ∞) → S | ω ∈ Ω}; (iii) random element of two variables X : [0, ∞) × Ω → S. Notice that the concept of “measurability” would seem a priori to be different for each of the three definitions. 1.1.2 Definition. A stochastic process is said to be measurable if X is measurable as a random element of two variables, i.e. if for all A ∈ S , {(t, ω) | X (t, ω) ∈ A} ∈ B[0, ∞) × F . N Warning: In K&S the authors write B(R[0,∞) ) = [0,∞) B(R), which may or may not be true according to our definitions. (Think about this?) 1.1.3 Definition. (Comparison of random processes) Let X and X 0 be random processes. (i) X and X 0 are indistinguishable if P[X t = X t0 for all t] = 1. (ii) X is said to be a modification of X 0 if P[X t = X t0 ] = 1 for all t. (iii) If P[X ∈ A] = P[X 0 ∈ A] for all A ∈ B(R[0,∞) ) then X and X 0 are equal in distribution. Construction of Brownian motion 5 (iv) If for every t 1 , . . . , t n then we have P[(X t 1 , . . . , X t n ) ∈ A] = P[(X t01 , . . . , X t0n ) ∈ A] for all A ∈ B(Rn ) they X and X 0 are said to have the same finite dimensional marginal distributions (or fi.di. distributions). As before, the definitions of equality in distribution and having the same fi.di. distributions can be extended to processes defined on different probability spaces. 1.1.4 Definition. A sequence of random elements X n converges weakly to X (or (w) X n −→ X ) if for every bounded continuous function f : S → R then E[ f (X n )] → E[ f (X )]. 1.2 Construction of Brownian motion For reasons that will become clear in the section on Donsker’s invariance principle, we look for a process B = {B t , t ≥ 0} with the following properties. (i) B0 = 0 and B t ∼ N (0, t) for all t; (ii) B has stationary increments (or homogeneous increments), i.e. B t − Bs has the same distribution as B t−s for all s < t; (iii) B has independent increments, i.e. for s < t, B t − Bs is independent of {Bu | 0 ≤ u ≤ s}; and (iv) B has continuous paths. The canonical version of a random element X : (Ω, F , P) → (S, S ) is constructed as follows. The image measure P X −1 of X is defined by P X −1 (A) = P[X ∈ A] for A ∈ B(S). There is always a random element defined on (S, S , P X −1 ) with the same distribution as X , namely the identity function. Thus our goal is to construct a measure on (C[0, 1], B(C[0, 1])) that satisfies properties 1–3. 1.2.1 Lemma. Given 0 ≤ t 1 < t 2 < t 3 , any process B that satisfies properties 1–3 must have the following joint probability density function. f Bt 1 ,B t 2 ,B t 3 (x, where p(t; x, y) := y, z) = p(t 1 ; 0, x)p(t 2 − t 1 ; x, y)p(t 3 − t 2 ; y, z), p1 2πt exp(− ( y−x)2 ). 2t This lemma extends easily to any finite number of times and it is seen that properties 1–3 determine the fi.di. distributions of B. To use Carathéodory’s extension theorem to construct a measure on (G, G ) one must: (i) Define a finitely additive set function µ0 on some algebra or π-system C ⊆ G. (ii) Show that µ0 is countably additive on C . (iii) Carathéodory’s theorem allows us to conclude that µ0 may be extended as a measure to the completion of σ(C ) ⊆ G . One could also start with a reference measure and define another measure via a density function. Or one could define a measure as a “limit” of “simple” measures. 6 Stochastic Calculus I Construction of Brownian motion: method 1 Let I be the space of finite increasing sequences of times I := {(t 1 , . . . , t n ) | n ∈ N, 0 ≤ t 1 < · · · < t n < ∞}. From the lemma, it is clear that P has finite dimensional distributions given by Q t (A) := P{ω ∈ C[0, ∞) | (ω t 1 , . . . , ω t n ) ∈ A} for all A ∈ B(Rn ), where Z Q t (A) = p(t 1 ; 0, x 1 ) · · · p(t n − t n−1 ; x n−1 , x n )d x 1 · · · d x n . A This method will yield a unique measure on (C[0, ∞), B(C[0, ∞))), Wiener measure, with fi.di. distributions given by {Q t , t ∈ I}. (But I don’t see how, yet.) Let C 0 be the set containing each so-called cylinder set, sets of the form {ω ∈ C[0, ∞) | (ω t 1 , . . . , ω t n ) ∈ A} for A ∈ B(Rn ), for n ∈ N. We have a finitely additive set function Q t defined on C 0 . As an exercise, prove that the collection {Q t | t ∈ I} is consistent (see below). The next step is to show that Q t is countably additive on C 0 (see Itô-McKean), and the final step is to show that σ(C 0 ) = B(C[0, ∞)). (I must be missing something here.) The canonical version of Brownian motion is defined to be the canonical process on (C[0, ∞), B(C[0, ∞)), P), where P is Wiener measure. Construction of Brownian motion: method 2 1.2.2 Theorem (Daniell-Kolmogorov). Let {Q t (·), t ∈ I} be a family of finite dimensional distributions that satisfies (i) Q t (A) is invariant under permutation of the elements of t; and (ii) For any t = (t 1 , . . . , t n ), if s = (t 1 , . . . , t n−1 ) and A ∈ B(Rn−1 ) then Q t (A × R) = Q s (A). This is the requirement that the family is consistent. Then there is a unique probability measure P on (R[0,∞) , B(R[0,∞) )) such that P(ω ∈ R[0,∞) | (ω t 1 , . . . , ω t n ) ∈ A} = Q t (A) for all A ∈ B(Rn ) and t ∈ I. Let C˜ be the collection of cylinder sets {ω ∈ R[0,∞) | (ω t 1 , . . . , ω t n ) ∈ A} for A ∈ B(Rn ), for n ∈ N, from R[0,∞) . As in method 1, Q t is defined on C˜ and it is finitely additive and consistent. It is countably additive by the Daniell-Kolmogorov extension theorem. Carathéodory’s theorem gives a measure satisfying properties 1–3 on σ(C˜). The only thing remaining is to deal with the continuity of the paths. Construction of Brownian motion 7 1.2.3 Theorem (Kolmogorov-Čentsov). If {X t | t ∈ [0, T ]} is a real valued stochastic process defined on (Ω, F , P) that satisfies E[|X t − X s |α ] ≤ C|t − s|1+β for all 0 ≤ s ≤ t ≤ T and some positive constants α, β, and C, then there is a continuous modification of X which is locally Hölder continuous with exponent γ, β for all γ ∈ (0, α ). A stochastic process {X t | t ∈ [0, T ]} is said to be locally Hölder continuous with exponent γ if an a.s. positive r.v. h such that |X t − X s | P sup ≤δ =1 γ 0<t−s<h |t − s| s,t∈[0,T ] for some appropriate constant δ > 0. PROOF: The first step is to choose a countable dense subset of [0, T ]. We use the dyadic rationals D = {k2−n | k = 0, . . . , 2n−1 , n ∈ N}. Define Ω̃∗ = {ω | t 7→ X t (ω) is uniformly continuous}, where X is the canonical process on R[0,∞) . We would like to show that P(Ω∗ ) = 1. We will show instead that P(Ω∗ ) = 1, where Ω∗ = {ω | t 7→ X t (ω) is locally Hölder continuous on D with coefficient γ}. By definition, ω ∈ Ω∗ holds if there is n∗ (ω) such that max |X k2−n (ω) − X (k−1)2−n (ω)| < 2−nγ 1≤k≤2n for all n ≥ n∗ (ω). Call this property (P). Let En = {ω | max n |X k2−n (ω) − X (k−1)2−n (ω)| ≥ 2−nγ }. 1≤k≤2 The set of ω for which property (P) does not hold is the set of ω for which En occurs infinitely often. Whence \ [ Ω \ Ω∗ = En . m∈N n≥m Now n P(En ) = P 2 [ ! |X k2−n − X (k−1)2−n | ≥ 2−nγ k=1 ≤ 2n X k=1 P(|X k2−n − X (k−1)2−n |α ≥ 2−nγα ) 8 Stochastic Calculus I ≤2 nγα 2n X E(|X k2−n − X (k−1)2−n |α ) k=1 ≤ C2 nγα 2n X 1 1+β k=1 2n ≤ 2(γα−β)n P∞ β so n=1 P(En ) < ∞ if γ ∈ (0, α ). Therefore by the Borel-Cantelli Lemma we have ∗ P(Ω ) = 1. The next step is to define the modification. We define X (ω) t ∈ D, ω ∈ Ω∗ t / D, ω ∈ Ω∗ X̃ (ω) = lim sn →t X sn (ω) t ∈ {sn }⊆D 0 ω∈ / Ω∗ and we must show that it is truly a modification of X . For t ∈ D, P(X t = X̃ t ) = P(Ω∗ ) = 1. For t ∈ / D, we know by construction that X̃ sn → X̃ t a.s. for any {sn } ⊆ D converging to t. We know by K-C inequality that X sn → X t in probability. Therefore X̃ sn → X t in probability, and so P(X t = X̃ t ) = 1. To complete the construction we must check the K-C inequality holds for Brownian motion for some α and β. Weak convergence There are a few books on convergence of processes: Billingsley (but not Probability), Parthasarathy, Jacod-Shiryaev. For this section we let (S, S ) be a metrizable space. 1.2.4 Definition. A sequence of measures µn converges weakly to µ, all defined (w) on the same space (S, B(S)), denoted µn −→ µ, if Eµ n [ f ] = Z S f (x)µn (d x) → Z f (x)µ(d x) = Eµ [ f ], S for all f ∈ C b (S). It can be seen that when S = R this reduces to convergence in distribution. A norm that may be defined on a collection of measures absolutely continuous with respect to some measure λ is the total variation norm defined by Z dµ dλ. kµkTV = dλ S Construction of Brownian motion 9 1.2.5 Exercise. Suppose that µn is the distribution of Pn Sn i=1 (X i − E[X i ]) , p = p n n where the X i are i.i.d. with finite means and variances. Does µn converge in the total variation metric? 1.2.6 Definition. Given a sequence of probability measures on (S, B(S)) and a probability measure µ on (S, B(S)), µn converges weakly to µ if and only if Z Z f dµn = lim n→∞ f dµ S S (w) for every bounded continuous function f on S. We write µn −→ µ. This is the weak∗ topology induced from considering the space of finitely additive measures as the dual of the space of bounded measurable functions. Note that Z kµkTV = sup f dµ , f ∈L ∞ (S) S k f k∞ ≤1 so the total variation norm is the operator norm induced by this duality. 1.2.7 Theorem (Portmanteau). The following are equivalent. (w) (i) µn −→ µ; (ii) lim infn→∞ µn (U) ≥ µ(U) for every open set U; (iii) lim supn→∞ µn (C) ≤ µ(C) for every closed set C; (iv) limn→∞ µn (A) = µ(A) for every measurable set A such that µ(A) = 0. (w) 1.2.8 Continuous mapping theorem. If Φ : S → S 0 is continuous and µn −→ µ in (w) (S, B(S)) then Φ(µn ) −→ Φ(µ) in (S 0 , B(S 0 )). PROOF: Let g ∈ C b (S 0 ). Then g ◦ Φ ∈ C b (S), so by definition Z Z Z Z g dΦ(µn ) = S0 g ◦ Φ dµn → S g ◦ Φ dµ = S g dΦ(µ), S0 (w) so Φ(µn ) −→ Φ(µ). 1.2.9 Definition. Let Π be a family of probability measures on (S, B(S)). (i) Π is said to be relatively compact if every sequence in Π contains a weakly convergent subsequence. (This is just relative compactness with respect to the topology induced by weak convergence). 10 Stochastic Calculus I (ii) Π is tight if for every " > 0 there is a compact set K ⊆ S such that P(K) ≥ 1 − " for all P ∈ Π. 1.2.10 Theorem (Prohorov). Suppose that (S, S ) is a Polish space (i.e. a complete, separable, metrizable space). Then a family Π of probability measures on (S, B(S)) is tight if and only if Π is relatively compact. We will be interested in the case when S is the space of continuous functions on [0, ∞) with the metric ρ associated with the uniform norm ρ( f , g) = ∞ X 1 n=1 2n sup t∈[0,n] | f (t) − g(t)| 1 + sup t∈[0,n] | f (t) − g(t)| . (C[0, ∞), ρ) is a Polish space. 1.2.11 Theorem (Arzela-Ascoli). A set A ⊆ C[0, ∞) is relatively compact if and only if the following two conditions hold. (i) supω∈A |ω(0)| < ∞; and (ii) limδ&0 supω∈A m T (ω, δ) = 0 for all T < ∞, where m T (ω, δ) = sup |ω(t) − ω(s)| 0≤s≤t≤T |t−s|≤δ is the modulus of continuity. (A is said to be equicontinuous). 1.2.12 Theorem. A sequence {Pn } of probability measures on C[0, ∞) is tight if and only if (i) limλ%∞ supn≥1 Pn {ω | |ω(0)| ≥ λ} = 0; and (ii) limδ&0 supn≥1 Pn {ω | m T (ω, δ) > "} = 0 for all " > 0 and all T < ∞. PROOF: Suppose that {Pn } is tight. Then given any η > 0 there is some compact set Kη ⊆ C[0, ∞) such that Pn (Kη ) > 1 − η. By the Arzela-Ascoli theorem, given T > 0, " > 0 there is λ < ∞ and δ0 > 0 such that K = {ω | |ω(0)| ≤ λ, m T (ω, δ) ≤ " for δ ∈ (0, δ0 )}. Now suppose that {Pn } satisfies the conditions. Then given T > 0 and η > 0, choose λ > 0 such that sup Pn {ω | |ω(0)| > λ} ≤ n≥1 η 2 T +1 Choose δk > 0 such that for each k = 1, 2, . . . sup Pn {ω | m T (ω, δk ) > 1k } ≤ n≥1 η 2 T +k+1 . Define AT := {ω | |ω(0)| ≤ λ, m T (ω, δk ) ≤ 1 k for all k = 1, 2, . . . } Construction of Brownian motion and let A := T∞ T =1 A T . 11 Then Pn (A T ) ≥ 1 − ∞ X η k=0 2 T +k+1 =1− η 2T so Pn (A) ≥ 1−η for all n ≥ 1. By the Arzela-Ascoli theorem A is relatively compact, so {Pn } is tight. It turns out that the topology on M1 (S) induced by weak convergence is metrizable, and M1 (S) with this so-called Prohorov metric is complete and separable. Another important fact is that we will use often is that probability measures on (C[0, ∞), B(C[0, ∞))) are uniquely characterized by their finite dimensional distributions, i.e. their values on cylinders {ω ∈ [0, ∞) | (ω t 1 , . . . , ω t n ) ∈ A}, for A ∈ B(Rn ), for n ∈ N. Donsker’s invariance principle Let ξ1 , ξ2 , . . . be i.i.d. r.v.’s with mean zero and variance one. Let Sn = p (w) Recall the central limit theorem, that Sn / n −→ N (0, 1). Following KS, let Pn i=1 ξi . Yt := Sbtc + (t − btc)ξbtc+1 be the continuous time process that is the linear interpolation between the partial p sums. For each n ≥ 1, scale Y in space by a factor of n and in time by a factor of n (the choice of these scaling factors will become clear in a moment) to get a (n) process {X t | t ∈ [0, 1]}, (n) Xt 1 1 = p Ynt = p (Sbntc + {nt}ξbntc+1 ). n n p (n) Notice that for s = k/n and t = (k + 1)/n we have X t − X s(n) = ξk+1 / n which is independent of σ(X u(n) | u ≤ s) = σ(ξ1 , . . . , ξk ) and it has mean zero and variance p (n) 1. At t = 1 we have X 1 = Sn / n, which converges weakly to N (0, 1). For t = 12 we have (approximately) 1 1 Sb n c (n) X 1 = p Sb n c = p p2n n 2 2 2 2 which is N (0, 21 ) in the limit, by the CLT. These computations lead us to believe (n) that the X t “converge” to Brownian motion. This is made precise below. 1.2.13 Theorem. {X (n) } converges weakly to Brownian motion. 12 Stochastic Calculus I First we show that {X (n) } is tight. From Exercise 4.11 of KS, if {X (n) } is a (n) sequence of continuous stochastic processes with X 0 = 0 and if (n) sup E[|X t − X s(n) |α ] ≤ C T |t − s|1+β n≥1 for all T > 0 and 0 ≤ s, t ≤ T for some constants α, β, and C T , then Pn = P(X (n) )−1 form a tight sequence. To show that {X (n) } is tight we will show that it satisfies the conditions of this problem. The next step is to apply Prohorov’s theorem to see that {X (n) } is relatively compact. Thus there is a weakly convergent subsequence {X (nk ) }. Let X be such (w) that X (nk ) −→ X and apply the continuous mapping theorem and central limit theorem to conclude that X has the required fi.di. distributions. 1.3 Sample path properties 1.3.1 Proposition. If B is a Brownian motion then the following processes X are also Brownian motions with respect to their natural filtrations. (i) X t := p1c Bc t for c > 0 (scaling property) (ii) X t := B t+c − Bc for c ≥ 0 (simple Markov property) (iii) X t := B T − B T −t for t ∈ [0, T ] (time reversal property) (iv) X t := t B 1 for t > 0 and X 0 := 0 (time inversion property) t (v) X t := U B t for an orthogonal matrix U (where B is d-dimensional Brownian motion). In particular we have the reflection property, that −B is a Brownian motion. 1.3.2 Lemma. P[sup t≥0 B t = ∞ and inf t≥0 B t = −∞] = 1. PROOF: Let Z = sup t≥0 B t . For any c > 0 we have (d) cZ = sup cB t = sup B t≥0 t≥0 t c2 = Z. Therefore the law of Z is concentrated on {0, ∞}. Let p = P(Z = 0). Then p ≤ P[B1 ≤ 0 and sup B1+t − B1 < ∞] t≥0 = P[B1 ≤ 0] P[sup B1+t − B1 < ∞] B1 ⊥ (B1+t − B1 , t ≥ 0) = P(B1 ≤ 0) P(Z = 0) (B1+t − B1 , t ≥ 0) is a BM t≥0 = 1 p 2 so p = 0 and P(Z = ∞) = 1. Remark. A direct consequence of this is that a.s. for all a ∈ R, {t | B t = a} is not bounded above. Sample path properties 13 1.3.3 Lemma. Brownian motion is a.s. not differentiable at zero. PROOF: The last lemma and time inversion together imply that P[∀ " > 0, ∃ s, t ≤ " s.t. Bs < 0 < B t ] = 1. Indeed, if this were not the case then there would be a set A with P[A] > 0 and the property that for all ω ∈ A there is " = "(ω) such that either B(u) > 0 or B(u) < 0 for all u ∈ (0, "]. By the time inversion property this implies that for all ω ∈ A, B̃u = uB 1 satisfies B̃u > 0 or B̃u < 0 for all s ∈ [ 1" , ∞), which contradicts u the previous lemma. Therefore the only possible (right) derivative of Brownian motion at zero is 0. If this were the case on a set A of positive probability, then then for ω ∈ A, |B t (ω)| ≤ t for all 0 ≤ t ≤ T (ω). Once again, using time inversion, B̃ t := t B 1 is a Brownian motion. On A, for all 0 < t ≤ T (ω), B̃ 1 = t Bt t t ≤ 1, which is impossible. 1.3.4 Lemma. Brownian paths are monotone on no interval, a.s. S PROOF: We must show that the set s,t∈Q {ω | B(ω) is monotone on [s, t]} has probability zero. By the symmetry properties of Brownian motion it suffices to show that A := {ω | B(ω) is non-decreasing on [0, 1]} has probability zero. Let Tn−1 T∞ An := i=0 {B i+1 − B i ≥ 0} and notice that A = n=1 An since B has continun n ous paths. (It follows in particular that A is measurable.) Since B has independent, normally distributed increments P[An ] = ( 12 )n , so by continuity of measure, P[A] = limn→∞ P[An ] = 0. Given a partition Π = {0 = t 0 < t 1 < · · · < t kΠ = 1} of [0, 1], and a real-valued function f : [0, 1] → R, let V (p) (Π)( f ) = kΠ X | f (t i ) − f (t i−1 )| p . i=1 The (classical) p-variation of f is defined to be e (p) ( f ) = sup V (p) (Π)( f ). V Π If f is continuous and p = 1 then V (p) ( f ) = limkΠn k→0 V (p) (Πn )( f ) where Πn is any sequence of partitions such that kΠn k → 0 (i.e. the mesh size goes to zero). When p 6= 1, these quantities need not be the same. Regardless, the p-variation of a stochastic process X is defined to be V (p) (X ) = lim V (p) (Πn )(X ) kΠn k→0 where the limit is taken in probability. 14 Stochastic Calculus I 1.3.5 Lemma. The quadratic (p = 2) variation of Brownian motion on the interval [0, t] is the deterministic value t, and Brownian motion does not have finite variation (p = 1) on any interval. PROOF: Fix t and fix a partition Π = {0 = t 0 < t 1 < · · · < t n = t} of [0, t]. Then E(V (2) (Π)(B) − t)2 = E X n ∆j 2 j=1 where ∆ j = (B t j − B t j−1 )2 − (t j − t j−1 )2 . It can be shown that E[∆ j ∆k ] = 0 for j 6= k and E[∆2j ] = 2(t j − t j−1 )2 . Whence E(V (2) (Π)(B) − t)2 = 2 n X (t j − t j−1 )2 ≤ kΠkt, j=1 so V (2) (Π)(B) converges to t in L 2 (0, t) (and hence in probability). It follows that Brownian motion cannot have finite first variation because it has continuous paths. 1.3.6 Sample path properties. The following properties are true of a.e. sample path of Brownian motion. (i) Unboundedness. (ii) Of unbounded first variation. (This is a consequence of the fact that V (2) (B(ω)) 6= 0 a.s.) (iii) Non-differentiable at zero (or anywhere). (iv) Monotone on no interval. (This is a consequence of (ii)) (v) Nowhere differentiable, i.e. {ω | ∀ t ∈ [0, ∞) either D+ Wt (ω) = ∞ or D− Wt (ω) = −∞} contains an event F with P(F ) = 1, where 1 D+ f t = lim sup ( f (t + h) − f (h)) h h→0 and 1 D− f t = lim inf ( f (t + h) − f (h)). h→0 h (vi) Law of the Iterated Logarithm: lim sup p t→∞ Wt (ω) 2t log log t =1 and lim inf p t→∞ (vii) Exact modulus of continuity (Lévy), see 2.9.F. Wt (ω) 2t log log t = −1 Distributional properties 1.4 15 Distributional properties 1.4.1 Definition. A stochastic process X is a Gaussian process if for every 0 < t 1 < t 2 · · · < ∞ the Rn valued random vector (X t 1 , . . . , X t n ) has a (multi-variate) Gaussian distribution. The covariance function of a Gaussian process is defined to be ρ(s, t) := E[(X s − E[X s ])(X t − E[X t ])]. Brownian motion is a Gaussian process mean zero and covariance function ρ(s, t) = s ∧ t. 1.4.2 Lemma. The CDF F t 1 ,...,t n (x 1 , . . . , x n ) of (B t 1 , . . . , B t n ) is Z x1 Z xn ··· −∞ 1.5 p(t 1 ; 0, y)p(t 2 − t 1 ; y1 , y2 ) · · · p(t n − t n−1 ; yn−1 , yn )d yn · · · d y1 . −∞ Markov property Informally, X is a Markov process if there is a family of Borel measurable functions { fs,t } such that P(X t ∈ A | FsX ) = fs,t (X s , A). 1.5.1 Definition. Let (Ω, F ) be a measurable space. A kernel on Ω is a map N : Ω × F → [0, 1] such that (i) A 7→ N (x, A) is a probability measure on (Ω, F ) for all x ∈ F ; and (ii) x 7→ N (x, A) is F -measurable for every A ∈ F . A kernel N is called a transition probability or stochastic kernel if N (x, Ω) = 1 for all x ∈ Ω. Notation. If f is a non-negative F -measurable function and N is a kernel then the function Z N f (x) := Ω N (x, d y) f ( y) = EN (x,·) [ f ]. Likewise, if M and N are two kernels then Z M N (x, A) = M (x, d y)N ( y, A). Ω Suppose there is a process X for which, for any s < t there is a transition probability Ps,t such that a.s. P(X t ∈ A | FsX ) = Ps,t (X s , A). Then for any positive F -measurable function f , using standard approximation arguments, E[ f (X t ) | FsX ] = Ps,t f (X s ). 16 Stochastic Calculus I So if s < t < v then P(X v ∈ A | FsX ) = E[P(X v ∈ A | F tX ) | FsX ] = Z Ps,t (X s , d y)Pt,v ( y, A). This should equal Ps,v (X s , A). 1.5.2 Definition. A transition function on (Ω, F ) is a family {Ps,t | 0 ≤ s < t} of transition probabilities on (Ω, F ) such that for all s < t < v, Z Ps,t (X s , d y)Pt,v ( y, A) = Ps,v (x, A). Ω These are the Chapman-Kolmogorov equations. 1.5.3 Definition. Let (Ω, F , {F t , t ≥ 0}, P) be a filtered probability space. An adapted process X is a Markov process with respect to another filtration {G t } (containing {F tX }) with transition functions Ps,t if for all non-negative F -measurable functions f and 0 ≤ s ≤ t, E[ f (X t ) | Gs ] = Ps,t f (X s ). Remark. (i) Given a transition function, one can always construct a Markov process with that transition function using Kolmogorov’s extension theorem. (ii) The transition function is said to be homogeneous if for all s < t, Ps,t depends on s and t only R through t − s. In this case the C-K equation takes the form Pt+s (x, A) = Ps (x, d y)Pt ( y, A). 1.5.4 Theorem. Brownian motion is a Markov process with respect to its natural filtration. (With what transition function?) Intuition: B t = B t − Bs + Bs and B t − Bs is N (0, t − s), distributed as P y if Bs = y. 1.5.5 Lemma. Suppose that X and Y are d-dimensional random vectors on (Ω, F , P), G is a sub-σ-algebra of F , X is independent of G and Y is G -measurable. Then for every Γ ∈ B(Rd ), P[X + Y ∈ Γ | G ] = P[X + Y ∈ Γ | Y ] P-a.s. and P[X + Y ∈ Γ | Y = y] = P[X + Y ∈ Γ] a.e. for P Y −1 . PROOF: We will show that for D ∈ B(Rd × Rd ), P[(X , Y ) ∈ D | G ] = P[(X , Y ) ∈ D | Y ]. Markov property 17 First look at D = D1 × D2 for D1 , D2 ∈ B(Rd ). The left hand side is P[X ∈ D1 , Y ∈ D2 | G ] = 1{Y ∈D2 } P[X ∈ D1 | G ] = 1{Y ∈D2 } P[X ∈ D1 ] and the right hand side is equal to the same thing by the same logic. Since the measurable rectangles form a Dynkin system and generate B(Rd × Rd ), we are done. Let Ω0 = (C[0, ∞) : R)d , F 0 = B(Ω0 ), and P 0 = P (1) × · · · × P (d) , where each P is Wiener measure. Let X be the canonical process on (Ω0 , F 0 , P 0 ), so X is a d-dimensional Brownian motion started at zero. Let µ be an arbitrary initial distribution on (Rd , B(Rd )). Consider the random variable on (Rd × Ω0 , B(Rd ) ⊗ F 0 , µ ⊗ P 0 ) defined by X (x, ω1 , . . . , ωd ) = x + (ω1 , . . . , ωd ), and so is Brownian motions with initial distribution µ. Another way to think about this is to think of P µ as the image measure. How µ 0 x 0 can we explicitly write R P in terms of P and µ? Naturally, take P (F ) = P (F − x), µ x and write P (F ) = P (F )µ(d x). This integral is well-defined if for every F ∈ F 0 the map x 7→ P x (F ) is “universally measurable.” The following fact is true: for B every F ∈ F∞ the map x 7→ P x (F ) is B(Rd )-measurable. (Universal measurability is introduced so that we get this nice property for slightly larger filtrations, such as the augmented natural filtration.) Define \ µ B(Rd ) ⊇ B(Rd ). U (Rd ) := (i) µ prob A mapping R → R is said to be universally measurable if it is U (Rd )/B(Rd )measurable. Note that if RF is a set of the form {ω ∈ Ω0 | ω(0) ∈ Γ0 , ω(t 1 ) ∈ Γ1 } then x P (F ) = 1Γ0 (x) Γ pd (t 1 ; x 1 , y1 )d y1 , where as always d 1 pd (t; x, y) = 1 d (2πt) 2 exp − kx − yk2 2t . As a consequence, Brownian R motion is a homogeneous Markov process with transition function Pt (x, A) = A pd (t; x, y)d y. Here we call pd (t; x, y) the transition density of Brownian motion. We have seen E[ f (B t+s ) | Bs ] = Pt f (Bs ). The infinitesimal generator of a homogeneous Markov process is G := lims& 1s (Ps − I). 1.5.6 Definition. A Markov family is an adapted process {S t , F t | t ≥ 0} on some (Ω, F ) together with a family of probability measures {P x | x ∈ Rd } on (Ω, F ) such that (i) x 7→ P x (F ) is universally measurable for all F ∈ F ; (ii) P x [X 0 = x] = 1 for all x ∈ Rd ; (iii) P x [X s+· ∈ F | Fs ] = P x [X s+· ∈ F | X s ] for all x ∈ Rd , for all F ∈ B(Rd )[0,∞) ; (iv) P x [X s+t ∈ Γ | X s = y] = P y [X t ∈ Γ] for all x ∈ Rd , a.s.-P x X s−1 1.5.7 Definition. A process X is F -progressively measurable if the restricted map X : [0, t] × Ω → Rd is (B[0, t] ⊗ F t )/B(Rd )-measurable for all t < ∞. 18 Stochastic Calculus I 1.5.8 Definition. The σ-algebra generated by a random time T is the σ-algebra generated by {X T ∈ A} | A ∈ B(Rd ) ∪ {T = ∞}. 1.5.9 Definition. A random time T is an F -stopping time (resp. F -optional time) if {T ≤ t} ∈ F t (resp. {T < t} ∈ F t ) for all t. The σ-algebra generated by a stopping time T is F T := {A ∈ F | A ∩ {T ≤ t} ∈ F t for all t ≥ 0}. 1.5.10 Exercise. Suppose that X is an adapted process with right continuous paths and A ∈ B(Rd ). The hitting time is HA = inf{t ≥ 0 | X t ∈ A}. (i) If A is open show that HA is an optional time. (ii) If A is closed and X is continuous show that HA is a stopping time. 1.5.11 Definition. Let {F t } t≥0 be the completion of {F tB } t≥0 , and for each t ≥ 0 T let F t = s>t F s . Then {F t } t≥0 is a right continuous filtration, the Brownian filtration. 1.5.12 Theorem. Let {B t , F t } t≥0 be a Brownian motion and let T be a finite val(T ) ued stopping time. Then the process defined by B t = B T +t − B T for t ≥ 0 is a Brownian motion independent of F T . PROOF: A simple stopping time is a stopping time whose image is countable. We claim that given any finite time there is a non-increasing sequence of simple stopping times T T1 ≥ T2 ≥ · · · such that limn→∞ Tn = T point-wise. In addition, F T = n F Tn . (The reason we take the approximation from above is for this latter property.) Indeed, define Tn = ∞ X k k=0 2n 1[k2−n ,(k+1)2−n ) ◦ T. Then clearly Tn ≥ Tn+1 for all n, and Tn converges to T since 0 ≤ Tn − T ≤ 2−n . Further, F T ⊆ F Tn for all n as a consequence of the general fact that S ≤ T implies FS ⊆ F T . (Indeed, if A ∈ FS then A ∩ {T ≤ t} = A ∩ {S ≤ t} ∩ {T ≤ t} ∈ F t so A ∈ F T .) If A ∈ Therefore T n F Tn A ∩ {T ≤ t} = then A ∩ {Tn ≤ t} ∈ F t for all n ≥ 1 and all t ≥ 0. \[ \ (A ∩ {Tn ≤ t + "}) ∈ ">0 m≥1 n≥m by the right continuity of {F t }. \ ">0 F t+" = F t Markov property 19 Back to the proof of the theorem. If T is a simple stopping time then let {τ1 , τ2 , . . . } be the range of T . For any A ∈ F T and for all C1 , . . . , Cn ∈ B(R), \ P(A ∩ ( {B T +t i − B T ∈ Ci })) i≤m = ∞ X P(A ∩ ( k=0 = ∞ X {Bτk +t i − Bτk ∈ Ci , T = τk })) i≤m P( k=0 = P( \ \ \ {Bτk +t i − Bτk ∈ Ci }) P({T = τk } ∩ A) i≤m {B t i ∈ Ci }) P(A) i≤m Now set A = R to deduce that t 7→ B T +t − B T is a BM. Since A ∈ F T was arbitrary, we also get independence. For a general topping time T , consider the approximating sequence of simple stopping times Tn & T defined above. For any A ∈ F T and for all open C1 , . . . , Cn ∈ B(Rn ) we have \ \ P(A ∩ ( {B T +t i − B T ∈ Ci })) = lim P(A ∩ ( {B Tn +t i − B Tn ∈ Ci })) n→∞ i≤m i≤m = lim P( n→∞ = P( \ \ {B t i ∈ Ci }) P(A) i≤m {B t i ∈ Ci }) P(A) i≤m since A ∈ F Tn for all n. 1.5.13 Example. This theorem is not true for general random times. Take T to be the last time before 1 that B t is zero. 1.5.14 Lemma. Under P 0 , |B| = {|B t |, F t } is a Markov process with transition density P 0 [|Wt−s | ∈ d y | |Wt | = x] = p+ (s; x, y), where p+ (s; x, y) = p(s; x, y) + p(s; x, − y). 1.5.15 Lemma. Define Yt = M t − B t . Under P 0 , the process {Yt , F t } is a Markov process and has transition density P 0 [Yt+s ∈ d y | Yt = z] = (p(s; z, y) + p(s; z, − y))d y = p+ (s; z, y)d y. PROOF: For s > 0, t ≥ 0, b ≥ a, b ≥ 0, P 0 [B t+s ≥ a, M t+s ≤ b | F t ] = P 0 [B t+s ≥ a, M t ≤ b, ( sup B t+u ) ≤ b | F t ] u∈[0,s] 0 = 1{M t ≤b} P [B t+s ≥ a, ( sup B t+u ) ≤ b | F t ] u∈[0,s] 20 Stochastic Calculus I = 1{M t ≤b} P 0 [B t+s ≥ a, ( sup B t+u ) ≤ b | B t ] u∈[0,s] since {B t } is a Markov process under P 0 . This calculation shows that (M t , B t ) is a Markov process under P 0 . Since Yt is a function of M t and B t , it follows that for every Γ ∈ B(R), P 0 [Yt+s ∈ Γ | F t ] = P 0 [Yt+s ∈ Γ | B t , M t ]. It suffices to show that P 0 [Yt+s ∈ d y | B t = x, M t = m] = p+ (s; m − x, y)d y. For b > m > x, b ≥ a, m ≥ 0, P 0 [B t+s ∈ d a, M t+s ∈ d b | B t = x, M t = m] = P 0 [B t+s ∈ da, max B t+u ∈ d b | B t = x, M t = m] 0≤u≤s = P x [Bs ∈ da, Ms ∈ d b] = P 0 [Bs ∈ da − x, Ms ∈ d b − x] (2b−a−x)2 2 =p (2b − a − x)e− 2s da d b 2πs3 For m > x, m ≥ a, m ≥ 0, P 0 [B t+s ∈ d a, M t+s = m | B t = x, M t = m] = P 0 [B t+s ∈ da, sup B t+u ≤ m | B t = x, M t = m] u∈[0,s] = P [Bs ∈ da, Ms ≤ m] x = P x [Bs ∈ da] − P x [Bs ∈ da, Ms ≥ m] (2m−a−x)2 1 − (a−x)2 =p e 2s − e− 2s da 2πs Therefore, since either the maximum increases to a new level b > m over the interval [t, t + s] or it stays the same, we have P 0 [Yt+s ∈ d y | B t = x, M t = m] Z∞ P 0 [B t+s ∈ b − d y, M t+s ∈ d b | B t = x, M t = m] = m + P 0 [B t+s ∈ m − d y, M t+s = m | B t = x, M t = m] = Z ∞ m 2 p 2πs3 (b + y − x)e− (b+ y−x)2 2s d y db (m+ y−x)2 1 − (m− y−x)2 2s e − e− 2s dy +p 2πs = p+ (s; m − x, y)d y Markov property 21 We have the following Markov processes: (i) Brownian motion B t (ii) Poisson process Nt (iii) Reflected Brownian motion |B t | (iv) (M t , B t ) (v) Yt = M t − B t What about Tb = inf{t ≥ 0 | B t = b} = inf{t ≥ 0 | B t ≥ b} = inf{t ≥ 0 | M t ≥ b}? 1.5.16 Theorem. {Tb , 0 < b < ∞} is a non-decreasing left-continuous (strong Markov) process that has stationary independent increments and is purely discontinuous (i.e. there is no interval on which b 7→ Tb is right-continuous). PROOF: Notice that {Tb ≤ t} = {M t ≥ b} = \ {M t ≥ b − 1n } = n∈N \ {Tb− 1 ≤ t}. n∈N n which implies left-continuity. That it is non-decreasing is obvious. The time shift operator, defined by θs (ω)(t) := ω(s + t) for s, t ≥ 0. The operator can also be defined for random times. We have for 0 < a < b, Tb = Ta + Tb ◦ θ Ta a.s. For all F Tb -measurable functions f , E[ f (Tb − Ta ) | F Ta ] = E[ f (Tb ◦ θ Ta ) | F Ta ] = Ea [ f (Tb )] = E[ f (Tb−a )] using the continuity of Brownian motion and the fact that Brownian motion has stationary, independent increments. This implies that Tb − Ta is independent of Fa and has the same distribution as Tb−a . For the last part it is enough to show that for p, q ∈ Q, P[ω | b 7→ Tb (ω) is cts on [p, q]] = 0. However, b 7→ Tb is continuous on this interval if and only if M t is strictly increasing on [Tp , Tq ], and for this to happen we would require B Tp +t − B Tp to be strictly increasing on that interval. But this last process is a Brownian motion by the strong Markov property, and so is not strictly increasing anywhere. p 1.5.17 Lemma. E0 [exp(−uTb )] = exp(−b 2u). 1.5.18 Proposition. Almost surely, the set Z = {t ∈ [0, ∞) | B t = 0} has no isolated points. PROOF: Earlier we showed that zero is not an isolated point, indeed p E0 [exp(−u(t + T0 ◦ θ t ))] = e−ut E0 [EB t [exp(−uT0 )]] = e−ut E0 [exp(−|B t | 2u)]] 22 Stochastic Calculus I and as t → 0, the right hand side goes to one. Therefore by Fatou’s Lemma, P[lim inf(t + T0 ◦ θ t ) = 0] ≥ lim inf P[(t + T0 ◦ θ t ) = 0] = 1, t&0 t&0 so zero is a limit point of Z a.s. For any rational q, the define the time dq to be q + Tq ◦ θq , the first point in Z after q. However, Bdq = 0, so {Bdq +t | t ≥ 0} is a standard BM by the strong Markov property. Therefore the set [ {dq is not a limit point of Z} q∈Q has P-measure zero. If h ∈ Z and h = dq then h is a limit point of Z. If not, choose a sequence {qn } ⊆ Q such that qn % h. Then dqn ∈ [qn , h], so dqn → h and h is a limit point of Z. 2 2.1 Martingales Martingale convergence theorem The definitions of sub- and super-martingales are analogous in discrete- and continuoustime. 2.1.1 Definition. A stochastic process X is integrable if E |X a | < ∞ for all a ∈ I. An adapted stochastic process {X a , Fa | a ∈ I} is a (i) sub-martingale if E[X b | Fa ] ≥ X a for all a < b; (ii) super-martingale if E[X b | Fa ] ≤ X a for all a < b; (iii) martingale if it is both a sub- and super-martingale. 2.1.2 Doob’s Upcrossing Lemma. Let X be a super-martingale and let UN [a, b] be the number of up-crossings of [a, b] by time N . Then (b − a) E UN [a, b] ≤ E(X N − a)− 2.1.3 Definition. A predictable process is a process {Cn , Fn } such that Cn is Fn−1 measurable for all n. PROOF: Let {X n , Fn } be a super-martingale, C1 = 1{X 0 <a} and Cn = 1{Cn−1 =1} 1{X n−1 ≤b} + 1{Cn−1 =0} 1{X n−1 <a} Pn for n > 1, and Yn := i=1 Ci (X i − X i−1 ). Then {Cn , Fn } is predicable and {Yn , Fn } is a super-martingale. We have fundamental inequality YN ≥ (b − a)UN [a, b] − (X N − a)− . Therefore we may conclude that 0 ≥ E[YN ] ≥ (b − a) E UN [a, b] − E(X N − a)− . Martingale convergence theorem 23 Now for the continuous-time analog. For a < b and F ⊆ [0, ∞) finite, let (i) τ1 = min{t ∈ F | X t ≤ a} (ii) σ j = min{t ∈ F | t ≥ τ j , X t > b} (iii) τ j+1 = min{t ∈ F | t ≥ σ j , X t < a} Given an interval I ⊆ [0, ∞) let U I (a, b; X ) = sup{U F [a, b] | F ⊆ I finite}. 2.1.4 Theorem. Let {X t , F t } be a right-continuous sub-martingale, a < b, and λ > 0. (i) λ P[sup t∈[σ,τ] X t ≥ λ] ≤ E X τ+ . (ii) λ P[inf t∈[σ,τ] X t ≤ −λ] ≤ E X τ+ − E X σ . (iii) Up-crossings: (b − a) E U[σ,τ] (a, b; X ) ≤ |a| + E X τ+ . p (iv) If X is non-negative then E[(sup t∈[σ,τ] X t ) p ] ≤ ( p−1 ) p E X τp . PROOF: Exercise. (Approximation arguments.) 2.1.5 Doob’s Forward Convergence Theorem. Let {X n , Fn } be a super-martingale bounded in L 1 (i.e. supn E |X n | < ∞). Then X ∞ = limn→∞ X n exists and is finite a.s., and it is F∞ -measurable. PROOF: Let Λ = {X n does not converge to a limit in [−∞, ∞]} = {lim inf X n 6= lim sup X n } n n [ = {lim inf X n < a < b < lim sup X n } a<b∈Q =: [ n n Λa,b . a<b∈Q But Λa,b ⊆ {limN →∞ UN [a, b] = ∞}. The probability of this set is zero, so P(Λ) = 0. By Fatou’s Lemma E |X ∞ | = E[lim inf |X n |] ≤ lim inf E |X n | < ∞. n n 2.1.6 Doob’s Forward Convergence Theorem. Let {X t , F t } be a cadlag supermartingale bounded in L 1 (i.e. sup t E |X t | < ∞). Then X ∞ = lim t→∞ X t exists and is finite a.s., and it is F∞ -measurable. 2.1.7 Discrete Optional Sampling Theorem. Let M be a uniformly integrable martingale and T be a stopping time. Then E[M∞ | F T ] = M T a.s. 2.1.8 Corollary. E |M T | < ∞ and E M T = E M0 . 24 Stochastic Calculus I 2.1.9 Corollary. If S is another stopping time and S ≤ T then E[M T | FS ] = MS . 2.1.10 Continuous Optional Sampling Theorem. Let {X t , F t } be a right-continuous sub-martingale with a last element and let S and T be F t -optional times. Then E[X T | FS+ ] ≥ MS a.s. If S is a stopping time then we may replace FS+ by FS . PROOF: Define ¨ Sn = ∞ k2−n S=∞ (k − 1)2−n ≤ S < k2−n and Tn similarly. Then Sn and Tn are stopping times and Sn & W and Tn & T , and by the discrete optional sampling theorem, E[X Tn | FSn ] ≥ X Sn . For all A ∈ FSn , Z Z X Tn d P ≥ X Sn d P . A A T∞ Therefore it also holds for all A ∈ n=1 FSn = FS+ . Also, since S ≤ Sn , FS ⊆ FSn . Observe that {X Sn , FSn } are backward sub-martingales and E X Sn is decreasing and bounded below by E X 0 . Therefore {X Sn } are u.i., and likewise for {X Tn }. Since the process is right continuous X S = lim X Sn n→∞ and X T = lim X Tn , n→∞ so we can take limits in the equation above and interchange the limits to get E[X T | FS+ ] ≥ E X S . 2.2 Continuous Martingales For this section let {X t , F t } be a process with right continuous paths. Remark. (i) We automatically know that it has limits from the left a.s. since {∃ t ∈ [0, n] lim inf X s s%t < lim sup X s } ⊆ s%t [ {ω | U[0,n] (a, b, X (ω)) = ∞}. a<b∈Q (ii) Recall that {F t } is said to satisfy the usual conditions if F0 contains all the Pnegligible sets and {F t } is right continuous. If {X t , F t } is a sub-martingale and {F t } satisfies the usual conditions then t 7→ E X t is right continuous then X t has a right continuous modification such that {X t , F t } is a submartingale. (iii) Continuous martingale results that are derived from discrete martingale results (e.g. OST) rely on approximation arguments for which a key ingredient is the backward sub-martingale convergence theorem. This theorem (given below) allows one to justify limits of the following kind. Continuous Martingales 25 Let Tn & T , Sn & S, and {X n , Fn } is a right-continuous submartingale. If X Sn ≤ E[X Tn | FSn ] for all n then X S ≤ E[X T | FS+ ]. Namely, for all A ∈ FS+ ⊆ FSn we have Z lim n→∞ X Tn d P = Z u.i. A A lim X Tn d P = Z r t c ts n→∞ XT d P A using the fact that E X Tn ≥ E X 0 . 2.2.1 Theorem (BS-MCT). Let {Fn }∞ n=1 be a decreasing sequence of σ-algebra and suppose that {X n , Fn } is a backward sub-martingale (i.e. E |X n | < ∞ and E[X n | Fn+1 ] ≥ X n+1 a.s. for all n). If limn→∞ E X n > −∞ then {X n } is u.i. PROOF: Step 1: Show that {X n+ , Fn } is a backward sub-martingale. This is the case since X n+1 ≤ E[X n | Fn+1 ] implies, since x 7→ x + is non-decreasing, + X n+1 ≤ E[X n | Fn+1 ]+ ≤ E[X n+ | Fn+1 ] by the condition Jensen inequality. Step 2: limλ→∞ supn≥1 P[|X n | > λ] = 0 since λ P[|X n | > λ] ≤ E |X n | = − E X n + 2 E X n+ < ∞ by the assumed bound and the fact that E X n+ ≤ E X 1+ . + Step 3: X n+ is u.i. Indeed, since {X n+ , Fn } is a sub-martingale we have E[X n−1 | + Fn ] ≥ X n so Z X n+ d P= Z E[X 1+ | Fn ]d P ≤ {|X n |>λ} {|X n |>λ} {|X n |>λ} Z Step 4: X n− is u.i. Indeed, for λ > 0 and n > m we have Z 0≥ Xnd P X n <−λ = E Xn − Z Xnd P X n ≥−λ Z ≥ E Xn − Xmd P X n ≥−λ = E Xn − E Xm + Z Xmd P X n <−λ Given " > 0, there is m large enough so that 0 ≤ E Xm − E Xn ≤ " 2 X 1+ d P 26 Stochastic Calculus I for all n > m (since X is L 1 -bounded and E X n is monotonic). For that m choose λ > 0 such that Z " sup |X m |d P < 2 n>m X <−λ n R so supn>m X − >λ X n− d P < " and X n1 is u.i. n 2.2.2 Theorem (Convergence). If {X t , F t } is a sub-martingale and sup E X t+ < ∞ t≥0 then X t has a limit a.s. and in L 1 . 2.2.3 Corollary. If {X t , F t } t≥0 is a right continuous non-negative super-martingale then X ∞ = lim t→∞ X t exists and is in L 1 . 2.2.4 Optional Sampling Theorem. If {X t , F t } is a right-continuous sub-martingale with a last element (i.e. X ∞ = lim t→∞ X t exists a.s. and is in L 1 ) and S and T are {F t }-optional times then E[X T | FS+ ] ≥ MS a.s. If S is a stopping time then we may replace FS+ by FS . 2.3 Applications Note first that if {B t , F t } is a Brownian motion then it is a martingale. Indeed, E[B t − Bs + Bs | Fs ] = E[B t − Bs ] + Bs = Bs . 2.3.1 Lemma. Let τ = inf{t ≥ 0 | B t ∈ / (a, b)}, where a < 0 < b. Then −a (i) P(Bτ = b) = b−a (ii) E[τ] = −a b PROOF: τ is a stopping time, but we cannot naively apply the OST since Brownian motion does not have a last element. Instead we look at the stopped process {B t∧n }, which is a right continuous martingale with last element. Applying the OST we get 0 = E Bτ∧n = b P[Bτ = b, τ ≤ n] + a P[Bτ = a, τ ≤ n] + E[Bn ; τ > n]. Taking limits as n → ∞ we get 0 = b P[Bτ = b] + a P[Bτ = a] For the next part, we show that M t := (B t − a)(b − B t ) + t is a martingale. (This is not too hard.) Then again applying the OST to the stopped process −a b = E Mτ∧n = E[τ ∧ n] + E[(Bτ∧n − a)(b − Bτ∧n )] and taking limits we get the result. Applications 27 2.3.2 Example. Let X t = B t + c t (Brownian motion with drift). We are interested in H x = inf{t > 0 | X t = x}. We need the fact that exp(θ B t − 21 θ 2 t) is a martingale. Fix λ > 0. Then from the exponential martingale it follows that exp(θ X t − λt) = exp(θ B t − (λ − θ c)t) p is a martingale provided that λ − θ c = 12 θ 2 . Let β, α = −c ± c 2 + 2λ. Note that α < 0 < β. Thus for any λ > 0 and β as given the martingale exp(β X t − λt) is bounded on [0, H x ]. We can use the OST to conclude that E[exp(β X H x − λH x )] = eβ x E[e−λH x ] p from which it follows that E[e−λH x ] = exp(−x( c 2 + 2λ− c)). The Laplace transform can be inverted explicitly to give x (x − c t)2 . P(H x ∈ d t) = p exp − 2t 2πt 3 Take limits as λ & 0 in the Laplace transform to conclude that ¨ 1 if c ≥ 0 P[H x < ∞] = −2|c|x e if c < 0 Now we calculate E[e−λT ] where T = H a ∧ H b for a < 0 < b. Recall that the θ that we used previously was found as a root of λ − θ c = 21 θ 2 . We know that any process of the form M t = C1 eαX t −λt + C2 eβ X t −λt is a martingale for any constants C1 , C2 . Choose M t of the form M t = f (X t )e−λt such that f (a) = f (b), say M t = (eβ b − eβ a )eαX t −λt + (eαa − eαb )eβ X t −λt . With this choice, M t is bounded on [0, T ], and so the OST implies f (0) = E[M0 ] = E[M T ] = E[ f (a)e−λT ] so E[e−λT ] = eβ b − eβ a + eαa − eαb eβ b+αa − eβ a+αb p In the special case c = 0 and a = −b this reduces to E[e−λT ] = sech(b 2λ). 2.3.3 Law of the Iterated Logarithm. P lim sup Æ t&0 Bt 2t log log( 1t ) =1 28 Stochastic Calculus I PROOF: Write h(t) = Æ 2t log log( 1t ). The first step is to show that lim sup t&0 Bt h(t) ≤1 P-a.s. Apply Doob’s maximal inequality to the exponential martingale Z t = exp(αB t − α2 t) yielding for α > 0 2 P sup (B t − s∈[0,t] 1 αs) 2 >β =P sup Z t > e αβ ≤ e−αβ E[Z t ] = e−αβ . s∈[0,t] Now fix θ , δ ∈ (0, 1) and apply the inequality with t = θ n , α = θ −n (1 + δ)h(θ n ), and β = 12 h(θ n ). Then αβ = 1 2 n 2 (1 + δ)θ h (θ ) = (1 + δ) log log n n 1 θ and eαβ = log(n log( θ1 ))1+δ = O(n1+δ ). So 1 1 sup P(Bs − s(1 + δ)θ −n h(θ n )) ≥ h(θ n )) ≤ C n−(1+δ) . n 2 2 s∈[0,θ ] By the Borel-Cantelli Lemma there is Θ0θ ,δ ∈ F with P Ω0 = 1 such that for all ω ∈ Ω0 there is Nθ ,δ (ω) such that for all n ≥ Nθ ,δ (ω) 1 1 maxn (Bs − s(1 + δ)θ −n h(θ n )) < h(θ n ) x∈[0,θ ] 2 2 Thus for θ n+1 < t ≤ θ n B t ≤ sup Bs ≤ s∈[0,θ n ] 1 2 (2 + δ)θ −n h(θ n ) ≤ 1 2 1 (2 + δ)θ − 2 h(t) 1 1 where the last inequality uses the fact that h(θ n ) ≤ θ − 2 h(θ n+1 ) ≤ θ − 2 h(t), so lim sup t&0 Bt t 1 ≤ (1 + δ2 )θ − 2 . Letting δ & 0 and θ % 1 along countable sequences we complete the proof of the first step. For the second step, see KS p.112–113 Recall the generator of a Markov process is G = lim t&0 E[ f (X t ) | X 0 = x]. The Brownian transition function is p(t; x, y) = 1 (2πt) d 2 exp(− kx − yk2 2t ) Pt −I t where Pt f (x) = Lévy processes and satisfies 1 ∆f 2 ∂ pt ∂t 29 = 12 ∆p t (x). For Brownian motion, check that G f (t, x) = ∂f ∂t + . Define f Ct = f (t, B t ) − f (0, B t ) − Z t G f (s, Bs )ds. 0 f If {B t , F t } is a Brownian motion and f ∈ C 1,2 then C t is an {F t }-martingale. 2.4 Lévy processes 2.4.1 Definition. A Poisson process {Nt , F t } with intensity λ is a right continuous {F t }-adapted process with N0 = 0 and Nt − Ns ∼ Poisson(λ(t − s)), i.e. for i = 0, 1, . . . , λi (t − s)i P[Nt − Ns = i] = e−λ(t−s) i! The Poisson process has stationary independent increments and {Nt − λt} and α α {eαNt −λt(e −1) }, for any α ∈ R, are martingales. This is because E[eαNt ] = eλt(e −1) is the moment generating function for the Poisson distribution. 2.4.2 Definition. A Lévy process is a right-continuous process with stationary independent increments. 2.4.3 Examples. (i) Brownian motion (with drift) (ii) Poisson process (iii) Ta , the hitting time of Brownian motion to a level a. 2.4.4 Definition. A probability measure µ on R is said to be infinitely divisible if for all n there is a probability measure ν on R such that µ = ν ∗nP . Equivalently, if n Y ∼ µ then for every n there are i.i.d. r.v.’s Yi ∼ ν such that Y = i=1 Yi . If {X t , F t } is a Lévy process with X 0 = 0 then for all t, X t is infinitely divisible since it may be written as a sum of n i.i.d. increments, Xt = n X (X t i − X t i−1 ). i=1 n n Conversely, given any infinitely divisible r.v. Y there is a Lévy process {X t , F t } such (d) that Y = X 1 . Analytical methods can be used to show that if µ is infinitely divisible then its Fourier transform is equal to eΨ(θ ) , where BM Poisson z }| { Z z }| { z}|{ 1 iθ x 2 2 iθ x e − 1− ν(d x) Ψ(θ ) = iβθ − σ θ + 2 1 + x2 | {z } drift pure jump process 30 Stochastic Calculus I R x and ν is a Radon measure on R\{0} such that 1+x 2 ν(d x) < ∞. This immediately gives you a complex exponential martingale associated with a Lévy process. This is the Lévy-Khintchine formula 2.4.5 Definition. A r.v. Y is stable if for all n there are independent r.v.’s with the (d) same law as Y and constants an > 0 and bn such that Y1 + · · · + Yn = an Y + bn . 2.4.6 Lemma. Stable r.v.’s are infinitely divisible. 1 2.4.7 Exercise. It must be the case that an = n α for some α ∈ (0, 2]. When α = 2 we get the Gaussian distribution. When α ∈ (0, 2] then σ = 0 in the L-K formula and the Lévy measure has density (m1 1{x<0} + m2 1{x>0} )|x|−(1+α) for some m1 , m2 ≥ 0. 2.5 Doob-Meyer decomposition 2.5.1 Lemma. Any non-constant continuous martingale {M t } a.s. has infinite variation. PROOF: Let Vt be the variation of M on [0, t] and define Sn := inf{s ≥ 0 | Vs ≥ n} ∧ inf{s ≥ 0 | |Ms | ≥ n}. Then the stopped process M Sn is of bounded variation and is a martingale by the OST. Therefore it is enough to prove that M is constant whenever it and its variation are bounded. Assume further that M0 = 0 a.s. Fix t < ∞ and let Π = {0 = t 0 < t 1 < · · · < t k = t} be a subdivision of [0, t]. Then E[M t2 ] = E[ k−1 k−1 X X (M t2i+1 − M t2i )] = E[ (M t i+1 − M t i )2 ] i=0 i=0 As a result E[M t2 ] ≤ E[Vt sup |M t i+1 − M t i |] i and since M is of bounded variation and it is continuous, this quantity goes to zero as the mesh of the partition goes to zero, so M ≡ 0. Remark. Continuity is required for this proof. The compensated Poisson process is a right continuous process of bounded variation (it is seen to be of bounded variation since it is the difference of two increasing processes). 2.5.2 Definition. An is said to be an increasing sequence if 0 = An ≤ A1 ≤ · · · Pa.e. and E[An ] < ∞ for all n ≥ 0. A t is said to be an increasing sequence if 0 = A0 , t 7→ A t is non-decreasing P-a.e., right continuous and E A t < ∞ for all t ≥ 0. Such a thing is said to be integrable if E[A∞ ] < ∞. Doob-Meyer decomposition 31 2.5.3 Definition. In discrete time, a sequence is said to be natural if for every bounded martingale {Mn }, E[Mn An ] = E[ n X Mk−1 (An − An−1 )] k=1 Pn Let Yn = k=1 Mk−1 (An − An−1 ). Then a sequence A is natural if and only if E[Yn ] = 0 for all n, if and only if {An } is predictable. 2.5.4 Definition. In continuous time, A is natural if for all bounded martingales Z E[M t A t ] = E[ Ms− dAs ]. (0,t] 2.5.5 Lemma. Z E[ Ms dAs = E[ Z (0,t] Ms− dAs ] (0,t] It will be a consequence of the definition of the stochastic integral that R M t A t − (0,t] Ms− dAs and so E[(A · M ) t ] = 0. R (0,t] A s d Ms = 2.5.6 Proposition. (In discrete time) an increasing random sequence is predictable if and only if it is natural. PROOF: Suppose that A is natural and M is a bounded martingale. Let Yn be the martingale transform, as defined above. Then E[An (Mn − Mn−1 )] = E[Yn ] − E[Yn−1 ] = 0 2.5.7 Definition. An increasing process {A t } is said to be natural if Z Z Ms dAs ] = E[ E[ (0,t] Ms− dAs ] (0,t] for all bounded martingale {M t F t }. This is analogous to the discrete time version because Z E[M t A t ] = E[ Ms− dAs ] (0,t] (see text for the approximation argument). An increasing process is increasing if and only if it is natural. See notes for the proof of existence. 32 Stochastic Calculus I 2.5.8 Definition. A right continuous process X is of class D (resp. class DL) if {X τ }τ∈S (resp. {X τ }τ∈Sa for all a ∈ R+ ) is u.i., where S is the set of all finite stopping times (resp. Sa is the set of all stopping times bounded by a). 2.5.9 Theorem. If X is a right continuous sub-martingale of class DL, then X = M +A, where M is a martingale and A is a natural increasing (predictable) process. If X is class D then M is u.i. and A is integrable. 2.5.10 Definition. A local martingale is a right continuous process M such that there exists a localizing sequence of stopping times {Tn } with Tn % ∞ a.s. and such that (M − M0 ) Tn is a martingale for all n. 2.5.11 Theorem. A process X is a local sub-martingale if and only if it has a decomposition X = M + A, where M is a local martingale and A is a locally integrable increasing process. The decomposition is unique when A is required to be predictable/natural. 2.5.12 Definition. A process X is a semi-martingale if X = M + A, where M is a local martingale and A is locally of finite variation. 2.5.13 Definition. A process X is said to be regular if for all a > 0 and every nondecreasing sequence of stopping times {Tn } bounded by a, if T = limn→∞ Tn then limn→∞ E[X Tn ] = E[X T ]. A continuous sub-martingale is regular. 2.5.14 Theorem. For a right continuous sub-martingale X of class DL, the compensator is continuous if and only if X is regular. 2.5.15 Lemma. Non-negative sub-martingales are of class DL. PROOF: Fix a > 0 and suppose that T is such that P[T ≤ a] = 1. Then apply OST to {X T ∧a } to get E[X a | F T ] ≥ X T . Multiply both sides by 1{X T >λ} and take expectations to get E[X T 1{X T >λ} ] ≤ E[X a 1{X T >λ} ]. Since X a ∈ L 1 and P[X T > λ] ≤ it follows that {X T } T ∈Sa is u.i. 1 λ E[X T ] ≤ 1 λ E[X a ] → 0 as λ → ∞, Stochastic Integration 3 33 Stochastic Integration Naive stochastic integration (i.e. via Riemann sums) is impossible. Let X be a right continuous function on [0, 1] and Πn be a refining sequence of dyadic rationals such P that kΠn k → 0. What conditions are needed on X so that the sums Sn = Πn h(t k )(x(t k+1 ) − x(t k )) converge to a finite limit for all continuous h? 3.0.16 Theorem. Finite variation is necessary. PROOF: Let X = C[0, ∞) and Y = R, and for h ∈ X let X Tn (h) = h(t k )(x(t k+1 ) − x(t k )). Πn Construct hn in X such that hn (t k P ) = sgn(x(t k+1 ) − x(t k )) over Πn and khn k = 1. For such an hn we have Tn (hn ) = Πn |x(t k+1 ) − x(t k )|, so kTn k ≥ Var[0,1] (x). On the other hand, if for all h ∈ X limn→∞ Tn (h) exists then by the Banach-Steinhaus theorem the total variation over [0, 1] of x is finite. 3.1 Riemann-Stieltjes Integration Let F V be the class of finite variation processes (differences of increasing processes) started at 0. 3.1.1 Theorem. Let A ∈ F V and H be a (jointly) measurable process such that a.s. s 7→ H(s, ω) is continuous. Let Πn be a sequence of random finite partitions of [0, t] such that limn→∞ kΠn k → 0. Then for and {Sk } with Tk ≤ Sk ≤ Tk+1 , a.s. Z t X Hs dAs . lim HSk (A Tk+1 − A Tk ) = n→∞ Πn 0 3.1.2 Theorem. Let A ∈ F V be right continuous . For f ∈ C 1 , the process ( f (A t )) t≥0 is in F V and is equal to Z t X f 0 (As )dAs + ∆ f (As ) + f 0 (As− )∆As . 0 0<s≤t 3.1.3 Example. Let N be a Poisson process of parameter λ and M t = Nt − λt be the compensated Poisson process. Let H be jointly measurable and (say) bounded. The natural way to define the integral of H with respect to M is Z t Z t Z t Z t I tM (H) = H s d Ms = 0 Hs dNs − λ Hs d(Ns − λt) = 0 0 Let {Ti } be the jump times of the Poisson process, so that Nt = Z t ∞ X M I t (H) = H Tn 1 Tn ≤t − λ Hs ds. n=1 0 Hs ds. 0 P∞ i=0 1 Ti ≤t . Then 34 Stochastic Calculus I If H is continuous and adapted, then E[I tM (H) − IsM (H) | Fs ] = E[ Z t Hu d Mu | Fs ] = 0 s applying the first theorem in the section, so the integral is seen to be a martingale. What if H is not continuous but right continuous? Let H̃ t = 1[0,T1 ) (t), so Z t H s d Ms = 0 ∞ X H̃ Ti 1 Ti ≤t − λ Z t H̃s ds = −λ(t ∧ T1 ), 0 i=1 which is not a martingale. Now for the harder case of continuous martingales (necessarily of unbounded variation). Let B be standard Brownian motion and consider a sequence {Πn } of dyadic partitions of [0, ∞) with kΠn k → 0 as n → ∞. Define (n) Bt = X Πk B t k 1(t k ,t k+1 ] . We know B (n) is càglàd, and B (n) → B u.c.p. i.e. for all T , (n) (p) sup |B t − B t | − →0 t∈[0,T ] as n → ∞. The martingale transform is I tB (B (n) ) = X Πn B t k (B t k+1 − B t k ) X1 1 (B t k+1 + B t k )(B t k+1 − B t k ) − (B t k+1 − B t k )(B t k+1 − B t k ) 2 Πn 1X 1 (B t k+1 − B t k )2 = B 2t − 2 2 Π = 2 n This converges u.c.p. to 21 B 2t − 12 t. Note that this is not what we would expect from the usual change of variable formula. There is an extra term of − 12 t, Z 0 3.2 t Bs dBs = 1 2 B 2t − Construction of the Itô integral See text, §3.1 and §3.2. 1 2 t. Construction of the Itô integral 35 3.2.1 Definition. A process X is called simple if there is a strictly increasing sequence of real numbers {t n }n≥0 with t 0 = 0 and limn→∞ t n = ∞ and a sequence of random variables {ξn }n≥0 with supn≥0 |ξn | ≤ C < ∞ and ξn is F t n -measurable for all n and ∞ X X t = ξ0 1{0} (t) + ξi 1(t i ,t i+1 ] (t) i=1 This class of processes will be denoted L0 . We define the stochastic integral of X ∈ L0 with respect to M by the martingale transform I tM (X ) = ∞ X ξi (M t∧t i+1 − M t∧t i ). i=0 Some obvious properties of I tM for simple processes are (i) I0M (X ) = 0; (ii) I tM (αX + β Y ) = αI tM (X ) + I tM (Y ) (linearity); (iii) {I tM (X ), F t } t≥0 is a martingale. Rt (iv) E[(I tM (X ))2 ] = E[ 0 X s2 d[M ]s ] (see below) PROOF (OF 4.): E[(I tM (X ))2 ] = fill in all details as an exercise Z t X s2 d[M ]s ] = E[ 0 It follows from the last property that E[(I tM (X ))2 − (IsM (X ))2 | Fs ] = E[ Z t X s2 d[M ]s ], s so if X ∈ L0 and M ∈ M2 then I M (X ) ∈ M2 and kI M (X )k = [X ] = ∞ X 1 n=0 2n 1 ∧ [X ]n , Rn where [X ]2n = E[ 0 X s2 d[M ]s ]. It is a fact that (M2 , k · k) and (L , [·]) are complete metric spaces. 3.2.2 Lemma. Let X be a bounded, measurable, adapted process. Then there is a sequence {X (m) }m≥1 of simple processes such that Z T >0 m→∞ PROOF: Fix T > 0. T (m) |X t sup lim E 0 − X t |2 d t = 0. 36 Stochastic Calculus I (i) Suppose that X has continuous paths. Then X can be approximated by (m) Xt = X 0 1{0} (t) + ∞ X X k2−m 1(k2−m ,(k+1)2−m ] (t). k=1 Indeed, X (m) → X a.s. as m → ∞ and by the BCT [X − X (m) ] T → 0. (ii) Suppose that X is progressively measurable. Define, for t ∈ [0, T ], Z t Ft = X s ds (m) and X̃ t = m(F t − F t− 1 ). 0 m Note that for F to be well-defined we require only that X is measurable. F is continuous, so X̃ (m) is continuous for all m. Progressive measurability of F follows from that of X . Indeed, if g : ([0, t] × Ω, B[0, t] ⊗ F t ) → (R, B(R)) Rt is measurable then 0 g(s, ω)ds is also (B[0, t] ⊗ F t )-measurable by Fubini(?). Therefore F is continuous and progressively measurable. For all (m) ω ∈ Ω, X t (ω) → X t (ω) by the fundamental theorem of calculus. The BCT gives you the rest. Standard diagonalization gives you that X can be approximated by a sequence in L0 . (iii) Let X be measurable and adapted. As before, F is continuous and measurable but we can not be sure that it is progressively measurable. We will show that F is indeed adapted. By KS Proposition 1.1.2, X has a progresRt sively measurable modification Y . Let G t = 0 Ys ds for t ∈ [0, T ]. We know from the second part that G is F t -adapted. It suffices to show that F is a modification of G since the filtration is complete. Fix t ∈ [0, T ]. Z T {F t 6= G t } ⊆ { 1X t 6=Yt d t > 0} 0 so P[F t 6= G t ] ≤ Z T P[X t 6= Yt ]d t = 0. 0 Since {F t } is complete, F t is adapted. Proceed as in the previous step. 3.2.3 Proposition. If t 7→ [M ] t is absolutely continuous then L0 is dense in (L (M ), [·] M ). PROOF: If X ∈ L is bounded then the assertion essentially follows from the previous lemma. Choose a subsequence of {X (m) } along which n oC lim X (mk ) = X k→∞ has zero µB measure, and therefore zero µ M -measure. By BCT we have convergence in [·] M . If not bounded then use DCT instead (truncation, take limits). Characterization of the Stochastic Integral Integrand L (M ) (meas, adapted) L ∗ (M ) (prog meas) P (M ) (predictable) 37 Integrator M ∈ M2c , t 7→ [M ] t a.c. (Itô) M ∈ M2c (Itô) M ∈ M2 (Kunita-Watanabe) Using localization we can replace M2c by M c,loc and replace Z E[ T X s2 d[M ]s ] <∞ Z by 0 P[ T X s2 d[M ]s ] < ∞. 0 We must deal with questions like whether I(X T ) = (I(X )) T for stopping times T . 3.3 Characterization of the Stochastic Integral Rt For M ∈ M2c , X ∈ L ∗ (M ) we have shown that I tM (X ) = 0 X s d Ms is well-defined. Rt We know what I M (X ) ∈ M2c with quadratic variation 0 X s2 d[M ]s . What is the cross variation of I M (X ) and I Y (N )? Recall that the cross variation may be characterized as the unique predictable process (of finite variation) such that I M (X )I N (Y ) − [I M (X ), I N (Y )] is a martingale. First, when X and Y are simple, suppose without loss of generality that ∞ X ξi 1(t i ,t i+1 ] (t) X = ξ0 1{0} (t) + i=1 and Y = η0 1{0} (t) + ∞ X ηi 1(t i ,t i+1 ] (t). i=1 Remember that I tM (X ) = ∞ X ξi (M t i+1 ∧t − M t i ∧t ) and i=1 I tN (Y ) = ∞ X ηi (Nt i+1 ∧t − Nt i ∧t ). i=1 Fix 0 ≤ s ≤ t < t and suppose that n and m are such that t m ≤ s < t m+1 and t n ≤ t < t n+1 (in fact, suppose for now that s = t m and t = t m+1 ). E[(I tM (X ) − IsM (X ))(I tN (Y ) − IsN (Y )) | Fs ] n X n X = E[ ξi η j (M t i+1 − M t i )(Nt j+1 − Nt j ) | Fs ] i=m j=m = E[ n X i=m ξi ηi (M t i+1 − M t i )(Nt i+1 − Nt i ) | Fs ] 38 Stochastic Calculus I = = = n X i=m n X i=m n X E[ξi ηi (M t i+1 − M t i )(Nt i+1 − Nt i ) | Fs ] E[ξi ηi E[M t i+1 Nt i+1 − M t i Nt i | F t i ] | Fs ] E[ξi ηi E[[M , N ] t i+1 − [M , N ] t i | F t i ] | Fs ] i=m = E[ = E[ n X i=m Z t ξi ηi ([M , N ] t i+1 − [M , N ] t i ) | Fs ] X u Yu d[M , N ]u | Fs ] s 3.3.1 Proposition. Let α, β, γ be right continuous functions [0, ∞) → R with α(0) = β(0) = γ(0) = 0. Let α be of finite variation and β and γ be increasing. Suppose further that for all s ≤ t we have Z t Z t 1 Z t 1 2 2 dαu ≤ dβu dγu . s s s Then for any measurable functions f , g we have Z t Z t 1 1 Z t 2 2 2 2 | f g|d|α| ≤ g dγ . f dβ s s s PROOF: Monotone class theorem. 3.3.2 Theorem (Kunita-Watanabe inequality, 1967). If M , N ∈ M2c , X ∈ L ∗ (M ), Y ∈ L ∗ (N ) then a.s. Z t Z t 1 Z X s2 d[M ]s |X s Ys |d|[M , N ]s | ≤ 0 2 0 t Ys2 d[N ]s 1 2 . 0 PROOF: By the previous result we only need to show that there is a negligible set Z such that Z t Z t 1 Z t 1 2 2 d|[M , N ]u | ≤ d[M ]u d[N ]u s s s holds path-wise for all s, t. Let Z be the null set such that if ω ∈ / Z then Z t d[M + r N , M + r N ]u 0≤ s for all r, s, t with s ≤ t and r, s, t ∈ Q. Then Z t Z t d[M + r N , M + r N ] t − 0≤ s s d[M + r N , M + r N ]s Characterization of the Stochastic Integral 39 = r s ([N ] t − [N ]s ) + 2r([N , M ] t − [N , M ]s ) + ([M ] t − [M ]s ) The right hand side is non-negative for all rational r, so it holds for all real r by continuity. The discriminant of the quadratic equation must be non-negative, which gives us the desired inequality. Since we have it for rational s, t, by right continuity of the paths we have it for all s, t. ∗ 3.3.3 Lemma. If M , N ∈ M2c , X ∈ L ∗ (M ), {X (n) }∞ n=1 ⊆ L (M ), with Z lim n→∞ T |X s(n) − X s |2 d[M ]s = 0 0 a.s.-P, then, for all 0 ≤ t ≤ T , lim [I M (X (n) ), N ] t = [I M (X ), N ] t . n→∞ PROOF: T Z |[] t | ≤ [] t [] t ≤ [] T 0 3.3.4 Lemma. If M , N ∈ M2c and X ∈ L ∗ (M ) then [I M (X ), N ] t = t Z X s d[M , N ]s . 0 PROOF: We showed there is a sequence of simple processes such that the condition in the above lemma holds. But we showed that the condition in this lemma holds for simple processes. 3.3.5 Theorem. Consider a martingale M ∈ M2c and X ∈ L ∗ (M ). Then I M (X ) is the unique martingale Φ ∈ M2c such that [Φ, N ] t = Z t X s d[M , N ]s 0 for all N ∈ M2c . 3.3.6 Corollary. If M ∈ M2c , X ∈ L ∗ (M ), N = I M (X ), Y ∈ L ∗ (N ), then X Y ∈ L ∗ (M ) and I N (Y ) = I M (X Y ). PROOF: [N ] t = Rt 0 X s2 d[M ]s , so Z E[ 0 T X s2 Ys2 d[M ]s ] = E[ Z 0 T Ys2 d[N ]s ] < ∞. 40 Stochastic Calculus I So X , Y ∈ L ∗ (M ). For any Ñ ∈ M2c , the previous theorem showed that d[N , Ñ ]s = X s d[M , Ñ ]s and so [I M (X Y ), Ñ ] t = Z t X s Ys d[M , Ñ ]s = 0 Z t Ys d[N , Ñ ]s = [I N (Y ), Ñ ] t . 0 By the characterization of the integral, I M (X Y ) = I N (Y ). Today we have shown that [I M (X ), I N (Y )] = Read the proof of Itô’s formula. 3.4 Rt 0 X s Ys d[M , N ]s . Stochastic Integration Today we extend the definition of the stochastic integral to all of M c,loc . For M ∈ M2c with t 7→ [M ] t absolutely continuous and X ∈ L ∗ (M ). We used the fact that for all T < ∞ there is a sequence {X (m) } ⊆ L0 such that T Z (m) |X t E[ − X t |2 d t] → 0 0 as m → ∞. We use “time changes” to do the general case. 3.4.1 Theorem. For M ∈ M2c , L0 (M ) is dense in L ∗ (M ) with respect to [·]. PROOF: The proof in the general case follows from the following more general lemma. 3.4.2 Lemma. Let {A t } is a continuous (resp. right-continuous) increasing, F adapted process. If X is progressively measurable and satisfies T Z X t2 dA t ] < ∞ E[ 0 for all T > 0, then there exists a.s. a sequence {X (n) }∞ n=1 of simple processes such that Z t (n) |X t − X t |2 dA t ] = 0. sup lim E[ T >0 n→∞ 0 PROOF: Assume without loss of generality that X is bounded, say by C. It suffices to fix T > 0 and show there exists {X (n) }∞ n=1 ⊆ L0 such that Z lim E[ n→∞ 0 t (n) |X t − X t |2 dA t ] = 0. Stochastic Integration 41 The process A t + t is strictly increasing and continuous, so it has a continuous strictly inverse function Ts defined by A Ts + Ts = s for all ω. In particular, Ts ≤ s and {Ts ≤ t} = {A t + t ≥ s} ∈ F t . Therefore for all s ≥ 0, Ts is an F -stopping time. Define Gs = F Ts and Ys = X Ts . Since X is progressively measurable, Ys is G -adapted. Without loss of generality, assume X t ≡ 0 for t ≥ T . Also, Z ∞ E[ Ys2 ds = E[ Z 0 ∞ 1{Ts ≤T } X T2s ds] 0 A T +T = E[ Z 1{Ts ≤T } X T2s ds] 0 ≤ C(E[A T ] + T ) < ∞. For any N ∈ N, choose R < ∞ such that E[ is a simple process Ỹ (n) R∞ R Ys2 ds] < 1 . 2n By the old result, there such that R Z |Ỹs(n) − Ys |2 ds] < E[ 0 1 2n . Define Ys(n) = 1[0,R] (s)Ỹs(n) . Then Z R E[ |Ys(n) − Ys |2 ds] < 0 But Ys(n) = ξ0 1{0} (s) + X 1 n . ξi 1(si ,si+1 ] (s) i≥0 where each ξi is Gsi -measurable. Define (n) Xt (n) = Yt+At = ξ0 1{0} (s) + X ξi 1(Ts ,Ts i i+1 ] (s). i≥0 (n) To see that X t is F t -adapted, simply observe that ξi restricted to (Tsi , Tsi+1 ] is F t -measurable. Now we extend to M c,loc . For simplicity we assume that M0 = 0. Define Tn := inf{t > 0 | |M t | > n} for n ∈ N. Then M Tn is a bounded martingale. Recall the (generalized) Doob decomposition. If M ∈ M c,loc then there is a unique, continuous, increasing process [M ] such that M 2 − [M ] ∈ M c,loc . For M , N ∈ c,loc M0 define 1 [M , N ] = ([M + N ] − [M − N ]). 4 42 Stochastic Calculus I c,loc 3.4.3 Definition. Let M ∈ M0 and X progressively measurable with Z P T X t2 d[M ] t =1 0 for all T < ∞. Then I tM (X ) is defined to be I tM (X ) = Tn I tM (X Tn ) = Z t X Tn d M Tn 0 for all t ∈ [0, Tn ], where Tn = Sn ∧ R n and Z t X t2 d[M ] t ≥ n} Sn = inf{t > 0 | and R n = inf{t > 0 | |M t | ≥ n}. 0 Tn Note that M Tn ∈ M2c and X Tn ∈ L ∗ (M Tn ) by the definition of Sn . So I M (X Tn ) is well-defined. 3.4.4 Definition. A (continuous) semi-martingale X is a process that admits a decomposition X = M + A, where M ∈ M c,loc and A ∈ F V c,loc , the collection of continuous adapted processes that are of finite variation on every bounded interval. This decomposition is unique. Now we can define the stochastic integral with respect to a continuous semi-martingale in the obvious way. 3.4.5 Proposition. We have the following properties of I M (H). (i) Linearity; R t (ii) [I M (H)] t = 0 Hs2 d[M ]s ; Rt (iii) [I M (H), I N (K)] t = 0 Hs Ks d[M , N ]s ; Furthermore, the stochasticR integral I M (H) is characterized as the unique Φ ∈ t M c,loc such that [Φ, N ] t = 0 Hs d[M , N ]s for all N ∈ M2c . In particular we cannot say things regarding the conditional expectations. 3.5 Integration by parts formula for stochastic integrals Recall that M t2 = X (M t2k − M t2k−1 ) Π X = (M t k − M t k−1 )(M t k + M t k−1 ) Π =2 X Π M t k−1 (M t k − M t k−1 ) + X (M t k − M t k−1 )2 Π Integration by parts formula for stochastic integrals 43 3.5.1 Lemma. Let M be a bounded continuous martingale and A be a continuous adapted process of finite variation. Then Rt (i) M t2 = 2 0 Ms d Ms + [M ] t ; and Rt Rt (ii) M t A t = 0 As d Ms + 0 Ms dAs . PROOF: Define T0n := 0 and n Tk+1 := inf{t > Tkn | |M t − M Tkn | > 1 2n }, and define t kn = t ∧ Tkn . Then we have M t2 = 2 X n (M n − M n ) + M t k−1 tk t k−1 k≥1 X 2 n ) (M t kn − M t k−1 k≥1 Define X n := X n n M Tk−1 1(Tk−1 ,Tkn ] An := and X 2 n ) , (M t kn − M t k−1 k≥1 k≥1 so we can rewrite this as M t2 = 2I M (X n ) + An . We showed earlier that An → [M ] a.s. (at least along a subsequence). Since sup t |X tn − X tn+1 | ≤ 2−n−1 and sup t |X tn − M t | ≤ 2−n . Taking limits, we have M t2 = 2I tM (M ) + [M ] t . n −n n P The second part is a similar argument, taking t k = (k2 ) ∧ t and X = n n n k≥0 A t k−1 1(t k−1 ,t k ] . 3.5.2 Theorem. Let X and Y be continuous semi-martingales. Then X t Yt − X 0 Y0 = Z t X s dYs + 0 Z t Ys dX s + [X , Y ] t . 0 PROOF: We assume without loss of generality that X 0 = Y0 = 0. Suppose that X = M + A and Y = N + V . Using the first part of the last lemma applied to M + N and M − N to get M t Nt = Z t 0 Ms dNs + Z t Ns d Ms + [M , N ] t . 0 Combine this with ordinary Lebesgue-Stieltjes integration to get the result. Recall the following results. 3.5.3 Theorem (Chain Rule). If X is a continuous semi-martingale and U and V are progressively measurable processes with V ∈ L ∗ (X ) then U ∈ L ∗ (V · X ) if and only if U V ∈ L ∗ (X ), in which case U · (V · X ) = (U V ) · X . 44 Stochastic Calculus I 3.5.4 Theorem (Integration-by-parts). For continuous semi-martingales X and Y , Z t Z X t Yt − X 0 Y0 = t X s dYs + 0 Ys dX s + [X , Y ]s . 0 We prove today the following. 3.5.5 Theorem (Itô’s Formula). For a continuous semi-martingale X and a smooth function f ∈ C 2 (Rd ), Z Z t X t ∂f 1X ∂2f i (X s )dX s + (X s )d[X i , X j ]s . f (X t ) − f (X 0 ) = ∂ x 2 ∂ x ∂ x i i j 0 0 i i, j PROOF: We prove the case d = 1. Fix X and let C be the collection of smooth functions f for which the formula holds. Clearly C is a linear subspace of C 2 (R) what contains all linear functions. We show that C is closed under multiplication. Let f , g ∈ C and define F = f (X ) and G = g(X ). Then F and G are continuous semi-martingales. By the integration-by-parts formula, ( f g)(X t ) − ( f g)(X 0 ) = F t G t − F0 G0 = (F · G) t + (G · F ) t + [F, G] t = f (X ) · (g 0 (X ) · X + 21 g 00 (X ) · [X ]) + g(X ) · ( f 0 (X ) · X + 12 f 00 (X ) · [X ]) + [ f (X ), g(X )] = f (X ) · (g 0 (X ) · X + 21 g 00 (X ) · [X ]) + g(X ) · ( f 0 (X ) · X + 12 f 00 (X ) · [X ]) + f 0 (X )g 0 (X ) · [ = ( f g 0 + g f 0 )(X ) · X + 12 (2 f 00 + 2g 0 f 0 + 2g 00 )(X ) · [X ] = ( f g)0 (X ) · X + 12 ( f g)00 (X ) · [X ] since [ f (X ), g(X )] = [ f 0 (X ) · X , g 0 (X ) · X ] = f 0 (X )g 0 (X ) · [X ] by Kunita-Watanabe. Therefore C contains all polynomials. Let f ∈ C 2 be arbitrary. By the Weierstrass approximation theorem there are polynomials pn such that sup |pn (x) − f 00 (x)| → 0 |x|≤c as n → ∞ for all c > 0. Integrate pn twice to get polynomials Fn such that sup |Fn (x) − f (x)| ∨ |Fn0 (x) − f 0 (x)| ∨ |Fn00 (x) − f 00 (x)| → 0 |x|≤c as n → ∞ for all c > 0. In particular, Fn (X t ) → f (X t ) for all t ≥ 0. Letting X = M̃ + Ã, by the dominated convergence theorem for Stieltjes integral, (Fn0 (X ) · Ã + 21 Fn00 (X ) · [ M̃ ]) → ( f 0 (X ) · Ã + 21 f 00 (X ) · [ M̃ ]). All that remains to show is that Z t Fn0 (X )d M̃s 0 Z → 0 t f 0 (X )d M̃s . Fisk-Stratonovich integral 45 This sequence does converge to this limit in L 2 because t Z Fn0 (X ) − E 0 f (X )d M̃s 2 =E Z t (Fn0 (X ) − f 0 (X ))2 d[ M̃s ] → 0 0 0 as n → ∞ (this is the Itô Isometry). Therefore there is a subsequence along which the convergence is a.s. 3.6 Fisk-Stratonovich integral 3.6.1 Definition. The Fisk-Stratonovich integral of X with respect to Y is Z t Z X s ◦ d Ys := 0 0 t 1 X s dYs + [X , Y ] t . 2 We have an Itô rule and integration-by-parts rule for this type of integral. We also have the following fact. S" (Π) = m−1 X i=0 1 2 (p) (B t i + B t i+1 )(B t i − B t i+1 ) − → Z t Bs ◦ dBs . 0 3.6.2 Example. Consider the ODE dN = a(t)N (t), where a is the rate of growth dt and N is the number of people. For whatever reason, we may think of a(t) as r(t)+ξ(t), a deterministic part plus a process of random fluctuations. Empirically, dB “ξ(t) = d tt ”, so we write d Nt = r(t)Nt d t + σNt dB t . It is not clear which integral we should use to write Nt . The choice of integral depends on the model. In finance the Itô integral is used because we cannot look into the future. For stochastic processes on a manifold the Stratonovich integral is used. 3.7 Applications of Itô’s formula Regular conditional probabilities Given a probability measure P on (Ω, F ), its characteristic function is Z e iθ x P(d x) = E[e iθ Z ] f (θ ) = Ω for any r.v. Z with law P. Now suppose you were able to show that for P-a.e. ω ∈ Ω, f (θ ) = E[e iθ Z | F T ](ω). 46 Stochastic Calculus I We would expect that f (θ ) should be the characteristic function of the conditional law of Z given F T . In order to do that we would like to be able to write Z E[e iθ Z | F T ](ω) = e iθ x (P | F T )(ω | d x) Ω for some measure (P | F T ). This motivates the following. 3.7.1 Definition. Given a r.v. Z on (Ω, F , P) taking values in (S, S ), and a subσ-algebra F T ⊆ F , we will say (P | F T ) is a regular conditional probability if (i) for all ω ∈ Ω, (P | F T )(ω, ·) defines a probability measure on (S, S ); (ii) for all A ∈ S , (P | F T )(·, A) is F T -measurable; and (iii) for all A ∈ S and ω ∈ Ω, P(Z ∈ A | F T )(ω) = (P | F T )(ω, A). For fixed A, we notice P(Z ∈ A | F T )(ω) = E[1A | F T ](ω) P-a.s. The existence of a regular conditional probability is asking whether there a “modification” of {P(A | F T ) | A ∈ S } (defined as above) that satisfies the first condition (as it certainly satisfies the second). 3.7.2 Theorem. If (S, S ) is a complete, separable, metric space and S = B(S ) then regular conditional probabilities exist. 3.7.3 Lemma. Let X be a d-dimensional random vector on (Ω, F , P). Suppose that G is a sub-σ-field of F and suppose that for each ω ∈ Ω there is a function ϕ(ω, ·) : Rd → C such that for all u ∈ Rd , ϕ(ω, u) = E[e i〈u,X 〉 | G ](ω) P-a.e. If for each ω, ϕ(ω, u) is the characteristic function of some probability measure Pω on (Rd , B(Rd )), i.e. if Z e i〈u,x〉 Pω (d x) ϕ(ω, u) = Rd then for all A ∈ B(Rd ), P[X ∈ A | G ](ω) = Pω (A) =: (P | G )(ω, A) P-a.e. ω. PROOF: Let (Q | G ) be a regular conditional probability for X given G , so that for each fixed u ∈ Rd , Z ϕ(ω, u) = E[e i〈u,X 〉 | G ](ω) = e i〈u,x〉 (Q | G )(ω, d x) Rd P-a.e. ω. The set of ω for which this holds could depend on u. Take a countable dense subset D and Ω̃ ∈ F with P(Ω̃) = 1 so that the equation above holds for all u ∈ D and all ω ∈ Ω̃. Use continuity with respect to u of both sides to conclude for all u ∈ Rd and all ω ∈ Ω̃. (See page 85 in KS.) This can be used to prove the strong Markov property in a different way. Applications of Itô’s formula 47 Martingale characterization of Brownian motion Recall that if B is a standard d-dimensional Brownian motion then the covariation among the components is [B ( j) , B ( j) ] t = δi j t. 3.7.4 Theorem (Lévy, 1948, Kunita-Watanabe, 1967). Suppose that X is a d-dimensional continuous adapted process such that for ev(k) (k) (k) ery component 1 ≤ k ≤ d the process M t := X t − X 0 is a continuous local ( j) ( j) martingale and [M , M ] t = δi j . Then X is a Brownian motion. PROOF: We will show that for all 0 ≤ s ≤ t, X t − X s is independent of Fs and has the d-variate normal distribution with mean zero and covariance matrix (t − s)I d . To do this we will show that 1 E[e i〈u,X t −X s 〉 | Fs ] = e− 2 kuk ∂f (x) ∂ xj For fixed u, f (x) = e i〈u,x〉 satisfies 2 (t−s) . = iu j f (x) and ∂2f (x) ∂ xi ∂ x j = −ui u j f (x). Applying Itô’s formula to the real and imaginary parts we have e i〈u,X t 〉 =e i〈u,X s 〉 +i d X Z uj t e i〈u,X v 〉 d M v( j) − s j=1 d 1X 2 u2j j=1 Z t e i〈u,X v 〉 d v. s Now | f (x)| ≤ 1 for all x ∈ Rd , and because [M ( j) ] t = t is bounded on any interval, Rt M ( j) ∈ M2c . Thus the real and imaginary parts of { 0 e i〈u,X v 〉 d M v( j) } lie in M2c (not just M c,loc ). Taking expectations, Z E[ t e i〈u,X v 〉 d M v( j) Fs ] = 0 s P-a.s. For A ∈ Fs , multiplying by e−i〈u,X s 〉 1A gives the result. (Fill this in.) Itô’s formula for general f ∈ C 1,2 and X = M + A a semi-martingale is Z f (t, X t ) = f (0, X 0 )+ 0 ∂f ∂t (x, X s )ds+ 1X 2 i, j Z 0 t ∂2f ∂ xi∂ x j (s, X s )d[X i , X j ]s + X i Z 0 t ∂f ∂ xi (x, X s )dX si Bessel process Let B be a d-dimensional Brownian motion and let Æ (1) (d) R t = kB t k = (B t )2 + · · · + (B t )2 . By the rotation (orthogonal transformation property) of Brownian motion, if k yk = kxk then R has the same distribution under P x and P y . Use this to show that R is a Markov process (see hand-written notes). Index Borel σ-algebra, 3 Brownian filtration, 18 localizing sequence, 32 locally Hölder continuous, 7 canonical version, 5 central limit theorem, 4, 11 Chapman-Kolmogorov equations, 16 class D, 32 class D L, 32 consistent, 6 converges P-a.s., 3 converges in distribution, 3 converges in probability, 3 converges weakly, 5, 8 covariance function, 15 cylinder set, 6 Markov family, 17 Markov process, 15, 16 Markov’s inequality, 3 martingale, 22 measurable, 4 measurable function, 3 modification, 4 modulus of continuity, 10 dyadic rationals, 7 Poisson process, 29 Polish space, 10 predictable process, 22 progressively measurable, 17 Prohorov metric, 11 natural, 31 optional time, 18 equal a.s., 3 equal in distribution, 3, 4 equicontinuous, 10 fi.di. distributions, 5 finite dimensional marginal distributions, 5 Fisk-Stratonovich integral, 45 random element, 3 random measure, 3 random process, 3 random variable, 3 random vector, 3 reflection property, 12 regular, 32 regular conditional probability, 46 relatively compact, 9 Gaussian process, 15 hitting time, 18 homogeneous, 16 homogeneous increments, 5 scaling property, 12 semi-martingale, 32, 42 simple, 35 simple Markov property, 12 simple stopping time, 18 stable, 30 state space, 4 stationary increments, 5 stochastic kernel, 15 stochastic process, 4 stopped process, 26 stopping time, 18 strong law of large numbers, 4 image measure, 5 increasing sequence, 30 independent increments, 5 indistinguishable, 3, 4 infinitely divisible, 29 infinitesimal generator, 17 integrable, 22, 30 kernel, 15 Lévy process, 29 Lévy-Khintchine formula, 30 local martingale, 32 49 50 sub-martingale, 22 super-martingale, 22 tight, 10 time inversion property, 12 time reversal property, 12 time shift operator, 21 total variation norm, 8 transition density, 17 transition function, 16 transition probability, 15 universally measurable, 17 usual conditions, 24 variation, 13 weak law of large numbers, 4 Wiener measure, 6 INDEX