Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
An Introduction to Stochastic Calculus with Applications to Finance Ovidiu Calin Department of Mathematics Eastern Michigan University Ypsilanti, MI 48197 USA [email protected] Preface The goal of this work is to introduce elementary Stochastic Calculus to senior undergraduate as well as to master students with Mathematics, Economics and Business majors. The author’s goal was to capture as much as possible of the spirit of elementary Calculus, at which the students have been already exposed in the beginning of their majors. This assumes a presentation that mimics similar properties of deterministic Calculus, which facilitates the understanding of more complicated concepts of Stochastic Calculus. Since deterministic Calculus books usually start with a brief presentation of elementary functions, and then continue with limits, and other properties of functions, we employed here a similar approach, starting with elementary stochastic processes, different types of limits and pursuing with properties of stochastic processes. The chapters regarding differentiation and integration follow the same pattern. For instance, there is a product rule, a chain-type rule and an integration by parts in Stochastic Calculus, which are modifications of the well-known rules from the elementary Calculus. Since deterministic Calculus can be used for modeling regular business problems, in the second part of the book we deal with stochastic modeling of business applications, such as Financial Derivatives, whose modeling are solely based on Stochastic Calculus. In order to make the book available to a wider audience, we sacrificed rigor for clarity. Most of the time we assumed maximal regularity conditions for which the computations hold and the statements are valid. This will be found attractive by both Business and Economics students, who might get lost otherwise in a very profound mathematical textbook where the forest’s scenary is obscured by the sight of the trees. An important feature of this textbook is the large number of solved problems and examples from which will benefit both the beginner as well as the advanced student. This book grew from a series of lectures and courses given by the author at the Eastern Michigan University (USA), Kuwait University (Kuwait) and Fu-Jen University (Taiwan). Several students read the first draft of these notes and provided valuable feedback, supplying a list of corrections, which is by far exhaustive. Any typos or comments regarding the present material are welcome. The Author, Ann Arbor, October 2012 i ii O. Calin Contents I Stochastic Calculus 3 1 Introduction 1.1 A Few Introductory Problems . . . . . . . . 1.1.1 Stochastic population growth models 1.1.2 Pricing zero-coupon bonds . . . . . . 1.1.3 Diffusions of particles . . . . . . . . 1.1.4 White noise . . . . . . . . . . . . . . 1.2 Bounded Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Basic Notions 2.1 Probability Space . . . . . . . . . . . . . . . . . . . . 2.2 Sample Space . . . . . . . . . . . . . . . . . . . . . . 2.3 Events and Probability . . . . . . . . . . . . . . . . . 2.4 Random Variables . . . . . . . . . . . . . . . . . . . 2.5 Integration in Probability Measure . . . . . . . . . . 2.6 Distribution Functions . . . . . . . . . . . . . . . . . 2.7 Independence . . . . . . . . . . . . . . . . . . . . . . 2.8 Expectation . . . . . . . . . . . . . . . . . . . . . . . 2.8.1 The best approximation of a random variable 2.8.2 Change of measure in an expectation . . . . . 2.9 Basic Distributions . . . . . . . . . . . . . . . . . . . 2.10 Conditional Expectations . . . . . . . . . . . . . . . 2.11 Inequalities of Random Variables . . . . . . . . . . . 2.12 Limits of Sequences of Random Variables . . . . . . 2.13 Properties of Mean-Square Limit . . . . . . . . . . . 2.14 Stochastic Processes . . . . . . . . . . . . . . . . . . 3 Useful Stochastic Processes 3.1 The Brownian Motion . . . . . . 3.2 Geometric Brownian Motion . . . 3.3 Integrated Brownian Motion . . . 3.4 Exponential Integrated Brownian . . . . . . . . . . . . . . . Motion iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 5 6 7 7 8 . . . . . . . . . . . . . . . . 11 11 11 12 13 14 15 16 16 19 19 19 23 27 31 35 36 . . . . 39 39 45 46 49 iv O. Calin 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 Brownian Bridge . . . . . . . . Brownian Motion with Drift . . Bessel Process . . . . . . . . . . The Poisson Process . . . . . . Interarrival times . . . . . . . . Waiting times . . . . . . . . . . The Integrated Poisson Process Submartingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 50 50 52 55 57 57 60 4 Properties of Stochastic Processes 4.1 Stopping Times . . . . . . . . . . . . . . . . 4.2 Stopping Theorem for Martingales . . . . . 4.3 The First Passage of Time . . . . . . . . . . 4.4 The Arc-sine Laws . . . . . . . . . . . . . . 4.5 More on Hitting Times . . . . . . . . . . . . 4.6 The Inverse Laplace Transform Method . . 4.7 Limits of Stochastic Processes . . . . . . . . 4.8 Mean Square Convergence . . . . . . . . . . 4.9 The Martingale Convergence Theorem . . . 4.10 Quadratic Variations . . . . . . . . . . . . . 4.10.1 The Quadratic Variation of Wt . . . 4.10.2 The Quadratic Variation of Nt − λt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 63 67 69 73 76 80 86 87 89 90 91 93 5 Stochastic Integration 5.1 Nonanticipating Processes . . . . . . . . . . . 5.2 Increments of Brownian Motions . . . . . . . 5.3 The Ito Integral . . . . . . . . . . . . . . . . . 5.4 Examples of Ito integrals . . . . . . . . . . . . 5.4.1 The case Ft = c, constant . . . . . . . 5.4.2 The case Ft = Wt . . . . . . . . . . . 5.5 Properties of the Ito Integral . . . . . . . . . 5.6 The Wiener Integral . . . . . . . . . . . . . . 5.7 Poisson Integration . . . . . . . . . . . . . . . 5.8 An Work Out Example: the case FRt = Mt . . T 5.9 The distribution function of XT = 0 g(t) dNt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 97 97 98 99 100 100 101 106 108 109 114 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 117 117 120 123 124 125 6 Stochastic Differentiation 6.1 Differentiation Rules . . . . . . . . . . . . 6.2 Basic Rules . . . . . . . . . . . . . . . . . 6.3 Ito’s Formula . . . . . . . . . . . . . . . . 6.3.1 Ito diffusions . . . . . . . . . . . . 6.3.2 Ito’s formula for Poisson processes 6.3.3 Ito’s multidimensional formula . . . . . . . . . . . . . . 1 An Introd. to Stoch. Calc. with Appl. to Finance 7 Stochastic Integration Techniques 7.1 Notational Conventions . . . . . . 7.2 Stochastic Integration by Parts . . 7.3 The Heat Equation Method . . . . 7.4 Table of Usual Stochastic Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 127 130 135 140 8 Stochastic Differential Equations 8.1 Definitions and Examples . . . . . . . . . . . . 8.2 Finding Mean and Variance from the Equation 8.3 The Integration Technique . . . . . . . . . . . . 8.4 Exact Stochastic Equations . . . . . . . . . . . 8.5 Integration by Inspection . . . . . . . . . . . . 8.6 Linear Stochastic Differential Equations . . . . 8.7 The Method of Variation of Parameters . . . . 8.8 Integrating Factors . . . . . . . . . . . . . . . . 8.9 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 143 144 150 155 158 160 166 169 170 9 Applications of Brownian Motion 9.1 The Directional Derivative . . . . . . . . . . . . 9.2 The Generator of an Ito Diffusion . . . . . . . . 9.3 Dynkin’s Formula . . . . . . . . . . . . . . . . . 9.4 Kolmogorov’s Backward Equation . . . . . . . 9.5 Exit Time from an Interval . . . . . . . . . . . 9.6 Transience and Recurrence of Brownian Motion 9.7 Application to Parabolic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 173 173 176 178 178 179 184 10 Martingales and Girsanov’s Theorem 187 10.1 Examples of Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 10.2 Girsanov’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 2 O. Calin Part I Stochastic Calculus 3 Chapter 1 Introduction 1.1 A Few Introductory Problems Even if deterministic Calculus is an excellent tool for modeling real life problems, however, when it comes to randomness, Stochastic Calculus is the one which can do a more accurate modeling of the problem. In real life applications, involving trajectories, measurements, noisy signals, etc, the effects of many unpredictable factors can be averaged out, via the Central Limit Theorem, as a normal random variable. This is related to the Brownian motion, which was introduced to model the irregular movements of pollen grains in a liquid. In the following we shall discuss a few problems involving random perturbations which can be modeled by Stochastic Calculus. 1.1.1 Stochastic population growth models Exponential growth model Let P (t) denote the population at time t. In the time interval ∆t the population increases by the amount ∆P (t) = P (t + ∆t) − P (t). The classical model of population growth suggests that the relative percentage increase in population is proportional with the time interval, i.e. ∆P (t) = r∆t, P (t) where the constant r > 0 denotes the population growth. Allowing for infinitesimal time intervals, the aforementioned equation writes as dP (t) = rP (t)dt. This differential equation has the solution P (t) = P0 ert , where P0 is the initial population size. The evolution of the population is driven by its growth rate r. In real life this rate in not constant. It might be a function of time t, or even more general, it might oscillate irregularly around some deterministic average a(t): rt = a(t) + “noise”. 5 6 In this case, rt becomes a random variable indexed over time t. The associated equation becomes a stochastic differential equation dP (t) = a(t) + “noise” P (t)dt. (1.1.1) Solving an equation of type (1.1.1) is a problem of Stochastic Calculus. Logistic growth model The previous exponential growth model allows the population to increase indefinitely. However, due to competition, limited space or resources, the population will increase slower and slower. This model was introduced by P.F. Verhust in 1832 and rediscovered by R. Pearl in the twentieth century. The main assumption of the model is that the amount of competition is proportional with the number of encounters between the population members, which is proportional with the square of the population size dP (t) = rP (t)dt − kP (t)2 dt. (1.1.2) The solution is given by the logistic function P (t) = P0 K , P0 + (K − P0 )e−rt where K = r/k is the saturation level of the population. One of the stochastic variants of equation (1.1.2) is given by dP (t) = rP (t)dt − kP (t)2 dt + β(“noise”)P (t), where β ∈ R is a measure of the size of the noise in the system. This equation is used to model the growth of a population in a stochastic, crowded environment. 1.1.2 Pricing zero-coupon bonds A bond is a financial instrument which pays back at the end of its lifetime, T , an amount equal to B, and provides some periodical some payments, called coupons. If the coupons are equal to zero, the bond is called a zero-coupon bond or a discount bond. Using the time value of money, the price of a bond at time t is B(t) = Be−t(T −t) , where r is the risk-free interest rate. The bond satisfies the ordinary differential equation dB(t) = rB(t)dt with the final condition B(T ) = B. In a “noisy” market the constant interest rate r is replaced by rt = r(t) + “noise”, fact that makes the bond pricing more complicated. This can be achieved by Stochastic Calculus. 7 Stochastic pendulum The free oscillations of a simple pendulum of unit mass can be described by the nonlinear equation θ̈(t) = −k2 sin θ(t), where θ(t) is the angle between the string and the vertical direction. If the pendulum is moving under the influence of a time dependent exterior force F = F (t), then the equation of the pendulum with forced oscillations is given by θ̈(t) + k2 sin θ(t) = F (t). We may encounter the situation when the force is not deterministic and we have F (t) = f (t) + (“noise”). How does the noisy force influence the angle deviation θ(t)? Stochastic Calculus can be used to answer this question. 1.1.3 Diffusions of particles Consider a flowing fluid with the velocity field v(x). A particle that moves with the ′ fluid has a trajectory φ(t) described by the equation φ (t) = v φ(t) . A small particle, that is also subject to molecular bombardments, will be described by an equation of the type φ′ (t) = v φ(t) + σ(“noise”), where the constant σ > 0 determines the size of the noise and controls the diffusion of the small particle in the fluid. 1.1.4 White noise All aforementioned problems involved a “noise” influence. This noise is directly related to the trajectory of a small particle which diffuses in a liquid due to the molecular bombardments (just consider the last example in the case of a static fluid, v = 0). This was observed first time by R. Brown in 1823, and it was called Brownian motion and will be denoted by Bt . It is a very irregular, continuous trajectory, which from the mathematical point of view is nowhere differentiable. The adjective “white” comes from signal processing, where it refers to the fact that the noise is completely unpredictable, i.e. it is not biased towards any specific “frequency”.1 If Nt denotes the “white noise” at time t, the trajectory φ(t) of a diffused particle satisfies φ′ (t) = σNt , t ≥ 0. The solution depends of a scaled Brownian motion starting at x, i.e φ(t) = x + σBt . Therefore the white noise is the instantaneously rate of change of the Brownian motion, and can be written informally as dBt . Nt = dt This looks contradictory, since Bt is not differentiable. However, there is a way of making sense of the previous formula by considering the derivative in the following 1 The white light is an equal mixture of radiations of all visible frequencies. 8 “generalized sense”: Z R Nt f (t) dt = − Z Bt f ′ (t) dt, R for any compact supported, smooth function f . 1.2 Bounded Variation Let f : [a, b] → R be a continuously differentiable function, and consider the partition a = x0 < x1 < · · · < xn = b of the interval [a, b]. A smooth curve y = f (x), a ≤ x ≤ b, is rectifiable (has length) if the sum of the lengths of the line segments with vertices at P0 (x0 , f (x0 )), · · · , Pn (xn , f (xn )) is bounded by a given constant, which is independent of the number n and the choice of the division points xi . Assuming the division equidistant, ∆x = (b − a)/n, the curve length is given by ℓ = sup xi n−1 X k=0 |Pk Pk+1 | = sup xi n−1 p X (∆x)2 + (∆f )2 . k=0 Furthermore, if f is continuously differentiable, the computation can be continued as ℓ = lim n→∞ r n−1 X 1+ k=0 ∆f 2 ∆x ∆x = Z bp a 1 + f ′ (x)2 dx, where we used that the limit of an increasing sequence is equal to its superior limit. Definition 1.2.1 The function f (x) has bounded variation on the interval [a, b] if for any division a = x0 < x1 < · · · < xn = b the sum n−1 X k=0 |f (xk+1 ) − f (xk )| is bounded above by a given constant. The total variation of f on [a, b] is defined by V (f ) = sup xi n−1 X k=0 |f (xk+1 ) − f (xk )|. (1.2.3) The amount V (f ) measures in a certain sense the “roughness” of the function. If f is a constant function, then V (f ) = 0. If f is a stair-type function, then V (f ) is the sum of the absolute value of its jumps. 9 We note that if f is continuously differentiable, then the total variation can be written as an integral V (f ) = sup xi n−1 X k=0 |f (xk+1 ) − f (xk )| = lim n→∞ n−1 X k=0 |f (xk+1 ) − f (xk )| ∆x = xk+1 − xk Z b a |f ′ (x)| dx. The next result states a relation between the length of the graph and the total variation of a function f . Proposition 1.2.2 Let f : [a, b] → R be a function. Then the graph y = f (x) has length if and only if V (f ) < ∞. Proof: Consider the simplifying notations (∆f )k = f (xk+1 ) − f (xk ) and ∆x = xk+1 − xk . Taking the summation in the double inequality p |(∆f )k | ≤ (∆x)2 + |(∆f )k |2 ≤ ∆x + |(∆f )k | and then apply the “sup” yields V (f ) ≤ ℓ ≤ (b − a) + V (f ), which implies the desired conclusion. In the virtue of the previous result, the functions with infinite total variations have the graphs of infinite lengths. The following informal computation shows that the Brownian motion has infinite total variation V (Bt ) = sup tk n−1 X k=0 |Wtk+1 Z b Z b dBt − Wtk | = |Nt | dt = ∞, dt = dt a a since the area under the curve t → |Nt | is infinite. We can try to model a more finer “roughness” of the function using the quadratic variation of f on the interval [a, b] V (2) (f ) = sup xi n−1 X k=0 |f (xk+1 ) − f (xk )|2 . (1.2.4) where the “sup” is taken over all divisions a = x0 < x1 < · · · < xn = b. It is worth noting that if f has bounded total variation, V (f ) < ∞, then V (2) (f ) = 0. This comes from the following inequality V (2) (f ) ≤ max |f (xk+1 ) − f (xk )|V (f ) → 0, xk as |∆x| → 0. 10 The total variation for the Brownian motion does not provide much information. It turns out that the correct measure for the roughness of the Brownian motion is the quadratic variation n−1 X |B(tk+1 ) − B(tk )|2 . (1.2.5) V (2) (Bt ) = sup ti It will be shown that V (2) (B t) k=0 is equal to the time interval b − a. Chapter 2 Basic Notions 2.1 Probability Space The modern theory of probability stems from the work of A. N. Kolmogorov published in 1933. Kolmogorov associates a random experiment with a probability space, which is a triplet, (Ω, F, P ), consisting of the set of outcomes, Ω, a σ-field, F, with Boolean algebra properties, and a probability measure, P . In the following sections, each of these elements will be discussed in more detail. 2.2 Sample Space A random experiment in the theory of probability is an experiment whose outcomes cannot be determined in advance. When an experiment is performed, all possible outcomes form a set called the sample space, which will be denoted by Ω. For instance, flipping a coin produces the sample space with two states Ω = {H, T }, while rolling a die yields a sample space with six states Ω = {1, · · · , 6}. Choosing randomly a number between 0 and 1 corresponds to a sample space which is the entire segment Ω = (0, 1). In financial markets one can regard Ω as the states of the world, understanding by this all possible states the world might have. The number of states the world that affect the stock market is huge. These would contain all possible values for the vector parameters that describe the world, which is practically infinite. All subsets of the sample space Ω form a set denoted by 2Ω . The reason for this notation is that the set of parts of Ω can be put into a bijective correspondence with the set of binary functions f : Ω → {0, 1}. The number of elements of this set is 2|Ω| , where |Ω| denotes the cardinal of Ω. If the set is finite, |Ω| = n, then 2Ω has 2n elements. If Ω is infinitely countable (i.e. can be put into a bijective correspondence with the set of natural numbers), then 2|Ω| is infinite and its cardinal is the same as that of the real number set R. 11 12 Remark 2.2.1 Pick a natural number at random. Any subset of the sample space corresponds to a sequence formed with 0 and 1. For instance, the subset {1, 3, 5, 6} corresponds to the sequence 10101100000 . . . having 1 on the 1st, 3rd, 5th and 6th places and 0 in rest. It is known that the number of these sequences is infinite and can be put into a bijective correspondence with the real number set R. This can be also written as |2N | = |R|, and stated by saying that the set of all subsets of natural numbers N has the same cardinal as the real numbers set R. 2.3 Events and Probability The set of parts 2Ω satisfies the following properties: 1. It contains the empty set Ø; 2. If it contains a set A, then it also contains its complement Ā = Ω\A; 3. It is closed with regard to unions, i.e., if A1 , A2 , . . . is a sequence of sets, then their union A1 ∪ A2 ∪ · · · also belongs to 2Ω . Any subset F of 2Ω that satisfies the previous three properties is called a σ-field. The sets belonging to F are called events. This way, the complement of an event, or the union of events is also an event. We say that an event occurs if the outcome of the experiment is an element of that subset. The chance of occurrence of an event is measured by a probability function P : F → [0, 1] which satisfies the following two properties: 1. P (Ω) = 1; 2. For any mutually disjoint events A1 , A2 , · · · ∈ F, P (A1 ∪ A2 ∪ · · · ) = P (A1 ) + P (A2 ) + · · · . The triplet (Ω, F, P ) is called a probability space. This is the main setup in which the probability theory works. Example 2.3.1 In the case of flipping a coin, the probability space has the following elements: Ω = {H, T }, F = {Ø, {H}, {T }, {H, T }} and P defined by P (Ø) = 0, P ({H}) = 21 , P ({T }) = 21 , P ({H, T }) = 1. Example 2.3.2 Consider a finite sample space Ω = {s1 , . . . , sn }, with the σ-field F = 2Ω , and probability given by P (A) = |A|/n, ∀A ∈ F. Then (Ω, F, P ) is a probability space. 13 Figure 2.1: If any set X −1 (a, b) is “known”, then the random variable X : Ω → R is 2Ω -measurable. Example 2.3.3 Let Ω = [0, 1] and consider the σ-field B([0, 1]) given by the set of all open or closed intervals on [0, 1], or any unions, intersections, and complementary sets. Define P (A) = λ(A), where λ stands for the Lebesgue measure (in particular, if A = (a, b), then P (A) = b − a is the length of the interval). It can be shown that (Ω, B([0, 1]), P ) is a probability space. 2.4 Random Variables Since the σ-field F provides the knowledge about which events are possible on the considered probability space, then F can be regarded as the information component of the probability space (Ω, F, P ). A random variable X is a function that assigns a numerical value to each state of the world, X : Ω → R, such that the values taken by X are known to someone who has access to the information F. More precisely, given any two numbers a, b ∈ R, then all the states of the world for which X takes values between a and b forms a set that is an event (an element of F), i.e. {ω ∈ Ω; a < X(ω) < b} ∈ F. Another way of saying this is that X is an F-measurable function. Example 2.4.1 Let X(ω) be the number of people who want to buy houses, given the state of the market ω. Is X measurable? This would mean that given two numbers, say a = 10, 000 and b = 50, 000, we know all the market situations ω for which there are at least 10, 000 and at most 50, 000 people willing to purchase houses. Many times, in theory, it makes sense to assume that we have enough knowledge to assume X measurable. 14 Example 2.4.2 Consider the experiment of flipping three coins. In this case Ω is the set of all possible triplets. Consider the random variable X which gives the number of tails obtained. For instance X(HHH) = 0, X(HHT ) = 1, etc. The sets {ω; X(ω) = 0} = {HHH}, {ω; X(ω) = 3} = {T T T }, {ω; X(ω) = 1} = {HHT, HT H, T HH}, {ω; X(ω) = 2} = {HT T, T HT, T T H} belong to 2Ω , and hence X is a random variable. 2.5 Integration in Probability Measure The notion of expectation is based on integration on measure spaces. In this section we recall briefly the definition of an integral with respect to the probability measure P. Let X : Ω → R be a random variable on the probability space (Ω, F, P ). A partition (Ωi )1≤i≤n of Ω is a family of subsets Ωi ⊂ Ω, with Ωi ∈ F, satisfying 1. Ωi ∩ Ωj = Ø, for i 6= j; n [ Ωi = Ω. 2. i Each Ωi is an event and its associated probability is P (Ωi ). Consider the characteristic 1, if ω ∈ A . More properties of χA function of a set A ⊂ Ω defined by χA (ω) = 0, if ω ∈ /A can be found in Exercise 2.10.8. P A simple function is a sum of characteristic functions f = ni ci χΩi . This means f (ω) = ck for ω ∈ Ωk . The integral of the simple function f is defined by Z n X ci P (Ωi ). f dP = Ω i If X : Ω → R is a random variable, then from the measure theory it is known that there is a sequence of simple functions (fn )n≥1 satisfying limn→∞ fn (ω) = X(ω). Furthermore, if X ≥ 0, then we may assume that fn ≤ fn+1 . Then from the monotone convergence theorem we can write Z Z fn dP. X dP = lim n→∞ Ω Ω If X is not non-negative, we can write it as X = X1 − X1 with Xi ≥ 0 and define Z Z Z X2 dP. X1 dP − lim X dP = lim Ω n→∞ Ω R n→∞ Ω R From now on, the integral notations Ω X dP or Ω X(ω) dP (ω) will be used interchangeably. In the rest of the chapter the integral notation will be used informally, without requiring a direct use of the previous definition. 15 2.6 Distribution Functions Let X be a random variable on the probability space (Ω, F, P ). The distribution function of X is the function FX : R → [0, 1] defined by FX (x) = P (ω; X(ω) ≤ x). It is worth observing that since X is a random variable, then the set {ω; X(ω) ≤ x} belongs to the information set F. The distribution function is non-decreasing and satisfies the limits lim FX (x) = 0, lim FX (x) = 1. x→−∞ x→+∞ If we have d F (x) = p(x), dx X then we say that p(x) is the probability density function of X. It is important to note the following relation among distribution function, probability and probability density function of the random variable X Z x Z p(u) du. (2.6.1) dP (ω) = FX (x) = P (X ≤ x) = −∞ {X≤x} The probability density function p(x) has the following properties: (i) p(x) ≥ 0 R∞ (ii) −∞ p(u) du = 1. The first one is a consequence of the fact that the distribution function FX (x) is nondecreasing. The second follows from (2.6.1) by making x → ∞ Z Z ∞ dP = P (Ω) = 1. p(u) du = −∞ Ω As an extension of formula (2.6.1) we have for any F-measurable function h Z Z h(x)p(x) dx. (2.6.2) h(X(ω)) dP (ω) = Ω R Another useful propety, which follows from the Fundamental Theorem of Calculus is P (a < X < b) = P (ω; a < X(ω) < b) = Z b p(x) dx. a In the case of discrete random variables the aforementioned integral is replaced by the following sum X P (a < X < b) = P (X = x). a<x<b For more details the reader is referred to a traditional probability book, such as Wackerly et. al. [?]. 16 2.7 Independence Roughly speaking, two random variables X and Y are independent if the occurrence of one of them does not change the probability density of the other. More precisely, if for any two open intervals A, B ⊂ R, the events E = {ω; X(ω) ∈ A}, F = {ω; Y (ω) ∈ B} are independent, i.e., P (E ∩ F ) = P (E)P (F ), then X and Y are called independent random variables. Proposition 2.7.1 Let X and Y be independent random variables with density functions pX (x) and pY (y). Then the joint density function of (X, Y ) is given by pX,Y (x, y) = pX (x) pY (y). Proof: Using the independence of sets, we have1 pX,Y (x, y) dxdy = P (x < X < x + dx, y < Y < y + dy) = P (x < X < x + dx)P (y < Y < y + dy) = pX (x) dx pY (y) dy = pX (x)pY (y) dxdy. Dropping the factor dxdy yields the desired result. We note that the converse holds true. The σ-algebra generated by a random variable X : Ω → R is the σ-algebra generated by the unions, intersections and complements of events of the form {ω; X(ω) ∈ (a, b)}, with a < b real numbers. It will be denoted by AX . Two σ-fields G and H included in F are called independent if P (G ∩ H) = P (G)P (H), ∀G ∈ G, H ∈ H. The random variable X and the σ-field G are called independent if the algebras AX and G are independent. 2.8 Expectation A random variable X : Ω → R is called integrable if Z Z |x|p(x) dx < ∞, |X(ω)| dP (ω) = Ω R where p(x) denotes the probability density function of X. The previous identity is based on changing the domain of integration from Ω to R. 1 We are using the useful approximation P (x < X < x + dx) = R x+dx x p(u) du = p(x)dx. 17 The expectation of an integrable random variable X is defined by Z Z x p(x) dx. X(ω) dP (ω) = E[X] = Ω R Customarily, the expectation of X is denoted by µ and it is also called the mean. In general, for any measurable function h : R → R, we have Z Z h(x)p(x) dx. h X(ω) dP (ω) = E[h(X)] = Ω R In the case of a discrete random variable X the expectation is defined as X E[X] = xk P (X = xk ). k≥1 Proposition 2.8.1 The expectation operator E is linear, i.e. for any integrable random variables X and Y 1. E[cX] = cE[X], ∀c ∈ R; 2. E[X + Y ] = E[X] + E[Y ]. Proof: It follows from the fact that the integral is a linear operator. Proposition 2.8.2 Let X and Y be two independent integrable random variables. Then E[XY ] = E[X]E[Y ]. Proof: This is a variant of Fubini’s theorem, which in this case states that a double integral is a product of two simple integrals. Let pX , pY , pX,Y denote the probability densities of X, Y and (X, Y ), respectively. Since X and Y are independent, by Proposition 2.7.1 we have E[XY ] = ZZ xypX,Y (x, y) dxdy = Z xpX (x) dx Z ypY (y) dy = E[X]E[Y ]. Definition 2.8.3 The covariance of two random variables is defined by Cov(X, Y ) = E[XY ] − E[X][Y ]. The variance of X is given by V ar(X) = Cov(X, X). 18 Proposition 2.8.2 states that if X and Y are independent, then Cov(X, Y ) = 0. It is worth to note that the converse in not necessarily true, see Exercise 2.8.6. Exercise 2.8.4 Show that (a) Cov(X, Y ) = E[(X − µX )(Y − µY )], where µX = E[X] and µY = E[Y ]; (b) V ar(X) = E[(X − µX )2 ]; From Exercise 2.8.4 (b), we have V ar(X) ≥ 0, so, there is a real number σ > 0 such that V ar(X) = σ 2 . The number σ is called standard deviation. Exercise 2.8.5 Let µ and σ denote the mean and the standard deviation of the random variable X. Show that E[X 2 ] = µ2 + σ 2 . Exercise 2.8.6 Consider two random variables with the following table of joint probabilities: Y \X −1 0 1 −1 1/16 3/16 1/16 0 3/16 0 3/16 1 1/16 3/16 1/16 Show the following: (a) E[X] = E[Y ] = E[XY ] = 0; (b) Cov(X, Y ) = 0. (c) P (0, 0) 6= PX (0)PY (0); (d) X and Y are not independent. The covariance can be standardized in the following way. Let σX and σY be the standard deviations of X and Y , respectively. The correlation coefficient of X and Y is defined as Cov(X, Y ) . ρ(X, Y ) = σX σY Example 2.8.1 (a) Show the inequality −1 ≤ ρ(X, Y ) ≤ 1. (b) What can you say about the random variables X and Y if ρ(X, Y ) = 1? 19 2.8.1 The best approximation of a random variable Let X be a random variable. We would like to approximate X by a single (nonrandom) number x. The “best” value of x is chosen in the sense of the “least squares”, i.e. x is picked such that the expectation of the error square (X − x)2 is minimum. Denote µ = E[X] and σ 2 = var(X). Since E[(X − x)2 ] = E[X 2 ] − 2xE[X] + x2 = σ 2 + µ2 − 2xµ + x2 = σ 2 + (x − µ)2 , then the minimum is obtained for x = µ, and in this case min E[(X − x)2 ] = σ 2 . x It follows that the mean is the the best approximation of the random variable X in the least squares sense. 2.8.2 Change of measure in an expectation Let P, Q : F → R be two probability measures on Ω, such that there is an integrable random variable f : Ω → R, such that dQ = f dP . This means Z Z f (ω) dP (ω), ∀A ∈ F. dQ = Q(A) = A A Denote by E P and E Q the expectations with respect to the measures P and Q, respectively. Then we have Z Z X(ω)f (ω) dP (ω) = E P [f X]. X(ω) dQ(ω) = E Q [X] = Ω Ω Z 1 g(x) dx = 1. Exercise 2.8.7 Let g : [0, 1] → [0, ∞) be a integrable function with 0 Z g(x) dx. Show that Q is a probability Consider Q : B([0, 1]) → R, given by Q(A) = measure on (Ω = [0, 1], B([0, 1])). 2.9 A Basic Distributions We shall recall a few basic distributions, which are most often seen in applications. Normal distribution A random variable X is said to have a normal distribution if its probability density function is given by 1 2 2 p(x) = √ e−(x−µ) /(2σ ) , σ 2π 20 with µ and σ > 0 constant parameters, see Fig.2.2a. The mean and variance are given by E[X] = µ, V ar[X] = σ 2 . If X has a normal distribution with mean µ and variance σ 2 , we shall write X ∼ N (µ, σ 2 ). Exercise 2.9.1 Let α, β ∈ R. Show that if X is normal distributed, with X ∼ N (µ, σ 2 ), then Y = αX + β is also normal distributed, with Y ∼ N (αµ + β, α2 σ 2 ). Log-normal distribution Let X be normally distributed with mean µ and variance σ 2 . Then the random variable Y = eX is said to be log-normal distributed. The mean and variance of Y are given by E[Y ] = eµ+ σ2 2 2 2 V ar[Y ] = e2µ+σ (eσ − 1). The density function of the log-normal distributed random variable Y is given by p(x) = 1 √ xσ 2π e− (ln x−µ)2 2σ 2 , x > 0, see Fig.2.2b. Definition 2.9.2 The moment generating function of a random variable X is the funcZ tX tx tion mX (t) = E[e ] = e p(x) dx, where p(x) is the probability density function of X, provided the integral exists. The name comes from the fact that the n-th moments of X, given by µk = E[X k ], are generated as derivatives of mX (t) dn mX (t) = µk . dtn |t=0 It is worth noting the relation Z between the Laplace transform and the moment generating function L(p(x))(t) = e−tx p(x) dx = mX (−t). Exercise 2.9.3 Find the moment generating function for the exponential distribution p(x) = λe−xλ , x ≥ 0, λ > 0. Exercise 2.9.4 Show that if X and Y are two independent random variables, then mX+Y (t) = mX (t)mY (t). 21 0.4 0.5 0.3 0.4 0.3 0.2 0.2 0.1 0.1 -4 0 -2 2 4 0 2 4 a 6 8 b 3.5 Α = 8, Β = 3 Α = 3, Β = 9 3.0 0.20 Α = 3, Β = 2 2.5 0.15 2.0 0.10 1.5 Α = 4, Β = 3 1.0 0.05 0.5 0 5 10 15 20 0.0 0.2 0.4 c 0.6 0.8 1.0 d Figure 2.2: a Normal distribution; b Log-normal distribution; c Gamma distributions; d Beta distributions. Exercise 2.9.5 Given that the moment generating function of a normally distributed 2 2 random variable X ∼ N (µ, σ 2 ) is m(t) = E[etX ] = eµt+t σ /2 , show that 2 2 (a) E[Y n ] = enµ+n σ /2 , where Y = eX . (b) Show that the mean and variance of the log-normal random variable Y = eX are 2 2 2 E[Y ] = eµ+σ /2 , V ar[Y ] = e2µ+σ (eσ − 1). Gamma distribution A random variable X is said to have a gamma distribution with parameters α > 0, β > 0 if its density function is given by p(x) = xα−1 e−x/β , β α Γ(α) x ≥ 0, where Γ(α) denotes the gamma function,2 see Fig.2.2c. The mean and variance are E[X] = αβ, V ar[X] = αβ 2 . The case α = 1 is known as the exponential distribution, see Fig.2.3a. In this case p(x) = 2 1 −x/β e , β x ≥ 0. Recall the definition of the gamma function Γ(α) = Γ(n) = (n − 1)! R∞ 0 y α−1 e−y dy; if α = n, integer, then 22 0.10 0.35 Λ = 15, 0 < k < 30 0.30 0.08 0.25 Β=3 0.20 0.06 0.15 0.04 0.10 0.02 0.05 0 2 4 6 8 10 a 5 10 15 20 25 30 b Figure 2.3: a Exponential distribution; b Poisson distribution. The particular case when α = n/2 and β = 2 becomes the χ2 −distribution with n degrees of freedom. This characterizes also a sum of n independent standard normal distributions. Beta distribution A random variable X is said to have a beta distribution with parameters α > 0, β > 0 if its probability density function is of the form p(x) = xα−1 (1 − x)β−1 , B(α, β) 0 ≤ x ≤ 1, where B(α, β) denotes the beta function.3 See see Fig.2.2d for two particular density functions. In this case α αβ E[X] = , V ar[X] = . 2 α+β (α + β) (α + β + 1) Poisson distribution A discrete random variable X is said to have a Poisson probability distribution if λk −λ e , k = 0, 1, 2, . . . , k! with λ > 0 parameter, see Fig.2.3b. In this case E[X] = λ and V ar[X] = λ. Pearson 5 distribution Let α, β > 0. A random variable X with the density function P (X = k) = p(x) = e−β/x 1 , βΓ(α) (x/β)α+1 x≥0 is said to have a Pearson 5 distribution4 with positive parameters α and β. It can be shown that β2 β , if α > 2 , if α > 1 E[X] = V ar(X) = (α − 1)2 (α − 2) α−1 ∞, otherwise, ∞, otherwise. R1 = 0 y α−1 (1 − y)β−1 dy. Two definition formulas for the beta functions are B(α, β) = Γ(α)Γ(β) Γ(α+β) 4 The Pearson family of distributions was designed by Pearson between 1890 and 1895. There are several Pearson distributions, this one being distinguished by the number 5. 3 23 β . α+1 The Inverse Gaussian distribution Let µ, λ > 0. A random variable X has an inverse Gaussian distribution with parameters µ and λ if its density function is given by 2 λ − λ(x−µ) 2x 2µ e p(x) = , x > 0. (2.9.3) 2πx3 We shall write X ∼ IG(µ, λ). Its mean, variance and mode are given by r 9µ2 3µ µ3 1+ 2 − , , M ode(X) = µ E[X] = µ, V ar(X) = λ 4λ 2λ The mode of this distribution is equal to where the mode denotes the value xm for which p(x) is maximum, i.e., p(x0 ) = maxx p(x). This distribution will be used to model the time instance when a Brownian motion with drift exceeds a certain barrier for the first time. 2.10 Conditional Expectations Let X be a random variable on the probability space (Ω, F, P ), and G be a σ-field contained in F. Since X is F-measurable, the expectation of X, given the information F must be X itself, fact that can be written as E[X|F] = X (for details see Example 2.10.5). On the other side, the information G does not completely determine X. The random variable that makes a prediction for X based on the information G is denoted by E[X|G], and is called the conditional expectation of X given G. This is defined as the random variable Y = E[X|G], which is the best approximation of X in the least squares sense, i.e. E[(X − Y )2 ] ≤ E[(X − Z)2 ], (2.10.4) for any G-measurable random variable Z. The set of all random variables on Ω forms a Hilbert space with the inner product hX, Y i = E[XY ]. This defines the norm kXk2 = E[X 2 ], which induces the distance d(X, Y ) = kX − Y k. Denote by SG the set of all G-measurable random variables on Ω. We shall show that the element of SG that is the closest to X in the aforementioned distance is the conditional expectation Y = E[X|G]. Let X⊥ denote the orthogonal projection of X on the space SG . This satisfies E[(X − X⊥ )(Z − X⊥ )] = 0, ∀Z ∈ SG . (2.10.5) The Pythagorean relation kX − X⊥ k2 + kZ − X⊥ k2 = kX − Zk2 , ∀Z ∈ SG 24 implies the inequality kX − X⊥ k2 ≤ kX − Zk2 , ∀Z ∈ SG , which is equivalent to E[(X − X⊥ )2 ] ≤ E[(X − Z)2 ], ∀Z ∈ SG , which yields X⊥ = E[X|G], so the conditional expectation is the orthogonal projection of X on the space SG . The uniqueness of this projection is a consequence of the Pythagorean relation. The orthogonality relation (2.10.5) can be written equivalently as E[(X − Y )U ] = 0, ∀U ∈ SG . Therefore, the conditional expectation Y satisfies the identity E[XU ] = E[Y U ], ∀U ∈ SG . In particular, if choose U = χA , the characteristic function of a set A ∈ G, then the foregoing relation yields Z Z Y dP, ∀A ∈ G. X dP = A A We arrived at the following equivalent characterization of the conditional expectations. The conditional expectation of X given G is a random variable satisfying: 1. E[X|G] is G-measurable; R R 2. A E[X|G] dP = A X dP, ∀A ∈ G. Exercise 2.10.1 Consider the probability space (Ω, F, P ), and let G be a σ-field included in F. If X is a G-measurable random variable such that Z X dP = 0 ∀A ∈ G, A then X = 0 a.s. It is worth to mention here an equivalent result, which relates to conditional expectations: Theorem 2.10.2 (Radon-Nikodym) Let (Ω, F, P ) be a probability space and G be a σ-field included in F. Then for any random variable X there is a G-measurable random variable Y such that Z Z Y dP, ∀A ∈ G. (2.10.6) X dP = A A 25 Radon-Nikodym’s theorem states the existence of Y . In fact this is unique almost surely by the application of Exercise 2.10.1. Exercise 2.10.3 Show that if G = {Ø, Ω}, then E[X|G] = E[X]. Proof: We need to show that E[X] satisfies conditions 1 and 2. The first one is obviously satisfied since any constant is G-measurable. The latter condition is checked on each set of G. We have Z Z Z E[X]dP dP = X dP = E[X] = E[X] Ω Ω Ω Z Z E[X]dP. X dP = Ø Ø Exercise 2.10.4 Show that E[E[X|G]] = E[X], i.e. all conditional expectations have the same mean, which is the mean of X. Proof: Using the definition of expectation and taking A = Ω in the second relation of the aforementioned definition, yields Z Z XdP = E[X], E[X|G] dP = E[E[X|G]] = Ω Ω which ends the proof. Example 2.10.5 The conditional expectation of X given the total information F is the random variable X itself, i.e. E[X|F] = X. Proof: The random variables X and E[X|F] are both F-measurable (from the definition of the random variable). From the definition of the conditional expectation we have Z Z X dP, ∀A ∈ F. E[X|F] dP = A A Exercise 2.10.1 implies that E[X|F] = X almost surely. General properties of the conditional expectation are stated below without proof. The proof involves more or less simple manipulations of integrals and can be taken as an exercise for the reader. 26 Proposition 2.10.6 Let X and Y be two random variables on the probability space (Ω, F, P ). We have 1. Linearity: E[aX + bY |G] = aE[X|G] + bE[Y |G], ∀a, b ∈ R; 2. Factoring out the measurable part: E[XY |G] = XE[Y |G] if X is G-measurable. In particular, E[X|G] = X. 3. Tower property (“the least information wins”): E[E[X|G]|H] = E[E[X|H]|G] = E[X|H], if H ⊂ G; 4. Positivity: E[X|G] ≥ 0, if X ≥ 0; 5. Expectation of a constant is a constant: E[c|G] = c. 6. An independent condition drops out: E[X|G] = E[X], if X is independent of G. Exercise 2.10.7 Prove the property 3 (tower property) given in the previous proposition. Exercise 2.10.8 Let X be a random variable on the probability space (Ω, F, P ), which is independent of theσ-field G ⊂ F. Consider the characteristic function of a set A ⊂ Ω 1, if ω ∈ A defined by χA (ω) = Show the following: 0, if ω ∈ /A (a) χA is G-measurable for any A ∈ G; (b) P (A) = E[χA ]; (c) X and χA are independent random variables; (d) E[χA X] = E[X]P (A) for any A ∈ G; (e) E[X|G] = E[X]. 27 Figure 2.4: Jensen’s inequality ϕ(E[X]) < E[ϕ(X)] for a convex function ϕ. 2.11 Inequalities of Random Variables This section prepares the reader for the limits of sequences of random variables and limits of stochastic processes. We recall first that an infinite differentiable function f (x), has a Taylor series at a if f (x) = f (a) + f ′′ (a) f ′′′ (a) f ′ (a) (x − a) + (x − a)2 + (x − a)3 + . . . , 1! 2! 3! where x belongs to an interval neighborhood of a. Exercise 2.11.1 Let f (x) be n + 1 times differentiable on an interval I containing a. Show that there is a ξ ∈ I such that for any x ∈ I f (x) = f (a)+ f ′ (a) f ′′ (a) f (n) (a) f (n+1) (a) (x−a)+ (x−a)2 +· · · + (x−a)n + (ξ −a)n+1 . 1! 2! n! (n + 1)! We shall start with a classical inequality result regarding expectations: Theorem 2.11.2 (Jensen’s inequality) Let ϕ : R → R be a convex function and let X be an integrable random variable on the probability space (Ω, F, P ). If ϕ(X) is integrable, then ϕ(E[X]) ≤ E[ϕ(X)]. Proof: We shall assume ϕ twice differentiable with ϕ′′ continuous. Let µ = E[X]. Expand ϕ in a Taylor series about µ, see Exercise 2.11.1, and get 1 ϕ(x) = ϕ(µ) + ϕ′ (µ)(x − µ) + ϕ′′ (ξ)(ξ − µ)2 , 2 28 with ξ in between x and µ. Since ϕ is convex, ϕ′′ ≥ 0, and hence ϕ(x) ≥ ϕ(µ) + ϕ′ (µ)(x − µ), which means the graph of ϕ(x) is above the tangent line at x, ϕ(x) . Replacing x by the random variable X, and taking the expectation yields E[ϕ(X)] ≥ E[ϕ(µ) + ϕ′ (µ)(X − µ)] = ϕ(µ) + ϕ′ (µ)(E[X] − µ) = ϕ(µ) = ϕ(E[X]), which proves the result. Fig.2.4 provides a graphical interpretation of Jensen’s inequality. If the distribution of X is symmetric, then the distribution of ϕ(X) is skewed, with ϕ(E[X]) < E[ϕ(X)]. It is worth noting that the inequality is reversed for ϕ concave. We shall present next a couple of applications. A random variable X : Ω → R is called square integrable if Z Z 2 2 x2 p(x) dx < ∞. |X(ω)| dP (ω) = E[X ] = Ω R Application 2.11.3 If X is a square integrable random variable, then it is integrable. Proof: Jensen’s inequality with ϕ(x) = x2 becomes E[X]2 ≤ E[X 2 ]. Since the right side is finite, it follows that E[X] < ∞, so X is integrable. Application 2.11.4 If mX (t) denotes the moment generating function of the random variable X with mean µ, then mX (t) ≥ etµ . Proof: Applying Jensen inequality with the convex function ϕ(x) = ex yields eE[X] ≤ E[eX ]. Substituting tX for X yields eE[tX] ≤ E[etX ]. (2.11.7) Using the definition of the moment generating function mX (t) = E[etX ] and that E[tX] = tE[X] = tµ, then (2.11.7) leads to the desired inequality. The variance of a square integrable random variable X is defined by V ar(X) = E[X 2 ] − E[X]2 . By Application 2.11.3 we have V ar(X) ≥ 0, so there is a constant σX > 0, called standard deviation, such that 2 = V ar(X). σX 29 Exercise 2.11.5 Prove the following identity: V ar[X] = E[(X − E[X])2 ]. Exercise 2.11.6 Prove that a non-constant random variable has a non-zero standard deviation. Exercise 2.11.7 Prove the following extension of Jensen’s inequality: If ϕ is a convex function, then for any σ-field G ⊂ F we have ϕ(E[X|G]) ≤ E[ϕ(X)|G]. Exercise 2.11.8 Show the following: (a) |E[X]| ≤ E[|X|]; (b) |E[X|G]| ≤ E[|X| |G], for any σ-field G ⊂ F; (c) |E[X]|r ≤ E[|X|r ], for r ≥ 1; (d) |E[X|G]|r ≤ E[|X|r |G], for any σ-field G ⊂ F and r ≥ 1. Theorem 2.11.9 (Markov’s inequality) For any λ, p > 0, we have the following inequality: 1 P (ω; |X(ω)| ≥ λ) ≤ p E[|X|p ]. λ Proof: Let A = {ω; |X(ω)| ≥ λ}. Then Z Z Z p p p λp dP (ω) |X(ω)| dP (ω) ≥ |X(ω)| dP (ω) ≥ E[|X| ] = A A ΩZ = λp dP (ω) = λp P (A) = λp P (|X| ≥ λ). A Dividing by λp leads to the desired result. Theorem 2.11.10 (Tchebychev’s inequality) If X is a random variable with mean µ and variance σ 2 , then σ2 P (ω; |X(ω) − µ| ≥ λ) ≤ 2 . λ Proof: Let A = {ω; |X(ω) − µ| ≥ λ}. Then Z Z 2 2 2 σ = V ar(X) = E[(X − µ) ] = (X − µ) dP ≥ (X − µ)2 dP A Ω Z ≥ λ2 dP = λ2 P (A) = λ2 P (ω; |X(ω) − µ| ≥ λ). A Dividing by λ2 leads to the desired inequality. The next result deals with exponentially decreasing bounds on tail distributions. 30 Theorem 2.11.11 (Chernoff bounds) Let X be a random variable. Then for any λ > 0 we have E[etX ] , ∀t > 0; 1. P (X ≥ λ) ≤ eλt 2. P (X ≤ λ) ≤ E[etX ] , ∀t < 0. eλt Proof: 1. Let t > 0 and denote Y = etX . By Markov’s inequality P (Y ≥ eλt ) ≤ E[Y ] . eλt Then we have P (X ≥ λ) = P (tX ≥ λt) = P (etX ≥ eλt ) E[etX ] E[Y ] . = P (Y ≥ eλt ) ≤ λt = e eλt 2. The case t < 0 is similar. In the following we shall present an application of the Chernoff bounds for the normal distributed random variables. Let X be a random variable normally distributed with mean µ and variance σ 2 . It is known that its moment generating function is given by 1 2 2 σ m(t) = E[etX ] = eµt+ 2 t . Using the first Chernoff bound we obtain P (X ≥ λ) ≤ 1 2 2 m(t) = e(µ−λ)t+ 2 t σ , ∀t > 0, λt e which implies 1 min[(µ − λ)t + t2 σ 2 ] 2 P (X ≥ λ) ≤ e t>0 . It is easy to see that the quadratic function f (t) = (µ − λ)t + 12 t2 σ 2 has the minimum λ−µ value reached for t = . Since t > 0, λ needs to satisfy λ > µ. Then σ2 λ − µ (λ − µ)2 min f (t) = f . = − t>0 σ2 2σ 2 Substituting into the previous formula, we obtain the following result: Proposition 2.11.12 If X is a normally distributed variable, with X ∼ N (µ, σ 2 ), then for any λ > µ (λ − µ)2 − 2σ 2 . P (X ≥ λ) ≤ e 31 Exercise 2.11.13 Let X be a Poisson random variable with mean λ > 0. (a) Show that the moment generating function of X is m(t) = eλ(e (b) Use a Chernoff bound to show that P (X ≥ k) ≤ eλ(e t −1)−tk , t −1) ; t > 0. Markov’s, Tchebychev’s and Chernoff’s inequalities will be useful later when computing limits of random variables. Proposition 2.11.14 Let X be a random variable and f and g be two functions, both increasing or decreasing. Then E[f (X)g(X)] ≥ E[f (X)]E[g(X)]. (2.11.8) Proof: For any two independent random variables X and Y , we have f (X) − f (Y ) g(X) − g(Y ) ≥ 0. Applying expectation yields E[f (X)g(X)] + E[f (Y )g(Y )] ≥ E[f (X)]E[f (Y )] + E[f (Y )]E[f (X)]. Considering Y as an independent copy of X we obtain 2E[f (X)g(X)] ≥ 2E[f (X)]E[g(X)]. Exercise 2.11.15 Show the following inequalities: (a) E[X 2 ] ≥ E[X]2 ; (b) E[X sinh(X)] ≥ E[X]E[sinh(X)]; (c) E[X 6 ] ≥ E[X]E[X 5 ]; (d) E[X 6 ] ≥ E[X 3 ]2 . Exercise 2.11.16 For any n, k ≥ 1, show that E[X 2(n+k+1) ] ≥ E[X 2k+1 ]E[X 2n+1 ]. 2.12 Limits of Sequences of Random Variables Consider a sequence (Xn )n≥1 of random variables defined on the probability space (Ω, F, P ). There are several ways of making sense of the limit expression X = lim Xn , n→∞ and they will be discussed in the following sections. 32 Almost Sure Limit The sequence Xn converges almost surely to X, if for all states of the world ω, except a set of probability zero, we have lim Xn (ω) = X(ω). n→∞ More precisely, this means P ω; lim Xn (ω) = X(ω) = 1, n→∞ and we shall write as-lim Xn = X. An important example where this type of limit n→∞ occurs is the Strong Law of Large Numbers: If Xn is a sequence of independent and identically distributed random variables with X1 + · · · + Xn = µ. the same mean µ, then as-lim n→∞ n It is worth noting that this type of convergence is also known under the name of strong convergence. This is the reason that the aforementioned theorem bears its name. Mean Square Limit Another possibility of convergence is to look at the mean square deviation of Xn from X. We say that Xn converges to X in the mean square if lim E[(Xn − X)2 ] = 0. n→∞ More precisely, this should be interpreted as Z 2 Xn (ω) − X(ω) dP (ω) = 0. lim n→∞ Ω This limit will be abbreviated by ms-lim Xn = X. The mean square convergence is n→∞ useful when defining the Ito integral. Example 2.12.1 Consider a sequence Xn of random variables such that there is a constant k with E[Xn ] → k and V ar(Xn ) → 0 as n → ∞. Show that ms-lim Xn = k. n→∞ Proof: Since we have E[|Xn − k|2 ] = E[Xn2 − 2kXn + k2 ] = E[Xn2 ] − 2kE[Xn ] + k2 = E[Xn2 ] − E[Xn ]2 + E[Xn ]2 − 2kE[Xn ] + k2 2 = V ar(Xn ) + E[Xn ] − k , the right side tends to 0 when taking the limit n → ∞. 33 Exercise 2.12.1 Show the following relation 2 E[(X − Y )2 ] = V ar[X] + V ar[Y ] + E[X] − E[Y ] − 2Cov(X, Y ). Exercise 2.12.2 If Xn tends to X in mean square, with E[X 2 ] < ∞, show that: (a) E[Xn ] → E[X] as n → ∞; (b) E[Xn2 ] → E[X 2 ] as n → ∞; (c) V ar[Xn ] → V ar[X] as n → ∞; (d) Cov(Xn , X) → V ar[X] as n → ∞. Exercise 2.12.3 If Xn tends to X in mean square, show that E[Xn |H] tends to E[X|H] in mean square. Limit in Probability The random variable X is the limit in probability of Xn if for n large enough the probability of deviation from X can be made smaller than any arbitrary ǫ. More precisely, for any ǫ > 0 lim P ω; |Xn (ω) − X(ω)| ≤ ǫ = 1. n→∞ This can be written also as lim P ω; |Xn (ω) − X(ω)| > ǫ = 0. n→∞ This limit is denoted by p-lim Xn = X. n→∞ It is worth noting that both almost certain convergence and convergence in mean square imply the convergence in probability. Proposition 2.12.4 The convergence in the mean square implies the convergence in probability. Proof: Let ms-lim Yn = Y . Let ǫ > 0 be arbitrarily fixed. Applying Markov’s n→∞ inequality with X = Yn − Y , p = 2 and λ = ǫ, yields 0 ≤ P (|Yn − Y | ≥ ǫ) ≤ 1 E[|Yn − Y |2 ]. ǫ2 The right side tends to 0 as n → ∞. Applying the Squeeze Theorem we obtain lim P (|Yn − Y | ≥ ǫ) = 0, n→∞ which means that Yn converges stochastically to Y . 34 Example 2.12.5 Let Xn be a sequence of random variables such that E[|Xn |] → 0 as n → ∞. Prove that p-lim Xn = 0. n→∞ Proof: Let ǫ > 0 be arbitrarily fixed. We need to show lim P ω; |Xn (ω)| ≥ ǫ = 0. n→∞ (2.12.9) From Markov’s inequality (see Exercise 2.11.9) we have E[|Xn |] . 0 ≤ P ω; |Xn (ω)| ≥ ǫ ≤ ǫ Using Squeeze Theorem we obtain (2.12.9). Remark 2.12.6 The conclusion still holds true even in the case when there is a p > 0 such that E[|Xn |p ] → 0 as n → ∞. Limit in Distribution We say the sequence Xn converges in distribution to X if for any continuous bounded function ϕ(x) we have lim E[ϕ(Xn )] = E[ϕ(X)]. n→∞ We make the remark that this type of limit is even weaker than the stochastic convergence, i.e. it is implied by it. An application of the limit in distribution is obtained if we consider ϕ(x) = eitx . In this case the expectation becomes the Fourier transform of the probability density E[ϕ(X)] = Z eitx p(x) dx = p̂(t), and it is called the characteristic function of the random variable X. It follows that if Xn converges in distribution to X, then the characteristic function of Xn converges to the characteristic function of X. From the properties of the the Fourier transform, the probability density of Xn approaches the probability density of X. It can be shown that the convergence in distribution is equivalent with lim Fn (x) = F (x), n→∞ whenever F is continuous at x, where Fn and F denote the distribution functions of Xn and X, respectively. This is the reason that this convergence bares its name. 35 2.13 Properties of Mean-Square Limit Lemma 2.13.1 If ms-lim Xn = 0 and ms-lim Yn = 0, then n→∞ n→∞ ms-lim (Xn + Yn ) = 0. n→∞ Proof: It follows from the inequality (x + y)2 ≤ 2x2 + 2y 2 . The details are left to the reader. Proposition 2.13.2 If the sequences of random variables Xn and Yn converge in the mean square, then Proof: 1. ms-lim (Xn + Yn ) = ms-lim Xn + ms-lim Yn 2. ms-lim (cXn ) = c · ms-lim Xn , n→∞ n→∞ n→∞ n→∞ n→∞ ∀c ∈ R. 1. Let ms-lim Xn = X and ms-lim Yn = Y . Consider the sequences Xn′ = n→∞ n→∞ Xn − X and Yn′ = Yn − Y . Then ms-lim Xn′ = 0 and ms-lim Yn′ = 0. Applying Lemma n→∞ n→∞ 2.13.1 yields ms-lim (Xn′ + Yn′ ) = 0. n→∞ This is equivalent with which becomes ms-lim (Xn − X + Yn − Y ) = 0, n→∞ ms-lim (Xn + Yn ) = X + Y . n→∞ Remark 2.13.3 It is worthy to note that ms-lim (Xn Yn ) 6= ms-lim (Xn )· ms-lim (Yn ). n→∞ n→∞ Counter-examples can be found, see Exercise 2.13.5. n→∞ Exercise 2.13.4 Use a computer algebra system to show the following: √ Z ∞ x2 π 2p dx = (a) 1 − 5−1/2 ; x5 + 1 4 0 Z ∞ x4 (b) dx = ∞; x5 + 1 0 Z ∞ 1 1 1 4 (c) Γ . dx = Γ x5 + 1 5 5 5 0 36 Exercise 2.13.5 Let X be a random variable with the probability density function p(x) = 1 5 , 5 Γ(1/5)Γ(4/5) x + 1 (a) Show that E[X 2 ] < ∞ and E[X 4 ] = ∞; x ≥ 0. (b) Construct the sequences of random variables Xn = Yn = lim Xn = 0, ms-lim Yn = 0, but ms-lim (Xn Yn ) = ∞. n→∞ 2.14 n→∞ 1 n X. Show that ms- n→∞ Stochastic Processes A stochastic process on the probability space (Ω, F, P ) is a family of random variables Xt parameterized by t ∈ T, where T ⊂ R. If T is an interval we say that Xt is a stochastic process in continuous time. If T = {1, 2, 3, . . . } we shall say that Xt is a stochastic process in discrete time. The latter case describes a sequence of random variables. The aforementioned types of convergence can be easily extended to continuous time. For instance, Xt converges almost surely to X as t → ∞ if P ω; lim Xt (ω) = X(ω) = 1. t→∞ The evolution in time of a given state of the world ω ∈ Ω given by the function t 7−→ Xt (ω) is called a path or realization of Xt . The study of stochastic processes using computer simulations is based on retrieving information about the process Xt given a large number of it realizations. Next we shall structure the information field F with an order relation parametrized by the time t. Consider that all the information accumulated until time t is contained by the σ-field Ft . This means that Ft contains the information of which events have already occurred until time t, and which did not. Since the information is growing in time, we have Fs ⊂ Ft ⊂ F for any s, t ∈ T with s ≤ t. The family Ft is called a filtration. A stochastic process Xt is called adapted to the filtration Ft if Xt is Ft - measurable, for any t ∈ T. This means that the information at time t determines the value of the random variable Xt . Example 2.14.1 Here there are a few examples of filtrations: 1. Ft represents the information about the evolution of a stock until time t, with t > 0. 2. Ft represents the information about the evolution of a Black-Jack game until time t, with t > 0. 3. Ft represents the medical information of a patient until time t. 37 Example 2.14.2 If X is a random variable, consider the conditional expectation Xt = E[X|Ft ]. From the definition of conditional expectation, the random variable Xt is Ft -measurable, and can be regarded as the measurement of X at time t using the information Ft . If the accumulated knowledge Ft increases and eventually equals the σ-field F, then X = E[X|F], i.e. we obtain the entire random variable. The process Xt is adapted to Ft . Example 2.14.3 Don Joe goes to a doctor to get an estimation of how long he still has to live. The age at which he will pass away is a random variable, denoted by X. Given his medical condition today, which is contained in Ft , the doctor can infer an average age, which is the average of all random instances that agree with the information to date; this is given by the conditional expectation Xt = E[X|Ft ]. The stochastic process Xt is adapted to the medical knowledge Ft . We shall define next an important type of stochastic process.5 Definition 2.14.4 A process Xt , t ∈ T, is called a martingale with respect to the filtration Ft if 1. Xt is integrable for each t ∈ T; 2. Xt is adapted to the filtration Ft ; 3. Xs = E[Xt |Fs ], ∀s < t. Remark 2.14.5 The first condition states that the unconditional forecast is finite Z |Xt | dP < ∞. Condition 2 says that the value Xt is known, given the E[|Xt ]] = Ω information set Ft . This can be also stated by saying that Xt is Ft -measurable. The third relation asserts that the best forecast of unobserved future values is the last observation on Xt . Example 2.14.1 Let Xt denote Mr. Li Zhu’s salary after t years of work at the same company. Since Xt is known at time t and it is bounded above, as all salaries are, then the first two conditions hold. Being honest, Mr. Zhu expects today that his future salary will be the same as today’s, i.e. Xs = E[Xt |Fs ], for s < t. This means that Xt is a martingale. Exercise 2.14.6 If X is an integrable random variable on (Ω, F, P ), and Ft is a filtration. Prove that Xt = E[X|Ft ] is a martingale. Exercise 2.14.7 Let Xt and Yt be martingales with respect to the filtration Ft . Show that for any a, b, c ∈ R the process Zt = aXt + bYt + c is a Ft -martingale. 5 The concept of martingale was introduced by Lévy in 1934. 38 Exercise 2.14.8 Let Xt and Yt be martingales with respect to the filtration Ft . (a) Is the process Xt Yt always a martingale with respect to Ft ? (b) What about the processes Xt2 and Yt2 ? Exercise 2.14.9 Two processes Xt and Yt are called conditionally uncorrelated, given Ft , if E[(Xt − Xs )(Yt − Ys )|Fs ] = 0, ∀ 0 ≤ s < t < ∞. Let Xt and Yt be martingale processes. Show that the process Zt = Xt Yt is a martingale if and only if Xt and Yt are conditionally uncorrelated. Assume that Xt , Yt and Zt are integrable. In the following, if Xt is a stochastic process, the minimum amount of information resulted from knowing the process Xs until time t is denoted by Ft = σ(Xs ; s ≤ t). This is the σ-algebra generated by the events {ω; Xs (ω) ∈ (a, b)}, for any real numbers a < b and s ≤ t. In the case of a discrete process, the minimum amount of information resulted from knowing the process Xk until time n is Fn = σ(Xk ; k ≤ n), the σ-algebra generated by the events {ω; Xk (ω) ∈ (a, b)}, for any real numbers a < b and k ≤ n. Exercise 2.14.10 Let Xn , n ≥ 0 be a sequence of integrable independent random variables, with E[Xn ] < ∞, for all n ≥ 0. Let S0 = X0 , Sn = X0 + · · · + Xn . Show the following: (a) Sn − E[Sn ] is an Fn -martingale. (b) If E[Xn ] = 0 and E[Xn2 ] < ∞, ∀n ≥ 0, then Sn2 − V ar(Sn ) is an Fn -martingale. Exercise 2.14.11 Let Xn , n ≥ 0 be a sequence of independent, integrable random variables such that E[Xn ] = 1 for n ≥ 0. Prove that Pn = X0 · X1 · · · · Xn is an Fn -martingale. Exercise 2.14.12 (a) Let X be a normally distributed random variable with mean µ 6= 0 and variance σ 2 . Prove that there is a unique θ 6= 0 such that E[eθX ] = 1. (b) Let (Xi )i≥0 be a sequence of identically P normally distributed random variables with mean µ 6= 0. Consider the sum Sn = nj=0 Xj . Show that Zn = eθSn is a martingale, with θ defined in part (a). In section 10.1 we shall encounter several processes which are martingales. Chapter 3 Useful Stochastic Processes This chapter deals with the most common used stochastic processes and their basic properties. The two main basic processes are the Brownian motion and the Poisson process. The other processes described in this chapter are derived from the previous two. 3.1 The Brownian Motion The observation made first by the botanist Robert Brown in 1827, that small pollen grains suspended in water have a very irregular and unpredictable state of motion, led to the definition of the Brownian motion, which is formalized in the following: Definition 3.1.1 A Brownian motion process is a stochastic process Bt , t ≥ 0, which satisfies 1. The process starts at the origin, B0 = 0; 2. Bt has independent increments; 3. The process Bt is continuous in t; 4. The increments Bt − Bs are normally distributed with mean zero and variance |t − s|, Bt − Bs ∼ N (0, |t − s|). The process Xt = x + Bt has all the properties of a Brownian motion that starts at x. Condition 4 states that the increments of a Brownian motion are stationary, i.e. the distribution of Bt − Bs depends only on the time interval t − s P (Bt+s − Bs ≤ a) = P (Bt − B0 ≤ a) = P (Bt ≤ a). It is worth noting that even if Bt is continuous, it is nowhere differentiable. From condition 4 we get that Bt is normally distributed with mean E[Bt ] = 0 and V ar[Bt ] = t Bt ∼ N (0, t). 39 40 This implies also that the second moment is E[Bt2 ] = t. Let 0 < s < t. Since the increments are independent, we can write E[Bs Bt ] = E[(Bs − B0 )(Bt − Bs ) + Bs2 ] = E[Bs − B0 ]E[Bt − Bs ] + E[Bs2 ] = s. Consequently, Bs and Bt are not independent. Condition 4 has also a physical explanation. A pollen grain suspended in water is kicked by a very large numbers of water molecules. The influence of each molecule on the grain is independent of the other molecules. These effects are average out into a resultant increment of the grain coordinate. According to the Central Limit Theorem, this increment has to be normal distributed. Proposition 3.1.2 A Brownian motion process Bt is a martingale with respect to the information set Ft = σ(Bs ; s ≤ t). Proof: The integrability of Bt follows from Jensen’s inequality E[|Bt |]2 ≤ E[Bt2 ] = V ar(Bt ) = |t| < ∞. Bt is obviously Ft -measurable. Let s < t and write Bt = Bs + (Bt − Bs ). Then E[Bt |Fs ] = E[Bs + (Bt − Bs )|Fs ] = E[Bs |Fs ] + E[Bt − Bs |Fs ] = Bs + E[Bt − Bs ] = Bs + E[Bt−s − B0 ] = Bs , where we used that Bs is Fs -predictable (from where E[Bs |Fs ] = Bs ) and that the increment Bt − Bs is independent of previous values of Bt contained in the information set Ft = σ(Bs ; s ≤ t). A process with similar properties as the Brownian motion was introduced by Wiener. Definition 3.1.3 A Wiener process Wt is a process adapted to a filtration Ft such that 1. The process starts at the origin, W0 = 0; 2. Wt is an Ft -martingale with E[Wt2 ] < ∞ for all t ≥ 0 and E[(Wt − Ws )2 ] = t − s, s ≤ t; 3. The process Wt is continuous in t. Since Wt is a martingale, its increments satisfy E[Wt − Ws ] = E[Wt − Ws |Fs ] = E[Wt |Fs ] − Ws = Ws − Ws = 0, and hence E[Wt ] = 0. It is easy to show that V ar[Wt − Ws ] = |t − s|, V ar[Wt ] = t. 41 Exercise 3.1.4 Show that a Brownian process Bt is a Winer process. The only property Bt has and Wt seems not to have is that the increments are normally distributed. However, there is no distinction between these two processes, as the following result states. Theorem 3.1.5 (Lévy) A Wiener process is a Brownian motion process. In stochastic calculus we often need to use infinitesimal notation and its properties. If dWt denotes the infinitesimal increment of a Wiener process in the time interval dt, the aforementioned properties become dWt ∼ N (0, dt), E[dWt ] = 0, and E[(dWt )2 ] = dt. Proposition 3.1.6 If Wt is a Wiener process with respect to the information set Ft , then Yt = Wt2 − t is a martingale. Proof: Yt is integrable since E[|Yt |] ≤ E[Wt2 + t] = 2t < ∞, t > 0. Let s < t. Using that the increments Wt − Ws and (Wt − Ws )2 are independent of the information set Fs and applying Proposition 2.10.6 yields E[Wt2 |Fs ] = E[(Ws + Wt − Ws )2 |Fs ] = E[Ws2 + 2Ws (Wt − Ws ) + (Wt − Ws )2 |Fs ] = E[Ws2 |Fs ] + E[2Ws (Wt − Ws )|Fs ] + E[(Wt − Ws )2 |Fs ] = Ws2 + 2Ws E[Wt − Ws |Fs ] + E[(Wt − Ws )2 |Fs ] = Ws2 + 2Ws E[Wt − Ws ] + E[(Wt − Ws )2 ] = Ws2 + t − s, and hence E[Wt2 − t|Fs ] = Ws2 − s, for s < t. Exercise 3.1.7 Show that the process Xt = Wt3 −3 Z t Ws ds 0 is a martingale with respect to the information set Ft = σ{Ws ; s ≤ t}. The following result states the memoryless property of Brownian motion1 Wt . Proposition 3.1.8 The conditional distribution of Wt+s , given the present Wt and the past Wu , 0 ≤ u < t, depends only on the present. 1 These type of processes are called Markov processes. 42 Proof: Using the independent increment assumption, we have P (Wt+s ≤ c|Wt = x, Wu , 0 ≤ u < t) = P (Wt+s − Wt ≤ c − x|Wt = x, Wu , 0 ≤ u < t) = P (Wt+s − Wt ≤ c − x) = P (Wt+s ≤ c|Wt = x). Since Wt is normally distributed with mean 0 and variance t, its density function is φt (x) = √ 1 − x2 e 2t . 2πt Then its distribution function is 1 Ft (x) = P (Wt ≤ x) = √ 2πt Z x u2 e− 2t du −∞ The probability that Wt is between the values a and b is given by P (a ≤ Wt ≤ b) = √ 1 2πt Z b u2 e− 2t du, a < b. a Even if the increments of a Brownian motion are independent, their values are still correlated. Proposition 3.1.9 Let 0 ≤ s ≤ t. Then 1. Cov(Ws , Wt ) = s;r 2. Corr(Ws , Wt ) = s . t Proof: 1. Using the properties of covariance Cov(Ws , Wt ) = Cov(Ws , Ws + Wt − Ws ) = Cov(Ws , Ws ) + Cov(Ws , Wt − Ws ) = V ar(Ws ) + E[Ws (Wt − Ws )] − E[Ws ]E[Wt − Ws ] = s + E[Ws ]E[Wt − Ws ] = s, since E[Ws ] = 0. We can also arrive at the same result starting from the formula Cov(Ws , Wt ) = E[Ws Wt ] − E[Ws ]E[Wt ] = E[Ws Wt ]. 43 Using that conditional expectations have the same expectation, factoring the predictable part out, and using that Wt is a martingale, we have E[Ws Wt ] = E[E[Ws Wt |Fs ]] = E[Ws E[Wt |Fs ]] = E[Ws Ws ] = E[Ws2 ] = s, so Cov(Ws , Wt ) = s. 2. The correlation formula yields Corr(Ws, Wt ) = s Cov(Ws , Wt ) =√ √ = σ(Wt )σ(Ws ) s t r s . t Remark 3.1.10 Removing the order relation between s and t, the previous relations can also be stated as Cov(Ws , Wt ) = min{s, t}; s min{s, t} Corr(Ws, Wt ) = . max{s, t} The following exercises state the translation and the scaling invariance of the Brownian motion. Exercise 3.1.11 For any t0 ≥ 0, show that the process Xt = Wt+t0 −Wt0 is a Brownian motion. This can be also stated as saying that the Brownian motion is translation invariant. Exercise 3.1.12 For any λ > 0, show that the process Xt = √1λ Wλt is a Brownian motion. This says that the Brownian motion is invariant by scaling. Exercise 3.1.13 Let 0 < s < t < u. Show the following multiplicative property Corr(Ws , Wt )Corr(Wt , Wu ) = Corr(Ws , Wu ). Exercise 3.1.14 Find the expectations E[Wt3 ] and E[Wt4 ]. Exercise 3.1.15 (a) Use the martingale property of Wt2 −t to find E[(Wt2 −t)(Ws2 −s)]; (b) Evaluate E[Wt2 Ws2 ]; (c) Compute Cov(Wt2 , Ws2 ); (d) Find Corr(Wt2 , Ws2 ). 44 5 50 100 150 200 250 300 350 4 3 -0.5 2 -1.0 50 a 100 150 200 250 300 350 b Figure 3.1: a Three simulations of the Brownian motion process Wt ; b Two simulations of the geometric Brownian motion process eWt . Exercise 3.1.16 Consider the process Yt = tW 1 , t > 0, and define Y0 = 0. t (a) Find the distribution of Yt ; (b) Find the probability density of Yt ; (c) Find Cov(Ys , Yt ); (d) Find E[Yt − Ys ] and V ar(Yt − Ys ) for s < t. It is worth noting that the process Yt = tW 1 , t > 0 with Y0 = 0 is a Brownian motion. t Exercise 3.1.17 The process Xt = |Wt | is called Brownian motion reflected at the origin. Show that p (a) E[|Wt |] = 2t/π; (b) V ar(|Wt |) = (1 − π2 )t. Exercise 3.1.18 Let 0 < s < t. Find E[Wt2 |Fs ]. Exercise 3.1.19 Let 0 < s < t. Show that (a) E[Wt3 |Fs ] = 3(t − s)Ws + Ws3 ; (b) E[Wt4 |Fs ] = 3(t − s)2 + 6(t − s)Ws2 + Ws4 . Exercise 3.1.20 Show that the following processes are Brownian motions (a) Xt = WT − WT −t , 0 ≤ t ≤ T ; (b) Yt = −Wt , t ≥ 0. Exercise 3.1.21 Let Wt and W̃t be two independent Brownian motions and ρ be a constant with |ρ| ≤ 1. p (a) Show that the process Xt = ρWt + 1 − ρ2 W̃t is continuous and has the distribution N (0, t); (b) Is Xt a Brownian motion? Exercise 3.1.22 √ Let Y be a random variable distributed as N (0, 1). Consider the process Xt = tY . Is Xt a Brownian motion? 45 3.2 Geometric Brownian Motion The geometric Brownian motion with drift µ and volatility σ is the process Xt = eσWt +(µ− σ2 )t 2 , t ≥ 0. In the standard case, when µ = 0 and σ = 1, the process becomes Xt = eWt , t ≥ 0. A few simulations of this process are contained in Fig.3.1 b. The following result will be useful in the following. 2 t/2 Lemma 3.2.1 E[eαWt ] = eα , for α ≥ 0. Proof: Using the definition of expectation Z Z x2 1 eαx φt (x) dx = √ E[eαWt ] = e− 2t +αx dx 2πt 2 t/2 = eα , where we have used the integral formula r Z π b2 −ax2 +bx e dx = e 4a , a with a = 1 2t a>0 and b = α. Proposition 3.2.2 The geometric Brownian motion Xt = eWt is log-normally distributed with mean et/2 and variance e2t − et . Proof: Since Wt is normally distributed, then Xt = eWt will have a log-normal distribution. Using Lemma 3.2.1 we have E[Xt ] = E[eWt ] = et/2 E[Xt2 ] = E[e2Wt ] = e2t , and hence the variance is V ar[Xt ] = E[Xt2 ] − E[Xt ]2 = e2t − (et/2 )2 = e2t − et . The distribution function of Xt = eWt can be obtained by reducing it to the distribution function of a Brownian motion. FXt (x) = P (Xt ≤ x) = P (eWt ≤ x) = P (Wt ≤ ln x) = FWt (ln x) Z ln x u2 1 e− 2t du. = √ 2πt −∞ 46 The density function of the geometric Brownian motion Xt = eWt is given by 1 2 √ e−(ln x) /(2t) , if x > 0, x 2πt d F (x) = dx Xt 0, p(x) = elsewhere. Exercise 3.2.3 Show that E[eWt −Ws ] = e t−s 2 , s < t. Exercise 3.2.4 Let Xt = eWt . (a) Show that Xt is not a martingale. t (b) Show that e− 2 Xt is a martingale. 1 2 (c) Show that for any constant c ∈ R, the process Yt = ecWt − 2 c t is a martingale. Exercise 3.2.5 If Xt = eWt , find Cov(Xs , Xt ) (a) by direct computation; (b) by using Exercise 3.2.4 (b). Exercise 3.2.6 Show that 2Wt2 E[e 3.3 ]= (1 − 4t)−1/2 , 0 ≤ t < 1/4 ∞, otherwise. Integrated Brownian Motion The stochastic process Zt = Z t Ws ds, 0 t≥0 is called integrated Brownian motion. Obviously, Z0 = 0. Let 0 = s0 < s1 < · · · < sk < · · · sn = t, with sk = kt n . Then Zt can be written as a limit of Riemann sums Zt = lim n→∞ n X Ws 1 + · · · + Ws n , n→∞ n Wsk ∆s = t lim k=1 where ∆s = sk+1 − sk = nt . Since Wsk are not independent, we first need to transform the previous expression into a sum of independent normally distributed random variables. A straightforward computation shows that Ws 1 + · · · + Ws n = n(Ws1 − W0 ) + (n − 1)(Ws2 − Ws1 ) + · · · + (Wsn − Wsn−1 ) = X1 + X2 + · · · + Xn . (3.3.1) 47 Since the increments of a Brownian motion are independent and normally distributed, we have X1 ∼ N 0, n2 ∆s X2 ∼ N 0, (n − 1)2 ∆s X3 ∼ N 0, (n − 2)2 ∆s .. . Xn ∼ N 0, ∆s . Recall now the following well known theorem on the addition formula for Gaussian random variable. Theorem 3.3.1 If Xj are independent random variables normally distributed with mean µj and variance σj2 , then the sum X1 + · · · + Xn is also normally distributed with mean µ1 + · · · + µn and variance σ12 + · · · + σn2 . Then n(n + 1)(2n + 1) X1 + · · · + Xn ∼ N 0, (1 + 22 + 32 + · · · + n2 )∆s = N 0, ∆s , 6 t with ∆s = . Using (3.3.1) yields n (n + 1)(2n + 1) Ws + · · · + Ws n t 1 ∼ N 0, t3 . n 6n2 “Taking the limit” we get t3 Zt ∼ N 0, . 3 Proposition 3.3.2 The integrated Brownian motion Zt has a normal distribution with mean 0 and variance t3 /3. Remark 3.3.3 The aforementioned limit was taken heuristically, without specifying the type of the convergence. In order to make this to work, the following result is usually used: If Xn is a sequence of normal random variables that converges in mean square to X, then the limit X is normal distributed, with E[Xn ] → E[X] and V ar(Xn ) → V ar(X), as n → ∞. The mean and the variance can also be computed in a direct way as follows. By Fubini’s theorem we have Z Z t Z t Ws ds dP E[Zt ] = E[ Ws ds] = R 0 0 Z t Z tZ E[Ws ] ds = 0, Ws dP ds = = 0 R 0 48 since E[Ws ] = 0. Then the variance is given by V ar[Zt ] = E[Zt2 ] − E[Zt ]2 = E[Zt2 ] Z tZ t Z t Z t Wv dv] = E[ Wu Wv dudv] = E[ Wu du · 0 0 0 0 ZZ Z tZ t min{u, v} dudv E[Wu Wv ] dudv = = [0,t]×[0,t] 0 0 ZZ ZZ min{u, v} dudv, min{u, v} dudv + = (3.3.2) D2 D1 where D1 = {(u, v); u > v, 0 ≤ u ≤ t}, D2 = {(u, v); u < v, 0 ≤ u ≤ t} The first integral can be evaluated using Fubini’s theorem ZZ ZZ v dudv min{u, v} dudv = D1 D1 = Z tZ 0 u v dv du = 0 Z t 0 t3 u2 du = . 2 6 Similarly, the latter integral is equal to ZZ t3 min{u, v} dudv = . 6 D2 Substituting in (3.3.2) yields V ar[Zt ] = t3 t3 t3 + = . 6 6 3 Exercise 3.3.4 (a) Prove that the moment generating function of Zt is m(u) = eu 2 t3 /6 . (b) Use the first part to find the mean and variance of Zt . Exercise 3.3.5 Let s < t. Show that the covariance of the integrated Brownian motion is given by t s , s < t. − Cov Zs , Zt = s2 2 6 Exercise 3.3.6 Show that (a) Cov(Zt , Zt − Zt−h ) = limh→0 o(h)/h = 0; (b) Cov(Zt , Wt ) = t2 . 2 1 2 2t h + o(h), where o(h) denotes a quantity such that 49 Exercise 3.3.7 Show that E[eWs +Wu ] = e Exercise 3.3.8 Consider the process Xt = u+s 2 Z t emin{s,u} . eWs ds. 0 (a) Find the mean of Xt ; (b) Find the variance of Xt . Exercise 3.3.9 Consider the process Zt = Z t Wu du, t > 0. 0 (a) Show that E[ZT |Ft ] = Zt + Wt (T − t), for any t < T ; (b) Prove that the process Mt = Zt − tWt is an Ft -martingale. 3.4 If Zt = Exponential Integrated Brownian Motion Rt 0 Ws ds denotes the integrated Brownian motion, the process Vt = eZt is called exponential integrated Brownian motion. The process starts at V0 = e0 = 1. Since Zt is normally distributed, then Vt is log-normally distributed. We compute the mean and the variance in a direct way. Using Exercises 3.2.5 and 3.3.4 we have t3 E[Vt ] = E[eZt ] = m(1) = e 6 E[Vt2 ] = E[e2Zt ] = m(2) = e V ar(Vt ) = E[Vt2 ] − E[Vt ]2 = e Cov(Vs , Vt ) = e t+3s 2 . Exercise 3.4.1 Show that E[VT |Ft ] = Vt e(T −t)Wt + 3.5 (T −t)3 3 4t3 6 2t3 3 =e 2t3 3 t3 −e3 for t < T . Brownian Bridge The process Xt = Wt − tW1 is called the Brownian bridge fixed at both 0 and 1. Since we can also write Xt = Wt − tWt − tW1 + tWt = (1 − t)(Wt − W0 ) − t(W1 − Wt ), using that the increments Wt − W0 and W1 − Wt are independent and normally distributed, with Wt − W0 ∼ N (0, t), W1 − Wt ∼ N (0, 1 − t), 50 it follows that Xt is normally distributed with E[Xt ] = (1 − t)E[(Wt − W0 )] − tE[(W1 − Wt )] = 0 V ar[Xt ] = (1 − t)2 V ar[(Wt − W0 )] + t2 V ar[(W1 − Wt )] = (1 − t)2 (t − 0) + t2 (1 − t) = t(1 − t). This can be also stated by saying that the Brownian bridge tied at0 and 1 is a Gaussian process with mean 0 and variance t(1 − t), so Xt ∼ N 0, t(1 − t) . Exercise 3.5.1 Let Xt = Wt − tW1 , 0 ≤ t ≤ 1 be a Brownian bridge fixed at 0 and 1. Let Yt = Xt2 . Show that Y0 = Y1 = 0 and find E[Yt ] and V ar(Yt ). 3.6 Brownian Motion with Drift The process Yt = µt + Wt , t ≥ 0, is called Brownian motion with drift. The process Yt tends to drift off at a rate µ. It starts at Y0 = 0 and it is a Gaussian process with mean E[Yt ] = µt + E[Wt ] = µt and variance V ar[Yt ] = V ar[µt + Wt ] = V ar[Wt ] = t. Exercise 3.6.1 Find the distribution and the density functions of the process Yt . 3.7 Bessel Process This section deals with the process satisfied by the Euclidean distance from the origin to a particle following a Brownian motion in Rn . More precisely, if W1 (t), · · · , Wn (t) are independent Brownian motions, let W (t) = (W1 (t), · · · , Wn (t)) be a Brownian motion in Rn , n ≥ 2. The process p Rt = dist(O, W (t)) = W1 (t)2 + · · · + Wn (t)2 is called n-dimensional Bessel process. The probability density of this process is given by the following result. Proposition 3.7.1 The probability density function of Rt , t > 0 is given by ρ2 (2t)n/22Γ(n/2) ρn−1 e− 2t , ρ ≥ 0; pt (ρ) = 0, ρ<0 51 with Γ n 2 = n ( 2 − 1)! for n even; √ ( n2 − 1)( n2 − 2) · · · 23 12 π, for n odd. Proof: Since the Brownian motions W1 (t), . . . , Wn (t) are independent, their joint density function is fW1 ···Wn (t) = fW1 (t) · · · fWn (t) 1 2 2 e−(x1 +···+xn )/(2t) , = (2πt)n/2 t > 0. In the next computation we shall use the following formula of integration that follows from the use of polar coordinates Z ρ Z n−1 r n−1 g(r) dr, (3.7.3) f (x) dx = σ(S ) 0 {|x|≤ρ} where f (x) = g(|x|) is a function on Rn with spherical symmetry, and where σ(Sn−1 ) = 2π n/2 Γ(n/2) is the area of the (n − 1)-dimensional sphere in Rn . Let ρ ≥ 0. The distribution function of Rt is Z fW1 ···Wn (t) dx1 · · · dxn FR (ρ) = P (Rt ≤ ρ) = {Rt ≤ρ} Z 1 2 2 = e−(x1 +···+xn )/(2t) dx1 · · · dxn n/2 x21 +···+x2n ≤ρ2 (2πt) ! Z Z ρ 1 n−1 −(x21 +···+x2n )/(2t) r = e dσ dr n/2 S(0,1) (2πt) 0 Z σ(Sn−1 ) ρ n−1 −r2 /(2t) r e dr. = (2πt)n/2 0 Differentiating yields pt (ρ) = = σ(Sn−1 ) n−1 − ρ2 d ρ e 2t FR (ρ) = dρ (2πt)n/2 2 2 n−1 − ρ2t ρ e , ρ > 0, t > 0. (2t)n/2 Γ(n/2) 52 It is worth noting that in the 2-dimensional case the aforementioned density becomes a particular case of a Weibull distribution with parameters m = 2 and α = 2t, called Wald’s distribution x2 1 x > 0, t > 0. pt (x) = xe− 2t , t Exercise 3.7.2 Let P (Rt ≤ t) be the probability of a 2-dimensional Brownian motion to be inside of the disk D(0, ρ) at time t > 0. Show that ρ2 ρ2 ρ2 1− < P (Rt ≤ t) < . 2t 4t 2t Exercise 3.7.3 Let Rt be a 2-dimensional Bessel process. Show that √ (a) E[Rt ] = 2πt/2; (b) V ar(Rt ) = 2t 1 − π4 . Rt , t > 0, where Rt is a 2-dimensional Bessel process. Show Exercise 3.7.4 Let Xt = t that Xt → 0 as t → ∞ in mean square. 3.8 The Poisson Process A Poisson process describes the number of occurrences of a certain event before time t, such as: the number of electrons arriving at an anode until time t; the number of cars arriving at a gas station until time t; the number of phone calls received on a certain day until time t; the number of visitors entering a museum on a certain day until time t; the number of earthquakes that occurred in Chile during the time interval [0, t]; the number of shocks in the stock market from the beginning of the year until time t; the number of twisters that might hit Alabama from the beginning of the century until time t. The definition of a Poisson process is stated more precisely in the following. Definition 3.8.1 A Poisson process is a stochastic process Nt , t ≥ 0, which satisfies 1. The process starts at the origin, N0 = 0; 2. Nt has independent increments; 3. The process Nt is right continuous in t, with left hand limits; 4. The increments Nt − Ns , with 0 < s < t, have a Poisson distribution with parameter λ(t − s), i.e. P (Nt − Ns = k) = λk (t − s)k −λ(t−s) e . k! 53 It can be shown that condition 4 in the previous definition can be replaced by the following two conditions: P (Nt − Ns = 1) = λ(t − s) + o(t − s) (3.8.4) P (Nt − Ns ≥ 2) = o(t − s), (3.8.5) where o(h) denotes a quantity such that limh→0 o(h)/h = 0. Then the probability that a jump of size 1 occurs in the infinitesimal interval dt is equal to λdt, and the probability that at least 2 events occur in the same small interval is zero. This implies that the random variable dNt may take only two values, 0 and 1, and hence satisfies P (dNt = 1) = λ dt (3.8.6) P (dNt = 0) = 1 − λ dt. (3.8.7) Exercise 3.8.2 Show that if condition 4 is satisfied, then conditions (3.8.4 − 3.8.5) hold. Exercise 3.8.3 Which of the following expressions are o(h)? (a) f (h) = 3h2 + h; √ (b) f (h) = h + 5; (c) f (h) = h ln |h|; (d) f (h) = heh . The fact that Nt − Ns is stationary can be stated as P (Nt+s − Ns ≤ n) = P (Nt − N0 ≤ n) = P (Nt ≤ n) = n X (λt)k k=0 k! e−λt . From condition 4 we get the mean and variance of increments E[Nt − Ns ] = λ(t − s), V ar[Nt − Ns ] = λ(t − s). In particular, the random variable Nt is Poisson distributed with E[Nt ] = λt and V ar[Nt ] = λt. The parameter λ is called the rate of the process. This means that the events occur at the constant rate λ. Since the increments are independent, we have for 0 < s < t E[Ns Nt ] = E[(Ns − N0 )(Nt − Ns ) + Ns2 ] = E[Ns − N0 ]E[Nt − Ns ] + E[Ns2 ] = λs · λ(t − s) + (V ar[Ns ] + E[Ns ]2 ) = λ2 st + λs. As a consequence we have the following result: (3.8.8) 54 Proposition 3.8.4 Let 0 ≤ s ≤ t. Then 1. Cov(Ns , Nt ) = λs; r 2. Corr(Ns , Nt ) = s . t Proof: 1. Using (3.8.8) we have Cov(Ns , Nt ) = E[Ns Nt ] − E[Ns ]E[Nt ] = λ2 st + λs − λsλt = λs. 2. Using the formula for the correlation yields Corr(Ns , Nt ) = Cov(Ns , Nt ) λs = = 1/2 (V ar[Ns ]V ar[Nt ]) (λsλt)1/2 r s . t It worth noting the similarity with Proposition 3.1.9. Proposition 3.8.5 Let Nt be Ft -adapted. Then the process Mt = Nt − λt is an Ft martingale. Proof: Let s < t and write Nt = Ns + (Nt − Ns ). Then E[Nt |Fs ] = E[Ns + (Nt − Ns )|Fs ] = E[Ns |Fs ] + E[Nt − Ns |Fs ] = Ns + E[Nt − Ns ] = Ns + λ(t − s), where we used that Ns is Fs -predictable (and hence E[Ns |Fs ] = Ns ) and that the increment Nt − Ns is independent of previous values of Ns and the information set Fs . Subtracting λt yields E[Nt − λt|Fs ] = Ns − λs, or E[Mt |Fs ] = Ms . Since it is obvious that Mt is integrable and Ft -adapted, it follows that Mt is a martingale. It is worth noting that the Poisson process Nt is not a martingale. The martingale process Mt = Nt − λt is called the compensated Poisson process. Exercise 3.8.6 Compute E[Nt2 |Fs ] for s < t. Is the process Nt2 an Fs -martingale? 55 Exercise 3.8.7 (a) Show that the moment generating function of the random variable Nt is x mNt (x) = eλt(e −1) . (b) Deduce the expressions for the first few moments E[Nt ] = λt E[Nt2 ] = λ2 t2 + λt E[Nt3 ] = λ3 t3 + 3λ2 t2 + λt E[Nt4 ] = λ4 t4 + 6λ3 t3 + 7λ2 t2 + λt. (c) Show that the first few central moments are given by E[Nt − λt] = 0 E[(Nt − λt)2 ] = λt E[(Nt − λt)3 ] = λt E[(Nt − λt)4 ] = 3λ2 t2 + λt. Exercise 3.8.8 Find the mean and variance of the process Xt = eNt . Exercise 3.8.9 (a) Show that the moment generating function of the random variable Mt is x mMt (x) = eλt(e −x−1) . (b) Let s < t. Verify that E[Mt − Ms ] = 0, E[(Mt − Ms )2 ] = λ(t − s), E[(Mt − Ms )3 ] = λ(t − s), E[(Mt − Ms )4 ] = λ(t − s) + 3λ2 (t − s)2 . Exercise 3.8.10 Let s < t. Show that V ar[(Mt − Ms )2 ] = λ(t − s) + 2λ2 (t − s)2 . 3.9 Interarrival times For each state of the world, ω, the path t → Nt (ω) is a step function that exhibits unit jumps. Each jump in the path corresponds to an occurrence of a new event. Let T1 be the random variable which describes the time of the 1st jump. Let T2 be the time between the 1st jump and the second one. In general, denote by Tn the time elapsed between the (n − 1)th and nth jumps. The random variables Tn are called interarrival times. 56 Proposition 3.9.1 The random variables Tn are independent and exponentially distributed with mean E[Tn ] = 1/λ. Proof: We start by noticing that the events {T1 > t} and {Nt = 0} are the same, since both describe the situation that no events occurred until after time t. Then P (T1 > t) = P (Nt = 0) = P (Nt − N0 = 0) = e−λt , and hence the distribution function of T1 is FT1 (t) = P (T1 ≤ t) = 1 − P (T1 > t) = 1 − e−λt . Differentiating yields the density function fT1 (t) = d FT (t) = λe−λt . dt 1 It follows that T1 is has an exponential distribution, with E[T1 ] = 1/λ. In order to show that the random variables T1 and T2 are independent, if suffices to show that P (T2 ≤ t) = P (T2 ≤ t|T1 = s), i.e. the distribution function of T2 is independent of the values of T1 . We note first that from the independent increments property P 0 jumps in (s, s + t], 1 jump in (0, s] = P (Ns+t − Ns = 0, Ns − N0 = 1) = P (Ns+t − Ns = 0)P (Ns − N0 = 1) = P 0 jumps in (s, s + t] P 1 jump in (0, s] . Then the conditional distribution of T2 is F (t|s) = P (T2 ≤ t|T1 = s) = 1 − P (T2 > t|T1 = s) P (T2 > t, T1 = s) = 1− P (T1 = s) P 0 jumps in (s, s + t], 1 jump in (0, s] = 1− P (T1 = s) P 0 jumps in (s, s + t] P 1 jump in (0, s] = 1− = 1 − P 0 jumps in (s, s + t] P 1 jump in (0, s] = 1 − P (Ns+t − Ns = 0) = 1 − e−λt , which is independent of s. Then T2 is independent of T1 and exponentially distributed. A similar argument for any Tn leads to the desired result. 57 3.10 Waiting times The random variable Sn = T1 + T2 + · · · + Tn is called the waiting time until the nth jump. The event {Sn ≤ t} means that there are n jumps that occurred before or at time t, i.e. there are at least n events that happened up to time t; the event is equal to {Nt ≥ n}. Hence the distribution function of Sn is given by FSn (t) = P (Sn ≤ t) = P (Nt ≥ n) = e−λt ∞ X (λt)k · k! k=n Differentiating we obtain the density function of the waiting time Sn fSn (t) = d λe−λt (λt)n−1 FSn (t) = · dt (n − 1)! Writing fSn (t) = tn−1 e−λt , (1/λ)n Γ(n) it turns out that Sn has a gamma distribution with parameters α = n and β = 1/λ. It follows that n n E[Sn ] = , V ar[Sn ] = 2 · λ λ The relation lim E[Sn ] = ∞ states that the expectation of the waiting time is unn→∞ bounded as n → ∞. Exercise 3.10.1 Prove that λe−λt (λt)n−1 d FSn (t) = · dt (n − 1)! Exercise 3.10.2 Using that the interarrival times T1 , T2 , · · · are independent and exponentially distributed, compute directly the mean E[Sn ] and variance V ar(Sn ). 3.11 The Integrated Poisson Process The function u → Nu is continuous with the exception of a set of countable jumps of size 1. It is known that such functions are Riemann integrable, so it makes sense to define the process Z t Nu du, Ut = 0 called the integrated Poisson process. The next result provides a relation between the process Ut and the partial sum of the waiting times Sk . Proposition 3.11.1 The integrated Poisson process can be expressed as Ut = tNt − Nt X k=1 Sk . 58 Figure 3.2: The Poisson process Nt and the waiting times S1 , S2 , · · · Sn . The shaded rectangle has area n(Sn+1 − t). Let Nt = n. Since Nt is equal to k between the waiting times Sk and Sk+1 , the process Ut , which is equal to the area of the subgraph of Nu between 0 and t, can be expressed as Z t Nu du = 1 · (S2 − S1 ) + 2 · (S3 − S2 ) + · · · + n(Sn+1 − Sn ) − n(Sn+1 − t). Ut = 0 Since Sn < t < Sn+1 , the difference of the last two terms represents the area of last the rectangle, which has the length t − Sn and the height n. Using associativity, a computation yields 1 · (S2 − S1 ) + 2 · (S3 − S2 ) + · · · + n(Sn+1 − Sn ) = nSn+1 − (S1 + S2 + · · · + Sn ). Substituting in the aforementioned relation yields Ut = nSn+1 − (S1 + S2 + · · · + Sn ) − n(Sn+1 − t) = nt − (S1 + S2 + · · · + Sn ) = tNt − where we replaced n by Nt . Nt X Sk , k=1 The conditional distribution of the waiting times is provided by the following useful result. Theorem 3.11.2 Given that Nt = n, the waiting times S1 , S2 , · · · , Sn have the the joint density function given by f (s1 , s2 , · · · , sn ) = n! , tn 0 < s1 ≤ s2 ≤ · · · ≤ sn < t. 59 This is the same as the density of an ordered sample of size n from a uniform distribution on the interval (0, t). A naive explanation of this result is as follows. If we know that there will be exactly n events during the time interval (0, t), since the events can occur at any time, each of them can be considered uniformly distributed, with the density f (sk ) = 1/t. Since it makes sense to consider the events independent, taking into consideration all possible n! permutaions, the joint density function becomes n! f (s1 , · · · , sn ) = n!f (s1 ) · · · f (sn ) = n . t Exercise 3.11.3 Find the following means (a) E[Ut ]. Nt X (b) E Sk . k=1 Exercise 3.11.4 Show that V ar(Ut ) = λt3 . 3 Exercise 3.11.5 Can you apply a similar proof as in Proposition 3.3.2 to show that the integrated Poisson process Ut is also a Poisson process? Exercise 3.11.6 Let Y : Ω → N be a discrete random variable. Then for any random variable X we have X E[X] = E[X|Y = y]P (Y = y). y≥0 Exercise 3.11.7 Use Exercise 3.11.6 to solve Exercise 3.11.3 (b). Exercise 3.11.8 (a) Let Tk be the kth interarrival time. Show that E[e−σTk ] = λ , λ+σ σ > 0. (b) Let n = Nt . Show that Ut = nt − [nT1 + (n − 1)T2 + · · · + 2Tn−1 + Tn ]. (c) Find the conditional expectation h i Nt = n . −σUt E e (Hint: If know that there are exactly n jumps in the interval [0, T ], it makes sense to consider the arrival time of the jumps Ti independent and uniformly distributed on [0, T ]). (d) Find the expectation i h E e−σUt . 60 3.12 Submartingales A stochastic process Xt on the probability space (Ω, F, P ) is called submartingale with respect to the filtration Ft if: R (a) Ω |Xt | dP < ∞ (Xt integrable); (b) Xt is known if Ft is given (Xt is adaptable to Ft ); (c) E[Xt+s |Ft ] ≥ Xt , ∀t, s ≥ 0 (future predictions exceed the present value). Example 3.12.1 We shall prove that the process Xt = µt + σWt , with µ > 0 is a submartingale. The integrability follows from the inequality |Xt (ω)| ≤ µt + |Wt (ω)| and integrability of Wt . The adaptability of Xt is obvious, and the last property follows from the computation: E[Xt+s |Ft ] = E[µt + σWt+s |Ft ] + µs > E[µt + σWt+s |Ft ] = µt + σE[Wt+s |Ft ] = µt + σWt = Xt , where we used that Wt is a martingale. Example 3.12.2 We shall show that the square of the Brownian motion, Wt2 , is a submartingale. Using that Wt2 − t is a martingale, we have 2 2 E[Wt+s |Ft ] = E[Wt+s − (t + s)|Ft ] + t + s = Wt2 − t + t + s = Wt2 + s ≥ Wt2 . The following result supplies examples of submatingales starting from martingales or submartingales. Proposition 3.12.3 (a) If Xt is a martingale and φ a convex function such that φ(Xt ) is integrable, then the process Yt = φ(Xt ) is a submartingale. (b) If Xt is a submartingale and φ an increasing convex function such that φ(Xt ) is integrable, then the process Yt = φ(Xt ) is a submartingale. (c) If Xt is a martingale and f (t) is an increasing, integrable function, then Yt = Xt + f (t) is a submartingale. Proof: (a) Using Jensen’s inequality for conditional probabilities, Exercise 2.11.7, we have E[Yt+s |Ft ] = E[φ(Xt+s )|Ft ] ≥ φ E[Xt+s |Ft ] = φ(Xt ) = Yt . 61 (b) From the submatingale property and monotonicity of φ we have φ E[Xt+s |Ft ] ≥ φ(Xt ). Then apply a similar computation as in part (a). (c) We shall check only the forcast property, since the other are obvious. E[Yt+s |Ft ] = E[Xt+s + f (t + s)|Ft ] = E[Xt+s |Ft ] + f (t + s) = Xt + f (t + s) ≥ Xt + f (t) = Yt , ∀s, t > 0. Corollary 3.12.4 (a) Let Xt be a right continuous martingale. Then Xt2 , |Xt |, eXt are submartingales. (b) Let µ > 0. Then eµt+σWt is a submartingale. Proof: (a) Results from part (a) of Proposition 3.12.3. (b) It follows from Example 3.12 and part (b) of Proposition 3.12.3. The following result provides important inequalities involving submartingales. Proposition 3.12.5 (Doob’s Submartingale Inequality) (a) Let Xt be a non-negative submartingale. Then P (sup Xs ≥ x) ≤ s≤t E[Xt ] , x ∀x > 0. (b) If Xt is a right continuous submartingale, then for any x > 0 P (sup Xt ≥ x) ≤ s≤t E[Xt+ ] , x where Xt+ = max{Xt , 0}. Exercise 3.12.6 Let x > 0. Show the inequalities: t (a) P (sup Ws2 ≥ x) ≤ . x s≤t p 2t/π . (b) P (sup |Ws | ≥ x) ≤ x s≤t sups≤t |Ws | = 0. t→∞ t Exercise 3.12.7 Show that st-lim 62 Exercise 3.12.8 Show that for any martingale Xt we have the inequality P (sup Xt2 > x) ≤ s≤t E[Xt2 ] , x ∀x > 0. It is worth noting that Doob’s inequality implies Markov’s inequality. Since sup Xs ≥ s≤t Xt , then P (Xt ≥ x) ≤ P (sup Xs ≥ x). Then Doob’s inequality s≤t P (sup Xs ≥ x) ≤ s≤t E[Xt ] x implies Markov’s inequality (see Theorem 2.11.9) P (Xt ≥ x) ≤ E[Xt ] . x Exercise 3.12.9 Let Nt denote the Poisson process and consider the information set Ft = σ{Ns ; s ≤ t}. (a) Show that Nt is a submartingale; (b) Is Nt2 a submartingale? Exercise 3.12.10 It can be shown that for any 0 < σ < τ we have the inequality E 2 i 4τ λ h X N t −λ ≤ 2 · t σ σ≤t≤τ Nt = λ. t→∞ t Using this inequality prove that ms–lim Chapter 4 Properties of Stochastic Processes 4.1 Stopping Times Consider the probability space (Ω, F, P ) and the filtration (Ft )t≥0 , i.e. Ft ⊂ Fs ⊂ F, ∀t < s. Assume that the decision to stop playing a game before or at time t is determined by the information Ft available at time t. Then this decision can be modeled by a random variable τ : Ω → [0, ∞] which satisfies {ω; τ (ω) ≤ t} ∈ Ft , ∀t ≥ 0. This means that given the information set Ft , we know whether the event {ω; τ (ω) ≤ t} had occurred or not. We note that the possibility τ = ∞ is included, since the decision to continue the game for ever is also a possible event. A random variable τ with the previous properties is called a stopping time. The next example illustrates a few cases when a decision is or is not a stopping time. In order to accomplish this, think of the situation that τ is the time when some random event related to a given stochastic process occurs first. Example 4.1.1 Let Ft be the information available until time t regarding the evolution of a stock. Assume the price of the stock at time t = 0 is $50 per share. The following decisions are stopping times: (a) Sell the stock when it reaches for the first time the price of $100 per share; (b) Buy the stock when it reaches for the first time the price of $10 per share; (c) Sell the stock at the end of the year; 63 64 (d) Sell the stock either when it reaches for the first time $80 or at the end of the year. (e) Keep the stock either until the initial investment doubles or until the end of the year; The following decision is not a stopping time: (f ) Sell the stock when it reaches the maximum level it will ever be; Part (f ) is not a stopping time because it requires information about the future that is not contained in Ft . In part (e) there are two conditions; the latter one has the occurring probability equal to 1. Exercise 4.1.2 Show that any positive constant, τ = c, is a stopping time with respect to any filtration. Exercise 4.1.3 Let τ (ω) = inf{t > 0; |Wt (ω)| ≥ K}, with K > 0 constant. Show that τ is a stopping time with respect to the filtration Ft = σ(Ws ; s ≤ t). The random variable τ is called the first exit time of the Brownian motion Wt from the interval (−K, K). In a similar way one can define the first exit time of the process Xt from the interval (a, b): τ (ω) = inf{t > 0; Xt (ω) ∈ / (a, b)} = inf{t > 0; Xt (ω) ≥ b or Xt (ω) ≤ a)}. Let X0 < a. The first entry time of Xt in the interval [a, b] is defined as τ (ω) = inf{t > 0; Xt (ω) ∈ [a, b]}. If let b = ∞, we obtain the first hitting time of the level a τ (ω) = inf{t > 0; Xt (ω) ≥ a)}. We shall deal with hitting times in more detail in section 4.3. Exercise 4.1.4 Let Xt be a continuous stochastic process. Prove that the first exit time of Xt from the interval (a, b) is a stopping time. We shall present in the following some properties regarding operations with stopping times. Consider the notations τ1 ∨ τ2 = max{τ1 , τ2 }, τ1 ∧ τ2 = min{τ1 , τ2 }, τ̄n = supn≥1 τn and τ n = inf n≥1 τn . Proposition 4.1.5 Let τ1 and τ2 be two stopping times with respect to the filtration Ft . Then 1. τ1 ∨ τ2 2. τ1 ∧ τ2 3. τ1 + τ2 are stopping times. 65 Proof: 1. We have {ω; τ1 ∨ τ2 ≤ t} = {ω; τ1 ≤ t} ∩ {ω; τ2 ≤ t} ∈ Ft , since {ω; τ1 ≤ t} ∈ Ft and {ω; τ2 ≤ t} ∈ Ft . From the subadditivity of P we get P {ω; τ1 ∨ τ2 = ∞} = P {ω; τ1 = ∞} ∪ {ω; τ2 = ∞} ≤ P {ω; τ1 = ∞} + P {ω; τ2 = ∞} . Since P {ω; τi = ∞} = 1 − P {ω; τi < ∞} = 0, i = 1, 2, it follows that P {ω; τ1 ∨ τ2 = ∞} = 0 and hence P {ω; τ1 ∨ τ2 < ∞} = 1. Then τ1 ∨ τ2 is a stopping time. 2. The event {ω; τ1 ∧ τ2 ≤ t} ∈ Ft if and only if {ω; τ1 ∧ τ2 > t} ∈ Ft . {ω; τ1 ∧ τ2 > t} = {ω; τ1 > t} ∩ {ω; τ2 > t} ∈ Ft , since {ω; τ1 > t} ∈ Ft and {ω; τ2 > t} ∈ Ft (the σ-algebra Ft is closed to complements). The fact that τ1 ∧ τ2 < ∞ almost surely has a similar proof. 3. We note that τ1 + τ2 ≤ t if there is a c ∈ (0, t) such that τ1 ≤ c, τ2 ≤ t − c. Using that the rational numbers are dense in R, we can write [ {ω; τ1 ≤ c} ∩ {ω; τ2 ≤ t − c} ∈ Ft , {ω; τ1 + τ2 ≤ t} = 0<c<t,c∈Q since {ω; τ1 ≤ c} ∈ Fc ⊂ Ft , {ω; τ2 ≤ t − c} ∈ Ft−c ⊂ Ft . Writing {ω; τ1 + τ2 = ∞} = {ω; τ1 = ∞} ∪ {ω; τ2 = ∞} yields P {ω; τ1 + τ2 = ∞} ≤ P {ω; τ1 = ∞} + P {ω; τ2 = ∞} = 0, Hence P {ω; τ1 + τ2 < ∞} = 1. It follows that τ1 + τ2 is a stopping time. A filtration Ft is called right-continuous if Ft = ∞ \ n=1 Ft+ 1 , for t > 0. This means n that the information available at time t is a good approximation for any future infinitesimal information Ft+ǫ ; or, equivalently, nothing more can be learned by peeking infinitesimally far into the future. 66 Exercise 4.1.6 (a) Let Ft = σ{Ws ; s ≤ t}, where Wt is a Brownian motion. Is Ft right-continuous? (b) Let Nt = σ{Ns ; s ≤ t}, where Nt is a Poisson motion. Is Nt right-continuous? Proposition 4.1.7 Let Ft be right-continuous and (τn )n≥1 be a sequence of bounded stopping times. (a) Then supn τn and inf τn are stopping times. (b) If the sequence (τn )n≥1 converges to τ , τ 6= 0, then τ is a stopping time. Proof: (a) The fact that τ̄n is a stopping time follows from \ {ω; τ̄n ≤ t} ⊂ {ω; τn ≤ t} ∈ Ft , n≥1 and from the boundedness, which implies P (τn < ∞) = 1. In order to show that τ n is a stopping time we shall proceed as in the following. Using that Ft is right-continuous and closed to complements, it suffices to show that {ω; τ n ≥ t} ∈ Ft . This follows from \ {ω; τn > t} ∈ Ft . {ω; τ n ≥ t} = n≥1 (b) Let lim τn = τ . Then there is an increasing (or decreasing) subsequence τnk of n→∞ stopping times that tends to τ , so sup τnk = τ (or inf τnk = τ ). Since τnk are stopping times, by part (a), it follows that τ is a stopping time. The condition that τn is bounded is significant, since if take τn = n as stopping times, then supn τn = ∞ with probability 1, which does not satisfy the stopping time definition. Exercise 4.1.8 Let τ be a stopping time. (a) Let c ≥ 1 be a constant. Show that cτ is a stopping time. (b) Let f : [0, ∞) → R be a continuous, increasing function satisfying f (t) ≥ t. Prove that f (τ ) is a stopping time. (c) Show that eτ is a stopping time. Exercise 4.1.9 Let τ be a stopping time and c > 0 a constant. Prove that τ + c is a stopping time. Exercise 4.1.10 Let a be a constant and define τ = inf{t ≥ 0; Wt = a}. Is τ a stopping time? Exercise 4.1.11 Let τ be a stopping time. Consider the following sequence τn = (m + 1)2−n if m2−n ≤ τ < (m + 1)2−n (stop at the first time of the form k2−n after τ ). Prove that τn is a stopping time. 67 4.2 Stopping Theorem for Martingales The next result states that in a fair game, the expected final fortune of a gambler, who is using a stopping time to quit the game, is the same as the expected initial fortune. From the financial point of view, the theorem says that if you buy an asset at some initial time and adopt a strategy of deciding when to sell it, then the expected price at the selling time is the initial price; so one cannot make money by buying and selling an asset whose price is a martingale. Fortunately, the price of a stock is not a martingale, and people can still expect to make money buying and selling stocks. If (Mt )t≥0 is an Ft -martingale, then taking the expectation in E[Mt |Fs ] = Ms , ∀s < t and using Example 2.10.4 yields E[Mt ] = E[Ms ], ∀s < t. In particular, E[Mt ] = E[M0 ], for any t > 0. The next result states necessary conditions under which this identity holds if t is replaced by any stopping time τ . Theorem 4.2.1 (Optional Stopping Theorem) Let (Mt )t≥0 be a right continuous Ft -martingale and τ be a stopping time with respect to Ft such that τ is bounded, i.e. ∃N < ∞ such that τ ≤ N . Then E[Mτ ] = E[M0 ]. If Mt is an Ft -submartingale, then E[Mτ ] ≥ E[M0 ]. Proof: Taking the expectation in relation Mτ = Mτ ∧t + (Mτ − Mt )1{τ >t} , see Exercise 4.2.3, yields E[Mτ ] = E[Mτ ∧t ] + E[Mτ 1{τ >t} ] − E[Mt 1{τ >t} ]. Since Mτ ∧t is a martingale, see Exercise 4.2.4 (b), then E[Mτ ∧t ] = E[M0 ]. The previous relation becomes E[Mτ ] = E[M0 ] + E[Mτ 1{τ >t} ] − E[Mt 1{τ >t} ], ∀t > 0. Taking the limit yields E[Mτ ] = E[M0 ] + lim E[Mτ 1{τ >t} ] − lim E[Mt 1{τ >t} ]. t→∞ We shall show that both limits are equal to zero. t→∞ (4.2.1) 68 Since |Mτ 1{τ >t} | ≤ |Mτ |, ∀t > 0, and Mτ is integrable, see Exercise 4.2.4 (a), by the dominated convergence theorem we have Z Z lim Mτ 1{τ >t} dP = 0. Mτ 1{τ >t} dP = lim E[Mτ 1{τ >t} ] = lim t→∞ Ω t→∞ t→∞ Ω For the second limit lim E[Mt 1{τ >t} ] = lim t→∞ Z t→∞ Ω Mt 1{τ >t} dP = 0, since for t > N the integrand vanishes. Hence relation (4.2.1) yields E[Mτ ] = E[M0 ]. It is worth noting that the previous theorem is a special case of the more general Optional Stopping Theorem of Doob: Theorem 4.2.2 Let Mt be a right continuous martingale and σ, τ be two bounded stopping times, with σ ≤ τ . Then Mσ , Mτ are integrable and E[Mτ |Fσ ] = Mσ a.s. In particular, taking expectations, we have E[Mτ ] = E[Mσ ] a.s. In the case when Mt is a submartingale then E[Mτ ] ≥ E[Mσ ] a.s. Exercise 4.2.3 Show that Mτ = Mτ ∧t + (Mτ − Mt )1{τ >t} , where 1{τ >t} (ω) = 1, τ (ω) > t; 0, τ (ω) ≤ t is the indicator function of the set {τ > t}. Exercise 4.2.4 Let Mt be a right continuous martingale and τ be a bounded stopping time. Show that (a) Mτ is integrable; (b) Mτ ∧t is a martingale. Exercise 4.2.5 Show that if let σ = 0 in Theorem 4.2.2 yields Theorem 4.2.1. 69 2.5 Wt 2.0 a 1.5 50 100 150 200 250 Ta 300 350 Figure 4.1: The first hitting time Ta given by WTa = a. 4.3 The First Passage of Time The first passage of time is a particular type of hitting time, which is useful in finance when studying barrier options and lookback options. For instance, knock-in options enter into existence when the stock price hits for the first time a certain barrier before option maturity. A lookback option is priced using the maximum value of the stock until the present time. The stock price is not a Brownian motion, but it depends on one. Hence the need for studying the hitting time for the Brownian motion. The first result deals with the first hitting time for a Brownian motion to reach the barrier a ∈ R, see Fig.4.1. Lemma 4.3.1 Let Ta be the first time the Brownian motion Wt hits a. Then the distribution function of Ta is given by 2 P (Ta ≤ t) = √ 2π Z ∞ −y 2 /2 dy. √ e |a|/ t Proof: If A and B are two events, then P (A) = P (A ∩ B) + P (A ∩ B) = P (A|B)P (B) + P (A|B)P (B). (4.3.2) Let a > 0. Using formula (4.3.2) for A = {ω; Wt (ω) ≥ a} and B = {ω; Ta (ω) ≤ t} yields P (Wt ≥ a) = P (Wt ≥ a|Ta ≤ t)P (Ta ≤ t) +P (Wt ≥ a|Ta > t)P (Ta > t) (4.3.3) If Ta > t, the Brownian motion did not reach the barrier a yet, so we must have Wt < a. Therefore P (Wt ≥ a|Ta > t) = 0. 70 If Ta ≤ t, then WTa = a. Since the Brownian motion is a Markov process, it starts fresh at Ta . Due to symmetry of the density function of a normal variable, Wt has equal chances to go up or go down after the time interval t − Ta . It follows that 1 P (Wt ≥ a|Ta ≤ t) = . 2 Substituting into (4.3.3) yields P (Ta ≤ t) = 2P (Wt ≥ a) Z ∞ Z ∞ 2 2 −y 2 /2 −x2 /(2t) = √ dy. e dx = √ √ e 2πt a 2π a/ t If a < 0, symmetry implies that the distribution of Ta is the same as that of T−a , so we get Z ∞ 2 −y 2 /2 dy. P (Ta ≤ t) = P (T−a ≤ t) = √ √ e 2π −a/ t Remark 4.3.2 The previous proof is based on a more general principle called the Reflection Principle: If τ is a stopping time for the Brownian motion Wt , then the Brownian motion reflected at τ is also a Brownian motion. Theorem 4.3.3 Let a ∈ R be fixed. Then the Brownian motion hits a (in a finite amount of time) with probability 1. Proof: The probability that Wt hits a (in a finite amount of time) is Z ∞ 2 −y 2 /2 dy P (Ta < ∞) = lim P (Ta ≤ t) = lim √ √ e t→∞ t→∞ 2π |a|/ t Z ∞ 2 2 e−y /2 dy = 1, = √ 2π 0 where we used the well known integral Z ∞ Z 1 ∞ −y2 /2 1√ −y 2 /2 e dy = 2π. e dy = 2 −∞ 2 0 The previous result stated that the Brownian motion hits the barrier a almost surely. The next result shows that the expected time to hit the barrier is infinite. Proposition 4.3.4 The random variable Ta has a Pearson 5 distribution given by a2 3 |a| p(t) = √ e− 2t t− 2 , 2π It has the mean E[Ta ] = ∞ and the mode a2 . 3 t > 0. 71 0.4 0.3 0.2 0.1 2 a3 1 2 3 4 Figure 4.2: The distribution of the first hitting time Ta . Proof: Differentiating in the formula of distribution function1 Z ∞ 2 −y 2 /2 FTa (t) = P (Ta ≤ t) = √ dy √ e 2π a/ t yields the following probability density function p(t) = a2 3 a dFTa (t) = √ e− 2t t− 2 , dt 2π t > 0. This is a Pearson 5 distribution with parameters α = 1/2 and β = a2 /2. The expectation is Z ∞ Z ∞ a2 1 a √ √ e− 2t dt. tp(t) dt = E[Ta ] = t 2π 0 0 a2 , t > 0, we have the estimation 2t Z ∞ Z ∞ a3 a 1 1 √ dt − √ dt = ∞, E[Ta ] > √ 3/2 t 2π 0 2 2π 0 t R∞ 1 1 √ dt is divergent and 0 t3/2 dt is convergent. t a2 Using the inequality e− 2t > 1 − since R∞ 0 The mode of Ta is given by a2 β a2 = . = 1 α+1 3 2( 2 + 1) 1 One may use Leibnitz’s formula d dt Z ψ(t) ϕ(t) f (u) du = f (ψ(t))ψ ′ (t) − f (ϕ(t))ϕ′ (t). (4.3.4) 72 Remark 4.3.5 The distribution has a peak at a2 /3. Then if we need to pick a small time interval [t − dt, t + dt] in which the probability that the Brownian motion hits the barrier a is maximum, we need to choose t = a2 /3. Remark 4.3.6 Formula (4.3.4) states that the expected waiting time for Wt to reach the barrier a is infinite. However, the expected waiting time for the Brownian motion Wt to hit either a or −a is finite, see Exercise 4.3.9. Corollary 4.3.7 A Brownian motion process returns to the origin in a finite amount time with probability 1. Proof: Choose a = 0 and apply Theorem 4.3.3. Exercise 4.3.8 Try to apply the proof of Lemma 4.3.1 for the following stochastic processes (a) Xt = µt + σWt , with µ, σ > 0 constants; Z t Ws ds. (b) Xt = 0 Where is the difficulty? Exercise 4.3.9 Let a > 0 and consider the hitting time τa = inf{t > 0; Wt = a or Wt = −a} = inf{t > 0; |Wt | = a}. Prove that E[τa ] = a2 . Exercise 4.3.10 (a) Show that the distribution function of the process Xt = max Ws s∈[0,t] is given by 2 P (Xt ≤ a) = √ 2π (b) Show that E[Xt ] = p Z √ a/ t e−y 2 /2 dy. 0 2 . 2t/π and V ar(Xt ) = t 1 − π Exercise 4.3.11 (a) Find the distribution of Yt = |Wt |, t ≥ 0; r πT . (b) Show that E max |Wt | = 0≤t≤T 2 The fact that a Brownian motion returns to the origin or hits a barrier almost surely is a property characteristic to the first dimension only. The next result states that in larger dimensions this is no longer possible. 73 Theorem 4.3.12 Let (a, b) ∈ R2 . The 2-dimensional Brownian motion W (t) = W1 (t), W2 (t) (with W1 (t) and W2 (t) independent) hits the point (a, b) with probability zero. The same result is valid for any n-dimensional Brownian motion, with n ≥ 2. However, if the point (a, b) is replaced by the disk Dǫ (x0 ) = {x ∈ R2 ; |x − x0 | ≤ ǫ}, there is a difference in the behavior of the Brownian motion from n = 2 to n > 2. Theorem 4.3.13 The 2-dimensional Brownian motion W (t) = W1 (t), W2 (t) hits the disk Dǫ (x0 ) with probability one. Theorem 4.3.14 Let n > 2. The n-dimensional Brownian motion W (t) = W1 (t), · · · Wn (t) hits the ball Dǫ (x0 ) with probability |x | 2−n 0 < 1. P = ǫ The previous results can be stated by saying that that Brownian motion is transient in Rn , for n > 2. If n = 2 the previous probability equals 1. We shall come back with proofs to the aforementioned results in a later chapter. Remark 4.3.15 If life spreads according to a Brownian motion, the aforementioned results explain why life is more extensive on earth rather than in space. The probability for a form of life to reach a planet of radius R situated at distance d is Rd . Since d is large the probability is very small, unlike in the plane, where the probability is always 1. Exercise 4.3.16 Is the one-dimensional Brownian motion transient or recurrent in R? 4.4 The Arc-sine Laws In this section we present a few results which provide certain probabilities related with the behavior of a Brownian motion in terms the arc-sine of a quotient of two time instances. These results are generally known under the name of law of arc-sines. The following result will be used in the proof of the first law of Arc-sine. Proposition 4.4.1 (a) If X : Ω → N is a discrete random variable, then for any subset A ⊂ Ω, we have X P (A) = P (A|X = x)P (X = x). x∈N (b) If X : Ω → R is a continuous random variable, then Z Z P (A) = P (A|X = x)dP = P (A|X = x)fX (x) dx. 74 0.4 0.2 t1 t2 50 100 150 200 250 300 350 a -0.2 -0.4 Wt -0.6 Figure 4.3: The event A(a; t1 , t2 ) in the Law of Arc-sine. Proof: (a) The sets X −1 (x) = {X = x} = {ω; X(ω) = x} form a partition of the sample space Ω, i.e.: S (i) Ω = x X −1 (x); (ii) X −1 (x) ∩ X −1 (y) = Ø for x 6= y. [ [ Then A = A ∩ X −1 (x) = A ∩ {X = x} , and hence x x P (A) = X P A ∩ {X = x} x X P (A ∩ {X = x}) P ({X = x}) = P ({X = x}) x X = P (A|X = x)P (X = x). x (b) In the case when X is continuous, the sum is replaced by an integral and the probability P ({X = x}) by fX (x)dx, where fX is the density function of X. The zero set of a Brownian motion Wt is defined by {t ≥ 0; Wt = 0}. Since Wt is continuous, the zero set is closed with no isolated points almost surely. The next result deals with the probability that the zero set does not intersect the interval (t1 , t2 ). Theorem 4.4.2 (The law of Arc-sine) The probability that a Brownian motion Wt does not have any zeros in the interval (t1 , t2 ) is equal to 2 P (Wt = 6 0, t1 ≤ t ≤ t2 ) = arcsin π r t1 . t2 Proof: Let A(a; t1 , t2 ) denote the event that the Brownian motion Wt takes on the value a between t1 and t2 . In particular, A(0; t1 , t2 ) denotes the event that Wt has (at least) a zero between t1 and t2 . Substituting A = A(0; t1 , t2 ) and X = Wt1 into the formula provided by Proposition 4.4.1 75 P (A) = yields P A(0; t1 , t2 ) = = Z P (A|X = x)fX (x) dx Z P A(0; t1 , t2 )|Wt1 = x fWt (x) dx 1 Z ∞ − x2 1 √ P A(0; t1 , t2 )|Wt1 = x e 2t1 dx 2πt1 −∞ (4.4.5) Using the properties of Wt with respect to time translation and symmetry we have P A(0; t1 , t2 )|Wt1 = x = P A(0; 0, t2 − t1 )|W0 = x = P A(−x; 0, t2 − t1 )|W0 = 0 = P A(|x|; 0, t2 − t1 )|W0 = 0 = P A(|x|; 0, t2 − t1 ) = P T|x| ≤ t2 − t1 , the last identity stating that Wt hits |x| before t2 − t1 . Using Lemma 4.3.1 yields Z ∞ 2 2 − y P A(0; t1 , t2 )|Wt1 = x = p e 2(t2 −t1 ) dy. 2π(t2 − t1 ) |x| Substituting into (4.4.5) we obtain Z ∞ Z ∞ 2 x2 2 1 − y − p e 2(t2 −t1 ) dy e 2t1 dx P A(0; t1 , t2 ) = √ 2πt1 −∞ 2π(t2 − t1 ) |x| Z ∞Z ∞ 2 2 1 − y −x p e 2(t2 −t1 ) 2t1 dydx. = π t1 (t2 − t1 ) 0 |x| The above integral can be evaluated to get (see Exercise 4.4.3 ) r t1 2 . P A(0; t1 , t2 ) = 1 − arcsin π t2 Using P (Wt 6= 0, t1 ≤ t ≤ t2 ) = 1−P A(0; t1 , t2 ) we obtain the desired result. Exercise 4.4.3 Use polar coordinates to show r Z ∞Z ∞ 2 x2 2 1 t1 − 2(t y−t ) − 2t 2 1 1 dydx = 1 − p arcsin . e π t2 π t1 (t2 − t1 ) 0 |x| Exercise 4.4.4 Find the probability that a 2-dimensional Brownian motion W (t) = W1 (t), W2 (t) stays in the same quadrant for the time interval t ∈ (t1 , t2 ). 76 Exercise 4.4.5 Find the probability that a Brownian motion Wt does not take the value a in the interval (t1 , t2 ). Exercise 4.4.6 Let a 6= b. Find the probability that a Brownian motion Wt does not take any of the values {a, b} in the interval (t1 , t2 ). Formulate and prove a generalization. We provide below without proof a few similar results dealing with arc-sine probabilities. The first result deals with the amount of time spent by a Brownian motion on the positive half-axis. Rt + Theorem 4.4.7 (Arc-sine Law of Lévy) Let L+ t = 0 sgn Ws ds be the amount of time a Brownian motion Wt is positive during the time interval [0, t]. Then r τ 2 + . P (Lt ≤ τ ) = arcsin π t The next result deals with the Arc-sine law for the last exit time of a Brownian motion from 0. Theorem 4.4.8 (Arc-sine Law of exit from 0) Let γt = sup{0 ≤ s ≤ t; Ws = 0}. Then r 2 τ , 0 ≤ τ ≤ t. P (γt ≤ τ ) = arcsin π t The Arc-sine law for the time the Brownian motion attains its maximum on the interval [0, t] is given by the next result. Theorem 4.4.9 (Arc-sine Law of maximum) Let Mt = max Ws and define 0≤s≤t θt = sup{0 ≤ s ≤ t; Ws = Mt }. Then 4.5 2 P (θt ≤ s) = arcsin π r s , t 0 ≤ s ≤ t, t > 0. More on Hitting Times In this section we shall deal with results regarding hitting times of Brownian motion with drift. They will be useful in the sequel when pricing barrier options. Theorem 4.5.1 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate µ, and consider α, β > 0. Then P (Xt goes up to α before down to − β) = e2µβ − 1 . e2µβ − e−2µα 77 Proof: Let T = inf{t > 0; Xt = α or Xt = −β} be the first exit time of Xt from the interval (−β, α), which is a stopping time, see Exercise 4.1.4. The exponential process c2 Mt = ecWt − 2 t , t≥0 is a martingale, see Exercise 3.2.4(c). Then E[Mt ] = E[M0 ] = 1. By the Optional Stopping Theorem (see Theorem 4.2.1), we get E[MT ] = 1. This can be written as 1 2 )T 1 2 1 = E[ecWT − 2 c T ] = E[ecXT −(cµ+ 2 c ]. (4.5.6) Choosing c = −2µ yields E[e−2µXT ] = 1. Since the random variable XT takes only the values α and −β, if let pα = P (XT = α), the previous relation becomes e−2µα pα + e2µβ (1 − pα ) = 1. Solving for pα yields pα = Noting that e2µβ − 1 . e2µβ − e−2µα (4.5.7) pα = P (Xt goes up to α before down to − β) leads to the desired answer. It is worth noting how the previous formula changes in the case when the drift rate is zero, i.e. when µ = 0, and Xt = Wt . The previous probability is computed by taking the limit µ → 0 and using L’Hospital’s rule lim 2βe2µβ β e2µβ − 1 = lim = . −2µα 2µβ µ→0 2βe −e + 2αe−2µα α+β µ→0 e2µβ Hence P (Wt goes up to α before down to − β) = β . α+β Taking the limit β → ∞ we recover the following result P (Wt hits α) = 1. If α = β we obtain P (Wt goes up to α before down to − α) = 1 , 2 which shows that the Brownian motion is equally likely to go up or down an amount α in a given time interval. If Tα and Tβ denote the times when the process Xt reaches α and β, respectively, then the aforementioned probabilities can be written using inequalities. For instance the first identity becomes P (Tα ≤ T−β ) = e2µβ − 1 . e2µβ − e−2µα 78 Exercise 4.5.2 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate µ, and consider α > 0. (a) If µ > 0 show that P (Xt goes up to α) = 1. (b) If µ < 0 show that P (Xt goes up to α) = e2µα < 1. Formula (a) can be written equivalently as P (sup(Wt + µt) ≥ α) = 1, µ ≥ 0, t≥0 while formula (b) becomes P (sup(Wt + µt) ≥ α) = e2µα , µ < 0, P (sup(Wt − γt) ≥ α) = e−2γα , γ > 0, t≥0 or t≥0 which is known as one of the Doob’s inequalities. This can be also described in terms of stopping times as follows. Define the stopping time τα = inf{t > 0; Wt − γt ≥ α}. Using P (τα < ∞) = P sup(Wt − γt) ≥ α t≥0 yields the identities P (τα < ∞) = e−2αγ , P (τα < ∞) = 1, γ > 0, γ ≤ 0. Exercise 4.5.3 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate µ, and consider β > 0. Show that the probability that Xt never hits −β is given by 1 − e−2µβ , if µ > 0 0, if µ < 0. Recall that T is the first time when the process Xt hits α or −β. Exercise 4.5.4 (a) Show that E[XT ] = (b) Find E[XT2 ]; (c) Compute V ar(XT ). αe2µβ + βe−2µα − α − β . e2µβ − e−2µα 79 The next result deals with the time one has to wait (in expectation) for the process Xt = µt + Wt to reach either α or −β. Proposition 4.5.5 The expected value of T is E[T ] = αe2µβ + βe−2µα − α − β . µ(e2µβ − e−2µα ) Proof: Using that Wt is a martingale, with E[Wt ] = E[W0 ] = 0, applying the Optional Stopping Theorem, Theorem 4.2.1, yields 0 = E[WT ] = E[XT − µT ] = E[XT ] − µE[T ]. Then by Exercise 4.5.4(a) we get E[T ] = αe2µβ + be−2µα − α − β E[XT ] = . µ µ(e2µβ − e−2µα ) Exercise 4.5.6 Take the limit µ → 0 in the formula provided by Proposition 4.5.5 to find the expected time for a Brownian motion to hit either α or −β. Exercise 4.5.7 Find E[T 2 ] and V ar(T ). Exercise 4.5.8 (Wald’s identities) Let T be a finite and bounded stopping time for the Brownian motion Wt . Show that: (a) E[WT ] = 0; (b) E[WT2 ] = E[T ]. The previous techniques can be also applied to right continuous martingales. Let a > 0 and consider the hitting time of the Poisson process of the barrier a τ = inf{t > 0; Nt ≥ a}. Proposition 4.5.9 The expected waiting time for Nt to reach the barrier a is E[τ ] = λa . Proof: Since Mt = Nt − λt is a right continuous martingale, by the Optional Stopping Theorem E[Mτ ] = E[M0 ] = 0. Then E[Nτ − λτ ] = 0 and hence E[τ ] = λ1 E[Nτ ] = λa . 80 4.6 The Inverse Laplace Transform Method In this section we shall use the Optional Stopping Theorem in conjunction with the inverse Laplace transform to obtain the probability density for hitting times. A. The case of standard Brownian motion Let x > 0. The first hitting time 1 2 τ = Tx = inf{t > 0; Wt = x} is a stopping time. Since Mt = ecWt − 2 c t , t ≥ 0, is a martingale, with E[Mt ] = E[M0 ] = 1, by the Optional Stopping Theorem, see Theorem 4.2.1, we have E[Mτ ] = 1. 1 2 This can be written equivalently as E[ecWτ e− 2 c τ ] = 1. Using Wτ = x, we get 1 2 E[e− 2 c τ ] = e−cx . 1 2 τ It is worth noting that c > 0. This is implied from the fact that e− 2 c τ, x > 0. Substituting s = 21 c2 , the previous relation becomes E[e−sτ ] = e− √ 2sx . < 1 and (4.6.8) This relation has a couple of useful applications. Proposition 4.6.1 The moments of the first hitting time are all infinite E[τ n ] = ∞, n ≥ 1. Proof: The nth moment of τ can be obtained by differentiating and taking s = 0 dn = (−1)n E[τ n ]. = E[(−τ )n e−sτ ] E[e−sτ ] dsn s=0 s=0 Using (4.6.8) yields E[τ n ] = (−1)n Since by induction we have dn −√2sx e dsn . s=0 n−1 √ X Mk xn−k dn −√2sx n − 2sx = (−1) e e , dsn 2rk /2 s(n+k)/2 k=0 with Mk , rk positive integers, it easily follows that E[τ n ] = ∞. For instance, in the case n = 1 we have E[τ ] = − √ x d −√2sx − 2sx √ = lim e e = +∞. ds s→0+ 2 2sx s=0 Another application involves the inverse Laplace transform to get the probability density. This way we can retrieve the result of Proposition 4.3.4. 81 Proposition 4.6.2 The probability density of the hitting time τ is given by |x| − x2 p(t) = √ e 2t , 2πt3 Proof: Let x > 0. The expectation Z E[e−sτ ] = 0 ∞ t > 0. (4.6.9) e−sτ p(τ ) dτ = L{p(τ )}(s) is the Laplace transform of p(τ ). Applying the inverse Laplace transform yields p(τ ) = L−1 {E[e−sτ ]}(τ ) = L−1 {e− x2 x e− 2τ , τ > 0. = √ 2πτ 3 √ 2sx }(τ ) In the case x < 0 we obtain −x − x2 e 2τ , p(τ ) = √ 2πτ 3 τ > 0, which leads to (4.6.9). √ The computation on the inverse Laplace transform L−1 {e− 2sx }(τ ) is beyound the goal of this book. The reader can obtain the value of this inverse Laplace transform using the Mathematica software. However, the more mathematically interested reader is referred to consult the method of complex integration in a book on inverse Laplace transforms. Another application of formula (4.6.8) is the following inequality. Proposition 4.6.3 (Chernoff bound) Let τ denote the first hitting time when the Brownian motion Wt hits the barrier x, x > 0. Then x2 P (τ ≤ λ) ≤ e− 2λ , ∀λ > 0. Proof: Let s = −t in the part 2 of Theorem 2.11.11 and use (4.6.8) to get √ E[etX ] E[e−sX ] ∀s > 0. = = eλs−x 2s , λt −λs e e √ Then P (τ ≤ λ) ≤ emins>0 f (s) , where f (s) = λs − x 2s. Since f ′ (s) = λ − P (τ ≤ λ) ≤ f (s) reaches its minimum at the critical point s0 = min f (s) = f (s0 ) = − s>0 x2 2λ2 √x , 2s . The minimum value is x2 . 2λ Substituting in the previous inequality leads to the required result. then 82 B. The case of Brownian motion with drift Consider the Brownian motion with drift Xt = µt + σWt , with µ, σ > 0. Let τ = inf{t > 0; Xt = x} denote the first hitting time of the barrier x, with x > 0. We shall compute the distribution of the random variable τ and its first two moments. Applying the Optional Stopping Theorem (Theorem 4.2.1) to the martingale Mt = cWt − 21 c2 t yields e E[Mτ ] = E[M0 ] = 1. Using that Wτ = σ1 (Xτ − µτ ) and Xτ = x, the previous relation becomes cµ 1 2 )τ E[e−( σ + 2 c Substituting s = c ] = e− σ x . (4.6.10) cµ 1 2 + c and completing to a square yields σ 2 µ2 µ 2 2s + 2 = c + . σ σ Solving for c we get the solutions r µ2 µ c = − + 2s + 2 , σ σ µ c=− − σ r 2s + µ2 . σ2 Assume c < 0. Then substituting the second solution into (4.6.10) yields √ 1 (µ+ 2sσ2 +µ2 )x −sτ 2 σ . E[e ] = e √ 1 2 2 This relation is contradictory since e−sτ < 1 while e σ2 (µ+ 2sσ +µ )x > 1, where we used that s, x, τ > 0. Hence it follows that c > 0. Substituting the first solution into (4.6.10) leads to √ 1 E[e−sτ ] = e σ2 (µ− 2sσ2 +µ2 )x . We arrived at the following result: Proposition 4.6.4 Assume µ, x > 0. Let τ be the time the process Xt = µt + σWt hits x for the first time. Then we have √ 2 2 1 s > 0. (4.6.11) E[e−sτ ] = e σ2 (µ− 2sσ +µ )x , Proposition 4.6.5 Let τ be the time the process Xt = µt + σWt hits x, with x > 0 and µ > 0. (a) Then the density function of τ is given by (x−µτ )2 x p(τ ) = √ e− 2τ σ2 , σ 2πτ 3/2 τ > 0. (4.6.12) 83 (b) The mean and variance of τ are E[τ ] = x , µ V ar(τ ) = xσ 2 . µ3 Proof: (a) Let p(τ ) be the density function of τ . Since Z ∞ −sτ e−sτ p(τ ) dτ = L{p(τ )}(s) E[e ] = 0 is the Laplace transform of p(τ ), applying the inverse Laplace transform yields √ 2 2 1 p(τ ) = L−1 {E[e−sτ ]} = L−1 {e σ2 (µ− 2sσ +µ )x } (x−µτ )2 x √ = τ > 0. e− 2τ σ2 , σ 2πτ 3/2 It is worth noting that the computation of the previous inverse Laplace transform is non-elementary; however, it can be easily computed using the Mathematica software. (b) The moments are obtained by differentiating the moment generating function and taking the value at s = 0 √ 2 2 d 1 d E[e−sτ ] = − e σ2 (µ− 2sσ +µ )x ds ds s=0 s=0 √ 1 x (µ− 2sσ2 +µ2 )x 2 p eσ µ2 + 2sµ s=0 x . µ E[τ ] = − = = √ d2 12 (µ− 2sσ2 +µ2 )x d2 −sτ σ E[e ] e = ds2 ds2 s=0 s=0 p √ 2 2 2 2 2 1 x(σ + x µ + 2sσ ) 2 (µ− 2sσ +µ )x eσ (µ2 + 2sσ 2 )3/2 s=0 xσ 2 x2 + 2. µ3 µ E[τ 2 ] = (−1)2 = = Hence xσ 2 . µ3 It is worth noting that we can arrive at the formula E[τ ] = µx in the following heuristic way. Taking the expectation in the equation µτ + σWτ = x yields µE[τ ] = x, where we used that E[Wτ ] = 0 for any finite stopping time τ (see Exercise 4.5.8 (a)). Solving for E[τ ] yields the aforementioned formula. V ar(τ ) = E[τ 2 ] − E[τ ]2 = Even if the computations are more or less similar with the previous result, we shall treat next the case of the negative barrier in its full length. This is because of its particular importance in being useful in pricing perpetual American puts. 84 Proposition 4.6.6 Assume µ, x > 0. Let τ be the time the process Xt = µt + σWt hits −x for the first time. (a) We have √ −1 (µ+ 2sσ2 +µ2 )x −sτ 2 σ , s > 0. (4.6.13) E[e ] = e (b) Then the density function of τ is given by (x+µτ )2 x e− 2τ σ2 , p(τ ) = √ σ 2πτ 3/2 (c) The mean of τ is E[τ ] = Proof: τ > 0. (4.6.14) x − 2µx e σ2 . µ (a) Consider the stopping time τ = inf{t > 0; Xt = −x}. By the Optional c2 Stopping Theorem (Theorem 4.2.1) applied to the martingale Mt = ecWt − 2 t yields Therefore c c2 c2 1 = M0 = E[Mτ ] = E ecWτ − 2 τ = E e σ (Xτ −µτ )− 2 τ cµ c2 c cµ c2 c = E e− σ x− σ τ − 2 τ = e− σ x E e−( σ + 2 )τ . cµ c2 c E e−( σ + 2 )τ = e σ x . (4.6.15) q 2 µ c2 If let s = cµ 2s + σµ2 , but only the negative σ + 2 , then solving for c yields c = − σ ± solution works out; this comes from the fact that both terms of the equation (4.6.15) have to be less than 1. Hence (4.6.15) becomes √ −1 2 2 E[e−sτ ] = e σ2 (µ+ 2sσ +µ )x , s > 0. (b) Relation (4.6.13) can be written equivalently as √ −1 (µ+ 2sσ2 +µ2 )x 2 σ L p(τ ) = e . Taking the inverse Laplace transform, and using Mathematica software to compute it, we obtain −1 √ 2 2 (x+µτ )2 x −1 e σ2 (µ+ 2sσ +µ )x (τ ) = √ p(τ ) = L e− 2τ σ2 , τ > 0. σ 2πτ 3/2 (c) Differentiating and evaluating at s = 0 we obtain E[τ ] = − 2µ x 2µx d x σ2 E[e−sτ ] = e− σ2 x 2 p = e− σ2 . ds σ µ µ2 s=0 85 Exercise 4.6.7 Assume the hypothesis of Proposition 4.6.6 satisfied. Find V ar(τ ). Exercise 4.6.8 Find the modes of distributions (4.6.12) and (4.6.14). What do you notice? Exercise 4.6.9 Let Xt = 2t + 3Wt and Yt = 2t + Wt . (a) Show that the expected times for Xt and Yt to reach any barrier x > 0 are the same. (b) If Xt and Yt model the prices of two stocks, which one would you like to own? Exercise 4.6.10 Does 4t + 2Wt hit 9 faster (in expectation) than 5t + 3Wt hits 14? Exercise 4.6.11 Let τ be the first time the Brownian motion with drift Xt = µt + Wt hits x, where µ, x > 0. Prove the inequality P (τ ≤ λ) ≤ e− x2 +λ2 µ2 +µx 2λ ∀λ > 0. , C. The double barrier case In the following we shall consider the case of double barrier. Consider the Brownian motion with drift Xt = µt + Wt , µ > 0. Let α, β > 0 and define the stopping time T = inf{t > 0; Xt = α or Xt = −β}. Relation (4.5.6) states 1 2 )T E[ecXT e−(cµ+ 2 c ] = 1. Since the random variables T and XT are independent (why?), we have 1 2 )T E[ecXT ]E[e−(cµ+ 2 c ] = 1. Using E[ecXT ] = ecα pα + e−cβ (1 − pα ), with pα given by (4.5.7), then 1 2 )T E[e−(cµ+ 2 c ]= 1 ecα p α + e−cβ (1 − pα ) · If substitute s = cµ + 12 c2 , then E[e−sT ] = e(−µ+ √ 1 2s+µ2 )α p −(−µ+ α+e √ 2s+µ2 )β (1 − pα ) · (4.6.16) The probability density of the stopping time T is obtained by taking the inverse Laplace transform of the right side expression n o 1 √ √ p(T ) = L−1 (τ ), 2 2 e(−µ+ 2s+µ )α pα + e−(−µ+ 2s+µ )β (1 − pα ) an expression which is not feasible for having closed form solution. However, expression (4.6.16) would be useful for computing the price for double barrier derivatives. 86 Exercise 4.6.12 Use formula (4.6.16) to find the expectation E[T ]. Exercise 4.6.13 Denote by Mt = Nt − λt the compensated Poisson process and let c > 0 be a constant. (a) Show that Xt = ecMt −λt(e c −c−1) is an Ft -martingale, with Ft = σ(Nu ; u ≤ t). (b) Let a > 0 and T = inf{t > 0; Mt > a} be the first hitting time of the level a. Use the Optional Stopping Theorem to show that E[e−λsT ] = e−ϕ(s)a , s > 0, where ϕ : [0, ∞) → [0, ∞) is the inverse function of f (x) = ex − x − 1. (c) Show that E[T ] = ∞. (d) Can you use the inverse Laplace transform to find the probability density function of T ? 4.7 Limits of Stochastic Processes Let (Xt )t≥0 be a stochastic process. We can make sense of the limit expression X = lim Xt , in a similar way as we did in section 2.12 for sequences of random variables. t→∞ We shall rewrite the definitions for the continuous case. Almost Certain Limit The process Xt converges almost certainly to X, if for all states of the world ω, except a set of probability zero, we have lim Xt (ω) = X(ω). t→∞ We shall write ac-lim Xt = X. It is also sometimes called strong convergence. t→∞ Mean Square Limit We say that the process Xt converges to X in the mean square if lim E[(Xt − X)2 ] = 0. t→∞ In this case we write ms-lim Xt = X. t→∞ 87 Limit in Probability or Stochastic Limit The stochastic process Xt converges in stochastic limit to X if lim P ω; |Xt (ω) − X(ω)| > ǫ = 0. t→∞ This limit is abbreviated by st-lim Xt = X. t→∞ It is worth noting that, like in the case of sequences of random variables, both almost certain convergence and convergence in mean square imply the stochastic convergence, which implies the limit in distribution. Limit in Distribution We say that Xt converges in distribution to X if for any continuous bounded function ϕ(x) we have lim E[ϕ(Xt )] = E[ϕ(X)]. t→∞ It is worth noting that the stochastic convergence implies the convergence in distribution. 4.8 Mean Square Convergence The following property is a reformulation of Exercise 2.12.1 in the continuous setup. Proposition 4.8.1 Consider a stochastic process Xt such that E[Xt ] → k, a constant, and V ar(Xt ) → 0 as t → ∞. Then ms-lim Xt = k. t→∞ It is worthy to note that the previous statement holds true if the limit to infinity is replaced by a limit to any other number. Next we shall provide a few applications. Application 4.8.2 If α > 1/2, then ms-lim t→∞ Wt = 0. tα E[Wt ] 1 t Wt = 0, and V ar[Xt ] = 2α V ar[Wt ] = 2α = Proof: Let Xt = α . Then E[Xt ] = α t t t t 1 1 , for any t > 0. Since → 0 as t → ∞, applying Proposition 4.8.1 yields t2α−1 t2α−1 Wt ms-lim α = 0. t→∞ t Wt = 0. t→∞ t Corollary 4.8.3 ms-lim 88 Application 4.8.4 Let Zt = Rt 0 Ws ds. If β > 3/2, then Zt = 0. t→∞ tβ ms-lim E[Zt ] 1 t3 Zt = 0, and V ar[X ] = V ar[Z ] = = Proof: Let Xt = β . Then E[Xt ] = t t t tβ t2β 3t2β 1 1 , for any t > 0. Since 2β−3 → 0 as t → ∞, applying Proposition 4.8.1 leads to 3t2β−3 3t the desired result. Application 4.8.5 For any p > 0, c ≥ 1 we have eWt −ct = 0. t→∞ tp ms-lim Proof: Consider the process Xt = eWt eWt −ct = . Since tp tp ect 1 E[eWt ] et/2 1 = = → 0, as t → ∞ 1 p ct p ct p )t (c− t e t e e 2 t e2t − et 1 1 V ar[eWt ] 1 = = − → 0, t2p e2ct t2p e2ct t2p e2t(c−1) et(2c−1) E[Xt ] = V ar[Xt ] = as t → ∞, Proposition 4.8.1 leads to the desired result. Application 4.8.6 Show that max Ws 0≤s≤t ms-lim t→∞ t = 0. max Ws Proof: Let Xt = 0≤s≤t t . Since by Exercise 4.3.10 E[ max Ws ] = 0 0≤s≤t √ V ar( max Ws ) = 2 t, 0≤s≤t then E[Xt ] = 0 V ar[Xt ] = √ 2 t → 0, t2 Apply Proposition 4.8.1 to get the desired result. t → ∞. 89 Remark 4.8.7 One of the strongest result regarding limits of Brownian motions is called the law of iterated logarithms and was first proved by Lamperti: lim sup p t→∞ almost certainly. Wt = 1, 2t ln(ln t) Exercise 4.8.8 Find a stochastic process Xt such that both conditions are satisfied: (i) ms-lim Xt = 0 t→∞ (ii) ms-lim Xt2 6= 0. t→∞ Exercise 4.8.9 Let Xt be a stochastic process. Show that ms-lim Xt = 0 ⇐⇒ ms-lim |Xt | = 0. t→∞ 4.9 t→∞ The Martingale Convergence Theorem We state now, without proof, a result which is a powerful way of proving the almost sure convergence. We start with the discrete version: Theorem 4.9.1 Let Xn be a martingale with bounded means ∃M > 0 such that E[|Xn |] ≤ M, ∀n ≥ 1. (4.9.17) Then there is a random variable L such that as- lim Xn = L, i.e. n→∞ P ω; lim Xn (ω) = L(ω) = 1. n→∞ Since E[|Xn |]2 ≤ E[Xn2 ], the condition (4.9.17) can be replaced by its stronger version ∃M > 0 such that E[Xn2 ] ≤ M, ∀n ≥ 1. The following result deals with the continuous versionof theMartingale Convergence Theorem. Denote the infinite knowledge by F∞ = σ ∪t Ft . Theorem 4.9.2 Let Xt be an Ft -martingale such that ∃M > 0 such that E[|Xt |] < M, ∀t > 0. Then there is an F∞ -measurable random variable X∞ such that Xt → X∞ a.s. as t → ∞. The next exercise deals with a process that is a.s-convergent but is not ms-convergent. 90 Exercise 4.9.3 It is known that Xt = eWt −t/2 is a martingale. Since E[|Xt |] = E[eWt −t/2 ] = e−t/2 E[eWt ] = e−t/2 et/2 = 1, by the Martingale Convergence Theorem there is a random variable L such that Xt → L a.s. as t → ∞. (a) What is the limit L? How did you make your guess? (b) Show that 2 E[|Xt − 1|2 ] = V ar(Xt ) + E(Xt ) − 1 . (c) Show that Xt does not converge in the mean square to 1. (d) Prove that the sequence Xt is a.s-convergent but it is not ms-convergent. 4.10 Quadratic Variations In order to gauge the roughness of stochastic processes we shall consider the sum ppowers of consecutive increments in the process, as the norm of the partition decreases to zero. Definition 4.10.1 Let Xt be a continuous stochastic process on Ω, and p > 0 be a (p) constant. The p-th variation process of Xt , denoted by hX, Xit is defined by (p) hX, Xit (ω) = p- lim maxi |ti+1 −ti | n−1 X i=0 |Xti+1 (ω) − Xti (ω)|p , where 0 = t1 < t2 < · · · < tn = t. (1) If p = 1, the process hX, Xit is called the total variation process of Xt . For p = 2, (2) hX, Xit = hX, Xit is called the quadratic variation process of Xt . It can be shown that the limit exists for continuous square integrable martingales Xt . Next we introduce a symmetric and bilinear operation. Definition 4.10.2 The quadratic covariation of two processes Xt and Yt is defined as 1 hX + Y, X + Y it − hX − Y, X − Y it . hX, Y it = 4 Exercise 4.10.3 Prove that: (a) hX, Y it = hY, Xit ; (b) haX + bY, Zit = ahX, Zit + bhY, Zit . Exercise 4.10.4 Prove that the total variation on the interval [0, t] of a Brownian motion is infinite a.s. We shall encounter in the following a few important examples that will be useful when dealing with stochastic integrals. 91 4.10.1 The Quadratic Variation of Wt The next result states that the quadratic variation of the Brownian motion Wt on the interval [0, T ] is T . More precisely, we have: Proposition 4.10.5 Let T > 0 and consider the equidistant partition 0 = t0 < t1 < · · · tn−1 < tn = T . Then ms- lim n→∞ n−1 X (Wti+1 − Wti )2 = T. i=0 (4.10.18) Proof: Consider the random variable Xn = n−1 X i=0 (Wti+1 − Wti )2 . Since the increments of a Brownian motion are independent, Proposition 5.2.1 yields E[Xn ] = E[(Wti+1 − Wti ) ] = n−1 X V ar[(Wti+1 − Wti )2 ] = i=0 2 = tn − t0 = T ; V ar(Xn ) = n−1 X n−1 X i=0 i=0 (ti+1 − ti ) n−1 X i=0 2(ti+1 − ti )2 T 2 2T 2 = n·2 = , n n where we used that the partition is equidistant. Since Xn satisfies the conditions E[Xn ] = T, V ar[Xn ] → 0, ∀n ≥ 1; n → ∞, by Proposition 4.8.1 we obtain ms-lim Xn = T , or n→∞ ms- lim n→∞ n−1 X i=0 (Wti+1 − Wti )2 = T. (4.10.19) Since the mean square convergence implies convergence in probability, it follows that n−1 X (Wti+1 − Wti )2 = T, p- lim n→∞ i=0 i.e. the quadratic variation of Wt on [0, T ] is T . Exercise 4.10.6 Prove that the quadratic variation of the Brownian motion Wt on [a, b] is equal to b − a. 92 The Fundamental Relation dWt2 = dt The relation discussed in this section can be regarded as the fundamental relation of Stochastic Calculus. We shall start by recalling relation (4.10.19) ms- lim n→∞ n−1 X i=0 (Wti+1 − Wti )2 = T. (4.10.20) The right side can be regarded as a regular Riemann integral Z T dt, T = 0 while the left side can be regarded as a stochastic integral with respect to dWt2 Z T (dWt )2 = ms- lim n→∞ 0 Substituting into (4.10.20) yields Z Z T 2 (dWt ) = n−1 X i=0 (Wti+1 − Wti )2 . T dt, 0 0 ∀T > 0. The differential form of this integral equation is dWt2 = dt. In fact, this expression also holds in the mean square sense, as it can be inferred from the next exercise. Exercise 4.10.7 Show that (a) E[dWt2 − dt] = 0; (b) V ar(dWt2 − dt) = o(dt); (c) ms-lim (dWt2 − dt) = 0. dt→0 Roughly speaking, the process dWt2 , which is the square of infinitesimal increments of a Brownian motion, is totally predictable. This relation plays a central role in Stochastic Calculus and will be useful when dealing with Ito’s lemma. The following exercise states that dtdWt = 0, which is another important stochastic relation useful in Ito’s lemma. Exercise 4.10.8 Consider the equidistant partition 0 = t0 < t1 < · · · tn−1 < tn = T . Show that n−1 X (Wti+1 − Wti )(ti+1 − ti ) = 0. (4.10.21) ms- lim n→∞ i=0 93 4.10.2 The Quadratic Variation of Nt − λt The following result deals with the quadratic variation of the compensated Poisson process Mt = Nt − λt. Proposition 4.10.9 Let a < b and consider the partition a = t0 < t1 < · · · < tn−1 < tn = b. Then n−1 X ms– lim (4.10.22) (Mtk+1 − Mtk )2 = Nb − Na , k∆n k→0 where k∆n k := k=0 sup (tk+1 − tk ). 0≤k≤n−1 Proof: For the sake of simplicity we shall use the following notations: ∆tk = tk+1 − tk , ∆Mk = Mtk+1 − Mtk , ∆Nk = Ntk+1 − Ntk . The relation we need to prove can also be written as ms-lim n→∞ Let n−1 X k=0 (∆Mk )2 − ∆Nk = 0. Yk = (∆Mk )2 − ∆Nk = (∆Mk )2 − ∆Mk − λ∆tk . It suffices to show that E lim V ar n→∞ n−1 hX k=0 n−1 hX k=0 Yk Yk i i = 0, (4.10.23) = 0. (4.10.24) The first identity follows from the properties of Poisson processes (see Exercise 3.8.9) E h n−1 X k=0 Yk i = = n−1 X k=0 n−1 X k=0 E[Yk ] = n−1 X k=0 E[(∆Mk )2 ] − E[∆Nk ] (λ∆tk − λ∆tk ) = 0. For the proof of the identity (4.10.24) we need to find first the variance of Yk . V ar[Yk ] = V ar[(∆Mk )2 − (∆Mk + λ∆tk )] = V ar[(∆Mk )2 − ∆Mk ] = V ar[(∆Mk )2 ] + V ar[∆Mk ] − 2Cov[∆Mk2 , ∆Mk ] = λ∆tk + 2λ2 ∆t2k + λ∆tk h i −2 E[(∆Mk )3 ] − E[(∆Mk )2 ]E[∆Mk ] = 2λ2 (∆tk )2 , 94 where we used Exercise 3.8.9 and the fact that E[∆Mk ] = 0. Since Mt is a process with independent increments, then Cov[Yk , Yj ] = 0 for i 6= j. Then V ar h n−1 X k=0 Yk i n−1 X = V ar[Yk ] + 2 k=0 = 2λ2 and hence V ar k=0 Cov[Yk , Yj ] = k6=j n−1 X k=0 h n−1 X X n−1 X V ar[Yk ] k=0 (∆tk )2 ≤ 2λ2 k∆n k n−1 X k=0 ∆tk = 2λ2 (b − a)k∆n k, i Yn → 0 as k∆n k → 0. According to the Example 2.12.1, we obtain the desired limit in mean square. The previous result states that the quadratic variation of the martingale Mt between a and b is equal to the jump of the Poisson process between a and b. The Fundamental Relation dMt2 = dNt Recall relation (4.10.22) n−1 X ms- lim (Mtk+1 − Mtk )2 = Nb − Na . n→∞ (4.10.25) k=0 The right side can be regarded as a Riemann-Stieltjes integral Nb − Na = Z b dNt , a while the left side can be regarded as a stochastic integral with respect to (dMt )2 Z b n→∞ n−1 X 2 Z (dMt )2 := ms- lim a k=0 (Mtk+1 − Mtk )2 . Substituting in (4.10.25) yields Z b (dMt ) = a b dNt , a for any a < b. The equivalent differential form is (dMt )2 = dNt . (4.10.26) 95 The Relations dt dMt = 0, dWt dMt = 0 In order to show that dt dMt = 0 in the mean square sense, we need to prove the limit ms- lim n→∞ n−1 X k=0 (tk+1 − tk )(Mtk+1 − Mtk ) = 0. (4.10.27) This can be thought as a vanishing integral of the increment process dMt with respect to dt Z b dMt dt = 0, ∀a, b ∈ R. a Denote Xn = n−1 X k=0 (tk+1 − tk )(Mtk+1 − Mtk ) = n−1 X ∆tk ∆Mk . k=0 In order to show (4.10.27) it suffices to prove that 1. E[Xn ] = 0; 2. lim V ar[Xn ] = 0. n→∞ Using the additivity of the expectation and Exercise 3.8.9, (ii) E[Xn ] = E h n−1 X k=0 i n−1 X ∆tk E[∆Mk ] = 0. ∆tk ∆Mk = k=0 Since the Poisson process Nt has independent increments, the same property holds for the compensated Poisson process Mt . Then ∆tk ∆Mk and ∆tj ∆Mj are independent for k 6= j, and using the properties of variance we have V ar[Xn ] = V ar h n−1 X k=0 n−1 i n−1 X X ∆tk ∆Mk = (∆tk )2 V ar[∆Mk ] = λ (∆tk )3 , k=0 k=0 where we used V ar[∆Mk ] = E[(∆Mk )2 ] − (E[∆Mk ])2 = λ∆tk , see Exercise 3.8.9 (ii). If we let k∆n k = max ∆tk , then k V ar[Xn ] = λ n−1 X k=0 (∆tk )3 ≤ λk∆n k2 n−1 X k=0 ∆tk = λ(b − a)k∆n k2 → 0 as n → ∞. Hence we proved the stochastic differential relation dt dMt = 0. (4.10.28) 96 For showing the relation dWt dMt = 0, we need to prove ms- lim Yn = 0, (4.10.29) n→∞ where we have denoted Yn = n−1 X k=0 (Wk+1 − Wk )(Mtk+1 − Mtk ) = n−1 X ∆Wk ∆Mk . k=0 Since the Brownian motion Wt and the process Mt have independent increments and ∆Wk is independent of ∆Mk , we have E[Yn ] = n−1 X E[∆Wk ∆Mk ] = k=0 n−1 X E[∆Wk ]E[∆Mk ] = 0, k=0 where we used E[∆Wk ] = E[∆Mk ] = 0. Using also E[(∆Wk )2 ] = ∆tk , E[(∆Mk )2 ] = λ∆tk , and invoking the independence of ∆Wk and ∆Mk , we get V ar[∆Wk ∆Mk ] = E[(∆Wk )2 (∆Mk )2 ] − (E[∆Wk ∆Mk ])2 = E[(∆Wk )2 ]E[(∆Mk )2 ] − E[∆Wk ]2 E[∆Mk ]2 = λ(∆tk )2 . Then using the independence of the terms in the sum, we get V ar[Yn ] = n−1 X V ar[∆Wk ∆Mk ] = λ n−1 X (∆tk )2 k=0 k=0 ≤ λk∆n k n−1 X k=0 ∆tk = λ(b − a)k∆n k → 0, as n → ∞. Since Yn is a random variable with mean zero and variance decreasing to zero, it follows that Yn → 0 in the mean square sense. Hence we proved that dWt dMt = 0. (4.10.30) Exercise 4.10.10 Show the following stochastic differential relations: (a) dt dNt = 0; (d) (dNt )2 = dNt ; (b) dWt dNt = 0; (e) (dMt )2 = dNt ; (c) dt dWt = 0; (f ) (dMt )4 = dNt . The relations proved in this section will be useful in the Part II when developing the stochastic model of a stock price that exhibits jumps modeled by a Poisson process. Chapter 5 Stochastic Integration This chapter deals with one of the most useful stochastic integrals, called the Ito integral. This type of integral was introduced in 1944 by the Japanese mathematician K. Ito, and was originally motivated by a construction of diffusion processes. 5.1 Nonanticipating Processes Consider the Brownian motion Wt . A process Ft is called a nonanticipating process if Ft is independent of any future increment Wt′ − Wt for any t and t′ with t < t′ . Consequently, the process Ft is independent of the behavior of the Brownian motion in the future, i.e. it cannot anticipate the future. For instance, Wt , eWt , Wt2 − Wt + t are examples of nonanticipating processes, while Wt+1 or 21 (Wt+1 − Wt )2 are not. Nonanticipating processes are important because the Ito integral concept applies only to them. If Ft denotes the information known until time t, where this information is generated by the Brownian motion {Ws ; s ≤ t}, then any Ft -adapted process Ft is nonanticipating. 5.2 Increments of Brownian Motions In this section we shall discuss a few basic properties of the increments of a Brownian motion, which will be useful when computing stochastic integrals. Proposition 5.2.1 Let Wt be a Brownian motion. If s < t, we have 1. E[(Wt − Ws )2 ] = t − s. 2. V ar[(Wt − Ws )2 ] = 2(t − s)2 . Proof: 1. Using that Wt − Ws ∼ N (0, t − s), we have E[(Wt − Ws )2 ] = E[(Wt − Ws )2 ] − (E[Wt − Ws ])2 = V ar(Wt − Ws ) = t − s. 97 98 2. Dividing by the standard deviation yields the standard normal random variable (Wt − Ws )2 Wt − Ws √ is χ2 -distributed with 1 degree of free∼ N (0, 1). Its square, t−s t−s dom.1 Its mean is 1 and its variance is 2. This implies h (W − W )2 i t s = 1 =⇒ E[(Wt − Ws )2 ] = t − s; t−s h (W − W )2 i t s V ar = 2 =⇒ V ar[(Wt − Ws )2 ] = 2(t − s)2 . t−s E Remark 5.2.2 The infinitesimal version of the previous result is obtained by replacing t − s with dt 1. E[dWt2 ] = dt; 2. V ar[dWt2 ] = 2dt2 . We shall see in an upcoming section that in fact dWt2 and dt are equal in a mean square sense. Exercise 5.2.3 Show that (a) E[(Wt − Ws )4 ] = 3(t − s)2 ; (b) E[(Wt − Ws )6 ] = 15(t − s)3 . 5.3 The Ito Integral The Ito integral is defined in a way that is similar to the Riemann integral. The Ito integral is taken with respect to infinitesimal increments of a Brownian motion, dWt , which are random variables, while the Riemann integral considers integration with respect to the predictable infinitesimal changes dt. It is worth noting that the Ito integral is a random variable, while the Riemann integral is just a real number. Despite this fact, there are several common properties and relations between these two types of integrals. Consider 0 ≤ a < b and let Ft = f (Wt , t) be a nonanticipating process with hZ b i E Ft2 dt < ∞. (5.3.1) a The role of the previous condition will be made more clear when we shall discuss about the martingale property of a the Ito integral. Divide the interval [a, b] into n subintervals using the partition points a = t0 < t1 < · · · < tn−1 < tn = b, 1 A χ2 -distributed random variable with n degrees of freedom has mean n and variance 2n. 99 and consider the partial sums Sn = n−1 X i=0 Fti (Wti+1 − Wti ). We emphasize that the intermediate points are the left endpoints of each interval, and this is the way they should be always chosen. Since the process Ft is nonanticipative, the random variables Fti and Wti+1 − Wti are independent; this is an important feature in the definition of the Ito integral. The Ito integral is the limit of the partial sums Sn Z b Ft dWt , ms-lim Sn = n→∞ a provided the limit exists. It can be shown that the choice of partition does not influence the value of the Ito integral. This is the reason why, for practical purposes, it suffices to assume the intervals equidistant, i.e. ti+1 − ti = (b − a) , n i = 0, 1, · · · , n − 1. The previous convergence is in the mean square sense, i.e. Z b 2 i h = 0. lim E Sn − Ft dWt n→∞ a Existence of the Ito integral It is known that the Ito stochastic integral Z b Ft dWt exists if the process Ft = f (Wt , t) a satisfies the following properties: 1. The paths t → Ft (ω) are continuous on [a, b] for any state of the world ω ∈ Ω; 2. The process Ft is nonanticipating for t ∈ [a, b]; hZ b i 3. E Ft2 dt < ∞. a For instance, the following stochastic integrals exist: Z 5.4 T 0 Wt2 dWt , Z T sin(Wt ) dWt , 0 Z a b cos(Wt ) dWt . t Examples of Ito integrals As in the case of the Riemann integral, using the definition is not an efficient way of computing integrals. The same philosophy applies to Ito integrals. We shall compute in the following two simple Ito integrals. In later sections we shall introduce more efficient methods for computing Ito integrals. 100 5.4.1 The case Ft = c, constant In this case the partial sums can be computed explicitly Sn = n−1 X i=0 Fti (Wti+1 − Wti ) = = c(Wb − Wa ), n−1 X i=0 c(Wti+1 − Wti ) and since the answer does not depend on n, we have Z b c dWt = c(Wb − Wa ). a In particular, taking c = 1, a = 0, and b = T , since the Brownian motion starts at 0, we have the following formula: Z 5.4.2 T dWt = WT . 0 The case Ft = Wt We shall integrate the process Wt between 0 and T . Considering an equidistant partikT , k = 0, 1, · · · , n − 1. The partial sums are given by tion, we take tk = n Sn = n−1 X i=0 Wti (Wti+1 − Wti ). Since 1 xy = [(x + y)2 − x2 − y 2 ], 2 letting x = Wti and y = Wti+1 − Wti yields Wti (Wti+1 − Wti ) = 1 1 1 2 Wti+1 − Wt2i − (Wti+1 − Wti )2 . 2 2 2 Then after pair cancelations the sum becomes n−1 Sn = i=0 = Using tn = T , we get n−1 n−1 i=0 i=0 1X 2 1X 1X 2 Wti+1 − Wti − (Wti+1 − Wti )2 2 2 2 1 2 1 Wtn − 2 2 n−1 X i=0 (Wti+1 − Wti )2 . n−1 1X 1 (Wti+1 − Wti )2 . Sn = WT2 − 2 2 i=0 101 Since the first term on the right side is independent of n, using Proposition 4.10.5, we have n−1 ms- lim Sn = n→∞ 1X 1 2 (Wti+1 − Wti )2 WT − ms- lim n→∞ 2 2 (5.4.2) i=0 1 2 1 W − T. (5.4.3) 2 T 2 We have now obtained the following explicit formula of a stochastic integral: Z T 1 1 Wt dWt = WT2 − T. 2 2 0 = In a similar way one can obtain Z b 1 1 Wt dWt = (Wb2 − Wa2 ) − (b − a). 2 2 a It is worth noting that the right side contains random variables depending on the limits of integration a and b. Exercise 5.4.1 Show the following identities: RT (a) E[ 0 dWt ] = 0; RT (b) E[ 0 Wt dWt ] = 0; Z T T2 . Wt dWt ] = (c) V ar[ 2 0 5.5 Properties of the Ito Integral We shall start with some properties which are similar with those of the Riemannian integral. Proposition 5.5.1 Let f (Wt , t), g(Wt , t) be nonanticipating processes and c ∈ R. Then we have 1. Additivity: Z T Z T Z T g(Wt , t) dWt . f (Wt , t) dWt + [f (Wt , t) + g(Wt , t)] dWt = 2. Homogeneity: Z T cf (Wt , t) dWt = c 0 3. Partition property: Z Z T f (Wt , t) dWt = 0 0 0 0 u f (Wt , t) dWt + 0 Z Z T f (Wt , t) dWt . 0 T f (Wt , t) dWt , u ∀0 < u < T. 102 Proof: 1. Consider the partial sum sequences n−1 X Xn = i=0 n−1 X Yn = i=0 Z Since ms-lim Xn = n→∞ f (Wti , ti )(Wti+1 − Wti ) g(Wti , ti )(Wti+1 − Wti ). T f (Wt , t) dWt and ms-lim Yn = n→∞ 0 Z T g(Wt , t) dWt , using Propo- 0 sition 2.13.2 yields Z T f (Wt , t) + g(Wt , t) dWt 0 = ms-lim n→∞ n−1 X i=0 f (Wti , ti ) + g(Wti , ti ) (Wti+1 − Wti ) n−1 n−1 i hX X g(Wti , ti )(Wti+1 − Wti ) = ms-lim f (Wti , ti )(Wti+1 − Wti ) + n→∞ i=0 i=0 = ms-lim (Xn + Yn ) = ms-lim Xn + ms-lim Yn n→∞ n→∞ n→∞ Z T Z T g(Wt , t) dWt . f (Wt , t) dWt + = 0 0 The proofs of parts 2 and 3 are left as an exercise for the reader. Some other properties, such as monotonicity, do not hold in general.R It is possible T to have a nonnegative random variable Ft for which the random variable 0 Ft dWt has RT negative values. More precisely, let Ft = 1. Then Ft > 0 but 0 1 dWt = WT it is not always positive. The probability to be negative is P (WT < 0) = 1/2. Some of the random variable properties of the Ito integral are given by the following result: Proposition 5.5.2 We have 1. Zero mean: E hZ a 2. Isometry: E h Z b i f (Wt , t) dWt = 0. b f (Wt , t) dWt a 2 i =E hZ b a i f (Wt , t)2 dt . 3. Covariance: i hZ b h Z b Z b i E g(Wt , t) dWt = E f (Wt , t)g(Wt , t) dt . f (Wt , t) dWt a a a 103 We shall discuss the previous properties giving rough reasons why they hold true. The detailed proofs are beyond the goal of this book. Rb 1. The Ito integral I = a f (Wt , t) dWt is the mean square limit of the partial sums Pn−1 Sn = i=0 fti (Wti+1 − Wti ), where we denoted fti = f (Wti , ti ). Since f (Wt , t) is a nonanticipative process, then fti is independent of the increments Wti+1 − Wti , and hence we have E[Sn ] = E h n−1 X i=0 = n−1 X i=0 i n−1 X E[fti (Wti+1 − Wti )] fti (Wti+1 − Wti ) = i=0 E[fti ]E[(Wti+1 − Wti )] = 0, because the increments have mean zero. Applying the Squeeze Theorem in the double inequality 2 0 ≤ E[Sn − I] ≤ E[(Sn − I)2 ] → 0, n→∞ yields E[Sn ] − E[I] → 0. Since E[Sn ] = 0 it follows that E[I] = 0, i.e. the Ito integral has zero mean. 2. Since the square of the sum of partial sums can be written as Sn2 = = n−1 X i=0 n−1 X i=0 2 fti (Wti+1 − Wti ) ft2i (Wti+1 − Wti )2 + 2 X i6=j fti (Wti+1 − Wti )ftj (Wtj+1 − Wtj ), using the independence yields E[Sn2 ] = n−1 X i=0 +2 E[ft2i ]E[(Wti+1 − Wti )2 ] X i6=j = n−1 X i=0 E[fti ]E[(Wti+1 − Wti )]E[ftj ]E[(Wtj+1 − Wtj )] E[ft2i ](ti+1 − ti ), which are the Riemann sums of the integral Z a b E[ft2 ] dt =E hZ a b i ft2 dt , where the last identity follows from Fubini’s theorem. Hence E[Sn2 ] converges to the aforementioned integral. 104 3. Consider the partial sums n−1 X Sn = i=0 fti (Wti+1 − Wti ), Vn = n−1 X j=0 gtj (Wtj+1 − Wtj ). Their product is n−1 X S n Vn = i=0 n−1 X = i=0 n−1 X gtj (Wtj+1 − Wtj ) fti (Wti+1 − Wti ) j=0 fti gti (Wti+1 − Wti )2 + n−1 X i6=j fti gtj (Wti+1 − Wti )(Wtj+1 − Wtj ) Using that ft and gt are nonanticipative and that E[(Wti+1 − Wti )(Wtj+1 − Wtj )] = E[Wti+1 − Wti ]E[Wtj+1 − Wtj ] = 0, i 6= j E[(Wti+1 − Wti )2 ] = ti+1 − ti , it follows that n−1 X E[Sn Vn ] = i=0 n−1 X = i=0 E[fti gti ]E[(Wti+1 − Wti )2 ] E[fti gti ](ti+1 − ti ), which is the Riemann sum for the integral Rb E[ft gt ] dt. Rb From 1 and 2 it follows that the random variable a f (Wt , t) dWt has mean zero and variance hZ b i hZ b i 2 V ar f (Wt , t) dWt = E f (Wt , t) dt . a a a From 1 and 3 it follows that Cov hZ b f (Wt , t) dWt , a Z i b g(Wt , t) dWt = a Z b E[f (Wt , t)g(Wt , t)] dt. a Corollary 5.5.3 (Cauchy’s integral inequality) Let ft = f (Wt , t) and gt = g(Wt , t). Then Z b 2 Z b Z b 2 2 E[ft gt ] dt ≤ E[ft ] dt E[gt ] dt . a a a 105 Proof: It follows from the previous theorem and from the correlation formula |Corr(X, Y )| = |Cov(X, Y )| ≤ 1. [V ar(X)V ar(Y )]1/2 Let Ft be the information set at time t. This implies that fti and Wti+1 − Wti are n−1 X fti (Wti+1 − known at time t, for any ti+1 ≤ t. It follows that the partial sum Sn = i=0 Wti ) is Ft -measurable. The following result, whose proof is omited for technical reasons, states that this is also valid after taking the limit in the mean square: Proposition 5.5.4 The Ito integral Rt 0 fs dWs is Ft -measurable. The following two results state that if the upper limit of an Ito integral is replaced by the parameter t we obtain a continuous martingale. Proposition 5.5.5 For any s < t we have hZ t i Z E f (Wu , u) dWu |Fs = s f (Wu , u) dWu . 0 0 Proof: Using part 3 of Proposition 5.5.2 we get i hZ t f (Wu , u) dWu |Fs E 0 Z t i hZ s f (Wu , u) dWu |Fs f (Wu , u) dWu + = E s 0 hZ s i hZ t i = E f (Wu , u) dWu |Fs + E f (Wu , u) dWu |Fs . 0 (5.5.4) s Rs Since 0 f (Wu , u) dWu is Fs -predictable (see Proposition 5.5.4), by part 2 of Proposition 2.10.6 hZ s i Z s f (Wu , u) dWu . E f (Wu , u) dWu |Fs = 0 0 Rt Since s f (Wu , u) dWu contains only information between s and t, it is independent of the information set Fs , so we can drop the condition in the expectation; using that Ito integrals have zero mean we obtain hZ t i hZ t i E f (Wu , u) dWu |Fs = E f (Wu , u) dWu = 0. s s Substituting into (5.5.4) yields the desired result. Rt Proposition 5.5.6 Consider the process Xt = 0 f (Ws , s) dWs . Then Xt is continuous, i.e. for almost any state of the world ω ∈ Ω, the path t → Xt (ω) is continuous. 106 Proof: A rigorous proof is beyond the purpose of this book. We shall provide a rough sketch. Assume the process f (Wt , t) satisfies E[f (Wt , t)2 ] < M , for some M > 0. Let t0 be fixed and consider h > 0. Consider the increment Yh = Xt0 +h − Xt0 . Using the aforementioned properties of the Ito integral we have h Z t0 +h i E[Yh ] = E[Xt0 +h − Xt0 ] = E f (Wt , t) dWt = 0 E[Yh2 ] = E h Z < M Z t0 t0 +h f (Wt , t) dWt t0 t0 +h 2 i = Z t0 +h E[f (Wt , t)2 ] dt t0 dt = M h. t0 The process Yh has zero mean for any h > 0 and its variance tends to 0 as h → 0. Using a convergence theorem yields that Yh tends to 0 in mean square, as h → 0. This is equivalent with the continuity of Xt at t0 . Rt R∞ Proposition 5.5.7 Let Xt = 0 f (Ws , s) dWs , with E 0 f 2 (s, Ws ) ds < ∞. Then Xt is a continuous Ft -martingale. Proof: We shall check in the following the properties of a martingale. Integrability: Using properties of Ito integrals Z Z Z 2 ∞ 2 t t 2 2 f (Ws , s) ds < ∞, f (Ws , s) ds < E f (Ws , s) dWs E[Xt ] = E =E 0 0 0 and then from the inequality E[|Xt |]2 ≤ E[Xt2 ] we obtain E[|Xt |] < ∞, for all t ≥ 0. Predictability: Xt is Ft -measurable from Proposition 5.5.4. Forecast: E[Xt |Fs ] = Xs for s < t by Proposition 5.5.5. Continuity: See Proposition 5.5.6. 5.6 The Wiener Integral The Wiener integral is a particular case of the Ito stochastic integral. It is obtained by replacing the nonanticipating stochastic process f (Wt , t) by the deterministic function Rb f (t). The Wiener integral a f (t) dWt is the mean square limit of the partial sums Sn = n−1 X i=0 f (ti )(Wti+1 − Wti ). All properties of Ito integrals also hold for Wiener integrals. The Wiener integral is a random variable with zero mean i hZ b E f (t) dWt = 0 a 107 and variance E h Z b f (t) dWt a 2 i = Z b f (t)2 dt. a However, in the case of Wiener integrals we can say something about their distribution. Rb Proposition 5.6.1 The Wiener integral I(f ) = a f (t) dWt is a normal random variable with mean 0 and variance Z b f (t)2 dt := kf k2L2 . V ar[I(f )] = a Proof: Since increments Wti+1 −Wti are normally distributed with mean 0 and variance ti+1 − ti , then f (ti )(Wti+1 − Wti ) ∼ N (0, f (ti )2 (ti+1 − ti )). Since these random variables are independent, by the Central Limit Theorem (see Theorem 3.3.1), their sum is also normally distributed, with Sn = n−1 X i=0 n−1 X f (ti )2 (ti+1 − ti ) . f (ti )(Wti+1 − Wti ) ∼ N 0, i=0 Taking n → ∞ and max kti+1 − ti k → 0, the normal distribution tends to i Z b f (t)2 dt . N 0, a The previous convergence holds in distribution, and it still needs to be shown in the mean square. However, we shall omit this essential proof detail. RT Exercise 5.6.2 Show that the random variable X = 1 √1t dWt is normally distributed with mean 0 and variance ln T . RT √ Exercise 5.6.3 Let Y = 1 t dWt . Show that Y is normally distributed with mean 0 and variance (T 2 − 1)/2. Rt Exercise 5.6.4 Find the distribution of the integral 0 et−s dWs . Rt Rt Exercise 5.6.5 Show that Xt = 0 (2t−u) dWu and Yt = 0 (3t−4u) dWu are Gaussian processes with mean 0 and variance 37 t3 . 1 Exercise 5.6.6 Show that ms- lim t→0 t Z t u dWu = 0. 0 Exercise 5.6.7 Find all constants a, b such that Xt = distributed with variance t. Rt 0 a+ bu t dWu is normally 108 Exercise 5.6.8 Let n be a positive integer. Prove that Z t tn+1 Cov Wt , un dWu = . n+1 0 Formulate and prove a more general result. 5.7 Poisson Integration In this section we deal with the integration with respect to the compensated Poisson process Mt = Nt − λt, which is a martingale. Consider 0 ≤ a < b and let Ft = F (t, Mt ) be a nonanticipating process with E hZ b a i Ft2 dt < ∞. Consider the partition a = t0 < t1 < · · · < tn−1 < tn = b of the interval [a, b], and associate the partial sums Sn = n−1 X i=0 Fti− (Mti+1 − Mti ), where Fti− is the left-hand limit at ti− . For predictability reasons, the intermediate points are the left-handed limit to the endpoints of each interval. Since the process Ft is nonanticipative, the random variables Fti− and Mti+1 − Mti are independent. The integral of Ft− with respect to Mt is the mean square limit of the partial sum Sn ms-lim Sn = n→∞ Z T Ft− dMt , 0 provided the limit exists. More precisely, this convergence means that lim E n→∞ Z b 2 i h = 0. Ft− dMt Sn − a Exercise 5.7.1 Let c be a constant. Show that Z b a c dMt = c(Mb − Ma ). 109 An Work Out Example: the case Ft = Mt 5.8 We shall integrate the process Mt− between 0 and T with respect to Mt . Consider the kT , k = 0, 1, · · · , n − 1. Then the partial sums are equidistant partition points tk = n given by n−1 X Mti− (Mti+1 − Mti ). Sn = i=0 1 Using xy = [(x + y)2 − x2 − y 2 ], by letting x = Mti− and y = Mti+1 − Mti , we get 2 1 1 1 Mti− (Mti+1 − Mti ) = (Mti+1 − Mti + Mti− )2 − Mt2i− − (Mti+1 − Mti )2 . 2 2 2 Let J be the set of jump instances between 0 and T . Using that Mti = Mti− for ti ∈ / J, and Mti = 1 + Mti− for ti ∈ J yields Mti+1 , if ti ∈ /J Mti+1 − Mti + Mti− = Mti+1 − 1, if ti ∈ J. Splitting the sum, canceling in pairs, and applying the difference of squares formula we have Sn = = n−1 n−1 n−1 i=0 i=0 i=0 1X 2 1X 1X (Mti+1 − Mti + Mti− )2 − Mti− − (Mti+1 − Mti )2 2 2 2 n−1 1X 1X 2 1X 2 1X 2 1X 2 (Mti+1 − 1) + Mti− − Mti+1 − Mti − (Mti+1 − Mti )2 2 2 2 2 2 ti ∈J = ti ∈J / 1 1 1 X (Mti+1 − 1)2 − Mt2i− + Mt2n − 2 2 2 ti ∈J = n−1 X i=0 ti ∈J i=0 (Mti+1 − Mti )2 n−1 1X 1 2 1X (Mti+1 − 1 − Mti− )(Mti+1 − 1 + Mti− ) + Mtn − (Mti+1 − Mti )2 2 2 2 {z } | i=0 t ∈J i = ti ∈J / 1 2 1 M − 2 tn 2 =0 n−1 X i=0 (Mti+1 − Mti )2 . Hence we have arrived at the following formula Z T 1 1 Mt− dMt = MT2 − NT . 2 2 0 Similarly, one can obtain Z b 1 1 Mt− dMt = (Mb2 − Ma2 ) − (Nb − Na ). 2 2 a 110 Exercise 5.8.1 (a) Show that E (b) Find V ar hZ b a i b hZ i Mt− dMt = 0, Mt dMt . a Remark 5.8.2 (a) Let ω be a fixed state of the world and assume the sample path t → Nt (ω) has a jump in the interval (a, b). Even if beyound the scope of this book, one can be shown that the integral Z b Nt (ω) dNt a does not exist in the Riemann-Stieltjes sense. (b) Let Nt− denote the left-hand limit of Nt . It can be shown that Nt− is predictable, while Nt is not. The previous remarks provide the reason why in the following we shall work with Z b Z b Mt− dNt does Mt dNt might not exist, while Mt− instead of Mt : the integral a a exist. Exercise 5.8.3 Show that Z t Z T 1 2 Nt dt. Nt− dMt = (Nt − Nt ) − λ 2 0 0 Exercise 5.8.4 Find the variance of Z T Nt− dMt . 0 The following integrals with respect to a Poisson process Nt are considered in the Riemann-Stieltjes sense. Proposition 5.8.5 For any continuous function f we have Z t hZ t i (a) E f (s) ds; f (s) dNs = λ 0 h Z t 2 i 0 Z t 2 2 f (s) ds + λ =λ f (s) dNs 0 0 h Rt i R t f (s) (c) E e 0 f (s) dNs = eλ 0 (e −1) ds . (b) E Z 0 t f (s) ds 2 ; Proof: (a) Consider the equidistant partition 0 = s0 < s1 < · · · < sn = t, with sk+1 − sk = ∆s. Then 111 E hZ t f (s) dNs 0 i = lim E n→∞ = λ lim n→∞ h n−1 X i=0 n−1 X i=0 n−1 i h i X f (si )E Nsi+1 − Nsi f (si )(Nsi+1 − Nsi ) = lim n→∞ f (si )(si+1 − si ) = λ Z i=0 t f (s) ds. 0 (b) Using that Nt is stationary and has independent increments, we have respectively E[(Nsi+1 − Nsi )2 ] = E[Ns2i+1 −si ] = λ(si+1 − si ) + λ2 (si+1 − si )2 = λ ∆s + λ2 (∆s)2 , E[(Nsi+1 − Nsi )(Nsj+1 − Nsj )] = E[(Nsi+1 − Nsi )]E[(Nsj+1 − Nsj )] = λ(si+1 − si )λ(sj+1 − sj ) = λ2 (∆s)2 . Applying the expectation to the formula n−1 X i=0 n−1 2 X f (si )2 (Nsi+1 − Nsi )2 f (si )(Nsi+1 − Nsi ) = i=0 +2 X i6=j f (si )f (sj )(Nsi+1 − Nsi )(Nsj+1 − Nsj ) yields E h n−1 X i=0 2 i f (si )(Nsi+1 − Nsi ) = n−1 X f (si )2 (λ∆s + λ2 (∆s)2 ) + 2 i=0 = λ n−1 X = λ → λ i=0 Z t 0 f (si )f (sj )λ2 (∆s)2 i6=j f (si )2 ∆s + λ2 h n−1 X f (si )2 (∆s)2 + 2 i=0 i=0 n−1 X X f (si )2 ∆s + λ2 f (s)2 ds + λ2 n−1 X i=0 Z t f (si )f (sj )(∆s)2 i6=j f (si ) ∆s f (s) ds 0 X 2 2 , as n → ∞. (c) Using that Nt is stationary with independent increments and has the moment i 112 generating function E[ekNt ] = eλ(e k −1)t , we have i i h n−1 h Pn−1 Y ef (si )(Nsi+1 −Nsi ) lim E e i=0 f (si )(Nsi+1 −Nsi ) = lim E h Rt i E e 0 f (s) dNs = n→∞ n→∞ = lim n→∞ = lim n→∞ = eλ Rt 0 n−1 Y i=0 n−1 Y i h E ef (si )(Nsi+1 −Nsi ) = lim eλ(e f (si ) −1)(s i+1 −si ) n→∞ Z h i E ef (si )(Nsi+1 −si ) Pn−1 i=0 n→∞ (ef (si ) −1)(si+1 −si ) . Since f is continuous, the Poisson integral of the waiting times Sk i=0 = lim eλ i=0 (ef (s) −1) ds i=0 n−1 Y t f (s) dNs = 0 Z t f (s) dNs can be computed in terms 0 Nt X f (Sk ). k=1 This formula can be used to give a proof for the previous result. For instance, taking the expectation and using conditions over Nt = n, yields E hZ 0 t f (s) dNs i = E Nt hX k=1 = n i X hX i f (Sk ) = E f (Sk )|Nt = n P (Nt = n) XnZ n≥0 = e−λt t Z n≥0 t f (x) dx 0 t (λt)n n! k=1 −λt −λt e =e Z t 0 f (x) dx λeλt = λ Z t f (x) dx 1 X (λt)n t (n − 1)! n≥0 f (x) dx. 0 0 Exercise 5.8.6 Solve parts (b) and (c) of Proposition 5.8.5 using a similar idea with the one presented above. Exercise 5.8.7 Show that Z t 2 i h Z t f (s)2 ds, =λ E f (s) dMs 0 0 where Mt = Nt − λt is the compensated Poisson process. Exercise 5.8.8 Prove that V ar Z t f (s) dNs = λ 0 Z t 0 f (s)2 dNs . 113 Exercise 5.8.9 Find h Rt i E e 0 f (s) dMs . Proposition 5.8.10 Let Ft = σ(Ns ; 0 ≤ s ≤ t). Then for any constant c, the process c Mt = ecNt +λ(1−e )t , t≥0 is an Ft -martingale. Proof: Let s < t. Since Nt − Ns is independent of Fs and Nt is stationary, we have E[ec(Nt −Ns ) |Fs ] = E[ec(Nt −Ns ) ] = E[ecNt−s ] = eλ(e c −1)(t−s) . On the other side, taking out the predictable part yields E[ec(Nt −Ns ) |Fs ] = e−cNs E[ecNt |Fs ]. Equating the last two relations we arrive at c E[ecNt +(1−e )t |Fs ] = ecNs +λ(1−e c )s , which is equivalent with the martingale condition E[Mt |Fs ] = Ms . We shall present an application of the previous result. Consider the waiting time until the nth jump, Sn = inf{t > 0; Nt = n}, which is a stopping time, and the filtration Ft = σ(Ns ; 0 ≤ s ≤ t). Since c Mt = ecNt +λ(1−e )t is an Ft -martingale, by the Optional Stopping Theorem (Theorem 4.2.1) we have c E[MSn ] = E[M0 ] = 1, which is equivalent with E[eλ(1−e )Sn ] = e−cn . Substituting s = −λ(1 − ec ), then c = ln(1 + λs ). Since s, λ > 0, then c > 0. The previous expression becomes λ x s E[e−sSn ] = e−n ln(1+ λ ) = . λ+s Since the expectation on the left side is the Laplace transform of the probability density of Sn , then p(Sn ) = L−1 {E[e−sSn ]} = L−1 = e−tλ tn−1 λn , Γ(n) which shows that Sn has a gamma distribution. n λ x o λ+s 114 5.9 The distribution function of XT = RT 0 g(t) dNt In this section we consider the function g(t) continuous. Let S1 < S2 < · · · < SNt denote the waiting times until time t. Since the increments dNt are equal to 1 at Sk and 0 otherwise, the integral can be written as Z T XT = g(t) dNt = g(S1 ) + · · · + g(SNt ). 0 RT The distribution function of the random variable XT = 0 g(t) dNt can be obtained conditioning over the Nt X P (XT ≤ u) = P (XT ≤ u|NT = k) P (NT = k) k≥0 = X k≥0 = X k≥0 P (g(S1 ) + · · · + g(SNt ) ≤ u|NT = k) P (NT = k) P (g(S1 ) + · · · + g(Sk ) ≤ u) P (NT = k). (5.9.5) Considering S1 , S2 , · · · , Sk independent and uniformly distributed over the interval [0, T ], we have Z vol(Dk ) 1 dx1 · · · dxk = , P g(S1 ) + · · · + g(Sk ) ≤ u = k Tk Dk T where Dk = {g(x1 ) + g(x2 ) + · · · + g(xk ) ≤ u} ∩ {0 ≤ x1 , · · · , xk ≤ T }. Substituting back in (5.9.5) yields X P (XT ≤ u) = P (g(S1 ) + · · · + g(Sk ) ≤ u) P (NT = k) k≥0 X λk vol(Dk ) X vol(Dk ) λk T k −λT −λT e = e . = Tk k! k! k≥0 (5.9.6) k≥0 In general, the volume of the k-dimensional solid Dk is not obvious easy to obtain. However, there are simple cases when this can be computed explicitly. A Particular Case. We shall do an explicit computation of the partition function of RT XT = 0 s2 dNs . In this case the solid Dk is the intersection between the k-dimensional √ ball of radius u centered at the origin and the k-dimensional cube [0, T ]k . There are √ three possible shapes for Dk , which depend on the size of u: √ (a) if 0 ≤ u < T , then Dk is a 21k -part of a k-dimensional sphere; √ √ (b) if T ≤ u < T k, then Dk has a complicated shape; 115 √ √ (c) if T k ≤ u, then Dk is the entire k-dimensional cube, and then vol(Dk ) = T k . Since the volume of the k-dimensional ball of radius R is given by the volume of Dk in case (a) becomes vol(Dk ) = π k/2 Rk , then Γ( k2 + 1) π k/2 uk/2 . 2k Γ( k2 + 1) Substituting in (5.9.6) yields P (XT ≤ u) = e−λT X (λ2 πu)k/2 , k!Γ( k2 + 1) k≥0 0≤ √ u < T. √ √ It is worth noting that for u → ∞, the inequality T k ≤ u is satisfied for all k ≥ 0; hence relation (5.9.6) yields lim P (XT ≤ u) = e−λT u→∞ X λk T k k≥0 k! = e−kT ekT = 1. The computation in case (b) is more complicated and will be omitted. hR i R T T Exercise 5.9.1 Calculate the expectation E 0 eks dNs and the variance V ar 0 eks dNs . Exercise 5.9.2 Compute the distribution function of Xt = RT 0 s dNs . 116 Chapter 6 Stochastic Differentiation 6.1 Differentiation Rules Most stochastic processes are not differentiable. For instance, the Brownian motion process Wt is a continuous process which is nowhere differentiable. Hence, derivatives t like dW dt do not make sense in stochastic calculus. The only quantities allowed to be used are the infinitesimal changes of the process, in our case, dWt . The infinitesimal change of a process The change in the process Xt between instances t and t + ∆t is given by ∆Xt = Xt+∆t − Xt . When ∆t is infinitesimally small, we obtain the infinitesimal change of a process Xt dXt = Xt+dt − Xt . Sometimes it is useful to use the equivalent formula Xt+dt = Xt + dXt . 6.2 Basic Rules The following rules are the analog of some familiar differentiation rules from elementary Calculus. The constant multiple rule If Xt is a stochastic process and c is a constant, then d(c Xt ) = c dXt . The verification follows from a straightforward application of the infinitesimal change formula d(c Xt ) = c Xt+dt − c Xt = c(Xt+dt − Xt ) = c dXt . 117 118 The sum rule If Xt and Yt are two stochastic processes, then d(Xt + Yt ) = dXt + dYt . The verification is as in the following: d(Xt + Yt ) = (Xt+dt + Yt+dt ) − (Xt + Yt ) = (Xt+dt − Xt ) + (Yt+dt − Yt ) = dXt + dYt . The difference rule If Xt and Yt are two stochastic processes, then d(Xt − Yt ) = dXt − dYt . The proof is similar to the one for the sum rule. The product rule If Xt and Yt are two stochastic processes, then d(Xt Yt ) = Xt dYt + Yt dXt + dXt dYt . The proof is as follows: d(Xt Yt ) = Xt+dt Yt+dt − Xt Yt = Xt (Yt+dt − Yt ) + Yt (Xt+dt − Xt ) + (Xt+dt − Xt )(Yt+dt − Yt ) = Xt dYt + Yt dXt + dXt dYt , where the second identity is verified by direct computation. If the process Xt is replaced by the deterministic function f (t), then the aforementioned formula becomes d(f (t)Yt ) = f (t) dYt + Yt df (t) + df (t) dYt . Since in most practical cases the process Yt is an Ito diffusion dYt = a(t, Wt )dt + b(t, Wt )dWt , using the relations dt dWt = dt2 = 0, the last term vanishes df (t) dYt = f ′ (t)dtdYt = 0, and hence d(f (t)Yt ) = f (t) dYt + Yt df (t). 119 This relation looks like the usual product rule. The quotient rule If Xt and Yt are two stochastic processes, then X Y dX − X dY − dX dY Xt t t t t t t t = + 3 (dYt )2 . d Yt Yt2 Yt The proof follows from Ito’s formula and will be addressed in section 6.3.3. When the process Yt is replaced by the deterministic function f (t), and Xt is an Ito diffusion, then the previous formula becomes X f (t)dX − X df (t) t t t = . d f (t) f (t)2 Example 6.2.1 We shall show that d(Wt2 ) = 2Wt dWt + dt. Applying the product rule and the fundamental relation (dWt )2 = dt, yields d(Wt2 ) = Wt dWt + Wt dWt + dWt dWt = 2Wt dWt + dt. Example 6.2.2 Show that d(Wt3 ) = 3Wt2 dWt + 3Wt dt. Applying the product rule and the previous exercise yields d(Wt3 ) = d(Wt · Wt2 ) = Wt d(Wt2 ) + Wt2 dWt + d(Wt2 ) dWt = Wt (2Wt dWt + dt) + Wt2 dWt + dWt (2Wt dWt + dt) = 2Wt2 dWt + Wt dt + Wt2 dWt + 2Wt (dWt )2 + dt dWt = 3Wt2 dWt + 3Wt dt, where we used (dWt )2 = dt and dt dWt = 0. Example 6.2.3 Show that d(tWt ) = Wt dt + t dWt . Using the product rule and dt dWt = 0, we get d(tWt ) = Wt dt + t dWt + dt dWt = Wt dt + t dWt . Example 6.2.4 Let Zt = Rt 0 Wu du be the integrated Brownian motion. Show that dZt = Wt dt. 120 The infinitesimal change of Zt is dZt = Zt+dt − Zt = Z t+dt Ws ds = Wt dt, t since Ws is a continuous function in s. Rt Example 6.2.5 Let At = 1t Zt = 1t 0 Wu du be the average of the Brownian motion on the time interval [0, t]. Show that dAt = 1 1 Wt − Zt dt. t t We have 1 1 1 Zt + dZt + d dZt dAt = d t t t −1 1 −1 = Zt dt + Wt dt + 2 Wt |{z} dt2 2 t t t =0 1 1 = Wt − Zt dt. t t Rt Exercise 6.2.1 Let Gt = 1t 0 eWu du be the average of the geometric Brownian motion on [0, t]. Find dGt . 6.3 Ito’s Formula Ito’s formula is the analog of the chain rule from elementary Calculus. We shall start by reviewing a few concepts regarding function approximations. Let f be a twice continuously differentiable function of a real variable x. Let x0 be fixed and consider the changes ∆x = x − x0 and ∆f (x) = f (x) − f (x0 ). It is known from Calculus that the following second order Taylor approximation holds 1 ∆f (x) = f ′ (x)∆x + f ′′ (x)(∆x)2 + O(∆x)3 . 2 When x is infinitesimally close to x0 , we replace ∆x by the differential dx and obtain 1 df (x) = f ′ (x)dx + f ′′ (x)(dx)2 + O(dx)3 . 2 (6.3.1) In the elementary Calculus, all terms involving terms of equal or higher order to dx2 are neglected; then the aforementioned formula becomes df (x) = f ′ (x)dx. 121 Now, if we consider x = x(t) to be a differentiable function of t, substituting into the previous formula we obtain the differential form of the well known chain rule df x(t) = f ′ x(t) dx(t) = f ′ x(t) x′ (t) dt. We shall present a similar formula for the stochastic environment. In this case the deterministic function x(t) is replaced by a stochastic process Xt . The composition between the differentiable function f and the process Xt is denoted by Ft = f (Xt ). Neglecting the increment powers higher than or equal to (dXt )3 , the expression (6.3.1) becomes 2 1 dFt = f ′ Xt dXt + f ′′ Xt dXt . (6.3.2) 2 In the computation of dXt we may take into the account stochastic relations such as dWt2 = dt, or dt dWt = 0. Theorem 6.3.1 (Ito’s formula) Consider Xt a stochastic process satisfying dXt = bt dt + σt dWt , with bt (ω) and σt (ω) ”good behaving” processes. Let Ft = f (Xt ), with f twice continuously differentiable. Then h i σ2 dFt = bt f ′ (Xt ) + t f ′′ (Xt ) dt + σt f ′ (Xt ) dWt . 2 (6.3.3) Proof: We shall provide an informal proof. Using relations dWt2 = dt and dt2 = dWt dt = 0, we have (dXt )2 = bt dt + σt dWt 2 = b2t dt2 + 2bt σt dWt dt + σt2 dWt2 = σt2 dt. Substituting into (6.3.2) yields 2 1 dFt = f ′ Xt dXt + f ′′ Xt dXt 2 1 = f ′ Xt bt dt + σt dWt + f ′′ Xt σt2 dt 2 h i 2 σ = bt f ′ (Xt ) + t f ′′ (Xt ) dt + σt f ′ (Xt ) dWt . 2 In the case Xt = Wt we obtain the following consequence: 122 Corollary 6.3.2 Let Ft = f (Wt ). Then 1 dFt = f ′′ (Wt )dt + f ′ (Wt ) dWt . 2 (6.3.4) Particular cases 1. If f (x) = xα , with α constant, then f ′ (x) = αxα−1 and f ′′ (x) = α(α − 1)xα−2 . Then (6.3.4) becomes the following useful formula d(Wtα ) = 1 α(α − 1)Wtα−2 dt + αWtα−1 dWt . 2 A couple of useful cases easily follow: d(Wt2 ) = 2Wt dWt + dt d(Wt3 ) = 3Wt2 dWt + 3Wt dt. 2. If f (x) = ekx , with k constant, f ′ (x) = kekx , f ′′ (x) = k2 ekx . Therefore 1 d(ekWt ) = kekWt dWt + k2 ekWt dt. 2 In particular, for k = 1, we obtain the increments of a geometric Brownian motion 1 d(eWt ) = eWt dWt + eWt dt. 2 3. If f (x) = sin x, then d(sin Wt ) = cos Wt dWt − 1 sin Wt dt. 2 Exercise 6.3.3 Use the previous rules to find the following increments (a) d(Wt eWt ) (b) d(3Wt2 + 2e5Wt ) 2 (c) d(et+Wt ) (d) d (t + Wt )n . 1 Z t Wu du (e) d t 0 1 Z t eWu du , where α is a constant. (f ) d α t 0 123 In the case when the function f = f (t, x) is also time dependent, the analog of (6.3.1) is given by 1 df (t, x) = ∂t f (t, x)dt + ∂x f (t, x)dx + ∂x2 f (t, x)(dx)2 + O(dx)3 + O(dt)2 . 2 (6.3.5) Substituting x = Xt yields 1 df (t, Xt ) = ∂t f (t, Xt )dt + ∂x f (t, Xt )dXt + ∂x2 f (t, Xt )(dXt )2 . 2 (6.3.6) If Xt is an Ito diffusion we obtain an extra-term in formula (6.3.3) dFt = h ∂t f (t, Xt ) + a(Wt , t)∂x f (t, Xt ) + +b(Wt , t)∂x f (t, Xt ) dWt . i b(Wt , t)2 2 ∂x f (t, Xt ) dt 2 (6.3.7) Exercise 6.3.4 Show that d(tWt2 ) = (t + Wt2 )dt + 2tWt dWt . Exercise 6.3.5 Find the following increments (a) d(tWt ) (b) d(et Wt ) 6.3.1 (c) d(t2 cos Wt ) (d) d(sin t Wt2 ). Ito diffusions Consider the process Xt given by dXt = b(Xt , t)dt + σ(Xt , t)dWt . (6.3.8) A process Xt = (Xti ) ∈ Rn satisfying this relation is called an Ito diffusion in Rn . Equation (6.3.8) models the position of a small particle that moves under the influence of a drift force b(Xt , t), and is subject to random deviations. This situation occurs in the physical world when a particle suspended in a moving liquid is subject to random molecular bombardments. The amount 12 σσ T is called the diffusion coefficient and describes the difussion of the particle. Exercise 6.3.6 Consider the time-homogeneous Ito diffusion dXt = b(Xt )dt + σ(Xt )dWt . Show that: E[(Xti − xi )] . t→0 t E[(Xti − xi )(Xtj − xj )] . (b) (σσ T )ij (x) = lim t→0 t (a) bi (x) = lim 124 6.3.2 Ito’s formula for Poisson processes Consider the process Ft = F (Mt ), where Mt = Nt − λt is the compensated Poisson process. Ito’s formula for the process Ft takes the following integral form. For a proof the reader can consult Kuo [?]. Proposition 6.3.7 Let F be a twice differentiable function. Then for any a < t we have Z t X F ′ (Ms− ) dMs + ∆F (Ms ) − F ′ (Ms− )∆Ms , Ft = Fa + a a<s≤t where ∆Ms = Ms − Ms− and ∆F (Ms ) = F (Ms ) − F (Ms− ). We shall apply the aforementioned result for the case Ft = F (Mt ) = Mt2 . We have Z t X 2 Ms− dMs + Ms2 − Ms− − 2Ms− (Ms − Ms− ) . (6.3.9) Mt2 = Ma2 + 2 a a<s≤t Since the jumps in Ns are of size 1, we have (∆Ns )2 = ∆Ns . Since the difference of the processes Ms and Ns is continuous, then ∆Ms = ∆Ns . Using these formulas we have 2 Ms2 − Ms− − 2Ms− (Ms − Ms− ) = (Ms − Ms− ) Ms + Ms− − 2Ms− = (Ms − Ms− )2 = (∆Ms )2 = (∆Ns )2 = ∆Ns = Ns − Ns− . P Since the sum of the jumps between s and t is a<s≤t ∆Ns = Nt − Na , formula (6.3.9) becomes Z t Ms− dMs + Nt − Na . (6.3.10) Mt2 = Ma2 + 2 a The differential form is d(Mt2 ) = 2Mt− dMt + dNt , which is equivalent with d(Mt2 ) = (1 + 2Mt− ) dMt + λdt, since dNt = dMt + λdt. Exercise 6.3.8 Show that Z T 0 1 Mt− dMt = (MT2 − NT ). 2 Exercise 6.3.9 Use Ito’s formula for the Poison process to find the conditional expectation E[Mt2 |Fs ] for s < t. 125 6.3.3 Ito’s multidimensional formula If the process Ft depends on several Ito diffusions, say Ft = f (t, Xt , Yt ), then a similar formula to (6.3.7) leads to dFt = ∂f ∂f ∂f (t, Xt , Yt )dt + (t, Xt , Yt )dXt + (t, Xt , Yt )dYt ∂t ∂x ∂y 1 ∂2f 1 ∂2f + (t, Xt , Yt )(dXt )2 + (t, Xt , Yt )(dYt )2 2 2 ∂x 2 ∂y 2 ∂2f + (t, Xt , Yt )dXt dYt . ∂x∂y Particular cases In the case when Ft = f (Xt , Yt ), with Xt = Wt1 , Yt = Wt2 independent Brownian motions, we have ∂f ∂f dWt1 + dWt2 + ∂x ∂y 1 ∂2f dWt1 dWt2 + 2 ∂x∂y ∂f ∂f = dWt1 + dWt2 + ∂x ∂y dFt = The expression ∆f = 1 ∂2f 1 ∂2f 1 2 (dW ) + (dWt2 )2 t 2 ∂x2 2 ∂y 2 1 ∂2f ∂2f + dt 2 ∂x2 ∂y 2 ∂2f 1 ∂2f + 2 ∂x2 ∂y 2 is called the Laplacian of f . We can rewrite the previous formula as dFt = ∂f ∂f dWt1 + dWt2 + ∆f dt ∂x ∂y A function f with ∆f = 0 is called harmonic. The aforementioned formula in the case of harmonic functions takes the simple form dFt = ∂f ∂f dWt1 + dWt2 . ∂x ∂y (6.3.11) Exercise 6.3.10 Let Wt1 , Wt2 be two independent Brownian motions. If the function f is harmonic, show that Ft = f (Wt1 , Wt2 ) is a martingale. Is the converse true? Exercise 6.3.11 Use the previous formulas to find dFt in the following cases (a) Ft = (Wt1 )2 + (Wt2 )2 (b) Ft = ln[(Wt1 )2 + (Wt2 )2 ]. 126 p Exercise 6.3.12 Consider the Bessel process Rt = (Wt1 )2 + (Wt2 )2 , where Wt1 and Wt2 are two independent Brownian motions. Prove that dRt = 1 W1 W2 dt + t dWt1 + t dWt2 . 2Rt Rt Rt Example 6.3.1 (The product rule) Let Xt and Yt be two Ito difussions. Show that d(Xt Yt ) = Yt dXt + Xt dYt + dXt dYt . Consider the function f (x, y) = xy. Since ∂x f = y, ∂y f = x, ∂x2 f = ∂y2 f = 0, ∂x ∂y = 1, then Ito’s multidimensional formula yields d(Xt Yt ) = d f (X, Yt ) = ∂x f dXt + ∂y f dYt 1 1 + ∂x2 f (dXt )2 + ∂y2 f (dYt )2 + ∂x ∂y f dXt dYt 2 2 = Yt dXt + Xt dYt + dXt dYt . Example 6.3.2 (The quotient rule) Let Xt and Yt be two Ito difussions. Show that X Y dX − X dY − dX dY Xt t t t t t t t d + 2 (dYt )2 . = Yt Yt2 Yt Consider the function f (x, y) = xy . Since ∂x f = y1 , ∂y f = − yx2 , ∂x2 f = 0, ∂y2 f = − yx2 , ∂x ∂y = y12 , then applying Ito’s multidimensional formula yields X t = d f (X, Yt ) = ∂x f dXt + ∂y f dYt d Yt 1 1 + ∂x2 f (dXt )2 + ∂y2 f (dYt )2 + ∂x ∂y f dXt dYt 2 2 1 1 Xt = dXt − 2 dYt − 2 dXt dYt Yt Yt Yt Yt dXt − Xt dYt − dXt dYt Xt = + 2 (dYt )2 . 2 Yt Yt Chapter 7 Stochastic Integration Techniques Computing a stochastic integral starting from the definition of the Ito integral is a quite inefficient method. Like in the elementary Calculus, several methods can be developed to compute stochastic integrals. We tried to keep the analogy with the elementary Calculus as much as possible. The integration by substitution is more complicated in the stochastic environment and we have considered only a particular case of it, which we called The method of heat equation. 7.1 Notational Conventions The intent of this section is to discuss some equivalent integral notations for a stochastic differential equation. Consider a process Xt whose increments satisfy the stochastic differential equation dXt = f (t, Wt )dWt . This can be written equivalently in the integral form as Z t dXs = a Z t f (s, Ws )dWs . (7.1.1) a If we consider the partition 0 = t0 < t1 < · · · < tn−1 < tn = t, then the left side becomes Z t n−1 X (Xtj+1 − Xtj ) = Xt − Xa , dXs = ms-lim a n→∞ j=0 after cancelling the terms in pairs. Substituting into formula (7.1.1) yields the equivalent form Z t f (s, Ws )dWs . Xt = Xa + a Z t f (s, Ws )dWs , since Xa is a constant. Using This can be written also as dXt = d a dXt = f (t, Wt )dWt , the previous formula can be written in the following two equivalent ways: 127 128 (i) For any a < t, we have Z t d f (s, Ws )dWs = f (t, Wt )dWt . a (ii) If Yt is a stochastic process, such that Yt dWt = dFt , then Z b a Yt dWt = Fb − Fa . These formulas are equivalent ways of writing the stochastic differential equation (7.1.1), and are useful in computations. A few applications follow. Example 7.1.1 Verify the stochastic formula Z t t W2 Ws dWs = t − . 2 2 0 Let Xt = Rt t Wt2 − . From Ito’s formula 2 2 t 1 W 2 1 t −d = (2Wt dWt + dt) − dt = Wt dWt , dYt = d 2 2 2 2 0 Ws dWs and Yt = and from the formula (i) we get Z t dXt = d Ws dWs = Wt dWt . 0 Hence dXt = dYt , or d(Xt − Yt ) = 0. Since the process Xt − Yt has zero increments, then Xt − Yt = c, constant. Taking t = 0, yields Z 0 W 2 0 0 Ws dWs − c = X0 − Y0 = = 0, − 2 2 0 and hence c = 0. It follows that Xt = Yt , which verifies the desired relation. Example 7.1.2 Verify the formula Z t Z t 2 t 1 t 2 sWs dWs = Wt − − W ds. 2 2 2 0 s 0 Rt t 2 Wt − 1 , and Zt = Consider the stochastic processes Xt = 0 sWs dWs , Yt = 2 R 1 t 2 ds. Formula (i) yields W s 2 0 dXt = tWt dWt 1 2 W dt. dZt = 2 t 129 Applying Ito’s formula, we get t2 t t 1 Wt2 − = d(tWt2 ) − d dYt = d 2 2 2 4 i 1 1h = (t + Wt2 )dt + 2tWt dWt − tdt 2 2 1 2 = W dt + tWt dWt . 2 t We can easily see that dXt = dYt − dZt . This implies d(Xt −Yt +Zt ) = 0, i.e. Xt −Yt +Zt = c, constant. Since X0 = Y0 = Z0 = 0, it follows that c = 0. This proves the desired relation. Example 7.1.3 Show that Z t 0 1 (Ws2 − s) dWs = Wt3 − tWt . 3 Consider the function f (t, x) = 31 x3 − tx, and let Ft = f (t, Wt ). Since ∂t f = −x, ∂x f = x2 − t, and ∂x2 f = 2x, then Ito’s formula provides 1 dFt = ∂t f dt + ∂x f dWt + ∂x2 f (dWt )2 2 1 2 = −Wt dt + (Wt − t) dWt + 2Wt dt 2 = (Wt2 − t)dWt . From formula (i) we get Z t Z t 1 dFs = Ft − F0 = Ft = Wt3 − tWt . (Ws2 − s) dWs = 3 0 0 Exercise Z t Z t 7.1.1 Show that 1 2 ln Ws dWs = 2Wt (ln Wt − 1) − (a) ds; 0 Ws 0 Z t s t e 2 cos Ws dWs = e 2 sin Wt ; (b) Z0 t t s e 2 sin Ws dWs = 1 − e 2 cos Wt ; (c) Z0 t t s eWs − 2 dWs = eWt − 2 − 1; (d) Z 0t Z 1 t (e) cos Ws dWs = sin Wt + sin Ws ds; 2 0 0 Z Z t 1 t cos Ws ds. sin Ws dWs = 1 − cos Wt − (f ) 2 0 0 130 7.2 Stochastic Integration by Parts Consider the process Ft = f (t)g(Wt ), with f and g differentiable. Using the product rule yields dFt = df (t) g(Wt ) + f (t) dg(Wt ) 1 = f ′ (t)g(Wt )dt + f (t) g ′ (Wt )dWt + g′′ (Wt )dt 2 1 ′′ ′ = f (t)g(Wt )dt + f (t)g (Wt )dt + f (t)g′ (Wt )dWt . 2 Writing the relation in the integral form, we obtain the first integration by parts formula: Z b Z b Z b 1 b f ′ (t)g(Wt ) dt − f (t)g′ (Wt ) dWt = f (t)g(Wt ) − f (t)g′′ (Wt ) dt. 2 a a a a This formula is to be used when integrating a product between a function of t and a function of the Brownian motion Wt , for which an antiderivative is known. The following two particular cases are important and useful in applications. 1. If g(Wt ) = Wt , the aforementioned formula takes the simple form Z b a t=b Z b f ′ (t)Wt dt. − f (t) dWt = f (t)Wt t=a (7.2.2) a It is worth noting that the left side is a Wiener integral. 2. If f (t) = 1, then the formula becomes Z a b t=b 1 Z b − g (Wt ) dWt = g(Wt ) g′′ (Wt ) dt. t=a 2 a ′ Application 1 Consider the Wiener integral IT = Z (7.2.3) T t dWt . From the general theory, 0 see Proposition 5.6.1, it is known that I is a random variable normally distributed with mean 0 and variance Z T T3 t2 dt = . V ar[IT ] = 3 0 Recall the definition of integrated Brownian motion Z t Wu du. Zt = 0 Formula (7.2.2) yields a relationship between I and the integrated Brownian motion Z T Z T Wt dt = T WT − ZT , t dWt = T WT − IT = 0 0 131 and hence IT + ZT = T WT . This relation can be used to compute the covariance between IT and ZT . Cov(IT + ZT , IT + ZT ) = V ar[T WT ] ⇐⇒ V ar[IT ] + V ar[ZT ] + 2Cov(IT , ZT ) = T 2 V ar[WT ] ⇐⇒ T 3 /3 + T 3 /3 + 2Cov(IT , ZT ) = T 3 ⇐⇒ Cov(IT , ZT ) = T 3 /6, where we used that V ar[ZT ] = T 3 /3. The processes It and Zt are not independent. Their correlation coefficient is 0.5 as the following calculation shows Corr(IT , ZT ) = T 3 /6 Cov(IT , ZT ) 1/2 = T 3 /3 V ar[IT ]V ar[ZT ] = 1/2. x2 2 Application 2 If we let g(x) = Z in formula (7.2.3), we get b Wt dWt = a Wb2 − Wa2 1 − (b − a). 2 2 It is worth noting that letting a = 0 and b = T , we retrieve a formula that was proved by direct methods in chapter 3 Z T Wt dWt = 0 Similarly, if we let g(x) = x3 3 Z a WT2 T − . 2 2 in (7.2.3) yields b Wt2 dWt Z b Wt3 b Wt dt. = − 3 a a Application 3 Choosing f (t) = eαt and g(x) = cos x, we shall compute the stochastic integral R T αt 0 e cos Wt dWt using the formula of integration by parts Z T Z T αt eαt (sin Wt )′ dWt e cos Wt dWt = 0 0 Z T Z T 1 T αt e (cos Wt )′′ dt (eαt )′ sin Wt dt − = eαt sin Wt − 2 0 0 0 Z T Z T 1 eαt sin Wt dt + = eαT sin WT − α eαt sin Wt dt 2 0 0 Z T 1 eαt sin Wt dt. = eαT sin WT − α − 2 0 132 The particular case α = 1 2 leads to the following exact formula of a stochastic integral Z T t T (7.2.4) e 2 cos Wt dWt = e 2 sin WT . 0 RT In a similar way, we can obtain an exact formula for the stochastic integral 0 eβt sin Wt dWt as follows Z T Z T βt eβt (cos Wt )′ dWt e sin Wt dWt = − 0 0 Z T Z T 1 T βt eβt cos Wt dt − = −eβt cos Wt + β e cos Wt dt. 0 2 0 0 Taking β = 1 2 yields the closed form formula Z T t T e 2 sin Wt dWt = 1 − e 2 cos WT . 0 (7.2.5) A consequence of the last two formulas and of Euler’s formula eiWt = cos Wt + i sin Wt , is Z T 0 T t e 2 +iWt dWt = i(1 − e 2 +iWT ). The proof details are left to the reader. A general form of the integration by parts formula In general, if Xt and Yt are two Ito diffusions, from the product formula d(Xt Yt ) = Xt dYt + Yt dXt + dXt dYt . Integrating between the limits a and b Z b Z b Z b Z b dXt dYt . Yt dXt + Xt dYt + d(Xt Yt ) = a a a a From the Fundamental Theorem Z b d(Xt Yt ) = Xb Yb − Xa Ya , a so the previous formula takes the following form of integration by parts Z b a Xt dYt = Xb Yb − Xa Ya − Z a b Yt dXt − Z b dXt dYt . a This formula is of theoretical value. In practice, the term dXt dYt needs to be computed using the rules Wt2 = dt, and dt dWt = 0. 133 Exercise 7.2.1 (a) Use integration by parts to get Z T 0 1 dWt = tan−1 (WT ) + 1 + Wt2 (b) Show that E[tan −1 (WT )] = − Z Z T 0 T E 0 h Wt dt, (1 + Wt2 )2 T > 0. i Wt dt. (1 + Wt2 )2 (c) Prove the double inequality √ √ 3 3 x 3 3 − ≤ ≤ , 16 (1 + x2 )2 16 ∀x ∈ R. (d) Use part (c) to obtain √ √ Z T 3 3 3 3 Wt − T ≤ 2 2 dt ≤ 16 T. 16 0 (1 + Wt ) (e) Use part (d) to get √ √ 3 3 3 3 −1 − T ≤ E[tan (WT )] ≤ T. 16 16 (f ) Does part (e) contradict the inequality − π π < tan−1 (WT ) < ? 2 2 Exercise 7.2.2 (a) Show the relation Z T Wt e WT dWt = e 0 1 −1− 2 Z T eWt dt. 0 (b) Use part (a) to find E[eWt ]. Exercise 7.2.3 (a) Use integration by parts to show Z T 0 Wt eWt dWt = 1 + WT eWT − eWT − 1 2 Z T eWt (1 + Wt ) dt; 0 (b) Use part (a) to find E[Wt eWt ]; (c) Show that Cov(Wt , eWt ) = tet/2 ; r (d) Prove that Corr(Wt, eWt ) = et t , and compute the limits as t → 0 and t → ∞. −1 134 Exercise 7.2.4 (a) Let T > 0. Show the following relation using integration by parts Z T 0 2Wt dWt = ln(1 + WT2 ) − 1 + Wt2 Z T 1 − Wt2 dt. (1 + Wt2 )2 0 (b) Show that for any real number x the following double inequality holds − 1 1 − x2 ≤ ≤ 1. 8 (1 + x2 )2 (c) Use part (b) to show that 1 − T ≤ 8 Z T 0 1 − Wt2 dt ≤ T. (1 + Wt2 )2 (d) Use parts (a) and (c) to get − T ≤ E[ln(1 + WT2 )] ≤ T. 8 (e) Use Jensen’s inequality to get E[ln(1 + WT2 )] ≤ ln(1 + T ). Does this contradict the upper bound provided in (d)? Exercise 7.2.5 Use integration by parts to show Z t 0 1 1 arctan Ws dWs = Wt arctan Wt − ln(1 + Wt2 ) − 2 2 Z t 0 1 ds. 1 + Ws2 Exercise 7.2.6 (a) Using integration by parts prove the identity Z t Ws Ws e Wt dWs = 1 + e 0 1 (Wt − 1) − 2 Z t (1 + Ws )eWs ds; 0 (b) Use part (a) to compute E[Wt eWt ]. Exercise 7.2.7 Check the following formulas using integration by parts Z t Z 1 t Ws sin Ws dWs = sin Wt − Wt cos Wt − (a) (sin Ws + Ws cos Ws ) ds; 2 0 0 Z Z t q 1 t Ws 2 p dWs = 1 + Wt − 1 − (1 + Ws2 )−3/2 ds. (b) 2 0 1 + Ws2 0 135 7.3 The Heat Equation Method In elementary Calculus, integration by substitution is the inverse application of the chain rule. In the stochastic environment, this will be the inverse application of Ito’s formula. This is difficult to apply in general, but there is a particular case of great importance. Let ϕ(t, x) be a solution of the equation 1 ∂t ϕ + ∂x2 ϕ = 0. 2 (7.3.6) This is called the heat equation without sources. The non-homogeneous equation 1 ∂t ϕ + ∂x2 ϕ = G(t, x) 2 (7.3.7) is called heat equation with sources. The function G(t, x) represents the density of heat sources, while the function ϕ(t, x) is the temperature at point x at time t in a onedimensional wire. If the heat source is time independent, then G = G(x), i.e. G is a function of x only. Example 7.3.1 Find all solutions of the equation (7.3.6) of type ϕ(t, x) = a(t) + b(x). Substituting into equation (7.3.6) yields 1 ′′ b (x) = −a′ (t). 2 Since the left side is a function of x only, while the right side is a function of variable t, the only where the previous equation is satisfied is when both sides are equal to the same constant C. This is called a separation constant. Therefore a(t) and b(x) satisfy the equations 1 ′′ a′ (t) = −C, b (x) = C. 2 Integrating yields a(t) = −Ct + C0 and b(x) = Cx2 + C1 x + C2 . It follows that ϕ(t, x) = C(x2 − t) + C1 x + C3 , with C0 , C1 , C2 , C3 arbitrary constants. Example 7.3.2 Find all solutions of the equation (7.3.6) of the type ϕ(t, x) = a(t)b(x). Substituting into the equation and dividing by a(t)b(x) yields a′ (t) 1 b′′ (x) + = 0. a(t) 2 b(x) 136 There is a separation constant C such that b′′ (x) a′ (t) = −C and = 2C. There are a(t) b(x) three distinct cases to discuss: 1. C = 0. In this case a(t) = a0 and b(x) = b1 x+b0 , with a0 , a1 , b0 , b1 real constants. Then ϕ(t, x) = a(t)b(x) = c1 x + c0 , c0 , c1 ∈ R is just a linear function in x. 2 2. C > 0. Let λ > 0 such that 2C = λ2 . Then a′ (t) = − λ2 a(t) and b′′ (x) = λ2 b(x), with solutions 2 t/2 a(t) = a0 e−λ b(x) = c1 eλx + c2 e−λx . The general solution of (7.3.6) is 2 t/2 ϕ(t, x) = e−λ (c1 eλx + c2 e−λx ), c1 , c2 ∈ R. 3. C < 0. Let λ > 0 such that 2C = −λ2 . Then a′ (t) = 2 −λ b(x). Solving yields λ2 2 a(t) and b′′ (x) = 2 t/2 a(t) = a0 eλ b(x) = c1 sin(λx) + c2 cos(λx). The general solution of (7.3.6) in this case is 2 t/2 ϕ(t, x) = eλ c1 sin(λx) + c2 cos(λx) , c1 , c2 ∈ R. In particular, the functions x, x2 − t, ex−t/2 , e−x−t/2 , et/2 sin x and et/2 cos x, or any linear combination of them are solutions of the heat equation (7.3.6). However, there are other solutions which are not of the previous type. Exercise 7.3.1 Prove that ϕ(t, x) = 13 x3 −tx is a solution of the heat equation (7.3.6). Exercise 7.3.2 Show that ϕ(t, x) = t−1/2 e−x (7.3.6) for t > 0. 2 /(2t) is a solution of the heat equation x Exercise 7.3.3 Let ϕ = u(λ), with λ = √ , t > 0. Show that ϕ satisfies the heat 2 t equation (7.3.6) if and only if u′′ + 2λu′ = 0. 2 Exercise 7.3.4 Let erf c(x) = √ π solution of the equation (7.3.6). Z ∞ x √ 2 e−r dr. Show that ϕ = erf c(x/(2 t)) is a 137 2 Exercise 7.3.5 (the fundamental solution) Show that ϕ(t, x) = satisfies the equation (7.3.6). x √ 1 e− 4t 4πt , t>0 Sometimes it is useful to generate new solutions for the heat equation from other solutions. Below we present a few ways to accomplish this: (i) by linear combination: if ϕ1 and ϕ2 are solutions, then a1 ϕ1 + a1 ϕ2 is a solution, where a1 , a2 constants. (ii) by translation: if ϕ(t, x) is a solution, then ϕ(t − τ, x − ξ) is a solution, where (τ, ξ) is a translation vector. (iii) by affine transforms: if ϕ(t, x) is a solution, then ϕ(λt, λ2 x) is a solution, for any constant λ. ∂ n+m (iv) by differentiation: if ϕ(t, x) is a solution, then n m ϕ(t, x) is a solution. ∂ x∂ t (v) by convolution: if ϕ(t, x) is a solution, then so are Z b ϕ(t, x − ξ)f (ξ) dξ a Z a b ϕ(t − τ, x)g(t) dt. For more detail on the subject the reader can consult Widder [?] and Cannon [?]. Theorem 7.3.6 Let ϕ(t, x) be a solution of the heat equation (7.3.6) and denote f (t, x) = ∂x ϕ(t, x). Then Z b a f (t, Wt ) dWt = ϕ(b, Wb ) − ϕ(a, Wa ). Proof: Let Ft = ϕ(t, Wt ). Applying Ito’s formula we get 1 dFt = ∂x ϕ(t, Wt ) dWt + ∂t ϕ + ∂x2 ϕ dt. 2 Since ∂t ϕ + 12 ∂x2 ϕ = 0 and ∂x ϕ(t, Wt ) = f (t, Wt ), we have dFt = f (t, Wt ) dWt . Applying the Fundamental Theorem yields Z b f (t, Wt ) dWt = a Z b a dFt = Fb − Fa = ϕ(b, Wb ) − ϕ(a, Wa ). 138 Application 7.3.7 Show that Z T 1 1 Wt dWt = WT2 − T. 2 2 0 Choose the solution of the heat equation (7.3.6) given by ϕ(t, x) = x2 − t. Then f (t, x) = ∂x ϕ(t, x) = 2x. Theorem 7.3.6 yields Z Z T 2Wt dWt = T 0 0 0 T f (t, Wt ) dWt = ϕ(t, x) = WT2 − T. Dividing by 2 leads to the desired result. Application 7.3.8 Show that Z T 0 (Wt2 − t) dWt = 1 3 W − T WT . 3 T Consider the function ϕ(t, x) = 31 x3 − tx, which is a solution of the heat equation (7.3.6), see Exercise 7.3.1. Then f (t, x) = ∂x ϕ(t, x) = x2 − t. Applying Theorem 7.3.6 yields Z T 0 (Wt2 − t) dWt = Z T 1 f (t, Wt ) dWt = ϕ(t, Wt ) = WT3 − T WT . 3 0 T 0 Application 7.3.9 Let λ > 0. Prove the identities Z T e− λ2 t ±λWt 2 dWt = 0 Consider the function ϕ(t, x) = e− λ2 t ±λx 2 1 − λ2 T ±λWT e 2 −1 . ±λ , which is a solution of the homogeneous heat equation (7.3.6), see Example 7.3.2. Then f (t, x) = ∂x ϕ(t, x) = ±λe− Theorem 7.3.6 to get Z T 0 2 − λ2 t ±λx ±λe dWt = Z T λ2 t ±λx 2 . Apply T λ2 T f (t, Wt ) dWt = ϕ(t, Wt ) = e− 2 ±λWT − 1. 0 0 Dividing by the constant ±λ ends the proof. In particular, for λ = 1 the aforementioned formula becomes Z T 0 t T e− 2 +Wt dWt = e− 2 +WT − 1. (7.3.8) 139 Application 7.3.10 Let λ > 0. Prove the identity Z T e λ2 t 2 cos(λWt ) dWt = 0 1 λ2 T e 2 sin(λWT ). λ From the Example 7.3.2 we know that ϕ(t, x) = e λ2 t 2 sin(λx) is a solution of the heat equation. Applying Theorem 7.3.6 to the function f (t, x) = ∂x ϕ(t, x) = λe yields Z T λe λ2 t 2 Z λ2 t 2 cos(λx), T f (t, Wt ) dWt = ϕ(t, Wt ) 0 0 T λ2 T λ2 t = e 2 sin(λWt ) = e 2 sin(λWT ). cos(λWt ) dWt = 0 T 0 Divide by λ to end the proof. If we choose λ = 1 we recover a result already familiar to the reader from section 7.2 Z T t T (7.3.9) e 2 cos(Wt ) dWt = e 2 sin WT . 0 Application 7.3.11 Let λ > 0. Show that Z T e λ2 t 2 sin(λWt ) dWt = 0 Choose ϕ(t, x) = e λ2 t 2 λ2 T 1 1 − e 2 cos(λWT ) . λ cos(λx) to be a solution of the heat equation. Apply Theorem 7.3.6 for the function f (t, x) = ∂x ϕ(t, x) = −λe Z 0 T (−λ)e λ2 t 2 λ2 t 2 sin(λx) to get T sin(λWt ) dWt = ϕ(t, Wt ) 0 = e and then divide by −λ. λ2 T 2 T λ2 T cos(λWt ) = e 2 cos(λWT ) − 1, 0 Application 7.3.12 Let 0 < a < b. Show that Z b a 3 t− 2 Wt e− Wt2 2t 1 dWt = a− 2 e− 2 Wa 2a 1 − b− 2 e− Wb2 2b . (7.3.10) 140 2 From Exercise 7.3.2 we have that ϕ(t, x) = t−1/2 e−x /(2t) is a solution of the homoge2 neous heat equation. Since f (t, x) = ∂x ϕ(t, x) = −t−3/2 xe−x /(2t) , applying Theorem 7.3.6 yields to the desired result. The reader can easily fill in the details. Integration techniques will be used when solving stochastic differential equations in the next chapter. Exercise 7.3.13 Find the value of the following stochastic integrals Z 1 √ et cos( 2Wt ) dWt (a) 0 Z 3 e2t cos(2Wt ) dWt (b) 0 Z 4 √ e−t+ 2Wt dWt . (c) 0 Exercise 7.3.14 Let ϕ(t, x) be a solution of the following non-homogeneous heat equation with time-dependent and uniform heat source G(t) 1 ∂t ϕ + ∂x2 ϕ = G(t). 2 Denote f (t, x) = ∂x ϕ(t, x). Show that Z b a f (t, Wt ) dWt = ϕ(b, Wb ) − ϕ(a, Wa ) − Z b G(t) dt. a How does the formula change if the heat source G is constant? 7.4 Table of Usual Stochastic Integrals Now we present a user-friendly table, which enlists identities that are among the most important of our standard techniques. This table is far too complicated to be memorized in full. However, the first couple of identities in this table are the most memorable, and should be remembered. Let a < b and 0 < T . Then we have: Z b dWt = Wb − Wa ; 1. a 2. 3. Z Z T Wt dWt = 0 T 0 WT2 T − ; 2 2 (Wt2 − t) dWt = WT2 − T WT ; 3 141 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. Z Z Z Z Z Z Z Z Z Z T t dWt = T WT − 0 T 0 Wt2 dWt T Z W3 = T − 3 T Wt dt, 0 < T; 0 Z T Wt dt; 0 T t e 2 cos Wt dWt = e 2 sin WT ; 0 T t T e 2 sin Wt dWt = 1 − e 2 cos WT ; 0 T 0 T t e− 2 +Wt dWt = e− 2 +WT − 1; T e λ2 t 2 e λ2 t 2 cos(λWt ) dWt = 1 λ2 T e 2 sin(λWT ); λ sin(λWt ) dWt = λ2 T 1 1 − e 2 cos(λWT ) ; λ 0 T 0 T e− λ2 t ±λWt 2 dWt = 0 b 3 t− 2 Wt e− Wt2 2t 1 − λ2 T ±λWT e 2 −1 ; ±λ 1 dWt = a− 2 e− 2 Wa 2a a T 0 1 cos Wt dWt = sin WT + 2 Z 1 − b− 2 e− Wb2 2b ; T sin Wt dt; 0 Z 1 T cos Wt dt; sin Wt dWt = 1 − cos WT − 14. 2 0 0 Z t 15. d f (s, Ws ) dWs = f (t, Wt ) dWt ; Z T a 16. 17. 18. Z Z Z b Yt dWt = Fb − Fa , if Yt dWt = dFt ; a b a b a f (t) dWt = f (t)Wt |ba − Z b f ′ (t)Wt dt; a b 1 Z b g (Wt ) dWt = g(Wt ) − g′′ (Wt ) dt. 2 a a ′ 142 Chapter 8 Stochastic Differential Equations 8.1 Definitions and Examples Let Xt be a continuous stochastic process. If small changes in the process Xt can be written as a linear combination of small changes in t and small increments of the Brownian motion Wt , we may write dXt = a(t, Wt , Xt )dt + b(t, Wt , Xt ) dWt (8.1.1) and call it a stochastic differential equation. In fact, this differential relation has the following integral meaning: Xt = X0 + Z t a(s, Ws , Xs ) ds + Z t b(s, Ws , Xs ) dWs , (8.1.2) 0 0 where the last integral is taken in the Ito sense. Relation (8.1.2) is taken as the definition for the stochastic differential equation (8.1.1). However, since it is convenient to use stochastic differentials informally, we shall approach stochastic differential equations by analogy with the ordinary differential equations, and try to present the same methods of solving equations in the new stochastic environment. The functions a(t, Wt , Xt ) and b(t, Wt , Xt ) are called drift rate and volatility, respectively. A process Xt is called a (strong) solution for the stochastic equation (8.1.1) if it satisfies the equation. We shall start with an example. Example 8.1.1 (The Brownian bridge) Let a, b ∈ R. Show that the process Z t 1 dWs , 0 ≤ t < 1 Xt = a(1 − t) + bt + (1 − t) 1 − s 0 is a solution of the stochastic differential equation dXt = b − Xt dt + dWt , 1−t 143 0 ≤ t < 1, X0 = a. 144 We shall perform a routine verification to show that Xt is a solution. First we compute b − Xt the quotient : 1−t Z b − Xt = b − a(1 − t) − bt − (1 − t) Z t = (b − a)(1 − t) − (1 − t) 0 and dividing by 1 − t yields b − Xt =b−a− 1−t Using the product rule yields Z d t 0 Z t 0 t 1 dWs 0 1−s 1 dWs , 1−s 1 dWs . 1−s (8.1.3) 1 1 dWs = dWt , 1−s 1−t Z t Z t 1 1 dWs + (1 − t)d dWs dXt = a d(1 − t) + bdt + d(1 − t) 0 1−s 0 1−s Z t 1 dWs dt + dWt = b−a− 0 1−s b − Xt = dt + dWt , 1−t where the last identity comes from (8.1.3). We just verified that the process Xt is a solution of the given stochastic equation. The question of how this solution was obtained in the first place, is the subject of study for the next few sections. 8.2 Finding Mean and Variance from the Equation For most practical purposes, the most important information one needs to know about a process is its mean and variance. These can be found directly from the stochastic equation in some particular cases without solving explicitly the equation. We shall deal in the present section with this problem. Taking the expectation in (8.1.2) and using the property of the Ito integral as a zero mean random variable yields Z t E[a(s, Ws , Xs )] ds. (8.2.4) E[Xt ] = X0 + 0 Applying the Fundamental Theorem of Calculus we obtain d E[Xt ] = E[a(t, Wt , Xt )]. dt 145 We note that Xt is not differentiable, but its expectation E[Xt ] is. This equation can be solved exactly in a few particular cases. 1. If a(t, Wt , Xt ) = a(t), then Rt X0 + 0 a(s) ds. d dt E[Xt ] = a(t) with the exact solution E[Xt ] = 2. If a(t, Wt , Xt ) = α(t)Xt + β(t), with α(t) and β(t) continuous deterministic function, then d E[Xt ] = α(t)E[Xt ] + β(t), dt which is a linear differential equation in E[Xt ]. Its solution is given by A(t) E[Xt ] = e Z t e−A(s) β(s) ds , X0 + (8.2.5) 0 Rt where A(t) = 0 α(s) ds. It is worth noting that the expectation E[Xt ] does not depend on the volatility term b(t, Wt , Xt ). Exercise 8.2.1 If dXt = (2Xt + e2t )dt + b(t, Wt , Xt )dWt , then show that E[Xt ] = e2t (X0 + t). Proposition 8.2.2 Let Xt be a process satisfying the stochastic equation dXt = α(t)Xt dt + b(t)dWt . Then the mean and variance of Xt are given by E[Xt ] = eA(t) X0 Z t 2A(t) V ar[Xt ] = e e−A(s) b2 (s) ds, 0 where A(t) = Rt 0 α(s) ds. Proof: The expression of E[Xt ] follows directly from formula (8.2.5) with β = 0. In order to compute the second moment we first compute (dXt )2 = b2 (t) dt; d(Xt2 ) = 2Xt dXt + (dXt )2 = 2Xt α(t)Xt dt + b(t)dWt + b2 (t)dt = 2α(t)Xt2 + b2 (t) dt + 2b(t)Xt dWt , where we used Ito’s formula. If we let Yt = Xt2 , the previous equation becomes p dYt = 2α(t)Yt + b2 (t) dt + 2b(t) Yt dWt . 146 Applying formula (8.2.5) with α(t) replaced by 2α(t) and β(t) by b2 (t), yields Z t 2A(t) e−2A(s) b2 (s) ds , E[Yt ] = e Y0 + 0 which is equivalent with E[Xt2 ] 2A(t) =e It follows that the variance is V ar[Xt ] = E[Xt2 ] X02 + Z e−2A(s) b2 (s) ds . t 0 2 2A(t) − (E[Xt ]) = e Z t e−2A(s) b2 (s) ds. 0 Remark 8.2.3 We note that the previous equation is of linear type. This shall be solved explicitly in a future section. The mean and variance for a given stochastic process can be computed by working out the associated stochastic equation. We shall provide next a few examples. Example 8.2.1 Find the mean and variance of ekWt , with k constant. From Ito’s formula 1 d(ekWt ) = kekWt dWt + k2 ekWt dt, 2 and integrating yields kWt e =1+k Z t kWs e 0 1 dWs + k2 2 Z t ekWs ds. 0 Taking the expectations we have kWt E[e 1 ] = 1 + k2 2 Z t E[ekWs ] ds. 0 If we let f (t) = E[ekWt ], then differentiating the previous relations yields the differential equation 1 f ′ (t) = k2 f (t) 2 with the initial condition f (0) = E[ekW0 ] = 1. The solution is f (t) = ek E[ekWt ] = ek 2 t/2 . The variance is V ar(ekWt ) = E[e2kWt ] − (E[ekWt ])2 = e4k 2 2 = ek t (ek t − 1). 2 t/2 − ek 2t 2 t/2 , and hence 147 Example 8.2.2 Find the mean of the process Wt eWt . We shall set up a stochastic differential equation for Wt eWt . Using the product formula and Ito’s formula yields d(Wt eWt ) = eWt dWt + Wt d(eWt ) + dWt d(eWt ) 1 = eWt dWt + (Wt + dWt )(eWt dWt + eWt dt) 2 1 Wt Wt Wt Wt = ( Wt e + e )dt + (e + Wt e )dWt . 2 Integrating and using that W0 eW0 = 0 yields Z t Z t 1 Wt Ws Ws ( Ws e + e ) ds + Wt e = (eWs + Ws eWs ) dWs . 2 0 0 Since the expectation of an Ito integral is zero, we have Z t 1 Wt E[Ws eWs ] + E[eWs ] ds. E[Wt e ] = 2 0 Let f (t) = E[Wt eWt ]. Using E[eWs ] = es/2 , the previous integral equation becomes Z t 1 ( f (s) + es/2 ) ds, f (t) = 0 2 Differentiating yields the following linear differential equation 1 f ′ (t) = f (t) + et/2 2 with the initial condition f (0) = 0. Multiplying by e−t/2 yields the following exact equation (e−t/2 f (t))′ = 1. The solution is f (t) = tet/2 . Hence we obtained that E[Wt eWt ] = tet/2 . Exercise 8.2.4 Find a E[Wt2 eWt ]; (b) E[Wt ekWt ]. Example 8.2.3 Show that for any integer k ≥ 0 we have E[Wt2k ] = (2k)! k t , 2k k! In particular, E[Wt4 ] = 3t2 , E[Wt6 ] = 15t3 . E[Wt2k+1 ] = 0. 148 From Ito’s formula we have d(Wtn ) = nWtn−1 dWt + n(n − 1) n−2 Wt dt. 2 Integrate and get Wtn = n Z t 0 Wsn−1 dWs + n(n − 1) 2 Z 0 t Wsn−2 ds. Since the expectation of the first integral on the right side is zero, taking the expectation yields the following recursive relation Z n(n − 1) t n E[Wt ] = E[Wsn−2 ] ds. 2 0 Using the initial values E[Wt ] = 0 and E[Wt2 ] = t, the method of mathematical intk . The details are left to the duction implies that E[Wt2k+1 ] = 0 and E[Wt2k ] = (2k)! 2k k! reader. Exercise 8.2.5 (a) Is Wt4 − 3t2 an Ft -martingale? (b) What about Wt3 ? Example 8.2.4 Find E[sin Wt ]. From Ito’s formula d(sin Wt ) = cos Wt dWt − 1 sin Wt dt, 2 then integrating yields sin Wt = Z t 0 1 cos Ws dWs − 2 Z t sin Ws ds. 0 Taking expectations we arrive at the integral equation Z 1 t E[sin Wt ] = − E[sin Ws ] ds. 2 0 Let f (t) = E[sin Wt ]. Differentiating yields the equation f ′ (t) = − 12 f (t) with f (0) = E[sin W0 ] = 0. The unique solution is f (t) = 0. Hence E[sin Wt ] = 0. Exercise 8.2.6 Let σ be a constant. Show that (a) E[sin(σWt )] = 0; 2 (b) E[cos(σWt )] = e−σ t/2 ; 2 (c) E[sin(t + σWt )] = e−σ t/2 sin t; 2 (d) E[cos(t + σWt )] = e−σ t/2 cos t; 149 Exercise 8.2.7 Use the previous exercise and the definition of expectation to show that Z ∞ π 1/2 ; e1/4 −∞ r Z ∞ 2π −x2 /2 . e cos x dx = (b) e −∞ (a) 2 e−x cos x dx = Exercise 8.2.8 Using expectations show that r Z ∞ π b b2 /(4a) 2 xe−ax +bx dx = (a) e ; a 2a −∞ r Z ∞ b2 b2 /(4a) π 1 2 1+ e ; x2 e−ax +bx dx = (b) a 2a 2a −∞ (c) Can you apply a similar method to find a close form expression for the integral Z ∞ 2 xn e−ax +bx dx? −∞ Exercise 8.2.9 Using the result given by Example 8.2.3 show that 3 (a) E[cos(tWt )] = e−t /2 ; (b) E[sin(tWt )] = 0; (c) E[etWt ] = 0. For general drift rates we cannot find the mean, but in the case of concave drift rates we can find an upper bound for the expectation E[Xt ]. The following result will be useful. Lemma 8.2.10 (Gronwall’s inequality) Let f (t) be a non-negative function satisfying the inequality Z t f (s) ds f (t) ≤ C + M 0 for 0 ≤ t ≤ T , with C, M constants. Then f (t) ≤ CeM t , 0 ≤ t ≤ T. Proposition 8.2.11 Let Xt be a continuous stochastic process such that dXt = a(Xt )dt + b(t, Wt , Xt ) dWt , with the function a(·) satisfying the following conditions 1. a(x) ≥ 0, for 0 ≤ x ≤ T ; 2. a′′ (x) < 0, for 0 ≤ x ≤ T ; 3. a′ (0) = M . Then E[Xt ] ≤ X0 eM t , for 0 ≤ Xt ≤ T . 150 Proof: From the mean value theorem there is ξ ∈ (0, x) such that a(x) = a(x) − a(0) = (x − 0)a′ (ξ) ≤ xa′ (0) = M x, (8.2.6) where we used that a′ (x) is a decreasing function. Applying Jensen’s inequality for concave functions yields E[a(Xt )] ≤ a(E[Xt ]). Combining with (8.2.6) we obtain E[a(Xt )] ≤ M E[Xt ]. Substituting in the identity (8.2.4) implies Z t E[Xs ] ds. E[Xt ] ≤ X0 + M 0 Applying Gronwall’s inequality we obtain E[Xt ] ≤ X0 eM t . Exercise 8.2.12 State the previous result in the particular case when a(x) = sin x, with 0 ≤ x ≤ π. Not in all cases can the mean and the variance be obtained directly from the stochastic equation. In these cases we need more powerful methods that produce closed form solutions. In the next sections we shall discuss several methods of solving stochastic differential equation. 8.3 The Integration Technique We shall start with the simple case when both the drift and the volatility are just functions of time t. Proposition 8.3.1 The solution Xt of the stochastic differential equation dXt = a(t)dt + b(t)dWt is Gaussian distributed with mean X0 + Rt 0 a(s) ds and variance Proof: Integrating in the equation yields Xt − X0 = Z t dXs = 0 Z t a(s) ds + 0 Z Rt 0 b2 (s) ds. t b(s) dWs . 0 Rt Using the property of Wiener integrals, 0 b(s) dWs is Gaussian distributed with mean Rt 0 and variance 0 b2 (s) ds. Then Xt is Gaussian (as a sum between a deterministic 151 function and a Gaussian), with Z Z t t b(s) dWs ] a(s) ds + E[Xt ] = E[X0 + 0 0 Z t Z t a(s) ds + E[ b(s) dWs ] = X0 + 0 0 Z t a(s) ds, = X0 + 0 Z t Z t b(s) dWs ] a(s) ds + V ar[Xt ] = V ar[X0 + 0 0 hZ t i = V ar b(s) dWs 0 Z t b2 (s) ds, = 0 which ends the proof. Exercise 8.3.2 Solve the following stochastic differential equations for t ≥ 0 and determine the mean and the variance of the solution (a) dXt = cos t dt − sin t dWt , X0 = 1. √ (b) dXt = et dt + t dWt , X0 = 0. (c) dXt = t 1+t2 dt + t3/2 dWt , X0 = 1. If the drift and the volatility depend on both variables t and Wt , the stochastic differential equation dXt = a(t, Wt )dt + b(t, Wt )dWt , t≥0 defines an Ito diffusion. Integrating yields the solution Xt = X0 + Z t a(s, Ws ) ds + 0 Z t b(s, Ws ) dWs . 0 There are several cases when both integrals can be computed explicitly. In order to compute the second integral we shall often use the table of usual stochastic integrals provided in section 7.4. Example 8.3.1 Find the solution of the stochastic differential equation dXt = dt + Wt dWt , X0 = 1. 152 Integrate between 0 and t and get Z t Z t t W2 Ws dWs = 1 + t + t − ds + Xt = 1 + 2 2 0 0 1 2 (W + t) + 1. = 2 t Example 8.3.2 Solve the stochastic differential equation dXt = (Wt − 1)dt + Wt2 dWt , X0 = 0. Rt Let Zt = 0 Ws ds denote the integrated Brownian motion process. Integrating the equation between 0 and t yields Z t Z t Z t Ws2 dWs (Ws − 1)ds + dXs = Xt = 0 0 0 1 = Zt − t + Wt3 − Zt 3 1 3 = W − t, 3 t where we used that Rt 0 Ws2 dWs = 13 Wt3 − Zt . Example 8.3.3 Solve the stochastic differential equation dXt = t2 dt + et/2 cos Wt dWt , X0 = 0, and find E[Xt ] and V ar(Xt ). Integrating yields Xt = = Z t 2 s ds + 0 Z t es/2 cos Ws dWs 0 t3 + et/2 sin Wt , 3 (8.3.7) where we used (7.3.9). Even if the process Xt is not Gaussian, we can still compute its mean and variance. By Ito’s formula we have d(sin Wt ) = cos Wt dWt − 1 sin Wt dt 2 Integrating between 0 and t yields Z t Z 1 t cos Ws dWs − sin Wt = sin Ws ds, 2 0 0 153 where we used that sin W0 = sin 0 = 0. Taking the expectation in the previous relation yields hZ t i 1Z t E[sin Wt ] = E cos Ws dWs − E[sin Ws ] ds. 2 0 0 From the properties of the Ito integral, the first expectation on the right side is zero. Denoting µ(t) = E[sin Wt ], we obtain the integral equation Z 1 t µ(s) ds. µ(t) = − 2 0 Differentiating yields the differential equation 1 µ′ (t) = − µ(t) 2 with the solution µ(t) = ke−t/2 . Since k = µ(0) = E[sin W0 ] = 0, it follows that µ(t) = 0. Hence E[sin Wt ] = 0. Taking expectation in (8.3.7) leads to E[Xt ] = E h t3 i 3 + et/2 E[sin Wt ] = t3 . 3 Since the variance of deterministic functions is zero, V ar[Xt ] = V ar h t3 3 i + et/2 sin Wt = (et/2 )2 V ar[sin Wt ] = et E[sin2 Wt ] = et (1 − E[cos 2Wt ]). 2 (8.3.8) In order to compute the last expectation we use Ito’s formula d(cos 2Wt ) = −2 sin 2Wt dWt − 2 cos 2Wt dt and integrate to get cos 2Wt = cos 2W0 − 2 Z t 0 sin 2Ws dWs − 2 Z t cos 2Ws ds 0 Taking the expectation and using that Ito integrals have zero expectation, yields Z t E[cos 2Ws ] ds. E[cos 2Wt ] = 1 − 2 0 If we denote m(t) = E[cos 2Wt ], the previous relation becomes an integral equation Z t m(s) ds. m(t) = 1 − 2 0 154 Differentiate and get m′ (t) = −2m(t), with the solution m(t) = ke−2t . Since k = m(0) = E[cos 2W0 ] = 1, we have m(t) = e−2t . Substituting into (8.3.8) yields V ar[Xt ] = et et − e−t (1 − e−2t ) = = sinh t. 2 2 In conclusion, the solution Xt has the mean and the variance given by E[Xt ] = t3 , 3 V ar[Xt ] = sinh t. Example 8.3.4 Solve the following stochastic differential equation et/2 dXt = dt + eWt dWt , X0 = 0, and find the distribution of the solution Xt and its mean and variance. Dividing by et/2 , integrating between 0 and t, and using formula (7.3.8) yields Z t Z t −s/2 e−s/2+Ws dWs e ds + Xt = 0 −t/2 Wt 0 −t/2 = 2(1 − e −t/2 = 1+e )+e Wt (e − 2). e −1 Since eWt is a geometric Brownian motion, using Proposition 3.2.2 yields E[Xt ] = E[1 + e−t/2 (eWt − 2)] = 1 − 2e−t/2 + e−t/2 E[eW t ] = 2 − 2e−t/2 . V ar(Xt ) = V ar[1 + e−t/2 (eWt − 2)] = V ar[e−t/2 eWt ] = e−t V ar[eWt ] = e−t (e2t − et ) = et − 1. The process Xt has the following distribution: F (y) = P (Xt ≤ y) = P 1 + e−t/2 (eWt − 2) ≤ y W 1 t = P Wt ≤ ln 2 + et/2 (y − 1) = P √ ≤ √ ln 2 + et/2 (y − 1) t t 1 = N √ ln 2 + et/2 (y − 1) , t Z u 1 2 where N (u) = √ e−s /2 ds is the distribution function of a standard normal 2π −∞ distributed random variable. 155 Example 8.3.5 Solve the stochastic differential equation 2 dXt = dt + t−3/2 Wt e−Wt /(2t) dWt , X1 = 1. Integrating between 1 and t and applying formula (7.3.10) yields Z t Z t 2 s−3/2 Ws e−Ws /(2s) dWs ds + Xt = X1 + 1 1 −W12 /2 = 1+t−1−e 2 = t − e−W1 /2 − 1 t1/2 − 1 t1/2 2 e−Wt /(2t) 2 e−Wt /(2t) , ∀t ≥ 1. Exercise 8.3.3 Solve the following stochastic differential equations by the method of integration 1 (a) dXt = (t − sin Wt )dt + (cos Wt )dWt , X0 = 0; 2 1 (b) dXt = ( cos Wt − 1)dt + (sin Wt )dWt , X0 = 0; 2 1 (c) dXt = (sin Wt + Wt cos Wt )dt + (Wt sin Wt )dWt , X0 = 0. 2 8.4 Exact Stochastic Equations The stochastic differential equation dXt = a(t, Wt )dt + b(t, Wt )dWt (8.4.9) is called exact if there is a differentiable function f (t, x) such that 1 a(t, x) = ∂t f (t, x) + ∂x2 f (t, x) 2 b(t, x) = ∂x f (t, x). (8.4.10) (8.4.11) Assume the equation is exact. Then substituting in (8.4.9) yields 1 dXt = ∂t f (t, Wt ) + ∂x2 f (t, Wt ) dt + ∂x f (t, Wt )dWt . 2 Applying Ito’s formula, the previous equations becomes dXt = d f (t, Wt ) , which implies Xt = f (t, Wt ) + c, with c constant. Solving the partial differential equations system (8.4.10)–(8.4.11) requires the following steps: 156 1. Integrating partially with respect to x in the second equation to obtain f (t, x) up to an additive function T (t); 2. Substitute into the first equation and determine the function T (t); 3. The solution is Xt = f (t, Wt ) + c, with c determined from the initial condition on Xt . Example 8.4.1 Solve the stochastic differential equation dXt = et (1 + Wt2 )dt + (1 + 2et Wt )dWt , X0 = 0. In this case a(t, x) = et (1 + x2 ) and b(t, x) = 1 + 2et x. The associated system is 1 et (1 + x2 ) = ∂t f (t, x) + ∂x2 f (t, x) 2 t 1 + 2e x = ∂x f (t, x). Integrate partially in x in the second equation yields Z f (t, x) = (1 + 2et x) dx = x + et x2 + T (t). Then ∂t f = et x2 + T ′ (t) and ∂x2 f = 2et . Substituting in the first equation yields et (1 + x2 ) = et x2 + T ′ (t) + et . This implies T ′ (t) = 0, or T = c constant. Hence f (t, x) = x + et x2 + c, and Xt = f (t, Wt ) = Wt + et Wt2 + c. Since X0 = 0, it follows that c = 0. The solution is Xt = Wt + et Wt2 . Example 8.4.2 Find the solution of dXt = 2tWt3 + 3t2 (1 + Wt ) dt + (3t2 Wt2 + 1)dWt , X0 = 0. The coefficient functions are a(t, x) = 2tx3 + 3t2 (1 + x) and b(t, x) = 3t2 x2 + 1. The associated system is given by 1 2tx3 + 3t2 (1 + x) = ∂t f (t, x) + ∂x2 f (t, x) 2 3t2 x2 + 1 = ∂x f (t, x). Integrating partially in the second equation yields Z f (t, x) = (3t2 x2 + 1) dx = t2 x3 + x + T (t). Then ∂t f = 2tx3 + T ′ (t) and ∂x2 f = 6t2 x, and substituting into the first equation we get 1 2tx3 + 3t2 (1 + x) = 2tx3 + T ′ (t) + 6t2 x. 2 157 After cancellations we get T ′ (t) = 3t2 , so T (t) = t3 + c. Then f (t, x) = t2 x3 + x + t3 + c. The solution process is given by Xt = f (t, Wt ) = t2 Wt3 + Wt + t3 + c. Using X0 = 0 we get c = 0. Hence the solution is Xt = t2 Wt3 + Wt + t3 . The next result deals with a condition regarding the closeness of the stochastic differential equation. Theorem 8.4.1 If the stochastic differential equation (8.4.9) is exact, then the coefficient functions a(t, x) and b(t, x) satisfy the condition 1 ∂x a = ∂t b + ∂x2 b. 2 (8.4.12) Proof: If the stochastic equation is exact, there is a function f (t, x) satisfying the system (8.4.10)–(8.4.11). Differentiating the first equation of the system with respect to x yields 1 ∂x a = ∂t ∂x f + ∂x2 ∂x f. 2 Substituting b = ∂x f yields the desired relation. Remark 8.4.2 The equation (8.4.12) has the meaning of a heat equation. The function b(t, x) represents the temperature measured at x at the instance t, while ∂x a is the density of heat sources. The function a(t, x) can be regarded as the potential from which the density of heat sources is derived by taking the gradient in x. It is worth noting that equation (8.4.12) is just a necessary condition for exactness. This means that if this condition is not satisfied, then the equation is not exact. In that case we need to try a different method to solve the equation. Example 8.4.3 Is the stochastic differential equation dXt = (1 + Wt2 )dt + (t4 + Wt2 )dWt exact? Collecting the coefficients, we have a(t, x) = 1 + x2 , b(t, x) = t4 + x2 . Since ∂x a = 2x, ∂t b = 4t3 , and ∂x2 b = 2, the condition (8.4.12) is not satisfied, and hence the equation is not exact. Exercise 8.4.3 Solve the following exact stochastic differential equations X0 = 1; (a) dXt = et dt + (Wt2 − t)dWt , 2 X0 = −1; (b) dXt = (sin t)dt + (Wt − t)dWt , t (c) dXt = t2 dt + eWt − 2 dWt , X0 = 0; t/2 (d) dXt = tdt + e (cos Wt ) dWt , X0 = 1. 158 Exercise 8.4.4 Verify the closeness condition and then solve the following exact stochastic differential equations X0 = 0; (a) dXt = Wt + 32 Wt2 dt + (t + Wt3 )dWt , (b) dXt = 2tWt dt + (t2 + Wt )dWt , X0 = 0; 1 t t (c) dXt = e Wt + 2 cos Wt dt + (e + sin Wt )dWt , (d) dXt = eWt (1 + 2t )dt + teWt dWt , 8.5 X0 = 0; X0 = 2. Integration by Inspection When solving a stochastic differential equation by inspection we look for opportunities to apply the product or the quotient formulas: d(f (t)Yt ) = f (t) dYt + Yt df (t). X f (t)dX − X df (t) t t t d = . f (t) f (t)2 For instance, if a stochastic differential equation can be written as dXt = f ′ (t)Wt dt + f (t)dWt , the product rule brings the equation into the exact form dXt = d f (t)Wt , which after integration leads to the solution Xt = X0 + f (t)Wt . Example 8.5.1 Solve dXt = (t + Wt2 )dt + 2tWt dWt , X0 = a. We can write the equation as dXt = Wt2 dt + t(2Wt dWt + dt), which can be contracted to dXt = Wt2 dt + td(Wt2 ). Using the product rule we can bring it to the exact form dXt = d(tWt2 ), with the solution Xt = tWt2 + a. 159 Example 8.5.2 Solve the stochastic differential equation dXt = (Wt + 3t2 )dt + tdWt . If we rewrite the equation as dXt = 3t2 dt + (Wt dt + tdWt ), we note the exact expression formed by the last two terms Wt dt + tdWt = d(tWt ). Then dXt = d(t3 ) + d(tWt ), which is equivalent with d(Xt ) = d(t3 + tWt ). Hence Xt = t3 + tWt + c, c ∈ R. Example 8.5.3 Solve the stochastic differential equation e−2t dXt = (1 + 2Wt2 )dt + 2Wt dWt . Multiply by e2t to get dXt = e2t (1 + 2Wt2 )dt + 2e2t Wt dWt . After regrouping, this becomes dXt = (2e2t dt)Wt2 + e2t (2Wt dWt + dt). Since d(e2t ) = 2e2t dt and d(Wt2 ) = 2Wt dWt + dt, the previous relation becomes dXt = d(e2t )Wt2 + e2t d(Wt2 ). By the product rule, the right side becomes exact dXt = d(e2t Wt2 ), and hence the solution is Xt = e2t Wt2 + c, c ∈ R. Example 8.5.4 Solve the equation t3 dXt = (3t2 Xt + t)dt + t6 dWt , X1 = 0. The equation can be written as t3 dXt − 3Xt t2 dt = tdt + t6 dWt . Divide by t6 : t3 dXt − Xt d(t3 ) = t−5 dt + dWt . (t3 )2 160 Applying the quotient rule yields t−4 X t + dWt . d 3 = −d t 4 Integrating between 1 and t, yields Xt t−4 + Wt − W1 + c = − t3 4 so Xt = ct3 − 1 + t3 (Wt − W1 ), 4t c ∈ R. Using X1 = 0 yields c = 1/4 and hence the solution is Xt = 1 3 1 t − + t3 (Wt − W1 ), 4 t c ∈ R. Exercise 8.5.1 Solve the following stochastic differential equations by the inspection method (a) dXt = (1 + Wt )dt + (t + 2Wt )dWt , (b) (c) (d) t2 dXt (2t3 − Wt )dt + tdWt , X1 = 1 e−t/2 dXt = 2 Wt dt + dWt , X0 = 0; 2 dXt = 2tWt dWt + Wt dt, X0 = 0; = (e) dXt = 1 + 8.6 X0 = 0; 1 √ W 2 t t √ dt + t dWt , 0; X1 = 0. Linear Stochastic Differential Equations Consider the stochastic differential equation with drift term linear in Xt dXt = α(t)Xt + β(t) dt + b(t, Wt )dWt , t ≥ 0. This also can be written as dXt − α(t)Xt dt = β(t)dt + b(t, Wt )dWt . Rt Let A(t) = 0 α(s) ds. Multiplying by the integrating factor e−A(t) , the left side of the previous equation becomes an exact expression = e−A(t) β(t)dt + e−A(t) b(t, Wt )dWt e−A(t) dXt − α(t)Xt dt = e−A(t) β(t)dt + e−A(t) b(t, Wt )dWt . d e−A(t) Xt 161 Integrating yields Z t e−A(s) b(s, Ws ) dWs e−A(s) β(s) ds + 0 0 Z t Z t −A(s) A(t) A(t) e−A(s) b(s, Ws ) dWs . e β(s) ds + = X0 e +e e−A(t) Xt = X0 + Xt Z t 0 0 The first integral within the previous parentheses is a Riemann integral, and the latter one is an Ito stochastic integral. Sometimes, in practical applications these integrals can be computed explicitly. When b(t, Wt ) = b(t), the latter integral becomes a Wiener integral. In this case the solution Xt is Gaussian with mean and variance given by Z t A(t) A(t) E[Xt ] = X0 e +e e−A(s) β(s) ds 0 Z t V ar[Xt ] = e2A(t) e−2A(s) b(s)2 ds. 0 Another important particular case is when α(t) = α 6= 0, β(t) = β are constants and b(t, Wt ) = b(t). The equation in this case is dXt = αXt + β dt + b(t)dWt , t ≥ 0, and the solution takes the form αt Xt = X0 e β + (eαt − 1) + α Z t eα(t−s) b(s) dWs . 0 Example 8.6.1 Solve the linear stochastic differential equation dXt = (2Xt + 1)dt + e2t dWt . Write the equation as dXt − 2Xt dt = dt + e2t dWt and multiply by the integrating factor e−2t to get d e−2t Xt = e−2t dt + dWt . Integrate between 0 and t and multiply by e2t , and obtain Z t Z t 2t 2t −2s 2t Xt = X0 e + e e ds + e dWs 0 1 = X0 e + (e2t − 1) + e2t Wt . 2 2t 0 162 Example 8.6.2 Solve the linear stochastic differential equation dXt = (2 − Xt )dt + e−t Wt dWt . Multiplying by the integrating factor et yields et (dXt + Xt dt) = 2et dt + Wt dWt . Since et (dXt + Xt dt) = d(et Xt ), integrating between 0 and t we get Z t Z t t t Ws dWs . 2e dt + e Xt = X0 + 0 0 Dividing by et and performing the integration yields 1 Xt = X0 e−t + 2(1 − e−t ) + e−t (Wt2 − t). 2 Example 8.6.3 Solve the linear stochastic differential equation 1 dXt = ( Xt + 1)dt + et cos Wt dWt . 2 Write the equation as 1 dXt − Xt dt = dt + et cos Wt dWt 2 and multiply by the integrating factor e−t/2 to get d(e−t/2 Xt ) = e−t/2 dt + et/2 cos Wt dWt . Integrating yields −t/2 e Xt = X0 + Z 0 t −s/2 e ds + Z t es/2 cos Ws dWs 0 Multiply by et/2 and use formula (7.3.9) to obtain the solution Xt = X0 et/2 + 2(et/2 − 1) + et sin Wt . Exercise 8.6.1 Solve the following linear stochastic differential equations (a) dXt = (4Xt − 1)dt + 2dWt ; (b) dXt = (3Xt − 2)dt + e3t dWt ; (c) dXt = (1 + Xt )dt + et Wt dWt ; (d) dXt = (4Xt + t)dt + e4t dWt ; (e) dXt = t + 12 Xt dt + et sin Wt dWt ; (f ) dXt = −Xt dt + e−t dWt . 163 In the following we present an important example of stochastic differential equations, which can be solved by the method presented in this section. Proposition 8.6.2 (The mean-reverting Ornstein-Uhlenbeck process) Let m and α be two constants. Then the solution Xt of the stochastic equation dXt = (m − Xt )dt + αdWt is given by −t Xt = m + (X0 − m)e +α Z t (8.6.13) es−t dWs . (8.6.14) 0 Xt is Gaussian with mean and variance given by E[Xt ] = m + (X0 − m)e−t α2 V ar(Xt ) = (1 − e−2t ). 2 Proof: Adding Xt dt to both sides and multiplying by the integrating factor et we get d(et Xt ) = met dt + αet dWt , which after integration yields et Xt = X0 + m(et − 1) + α Z t es dWs . 0 Hence −t −t −t Z t + αe es dWs 0 Z t es−t dWs . = m + (X0 − m)e−t + α Xt = X0 e +m−e 0 Since Xt is the sum between a predictable function and a Wiener integral, then we can use Proposition 5.6.1 and it follows that Xt is Gaussian, with i h Z t −t es−t dWs = m + (X0 − m)e−t E[Xt ] = m + (X0 − m)e + E α 0 Z t i h Z t 2 −2t s−t e dWs = α e e2s ds V ar(Xt ) = V ar α = α2 e−2t 0 2t e 0 1 −1 = α2 (1 − e−2t ). 2 2 164 The name mean-reverting comes from the fact that lim E[Xt ] = m. t→∞ The variance also tends to zero exponentially, lim V ar[Xt ] = 0. According to Propot→∞ sition 4.8.1, the process Xt tends to m in the mean square sense. Proposition 8.6.3 (The Brownian bridge) For a, b ∈ R fixed, the stochastic differential equation dXt = b − Xt dt + dWt , 1−t 0 ≤ t < 1, X0 = a has the solution Z Xt = a(1 − t) + bt + (1 − t) t 0 1 dWs , 0 ≤ t < 1. 1−s The solution has the property lim Xt = b, almost certainly. t→1 Proof: If we let Yt = b − Xt the equation becomes linear in Yt dYt + 1 Yt dt = −dWt . 1−t 1 1−t Multiplying by the integrating factor ρ(t) = yields Y 1 t d dWt , =− 1−t 1−t which leads by integration to Yt =c− 1−t Z t 0 Making t = 0 yields c = a − b, so 1 dWs . 1−s b − Xt =a−b− 1−t Z Xt = a(1 − t) + bt + (1 − t) Z t 1 dWs . 1−s t 1 dWs , 0 ≤ t < 1. 1−s 0 Solving for Xt yields 0 (8.6.15) 165 Let Ut = (1 − t) Rt 1 0 1−s dWs . First we notice that Z t 1 dWs = 0, 0 1−s Z Z t t 1 1 V ar(Ut ) = (1 − t)2 V ar dWs = (1 − t)2 ds 2 0 1−s 0 (1 − s) 1 − 1 = t(1 − t). = (1 − t)2 1−t E[Ut ] = (1 − t)E In order to show ac-limt→1 Xt = b, we need to prove P ω; lim Xt (ω) = b = 1. t→1 Since Xt = a(1 − t) + bt + Ut , it suffices to show that P ω; lim Ut (ω) = 0 = 1. t→1 (8.6.16) We evaluate the probability of the complementary event P ω; lim Ut (ω) 6= 0 = P ω; |Ut (ω)| > ǫ, ∀t , t→1 for some ǫ > 0. Since by Markov’s inequality P ω; |Ut (ω)| > ǫ) < V ar(Ut ) t(1 − t) = 2 ǫ ǫ2 holds for any 0 ≤ t < 1, choosing t → 1 implies that P ω; |Ut (ω)| > ǫ, ∀t) = 0, which implies (8.6.16). The process (8.6.15) is called the Brownian bridge because it joins X0 = a with X1 = b. Since Xt is the sum between a deterministic linear function in t and a Wiener integral, it follows that it is a Gaussian process, with mean and variance E[Xt ] = a(1 − t) + bt V ar(Xt ) = V ar(Ut ) = t(1 − t). It is worth noting that the variance is maximum at the midpoint t = (b − a)/2 and zero at the end points a and b. ms b as t → 1. Exercise 8.6.4 Show that the Brownian bridge (8.6.15) satisfies Xt −→ Exercise 8.6.5 Find Cov(Xs , Xt ), 0 < s < t for the following cases: (a) Xt is a mean reverting Orstein-Uhlenbeck process; (b) Xt is a Brownian bridge process. 166 Stochastic equations with respect to a Poisson process Similar techniques can be applied in the case when the Brownian motion process Wt is replaced by a Poisson process Nt with constant rate λ. For instance, the stochastic differential equation dXt = 3Xt dt + e3t dNt X0 = 1 can be solved multiplying by the integrating factor e−3t to obtain d(e−3t Xt ) = dNt . Integrating yields e−3t Xt = Nt + C. Making t = 0 yields C = 1, so the solution is given by Xt = e3t (1 + Nt ). The following equation dXt = (m − Xt )dt + αdNt is similar with the equation defining the mean-reverting Ornstein-Uhlenbeck process. As we shall see, in this case, the process is no more mean-reverting, but it reverts to a certain constant. A similar method yields the solution Z t Xt = m + (X0 − m)e−t + αe−t es dNs . 0 Since from Proposition 5.8.5 and Exercise 5.8.8 we have Z Z t t s e dNs = λ E es ds = λ(et − 1) 0 0 Z t Z t λ e2s ds = (e2t − 1), es dNs = λ V ar 2 0 0 it follows that E[Xt ] = m + (X0 − m)e−t + αλ(1 − e−t ) → m + αλ λα2 (1 − e−2t ). V ar(Xt ) = 2 It is worth noting that in this case the process Xt is not Gaussian any more. 8.7 The Method of Variation of Parameters Let’s start by considering the following stochastic equation dXt = αXt dWt , (8.7.17) 167 with α constant. This is the equation which, in physics is known to model the linear noise. Dividing by Xt yields dXt = α dWt . Xt Switch to the integral form Z dXt = Xt Z α dWt , and integrate “blindly” to get ln Xt = αWt + c, with c an integration constant. This leads to the “pseudo-solution” Xt = eαWt +c . The nomination “pseudo” stands for the fact that Xt does not satisfy the initial equation. We shall find a correct solution by letting the parameter c be a function of t. In other words, we are looking for a solution of the following type: Xt = eαWt +c(t) , (8.7.18) where the function c(t) is subject to be determined. Using Ito’s formula we get dXt = d(eαWt +c(t) ) = eαWt +c(t) (c′ (t) + α2 /2)dt + αeαWt +c(t) dWt = Xt (c′ (t) + α2 /2)dt + αXt dWt . Substituting the last term from the initial equation (8.7.17) yields dXt = Xt (c′ (t) + α2 /2)dt + dXt , which leads to the equation c′ (t) + α2 /2 = 0 2 with the solution c(t) = − α2 t + k. Substituting into (8.7.18) yields Xt = eαWt − α2 t+k 2 . The value of the constant k is determined by taking t = 0. This leads to X0 = ek . Hence we have obtained the solution of the equation (8.7.17) Xt = X0 eαWt − α2 t 2 . Example 8.7.1 Use the method of variation of parameters to solve the stochastic differential equation dXt = µXt dt + σXt dWt , with µ and σ constants. 168 After dividing by Xt we bring the equation into the equivalent integral form Z Z Z dXt = µ dt + σ dWt . Xt Integrate on the left “blindly” and get ln Xt = µt + σWt + c, where c is an integration constant. We arrive at the following “pseudo-solution” Xt = eµt+σWt +c . Assume the constant c is replaced by a function c(t), so we are looking for a solution of the form Xt = eµt+σWt +c(t) . (8.7.19) Apply Ito’s formula and get dXt = Xt µ + c′ (t) + Subtracting the initial equation yields c′ (t) + 2 σ2 dt + σXt dWt . 2 σ2 dt = 0, 2 2 which is satisfied for c′ (t) = − σ2 , with the solution c(t) = − σ2 t+k, k ∈ R. Substituting into (8.7.19) yields the solution Xt = eµt+σWt − σ2 t+k 2 = e(µ− σ2 )t+σWt +k 2 = X0 e(µ− σ2 )t+σWt 2 . Exercise 8.7.1 Use the method of variation of parameters to solve the equation dXt = Xt Wt dWt by following the next two steps: (a) Divide by Xt and integrate “blindly” to get the “pseudo-solution” Xt = e Wt2 − 2t +c 2 , with c constant. (b) Consider c = c(t, Wt ) and find a solution of type Xt = e Wt2 − 2t +c(t,Wt ) 2 . 169 8.8 Integrating Factors The method of integrating factors can be applied to a class of stochastic differential equations of the type dXt = f (t, Xt )dt + g(t)Xt dWt , (8.8.20) where f and g are continuous deterministic functions. The integrating factor is given by R Rt 1 t 2 ρt = e− 0 g(s) dWs + 2 0 g (s) ds . The equation can be brought into the following exact form d(ρt Xt ) = ρt f (t, Xt )dt. Substituting Yt = ρt Xt , we obtain that Yt satisfies the deterministic differential equation dYt = ρt f (t, Yt /ρt )dt, which can be solved by either integration or as an exact equation. We shall exemplify the method of integrating factors with a few examples. Example 8.8.1 Solve the stochastic differential equation dXt = rdt + αXt dWt , (8.8.21) with r and α constants. 1 2 The integrating factor is given by ρt = e 2 α t−αWt . Using Ito’s formula, we can easily check that dρt = ρt (α2 dt − αdWt ). Using dt2 = dt dWt = 0, (dWt )2 = dt we obtain dXt dρt = −α2 ρt Xt dt. Multiplying by ρt , the initial equation becomes ρt dXt − αρt Xt dWt = rρt dt, and adding and subtracting α2 ρt Xt dt from the left side yields ρt dXt − αρt Xt dWt + α2 ρt Xt dt − α2 ρt Xt dt = rρt dt. This can be written as ρt dXt + Xt dρt + dρt dXt = rρt dt, 170 which, with the virtue of the product rule, becomes d(ρt Xt ) = rρt dt. Integrating yields ρt Xt = ρ0 X0 + r Z t ρs ds 0 and hence the solution is r 1 X0 + ρt ρt Z t ρs ds Z t 1 2 αWt − 12 α2 t e− 2 α (t−s)+α(Wt −Ws ) ds. +r = X0 e Xt = 0 0 Exercise 8.8.1 Let α be a constant. Solve the following stochastic differential equations by the method of integrating factors (a) dXt = αXt dWt ; (b) dXt = Xt dt + αXt dWt ; 1 (c) dXt = dt + αXt dWt , X0 > 0. Xt Exercise 8.8.2 Let Xt Rbe the solution of the stochastic equation dXt = σXt dWt , with t σ constant. Let At = 1t 0 Xs dWs be the stochastic average of Xt . Find the stochastic equation satisfied by At , the mean and variance of At . 8.9 Existence and Uniqueness An exploding solution Consider the non-linear stochastic differential equation dXt = Xt3 dt + Xt2 dWt , X0 = 1/a. (8.9.22) We shall look for a solution of the type Xt = f (Wt ). Ito’s formula yields 1 dXt = f ′ (Wt )dWt + f ′′ (Wt )dt. 2 Equating the coefficients of dt and dWt in the last two equations yields f ′ (Wt ) = Xt2 =⇒ f ′ (Wt ) = f (Wt )2 (8.9.23) 1 ′′ f (Wt ) = Xt3 =⇒ f ′′ (Wt ) = 2f (Wt )3 2 (8.9.24) We note that equation (8.9.23) implies (8.9.24) by differentiation. So it suffices to solve only the ordinary differential equation f ′ (x) = f (x)2 , f (0) = 1/a. 171 Separating and integrating we have Z Z df 1 = ds =⇒ f (x) = . 2 f (x) a−x Hence a solution of equation (8.9.22) is Xt = 1 . a − Wt Let Ta be the first time the Brownian motion Wt hits a. Then the process Xt is defined only for 0 ≤ t < Ta . Ta is a random variable with P (Ta < ∞) = 1 and E[Ta ] = ∞, see section 4.3. The following theorem is the analog of Picard’s uniqueness result from ordinary differential equations: Theorem 8.9.1 (Existence and Uniqueness) Consider the stochastic differential equation dXt = b(t, Xt )dt + σ(t, Xt )dWt , X0 = c where c is a constant and b and σ are continuous functions on [0, T ] × R satisfying 1. |b(t, x)| + |σ(t, x)| ≤ C(1 + |x|); x ∈ R, t ∈ [0, T ] 2. |b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)| ≤ K|x − y|, x, y ∈ R, t ∈ [0, T ] with C, K positive constants. There there is a unique solution process Xt that is continuous and satisfies i hZ T Xt2 dt < ∞. E 0 The first condition says that the drift and volatility increase no faster than a linear function in x. The second conditions states that the functions are Lipschitz in the second argument. Example 8.9.1 Consider the stochastic differential equation q q 1 1 + Xt2 + Xt dt + 1 + Xt2 dWt , X0 = x0 . dXt = 2 (a) Solve the equation; (b) Show that there a unique solution. 172 Chapter 9 Applications of Brownian Motion 9.1 The Directional Derivative Consider a smooth curve x : [0, ∞) → Rn , starting at x(0) = x0 with the initial velocity v = x′ (0). Then the derivative of a function f : Rn → R in the direction v is defined by f x(t) − f (x) d = f x(t) . Dv f (x0 ) = lim tց0 t dt t=0+ Applying chain rule yields n X ∂f dxk (t) Dv f (x0 ) = (x(t)) ∂xk dt k=1 t=0+ n X ∂f (x0 )vk (0) = h∇f (x0 ), vi, = ∂xk k=1 where ∇f stands for the gradient of f and h , i denotes the scalar product. The linear differential operator Dv is called the directional derivative with respect to the vector v. In the next section we shall extend this definition to the case when the curve x(t) is replaced by an Ito diffusion Xt ; in this case the corresponding “directional derivative” will be a second order partial differential operator. 9.2 The Generator of an Ito Diffusion Let (Xt )t≥0 be a stochastic process with X0 = x0 . We shall consider an operator that describes infinitesimally the rate of change of a function which depends smoothly on Xt . More precisely, the generator of the stochastic process Xt is the second order partial differential operator A defined by E[f (Xt )] − f (x) , tց0 t Af (x) = lim 173 174 for any smooth function (at least of class C 2 ) with compact support f : Rn → R. Here E stands for the expectation operatorgiven the initial condition X0 = x, i.e., Z f (y)pt (x, y) dy, E[f (Xt )] = E[f (Xt )|X0 = x] = Rn where pt (x, y) = p(x, y; t, 0) is the transition density of Xt , given X0 = x. In the following we shall find the generator associated with the Ito diffusion dXt = b(Xt )dt + σ(Xt )dW (t), t ≥ 0, X0 = x, (9.2.1) where W (t) = W1 (t), . . . , Wm (t) is an m-dimensional Brownian motion, b : Rn → Rn and σ : Rn → Rn×m are measurable functions. The main tool used in deriving the formula for the generator A is Ito’s formula in several variables. If let Ft = f (Xt ), then using Ito’s formula we have X ∂f 1 X ∂2f (Xt ) dXti + (Xt ) dXti dXtj , (9.2.2) dFt = ∂xi 2 ∂xi ∂xj i,j i where Xt = (Xt1 , · · · , Xtn ) satisfies the Ito diffusion (9.2.1) on components, i.e., dXti = bi (Xt )dt + [σ(Xt ) dW (t)]i X = bi (Xt )dt + σik dWk (t). (9.2.3) k Using the stochastic relations dt2 = dt dWk (t) = 0 and dWk (t) dWr (t) = δkr dt, a computation provides dXti dXtj = = bi dt + X k = X X k X σik dWk (t) bj dt + σjk dWk (t) k X σik dWk (t) σjr dWr (t) r σik σjr dWk (t)dWr (t) = k,r σik σjk dt k T = (σσ )ij dt. Therefore X dXti dXtj = (σσ T )ij dt. Substitute (9.2.3) and (9.2.4) into (9.2.2) yields dFt = " # X ∂f 1 X ∂2f (Xt )(σσ T )ij + bi (Xt ) (Xt ) dt 2 ∂xi ∂xj ∂xi i,j X ∂f (Xt )σik (Xt ) dWk (t). + ∂xi i,k i (9.2.4) 175 Integrate and obtain Ft # Z t" X X ∂f 1 ∂2f T = F0 + (σσ )ij + bi (Xs ) ds 2 ∂xi ∂xj ∂xi 0 i,j i XZ t X ∂f (Xs ) dWk (s). + σik ∂x i 0 k i Since F0 = f (X0 ) = f (x) and E(f (x)) = f (x), applying the expectation operator in the previous relation we obtain "Z # t X X ∂f ∂2f 1 T (σσ )ij E(Ft ) = f (x) + E + bi (Xs ) ds . (9.2.5) 2 ∂xi ∂xj ∂xi 0 i,j i Using the commutativity between the operator E and the integral rule (see Exercise 9.2.7), yields Rt 0, applying l’Hospital ∂ 2 f (x) X ∂f (x) 1X E(Ft ) − f (x) (σσ T )ij = + bk . tց0 t 2 ∂xi ∂xj ∂xk lim i,j k We conclude the previous computations with the following result. Theorem 9.2.1 The generator of the Ito diffusion (9.2.1) is given by A= X ∂2 1X ∂ (σσ T )ij + . bk 2 ∂xi ∂xj ∂xk i,j (9.2.6) k The matrix σ is called dispersion and the product σσ T is called diffusion matrix. These names are related with their physical significance Substituting (9.2.6) in (9.2.5) we obtain the following formula hZ t i E[f (Xt )] = f (x) + E Af (Xs ) ds , (9.2.7) 0 for any f ∈ C02 (Rn ). Exercise 9.2.2 Find the generator operator associated with the n-dimensional Brownian motion. Exercise 9.2.3 Find the Ito diffusion corresponding to the generator Af (x) = f ′′ (x)+ f ′ (x). Exercise 9.2.4 Let ∆G = 12 (∂x21 + x21 ∂x22 ) be the Grushin’s operator. (a) Find the diffusion process associated with the generator ∆G . (b) Find the diffusion and dispersion matrices and show that they are degenerate. 176 Exercise 9.2.5 Let Xt and Yt be two one-dimensional independent Ito diffusions with infinitesimal generators AX and AY . Let Zt = (Xt , Yt ) with the infinitesimal generator AZ . Show that AZ = AX + AY . Exercise 9.2.6 Let Xt be an Ito diffusion with infinitesimal generator AX . Consider the process Yt = (t, Xt ). Show that the infinitesimal generator of Yt is given by AY = ∂t + AX . Exercise 9.2.7 Let Xt be an Ito diffusion with X0 = x, and ϕ a smooth function. Using l’Hospital rule to show that Z i 1 h t ϕ(Xs ) ds = ϕ(x). lim E t→0+ t 0 9.3 Dynkin’s Formula Formula (9.2.7) holds under more general conditions, when t is a stopping time. First we need the following result, which deals with a continuity-type property in the upper limit of a Ito integral. Lemma 9.3.1 Let g be a bounded measurable function and τ be a stopping time for Xt with E[τ ] < ∞. Then lim E k→∞ hZ lim E k→∞ τ ∧k i g(Xs ) dWs = E 0 hZ τ ∧k i hZ g(Xs ) ds = E 0 i g(Xs ) dWs ; τ 0 hZ τ 0 i g(Xs ) ds . (9.3.8) (9.3.9) Proof: Let |g| < K. Using the properties of Ito integrals, we have E = E h Z hZ τ g(Xs ) dWs − 0 τ 2 i Z τ ∧k g(Xt ) dWs 0 2 2 i =E g (Xs ) ds ≤ K E[τ − τ ∧ k] → 0, τ ∧k h Z τ g(Xs ) dWs τ ∧k 2 i k → ∞. Since E[X 2 ] ≤ E[X]2 , it follows that E hZ 0 τ g(Xs ) dWs − Z 0 τ ∧k i g(Xt ) dWs → 0, k → ∞, which is equivalent with relation (9.3.8). The second relation can be proved similar and is left as an exercise for the reader. 177 Exercise 9.3.2 Assume the hypothesis of the previous lemma. Let 1{s<τ } be the characteristic function of the interval (−∞, τ ) 1, if u < τ 1{s<τ } (u) = 0, otherwise. Show that Z k Z τ ∧k 1{s<τ } g(Xs ) dWs , g(Xs ) dWs = (a) 0 0 Z k Z τ ∧k 1{s<τ } g(Xs ) ds. g(Xs ) ds = (b) 0 0 Theorem 9.3.3 (Dynkin’s formula) Let f ∈ C02 (Rn ), and Xt be an Ito diffusion starting at x. If τ is a stopping time with E[τ ] < ∞, then hZ τ i E[f (Xτ )] = f (x) + E Af (Xs ) ds , (9.3.10) 0 where A is the infinitesimal generator of Xt . Proof: Replace t by k and f by 1{s<τ } f in (9.2.7) and obtain E[1s<τ f (Xk )] = 1{s<τ } f (x) + E hZ k 0 i A(1{s<τ } f )(Xs ) ds , which can be written as E[f (Xk∧τ )] = 1{s<τ } f (x) + E = 1{s<τ } f (x) + E hZ k 1{s<τ } (s)A(f )(Xs ) ds 0 hZ k∧τ 0 i A(f )(Xs ) ds . i (9.3.11) Since by Lemma 9.3.1 (b) E hZ E[f (Xk∧τ )] → E[f (Xτ )], k∧τ 0 i hZ A(f )(Xs ) ds → E 0 τ k→∞ i A(f )(Xs ) ds , k → ∞, using Exercise (9.3.2) relation (9.3.11) yields (9.3.10). Exercise 9.3.4 Write Dynkin’s formula for the case of a function f (t, Xt ). Use Exercise 9.2.6. In the following sections we shall present a few important results of stochastic calculus that can be obtained as direct consequences of Dynkin’s formula. 178 9.4 Kolmogorov’s Backward Equation For any function f ∈ C02 (Rn ) let v(t, x) = E[f (Xt )], given that X0 = x. The operator E denotes the expectation given the initial condition X0 = x. Then v(0, x) = f (x), and differentiating in Dynkin’s formula (9.2.7) Z t E[Af (Xs )] ds v(t, x) = f (x) + 0 yields ∂v ∂t = E[Af (Xt )]. Since we are allowed to differentiate inside of an integral, the operators A and E commute E[Af (Xt )] = AE[f (Xt )]. Therefore ∂v = E[Af (Xt )] = AE[f (Xt )] = Av(t, x). ∂t Hence, we arrived at the following result. Theorem 9.4.1 (Kolmogorov’s backward equation) For any f ∈ C02 (Rn ) the function v(t, x) = E[f (Xt )] satisfies the following Cauchy’s problem ∂v = Av, ∂t v(0, x) = f (x), t>0 where A denotes the generator of the Ito’s diffusion (9.2.1). 9.5 Exit Time from an Interval Let Xt = x0 +Wt be a one-dimensional Brownian motion starting at x0 , with x0 ∈ (a, b). Consider the exit time of the process Xt from the strip (a, b) τ = inf{t > 0; Xt 6∈ (a, b)}. Assuming E[τ ] < 0, applying Dynkin’s formula yields i h i h Z τ 1 d2 f (X ) ds . E f (Xτ ) = f (x0 ) + E s 2 0 2 dx (9.5.12) Choosing f (x) = x in (9.5.12) we obtain E[Xτ ] = x0 (9.5.13) 179 Exercise 9.5.1 Prove relation (9.5.13) using the Optional Stopping Theorem for the martingale Xt . Let pa = P (Xτ = a) and pb = P (Xτ = b) be the exit probabilities of Xt from the interval (a, b). Obviously, pa + pb = 1, since the probability that the Brownian motion will stay for ever inside the bounded interval is zero. Using the expectation definition, relation (9.5.13) yields apa + b(1 − pa ) = x0 . Solving for pa and pb we get the following exit probabilities pa = b − x0 b−a pb = 1 − pa = (9.5.14) x0 − a . b−a (9.5.15) It is worth noting that if b → ∞ then pa → 1 and if a → −∞ then pb → 1. This can be stated by saying that a Brownian motion starting at x0 reaches any level (below or above x0 ) with probability 1. Next we shall compute the mean of the exit time, E[τ ]. Choosing f (x) = x2 in (9.5.12) yields E[(Xτ )2 ] = x20 + E[τ ]. From the definition of the mean and formulas (9.5.14)-(9.5.15) we obtain b − x0 x0 − a E[τ ] = a2 pa + b2 pb − x20 = a2 + b2 − x0 b−a b−a i 1 h 2 ba − ab2 + x0 (b − a)(b + a) − x20 = b−a = −ab + x0 (b + a) − x20 = (b − x0 )(x0 − a). (9.5.16) Exercise 9.5.2 (a) Show that the equation x2 −(b−a)x+E[τ ] = 0 cannot have complex roots; (b − a)2 (b) Prove that E[τ ] ≤ ; 4 (c) Find the point x0 ∈ (a, b) such that the expectation of the exit time, E[τ ], is maximum. 9.6 Transience and Recurrence of Brownian Motion We shall consider first the expectation of the exit time from a ball. Then we shall extend it to an annulus and compute the transience probabilities. 180 Figure 9.1: The Brownian motion Xt in the ball B(0, R). 1. Consider the process Xt = a + W (t), where W (t) = W1 (t), . . . , Wn (t) is an ndimensional Brownian motion, and a = (a1 , . . . , an ) ∈ Rn is a fixed vector, see Fig. 9.1. Let R > 0 such that R > |a|. Consider the exit time of the process Xt from the ball B(0, R) τ = inf{t > 0; |Xt | > R}. (9.6.17) Assuming E[τ ] < ∞ and letting f (x) = |x|2 = x21 + · · · + x2n in Dynkin’s formula hZ τ 1 i E f (Xτ ) = f (x) + E ∆f (Xs ) ds 0 2 yields 2 2 R = |a| + E and hence hZ τ 0 i n ds , R2 − |a|2 . (9.6.18) n In particular, if the Brownian motion starts from the center, i.e. a = 0, the expectation of the exit time is R2 E[τ ] = . n (i) Since R2 /2 > R2 /3, the previous relation implies that it takes longer for a Brownian motion to exit a disk of radius R rather than a ball of the same radius. (ii) The probability that a Brownian motion leaves the interval (−R, R) is twice the probability that a 2-dimensional Brownian motion exits the disk B(0, R). E[τ ] = Exercise 9.6.1 Prove that E[τ ] < ∞, where τ is given by (9.6.17). 181 Exercise 9.6.2 Apply the Optional Stopping Theorem for the martingale Wt = Wt2 − t to show that E[τ ] = R2 , where τ = inf{t > 0; |Wt | > R} is the first exit time of the Brownian motion from (−R, R). 2. Let b ∈ Rn such that b ∈ / B(0, R), i.e. |b| > R, and consider the annulus Ak = {x; R < |x| < kR} where k > 0 such that b ∈ Ak . Consider the process Xt = b + W (t) and let τk = inf{t > 0; Xt ∈ / Ak } be the first exit time of Xt from the annulus Ak . Let f : Ak → R be defined by ( − ln |x|, if n = 2 1 f (x) = , if n > 2. |x|n−2 A straightforward computation shows that ∆f = 0. Substituting into Dynkin’s formula i h Z τk 1 ∆f (Xs ) ds E f (Xτk ) = f (b) + E 2 0 yields E f (Xτk ) = f (b). (9.6.19) This can be stated by saying that the value of f at a point b in the annulus is equal to the expected value of f at the first exit time of a Brownian motion starting at b. Since |Xτk | is a random variable with two outcomes, we have E f (Xτk ) = pk f (R) + qk f (kR), where pk = P (|Xτk | = R), qk = P (|XXτk |) = kR and pk + qk = 1. Substituting in (9.6.19) yields pk f (R) + qk f (kR) = f (b). (9.6.20) There are the following two distinguished cases: (i) If n = 2 we obtain −pk ln R − qk (ln k + ln R) = − ln b. Using pk = 1 − qk , solving for pk yields ln( Rb ) . pk = 1 − ln k 182 Hence P (τ < ∞) = lim pk = 1, k→∞ where τ = inf{t > 0; |Xt | < R} is the first time Xt hits the ball B(0, R). Hence in R2 a Brownian motion hits with probability 1 any ball. This is stated equivalently by saying that the Brownian motion is recurrent in R2 . (ii) If n > 2 the equation (9.6.20) becomes qk 1 pk + n−2 n−2 = n−2 . n−2 R k R b Taking the limit k → ∞ yields lim pk = k→∞ R n−2 b < 1. Then in Rn , n > 2, a Brownian motion starting outside of a ball hits it with a probability less than 1. This is usually stated by saying that the Brownian motion is transient. 3. We shall recover the previous results using the n-dimensional Bessel process p Rt = dist 0, W (t) = W1 (t)2 + · · · + Wn (t)2 . Consider the process Yt = α + Rt , with 0 ≤ α < R, see section 3.7. It can be shown that the generator of Yt is the Bessel operator of order n A= n−1 d 1 d2 + . 2 2 dx 2x dx Consider the exit time τ = {t > 0; Yt > R}. Applying Dynkin’s formula h E f (Yτ ) = f (Y0 ) + E for f (x) = x2 yields R2 = α2 + E Rτ 0 Z τ (Af )(Ys ) ds 0 nds . This leads to E[τ ] = R2 − α2 . n which recovers (9.6.18) with α = |a|. In the following assume n ≥ 3 and consider the annulus Ar,R = {x ∈ Rn ; r < |x| < R}. i 183 Figure 9.2: The Brownian motion Xt in the annulus Ar,R . Consider the stopping time τ = inf{t > 0; Xt ∈ / Ar,R } = inf{t > 0; Yt ∈ /(r, R)}, where 2−n |Y0 | = α ∈ (r, R). Applying Dynkin’s formula for f (x) = x yields E f (Yτ = f (α). This can be written as pr r 2−n + pR R2−n = α2−n , where pr = P (|Xt | = r), pR = P (|Xt | = R), pr + pR = 1. Solving for pr and pR yields pr = R n−2 α R n−2 r −1 −1 , pR = r n−2 −1 α . r n−2 −1 R The transience probability is obtained by taking the limit to infinity pr = α2−n Rn−2 − 1 r n−2 , = R→∞ r 2−n Rn−2 − 1 α lim pr,R = lim R→∞ where pr is the probability that a Brownian motion starting outside the ball of radius r will hit the ball, see Fig. 9.2. Exercise 9.6.3 Solve the equation 21 f ′′ (x) + monomial type f (x) = xk . n−1 ′ 2x f (x) = 0 by looking for a solution of 184 9.7 Application to Parabolic Equations This section deals with solving first and second order parabolic equations using the integral of the cost function along a certain characteristic solution. The first order equations are related to predictable characteristic curves, while the second order equations depend on stochastic characteristic curves. 1. Predictable characteristics Let ϕ(s) be the solution of the following one-dimensional ODE dX(s) = a s, X(s) , ds X(t) = x, t≤s≤T and define the cumulative cost between t and T along the solution ϕ u(t, x) = Z t T c s, ϕ(s) ds, (9.7.21) where c denotes a continuous cost function. Differentiate both sides with respect to t Z T ∂ ∂ u t, ϕ(t) = c s, ϕ(s) ds ∂t ∂t t ′ ∂t u + ∂x u ϕ (t) = −c t, ϕ(t) . Hence (9.7.21) is a solution of the following final value problem ∂t u(t, x) + a(t, x)∂x u(t, x) = −c(t, x) u(T, x) = 0. It is worth mentioning that this is a variant of the method of characteristics.1 The curve given by the solution ϕ(s) is called a characteristic curve. Exercise 9.7.1 Using the previous method solve the following final boundary problems: (a) ∂t u + x∂x u = −x u(T, x) = 0. (b) ∂t u + tx∂x u = ln x, x>0 u(T, x) = 0. 1 This is a well known method of solving linear partial differential equations. 185 2. Stochastic characteristics Consider the diffusion dXs = a(s, Xs )ds + b(s, Xs )dWs , Xt = x, t≤s≤T and define the stochastic cumulative cost function u(t, Xt ) = Z T c(s, Xs ) ds, (9.7.22) t with the conditional expectation h i u(t, x) = E u(t, Xt )|Xt = x hZ T i = E c(s, Xs ) ds|Xt = x . t Taking increments in both sides of (9.7.22) yields du(t, Xt ) = d Z T c(s, Xs ) ds. t Applying Ito’s formula on one side and the Fundamental Theorem of Calculus on the other, we obtain 1 ∂t u(t, x)dt + ∂x u(t, Xt )dXt + ∂x2 u(t, t, Xt )dXt2 = −c(t, Xt )dt. 2 Taking the expectation E[ · |Xt = x] on both sides yields 1 ∂t u(t, x)dt + ∂x u(t, x)a(t, x)dt + ∂x2 u(t, x)b2 (t, x)dt = −c(t, x)dt. 2 Hence, the expected cost u(t, x) = E hZ T c(s, Xs ) ds|Xt = x t i is a solution of the following second order parabolic equation 1 ∂t u + a(t, x)∂x u + b2 (t, x)∂x2 u(t, x) = −c(t, x) 2 u(T, x) = 0. This represents the probabilistic interpretation of the solution of a parabolic equation. 186 Exercise 9.7.2 Solve the following final boundary problems: (a) 1 ∂t u + ∂x u + ∂x2 u = −x 2 u(T, x) = 0. (b) 1 ∂t u + ∂x u + ∂x2 u = ex , 2 u(T, x) = 0. (c) 1 ∂t u + µx∂x u + σ 2 x2 ∂x2 u = −x, 2 u(T, x) = 0. Chapter 10 Martingales and Girsanov’s Theorem 10.1 Examples of Martingales In this section we shall use the knowledge acquired in previous chapters to present a few important examples of martingales. These will be useful in the proof of Girsanov’s theorem in the next section. We start by recalling that a process Xt , 0 ≤ t ≤ T , is an Ft -martingale if 1. E[|Xt |] < ∞ (Xt integrable for each t); 2. Xt is Ft -measurable; 3. The forecast of future values is the last observation: E[Xt |Fs ] = Xs , ∀s < t. We shall present three important examples of martingales and some of their particular cases. Example 10.1.1 If v(s) is a continuous function on [0, T ], then Xt = Z t v(s) dWs 0 is an Ft -martingale. Taking out the predictable part Z t i v(τ ) dWτ Fs v(τ ) dWτ + s 0 i hZ t = Xs + E v(τ ) dWτ Fs = Xs , E[Xt |Fs ] = E hZ s s 187 188 Rt where we used that s v(τ ) dWτ is independent of Fs and the conditional expectation equals the usual expectation i hZ t hZ t i E v(τ ) dWτ = 0. v(τ ) dWτ Fs = E s s Example 10.1.2 Let Xt = Z t v(s) dWs be a process as in Example 10.1.1. Then 0 Mt = Xt2 − Z t v 2 (s) ds 0 is an Ft -martingale. The process Xt satisfies the stochastic equation dXt = v(t)dWt . By Ito’s formula d(Xt2 ) = 2Xt dXt + (dXt )2 = 2v(t)Xt dWt + v 2 (t)dt. (10.1.1) Integrating between s and t yields Xt2 − Xs2 = 2 Z t Xτ v(τ ) dWτ + Z t v 2 (τ ) dτ. s s Then separating the predictable from the unpredictable we have Z t h i v 2 (τ ) dτ Fs E[Mt |Fs ] = E Xt2 − 0 Z s Z t h i 2 2 2 2 v 2 (τ ) dτ Fs v (τ ) dτ + Xs − = E Xt − Xs − 0 s Z t Z s h i v 2 (τ ) dτ Fs v 2 (τ ) dτ + E Xt2 − Xs2 − = Xs2 − s 0 hZ t i = Ms + 2E Xτ v(τ ) dWτ Fs = Ms , s where we used relation (10.1.1) and that Z t Xτ v(τ ) dWτ is totally unpredictable given s the information set Fs . In the following we shall mention a few particular cases. 1. If v(s) = 1, then Xt = RWt . In this case Mt = Wt2 − t is an Ft -martingale. t 2. If v(s) = s, then Xt = 0 s dWs , and hence Mt = is an Ft -martingale. Z t s dWs 0 2 − t3 3 189 Example 10.1.3 Let u : [0, T ] → R be a continuous function. Then Rt Mt = e 0 Rt u(s) dWs − 21 0 u2 (s) ds is an Ft -martingale for 0 ≤ t ≤ T . Z Z t 1 t 2 u (s) ds. Then u(s) dWs − Consider the process Ut = 2 0 0 1 dUt = u(t)dWt − u2 (t)dt 2 (dUt )2 = u(t)dt. Then Ito’s formula yields 1 dMt = d(eUt ) = eUt dUt + eUt (dUt )2 2 1 1 2 Ut = e u(t)dWt − u (t)dt + u2 (t)dt 2 2 = u(t)Mt dWt . Integrating between s and t yields Mt = Ms + Z t u(τ )Mτ dWτ s Since Z t s u(τ )Mτ dWτ is independent of Fs , then E hZ s i t u(τ )Mτ dWτ |Fs = E and hence E[Mt |Fs ] = E[Ms + Z s Z s t u(τ )Mτ dWτ = 0, t u(τ )Mτ dWτ |Fs ] = Ms . Remark 10.1.4 The condition that u(s) is continuous on [0, T ] can be relaxed by asking only Z t 2 |u(s)|2 ds < ∞}. u ∈ L [0, T ] = {u : [0, T ] → R; measurable and 0 It is worth noting that the conclusion still holds if the function u(s) is replaced by a stochastic process u(t, ω) satisfying Novikov’s condition 1 RT 2 E e 2 0 u (s,ω) ds < ∞. 190 The previous process has a distinguished importance in the theory of martingales and it will be useful when proving the Girsanov theorem. Definition 10.1.5 Let u ∈ L2 [0, T ] be a deterministic function. Then the stochastic process R Rt 1 t 2 Mt = e 0 u(s) dWs − 2 0 u (s) ds is called the exponential process induced by u. Particular cases of exponential processes. 1. Let u(s) = σ, constant, then Mt = eσWt − σ2 t 2 is an Ft -martingale. 2. Let u(s) = s. Integrating in d(tWt ) = tdWt − Wt dt yields Z t 0 Let Zt = Rt 0 s dWs = tWt − Z t Ws ds. 0 Ws ds be the integrated Brownian motion. Then Rt Mt = e 0 d dWs − 12 Rt 0 s2 ds t3 = etWt − 6 −Zt is an Ft -martingale. Example 10.1.6 Let Xt be a solution of dXt = u(t)dt + dWt , with u(s) a bounded function. Consider the exponential process Mt = e− Rt 0 u(s) dWs − 12 Rt 0 u2 (s) ds . Then Yt = Mt Xt is an Ft -martingale. In Example 10.1.3 we obtained dMt = −u(t)Mt dWt . Then dMt dXt = −u(t)Mt dt. The product rule yields dYt = Mt dXt + Xt dMt + dMt dXt = Mt u(t)dt + dWt − Xt u(t)Mt dWt − u(t)Mt dt = Mt 1 − u(t)Xt dWt . Integrating between s and t yields Yt = Ys + Z t s Mτ 1 − u(τ )Xτ dWτ . (10.1.2) 191 Since Rt s Mτ 1 − u(τ )Xτ dWτ is independent of Fs , E hZ t s i Mτ 1 − u(τ )Xτ dWτ |Fs = E and hence hZ t s i Mτ 1 − u(τ )Xτ dWτ = 0, E[Yt |Fs ] = Ys . 1 Exercise 10.1.7 Prove that (Wt + t)e−Wt − 2 t is an Ft -martingale. Exercise 10.1.8 Let h be a continuous function. Using the properties of the Wiener integral and log-normal random variables, show that i h Rt R 1 t 2 E e 0 h(s) dWs = e 2 0 h(s) ds . Exercise 10.1.9 Let Mt be the exponential process (10.1.2). Use the previous exercise to show that for any t > 0 (a) E[Mt ] = 1 Rt (b) E[Mt2 ] = e 0 u(s)2 ds . Exercise 10.1.10 Let Ft = σ{Wu ; u ≤ t}. Show that the following processes are Ft -martingales: (a) et/2 cos Wt ; (b) et/2 sin Wt . Recall that the Laplacian of a twice differentiable function f is defined by ∆f (x) = P n 2 j=1 ∂xj f . Example 10.1.11 Consider the smooth function f : Rn → R, such that (i) ∆f = 0; (ii) E |f (Wt )| < ∞, ∀t > 0 and x ∈ R. Then the process Xt = f (Wt ) is an Ft -martingale. Proof: It follows from the more general Example 10.1.13. Exercise 10.1.12 Let W1 (t) and W2 (t) be two independent Brownian motions. Show that Xt = eW1 (t) cos W2 (t) is a martingale. Example 10.1.13 Let f : Rn → R be a smooth function such that (i) E |f (Wt )| < ∞; hR i t (ii) E 0 |∆f (Ws )| ds < ∞. Rt Then the process Xt = f (Wt ) − 12 0 ∆f (Ws ) ds is a martingale. 192 Proof: For 0 ≤ s < t we have h1 Z t i E Xt |Fs = E f (Wt )|Fs − E ∆f (Wu ) du|Fs 2 0 Z t h Z s i 1 1 (10.1.3) E ∆f (Wu )|Fs du ∆f (Wu ) du − = E f (Wt )|Fs − 2 0 2 s Let p(t, y, x) be the probability density function of Wt . Integrating by parts and using that p satisfies the Kolmogorov’s backward equation, we have Z Z i h1 1 1 p(u − s, Ws , x) ∆f (x) dx = ∆x p(u − s, Ws , x)f (x) dx E ∆f (Wu )|Fs = 2 2 2 Z ∂ = p(u − s, Ws , x)f (x) dx. ∂u Then, using the Fundamental Theorem of Calculus, we obtain Z t Z Z t h i ∂ 1 E ∆f (Wu )|Fs du = p(u − s, Ws , x)f (x) dx du 2 ∂u s Zs Z = p(t − s, Ws , x)f (x) dx − lim p(ǫ, Ws , x)f (x) dx ǫց0 Z = E f (Wt )|Fs − δ(x = Ws )f (x) dx = E f (Wt )|Fs − f (Ws ). Substituting in (10.1.3) yields E Xt |Fs Z s ∆f (Wu ) du − E f (Wt )|Fs + f (Ws()10.1.4) = E f (Wt )|Fs 0 Z 1 s = f (Ws ) − ∆f (Wu ) du (10.1.5) 2 0 = Xs . (10.1.6) 1 − 2 Hence Xt is an Ft -martingale. Exercise 10.1.14 Use the Example 10.1.13 to show that the following processes are martingales: (a) Xt = Wt2 − t; Rt (b) Xt = Wt3 − 3 0 Ws ds; Rt 1 Wtn − 12 0 Wsn−2 ds; (c) Xt = n(n−1) Rt (d) Xt = ecWt − 21 c2 0 eWs ds, with c constant; Rt (e) Xt = sin(cWt ) + 12 c2 0 sin(cWs ) ds, with c constant. 193 Exercise 10.1.15 Let f : Rn → R be a function such that (i) E |f (Wt )| < ∞; (ii) ∆f = λf , λ constant. Rt Show that the process Xt = f (Wt ) − λ2 0 f (Ws ) ds is a martingale. 10.2 Girsanov’s Theorem In this section we shall present and prove a particular version of Girsanov’s theorem, which will suffice for the purpose of later applications. The application of Girsanov’s theorem to finance is to show that in the study of security markets the differences between the mean rates of return can be removed. We shall recall first a few basic notions. Let (Ω, F, P ) be a probability space. When dealing with an Ft -martingale on the aforementioned probability space, the filtration Ft is considered to be the sigma-algebra generated by the given Brownian motion Wt , i.e. Ft = σ{Wu ; 0 ≤ u ≤ s}. By default, a martingale is considered with respect to the probability measure P , in the sense that the expectations involve an integration with respect to P Z E P [X] = X(ω)dP (ω). Ω We have not used the upper script until now since there was no doubt which probability measure was used. In this section we shall use also another probability measure given by dQ = MT dP, where MT is an exponential process. This means that Q : F → R is given by Z Z MT dP, ∀A ∈ F. dQ = Q(A) = A A Since MT > 0, M0 = 1, using the martingale property of Mt yields Q(A) > 0, A 6= Ø; Z MT dP = E P [MT ] = E P [MT |F0 ] = M0 = 1, Q(Ω) = Ω which shows that Q is a probability on F, and hence (Ω, F, Q) becomes a probability space. The following transformation of expectation from the probability measure Q to P will be useful in part II. If X is a random variable Z Z Q X(ω)MT (ω) dP (ω) X(ω) dQ(ω) = E [X] = Ω Ω = E P [XMT ]. The following result will play a central role in proving Girsanov’s theorem: 194 Lemma 10.2.1 Let Xt be the Ito process dXt = u(t)dt + dWt , X0 = 0, 0 ≤ t ≤ T, with u(s) a bounded function. Consider the exponential process Mt = e− Rt 0 u(s) dWs − 12 Rt 0 u2 (s) ds . Then Xt is an Ft -martingale with respect to the measure dQ(ω) = MT (ω)dP (ω). Proof: We need to prove that Xt is an Ft -martingale with respect to Q, so it suffices to show the following three properties: 1. Integrability of Xt . This part usually follows from standard manipulations of norms estimations. We shall do it here in detail, but we shall omit it in other proofs. Integrating in the equation of Xt between 0 and t yields Z t u(s) ds + Wt . (10.2.7) Xt = 0 We start with an estimation of the expectation with respect to P Z t h Z t i 2 P P 2 u(s) ds + 2 E [Xt ] = E u(s) ds Wt + Wt2 0 0 Z t Z t 2 = u(s) ds + 2 u(s) ds E P [Wt ] + E P [Wt2 ] 0 0 Z t 2 = u(s) ds + t < ∞, ∀0 ≤ t ≤ T 0 where the last inequality follows from the norm estimation Z t Z t i1/2 h Z t 2 |u(s)| ds |u(s)| ds ≤ t u(s) ds ≤ 0 0 0 i1/2 h Z T |u(s)|2 ds ≤ t = T 1/2 kukL2 [0,T ] . 0 Next we obtain an estimation with respect to Q Z 2 Z Z 2 Q 2 MT2 dP |Xt | dP |Xt |MT dP ≤ E [|Xt |] = Ω Ω = E P [Xt2 ] E P [MT2 ] RT since E P [Xt2 ] < ∞ and E P [MT2 ] = e 0 u(s)2 ds Ω < ∞, kuk2 2 =e L [0,T ] , see Exercise 10.1.9. 195 2. Ft -mesurability of Xt . This follows from equation (10.2.7) and the fact that Wt is Ft -measurable. 3. Conditional expectation of Xt . From Examples 10.1.3 and 10.1.6 recall that for any 0≤t≤T Mt is an Ft -martingale with respect to probability measure P ; Xt Mt is an Ft -martingale with respect to probability measure P. We need to verify that E Q [Xt |Fs ] = Xs , which can be written as Z Xt dQ = A Z ∀s ≤ t. Xs dQ, A Since dQ = MT dP , the previous relation becomes Z Z Xs MT dP, Xt MT dP = A A ∀A ∈ Fs . ∀A ∈ Fs . This can be written in terms of conditional expectation as E P [Xt MT |Fs ] = E P [Xs MT |Fs ]. (10.2.8) We shall prove this identity by showing that both terms are equal to Xs Ms . Since Xs is Fs -predictable and Mt is a martingale, the right side term becomes E P [Xs MT |Fs ] = Xs E P [MT |Fs ] = Xs Ms , ∀s ≤ T. Let s < t. Using the tower property (see Proposition 2.10.6, part 3), the left side term becomes E P [Xt MT |Fs ] = E P E P [Xt MT |Ft ]|Fs = E P Xt E P [MT |Ft ]|Fs = E P Xt Mt |Fs = Xs Ms , where we used that Mt and Xt Mt are martingales and Xt is Ft -measurable. Hence (10.2.8) holds and Xt is an Ft -martingale with respect to the probability measure Q. Lemma 10.2.2 Consider the process Z t Xt = u(s) ds + Wt , 0 0 ≤ t ≤ T, with u ∈ L2 [0, T ] a deterministic function, and let dQ = MT dP . Then E Q [Xt2 ] = t. 196 Proof: Denote U (t) = Rt 0 u(s) ds. Then E Q [Xt2 ] = E P [Xt2 MT ] = E P [U 2 (t)MT + 2U (t)Wt MT + Wt2 MT ] = U 2 (t)E P [MT ] + 2U (t)E P [Wt MT ] + E P [Wt2 MT ]. (10.2.9) From Exercise 10.1.9 (a) we have E P [MT ] = 1. In order to compute E P [Wt MT ] we use the tower property and the martingale property of Mt E P [Wt MT ] = E[E P [Wt MT |Ft ]] = E[Wt E P [MT |Ft ]] = E[Wt Mt ]. (10.2.10) Using the product rule d(Wt Mt ) = Mt dWt + Wt dMt + dWt dMt = Mt − u(t)Mt Wt dWt − u(t)Mt dt, where we used dMt = −u(t)Mt dWt . Integrating between 0 and t yields Z t Z t Ms − u(s)Ms Ws dWs − Wt Mt = u(s)Ms ds. 0 0 Taking the expectation and using the property of Ito integrals we have Z t Z t u(s) ds = −U (t). u(s)E[Ms ] ds = − E[Wt Mt ] = − (10.2.11) 0 0 Substituting into (10.2.10) yields E P [Wt MT ] = −U (t). (10.2.12) For computing E P [Wt2 MT ] we proceed in a similar way E P [Wt2 MT ] = E[E P [Wt2 MT |Ft ]] = E[Wt2 E P [MT |Ft ]] = E[Wt2 Mt ]. (10.2.13) Using the product rule yields d(Wt2 Mt ) = Mt d(Wt2 ) + Wt2 dMt + d(Wt2 )dMt = Mt (2Wt dWt + dt) − Wt2 u(t)Mt dWt −(2Wt dWt + dt) u(t)Mt dWt = Mt Wt 2 − u(t)Wt dWt + Mt − 2u(t)Wt Mt dt. Integrate between 0 and t Z Z t 2 [Ms Ws 2 − u(s)Ws ] dWs + Wt Mt = 0 0 t Ms − 2u(s)Ws Ms ds, 197 and take the expected value to get Z E[Wt2 Mt ] = t 0 Z = 0 t E[Ms ] − 2u(s)E[Ws Ms ] ds 1 + 2u(s)U (s) ds = t + U 2 (t), where we used (10.2.11). Substituting into (10.2.13) yields E P [Wt2 MT ] = t + U 2 (t). (10.2.14) Substituting (10.2.12) and (10.2.14) into relation (10.2.9) yields E Q [Xt2 ] = U 2 (t) − 2U (t)2 + t + U 2 (t) = t, (10.2.15) which ends the proof of the lemma. Now we are prepared to prove one of the most important results of stochastic calculus. Theorem 10.2.3 (Girsanov Theorem) Let u ∈ L2 [0, T ] be a deterministic function. Then the process Xt = Z t 0≤t≤T u(s) ds + Wt , 0 is a Brownian motion with respect to the probability measure Q given by dQ = e− RT 0 u(s) dWs − 12 RT 0 u(s)2 ds dP. Proof: In order to prove that Xt is a Brownian motion on the probability space (Ω, F, Q) we shall apply Lévy’s characterization theorem, see Theorem 3.1.5. Hence it suffices to show that Xt is a Wiener process. Lemma 10.2.1 implies that the process Xt satisfies the following properties: 1. X0 = 0; 2. Xt is continuous in t; 3. Xt is a square integrable Ft -martingale on the space (Ω, F, Q). The only property we still need to show is 4. E Q [(Xt − Xs )2 ] = t − s, s < t. 198 Using Lemma 10.2.2, the martingale property of Wt , the additivity and the tower property of expectations, we have E Q [(Xt − Xs )2 ] = E Q [Xt2 ] − 2E Q [Xt Xs ] + E Q [Xs2 ] = t − 2E Q [Xt Xs ] + s = t − 2E Q [E Q [Xt Xs |Fs ]] + s = t − 2E Q [Xs E Q [Xt |Fs ]] + s = t − 2E Q [Xs2 ] + s = t − 2s + s = t − s, which proves relation 4. Choosing u(s) = λ, constant, we obtain the following consequence that will be useful later in finance applications. Corollary 10.2.4 Let Wt be a Brownian motion on the probability space (Ω, F, P ). Then the process Xt = λt + Wt , 0≤t≤T is a Brownian motion on the probability space (Ω, F, Q), where 1 2 T −λW dQ = e− 2 λ T dP. Chapter 1 2.9.1 Let X ∼ N (µ, σ 2 ). Then the distribution function of Y is y − β FY (y) = P (Y < y) = P (αX + β < y) = P X < α 2 Z y−β Z y 2 z−(αµ+β) (x−µ) α 1 1 = √ e− 2α2 σ2 e− 2σ2 dx = √ dz 2πσ −∞ 2πασ −∞ Z y (z−µ′ )2 1 − e 2(σ′ )2 dz, = √ 2πσ ′ −∞ with µ′ = αµ + β, σ ′ = ασ. 2 2 2.9.5 (a) Making t = n yields E[Y n ] = E[enX ] = eµn+n σ /2 . (b) Let n = 1 and n = 2 in (a) to get the first two moments and then use the formula of variance. 2.10.7 The tower property E E[X|G]|H = E[X|H], is equivalent with Z A E[X|G] dP = Z X dP, H⊂G ∀A ∈ H. 199 Since A ∈ G, the previous relation holds by the definition on E[X|G]. 2.10.8 (a) Direct application of the definition. R R (b) P (A) = A dP = Ω χA (ω) dP (ω) = E[χA ]. (d) E[χA X] = E[χA ]E[X] = P (A)E[X]. (e) We have the sequence of equivalencies Z Z X dP, ∀A ∈ G ⇔ E[X] dP = E[X|G] = E[X] ⇔ A A Z X dP ⇔ E[X]P (A) = E[χA ], E[X]P (A) = A which follows from (d). 2.11.5 If µ = E[X] then E[(X − E[X])2 ] = E[X 2 − 2µX + µ2 ] = E[X 2 ] − 2µ2 + µ2 = E[X 2 ] − E[X]2 = V ar[X]. 2.11.6 From 2.11.5 we have V ar(X) = 0 ⇔ X = E[X], i.e. X is a constant. 2.11.7 The same proof as the one in Jensen’s inequality. 2.11.8 It follows from Jensen’s inequality or using properties of integrals. P k e−λ t = eλ(e −1) ; 2.11.13 (a) m(t) = E[etX ] = k etk λ k! (b) It follows from the first Chernoff bound. 2.11.16 Choose f (x) = x2k+1 and g(x) = x2n+1 . 2.12.1 By direct computation we have E[(X − Y )2 ] = E[X 2 ] + E[Y 2 ] − 2E[XY ] = V ar(X) + E[X]2 + V ar[Y ] + E[Y ]2 − 2E[X]E[Y ] +2E[X]E[Y ] − 2E[XY ] = V ar(X) + V ar[Y ] + (E[X] − E[Y ])2 − 2Cov(X, Y ). 2.12.2 (a) Since E[(X − Xn )2 ] ≥ E[X − Xn ]2 = (E[Xn ] − E[X])2 ≥ 0, the Squeeze theorem yields limn→∞ (E[Xn ] − E[X]) = 0. (b) Writing Xn2 − X 2 = (Xn − X)2 − 2X(X − Xn ), 200 and taking the expectation we get E[Xn2 ] − E[X 2 ] = E[(Xn − X)2 ] − 2E[X(X − Xn )]. The right side tends to zero since E[(Xn − X)2 ] → 0 Z |X(X − Xn )| dP |E[X(X − Xn )]| ≤ Ω Z 1/2 Z 1/2 2 ≤ X dP (X − Xn )2 dP Ω pΩ E[X 2 ]E[(X − Xn )2 ] → 0. = (c) It follows from part (b). (d) Apply Exercise 2.12.1. 2.12.3 Using Jensen’s inequality we have 2 2 E E[Xn |H] − E[X|H] = E E[Xn − X|H] ≤ E E[(Xn − X)2 |H] = E[(Xn − X)2 ] → 0, as n → 0. 2.14.6 The integrability of Xt follows from E[|Xt |] = E |E[X|Ft ]| ≤ E E[|X| |Ft ] = E[|X|] < ∞. Xt is Ft -predictable by the definition of the conditional expectation. Using the tower property yields s < t. E[Xt |Fs ] = E E[X|Ft ]|Fs = E[X|Fs ] = Xs , 2.14.7 Since E[|Zt |] = E[|aXt + bYt + c|] ≤ |a|E[|Xt |] + |b|E[|Yt |] + |c| < ∞ then Zt is integrable. For s < t, using the martingale property of Xt and Yt we have E[Zt |Fs ] = aE[Xt |Fs ] + bE[Yt |Fs ] + c = aXs + bYs + c = Zs . 2.14.8 In general the answer is no. For instance, if Xt = Yt the process Xt2 is not a martingale, since the Jensen’s inequality 2 E[Xt2 |Fs ] ≥ E[Xt |Fs ] = Xs2 201 is not necessarily an identity. For instance Bt2 is not a martingale, with Bt the Brownian motion process. 2.14.9 It follows from the identity E[(Xt − Xs )(Yt − Ys )|Fs ] = E[Xt Yt − Xs Ys |Fs ]. 2.14.10 (a) Let Yn = Sn − E[Sn ]. We have Yn+k = Sn+k − E[Sn+k ] = Yn + k X j=1 Xn+j − k X E[Xn+j ]. j=1 Using the properties of expectation we have E[Yn+k |Fn ] = Yn + = Yn + k X j=1 k X j=1 = Yn . E[Xn+j |Fn ] − E[Xn+j ] − k X k X E[E[Xn+j ]|Fn ] j=1 E[Xn+j ] j=1 (b) Let Zn = Sn2 − V ar(Sn ). The process Zn is an Fn -martingale iff E[Zn+k − Zn |Fn ] = 0. Let U = Sn+k − Sn . Using the independence we have 2 Zn+k − Zn = (Sn+k − Sn2 ) − V ar(Sn+k − V ar(Sn ) = (Sn + U )2 − Sn2 − V ar(Sn+k − V ar(Sn ) = U 2 + 2U Sn − V ar(U ), so E[Zn+k − Zn |Fn ] = E[U 2 ] + 2Sn E[U ] − V ar(U ) = E[U 2 ] − (E[U 2 ] − E[U ]2 ) = 0, since E[U ] = 0. 2.14.11 Let Fn = σ(Xk ; k ≤ n). Using the independence E[|Pn |] = E[|X0 |] · · · E[|Xn |] < ∞, 202 so |Pn | integrable. Taking out the predictable part we have E[Pn+k |Fn ] = E[Pn Xn+1 · · · Xn+k |Fn ] = Pn E[Xn+1 · · · Xn+k |Fn ] = Pn E[Xn+1 ] · · · E[Xn+k ] = Pn . 2.14.12 (a) Since the random variable Y = θX is normally distributed with mean θµ and variance θ 2 σ 2 , then 1 2 2 E[eθX ] = eθµ+ 2 θ σ . Hence E[eθX ] = 1 iff θµ + 12 θ 2 σ 2 = 0 which has the nonzero solution θ = −2µ/σ 2 . (b) Since eθXi are independent, integrable and satisfy E[eθXi ] = 1, by Exercise 2.14.11 we get that the product Zn = eθSn = eθX1 · · · eθXn is a martingale. Chapter 2 3.1.4 Bt starts at 0 and is continuous in t. By Proposition 3.1.2 Bt is a martingale with E[Bt2 ] = t < ∞. Since Bt − Bs ∼ N (0, |t − s|), then E[(Bt − Bs )2 ] = |t − s|. 3.1.11 It is obvious that Xt = Wt+t0 − Wt0 satisfies X0 = 0 and that Xt is continuous in t. The increments are normal distributed Xt − Xs = Wt+t0 − Ws+t0 ∼ N (0, |t − s|). If 0 < t1 < · · · < tn , then 0 < t0 < t1 + t0 < · · · < tn + t0 . The increments Xtk+1 − Xtk = Wtk+1 +t0 − Wtk +t0 are obviously independent and stationary. 3.1.12 For any λ > 0, show that the process Xt = √1λ Wλt is a Brownian motion. This says that the Brownian motion is invariant by scaling. Let s < t. Then Xt − Xs = √1 (Wλt − Wλs ) ∼ √1 N 0, λ(t − s) = N (0, t − s). The other properties are obvious. λ λ 3.1.13 Apply Property 3.1.9. 3.1.14 Using the moment generating function, we get E[Wt3 ] = 0, E[Wt4 ] = 3t2 . 3.1.15 (a) Let s < t. Then i i h h E[(Wt2 − t)(Ws2 − s)] = E E[(Wt2 − t)(Ws2 − s)]|Fs = E (Ws2 − s)E[(Wt2 − t)]|Fs i i h h = E (Ws2 − s)2 = E Ws4 − 2sWs2 + s2 = E[Ws4 ] − 2sE[Ws2 ] + s2 = 3s2 − 2s2 + s2 = 2s2 . (b) Using part (a) we have 2s2 = E[(Wt2 − t)(Ws2 − s)] = E[Ws2 Wt2 ] − sE[Wt2 ] − tE[Ws2 ] + ts = E[Ws2 Wt2 ] − st. Therefore E[Ws2 Wt2 ] = ts + 2s2 . (c) Cov(Wt2 , Ws2 ) = E[Ws2 Wt2 ] − E[Ws2 ]E[Wt2 ] = ts + 2s2 − ts = 2s2 . 203 (d) Corr(Wt2 , Ws2 ) = 2s2 2ts = st , where we used V ar(Wt2 ) = E[Wt4 ] − E[Wt2 ]2 = 3t2 − t2 = 2t2 . 3.1.16 (a) The distribution function of Yt is given by F (x) = P (Yt ≤ x) = P (tW1/t ≤ x) = P (W1/t ≤ x/t) Z x/t p Z x/t 2 φ1/t (y) dy = = t/(2π)ety /2 dy 0 Z = 0 √ x/ t 0 1 2 √ e−u /2 du. 2π (b) The probability density of Yt is obtained by differentiating F (x) d p(x) = F (x) = dx ′ Z √ x/ t 0 1 2 1 2 √ e−u /2 du = √ e−x /2 , 2π 2π and hence Yt ∼ N (0, t). (c) Using that Yt has independent increments we have Cov(Ys , Yt ) = E[Ys Yt ] − E[Ys ]E[Yt ] = E[Ys Yt ] i h = E Ys (Yt − Ys ) + Ys2 = E[Ys ]E[Yt − Ys ] + E[Ys2 ] = 0 + s = s. (d) Since Yt − Ys = (t − s)(W1/t − W0 ) − s(W1/s − W1/t ) E[Yt − Ys ] = (t − s)E[W1/t ] − sE[W1/s − W1/t ] = 0, and 1 1 1 V ar(Yt − Ys ) = E[(Yt − Ys )2 ] = (t − s)2 + s2 ( − ) t s t 2 (t − s) + s(t − s) = = t − s. t 3.1.17 (a) Applying the definition of expectation we have Z ∞ Z ∞ 1 − x2 1 − x2 e 2t dx = e 2t dx 2x √ |x| √ E[|Wt |] = 2πt 2πt 0 −∞ Z ∞ p y 1 e− 2t dy = 2t/π. = √ 2πt 0 (b) Since E[|Wt |2 ] = E[Wt2 ] = t, we have V ar(|Wt |) = E[|Wt |2 ] − E[|Wt |]2 = t − 2 2t = t(1 − ). π π 204 3.1.18 By the martingale property of Wt2 − t we have E[Wt2 |Fs ] = E[Wt2 − t|Fs ] + t = Ws2 + t − s. 3.1.19 (a) Expanding (Wt − Ws )3 = Wt3 − 3Wt2 Ws + 3Wt Ws2 − Ws3 and taking the expectation E[(Wt − Ws )3 |Fs ] = E[Wt3 |Fs ] − 3Ws E[Wt2 ] + 3Ws2 E[Wt |Fs ] − Ws3 = E[Wt3 |Fs ] − 3(t − s)Ws − Ws3 , so E[Wt3 |Fs ] = 3(t − s)Ws + Ws3 , since 3 ] = 0. E[(Wt − Ws )3 |Fs ] = E[(Wt − Ws )3 ] = E[Wt−s (b) Hint: Start from the expansion of (Wt − Ws )4 . 3.2.3 Using that eWt −Ws is stationary, we have 1 E[eWt −Ws ] = E[eWt−s ] = e 2 (t−s) . 3.2.4 (a) E[Xt |Fs ] = E[eWt |Fs ] = E[eWt −Ws eWs |Fs ] = eWs E[eWt −Ws |Fs ] = eWs E[eWt −Ws ] = eWs et/2 e−s/2 . (b) This can be written also as E[e−t/2 eWt |Fs ] = e−s/2 eWs , which shows that e−t/2 eWt is a martingale. (c) From the stationarity we have 1 2 (t−s) E[ecWt −cWs ] = E[ec(Wt −Ws ) ] = E[ecWt−s ] = e 2 c . Then for any s < t we have E[ecWt |Fs ] = E[ec(Wt −Ws ) ecWs |Fs ] = ecWs E[ec(Wt −Ws ) |Fs ] 1 2 (t−s) = ecWs E[ec(Wt −Ws ) ] = ecWs e 2 c 1 2 t Multiplying by e− 2 c yields the desired result. 1 2 = Ys e 2 c t . 205 3.2.5 (a) Using Exercise 3.2.3 we have Cov(Xs , Xt ) = E[Xs Xt ] − E[Xs ]E[Xt ] = E[Xs Xt ] − et/2 es/2 = E[eWs +Wt ] − et/2 es/2 = E[eWt −Ws e2(Ws −W0 ) ] − et/2 es/2 = E[eWt −Ws ]E[e2(Ws −W0 ) ] − et/2 es/2 = e = e t+3s 2 − e(t+s)/2 . t−s 2 e2s − et/2 es/2 (b) Using Exercise 3.2.4 (b), we have h i h i E[Xs Xt ] = E E[Xs Xt |Fs ] = E Xs E[Xt |Fs ] h i h i = et/2 E Xs E[e−t/2 Xt |Fs ] = et/2 E Xs e−s/2 Xs ] = e(t−s)/2 E[Xs2 ] = e(t−s)/2 E[e2Ws ] = e(t−s)/2 e2s = e t+3s 2 , and continue like in part (a). 3.2.6 Using the definition of the expectation we have Z Z x2 1 2 2 e2x φt (x) dx = √ e2x e− 2t dx 2πt Z 1 2 1 − 1−4t x = √ e 2t dx = √ , 1 − 4t 2πt if Z 1 − 4t >p0. Otherwise, the integral is infinite. We used the standard integral 2 e−ax = π/a, a > 0. 2 E[e2Wt ] = 3.3.4 It follows from the fact that Zt is normally distributed. 3.3.5 Using the definition of covariance we have = E[Zs Zt ] − E[Zs ]E[Zt ] = E[Zs Zt ] Cov Zs , Zt Z t i hZ sZ t hZ s i Wv dv = E = E Wu du · Wu Wv dudv 0 0 0 0 Z sZ t Z sZ t = E[Wu Wv ] dudv = min{u, v} dudv 0 0 0 0 t s , s < t. − = s2 2 6 3.3.6 (a) Using Exercise 3.3.5 Cov(Zt , Zt − Zt−h ) = Cov(Zt , Zt ) − Cov(Zt , Zt−h ) t3 t t−h = − (t − h)2 ( − ) 3 2 6 1 2 = t h + o(h). 2 206 (b) Using Zt − Zt−h = Rt t−h Wu du = hWt + o(h), Cov(Zt , Wt ) = = 1 Cov(Zt , Zt − Zt−h ) h 1 1 1 2 t h + o(h) = t2 . h 2 2 3.3.7 Let s < u. Since Wt has independent increments, taking the expectation in eWs +Wt = eWt −Ws e2(Ws −W0 ) we obtain E[eWs +Wt ] = E[eWt −Ws ]E[e2(Ws −W0 ) ] = e = e 3.3.8 (a) E[Xt ] = Rt 0 E[eWs ] ds = u+s 2 Rt 0 es = e u+s 2 u−s 2 e2s emin{s,t} . E[es/2 ] ds = 2(et/2 − 1) (b) Since V ar(Xt ) = E[Xt2 ] − E[Xt ]2 , it suffices to compute E[Xt2 ]. Using Exercise 3.3.7 we have Z t hZ t i hZ tZ t i E[Xt2 ] = E eWu du = E eWt ds · eWs eWu dsdu 0 0 0 0 Z tZ t Z tZ t u+s e 2 emin{s,t} dsdu E[eWs +Wu ] dsdu = = 0 0 0 0 ZZ ZZ u+s u+s s e 2 eu duds e 2 e duds + = D2 ZD Z1 u+s 3 4 1 2t u e 2 e duds = = 2 , e − 2et/2 + 3 2 2 D2 where D1 {0 ≤ s < u ≤ t} and D2 {0 ≤ u < s ≤ t}. In the last identity we applied Fubini’s theorem. For the variance use the formula V ar(Xt ) = E[Xt2 ] − E[Xt ]2 . 3.3.9 (a) Splitting the integral at t and taking out the predictable part, we have 207 Z T Z t Wu du|Ft ] Wu du|Ft ] = E[ Wu du|Ft ] + E[ t 0 0 Z T Wu du|Ft ] Zt + E[ t Z T Zt + E[ (Wu − Wt + Wt ) du|Ft ] t Z T (Wu − Wt ) du|Ft ] + Wt (T − t) Zt + E[ t Z T (Wu − Wt ) du] + Wt (T − t) Zt + E[ t Z T E[Wu − Wt ] du + Wt (T − t) Zt + Z E[ZT |Ft ] = E[ = = = = = T t = Zt + Wt (T − t), since E[Wu − Wt ] = 0. (b) Let 0 < t < T . Using (a) we have E[ZT − T WT |Ft ] = E[ZT |Ft ] − T E[WT |Ft ] = Zt + Wt (T − t) − T Wt = Zt − tWt . 3.4.1 h Rt i RT E[VT |Ft ] = E e 0 Wu du+ t Wu du |Ft i h RT Rt = e 0 Wu du E e t Wu du |Ft i h RT Rt = e 0 Wu du E e t (Wu −Wt ) du+(T −t)Wt |Ft i h RT = Vt e(T −t)Wt E e t (Wu −Wt ) du |Ft i h RT = Vt e(T −t)Wt E e t (Wu −Wt ) du h R T −t i (T −t)Wt Wτ dτ 0 E e = Vt e 3 1 (T −t) 3 = Vt e(T −t)Wt e 2 . 208 3.6.1 F (x) = P (Yt ≤ x) = P (µt + Wt ≤ x) = P (Wt ≤ x − µt) Z x−µt 1 − u2 √ e 2t du; = 2πt 0 1 − (x−µt)2 2t e f (x) = F ′ (x) = √ . 2πt 3.7.2 Since P (Rt ≤ ρ) = Z ρ 0 1 − x2 xe 2t dx, t use the inequality 1− x2 x2 < e− 2t < 1 2t to get the desired result. 3.7.3 E[Rt ] = Z ∞ xpt (x) dx = Z ∞ 1 2 − x2 x e 2t dx t 0 √ Z ∞ 3 −1 −z dy = 2t z 2 e dz 0 Z 1 ∞ 1/2 − y = y e 2t 2t 0 √ √ 2πt 3 . = = 2t Γ 2 2 0 Since E[Rt2 ] = E[W1 (t)2 + W2 (t)2 ] = 2t, then V ar(Rt ) = 2t − 2πt π = 2t 1 − . 4 4 3.7.4 E[Xt ] = V ar(Xt ) = √ r 2πt π E[Xt ] = = → 0, t 2t 2t 1 2 π V ar(Rt ) = (1 − ) → 0, 2 t t 4 By Example 2.12.1 we get Xt → 0, t → ∞ in mean square. t → ∞; t → ∞. 209 3.8.2 P (Nt − Ns = 1) = λ(t − s)e−λ(t−s) = λ(t − s) 1 − λ(t − s) + o(t − s) = λ(t − s) + o(t − s). P (Nt − Ns ≥ 1) = 1 − P (Nt − Ns = 0) + P (Nt − Ns = 1) = 1 − e−λ(t−s) − λ(t − s)e−λ(t−s) = 1 − 1 − λ(t − s) + o(t − s) −λ(t − s) 1 − λ(t − s) + o(t − s) = λ2 (t − s)2 = o(t − s). 3.8.6 Write first as Nt2 = Nt (Nt − Ns ) + Nt Ns = (Nt − Ns )2 + Ns (Nt − Ns ) + (Nt − λt)Ns + λtNs , then E[Nt2 |Fs ] = E[(Nt − Ns )2 |Fs ] + Ns E[Nt − Ns |Fs ] + E[Nt − λt|Fs ]Ns + λtNs = E[(Nt − Ns )2 ] + Ns E[Nt − Ns ] + E[Nt − λt]Ns + λtNs = λ(t − s) + λ2 (t − s)2 + λtNs + Ns2 − λsNs + λtNs = λ(t − s) + λ2 (t − s)2 + 2λ(t − s)Ns + Ns2 = λ(t − s) + [Ns + λ(t − s)]2 . Hence E[Nt2 |Fs ] 6= Ns2 and hence the process Nt2 is not an Fs -martingale. 3.8.7 (a) mNt (x) = E[exNt ] = = X X exh P (Nt = k) k≥0 k k xk −λt λ t e e k≥0 k! x = e−λt eλte = eλt(e x −1) . (b) E[Nt2 ] = m′′Nt (0) = λ2 t2 + λt. Similarly for the other relations. 3.8.8 E[Xt ] = E[eNt ] = mNt (1) = eλt(e−1) . 3.8.9 (a) Since exMt = ex(Nt −λt) = e−λtx exNt , the moment generating function is mMt (x) = E[exMt ] = e−λtx E[exNt ] = e−λtx eλt(e x −1) = eλt(e x −x−1) . 210 (b) For instance E[Mt3 ] = m′′′ Mt (0) = λt. Since Mt is a stationary process, E[(Mt − Ms )3 ] = λ(t − s). 3.8.10 V ar[(Mt − Ms )2 ] = E[(Mt − Ms )4 ] − E[(Mt − Ms )2 ]2 = λ(t − s) + 3λ2 (t − s)2 − λ2 (t − s)2 = λ(t − s) + 2λ2 (t − s)2 . 3.11.3 (a) E[Ut ] = E (b) E Nt X k=1 hR t 0 Nu du i = Rt 0 E[Nu ] du = Rt 0 λu du = λt2 2 . λt2 λt2 = . Sk = E[tNt − Ut ] = tλt − 2 2 3.11.5 The proof cannot go through because a product between a constant and a Poisson process is not a Poisson process. 3.11.6 Let pX (x) be the probability P density of X. If p(x, y) is the joint probability density of X and Y , then pX (x) = y p(x, y). We have X E[X|Y = y]P (Y = y) = y≥0 XZ xpX|Y =y (x|y)P (Y = y) dx y≥0 = XZ y≥0 = Z xp(x, y) P (Y = y) dx = P (Y = y) Z x X p(x, y) dx y≥0 xpX (x) dx = E[X]. 3.11.8 (a) Since Tk has an exponential distribution with parameter λ Z ∞ λ −σTk E[e ] = e−σx λe−λx dx = . λ+σ 0 (b) We have Ut = T2 + 2T3 + 3T4 + · · · + (n − 2)Tn−1 + (n − 1)Tn + (t − Sn )n = T2 + 2T3 + 3T4 + · · · + (n − 2)Tn−1 + (n − 1)Tn + nt − n(T1 + T2 + · · · + Tn ) = nt − [nT1 + (n − 1)T2 + · · · + Tn ]. (c) Using that the arrival times Sk , k = 1, 2, . . . n, have the same distribution as the order statistics U(k) corresponding to n independent random variables uniformly dis- 211 tributed on the interval (0, t), we get i h PNt E e−σUt Nt = n = E[e−σ(tNt − k=1 Sk ) |Nt = n] = e−nσt E[eσ −nσt Pn i=1 U(i) σU1 ] = e−nσt E[eσ Pn i=1 Ui ] σUn E[e ] · · · E[e ] Z t Z 1 t σxn σx1 −nσt 1 e e dxn dx1 · · · = e t 0 t 0 (1 − e−σt )n = · σ n tn = e (d) Using Exercise 3.11.6 we have h h i X E e−σUt ] = P (Nt = n)E e−σUt Nt = n n≥0 X e−λ tn λn (1 − e−σt )n = σ n tn n! n≥0 = eλ(1−e −σt )/σ−λ · 4.10.10 (a) dt dNt = dt(dMt + λdt) = dt dMt + λdt2 = 0 (b) dWt dNt = dWt (dMt + λdt) = dWt dMt + λdWt dt = 0 2 2 3.12.6 Use Doob’s pinequality for the submartingales Wt and |Wt |, and use that E[Wt ] = t and E[|Wt |] = 2t/π, see Exercise 3.1.17 (a). 3.12.7 Divide by t in the inequality from Exercise 3.12.6 part (b). 3.12.10 Let σ = n and τ = n + 1. Then E h sup n≤t≤n=1 N t t −λ 2 i 4λ(n + 1) · n2 ≤ The result follows by taking n → ∞ in the sequence of inequalities 0≤E h N t t −λ 2 i ≤E h sup n≤t≤n=1 N 2 i 4λ(n + 1) −λ ≤ · t n2 t Chapter 3 4.1.2 We have {ω; τ (ω) ≤ t} = and use that Ø, Ω ∈ Ft . Ω, if c ≤ t Ø, if c > t 212 4.1.3 First we note that {ω; τ (ω) < t} = [ 0<s<t {ω; |Ws (ω)| > K}. (10.2.16) This can be shown by double inclusion. Let As = {ω; |Ws (ω)| > K}. “ ⊂ ” Let ω ∈ {ω; τ (ω) < t}, so inf{s > 0; |Ws (ω)| > K} < t. Then exists τ < u < t such that |Wu (ω)| S > K, and hence ω ∈ Au . “ ⊃ ” Let ω ∈ 0<s<t {ω; |Ws (ω)| > K}. Then there is 0 < s < t such that |Ws (ω)| > K. This implies τ (ω) < s and since s < t it follows that τ (ω) < t. Since Wt is continuous, then (10.2.16) can be also written as [ {ω; τ (ω) < t} = {ω; |Wr (ω)| > K}, 0<r<t,r∈Q which implies {ω; τ (ω) < t} ∈ Ft since {ω; |Wr (ω)| > K} ∈ Ft , for 0 < r < t. Next we shall show that P (τ < ∞) = 1. [ P ({ω; τ (ω) < ∞}) = P {ω; |Ws (ω)| > K} > P ({ω; |Ws (ω)| > K}) 0<s = 1− Z |x|<K √ y2 2K 1 e− 2s dy > 1 − √ → 1, s → ∞. 2πs 2πs Hence τ is a stopping time. 4.1.4 Let Km = [a + 1 m, b − 1 m ]. We can write \ [ {ω; τ ≤ t} = {ω; Xr ∈ / Km } ∈ Ft , m≥1 r<t,r∈Q since {ω; Xr ∈ / Km } = {ω; Xr ∈ K m } ∈ Fr ⊂ Ft . 4.1.6 (a) No. The event A = {ω; Wt (ω) has a local maximum at time t} is not in σ{Ws ; s ≤ t} but is in σ{Ws ; s ≤ t + ǫ}. 4.1.8 (a) We have {ω; cτ ≤ t} = {ω; τ ≤ t/c} ∈ Ft/c ⊂ Ft . And P (cτ < ∞) = P (τ < ∞) = 1. (b) {ω; f (τ ) ≤ t} = {ω; τ ≤ f −1 (t)} = Ff −1 (t) ⊂ Ft , since f −1 (t) ≤ t. If f is bounded, then it is obvious that P (f (τ ) < ∞) = 1. If limt→∞ f (t) = ∞, then P (f (τ ) < ∞) = P τ < f −1 (∞) = P (τ < ∞) = 1. (c) Apply (b) with f (x) = ex . T 4.1.10 If let G(n) = {x; |x − a| < n1 }, then {a} = n≥1 G(n). Then τn = inf{t ≥ 0; Wt ∈ G(n)} are stopping times. Since supn τn = τ , then τ is a stopping time. 4.2.3 The relation is proved by verifying two cases: (i) If ω ∈ {ω; τ > t} then (τ ∧ t)(ω) = t and the relation becomes Mτ (ω) = Mt (ω) + Mτ (ω) − Mt (ω). 213 (ii) If ω ∈ {ω; τ ≤ t} then (τ ∧ t)(ω) = τ (ω) and the relation is equivalent with the obvious relation Mτ = Mτ . 4.2.5 Taking the expectation in E[Mτ |Fσ ] = Mσ yields E[Mτ ] = E[Mσ ], and then make σ = 0. 4.3.9 Since Mt = WT2 − t is a martingale with E[Mt ] = 0, by the Optional Stopping Theorem we get E[Mτa ] = E[M0 ] = 0, so E[Wτ2a − τa ] = 0, from where E[τa ] = E[Wτ2a ] = a2 , since Wτa = a. 4.3.10 (a) F (a) = P (Xt ≤ a) = 1 − P (Xt > a) = 1 − P ( max Ws > a) 0≤s≤t Z ∞ 2 −y 2 /2 dy = 1 − P (Ta ≤ t) = 1 − √ √ e 2π |a|/ t Z ∞ Z ∞ 2 2 −y 2 /2 −y 2 /2 dy = √ e dy − √ √ e 2π 0 2π |a|/ t Z |a|/√t 2 2 e−y /2 dy. = √ 2π 0 (b) The density function is p(a) = F ′ (a) = E[Xt ] = E[Xt2 ] = = V ar(Xt ) = 2 √ 2 e−a /(2t) , 2πt a > 0. Then r Z ∞ 2 2t −x2 /(2t) xp(x) dx = √ xe dy = π 2πt 0 0 Z ∞ Z ∞ 2 2 2 4t √ x2 e−x /(2t) dx = √ y 2 e−y dy π 0 2πt 0 Z 2 1 ∞ 1/2 −u 1 √ Γ(3/2) = t u e du = √ 2πt 2 0 2πt 2 . E[Xt2 ] − E[Xt ]2 = t 1 − π Z ∞ 4.3.16 It is recurrent since P (∃t > 0 : a < Wt < b) = 1. 4.4.4 Since 1 1 P (Wt > 0; t1 ≤ t ≤ t2 ) = P (Wt = 6 0; t1 ≤ t ≤ t2 ) = arcsin 2 π r t1 , t2 using the independence P (Wt1 > 0, Wt2 ) = P (Wt1 > 0)P (Wt2 1 > 0) = 2 arcsin π r t1 2 . t2 214 The probability for Wt = (Wt1 , Wt2 ) 4.5.2 (a) We have 4 to be in one of the quadrants is 2 arcsin π r t1 2 . t2 e2µβ − 1 = 1. β→∞ e2µβ − e−2µα P (Xt goes up to α) = P (Xt goes up to α before down to −∞) = lim 4.5.3 e2µβ − 1 . α→∞ e2µβ − e−2µα P (Xt never hits − β) = P (Xt goes up to ∞ before down to − β) = lim 4.5.4 (a) Use that E[XT ] = αpα − β(1 − pα ); (b) E[XT2 ] = α2 pα + β 2 (1 − pα ), with e2µβ −1 pα = e2µβ ; (c) Use V ar(T ) = E[T 2 ] − E[T ]2 . −e−2µα 4.5.7 Since Mt = Wt2 − t is a martingale, with E[Mt ] = 0, by the Optional Stopping Theorem we get E[WT2 − T ] = 0. Using WT = XT − µT yields E[XT2 − 2µT XT + µ2 T 2 ] = E[T ]. Then E[T 2 ] = E[T ](1 + 2µE[XT ]) − E[XT2 ] . µ2 Substitute E[XT ] and E[XT2 ] from Exercise 4.5.4 and E[T ] from Proposition 4.5.5. 4.6.11 See the proof of Proposition 4.6.3. 4.6.12 E[T ] = − 1 E[XT ] d E[e−sT ]|s=0 = (αpα + β(1 − pβ )) = . ds µ µ 4.6.13 (b) Applying the Optional Stopping Theorem E[ecMT −λT (e c −c−1) ] = E[X0 ] = 1 E[eca−λT f (c) ] = 1 E[e−λT f (c) ] = e−ac . Let s = f (c), so c = ϕ(s). Then E[e−λsT ] = e−aϕ(s) . (c) Differentiating and taking s = 0 yields −λE[T ] = −ae−aϕ(0) ϕ′ (0) 1 = −∞, = −a ′ f (0) so E[T ] = ∞. 215 (d) The inverse Laplace transform L−1 e−aϕ(s) cannot be represented by elementary functions. 4.8.9 Use E[(|Xt | − 0)2 ] = E[|Xt |2 ] = E[(Xt − 0)2 ]. ?? Let an = ln bn . Then Gn = eln Gn = e a1 +···+an n → eln L = L. 4.9.3 (a) L = 1. (b) A computation shows E[(Xt − 1)2 ] = E[Xt2 − 2Xt + 1] = E[Xt2 ] − 2E[Xt ] + 1 = V ar(Xt ) + (E[Xt ] − 1)2 . (c) Since E[Xt ] = 1, we have E[(Xt − 1)2 ] = V ar(Xt ). Since V ar(Xt ) = e−t E[eWt ] = e−t (e2t − et ) = et − 1, then E[(Xt − 1)2 ] does not tend to 0 as t → ∞. 4.10.7 (a) E[(dWt )2 − dt2 ] = E[(dWt )2 ] − dt2 = 0. (b) V ar((dWt )2 − dt) = E[(dWt2 − dt)2 ] = E[(dWt )4 − 2dtdWt + dt2 ] = 3st2 − 2dt · 0 + dt2 = 4dt2 . Chapter 4 5.2.3 (a) Use either the definition or the moment generation function to show that 4 ] = 3(t − s)2 . E[Wt4 ] = 3t2 . Using stationarity, E[(Wt − Ws )4 ] = E[Wt−s Z T dWt ] = E[WT ] = 0. 5.4.1 (a) E[ 0 Z T 1 1 Wt dWt ] = E[ WT2 − T ] = 0. (b) E[ 2 2 0Z Z T T 1 1 T2 1 . Wt dWt )2 ] = E[ WT2 + T 2 − T WT2 ] = Wt dWt ) = E[( (c) V ar( 4 4 2 2 0 0 RT 5.6.2 X ∼ N (0, 1 1t dt) = N (0, ln T ). 5.6.3 Y ∼ N (0, RT 1 tdt) = N 0, 12 (T 2 − 1) 5.6.4 Normally distributed with zero mean and variance Z t 0 1 e2(t−s) ds = (e2t − 1). 2 5.6.5 Using the property of Wiener integrals, both integrals have zero mean and vari7t3 ance . 3 5.6.6 The mean is zero and the variance is t/3 → 0 as t → 0. 216 5.6.7 Since it is a Wiener integral, Xt is normally distributed with zero mean and variance Z t bu 2 b2 a+ du = (a2 + + ab)t. t 3 0 Hence a2 + b2 3 + ab = 1. 5.6.8 Since both Wt and Rt 0 f (s) dWs have the mean equal to zero, Z t Z t Z t Z t dWs f (s) dWs ] = E[ f (s) dWs ] = E[Wt , f (s) dWs Cov Wt , 0 0 0 0 Z t Z t f (u) ds. = E[ f (u) ds] = 0 0 The general result is Z t Z t f (s) ds. f (s) dWs = Cov Wt , 0 0 Choosing f (u) = un yields the desired identity. 5.8.6 Apply the expectation to Nt X k=1 Nt Nt 2 X X f (Sk )f (Sj ). f 2 (Sk ) + 2 f (Sk ) = k6=j k=1 5.9.1 E V ar hZ Z T 0 T eks dNs eks dNs 0 i = = λ kT (e − 1) k λ 2kT (e − 1). 2k Chapter 5 6.2.1 Let Xt = Rt 0 eWu du. Then dGt = d( Xt tdXt − Xt dt teWt dt − Xt dt 1 Wt − G e )= = = t dt. t t2 t2 t 6.3.3 (a) eWt (1 + 12 Wt )dt + eWt (1 + Wt )dWt ; (b) (6Wt + 10e5Wt )dWt + (3 + 25e5Wt )dt; 2 2 (c) 2et+Wt (1 + Wt2 )dt + 2et+Wt Wt dWt ; 217 (d) n(t + Wt )n−2 (t + Wt + n−1 )dt + (t + W )dW t t ; 2 ! Z 1 t 1 Wt − Wu du dt; (e) t t 0 ! Z α t Wu 1 Wt (f ) α e − e du dt. t t 0 6.3.4 d(tWt2 ) = td(Wt2 ) + Wt2 dt = t(2Wt dWt + dt) + Wt2 dt = (t + Wt2 )dt + 2tWt dWt . 6.3.5 (a) tdWt + Wt dt; (b) et (Wt dt + dWt ); (c) (2 − t/2)t cos Wt dt − t2 sin Wt dWt ; (d) (sin t + Wt2 cos t)dt + 2 sin t Wt dWt ; 6.3.8 It follows from (6.3.10). 6.3.9 Take the conditional expectation in Z t Mu− dMu + Nt − Ns Mt2 = Ms2 + 2 s and obtain E[Mt2 |Fs ] = Ms2 Z t + 2E[ Mu− dMu |Fs ] + E[Nt |Fs ] − Ns s = Ms2 + E[Mt + λt|Fs ] − Ns = Ms2 + Ms + λt − Ns = Ms2 + λ(t − s). 6.3.10 Integrating in (6.3.11) yields Z Ft = Fs + t s ∂f dWt1 + ∂x Z s t ∂f dWt2 . ∂y One can check that E[Ft |Fs ] = Fs . 2Wt1 dWt1 + 2Wt2 dWt2 . (Wt1 )2 + (Wt2 )2 p ∂f x ∂f x2 + y 2 . Since 6.3.12 Consider the function f (x, y) = = p = , 2 2 ∂x ∂y x +y 1 y p , ∆f = p , we get 2 2 2 x +y 2 x + y2 6.3.11 (a) dFt = 2Wt1 dWt1 + 2Wt2 dWt2 + 2dt; (b) dFt = dRt = ∂f ∂f W1 W2 1 dWt1 + dWt2 + ∆f dt = t dWt1 + t dWt2 + dt. ∂x ∂y Rt Rt 2Rt 218 Chapter 6 7.2.1 (a) Use integration formula with g(x) = tan−1 (x). Z T Z T Z 1 T 1 2Wt −1 ′ −1 (tan ) (Wt ) dWt = tan WT + dWt = dt. 2 2 0 (1 + Wt )2 0 0 1 + Wt i hZ T 1 = 0. dW (b) Use E t 2 0 1 + Wt x (c) Use Calculus to find minima and maxima of the function ϕ(x) = . (1 + x2 )2 7.2.2 (a) Use integration by parts with g(x) = ex and get Z T Z 1 T Wt eWt dWt = eWT − 1 − e dt. 2 0 0 (b) Applying the expectation we obtain WT E[e 1 ] =1+ 2 Z T E[eWt ] dt. 0 If let φ(T ) = E[eWT ], then φ satisfies the integral equation Z 1 T φ(T ) = 1 + φ(t) dt. 2 0 Differentiating yields the ODE φ′ (T ) = 21 φ(T ), with φ(0) = 1. Solving yields φ(T ) = eT /2 . 7.2.3 (a) Apply integration by parts with g(x) = (x − 1)ex to get Z T Z T Z 1 T ′′ ′ Wt g (Wt ) dWt = g(WT ) − g(0) − Wt e dWt = g (Wt ) dt. 2 0 0 0 (b) Applying the expectation yields WT E[WT e Z i 1 Th ] = E[e ] − 1 + E[eWt ] + E[Wt eWt ] dt 2 0 Z T 1 et/2 + E[Wt eWt ] dt. = eT /2 − 1 + 2 0 WT Then φ(T ) = E[WT eWT ] satisfies the ODE φ′ (T ) − φ(T ) = eT /2 with φ(0) = 0. 7.2.4 (a) Use integration by parts with g(x) = ln(1 + x2 ). (e) Since ln(1 + T ) ≤ T , the upper bound obtained in (e) is better than the one in (d), without contradicting it. 219 7.3.1 By straightforward computation. 7.3.2 By computation. √ 7.3.13 (a) √e2 sin( 2W1 ); (b) 12 e2T sin(2W3 ); (c) √ √1 (e 2W4 −4 2 − 1). 7.3.14 Apply Ito’s formula to get 1 dϕ(t, Wt ) = (∂t ϕ(t, Wt ) + ∂x2 ϕ(t, Wt ))dt + ∂x ϕ(t, Wt )dWt 2 = G(t)dt + f (t, Wt ) dWt . Integrating between a and b yields ϕ(t, Wt )|ba = Z 8.2.1 Integrating yields Xt = X0 + we get b G(t) dt + b f (t, Wt ) dWt . a a Rt Z Chapter 7 Rt 2s 0 (2Xs + e ) ds + 0 b dWs . E[Xt ] = X0 + Z t Taking the expectation (2E[Xs ] + e2s ) ds. 0 Differentiating we obtain f ′ (t) = 2f (t) + e2t , where f (t) = E[Xt ], with f (0) = X0 . Multiplying by the integrating factor e−2t yields (e−2t f (t))′ = 1. Integrating yields f (t) = e2t (t + X0 ). 8.2.4 (a) Using product rule and Ito’s formula, we get 1 d(Wt2 eWt ) = eWt (1 + 2Wt + Wt2 )dt + eWt (2Wt + Wt2 )dWt . 2 Integrating and taking expectations yields Z t 1 2 Wt E[eWs ] + 2E[Ws eWs ] + E[Ws2 eWs ] ds. E[Wt e ] = 2 0 Since E[eWs ] = et/2 , E[Ws eWs ] = tet/2 , if let f (t) = E[Wt2 eWt ], we get by differentiation 1 f ′ (t) = et/2 + 2tet/2 + f (t), 2 f (0) = 0. Multiplying by the integrating factor e−t/2 yields (f (t)e−t/2 )′ = 1 + 2t. Integrating yields the solution f (t) = t(1 + t)et/2 . (b) Similar method. 8.2.5 (a) Using Exercise 3.1.19 E[Wt4 − 3t2 |Ft ] = E[Wt4 |Ft ] − 3t2 = 3(t − s)2 + 6(t − s)Ws2 + Ws4 − 3t2 = (Ws4 − 3s2 ) + 6s2 − 6ts + 6(t − s)Ws2 6= Ws4 − 3s2 220 Hence Wt4 − 3t2 is not a martingale. (b) E[Wt3 |Fs ] = Ws3 + 3(t − s)Ws 6= Ws3 , and hence Wt3 is not a martingale. 8.2.6 (a) Similar method as in Example 8.2.4. (b) Applying Ito’s formula 1 d(cos(σWt )) = −σ sin(σWt )dWt − σ 2 cos(σWt )dt. 2 σ2 2 Let f (t) = E[cos(σWt )]. Then f ′ (t) = − σ2 f (t), f (0) = 1. The solution is f (t) = e− 2 t . (c) Since sin(t+σWt ) = sin t cos(σWt )+cos t sin(σWt ), taking the expectation and using (a) and (b) yields E[sin(t + σWt )] = sin tE[cos(σWt )] = e− σ2 t 2 sin t. (d) Similarly starting from cos(t + σWt ) = cos t cos(σWt ) − sin t sin(σWt ). 8.2.7 From Exercise 8.2.6 (b) we have E[cos(Wt )] = e−t/2 . From the definition of expectation Z ∞ 1 − x2 e 2t dx. cos x √ E[cos(Wt )] = 2πt −∞ Then choose t = 1/2 and t = 1 to get (a) and (b), respectively. 8.2.8 (a) Using a standard method involving Ito’s formula we can get E(Wt ebWt ) = 2 bteb t/2 . Let a = 1/(2t). We can write Z Z √ 1 − x2 2 2πt xebx √ xe−ax +bx dx = e 2t dx 2πt r √ √ π b b2 /(4a) bWt b2 t/2 e . 2πtE(Wt e ) = 2πtbte = = a 2a The same method for (b) and (c). 8.2.9 (a) We have E[cos(tWt )] = E X (−1)n n≥0 = X Wt2n t2n X E[Wt2n ]t2n = (−1)n (2n)! (2n)! n≥0 n≥0 −t3 /2 = e n≥0 3n t2n (2n)!tn X n t = (−1) (−1)n (2n)! 2n n! 2n n! . (b) Similar computation using E[Wt2n+1 ] = 0. 221 Rt Rt 8.3.2 (a) Xt = 1 + sin t − 0 sin s dWs , E[Xt ] = 1 + sin t, V ar[Xt ] = 0 (sin s)2 ds = t 1 2 − 4 sin(2t); Rt√ 2 (b) Xt = et − 1 + 0 s dWs , E[Xt ] = et − 1, V ar[Xt ] = t2 ; Rt 4 (c) Xt = 1 + 12 ln(1 + t2 ) + 0 s3/2 dWs , E[Xt ] = 1 + 12 ln(1 + t2 ), V ar(Xt ) = t4 . 8.3.4 −t/2 Xt = 2(1 − e )+ Z t 0 s e− 2 +Ws dWs = 1 − 2e−t/2 + e−t/2+Wt = 1 + et/2 (eWt − 2); Its distribution function is given by F (y) = P (Xt ≤ y) = P (1 + et/2 (eWt − 2) ≤ y) = P Wt ≤ ln(2 + (y − 1)et/2 ) Z ln(2+(y−1)et/2 ) x2 1 = √ e− 2t dx. 2πt −∞ E[Xt ] = 2(1 − e−t/2 ), V ar(Xt ) = V ar(e−t/2 eWt ) = e−t V ar(eWt ) = et − 1. 8.4.3 (a) Xt = 13 Wt3 − tWt + et ; (b) Xt = 31 Wt3 − tWt − cos t; t 3 2 (c) Xt = eWt − 2 + t3 ; (d) Xt = et/2 sin Wt + t2 + 1. 8.4.4 (a) Xt = 14 Wt4 + tWt ; (b) Xt = 21 Wt2 + t2 Wt − 2t ; (c) Xt = et Wt − cos Wt + 1; (d) Xt = teWt + 2. 8.5.1 (a) dXt = (2Wt dWt + dt) + Wt dt + tdWt = d(Wt2 ) + d(tWt ) = d(tWt + Wt2 ) so Xt = tWt + Wt2 . (b) We have 1 1 Wt )dt + dWt t2 t 1 1 = 2tdt + d( Wt ) = d(t2 + Wt ), t t dXt = (2t − so Xt = t2 + 1t Wt − 1 − W1 . 1 (c) dXt = et/2 Wt dt + et/2 dWt = d(et/2 Wt ), so Xt = et/2 Wt . (d) We have 2 dXt = t(2Wt dWt + dt) − tdt + Wt2 dt 1 = td(Wt2 ) + Wt2 dt − d(t2 ) 2 2 t = d(tWt2 − ), 2 222 √ √ √ (e) dXt = dt + d( tWt ) = d(t + tWt ), so Xt = t + tWt − W1 . Z t 1 4t 4t e4(t−s) dWs ; 8.6.1 (a) Xt = X0 e + (1 − e ) + 2 4 0 2 (b) Xt = X0 e3t + (1 − e3t ) + e3t Wt ; 3 1 t t (c) Xt = e (X0 + 1 + Wt2 − ) − 1; 2 2 t 1 (d) Xt = X0 e4t − − (1 − e4t ) + e4t Wt ; 4 16 (e) Xt = X0 et/2 − 2t − 4 + 5et/2 − et cos Wt ; (f ) Xt = X0 e−t + e−t Wt . so Xt = tWt − t2 2. Rt 1 Rt α2 2 8.8.1 (a) The integrating factor is ρt = e− 0 α dWs + 2 0 α ds = e−αWt + 2 t , which transforms the equation in the exact form d(ρt Xt ) = 0. Then ρt Xt = X0 and hence Xt = X0 eαWt − α2 t 2 . 2 −αWt + α2 t (b) ρt = e 2 (1− α2 )t+αWt X0 e , d(ρt Xt ) = ρt Xt dt, dYt = Yt dt, Yt = Y0 et , ρt Xt = X0 et , Xt = . 8.8.2 σt dAt = dXt − σAt dt, E[At ] = 0, V ar(At ) = E[A2t ] = Chapter 8 9.2.2 A = 21 ∆ = 9.2.4 1 t2 Rt 2 0 E[Xs ] ds = X02 t . 1 Pn 2 2 k=1 ∂k . X1 (t) = x01 + W1 (t) Z t 0 X1 (s) dW2 (s) X2 (t) = x2 + 0 Z t W1 (s) dW2 (s). = x02 + x01 W2 (t) + 0 9.5.2 E[τ ] is maximum if b − x0 = x0 − a, i.e. when x0 = (a + b)/2. The maximum value is (b − a)2 /4. 9.6.1 Let τk = min(k, τ ) ր τ and k → ∞. Apply Dynkin’s formula for τk to show that E[τk ] ≤ 1 2 (R − |a|2 ), n and take k → ∞. 9.6.3 x0 and x2−n . 9.7.1 (a) We have a(t, x) = x, c(t, x) = x, ϕ(s) = xes−t and u(t, x) = x(eT −t − 1). RT t xes−t ds = 223 (b) a(t, x) = tx, c(t, x) = − ln x, ϕ(s) = xe(s 2 −(T − t)[ln x + T6 (T + t) − t3 ]. 2 −t2 )/2 and u(t, x) = − RT t ln xe(s 2 −t2 )/2 ds = 3 9.7.2 (a) u(t, x) = x(T − t) + 12 (T − t)2 . (b) u(t, x) = 32 ex e 2 (T −t) − 1 . (c) We have a(t, x) = µx, b(t, x) = σx, c(t, x) = x. The associated diffusion is dXs = µXs ds + σXs dWs , Xt = x, which is the geometric Brownian motion 1 Xs = xe(µ− 2 σ 2 )(s−t)+σ(W s −Wt ) s ≥ t. , The solution is hZ i T 1 2 u(t, x) = E xe(µ− 2 σ )(s−t)+σ(Ws −Wt ) ds t Z T 1 e(µ− 2 (s−t))(s−t) E[eσ(s−t) ] ds = x t Z T 1 2 e(µ− 2 (s−t))(s−t) eσ (s−t)/2 ds = x t Z T x µ(T −t) eµ(s−t) ds = = x e −1 . µ t Chapter 9 10.1.7 Apply Example 10.1.6 with u = 1. Rt Rt 10.1.8 Xt = 0 h(s) dWs ∼ N (0, 0 h2 (s) ds). Then eXt is log-normal with E[eXt ] = 1 1 e 2 V ar(Xt ) = e 2 Rt 0 h(s)2 ds . 10.1.9 (a) Using Exercise 10.1.8 we have E[Mt ] = E[e− − 12 = e Rt Rt 0 0 u(s) dWs − 12 u(s)2 e ds E[e− Rt 0 Rt 0 u(s)2 ds u(s) dWs ] 1 ] = e− 2 Rt 0 u(s)2 ds 1 e2 Rt 0 u(s)2 ds (b) Similar computation with (a). 10.1.10 (a) Applying the product and Ito’s formulas we get d(et/2 cos Wt ) = −e−t/2 sin Wt dWt . Integrating yields et/2 cos Wt = 1 − Z t e−s/2 sin Ws dWs , 0 which is an Ito integral, and hence a martingale; (b) Similarly. 10.1.12 Use that the function f (x1 , x2 ) = ex1 cos x2 satisfies ∆f = 0. = 1. 224 10.1.14 (a) f (x) = x2 ; (b) f (x) = x3 ; (c) f (x) = xn /(n(n − 1)); (d) f (x) = ecx ; (e) f (x) = sin(cx). Chapter 10 ?? (a) Since rt ∼ N (µ, s2 ), with µ = b + (r0 − b)e−at and s2 = Z P (rt < 0) = 0 −∞ (x−µ)2 1 1 √ e− 2s2 dx = √ 2πs 2π Z −µ/s e−v σ2 2a (1 2 /2 − e−2at ). Then dv = N (−µ/s), −∞ where by computation µ 1 − =− s σ (b) Since µ lim = t→∞ s then r 2a r0 + b(eat − 1) . −1 e2at √ √ 2a r0 + b(eat − 1) b 2a √ lim , = σ t→∞ σ e2at − 1 √ b 2a ). lim P (rt < 0) = lim N (−µ/s) = N (− t→∞ t→∞ σ It is worth noting that the previous probability is less than 0.5. (c) The rate of change is µ µ2 d d 1 P (rt < 0) = − √ e− 2s2 dt ds s 2π 2 µ ae2at [b(e2at − 1)(e−at − 1) − r0 ] 1 . = − √ e− 2s2 (e2at − 1)3/2 2π ?? By Ito’s formula 1 n− 1 d(rtn ) = na(b − rt )rtn−1 + n(n − 1)σ 2 rtn−1 dt + nσrt 2 dWt . 2 Integrating between 0 and t and taking the expectation yields Z t n(n − 1) 2 σ µn−1 (s)] ds [nabµn−1 (s) − naµn (s) + µn (t) = r0n + 2 0 Differentiating yields µ′n (t) + naµn (t) = (nab + n(n − 1) 2 σ )µn−1 (t). 2 Multiplying by the integrating factor enat yields the exact equation [enat µn (t)]′ = enat (nab + n(n − 1) 2 σ )µn−1 (t). 2 225 Integrating yields the following recursive formula for moments Z t n(n − 1) 2 n −nat e−na(t−s)µn−1 (s) ds . σ ) µn (t) = r0 e + (nab + 2 0 ?? (a) The spot rate rt follows the process d(ln rt ) = θ(t)dt + σdWt . Integrating yields ln rt = ln r0 + Z t θ(u) du + σWt , 0 Rt which is normally distributed. (b) Then rt = r0 e tributed. (c) The mean and variance are Rt E[rt ] = r0 e 0 θ(u) du 1 2 e 2 σ t, V ar(rt ) = r02 e2 0 θ(u) du+σWt Rt 0 θ(u) du σ2 t e is log-normally dis2 (eσ t − 1). ?? Substitute ut = ln rt and obtain the linear equation dut + a(t)ut dt = θ(t)dt + σ(t)dWt . Rt The equation can be solved multiplying by the integrating factor e 0 a(s) ds . Chapter 11 ?? Apply the same method used in the Application ??. The price of the bond is: R T −t P (t, T ) = e−rt (T −t) E e−σ 0 Ns ds . ?? The price of an infinitely lived bond at time t is given by limT →∞ P (t, T ). Using 1 lim B(t, T ) = , and T →∞ a if b < σ 2 /(2a) +∞, 1, if b > σ 2 /(2a) lim A(t, T ) = 1 2 2 T →∞ σ σ if b = σ 2 /(2a), ( a + t)(b − 2a 2 ) − 4a3 , we can get the price of the bond in all three cases. Chapter 12 ?? Dividing the equations St = S0 e(µ− Su = S0 e(µ− yields St = Su e(µ− σ2 )t+σWt 2 σ2 )u+σWu 2 σ2 )(t−u)+σ(Wt −Wu ) 2 . 226 Taking the predictable part out and dropping the independent condition we obtain E[St |Fu ] = Su e(µ− = Su e(µ− σ2 )(t−u) 2 σ2 )(t−u) 2 2 (µ− σ2 )(t−u) = Su e = Su e(µ− σ2 )(t−u) 2 E[eσ(Wt −Wu ) |Fu ] E[eσ(Wt −Wu ) ] E[eσWt−u ] 1 e2σ 2 (t−u) = Su eµ(t−u) Similarly we obtain E[St2 |Fu ] = Su2 e2(µ− σ2 )(t−u) 2 = Su2 e2µ(t−u) eσ E[e2σ(Wt −Wu ) |Fu ] 2 (t−u) . Then V ar(St |Fu ) = E[St2 |Fu ] − E[St |Fu ]2 2 = Su2 e2µ(t−u) eσ (t−u) − Su2 e2µ(t−u) 2 = Su2 e2µ(t−u) eσ (t−u) − 1 . When s = t we get E[St |Ft ] = St and V ar(St |Ft ) = 0. ?? By Ito’s formula 1 1 1 (dSt )2 dSt − St 2 St2 σ2 = (µ − )dt + σdWt , 2 d(ln St ) = 2 so ln St = ln S0 +(µ− σ2 )t+σWt , and hence ln St is normally distributed with E[ln St ] = 2 ln S0 + (µ − σ2 )t and V ar(ln St ) = σ 2 t. ?? (a) d S1t = (σ 2 − µ) S1t dt − Sσt dWt ; 2 n (b) d Stn = n(µ + n−1 2 σ )dt + nσSt dWt ; (c) d (St − 1)2 = d(St2 ) − 2dSt = 2St dSt + (dSt )2 − 2dSt = (2µ + σ 2 )St2 − 2µSt dt + 2σSt (St − 1)dWt . ?? (b) Since St = S0 e(µ− σ2 )t+σWt 2 , then Stn = S0n e(nµ− 2 (nµ− σ2 )t E[Stn ] = S0n e = S0n e(nµ− = S0n e(nµ+ σ2 )t 2 σ2 )t+nσWt 2 E[enσWt ] 1 e2n 2 σ2 t n(n−1) 2 σ )t 2 . and hence 227 (a) Let n = 2 in (b) and obtain E[St2 ] = S0 e(2µ+σ 2 )t . ?? (a) Let Xt = St Wt . The product rule yields d(Xt ) = Wt dSt + St dWt + dSt dWt = Wt (µSt dt + σSt dWt ) + St dWt + (µSt dt + σSt dWt )dWt = St (µWt + σ)dt + (1 + σWt )St dWt . Integrating Xs = Z s St (µWt + σ)dt + Z s (1 + σWt )St dWt . 0 0 Let f (s) = E[Xs ]. Then f (s) = Z s (µf (t) + σS0 eµt ) dt. 0 Differentiating yields f ′ (s) = µf (s) + σS0 eµs , f (0) = 0, with the solution f (s) = σS0 seµs . Hence E[Wt St ] = σS0 teµt . (b) Cov(St , Wt ) = E[St Wt ] − E[St ]E[Wt ] = E[St Wt ] = σS0 teµt . Then Cov(St , Wt ) σS0 teµt p = √ σSt σWt S0 eµt eσ2 t − 1 t r t → 0, t → ∞. = σ 2t σ e −1 Corr(St, Wt ) = ?? d = Sd /S0 = 0.5, u = Su /S0 = 1.5, γ = −6.5, p = 0.98. ?? Let Ta be the time St reaches level a for the first time. Then F (a) = P (S t < a) = P (Ta > t) Z ∞ (ln a −αστ )2 S0 1 a − 2τ σ 3 √ e = ln dτ, 3 3 S0 t 2πσ τ where α = µ − 21 σ 2 . RT −(ln Sa −αστ )2 /(2τ σ3 ) 0 dτ . ?? (a) P (Ta < T ) = ln Sa0 0 √ 1 3 3 e 2πσ τ RT (b) T12 p(τ ) dτ . S0 2 2 2 ?? E[At ] = S0 1 + µt 2 + O(t ), V ar(At ) = 3 σ t + O(t ). 228 1 t Rt ln Su du, using the product rule yields Z 1 Z t 1 t ln Su du + d ln Su du d(ln Gt ) = d t 0 t 0 Z t 1 1 = − 2 ln Su du dt + ln St dt t t 0 1 (ln St − ln Gt )dt. = t ?? (a) Since ln Gt = 0 (b) Let Xt = ln Gt . Then Gt = eXt and Ito’s formula yields 1 dGt = deXt = eXt dXt + eXt (dXt )2 = eXt dXt 2 Gt = Gt d(ln Gt ) = (ln St − ln Gt )dt. t ?? Using the product rule we have 1 H 1 t = d Ht + dHt d t t t Ht 1 1 dt = − 2 Ht dt + 2 Ht 1 − t t St 1 H2 = − 2 t dt, t St so, for dt > 0, then d Htt < 0 and hence Htt is decreasing. For the second part, try to apply l’Hospital rule. ?? By continuity, the inequality (??) is preserved under the limit. Rt Rt P α P ?? (a) Use that n1 Stk = 1t Stk nt → 1t 0 Suα du. (b) Let It = 0 Suα du. Then 1 1 1 1 α It It + dIt = S − dt. d It = d t t t t t t Let Xt = Aαt . Then by Ito’s formula we get ! 1 α−1 1 1 1 α−2 1 2 dXt = α It d It + α(α − 1) It d It t t 2 t t 1 α−1 1 1 α−1 1 It Stα − dt d It = α It = α It t t t t t α = St (Aαt )1−1/α − Aαt dt. t Rt (c) If α = 1 we get the continuous arithmetic mean, A1t = 1t 0 Su du. If α = −1 we t . obtain the continuous harmonic mean, A−1 t = Rt 1 0 Su du 229 ?? (a) By the product rule we have dXt Z 1 Z t 1 t Su dWu + d Su dWu = d t 0 t 0 Z t 1 1 Su dWu dt + St dWt = − 2 t t 0 1 1 = = − Xt dt + St dWt . t t Using the properties of Ito integrals, we have Z t 1 E[Xt ] = Su dWu ] = 0 E[ t 0 V ar(Xt ) = E[Xt2 ] − E[Xt ]2 = E[Xt2 ] Z i Z t 1 h t Su dWu Su dWu = 2E t 0 0 Z t Z t 1 1 2 = 2 E[Su2 ] du = 2 S0 e(2µ+σ )u du t 0 t 0 2 S02 e(2µ+σ )t − 1 . t2 2µ + σ 2 = (b) The stochastic differential equation of the stock price can be written in the form σSt dWt = dSt − µSt dt. Integrating between 0 and t yields Z t Z t Su du. Su dWu = St − S0 − µ σ 0 0 Dividing by t yields the desired relation. ?? Using independence E[St ] = S0 eµt E[(1 + ρ)Nt ]. Then use E[(1 + ρ)Nt ] = X n≥0 E[(1 + ρ)n |Nt = n]P (Nt = n) = ?? E[ln St ] = ln S0 µ − λρ − ?? E[St |Fu ] = Su e(µ−λρ− σ2 2 X (1 + ρ)n n≥0 (λt)n = eλ(1+ρ)t . n! + λ ln(ρ + 1) t. σ2 )(t−u) 2 (1 + ρ)Nt−u . Chapter 13 ?? (a) Substitute limSt →∞ N (d1 ) = limSt →∞ N (d2 ) = 1 and limSt →∞ c(t) K = N (d1 ) − e−r(T −t) N (d2 ). St St K St = 0 in 230 (b) Use limSt →0 N (d1 ) = limSt →0 N (d2 ) = 0. (c) It comes from an analysis similar to the one used at (a). ?? Differentiating in c(t) = St N (d1 ) − Ke−r(T −t) N (d2 ) yields dN (d1 ) dN (d2 ) − Ke−r(T −t) dSt dSt ∂d2 ∂d 1 − Ke−r(T −t) N ′ (d2 ) = N (d1 ) + St N ′ (d1 ) ∂St ∂St 1 −d21 /2 1 1 1 2 1 −r(T −t) 1 √ √ e−d2 /2 √ = N (d1 ) + St √ e − Ke S S σ T −t t σ T −t t 2π 2π i 1 −d22 /2 h (d22 −d21 )/2 1 1 √ St e − Ke−r(T −t) . e = N (d1 ) + √ 2π σ T − t St dc(t) dSt = N (d1 ) + St It suffices to show 2 2 St e(d2 −d1 )/2 − Ke−r(T −t) = 0. Since √ 2 ln(St /K) + 2r(T − t) √ d22 − d21 = (d2 − d1 )(d2 + d1 ) = −σ T − t σ T −t = −2 ln St − ln K + r(T − t) , then we have 2 ln 2 e(d2 −d1 )/2 = e− ln St +ln K−r(T −t) = e 1 = Ke−r(T −t) . St 1 St eln K e−r(T −t) Therefore 2 2 St e(d2 −d1 )/2 − Ke−r(T −t) = St which shows that dc(t) dSt 1 Ke−r(T −t) − Ke−r(T −t) = 0, St = N (d1 ). ?? The payoff is fT = where XT ∼ N ln St + (µ − 1, if ln K1 ≤ XT ≤ ln K2 0, otherwise. σ2 2 )(T − t), σ 2 (T − t) . 231 bt [fT ] = E[fT |Ft , µ = r] = E Z Z ∞ fT (x)p(x) dx −∞ σ2 ln K2 2 [x−ln St −(r− 2 )(T −t)] 1 − 2σ 2 (T −t) √ √ = e dx ln K1 σ 2π T − t Z −d2 (K2 ) Z d2 (K1 ) 1 −y2 /2 1 2 √ e √ e−y /2 dy = dy = 2π 2π −d2 (K1 ) d2 (K2 ) = N d2 (K1 ) − N d2 (K2 ) , where d2 (K1 ) = d2 (K2 ) = ln St − ln K1 + (r − √ σ T −t ln St − ln K2 + (r − √ σ T −t σ2 2 )(T − t) σ2 2 )(T − t) . Hence the value of a box-contract at time t is ft = e−r(T −t) [N d2 (K1 ) − N d2 (K2 ) ]. ?? The payoff is eXT , if XT > ln K 0, otherwise. Z ∞ bt [fT ] = e−r(T −t) ft = e−r(T −t) E ex p(x) dx. fT = ln K The computation is similar with the one of the integral I2 from Proposition ??. ?? (a) The payoff is fT = bt [fT ] = E = = = = = Z enXT , if XT > ln K 0, otherwise. Z ∞ [x−ln St −(r−σ 2 /2)(T −t)]2 1 −1 σ 2 (T −t) enx p(x) dx = p enx e 2 dx σ 2π(T − t) ln K ln K Z ∞ √ σ2 1 2 1 n √ enσ T −ty eln St en(r− 2 )(T −t) e− 2 y dy 2π −d2 Z ∞ √ 2 1 2 1 n n(r− σ2 )(T −t) √ St e e− 2 y +nσ T −ty dy 2π −d2 Z ∞ √ σ2 1 2 2 1 1 2 n n(r− 2 )(T −t) 2 n σ (T −t) √ St e e e− 2 (y−nσ T −t) dy 2π Z ∞ −d2 2 σ2 1 − z2 √ Stn e(nr+n(n−1) 2 )(T −t) e dz √ 2π −d2 −nσ T −t √ σ2 Stn e(nr+n(n−1) 2 )(T −t) N (d2 + nσ T − t). ∞ 232 The value of the contract at time t is bt [fT ] = S n e(n−1)(r+ ft = e−r(T −t) E t nσ 2 2 )(T −t) √ N (d2 + nσ T − t). √ nσ 2 (b) Using gt = Stn e(n−1)(r+ 2 )(T −t) , we get ft = g√ t N (d2 + nσ T − t). (c) In the particular case n = 1, we have N (d2 + σ T − t) = N (d1 ) and the value takes the form ft = St N (d1 ). 2 ?? Since ln ST ∼ N ln St + µ − σ2 (T − t) , σ 2 (T − t) , we have bt [fT ] = E (ln ST )2 |Ft , µ = r E = V ar ln ST |Ft , µ = r + E[ln ST |Ft , µ = r]2 2 σ2 = σ 2 (T − t) + ln St + (r − )(T − t) , 2 so h 2 i σ2 ft = e−r(T −t) σ 2 (T − t) + ln St + (r − )(T − t) . 2 ?? ln STn is normally distributed with σ2 ln STn = n ln ST ∼ N n ln St + n µ − (T − t) , n2 σ 2 (T − t) . 2 Redo the computation of Proposition ?? in this case. ?? Let n = 2. bt [(ST − K)2 ] ft = e−r(T −t) E bt [ST ] bt [ST2 ] + e−r(T −t) K 2 − 2K E = e−r(T −t) E = St2 e(r+σ 2 )(T −t) + e−r(T −t) K 2 − 2KSt . Let n = 3. Then bt [(ST − K)3 ] ft = e−r(T −t) E bt [S 3 − 3KS 2 + 3K 2 ST − K 3 ] = e−r(T −t) E T T bt [ST3 ] − 3Ke−r(T −t) E bt [ST2 ] + e−r(T −t) (3K 2 E bt [ST ] − K 3 ) = e−r(T −t) E = e2(r+3σ 2 /2)(T −t) St3 − 3Ke(r+σ 2 )(T −t) St2 + 3K 2 St − e−r(T −t) K 3 . ?? Choose cn = 1/n! in the general contract formula. ?? Similar computation with the one done for the call option. ?? Write the payoff as a difference of two puts, one with strike price K1 and the other with strike price K2 . Then apply the superposition principle. 233 ?? Write the payoff as fT = c1 + c3 − 2c2 , where ci is a call with strike price Ki . Apply the superposition principle. ?? (a) By inspection. (b) Write the payoff as the sum between a call and a put both with strike price K, and then apply the superposition principle. ?? (a) By inspection. (b) Write the payoff as the sum between a call with strike price K2 and a put with strike price K1 , and then apply the superposition principle. (c) The strangle is cheaper. ?? The payoff can be written as a sum between the payoffs of a (K3 , K4 )-bull spread and a (K1 , K2 )-bear spread. Apply the superposition principle. ?? By computation. ?? The risk-neutral valuation yields bt [ST − AT ] = e−r(T −t) E bt [ST ] − e−r(T −t) E bt [AT ] ft = e−r(T −t) E 1 t St 1 − e−r(T −t) = St − e−r(T −t) At − T rT 1 t 1 −r(T −t) = St 1 − − e−r(T −t) At . + e rT rT T ?? We have lim ft = S0 e−rT +(r− σ2 T )2 2 2 + σ 6T t→0 1 = S0 e− 2 (r+ σ2 )T 6 − e−rT K − e−rT K. bt [GT ] < E bt [AT ] and hence ?? Since GT < AT , then E bt [GT − K] < e−r(T −t) E bt [AT − K], e−r(T −t) E so the forward contract on the geometric average is cheaper. ?? Dividing Gt = S0 e(µ− GT yields = S0 e(µ− σ 2 T −t σ GT = e(µ− 2 ) 2 e T Gt σ2 t ) 2 2 σ et σ2 T ) 2 2 RT 0 σ eT Rt 0 Wu du Rt 0 Wu du Wu du− σt Rt 0 Wu du . An algebraic manipulation leads to GT = Gt e(µ− σ 2 T −t ) 2 2 1 1 eσ( T − t ) Rt 0 Wu du σ eT RT t Wu du . 234 ?? Use that σ2 z P (max St ≥ z) = P sup[(µ − )t + σWt ] ≥ ln t≤T 2 S0 t≤T 2 σ 1 z 1 . = P sup[ (µ − )t + Wt ] ≥ ln 2 σ S0 t≤T σ Chapter 15 ?? P = F − ∆F S = c − N (d1 )S = −Ke−r(T −t) . Chapter 17 ?? σ2 b b σ2 P (St < b) = P (r − )t + σWt < ln = P σWt < ln − (r − )t 2 S0 S0 2 −λ2 = P σWt < −λ = P σWt > λ ≤ e− 2σ2 t 1 = e− 2σ2 t [ln(S0 /b)+(r−σ 2 /2)t]2 , where we used Proposition 2.11.12 with µ = 0.