Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Transcript

ANDREW TULLOCH A DVA N C E D P R O B A B I L ITY TRINITY COLLEGE THE UNIVERSITY OF CAMBRIDGE Contents 1 Conditional Expectation 1.1 Discrete Case 5 6 1.2 Existence and Uniqueness 7 1.3 Conditional Jensen’s Inequalities 11 1.4 Product Measures and Fubini’s Theorem 1.5 Examples of Conditional Expectation 1.6 Notation for Example Sheet 1 2 Discrete Time Martingales 2.1 Optional Stopping 13 14 14 17 18 2.2 Hitting Probabilities for a Simple Symmetric Random Walk 2.3 Martingale Convergence Theorem 2.4 Uniform Integrability 22 2.5 Backwards Martingales 2.6 Applications of Martingales 2.6.1 3 21 25 25 Martingale proof of the Radon-Nikodym theorem Stochastic Processes in Continuous Time 26 27 20 4 4 andrew tulloch Bibliography 29 1 Conditional Expectation Let (Ω, F , P) be a probability space. Ω is a set, F is a σ-algebra on Ω, and P is a probability measure on (Ω, F ). Definition 1.1. F is a σ-algebra on Ω if it satisfies • ∅, Ω ∈ F • A ∈ F ==> Ac ∈ F • ( An )n≥0 is a collection of sets in F then ∪n An ∈ F . Definition 1.2. P is a probability measure on (Ω, F ) if • P : F → [0, 1] is a set function. • P(∅) = 0, P(Ω) = 1, • ( An )n≥0 is a collection of pairwise disjoint sets in F , then P(∪n An ) = ∑ n P( A n ). Definition 1.3. The Borel σ-algebra B(R) is the σ-algebra generated by the open sets of R. Call O the collection of open subsets of R, then B(R) = ∩{ξ : ξ is a sigma algebra containing O} (1.1) Definition 1.4. A a collection of subsets of Ω, then we write σ (A) = ∩{ξ : ξ a sigma algebra containing A} Definition 1.5. X is a random variable on (Ω, F ) if X : Ω− > R is a function with the property that X −1 (V ) ∈ F for all V open sets in R. 6 andrew tulloch Exercise 1.6. If X is a random variable then { B ⊆ R, X −1 ( B) ∈ F } is a σ-algebra and contains B(R). If ( Xi , i ∈ I ) is a collection of random variables, then we write σ( Xi , i ∈ I ) = σ ({ω ∈ Ω : Xi(ω )∈B},i∈ I,B∈B(R))) and it is the smallest σ-algebra that makes all the Xi ’s measurable. Definition 1.7. First we define it for the positive simple random variables. E n ∑ ci 1 ( Ai ) ! n = i =1 ∑ P( A i ) . (1.2) i =1 with ci positive constants, ( Ai ) ∈ F . We can extend this to any positive random variable X ≥ 0 by approximation X as the limit of piecewise constant functions. For a general X, we write X = X + − X − with X + = max( X, 0), X − = max(− X, 0). If at least one of E( X + ) or E( X − ) is finite, then we define E( X ) = E( X + ) + E( X − ). We call X integrable if E(| X |) < ∞. Definition 1.8. Let A, B ∈ F , P( B) > 0. Then P( A ∩ B ) P( B ) E( X1( B)) E[ X | B ] = P( B ) P( A | B ) = Goal - we want to define E( X |G) that is a random variable measurable with respect to the σ-algebra G . 1.1 Discrete Case Suppose G is a σ-algebra countably generated ( Bi)i∈N is a collection of pairwise disjoint sets in F with ∪ Bi = Ω. Let G = σ ( Bi , i ∈ N). It is easy to check that G = {∪ j∈ J Bj , J ⊆ N }. Let X be an integrable random variable. Then X 0 = E( X |G) = ∑ E(X |Bi ) I( Bi ) i ∈N advanced probability (i) X 0 is G -measurable (check). (ii) E | X 0 | ≤ E(| X |) (1.3) and so X 0 is integrable. (iii) ∀ G ∈ G , then E XI( G ) = E X 0 I( G ) (1.4) (check). 1.2 Existence and Uniqueness Definition 1.9. A ∈ F , A happens almost surely (a.s.) if P( A) = 1. Theorem 1.10 (Monotone Convergence Theorem). If Xn ≥ 0 is a sequence of random variables and Xn ↑ X as n → ∞ a.s, then E( Xn ) ↑ E( X ) (1.5) almost surely as n → ∞. Theorem 1.11 (Dominated Convergence Theorem). If ( Xn ) is a sequence of random variables such that | Xn | ≤ Y for Y an integrable random as as variable, then if Xn → X then E( Xn ) → E( X ). Definition 1.12. For p ∈ [1, ∞), f measurable functions, then 1 k f k p = E[| f | p ] p (1.6) k f k∞ = inf{λ : | f | ≤ λa.e.} (1.7) Definition 1.13. L p = L p (Ω, F , P) = { f : k f k p < ∞} Formally, L p is the collection of equivalence classes where two functions are equivalent if they are equal a.e. We will just represent an element of L p by a function, but remember that equality is a.e. 7 8 andrew tulloch Theorem 1.14. The space ( L2 , k · k2 ) is a Hilbert space with hU, V i >= E(UV ). Suppose H is a closed subspace, then ∀ f ∈ L2 there exists a unique g ∈ H such that k f − gk2 = inf{k f − hk2 , h ∈ H and h f − g, hi = 0 for all h ∈ H. We call g the orthogonal projection of f onto H. Theorem 1.15. Let (Ω, F , ¶) be an underlying probability space, and let X be an integrable random variable, and let G ⊂ F sub σ-algebra. Then there exists a random variable Y such that (i) Y is G -measurable (ii) If A ∈ G , E( XI( A) = E(YI( A))) (1.8) and Y is integrable. Moreover, if Y 0 also satisfies the above properties, then Y = Y 0 a.s. Remark 1.16. Y is called a version of the conditional expectation of X given G and we write G = σ( Z ) as Y = E( X |G). Remark 1.17. (b) could be replaced by the following condition: for all Z G -measurable, bounded random variables, E( XZ ) = E(YZ ) (1.9) Proof. Uniqueness - let Y 0 satisfy (a) and (b). If we consider {Y 0 − Y > 0} = A, A is G measurable. From (b), E (Y 0 − Y )I( A) = E( XI( A)) − E( XI( A)) = 0 and hence P(Y 0 − Y > 0)) = 0 which implies that Y 0 ≤ Y a.s. Similarly, Y 0 ≥ Y a.s. Existence - Complete the following three steps: (i) X ∈ L2 (Ω, F , ¶) is a Hilbert space with hU, V i = E(UV ). The space L2 (Ω, G , ¶) is a closed subspace. p as Xn → X ( L2 ) => Xn → X => ∃subseqXnk → X => X 0 = lim sup Xnk (1.10) advanced probability We can write L2 (Ω, F , ¶) = L2 (Ω, G , ¶) + L2 (Ω, G , ¶)⊥ X =Y+Z Set Y = E( X |G), Y is G -measurable, A ∈ G . E( XI( A)) = EYI( A) + EZI( A) | {z } =0 (ii) If X ≥ 0 then Y ≥ 0 a.s. Consider A = {Y < 0}, then 0 ≤ E( XI( A)) = E(YI( A)) ≤ 0 (1.11) Thus P( A) = 0 ⇒ Y ≥ 0 a.s. Let X ≥ 0, Set 0 ≤ Xn = max ( X, n) ≤ n, so Xn ∈ L2 for all n. Write Yn = E( Xn |G), then Yn ≥ 0 a.s., Yn is increasing a.s.. Set Y = lim sup Yn . So Y is G -measurable. We will show Y = E( X |G) a.s. For all A ∈ G , we need to check E( XI( A)) = E(YI( A)) . We know that E( Xn I( A)) = E(Yn I( A)), and Yn ↑ Y a.s. Thus, by monotone convergence theorem, E( XI( A)) = E(YI( A)). If X is integrable, setting A = Ω, we have Y is integrable. (iii) X is a general random variable, not necessarily in L2 or ≥ 0. Then we have that X = X + + X − . We define E( X |G) = E( X + |G) − E( X − |G). This satisfies (a), (b). Remark 1.18. If X ≥ 0, we can always define Y = E( X |G) a.s. The integrability condition of Y may not be satisfied. Definition 1.19. Let G0 , G1 , . . . be sub σ-algebras of F . Then they are called independent if for all i, j ∈ N, P Gi ∩ · · · ∩ Gj = Πin=1 P( Gi ) Theorem 1.20. (i) If X ≥ 0 then E( X |G) ≥ 0 (1.12) 9 10 andrew tulloch (ii) E(E( X |G)) = E( X ) (A = Ω) (iii) X is G -measurable implies E( X |G) = X a.s. (iv) X is independent of G , then E( X |G) = E( X ). Theorem 1.21 (Fatau’s lemma). Xn ≥ 0, then for all n, E(lim inf Xn ) ≤ lim inf E( Xn ) (1.13) Theorem 1.22 (Conditional Monotone Convergence). Let Xn ≥ 0, Xn ↑ X a.s. Then E( Xn |G) ↑ E( X |G) a.s. (1.14) Proof. Set Yn = E( Xn |G). Then Yn ≥ 0 and Yn is increasing. Set Y = lim sup Yn . Then Y is G -measurable. Theorem 1.23 (Conditional Fatau’s Lemma). Xn ≥ 0, then E(lim inf Xn |G) ≤ lim inf E( Xn |G) a.s. (1.15) Proof. Let X denote the limit inferior of the Xn . For every natural number k define pointwise the random variable Yk = infn≥k Xn . Then the sequence Y1 , Y2 , . . . is increasing and converges pointwise to X. For k ≤ n, we have Yk ≤ Xn , so that E(Yk |G) ≤ E( Xn |G) a.s (1.16) by the monotonicity of conditional expectation, hence E(Yk |G) ≤ inf E( Xn |G) a.s. n≥k (1.17) because the countable union of the exceptional sets of probability zero is again a null set. Using the definition of X, its representation as pointwise limit of the Yk , the monotone convergence theorem for conditional expectations, the last inequality, and the definition of the limit inferior, it follows that almost surely advanced probability E lim inf Xn |G = E( X |G) n→∞ = E lim Yk |G k→∞ (1.18) (1.19) = lim E(Yk |G) (1.20) ≤ lim inf E( Xn |G) (1.21) = lim inf E( Xn |G) (1.22) k→∞ k→∞ n≥k n→∞ Theorem 1.24 (Conditional dominated convergence). TODO 1.3 Conditional Jensen’s Inequalities Let X be an integrable random variable such that φ( x ) is integrable of φ is non-negative. Suppose G ⊂ F is a σ-algebra. Then E(φ( X )|G) ≥ φ(E( X |G)) (1.23) almost surely. In particular, if 1 ≤ p < ∞, then kE( X |G) k p ≤ k X k p (1.24) Proof. Every convex function can be written as φ( x ) = supi∈N ( ai x + bi ), ai , bi ∈ R. Then E(φ( X )|G) ≥ aE( X |G) + bi E(φ( X )|G) ≥ sup( ai E( X |G) + bi ) i ∈N = φ(E( X |G) The second part follows from p p kE( X |G) k p = E(|E( X |G) | p ) ≤ E(E(| X | p |G)) = E(| X | p ) = k X k p (1.25) Proposition 1.25 (Tower Property). Let X ∈ L1 , H ⊂ G ⊂ F be 11 12 andrew tulloch sub-σ-algebras. Then E(E( X |G) |H) = E( X |H) (1.26) almost surely. Proof. Clearly E( X |H) is H-measurable. Let A ∈ H. Then E(E( X |H) I( A)) = E( XI( A)) = E(E( X |G) I( A)) (1.27) Proposition 1.26. Let X ∈ L1 , G ⊂ F be sub-σ-algebras. Suppose that Y is bounded, G -measurable. Then E( XY |G) = YE( X |G) (1.28) almost surely. Proof. Clearly YE( X |G) is G -measurable. Let A ∈ G . Then E(YE( X |G) I( A)) = EE( X |G) (YI( A)) | {z } = E( XYI( A)) G -measurable, bounded (1.29) Definition 1.27. A collection A of subsets of Ω is called a π-system if for all A, B ∈ A, then A ∩ B ∈ A. Proposition 1.28 (Uniqueness of extension). Suppose that ξ is a σalgebra on E. Let µ1 , µ2 be two measures on ( E, ξ ) that agree on a π-system generating ξ and µ1 ( E) = µ2 ( E) < ∞. Then µ1 = µ2 everywhere on ξ. Theorem 1.29. Let X ∈ L1 , G , H ⊂ F two sub-σ-algebras. If σ ( X, G) is independent of H, then E( X |σ (G , H)) = E( X |G) almost surely. (1.30) advanced probability Proof. Take A ∈ G , B ∈ H. E(E( X |G) I( A) I( B)) = P( B) E(E( X |G) I( A)) = P( B) E( XI( A)) = E( XI( A) I( B)) = E(E( X |σ(G , H)) I( A ∩ B)) Assume X ≥ 0, the general case follows by writing X = X + − X − . Now, letting F ∈ F , we have that µ( F ) = E(E( X |G) I( F )), and if µ, ν are two measures on (Ω, pF ), setting A = { A ∩ B, A ∈ G , B ∈ H}. Then A is a π-system. µ, ν are two measurables that agree on the π-system A and µ(Ω) = E(E( X |G)) = E( X ) = νΩ < ∞, since X is integrable. Note that A generates σ (G , H). So, by the uniqueness of extension theorem, µ, ν agree everywhere on σ (G , H). Remark 1.30. If we only had X independent of H and G independent of H, the conclusion can fail. For example, consider coin tosses X, Y independent 0, 1 with probability 12 , and Z = I( X = Y ). 1.4 Product Measures and Fubini’s Theorem Definition 1.31. A measure space ( E, ξ, µ) is called σ-finite if there exists sets (Sn )n with ∪Sn = E and µ(Sn ) < ∞ for all n. Let ( E1 , ξ 1 , µ1 ) and ( E2 , ξ 2 , µ2 ) be two σ-finite measure spaces, with A = { A1 × A2 : A1 ∈ ξ 1 , A2 ∈ ξ 2 } a π-system of subsets of E = E1 × E2 . Define ξ = ξ 1 ⊗ ξ 2 = σ ( A). Definition 1.32 (Product measure). Let ( E1 , ξ 1 , µ1 ) and ( E2 , ξ 2 , µ2 ) be two σ-finite measure spaces. Then there exists a unique measure µ on ( E, ξ ) (µ = µ1 ⊗ µ2 ) satisfying µ ( A1 × A2 ) = µ1 ( A1 ) µ2 ( A2 ) for all A1 ∈ ξ 1 , A2 ∈ ξ 2 . (1.31) 13 14 andrew tulloch Theorem 1.33 (Fubini’s Theorem). Let ( E1 , ξ 1 , µ1 ) and ( E2 , ξ 2 , µ2 ) be σ-finite measure spaces. Let f ≥ 0, f is ξ-measurable. Then µ( f ) = Z Z E1 E2 f ( x1 , x2 )µ2 (dx2 ) µ1 (dx1 ) (1.32) If f is integrable, then x2 7→ f ( x1 , x2 ) is u2 -integrable for u1 -almost all x. R Moreover, x1 7→ E f ( x1 , x2 µ2 (dx2 ) is µ1 -integrable and µ( f ) is given 2 by (1.32). 1.5 Examples of Conditional Expectation Definition 1.34. A random vector ( X1 , X2 , . . . , Xn ) ∈ Rn is called a Gaussian random vector if and only if for all a1 , . . . , an ∈ R, a 1 X1 + · · · + a n X n (1.33) is a Gaussian random variable. ( Xt )t≥0 is called a Gaussian process if for all 0 ≤ t1 ≤ t2 ≤ · · · ≤ tn , the vector Xt1 , . . . , Xtn is a Gaussian random vector. Example 1.35 (Gaussian case). Let ( X, Y ) e a Gaussian vector in R2 . We want to calculate E( X |Y ) = E( X |σ (Y )) = X 0 (1.34) where X 0 = f (Y ) with f a Borel function. Let’s try f of a linear function X 0 = aY = b, a, b ∈ R to be determined. Note that E( X ) = E( X 0 ) and E( X 0 − X )Y = 0 ⇒ Cov( X − X 0 , Y ) = 0 by laws of conditional expectation. Then we have that aE(Y ) + b = E( X ) Cov( X, Y ) = aV( X ) (1.35) TODO - continue inference 1.6 Notation for Example Sheet 1 (i) G ∨ H = σ ( G, H ). (ii) Let X, Y be two random variables taking values in R with joint density f X,Y ( x, y) and h : R → R be a Borel function such that advanced probability h( X ) is integrable. We want to calculate E(h( X )|Y ) = E(h( X )|σ (Y )) (1.36) Let g be bounded and measurable. Then E(h( X ) g(Y )) = Z Z h( x ) g(y) f X,Y ( x, y)dxdy f X,Y ( x, y) f (y)dxdy f Y (y) Y Z Z f X,Y ( x, y) = h( x ) dx g(y) f Y (y)dy f Y (y) = Z Z h( x ) g(y) (1.37) (1.38) (1.39) with 0/0 = 0 R f ( x,y) Set φ(y) = h( x ) X,Y dx if f Y (y) > 0, and 0 otherwise. Then we f (y) Y have E(h( X )|Y ) = φ(Y ) (1.40) almost surely, and E(h( X )|Y ) = with ν(y, dx ) = f X,Y ( x,y) I( f Y ( y ) f Y (y) Z h( x )ν(Y, dx ) (1.41) > 0) dx = f X |Y ( x |y)dx. ν(y, dx ) is called the conditional distribution of X given Y = y and f X |Y ( x |y) is the conditional density of X given Y = y. 15 2 Discrete Time Martingales Let (Ω, F , P) be a probability space and ( E, ξ ) a measurable space. Usually E = R, Rd , C. For us, E = R. A sequence X = ( Xn )n≥0 of random variables taking values in E is called a stochastic process. A filtration is an increasing family (Fn )n≥0 of sub-σ-algebras of Fn , so Fn ⊆ Fn+1 . Intuitively, Fn is the information available to us at time n. To every stochastic process X we associate a filtration called the natural filtration (FnX )n≥0 , FnX = σ( Xk , k ≤ n) (2.1) A stochastic process X is called adapted to (Fn )n≥0 if Xn is Fn measurable for all n. A stochastic process X is called integrable if Xn is integrable for all n. Definition 2.1. An adapted integrable process ( Xn )n≥0 taking values in R is called a (i) martingale if E( Xn |Fm ) = Xm for all n ≥ m. (ii) super-martingale if E( Xn |Fm ) ≤ Xm for all n ≥ m. (iii) sub-martingale if E( Xn |Fm ) ≥ Xm for all n ≥ m. Remark 2.2. A (sub,super)-martingale with respect to a filtration Fn is also a (sub, super)-martingale with respect to the natural filtration of Xn (by the tower property) 18 andrew tulloch Example 2.3. Suppose (ξ i ) are iid random variables with E(ξ i ) = 0. Set Xn = ∑in=1 ξ i . Then ( Xn ) is a martingale. Example 2.4. As above, but let (ξ i ) be iid with E(ξ i ) = 1. Then Xn = Πin=1 ξ i is a martingale. Definition 2.5. A random variables T : Ω → Z+ ∪ {∞} is called a stopping time if { T ≤ n} ∈ Fn for all n. Equivalently, { T = n} ∈ Fn for all n. Example 2.6. (i) Constant times are trivial stopping times. (ii) A ∈ B(R). Define TA = inf{n ≥ 0| Xn ∈ A}, with inf ∅ = ∞. Then TA is a stopping time. Proposition 2.7. Let S, T, ( Tn ) be stopping times on the filtered probability space (Ω, F , (Fn ), P). Then S ∧ T, S ∨ T, infn Tn , lim infn Tn , lim supn Tn are stopping times. Notation. T stopping time, then XT (ω ) = XT (ω ) (ω ). The stopped process X T is defined by XtT = XT ∧t . F T = { A ∈ F | A ∩ T ≤ T ∈ F t , ∀ t }. Proposition 2.8. (Ω, F , (Fn ), P), X = ( Xn )n≥0 is adapted. (i) S ≤ T, stopping times, then FS ⊆ F T (ii) XT I( T < ∞) is F T -measurable. (iii) T a stopping time, then X T is adapted (iv) If X is integrable, then X T is integrable. Proof. Let A ∈ ξ. Need to show that { XT I( T < ∞) ∈ A} ∈ F T . { XT I( T < ∞)} ∩ { T ≤ t} = ∪s≤t { T = s} ∩ { Xs ∈ A} ∈ Ft (2.2) | {z } | {z } Fs ⊆Ft 2.1 Optional Stopping Theorem 2.9. Let X be a martingale. ∈Fs ⊆Ft advanced probability (i) If T is a stopping time, then X T is also a martingale. In particular, E( XT ∧t ) = E( X0 ) for all t. (ii) (iii) (iv) Proof. By the tower property, it is sufficient to check t −1 E( XT ∧t |Ft−1 ) = E ∑ Xs I( T = s) |Ft−1 + E( Xt I( T > t − 1) |Ft−1 ) | {z } i =1 ∈Fs ⊆Ft−1 t −1 = ∑ I(T = s) Xs + I(t > t − 1) Xt−1 = XT∧(t−1) s =0 Since it is a martingale, E( XT ∧t ) = E( X0 ). Theorem 2.10. Let X be a martingale. (i) If T is a stopping time, then X T is also a martingale, so in particular E ( X T ∧ t ) = E ( X0 ) (2.3) (ii) If X ≤ T are bounded stopping times, then E( XT |FS ) = XS almost surely. Proof. Let S ≤ T ≤ n. Then XT = ( XT − XT −1 ) + ( XT −1 − XT −2 ) + · · · + ( XS+1 − XS ) + XS = Xs + ∑nk=0 ( Xk+1 − Xk )I(S ≤ k < T ). Let A ∈ Fs . Then E( XT I( A)) = E( Xs I( A)) + n ∑ E((Xk+1 − Xk )I(S ≤ k < T ) I( A)) k =0 (2.4) = E( Xs I( A)) Remark 2.11. The optimal stopping theorem also holds for super/submartingales with the respective martingale inequalities in the statement. (2.5) 19 20 andrew tulloch Example 2.12. Suppose that (ξ i )i are random variables with P( ξ i = 1) = P( ξ i = −1) = 1 2 (2.6) Set X0 = 0, Xn = ∑in=1 ξ i . This is a simply symmetric random walk on Xn . Let T = inf{n ≥ 0 : Xn = 1}. Then ¶T < ∞ = 1, but T is not bounded. Proposition 2.13. If X is a positive supermartingale and T is a stopping time which is finite almost surely (P( T < ∞) = 1), then E ( X T ) ≤ E ( X0 ) (2.7) Proof. E( XT ) = E lim inf Xt∧T ≤ lim inf E( Xt∧T ) ≤ E( X0 ) t→∞ 2.2 t→∞ (2.8) Hitting Probabilities for a Simple Symmetric Random Walk Let (ξ i ) be iid ±1 equally likely. Let X0 = 0, Xn = ∑in=1 ξ i . For all x ∈ Z let Tx = inf{n ≥ 0 : Xn = x } (2.9) which is a stopping time. We want to explore hitting probabilities (P( T− a < Tb )) for a, b > 0. If E( T ) < ∞, then by (iv) in Theorem 2.10, E( XT ) = E( X0 ) = 0. E( XT ) = − aP( T− a < Tb ) + bP( Tb < T− a ) = 0 (2.10) and thus obtain that P( T−a < Tb ) = b . a+b Remains to check E( T ) < ∞. We have P(ξ 1 = 1, ξ a+b = 1) = (2.11) 1 . 2a+b advanced probability 2.3 Martingale Convergence Theorem Theorem 2.14. Let X = ( Xn )n≥0 be a (super-)-martingale bounded in L1 , that is, supn≥0 E(| Xn |) < ∞. Then Xn converges as n → ∞ almost surely towards an a.s. finite limit X ∈ L1 (F∞ ) with F∞ = σ (Fn , n ≥ 0). To prove it we will use Doob’s trick which counts up-crossings of intervals with rational endpoints. Corollary 2.15. Let X be a positive supermartingale. Then it converges to an almost surely finite limit as n → ∞. Proof. E(| Xn |) = E( Xn ) ≤ E( X0 ) < ∞ (2.12) Proof. Let x = ( xn )n be a sequence of real numbers, and let a < b be two real numbers. Let T0 ( x ) = 0 and inductively for k ≥ 0, Sk+1 ( x ) = inf{n ≥ Tk ( x ) : xn ≤ a} Tk+1 ( x ) = inf{n ≥ Sk+1 ( x ) : xn ≥ b} (2.13) with the usual convention that inf ∅ = ∞. Define Nn ([ a, b], x ) = sup{k ≥ 0 : Tk ( x ) ≤ n} - the number of up-crossings of the interval [ a, b] by the sequence x by the time n. As n → ∞, we have Nn ([ a, b], x ) ↑ N ([ a, b], x ) = sup{k ≥ 0 : Tk ( x ) < ∞}, (2.14) the total number of up-crossings of the interval [ a, b]. Lemma 2.16. A sequence of rationals x = ( xn )n converges in R̄ = R ∪ {±∞} if and only if N ([ a, b], x ) < ∞ for all rationals a, b. Proof. Assume x converges. Then if for some a < b we had that N ([ a, b], x ) = ∞, then lim infn xn ≤ a < b ≤ lim supn xn , which is a contradiction. Then, suppose that x does converge. Then lim infn xn > lim supn xn , and so taking a, b rationals between these two numbers gives that N ([ a, b], x ) = ∞ as required. 21 22 andrew tulloch Theorem 2.17 (Doob’s up-crossing inequality). Let X be a supermartingale and a < b be two real numbers. Then for all n ≥ 0, (b − a)E( Nn ([ a, b], X )) ≤ E ( Xn − a)− (2.15) Proof. For all k, XTk − XSk ≥ b − a 2.4 (2.16) Uniform Integrability Theorem 2.18. Suppose X ∈ L1 . Then the collection of random variables {E( X |G)} (2.17) for G ⊆ F a sub-σ-algebra is uniformly integrable. Proof. Since X ∈ L1 , for all e > 0 there exists S > 0 such that if A ∈ F and P( A) < δ, then E(| X |I( A)) ≤ e. Set Y = E( X |G). Then E(|Y |) ≤ E(| X |). Choose λ < ∞ such that E(| X |) ≤ λδ. Then P(|Y | ≥ λ) ≤ E(|Y |) ≤δ λ (2.18) by Markov’s inequality. Then E(|Y |I(|Y | ≥ λ)) ≤ E(E(| X ||G) I(|Y | ≥ λ)) (2.19) = E(| X |I(|Y | ≥ λ)) (2.20) ≤e (2.21) Definition 2.19. A process X = ( Xn )n≥0 is called a uniformly integrable martingale if it is a martingale and the collection ( Xn ) is uniformly integrable. Theorem 2.20. Let X be a martingale. Then the following are equivalent. advanced probability (i) X is a uniformly integrable martingale. (ii) X converges almost surely and in L1 to a limit X∞ as n → ∞. (iii) There exists a random variable Z ∈ L1 such that Xn = E( Z |Fn ) almost surely for all n ≥ 0. Theorem 2.21 (Chapter 13 of Williams). Let Xn , X ∈ L1 for all n ≥ 0 as and suppose that Xn → X as n → ∞. Then Xn converges to X in L1 if and only if ( Xn ) is uniformly integrable. Proof. We proceed as follows. (i ) ⇒ (ii ) Since X is uniformly integrable, it is bounded in L1 and by the martingale convergence theorem, we get that Xn converges almost surely to a finite limit X∞ . By the previous theorem, Theorem 2.21 gives L1 convergence. (ii ) ⇒ (iii ) Set Z = X∞ . We need to show that Xn = E( Z |Fn ) almost surely for all n ≥ 0. For all m ≥ n by the martingale property we have k Xn − E( X∞ |Fn ) k1 = kE( Xm − X∞ |Fn ) k1 ≤ k Xm − X∞ k1 → 0 (2.22) as m → ∞. (iii ) ⇒ (i ) E( Z |Fn ) is a martingale by the tower property of conditional expectation. Uniform integrability follows from Theorem 2.18. Remark 2.22. If X is UI then X∞ = E( Z |F∞ ) a.s where F∞ = σ (Fn , n ≥ 0). Remark 2.23. If X is a super/sub-martingale UI, then it converges almost surely and in L1 to a finite limit X∞ with E( X∞ |Fn ) (≥)(≤) Xn almost surely. Example 2.24. Let X1 , X2 , . . . be iid random variables with P( X = 0) = P( X = 2) = 21 . Set Yn = X1 · · · · · Xn . Then Yn is a martingale. As E(Yn ) = 1 for all n, we have (Yn ) is bounded in L1 , and it converges almost surely to 0. But E(Yn ) = 1 for all n, and hence it does not converge in L1 . 23 24 andrew tulloch If X is a UI martingale and T is a stopping time, then we can unambiguously define ∞ XT = ∑ Xn I( T = n ) + X∞ I( T = ∞ ) (2.23) n =0 Theorem 2.25 (Optional stopping for UI martingales). Let X be a UI martingale and let S, T be stopping times with S ≤ T. Then E( XT |FS ) = XS (2.24) almost surely. Proof. We first show that E( X∞ |F T ) = XT almost surely for any stopping time T. First, check that XT ∈ L1 . Since | Xn | ≤ E(| X∞ ||Fn ), we have E(| XT |) = ∞ ∑ E(|Xn |I(T = n) + E(|X∞ |I(T = ∞))) (2.25) n =0 ≤ ∑ E(| X∞ |I( T = n)) n∈Z+ ∪{∞} (2.26) = E(| X∞ |) (2.27) Let B ∈ F T . Then E(I( B ) X T ) = ∑ E(I( B ) I( T = n ) Xn ) (2.28) ∑ E(I( B ) I( T = n ) X∞ ) (2.29) n∈Z+ ∪{∞} = n∈ Z + ∪{∞} = E(I( B ) X∞ ) (2.30) where for the second equality we used that E( X∞ |Fn ) = Xn almost surely. Clearly XT isF T -measurable, and hence E( X∞ |F T ) = XT almost surely. Using the tower property of conditional expectation, we have advanced probability for stopping times S ≤ T (as FS ⊆ F T ), E( XT |FS ) = E(E( X∞ |F T ) |FS ) (2.31) = E( X∞ |FS ) (2.32) = XS (2.33) almost surely. 2.5 Backwards Martingales Let ... ⊆ G−2 ⊆ G−1 ⊆ G0 be a sequence of .... Fill in proof from lecture notes 2.6 Applications of Martingales Theorem 2.26 (Kolmogrov’s 0 − 1 law). Let ( Xi )i≥1 be a sequence of IID random variables. Let Fn = σ( Xk , k ≥ n) and F∞ = ∩n≥0 Fn . Then F∞ is trivial - that is, every A ∈ F∞ has probability P( A) ∈ {0, 1}. Proof. Let Gn = σ ( Xk , k ≤ n) and A ∈ F∞ . Since Gn is independent of Fn+1 , we have that E(I( A) |Gn ) = P( A) (2.34) Theorem 2.26 (LN ) gives that P( A) = E(I( A) |Gn ) converges to link to correct theorem E(I( A) |G∞ ) as n → ∞, where G∞ = σ (Gn , n ≥ 0). Then we deduce that E(I( A) |Gn ) = I( A) = P( A) as F∞ ⊆ G∞ . Therefore, P( A) = Theorem 2.27 (Strong law of large numbers). Let ( Xi )i≥1 be a sequence of iid random variables in L1 with µ = E( Xi ). Let Sn = ∑in=1 Xi and S0 = 0. Then Sn n → µ as n → ∞ almost surely and in L1 . Proof. Theorem 2.28 (Kakutani’s product martingale theorem). Let ( Xn )n≥0 be a sequence of independent non-negative random variables of mean 1. Let M0 = 1, Mn = ∏in=1 Xi for n ∈ N. Then ( Mn )n≥0 is a non-negative martingale and Mn → M∞ a.s. as n → ∞ for some random variable M∞ . We set an=E(√ Xn ) , then an ∈ (0, 1]. Moreover, (i) If ∏n an > 0, then Mn → M∞ in L1 and E( M∞ ) = 1, (ii) If ∏n an = 0, then M∞ = 0 almost surely. fill in, this is somewhat involved. 25 26 andrew tulloch Proof. fill in 2.6.1 Martingale proof of the Radon-Nikodym theorem Let P, Q be two probability measures on the measurable space Ω, F . Assume that F is countably generated, that is, there exists a collection of sets ( Fn )n∈N such that F = σ( FN , n ∈ N). Then the following are equivalent. (i) P( A) = 0 ⇒ Q( A) for all A ∈ F . That is, Q is absolutely continuous with respect to P and write Q << P (ii) For all e > 0, there exists δ > 0 such that P( A) ≤ δ ⇒ Q( A) ≤ e. (iii) There exists a non-negative random variable X such that Q( A) = EP ( XI( A)) (2.35) Proof. (i ) → (ii ). If (ii ) does not hold, then there exists e > 0 such that for all n ≥ 1 there exists a set An with P( An ) ≤ 1 n2 and Q( An ) ≥ e. By Borel-Cantelli, we get that P( An i.o ) = 0. Therefore from (i ) we get that Q( An i.o ) = 0. But Q( An i.o ) = Q(∩n ∪k≥n Ak ) = lim Q(∪k≥n Ak ) ≥ e n→∞ (2.36) which is a contradiction. (ii ) → (iii ). Consider the filtration Fn = σ( Fk , k ≤ n). Let An = { H1 ∩ · · · ∩ Hn | Hi = Fi or Fic } (2.37) then it is easy to see that Fn = σ ( An ). Note also that sets in An are disjoint. continue proof 3 Stochastic Processes in Continuous Time Our setting is a probability space (Ω, F , P) a probability space with t ∈ J ⊆ R+ = [0, ∞) Definition 3.1. A filtration on (Ω, F , P) is an increasing collection of σ-algebras (Ft )t∈ J , satisfying Fs ⊆ Ft for t ≥ s. A stochastic process in continuous time is an ordered collection of random variables on Ω. 4 Bibliography