* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download A little more measure theory
Survey
Document related concepts
Transcript
SOME TOOLS FROM MEASURE THEORY Abstract. Some more measure theory 1. Dealing with sigma-algebras 1.1. Measurable functions. Let (Ω, F) be a measurable space. Given a collection C of subsets of Ω, we let σ(C) denote the smallest sigmaalgebra that contains the sets C; we also say that C generates the sigma-algebra σ(C). Thus the open sets generate the Borel sets and B = σ(G), where G is the collection of opens, and B is the set of Borel sets. Recall that a function f : Ω → R is measurable if f −1 (B) = {f −1 (B) : B ∈ B} ⊂ F; that is, f −1 (B) ∈ F for all B ∈ B. We claimed that this condition is equivalent to checking the easier condition that f −1 (C) ⊂ F, in the case that C is the set of intervals of the form (−∞, x) for x ∈ R. Lemma 1. Let C be the set of intervals of the form (−∞, x) for x ∈ R. Then σ(C) = B. Proof. First, we argue that every open interval is in σ(C). Let a < b be real numbers. Clearly, (−∞, b) ∈ σ(C), and \ (−∞, a] = (−∞, a + n−1 ) ∈ σ(C). n>0 c Hence (−∞, a] = (a, ∞) ∈ σ(C), from which we can conclude that (a, ∞) ∩ (−∞, b) = (a, b) ∈ σ(C). Second, we note that it is theorem that every open set is a disjoint union of a countable number of intervals of open intervals, from which it follows that σ(C) contains every open set. Finally, since B was defined to be smallest sigma-algebra containing the open sets, we have that σ(C) = B. Exercise 1.1. Let (Ω, F) be a measurable space. Let C be a collection of subsets that generate B. Show that f : Ω → R is measurable if and only if f −1 (C) ⊂ F. We will also speak of random variables with that are not real-valued. For example random vectors of random sequences. In general, if (Ω, F) is measurable space, and (S, S) is another measurable space, we say that f : Ω → S is measurable if f −1 (S) ⊂ F. Exercise 1.2. Consider the measurable space (N, 2N ); here 2N is the set of all subsets of N. Check that the set of all singletons {n} such that n ∈ N generates 2N . Exercise 1.3. Prove a version of Exercise 1.1 for the case of general measuable functions. Let (Ω, F, P) be a probability space. We say that a sigma-algebra T ⊂ F is trivial if for every A ∈ T , we have P(A) ∈ {0, 1}. Exercise 1.4. Let X be a real-valued random variable. Show that if σ(X) is trival, then there exists a constant c ∈ R such that P(X = c) = 1. Exercise 1.5. Let (Ω, F, P) be a probability space, and let (S, S) be a measurable space. (a) Show that if S = {∅, S}, then every function X : Ω → S is a random variable, and σ(X) is trivial. (b) Show that if σ(X) is trivial, then there does not exists a partition of S given by S1 ∪ S2 = S with S1 , S2 ∈ S such that X takes values in both S1 and S2 . (c) Assume that S contains all the singletons of S; that is, all sets of the form {s}, with s ∈ S. Show that if X is discrete and σ(X) is trivial, then there exists c ∈ S such that P(X = c) = 1. (d) What happens if we do not know a priori know that X is discrete? 1.2. Measures. Recall in elementary probability courses, we said that two random variables X and Y are independent if P(X ≤ x, Y ≤ y) = P(X ≤ x)P(Y ≤ y) for all x, y ∈ R. This condition is equivalent to the condition that P(X ∈ A, Y ∈ B) = P(X ∈ A)P(Y ∈ B) for all A, B ∈ B. One way to justify this fact is via π-systems. Let Ω be a set. A collection of subsets I of Ω is a π-system if it is closed under finite intersections; that is if A, B ∈ I, then A ∩ B ∈ I. Theorem 2 (Uniqueness via π-systems). Let (Ω, F) be a measurable space and µ and ν be finite measures on Ω. If µ and ν agree on π-system that generates F, then µ and ν are equal on all of F. Exercise 1.6. We claimed that there exists a nice ‘Borel’ measure λ on (R, B), with properties such as λ(a, b) = b − a and translationinvariance. Show that there is only one Borel measure. Let (Ω, F, P) be a probability space. Let G and H be sub-sigmaalgebras of F; that is G, H ⊂ F. We say that G and H are independent if P(A ∩ B) = P(A)P(B) for all A ∈ G and all B ∈ H. Recall that X −1 (B) = {X −1 (B) : B ∈ B} is a sigma-algebra. Write σ(X) = X −1 (B). We say that X and Y are independent if σ(X) is independent of σ(Y ). Exercise 1.7. Check that this definition agrees with the usual definition. Exercise 1.8. Let f be a Borel measurable function. Check that Y = f (X) is measurable with respect to the σ(X); that is, Y −1 (B) ∈ σ(X). Exercise 1.9. Check that if X and Y are independent, then f (X) is independent of g(Y ), for Borel measurable funcitons f and g. Exercise 1.10. Let f : [0, 1] → R be a continuous function. Let Ui be a sequence of i.i.d. random variables uniformly distributed in [0, 1]. Show that Z 1 n 1X f (Ui ) → f (x)dx. n i=1 0 Exercise 1.11. Let X be Bernoulli random variable. Let Y ∈ σ(X) be a real-valued random variable. Show that Y is a discrete random variable with at most two distinct values. 2. Approximating and constructing measures 2.1. Caratheodory’s extension theorem. Let Ω be a set. An algebra A on Ω is a collection of subsets of Ω that contains Ω and is closed under complements and finite unions; that is, Ω ∈ A, if A ∈ A, then Ac ∈ A, and if A, B ∈ A, then A ∪ B ∈ A. Theorem 3 (Caratheodory’s extension theorem). Let Ω be a set, and A be an alegbra for Ω. If µ̃ be a measure on A, then there exists a measure µ on the measurable space (Ω, σ(A)) such that µ = µ̃ on A. The proof of Theorem 3 involves defining a set function on all subsets of Ω using µ̃: set X µ∗ (E) = inf µ̃(Ai ), i=1 where the the infimum is take over all sequences (Ai ) for which E ⊂ ∪i Ai and Ai ∈ A. The set function µ∗ is called an outer measure. We have that µ∗ = µ̃ on A, but the outer measure is only countably subadditive: even if Ei are disjoint sets, we only have [ X µ∗ Ei ≤ µ∗ (Ei ); i i however, restricting to µ∗ to σ(A) results in a measure µ. Using outer measure to construct measures has the following benefit that measurable sets can be approximated by more basic sets. In the case of Borel measure on [0, 1), we can set A to be the set of all finite unions of intervals for the form [a, b), and define µ̃[a, b) = b − a. Corollary 4. Let ε > 0. If E ∈ σ(A), then there exists a A ∈ A such that µ(A4E) < ε. Here A4E = (A \ E) ∪ (E \ A). Exercise 2.1. In Theorem 3, show that if µ̃(Ω) < ∞, then the extension µ is unique. Exercise 2.2. Let f : [0, 1] → R be a meaurable function. Show that there exist a sequence of step functions fn such that fn → f almost surely with respect to Borel measure. Here a step function is any finite linear combination of indicator functions over the intervals. Exercise 2.3. Check that the set A of all finite unions of the form [a, b) for all 0 ≤ a < b < 1, containing the empty set, is an algebra. Show that A is not a sigma-algebra. Solution. Notice that if A = [a, b) ∪ [c, d), where a < b < c < d, then Ac = [0, a) ∪ [b, c) ∪ [d, 1). Thus it is easy to see that A is closed under complements, as well as finite unions. Notice also that A does not even contain an interval of the form (a, b). A semialgebra A0 is collection of sets that are closed under finite intersections and have the property that any A ∈ A0 , we have that Ac is a finite disjoint union of members of A0 ; note that Ac does not have to be in A0 . The algebra generated by A0 is the collection A containing the empty-set and all finite disjoint unions of sets in A0 . 2.2. Product spaces and Kolmogorov’s extension theorem. Let (Ωi , Fi ) be measurable spaces with µ(Ωi ) < ∞. We define the product space to be (Ω1 × Ω2 , F1 ⊗ F2 ), where F1 ⊗ F2 is defined to be the sigma-algebra generated by all sets of the form F1 × F2 ∈ F1 × F2 . Often we will abuse notation, and write F1 × F2 = σ(F1 × F2 ). Theorem 5 (Product measures). Let (Ωi , Fi , µi ) be measure spaces. There exists a unique measure µ on the product space (Ω1 ×Ω2 , F1 ⊗F2 ), such that µ(A × B) = µ1 (A)µ2 (B) for all A × B ∈ F1 × F2 . The proof of Theorem 5 is harder than you might guess, but it is a corollary of Theorem 3. Q In the case of infinite product spaces, we consider the set Ω = i∈Z+ Ωi , which is the set of all sequences ω such that ω(i) ∈ Ωi for all i ∈ Z+ . and sigma-algebra generated by all the finite dimensional sets, which are sets given by the finite intersection of sets of the form {ω ∈ Ω : ω(i) ∈ Fi }; such sets are sometimes also called cylinder sets. Note that by de Morgan’s laws, the cylinder sets form a semialgebra; for example, c {ω : Ω : ω(1) ∈ F1 } ∩ {ω : Ω : ω(2) ∈ F2 } = {ω : Ω : ω(1) ∈ F1c } ∪ {ω : Ω : ω(2) ∈ F2c } . Sometimes even members of the algebra generated by the cylinder sets are called cylinder sets. Exercise 2.4. Use Theorem 3 to construct a probability space for an infinite sequence of i.i.d. fair coin flips. Exercise 2.5 (Ergodicity). Let X = (Xi )i∈Z be i.i.d. Bernoulli random variables. Let Q = P(X ∈ ·) be the law of X, so that Q is probability measure on space of bi-infinte sequences Ω = {0, 1}Z endowed with the product sigma-algebra F. Define the left-shift T via (T ω)i = ωi+1 . An event A ∈ F is said to be translation-invariant if Q(A4T −1 (A)) = 0. (a) Argue that Q is translation-invariant; that is Q ◦ T −1 = Q. (b) Show that for any two cylinder sets C1 and C2 , there exists a finite N > 0 such that Q(C1 ∩ T −N C2 ) = Q(C1 )Q(C2 ). (c) Show that every translation-invariant event is trivial, in the sense that they have probability zero or one. Hint: Show that Q(A) = Q(A)2 . Approximate A by a finite disjoint union of cylinder sets. A sequence of probability measures µn is consistent if µn+1 (F1 × · · · × Fn × Ωn+1 ) = µn (F1 × · · · Fn ), for all Fi ∈ Fi . Theorem 6 (Kolmogorov extension theorem). If µn be a sequence of consistent probability measures on (Rn , B n ), then there exists a unique probability measure µ on the infinite product space such that µ agrees with the µn on the cyclinder sets. Remark 1. Theorem 6 also holds in the case of an infinite product of a general measurable space (Ω, F) provided that it is a standard Borel space; that is, there exists a measurable bijection φ : Ω → R. Exercise 2.6. Use Theorem 6 to construct a Markov chain X with a transition matrix P on a countable state space S that is started at the probability measure µ0 . Solution. Without loss of generality, we may assume that S = N ⊂ R. Thus we can start with the probability measure µ0 on defined on (R, B). We can define µ1 on (R2 , B 2 ) by µ1 (a0 , a1 ) = µ0 (a0 )pa0 ,a1 . It follows from the fact that P is a transition matrix that µ1 is also a probability measure supported on S × S, and µ1 (a0 , R) = µ0 (a0 ). Similarly, we can define µn on (Rn+1 , B n+1 ) via µn+1 (a0 , a1 , . . . , an+1 ) = µ0 (a0 )pa0 ,a1 · · · pan−1 ,an , and we obtain a sequence of consistent probability measures. From the Kolmogorov extension theorem, there exists a unique probability measure P on (RN , B N ) that agrees with the µn on the finite dimensional sets. Let X be the random variable on the probability space (RN , B N , P) defined by X(ω) = ω for all ω ∈ RN . Set Xi (ω) = X(ω)i = ωi . If you are not convinced we are done, then consider the following calculation. Let a, b ∈ S. We have that X P(Xn = a) = µn (Rn , a) = µ(a0 )pa0 ,a1 · · · pan−1 ,an . a∈S n+1 ,an =a Similarly, we have P(Xn+1 = b, Xn = a) = µn (Rn , a, b) X = µ(a0 )pa0 ,a1 · · · pan−1 ,an pan ,an+1 a∈S n+2 ,an =a,an+1 =b X = pa,b µ(a0 )pa0 ,a1 · · · pan−1 ,an a∈S n+1 ,an =a,an+1 =b = pa,b P(Xn = a). Hence we obtain that Note that P(Xn+1 = b, Xn = a) P(Xn = a) = pa,b , P(Xn+1 = b | Xn = a) = as required. Exercise 2.7. Let X = (X0 , X1 , . . .) be a Markov chain with transition matrix P , started at a stationary distribution. Extend X to include all negative integer times. Exercise 2.8. Define a stationary Marov chain Y = (. . . , Y−1 , Y0 , Y1 , . . .) such that there exists a translation-invariant event A that is not trivial; see Exercise 2.5. 3. Stopping times and the strong Markov property, again Let X be a Markov chain taking values on a countable state space S, defined on a probability space (Ω, F, P). Set Fn = σ(X0 , . . . , Xn ). Let T be a stopping time. You will soon be able to prove that {T = n} ∈ Fn ; in fact, this is the usual definition. We set FT to be sigma-algebra of events F ∈ F such that F ∩ {T = n} ∈ Fn for all n ≥ 0. Lemma 7. The stopping time T is measurable with respect to FT ; that is, T −1 (B) ∈ FT for all B ⊂ N. Proof. If n ∈ B, then T −1 (B) ∩ {T = n} = {T = n} ∈ Fn ; otherwise, T −1 (B) ∩ {T = n} = ∅ ∈ Fn . Lemma 8. Let X be a Markov chain taking values on a countable state space S. Let T be a stopping time. Add to S a symbol 4 to obtain S4 . Set Z = (X0 , X1 , . . . , XT , 4, 4, . . .) N so that Z takes values in S4 . Then Z is measurable with respect to FT . Proof. Let n ∈ N. It suffices to check that Z −1 (B) ∩ {T = n} ∈ Fn for all sets B of the form B = (b0 , b1 , . . . , bk , 4, . . .), where bi ∈ S. Note that by definition {T = n} ∈ Fn . If k 6= n, then the intersection is empty, and thus in Fn ; otherwise, k = n, and Z −1 (B) = {X0 = b0 , . . . , Xn = bn } ∈ Fn Exercise 3.1. Let X be a Markov chain taking values on a countable state space S, defined on a probability space (Ω, F, P). Let T1 and T2 be stopping times. (a) Check that FT is indeed a sigma-algebra. (b) Check that min {S, T } is a stopping time (c) Check that S + T is also a stopping time. Theorem 9. Let X be a Markov chain taking values in a state space S with a transition matrix P . Let T be a stopping time, with P(T < ∞) = 1. Let s ∈ S. Conditional on XT = s, we have that Y = (XT +k )∞ k=0 is a Markov chain started at s with transition matrix P that is independent of Z = (Xk )Tk=0 . Proof. Note that it suffices to check that conditional on {XT = s}, we have that Y is Markov chain started at s that is independent of FT . Let C ∈ FT . We should check that for all measurable A ⊂ S N , we have that P(Y ∈ A, C | XT = s) = P(X ∈ A | X0 = s)P(C | XT = s). Hopefully, you will believe (see Exercise 3.2) that we have enough measure theory to justify that we only need to consider cyclinder sets A of the form A = a ∈ S N : a0 = z0 , . . . , ak = zk . (1) Note that C is given by the disjoint union of Bn = C ∩ {T = n}, where Bn ∈ Fn . Let z0 , . . . , zk ∈ S. Check using the Markov property that P(XT = z0 , . . . , XT +k = zk , Bn , T = n, XT = s) = P(X0 = z0 , . . . , Xk = zk |X0 = s)P(Bn , T = n, XT = s). Summing over all n ≥ 0, we have P(XT = z0 , . . . , XT +k = zk , C, XT = s) = P(X0 = z0 , . . . , Xk = zk |X0 = s)P(C, XT = s), and dividing by P(XT = s), we obtain the required result. Exercise 3.2. Check that it really is enough to just check cylinder sets in the proof of Theorem 9. Solution. This is a consequence of the π-system lemma. Fix C ∈ FT , and s ∈ S. The random variables X and Y take values on (S N , F), where F is the product sigma-algebra generated by the cylinder sets. Consider the finite measures µ and ν defined by on (S N , F) via µ(A) = P(Y ∈ A, C | XT = s) and ν(A) = P(X ∈ A | X0 = s)P(C | XT = s); these are just the left and right hand sides of (1). We checked in the proof of Theorem 9, that µ(A) = ν(A) for all cylinder sets A. Note that the cylinder sets are a π-system that generates F. Hence Theorem 2 gives that µ = ν on all of F. 4. Stationary stochastic processes Let (Ω, F, P) be a probability space. Let X = (Xi )i∈Z be a bi-infinite sequence of random variables taking values in R. Let T : RZ → RZ be given by (T x)i = xi+1 for all i ∈ Z; thus (T X)i = Xi+1 for all i ∈ Z. d We say that X is stationary if T X = X; that is, P(X ∈ A) = P(T X ∈ A) for all A ∈ B Z . We can also use the same definition in the case the X = (Xi )i∈N is a unilateral sequence of random variables. Exercise 4.1. Check that X is stationary if and only if for all ` ∈ Z and any finite collection of n1 , . . . , nk ∈ Z and Borel sets B1 , . . . , Bk ∈ B, we have P(Xn1 ∈ B1 , . . . , Xnk ∈ Bk ) = P(Xn1 +` ∈ B1 , . . . , Xnk +` ∈ Bk ). Exercise 4.2. Show a Markov chain started at a stationary distribution is indeed a stationary process. Exercise 4.3. Let X be a stationary real-valued stochastic process on (Ω, F, P). Show that if we set µ to be the law of X so that µ(A) = P(X ∈ A) for all A ∈ B Z , then (RZ , B Z , µ, T ) is a measure-preserving system. Exercise 4.4. Let (RZ , B Z , µ, T ) is a probability measure-preserving system, where (T x)i = xi+1 . Show that if we set Xi (x) = xi for all i ∈ Z and x ∈ RZ , then X = (Xi )i∈Z is a stationary process. Exercise 4.5. Let X be an irreducible Markov chain on a finite state space, so that it has a unique stationary distribution. Use the Poincare recurrence theorem to show if X is started at the stationary distribution, then it will visit every state almost surely.