Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MTH 202 : Probability and Statistics Lecture 5 - 8 : 15, 20, 21, 23 January, 2013 Random Variables and their Probability Distributions 3.1 : Random Variables Often while we need to deal with probability of certain events, we need to take care of certain arithmetic operations on the sets which turn out to be tricky. Rather it is easier to understand deal with real valued functions defined on the points of the sample space Ω. We call these as random variables. Before we get into these we would need to know about a specific σ-field built on the open intervals in R, known as the Borel σ-field. Theorem 3.1.1 : Let X be a non-empty set and P(X) denote its power set. For any collection C ⊆ P(X) of subsets of X, there exists a smallest σ-field of subsets of X which contain all subsets of X from C. Proof : There is at least one σ-field which contain C itself, namely P(X). Now consider to collect all such σ-field : T := {U : C ⊆ U}; the collection is non-empty since P(X) is a member of it. Now the intersection ∩U ∈T U fulfill the defining properties of a σ-field (Verify !) and this also contain C. Note 3.1.2 : The σ-field constructed above is often denoted by hCi and called as the one generated by C. In practical it is hard to describe the members of hCi while a collection of subsets C ⊆ P(X) is given. Hence for practical purposes, we will be needing a necessary condition to avoid dealing with sets from hCi. Definition 3.1.3 : Let C be the collection of all open intervals of the form (a, b) where a, b ∈ R. Then the smallest σ-field hCi (in R), often denoted as B(R) generated by C is called the Borel σ-field. The sets from B(R) are called Borel sets. 1 2 Exercise 3.1.4 : B(R) can also be described to be generated by the following collections : (i){(−∞, x] : x ∈ R}, (ii){[x, ∞) : x ∈ R}, (iii){(−∞, x) : x ∈ R}, (iv){(x, ∞) : x ∈ R}, (v){[x, y] : x, y ∈ R}, (vi){[x, y) : x, y ∈ R}, (vii){(x, y] : x, y ∈ R}. We are now ready to encounter random variables : Definition 3.1.5 : Let (Ω, S, P ) be a probability space. A function F : Ω → R is called a random variable (RV in short) if : F −1 (B) = {ω ∈ Ω : F (ω) ∈ B} ∈ S for every Borel set B ⊆ R. It is customary to denote a random variable by X instead of the usual function notation F . From the previous exercise we can deduce that : Theorem 3.1.6 : Let (Ω, S, P ) be a probability space. Then X is a random variable if and only if X −1 ((−∞, x]) = {ω ∈ Ω : X(ω) ≤ x} ∈ S for all x ∈ R. Example 3.1.7 : Suppose that we toss a coin thrice and count the number of heads turn up in each outcome. Here the sample space Ω is {abc : a, b, c ∈ {H, T }}. Let us define the probability on Ω by P ({abc}) := 1/8 for all abc ∈ Ω. Suppose we are speaking of the event A = {at least two H0 s} and calculate P A. At this moment instead of talking about the set A, we can simply introduce a random variable X : Ω → R by : X(ω) := number of H0 s in ω For example X(HT H) = 2 = X(HHT ), X(HHH) = 3 etc. The function X is a random variable simply because S is all of the power set P(Ω). Finally we would write P A by P (X ≥ 2) and calculate that P A = 1/2 = P X −1 ([2, ∞)). Exercise 3.1.8 : Let X be an RV. Is√|X| also an RV? If X is an RV that takes only nonnegative values, is X also an RV? Solution : Let Ux be the set Ux := |X|−1 ((−∞, x]) = {ω ∈ Ω : |X(ω)| ≤ x} 3 Then if x < 0, ∅ −1 Ux = X (0) if x = 0, X −1 [−x, x] if x > 0. Clearly these sets are in S, since X is a RV. √ Next we recall that for x ≥ 0 in R the symbol x denote the positive square root of x. Let Vx be the set p √ −1 Vx := X ((−∞, x]) = {ω ∈ Ω : −∞ < X(ω) ≤ x} Then Vx = ∅ if x < 0. Now if x ≥ 0 we have p Vx = {ω : 0 ≤ X(ω) ≤ x} = X −1 ([0, x2 ]) √ Hence X is also an RV. Exercise 3.1.9 : Let Ω = [0, 1] and S be the Borel σ-field of subsets of Ω. Define X : Ω → R by : ( ω if 0 ≤ ω ≤ 12 , X(ω) = ω − 12 if 12 < ω ≤ 1. Is X an RV? If so, what is the event {ω : X(ω) ∈ ( 14 , 12 )}? Solution : We notice that ∅ if x < 0, −1 1 1 X (−∞, x] = [0, x] ∪ ( 2 , 2 + x] if 0 ≤ x < 12 , [0, 1 ] if x ≥ 21 . 2 3.2 : Probability distribution of a Random Variable Theorem 3.2.1 : The RV X defined on a probability space (Ω, S, P ) induces a probability space (R, B(R), Q) defined by Q(B) := P (X −1 (B)) = P ({ω : X(ω) ∈ B}) for all B ∈ B(R). Proof : Ref. Pg 43, Sec. 2.3, Theorem 1 [RS]. Before we speak about the probability distribution, we would first define the idea of a distribution function in general. 4 Definition 3.2.2 : A function F : R → R is called a distribution function if : (i) x < y implies F (x) ≤ F (y) for all x, y ∈ R (non-decreasing), (ii) limx→a+ F (x) = F (a) for all x ∈ R (right continuous), (iii) F (−∞) = 0 (i.e., limx→−∞ F (x) = 0) and F (+∞) = 1 (i.e., limx→+∞ F (x) = 1) Exercise 3.2.3 : Do the following 0 (a) F (x) = x 1 functions define DF’s? if x < 0, if 0 ≤ x < 12 , and if x ≥ 21 1 tan−1 x (x ∈ R) π Solution : (a) The property (i) can be easily checked. Since the function is defined by the patches of continuos functions, we would need to verify (ii) at x = 0 and x = 21 . Now we see that, limx→0+ F (x) = limx→0+ x = 0 = F (0). Similarly at x = 21 we have limx→ 1 + F (x) = 1 = F ( 12 ). (b) F (x) = 2 The third property is clear since the F merges with the constant functions 0 and 1 near −∞ and +∞ respectively. (b) The limit of F (x) limx→−∞ π1 tan−1 x = − 21 6= 0. Similarly, F (+∞) = 12 . Hence this is not a distribution function. Theorem 3.2.4 : The set of points where a DF F is discontinuous is at the most countable. Proof : Ref. Pg 44, Sec. 2.3, Theorem 2 [RS]. We would now define the DF of an RV. Definition 3.2.5 : Let X be an RV defined on a probability space (Ω, S, P ). The function F : R → R defined by F (x) = Q(−∞, x] = P ({ω ∈ ω : X(ω) ≤ x}) (x ∈ R) is called the distribution function of the RV X. The name ”distribution function of an RV” is surely given for some reason : Theorem 3.2.6 : The function F defined as above is a DF. Proof : Ref. Pg 45, Sec. 2.3, Theorem 3 [RS]. 5 In fact every DF can be shown to be a DF of an RV on some probability space. The proof of this would not be discussed in this course. From now on we would adopt the following notations : P ({ω ∈ ω : X(ω) ≤ α}) is denoted by P (X ≤ α), P ({ω ∈ ω : X(ω) < α}) is denoted by P (X < α) etc. Exercise 3.2.7 : Do the following function define a DF? If so, find P (−∞ < X < 2). ( 1 − e−x if x ≥ 0, F (x) = 0 if x < 0. Solution : F 0 (x) = e−x > 0 shows that the function is strictly increasing on the positive half of the real line. It is constant on the negative side and 0 < 1 − e−x for all x > 0. Hence F is non-decreasing. F (x) is continuous while x ≥ 0, which implies F is right continuous at x = 0. At any other point F is indeed continuous. Finally F is the constant function 0 while being on the negative side of the real line showing F (−∞) = 0. Since limx→+∞ e1x = 0, we have that F (+∞) = 1. Thus F is a DF. Since F is a continuous function, P (X = a) = 0 for all a ∈ R (Why?). P (−∞ < X < 2) = P (−∞ < X ≤ 2) − P (X = 2) = F (2) − 0 = 1 − e−2 3.3 : Discrete and Continuous Random Variables There would essentially be two distinct type of RV’s we would be dealing with. The first we will be discussing about discrete RV’s. Roughly speaking, the discrete RV is the one for which the complete probability mass would be concentrated at some discrete points (i.e., points which are separated from each other by certain positive distance). First, we would briefly recall the notion of countable set. Definition 3.3.1 : A set E is said to be countable if it is either finite, or else there is a bijection f : N → E. The set E is finite meaning if you along with the some others are trying to count the elements of E by numbers 1, 2, 3, . . . , it would theoretically stop at some point, doesn’t matter even if the sun is extinct by then, 6 or else the earth is evacuated by rest of the humans while no one could have changed your interest in counting E. Figure 1. WALL-E and EVA On the other hand, a countably infinite set is impossible to be counted by any finite time given. However like the previous case, say while counting by the numbers 1, 2, 3, . . . you also put a tag on the elements by these numbers. Thus we would be calling E countably infinite if every element would have a number tag ”n”, however large it could be. In the previous definition the bijection f would ensure that the tags, say ”n” given to the element ”f (n)” are all distinct. Definition 3.3.2 : An RV defined on a probability space (Ω, S, P ) is said to be of discrete type (or simply discrete) if there is a countable set E ⊆ R such that P (X ∈ E) = 1. A relevant query here at this point would be whether countable sets in R are Borel sets, else it would be meaningless to talk about P (X ∈ E) = P X −1 (E). First we note that every singleton sets {x} in R are Borel sets by means of the infinite nested intersection : ∞ \ 1 1 {x} = x − ,x + n n n=1 Thus countable subsets of R is a Borel set, since they would be countable union (either finite or infinite) of finite sets. Now if it is known that P (X = xi ) = pi ≥ 0 for all xi ∈ E, we have from the definition of probability that ∞ X pn = 1 n=1 7 ∞ Definition 3.3.3 : The collection of non-negative P∞ real numbers {pi }i=1 satisfying P (X = xi ) = pi for all i ∈ N and i=1 pi = 1, is called the probability mass function (PMF) of the RV X. The DF F of X is given by : X F (x) = P (X ≤ x) = pi (x ∈ R) xi ≤x The name ”probability mass function” for the expression {pi }∞ i=1 of non-negative real numbers may be misleading. In fact it can precisely be written as a function p : R → R by ( pk if x = xk (k = 1, 2, . . . ) p(x) = 0 otherwise In general : Definition 3.3.4 :PLet {pi }∞ i=1 be a collection of non-negative real ∞ numbers such that i=1 pi = 1. Then {pi }∞ i=1 is the PMF of some RV X. Exercise 3.3.5 : For what value of K do the following define the probability mass function of some random variable : f (x) = K/N (x = 1, 2, . . . , N ) P Solution : We need N i=1 K/N = K = 1. Next we would consider the RV’s associated to the DF’s which are of continuous type. Definition 3.3.6 : Let X be an RV defined on a probability space (Ω, S, P ) with DF F . Then X is said to be of continuous type if there is an integrable function f : R → [0, ∞) such that Z x F (x) = f (t)dt (x ∈ R) −∞ The function f is called the probability density function (PDF) of the RV X. Properties 3.3.7 : Let f be the PDF of the RV X on the probability space (Ω, S, P ). Then : Z ∞ Z b (i) f (t)dt = 1, (ii) P (a < X ≤ b) = f (t)dt −∞ In general : a 8 Theorem 3.3.8 : Every non-negative real function f that is integrable over R and satisfies Z ∞ f (t)dt = 1 −∞ is the PDF of some continuous RV X. As a special note, we would address a few comments regarding continuity of the distribution function. Theorem 3.3.10 : Let F be the distribution function corresponding to an RV X over the probability space (Ω, S, P ). If F is continuous at x = a, then P (X = a) = 0. Otherwise P (X = a) = F (a) − F (a−) > 0 Proof : Consider the sequence of event sets 1 1 En := {ω ∈ Ω : a − < X(ω) ≤ a} = X −1 ((a − , a]) n n 1 Since (a − n , a] is a Borel set, En ∈ S for all n ∈ N. But we see that E1 ⊇ E2 ⊇ E3 ⊇ . . . i.e., the sequence ∞ \ {En }∞ n=1 is decreasing and we have ∞ \ 1 −1 En = X (a − , a] = X −1 ({a}) n n=1 n=1 Since {En }∞ n=1 is decreasing we have (See corollary to Thm.6, Pg-13, [RS]) −1 lim P (En ) = P (∩∞ ({a})) = P (X = a) n=1 En ) = P (X n→∞ But P (En ) = P (a − 1 n < X ≤ a) = F (a) − F (a − n1 ). Hence 1 ) = F (a) − F (a−) n→∞ n→∞ n Now if F is continuous at x = a, it is left continuous there as well. Hence, F (a) = F (a−). Next if F is not continuous at x = a, since F is increasing we have F (a − n1 ) < F (a) for all n ∈ N. Thus {P (En )}∞ n=1 is a sequence of positive real numbers whose limit exists (since F is non-decreasing), but not 0. Hence the limit F (a) − F (a−) > 0. lim P (En ) = F (a) − lim F (a − We will finally note that if X is of continuous type, then F has a derivative almost everywhere, which is an equivalent to say that F is absolutely continuous, a notion which is much stronger than continuity. 9 For details, you may consult Chap-5, Section 4, Cor. 12, [ROY]. In short, we have the following conclusion : Corollary 3.3.11 : Let F be the distribution function corresponding to an RV X of continuous type over the probability space (Ω, S, P ). Then P (X = a) = 0 for all a ∈ R. In particular, F is a continuous function. Moreover, there are RV’s whose types are neither continuous, nor discrete. Hence the DF’s for these would not be absolutely continuous. However, these might have the corresponding density (or probability) function which would be a little tricky to describe. For example : Example 3.3.12 : Is the following function a DF? If so, find the corresponding density or probability function : 0 2 if x < 1, F (x) = (x−1) if 1 ≤ x < 3, 8 1 if x ≥ 3 Proof : Except at the interval [1, 3), the function F is constant. In the open interval (1, 3), we have F 0 (x) = (x − 1)/4 > 0. Hence F is nondecreasing. Clearly, F (−∞) = 0, F (+∞) = 1. Finally, F is defined piecewise by the functions which are always right continuous, implying F is right continuous. Therefore, F is a DF. The corresponding density function f is given by if x < 1, 0 (x−1) f (x) = if 1 ≤ x < 3, 4 0 if x ≥ 3 We note that F is not continuous at x = 3. In fact, 1 1 P (X = 3) = F (3) − F (3−) = 1 − = > 0. 2 2 References : [ROY] Real Analysis, H.L. Royden, 3rd Edition, Macmillan Publishing Co. [RS] An Introduction to Probability and Statistics, V.K. Rohatgi and A.K. Saleh, Second Edition, Wiley Students Edition.