Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Discrete Random Variables: Joint PMFs, Conditioning and Independence Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University Reference: - D. P. Bertsekas, J. N. Tsitsiklis, Introduction to Probability , Sections 2.5-2.7 Motivation • Given an experiment, e.g., a medical diagnosis – The results of blood test is modeled as numerical values of a random variable X – The results of magnetic resonance imaging (MRI,核磁共振攝影) is also modeled as numerical values of a random variable Y We would like to consider probabilities of events involving simultaneously the numerical values of these two variables and to investigate their mutual couplings P X x Y y ? Probability-Berlin Chen 2 Joint PMF of Random Variables • Let X and Y be random variables associated with the same experiment (also the same sample space and probability laws), the joint PMF of X and Y is defined by p X ,Y x , y P X x Y y P X x ,Y y • if event A is the set of all pairs x, y that have a certain property, then the probability of A can be calculated by P X ,Y A – Namely, x , y A p X ,Y x , y A can be specified in terms of X and Y Probability-Berlin Chen 3 Marginal PMFs of Random Variables (1/2) • The PMFs of random variables X and Y can be calculated from their joint PMF p X x p X ,Y x , y , y pY y p X ,Y x , y x – p X x and pY y are often referred to as the marginal PMFs – The above two equations can be verified by p X x P X x P X x ,Y y y p X ,Y x , y y Probability-Berlin Chen 4 Marginal PMFs of Random Variables (2/2) • Tabular Method: Given the joint PMF of random variables X and Y is specified in a two-dimensional table, the marginal PMF of X or Y at a given value is obtained by adding the table entries along a corresponding column or row, respectively Probability-Berlin Chen 5 Functions of Multiple Random Variables (1/2) • A function Z g X ,Y of the random variables X and Y defines another random variable. Its PMF can be calculated from the joint PMF p X , y pZ z x , y g x , y z p X ,Y x , y • The expectation for a function of several random variables EZ Eg X ,Y g x, y p X ,Y x, y x y Probability-Berlin Chen 6 Functions of Multiple Random Variables (2/2) • If the function of several random variables is linear and of the form Z g X ,Y aX bY c E Z aE X bE Y c – How can we verify the above equation ? Probability-Berlin Chen 7 An Illustrative Example • Given the random variables X and Y whose joint is given in the following figure, and a new random variable Z is defined by Z X 2Y , calculate EZ – Method 1: E X 1 3 6 8 3 51 2 3 4 20 20 20 20 20 3 7 7 3 50 2 3 4 EY 1 20 20 20 20 20 51 50 151 2 7.55 E[ Z ] E X 2EY 20 20 20 – Method 2: p Z z p X ,Y x, y x , y x 2 y z 1 1 2 2 E Z [ ] 3 4 5 6 1 1 2 2 20 20 20 20 p Z 3 , p Z 4 , p Z 5 , p Z 6 20 20 20 20 2 4 3 3 7 8 9 10 4 3 3 2 20 20 20 20 p Z 7 , p Z 8 , p Z 9 , p Z 10 20 20 20 20 1 1 12 11 7.55 1 1 20 20 p Z 11 , p Z 12 20 20 Probability-Berlin Chen 8 More than Two Random Variables (1/2) • The joint PMF of three random variables X , is defined in analogy with the above as Y and Z p X ,Y , Z x, y , z P X x, Y y , Z z – The corresponding marginal PMFs p X ,Y x, y p X ,Y , Z x, y , z z and p X x p X ,Y , Z x, y , z y z Probability-Berlin Chen 9 More than Two Random Variables (2/2) • The expectation for the function of random variables X , Y and Z Eg X , Y , Z g x, y , z p X ,Y , Z x, y , z x y z – If the function is linear and has the form aX bY cZ d EaX bY cZ d aE X bE Y cE Z d • A generalization to more than three random variables Ea1 X 1 a 2 X 2 a n X n a1 E X 1 a 2 E X 2 a n E X n Probability-Berlin Chen 10 An Illustrative Example • Example 2.10. Mean of the Binomial. Your probability class has 300 students and each student has probability 1/3 of getting an A, independently of any other student. – What is the mean of Let X , the number of students that get an A? 1, if the ith student gets an A Xi 0, otherwise X 1 , X 2 , , X 300 are bernoulli random variables with common mean p 1/3 Their sum X X 1 X 2 X 300 can be interprete d as a binomial random variable with parameters n ( n 300) and p ( p 1/3). That is, X is the number of success in n ( n 300) independen t trials EX E X 1 X 2 X 300 E X i 300 1/3 100 300 i 1 Probability-Berlin Chen 11 Conditioning • Recall that conditional probability provides us with a way to reason about the outcome of an experiment, based on partial information • In the same spirit, we can define conditional PMFs, given the occurrence of a certain event or given the value of another random variable Probability-Berlin Chen 12 Conditioning a Random Variable on an Event (1/2) • The conditional PMF of a random variable X , conditioned on a particular event A with P A 0 , is defined by (where X and A are associated with the same experiment) P X x A PX A x P X x A P A • Normalization Property – Note that the events P X x A are disjoint for different values of X , their union is A P A P X x A Total probability theorem x P X x A PX A x P A x x P X x A x P A P A 1 P A Probability-Berlin Chen 13 Conditioning a Random Variable on an Event (2/2) • A graphical illustration PX A x Is obtained by adding the probabilities of the outcomes that give rise to X x and be long to the conditioning event A Probability-Berlin Chen 14 Illustrative Examples (1/2) • Example 2.12. Let X be the roll of a fair six-sided die and A be the event that the roll is an even number PX A x P X x roll is even P X x and X is even P X is even 1 / 3, if x 2,4,6 otherwise 0, Probability-Berlin Chen 15 Illustrative Examples (2/2) • Example 2.14. A student will take a certain test repeatedly, up to a maximum of n times, each time with a probability p of passing, independently of the number of previous attempts. – What is the PMF of the number of attempts given that the student passes the test ? p x X Let X be a geometric random variable with parameter p , representi ng the number of attempts until the p fist success comes up p X x (1 p ) x 1 p 1 2 Let A be the event that the student pass the test w ithin n attempts ( A X n) (1 p ) x 1 p , if x 1,2 , ,n n p X A x (1 p ) m 1 p m 1 0, otherwise p P A n-1 n pX A x 1 2 x n-1 n Probability-Berlin Chen 16 Conditioning a Random Variable on Another (1/2) • Let X and Y be two random variables associated with the same experiment. The conditional PMF p X Y of X given Y is defined as P X x, Y y p X Y x y P X x Y y P Y y p X ,Y x, y pY y • Normalization Property Y is fixed on some value y p X Y x y 1 x • The conditional PMF is often convenient for the calculation of the joint PMF multiplication (chain) rule p X ,Y x, y pY y p X Y x y ( p X x pY X y x ) Probability-Berlin Chen 17 Conditioning a Random Variable on Another (2/2) • The conditional PMF can also be used to calculate the marginal PMFs p X x p X ,Y x, y pY y p X Y x y y y • Visualization of the conditional PMF p X Y p X Y x y p X ,Y x, y pY y p X ,Y x, y p X ,Y x, y x Probability-Berlin Chen 18 An Illustrative Example (1/2) • Example 2.14. Professor May B. Right often has her facts wrong, and answers each of her students’ questions incorrectly with probability 1/4, independently of other questions. In each lecture May is asked 0, 1, or 2 questions with equal probability 1/3. – What is the probability that she gives at least one wrong answer ? Let X be the number of questions asked, Y be the number of questions answered wrong n k p 1-p n k k P (Y 1) P (Y 1) P (Y 2) P ( X 1, Y 1) P ( X 2, Y 1) modeled as binomial distributions P ( X 2, Y 2) P (Y 1) P ( X 1) P (Y 1 X 1) P ( X 2) P (Y 1 X 2) P ( X 2) P (Y 2 X 2) 1 1 1 2 1 3 1 2 1 1 11 3 4 3 1 4 4 3 2 4 4 48 Probability-Berlin Chen 19 An Illustrative Example (2/2) • Calculation of the joint PMF p X ,Y x, y in Example 2.14. Probability-Berlin Chen 20 Conditional Expectation • Recall that a conditional PMF can be thought of as an ordinary PMF over a new universe determined by the conditioning event • In the same spirit, a conditional expectation is the same as an ordinary expectation, except that it refers to the new universe, and all probabilities and PMFs are replaced by their conditional counterparts Probability-Berlin Chen 21 Summary of Facts About Conditional Expectations • Let X and Y be two random variables associated with the same experiment – The conditional expectation of X given an event A with P A 0 , is defined by E X A xp x X A x • For a function g X , it is given by E g X A g x p X x A x Probability-Berlin Chen 22 Total Expectation Theorem (1/2) • The conditional expectation of X given a value is defined by E X Y y xp x X Y x y – We have E X p Y y xp y x y of Y X Y x y • Let A1 ,, An be disjoint events that form a partition of the sample space, and assume that P Ai 0 , for all i . Then, E X P A i E X A i n i 1 Probability-Berlin Chen 23 Total Expectation Theorem (2/2) • Let A1 ,, An be disjoint events that form a partition of an event B , and assume that P Ai B 0 , for all i . Then, E X B P A i B E X A i B n i 1 • Verification of total expectation theorem E X xp X x x x pX x y ,Y x pY y p X Y x y p Y y x p X Y x y x y y x x , y p Y y E X Y y y Probability-Berlin Chen 24 An Illustrative Example (1/2) • Example 2.17. Mean and Variance of the Geometric Random Variable x1 – A geometric random variable X has PMF pX x 1 p p, x 1,2, Let A1 be the event that A2 be the event that X X 1 1 E X P A1 E X A1 P A2 E X A2 where P A1 p , P A2 1 p (??) pX pX p 1, x p A1 0, x 1 otherwise 1 p x 2 p (??) , x 1 A2 x 0 , otherwise Note that (See Example 2.13) : (1 p ) x 1 p , if x 1,2 , ,n n p X A x (1 p ) m 1 p m 1 0, otherwise E X A1 1 1 x 0 1 x2 E X A2 1 0 x 1 p x 2 p x2 x 1 p x 2 p x2 x 11 p x 1 p x 1 x 1 p x 1 p 1 p x 1 p x 1 x 1 E X 1 E X P A1 E X A1 P A2 E X A2 1 P A1 1 1 p E X 1 E X 1 p Probability-Berlin Chen 25 An Illustrative Example (2/2) P A E X A P A E X A E X A 1 1 x 0 1 p E X A 1 0 x 1 p x E X 2 2 1 2 2 1 2 2 1 x2 2 2 2 2 2 x2 2 x2 2 x 12 2 x 1 x 12 1 p x 2 p 2 x 1 p x 2 p 1 p x 2 p x2 x2 x2 2 x 1 x2 x2 p 2 x 1 1 p p 2 1 p p 1 p x 2 x 1 p x 1 x2 x2 x2 2 x 1 p p 1 p p E X 2 E X 1 E X p 1 1 p E X 2 E X 1 1 2 1 p E X 1 we have shown that E X E X p p E X 2 x 1 x 1 x 1 x 1 set p x x 1 2 2 2 2 E X 2 2 p2 1 p E X var X E X 2 2 1 p2 1 1 p p p2 Probability-Berlin Chen 26 Independence of a Random Variable from an Event • A random variable X is independent of an event A if P X x and A P X x P A , for all x – Require two events X x and A be independent for all • If a random variable X is independent of an event and P A 0 pX x A P X x and A A x P A P X x P A P A P X x p X x , for all x Probability-Berlin Chen 27 An Illustrative Example • Example 2.19. Consider two independent tosses of a fair coin. – Let random variable X be the number of heads – Let random variable Y be 0 if the first toss is head, and 1 if the first toss is tail – Let A be the event that the number of head is even • Possible outcomes (T,T), (T,H), (H,T), (H,H) 1 / 2 , if x 0 1 / 4 , if x 0 p x if x 1 0, X A p X x 1 / 2 , if x 1 1 / 2 , if x 2 1 / 4 , if x 2 p X A x p X x X and A are not independent! 1 / 2 , pY y 1 / 2 , P A 1 / 2 if y 0 if y 1 PY y and A 1 / 2, if y 0 pY A y if y 1 P A 1 / 2, pY A y pY y Y and A are independen t! Probability-Berlin Chen 28 Independence of a Random Variables (1/2) • Two random variables X and Y are independent if p X ,Y x , y p X x p Y y , for all x , y or P X x , Y y P X x P Y y , for all x , y • If a random variable X is independent of an random variable Y pX pX Y Y x y x y p X x , for all y with pY y 0 all x p X ,Y x , y pY y p X x pY y pY y p X x , for all y with p y 0 and all x Probability-Berlin Chen 29 Independence of a Random Variables (2/2) • Random variables X and Y are said to be conditionally independent, given a positive probability event A, if p X ,Y A x , y pX A x p Y A y , for all x , y – Or equivalently, p X Y , A x y p X A x , for all y with pY A y 0 and all x • Note here that, as in the case of events, conditional independence may not imply unconditional independence and vice versa Probability-Berlin Chen 30 An Illustrative Example (1/2) • Figure 2.15: Example illustrating that conditional independence may not imply unconditional independence – For the PMF shown, the random variables X and Y are not independent • To show X and a pair of values Y are not independent, we only have to find x, y of X and Y that p X Y x y p X x – For example, independent X and Y are not p X Y 11 0 p X 1 3 20 Probability-Berlin Chen 31 An Illustrative Example (2/2) • To show X and all pair of values Y are not dependent, we only have to find x, y of X and Y that p X Y x y p X x – For example, X and Y are independent, conditioned on the event A X 2, Y 3 P A 9 , 20 pX P X x y Y ,A 2 / 20 1 3 Y ,A 1 , pX 6 / 20 3 1 / 20 1 p X Y , A 1 4 3 / 20 3 4 / 20 2 p X Y , A 2 3 , pX 6 / 20 3 2 / 20 2 p X Y , A 2 4 3 / 20 3 pX x Y y A P Y y A 3 / 20 1 1/ 3 A 9 / 20 A 2 6 / 20 2/3 9 / 20 Probability-Berlin Chen 32 Functions of Two Independent Random Variables • Given X and Y be two independent random variables, let g X and h Y be two functions of X and Y , respectively. Show that g X and h Y are independent. Let U g X and V h Y , then pU ,V u , v p X ,Y x, y x , y g x u , h y v p X x pY y x , y g x u , h y v p X x pY y x g x u y h y v pU u pV v Probability-Berlin Chen 33 More Factors about Independent Random Variables (1/2) • If X and Y are independent random variables, then EXY EX EY – As shown by the following calculation E XY xyp X ,Y x, y x y xyp X x pY y by independence x y xp X x ypY y x y E X EY • Similarly, if X and Y are independent random variables, then Eg X h Y Eg X Eh Y Probability-Berlin Chen 34 More Factors about Independent Random Variables (2/2) • If X and Y are independent random variables, then var X Y var X var Y – As shown by the following calculation var X Y E X Y EX Y 2 E X Y 2 2 X Y EX EY EX EY 2 2 x y p X ,Y x, y 2EX EY EX 2EX EY EY x, y EX 2 2 EX EY EY 2 x 2 p X ,Y x, y y 2 p X ,Y x, y 2 xypX ,Y x, y x, y x, y x, y EX 2 EY 2 2EX EY E X 2 EX 2 E Y 2 EY 2 var X varY Probability-Berlin Chen 35 More than Two Random Variables • Independence of several random variables – Three random variable X , Y and Z are independent if p X ,Y , Z x, y , z p X x pY y p Z z for all x, y , x ? Compared to the conditions to be satisfied for three independent events A1, A2 and A3 (in P.39 of the textbook) • Any three random variables of the form f X , g X and h X are also independent • Variance of the sum of independent random variables – If X 1 , X 2 , , X n are independent random variables, then var X 1 X 2 X n var X 1 var X 2 var X n Probability-Berlin Chen 36 Illustrative Examples (1/3) • Example 2.20. Variance of the Binomial. We consider n independent coin tosses, with each toss having probability p of coming up a head. For each i , we let X be the Bernoulli random variable which is equal to 1 if the i-th toss comes up a head, and is 0 otherwise. i – Then, X X 1 X 2 X n is a binomial random variable. var X i p 1 p , for all i var X var X i np 1 p n i 1 (Note that X i ' s are independent! ) Probability-Berlin Chen 37 Illustrative Examples (2/3) • Example 2.21. Mean and Variance of the Sample Mean. We wish to estimate the approval rating of a president, to be called B. To this end, we ask n persons drawn at random from the voter population, and we let X i be a random variable that encodes the response of the i-th person: 1, if the i - th person approves B' s performanc e Xi 0, if the i - th person disapprove s B' s performanc e – Assume that X i independent, and are the same random variable (Bernoulli) with the common parameter ( p for Bernoulli), which is unknown to us • X i are independent, and identically distributed (i.i.d.) – If the sample mean S n (is a random variable) is defined as X X2 Xn Sn 1 n X1 X 2 X with parameter p X n 1 X n Probability-Berlin Chen 38 Illustrative Examples (3/3) – The expectation of S n will be the true mean of X i X X2 Xn ES n E 1 n 1 n E X i n i 1 EX i ( p for the Bernoulli we assumed here) – The variance of S n will approximate 0 if n is large enough X X2 Xn lim var S n var 1 n n n lim n var X i i 1 n2 lim n np 1 p n2 lim n p 1 p 0 n • Which means that S n will be a good estimate of EX i if n is large enough Probability-Berlin Chen 39 Recitation • SECTION 2.5 Joint PMFs of Multiple Random Variables – Problems 27, 28, 30 • SECTION 2.6 Conditioning – Problems 33, 34, 35, 37 • SECTION 2.6 Independence – Problems 42, 43, 45, 46 Probability-Berlin Chen 40