Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Chapter 4: Joint and Conditional Distributions Yang Zhenlin [email protected] http://www.mysmu.edu/faculty/zlyang/ Chapter Contents Chapter 4 Joint Distribution Special Joint Distributions: Multinomial and Bivariate Normal Covariance and Correlation Coefficient Conditional Distribution Conditional Expectation Conditional Variance STAT151,Term TermII,II09/10 14-15 STAT306, 2 © Zhenlin Yang, SMU Introduction Chapter 4 In many applications, more than one variables are needed for describing a quantity or a phenomenon of interest, e.g., To describe the size of a man, one needs at least height (X) and weight (Y). To describe a point in a rectangle, one needs X coordinate and Y coordinate. In general, the set of k r.v.s correspond to the same “unit”, defined on the same sample space and taking values in a kdimensional Euclidean space. In this chapter, we focus mainly on the case of two r.v.s, and deal separately with two cases: •both X and Y are discrete •both X and Y are continuous STAT151,Term TermII,II09/10 14-15 STAT306, 3 © Zhenlin Yang, SMU Joint Distributions Chapter 4 Definition 4.1. (Joint CDF) The joint cumulative distribution function of r.v.s X and Y is the function defined by F(x, y) = P(X x, Y y ). Definition 4.1 extends naturally to cases of more than two r.v.s. It applies to both discrete and continuous r.v.s. Definition 4.2. Let X and Y be two discrete random variables defined on the same sample space. The joint probability mass function of X and Y is defined to be p(x, y) = P(X = x, Y = y) for all possible values of X and Y. Definition 4.2 extends directly to cases of more than two r.v.s STAT151,Term TermII,II09/10 14-15 STAT306, 4 © Zhenlin Yang, SMU Chapter 4 Joint Distributions Example 4.1. Xavier and Yvette are two real estate agents. Let X and Y denote the number of houses that Xavier and Yvette will sell next week, respectively. Suppose that there only four houses for sale next week. The joint probability mass function and its graph are presented below. Find P(X 1, Y 1) and P(Y 1). 0.42 p(x,y) p(x, y) X Y 0 1 2 0 .12 .42 .06 1 .21 .06 .03 2 .07 .02 .01 Answer: P(X 1, Y 1) =.06+.03+.02+.01 = 0.12, P(Y 1) =.21+.06+.03+.07+.02+.01 = .40 STAT151,Term TermII,II09/10 14-15 STAT306, 5 0.21 0.12 0.06 0.06 0.07 0.02 0.01 Y X=0 y=0 X 0.03 y=1 y=2 X=1 X=2 © Zhenlin Yang, SMU Chapter 4 Joint Distributions Example 4.2. A bin contains 1000 flower seeds, of which 400 are red, 400 are white and 200 are pink. Ten seeds are selected at random without replacement. Let X be the number of red flower seeds and Y be the number of white flower seeds being selected. (a) Find the joint pmf of X and Y. (b) Calculate P(X = 2, Y = 3) and P(X = Y). Solution: (a) From the counting techniques in Chapter 1, we obtain 400 400 200 1000 p( x, y ) , x 0, y 0, x y 10. x y 10 x y 10 (b) 5 P ( X Y ) P ( X i, Y i ) P( X 2, Y 3) i 0 400 400 200 1000 2 3 5 10 0.0081 STAT151,Term TermII,II09/10 14-15 STAT306, 400 400 200 1000 i i 10 2 i 10 i 0 0.0263 5 6 © Zhenlin Yang, SMU Joint Distributions Chapter 4 A function p(x, y) is said to be the joint pmf of discrete r.v.s X and Y if and only if for all possible values (x, y), (i) p(x, y) 0 and (ii) p( x, y) 1. x y Example 4.3. Let the joint pmf of X and Y be give by k ( x 2 y 2 ), p( x, y ) 0, if ( x, y ) (1,1), (1,2), (2,3), (3,3) otherwise. (a) Find the value of the constant k. (b) Calculate P(X > Y), P(X + Y 4), and P(Y X). Solution: (a) 1 p( x, y ) x y p ( x, y ) ( x, y ) = k[(12 + 12) + (12 + 22 ) + (22 + 32 ) + (32 + 32)] = 38 k, k = 1/38. (b) P(X > Y) = 0, P(X + Y 4) = 7/38, and P(Y X) = 1. STAT151,Term TermII,II09/10 14-15 STAT306, 7 © Zhenlin Yang, SMU Chapter 4 Joint Distributions Definition 4.3. A function f(x, y) is said to be the joint probability density function of the continuous r.v.s X and Y if the joint CDF of X and Y can be written as x F ( x, y) y f (u, v)dvdu, for all x and y. A function f(x, y) is said to be the joint pdf of continuous r.v.s X and Y if and only if for all possible values (x, y), (i) f(x, y) 0 and (ii) f ( x, y)dxdy 1 Marginal pmf: p X ( x) p( x, y ); pY ( y ) p( x, y ) y Marginal pdf: STAT151,Term TermII,II09/10 14-15 STAT306, x f X ( x) f ( x, y)dy; fY ( y) f ( x, y)dx 8 © Zhenlin Yang, SMU Chapter 4 Joint Distributions In Example 4.3, the marginal pmfs of X and Y are given below: X p X (x) 1 2 3 7/38 13/38 18/38 Y pY ( y) 1 2 3 2/38 5/38 31/38 Example 4.4. Let the joint pdf be given by k xy2 f ( x, y ) 0 if 0 x y 1 otherwise. (a) Find the value of the constant k. (b) Find the marginal pdfs of X and Y. (c) Calculate P(X + Y < 1), P(2X < Y), and P(X = Y). Solution: Some points to note: Finding the constant k and probabilities are matters of double integration, It is important to draw regions on which integrations are desired, so that the integration limits can be determined. STAT151,Term TermII,II09/10 14-15 STAT306, 9 © Zhenlin Yang, SMU Chapter 4 Joint Distributions f ( x, y)dxdy = k xy dxdy = k x y dy (a) 1 = 1 y 2 y 0 1 0 2 1 1 0 2 (b) 1 2 0 0 1 = k Y 0xy1 2 k , k = 10. 10 4 y dy = 1 0 X The marginal pdfs are f X ( x) f ( x, y)dy = f Y ( y) f ( x, y)dx = (c) P(X + Y < 1) = 0.5 1 x 0 x 1 2 10 xy dy = x y 0 10 x (1 x 3 ) , 0 x 1. 3 10 xy2 dx = 5y4, 0 y 1. Y 2 10 xy dy dx 1 10 0.5 3 3 x ( 1 x ) x dx 0 3 10 0.5 ( x 3x 2 3x 3 2 x 4 )dx = 0.1146 = 3 0 = STAT151,Term TermII,II09/10 14-15 STAT306, 10 x y, x+y 1 0 X=Y X+Y = 1 1 X © Zhenlin Yang, SMU Joint Distributions 1 x Chapter 4 2 10xy dy dx 0 x 0.5 1 10 x(1 8 x = 3 P(2X < Y) = 0 2x Y 10 xy2 dy dx 0.5 0 3 1 2x y )dx 2X=Y X=Y = 1/4 0 Finally, P(X = Y) = 1 X 1 x 2 10 xy dy dx = 0. 0 x Definition 4.4. Two random variables X and Y are said to be independent if and only if P(X x, Y y) = P(X x) P(Y y) for all possible values (x, y) of (X, Y). STAT151,Term TermII,II09/10 14-15 STAT306, 11 © Zhenlin Yang, SMU Joint Distributions Chapter 4 Note: This definition states that X and Y are independent if and only if their joint CDF can be written as the product of their marginal CDfs, i.e., F(x, y) = FX(x) FY(y). When X and Y are both discrete, the independence condition can be written as P(X = x, Y = y) = P(X = x) P(Y = y), for all x and y, i.e., the joint pmf is the product of the marginal pmfs. When X and Y are both continuous, the independence condition can be written as f(x, y) = fX(x) fY(y), i.e., joint pdf is the product of the marginal pdfs. Definition 4.4 extends naturally to the cases of many random variables STAT151,Term TermII,II09/10 14-15 STAT306, 12 © Zhenlin Yang, SMU Joint Distributions Chapter 4 Example 4.5. Stores A and B, which belong to the same owner, are located in two different towns. If the probability density function of the weekly profit of each store, in thousand dollars, is given by x 4 if 1 x 3 f ( x) 0 otherwise. and the profit of one store is independent of the other, what is the probability that next week one store makes at least $500 more than the other store? Solution: Let X and Y denote, respectively, next week’s profits of stores A and B. The desired probability is P(X Y + 1/2) + P(Y X + 1/2) Since X and Y are independent, by symmetry, P(X Y + 1/2) + P(Y X + 1/2) = 2 P(X Y + 1/2) To calculate P(X Y + 1/2), we need the joint pdf of X and Y. Since X and Y are independent, we have, xy 16, if 1 x 3, 1 y 3 f ( x, y ) f X ( x) fY ( y ) otherwise 0, STAT151,Term TermII,II09/10 14-15 STAT306, 13 © Zhenlin Yang, SMU Chapter 4 Joint Distributions To find P(X > Y + 1/2), one needs to integrate f(x, y) on a region defined by the conditions: 1 X 3, 1 Y 3, and X Y + 1/2. 2P(X Y +1/2) 3 x 1 2 xy = 2 dy dx 32 1 16 1 3 2 x 1 2 = xy 1 dx 3 2 16 1 3 3 2 ( x x 3x 4)dx = 0.54 = 3 2 16 Y 3 5/2 X Y+1/2, 1 X 3 1 Y 3 2 1 0 Example 4.6. Prove that the two random variables X and Y with the following joint probability density function are not independent. 1 3/2 2 3 X Solution: 1 fY(y) = 8 xy dx = 4y3, fX(x) = x y 8 xydy = 4x(1x2), 0 x 1, 0 0 y 1. Since f(x, y) fX(x) fY(y) , X and Y are 8xy if 0 x y 1 f ( x, y) 0 otherwise. STAT151,Term TermII,II09/10 14-15 STAT306, X =Y+1/2 NOT independent. 14 © Zhenlin Yang, SMU Special Joint Distributions Chapter 4 Certain special joint distributions such as multinomial and bivariate normal deserve some detailed attention. Multinomial is a direct generalization of the binomial. An experiment has k possible outcomes with probabilities 1, 2, , k. Let Xi be the number of times that the ith outcome occurs among a total of n independent trials of such an experiment, i = 1, 2, , k. Then the joint distribution of X1, X2, . . . , Xk is called the Multinomial Distribution with the joint pmf of the following form: n! p( x1 , x2 , , xk ) 1x1 2x2 kxk x1! x2 ! xk ! where 1 + 2 + . . . + k = 1, and x1 + x2 + ... + xk = n. STAT151,Term TermII,II09/10 14-15 STAT306, 15 © Zhenlin Yang, SMU Chapter 4 Special Joint Distributions A Bivariate Normal distribution has the following joint pdf: f(x1,x2) = 2 2 x1 1 x2 2 1 1 x1 1 x2 2 2 exp 2 2 2 1 1 2 1 2 2(1 ) 1 2 Plots of Bivariate Normal pdf µ1 = µ2 = 0, 1 = 2 =1, = 0.1 µ1 = µ2 = 0, 1 = 2 =1, = 0.9 0.15 2 0.1 0.05 1 0 -2 0 -1 -1 0 1 2 STAT151,Term TermII,II09/10 14-15 STAT306, 16 -2 © Zhenlin Yang, SMU Special Joint Distributions Chapter 4 It can be shown that is the correlation coefficient between X1 and X2. When =0, we have 2 2 1 1 x1 1 x2 2 f ( x1 , x2 ) exp 21 2 2 1 2 2 2 1 1 1 x1 1 1 x2 2 exp exp 2 1 2 1 2 2 2 2 f ( x1 ) f ( x2 ) So, in this case, X1 and X2 are independent. For two normal random variables, if they are uncorrelated (or covariance is zero), then they are independent. This conclusion may not apply to other random variables. STAT151,Term TermII,II09/10 14-15 STAT306, 17 © Zhenlin Yang, SMU Covariance and Correlation Coefficient Chapter 4 Definition 4.5. The covariance between any two jointly distributed r.v.s X and Y, denoted by Cov(X, Y), is defined by Cov(X, Y) = E[(X µX)(Y µY)] = E[XY] µX µY where µX = E[X] and µY = E[Y] Properties of Covariance: For any two r.v.s X and Y, and constants a, b, c and d, Cov(X, X) = Var(X) Cov(X, Y) = Cov(Y, X) Cov(aX+b, cY+d) = ac Cov(X, Y) If X and Y are independent then Cov(X, Y) = 0. STAT151,Term TermII,II09/10 14-15 STAT306, 18 © Zhenlin Yang, SMU Covariance and Correlation Coefficient Chapter 4 Definition 4.6. The correlation coefficient between any two jointly distributed r.v.s X and Y, denoted by (X, Y), is defined by Cov( X , Y ) ( X ,Y ) Var( X ) Var(Y ) It measures the degree of association between X and Y, and takes values in [1, 1]. Properties of Correlation Coefficient: For any two r.v.s X and Y, and constants a, b, c and d, (aX+b, cY+d) = (X, Y), if ac > 0, = (X, Y), if ac < 0. STAT151,Term TermII,II09/10 14-15 STAT306, 19 © Zhenlin Yang, SMU Conditional Distributions Chapter 4 One of the most useful concepts in probability theory is that of conditional probability and conditional expectation, because In practice, some partial information is often available, and hence calculations of probabilities and expectations should be conditional upon the given information; In calculating a desired probability or expectation it is often extremely useful to first “condition” on some appropriate random variables. The concept of conditional probability, P(A|B) = P(A B)/P(B), can be extended directly to give a definition of the conditional distribution of X given Y = y, where X and Y are two r.v.s, discrete or continuous. STAT151,Term TermII,II09/10 14-15 STAT306, 20 © Zhenlin Yang, SMU Chapter 4 Conditional Distributions Definition 4.7. For two discrete r.v.s X and Y, the conditional pmf of X given Y = y is p X |Y ( x | y ) P( X x | Y y ) P ( X x, Y y ) P(Y y ) p ( x, y ) , where pY(y) 0; pY ( y ) The conditional expectation of X given Y = y is defined as E[ X | Y y] x p X |Y ( x | y), x Clearly, when X is independent of Y, pX|Y(x | y) = pX(x) . STAT151,Term TermII,II09/10 14-15 STAT306, 21 © Zhenlin Yang, SMU Chapter 4 Conditional Distributions Definition 4.8. For two continuous r.v.s X and Y, the conditional pdf of X given Y = y is f ( x, y ) f X |Y ( x | y ) , fY ( y ) where fY(y) 0; The conditional expectation of X given Y = y is defined as E[ X | Y y] x f X |Y ( x | y)dx Example 4.7. Roll a fair die successfully. Let X be the number of rolls until first 4 and Y be the number of rolls until first 5. (a) Find the conditional pmf of X given Y = 4. (b) Calculate P(X > 2 | Y = 4). (c) Calculate E[X | Y = 4] STAT151,Term TermII,II09/10 14-15 STAT306, 22 © Zhenlin Yang, SMU Chapter 4 Conditional Distributions Solution: p ( x , 4) (a) p X |Y ( x | 4) = , where p Y ( 4) pY (4) = P(Y = 4) = (5/6)3(1/6) = 125/1296 p(1, 4) = P(X = 1, Y = 4) = (1/6)(5/6)2(1/6) = 25/1296 p(2, 4) = P(X = 2, Y = 4) = (4/6)(1/6)(5/6)(1/6) = 20/1296 p(3, 4) = P(X = 3, Y = 4) = (4/6)2(1/6)(1/6) = 16/1296 p(4, 4) = P(X = 4, Y = 4) = 0, and 3 4 1 5 p(x, 4) = 6 6 6 x 5 1 , for x = 5, 6, 7, . . . . 6 Therefore, p X |Y (1 | 4) = p(1,4)/pY(4) = 25/125, p X |Y ( 2 | 4) = p(2,4)/pY(4) = 20/125, STAT151,Term TermII,II09/10 14-15 STAT306, 23 © Zhenlin Yang, SMU Chapter 4 Conditional Distributions p X |Y (3 | 4) = p(3, 4)/pY(4) = 16/125, p X |Y ( 4 | 4) = p(4, 4)/pY(4) = 0, 3 4 5 p X |Y ( x | 4) = 5 6 x 5 and 1 , for x = 5, 6, 7, . . . . 6 (b) P(X > 2 | Y = 4) = 1 P(X = 1 | Y = 4) P(X = 2 | Y = 4) = 80/125. (c) E(X | Y = 4) 4 = (1)(25/125) + (2)(20/125) + (3)(16/125) + (4)(0) + 5 3 113 4 5 = ( y 4) 125 5 y 1 6 (in the above, y = x 4) STAT151,Term TermII,II09/10 14-15 STAT306, y 1 3 5 x x 5 6 x 5 1 6 3 1 113 4 (6+4) = 6.024. = 6 125 5 24 © Zhenlin Yang, SMU Conditional Distributions Chapter 4 Example 4.8. The joint pdf of X and Y is given by 125 x(2 x y ) if 0 x 1, 0 y 1 f ( x, y ) otherwise. 0 (a) Compute the conditional pdf of X given Y=y, where 0 y 1. (b) Calculate P(X > 0.5 | Y = 0.5) . (c) Calculate E(X | Y = 0.5). Solution: (a) f X |Y ( x | y ) = = f ( x, y ) f Y ( y) f ( x, y) f ( x, y)dx = x(2 x y ) 1 x(2 x y)dx = 6 x( 2 x y ) , for 0 x 1, 0 y 1. 4 3y 0 (b) P(X > 0.5 | Y = 0.5) = 1 0.5 f X |Y ( x | 0.5) dx = STAT151,Term TermII,II09/10 14-15 STAT306, 6 0.5 5 x(3 2 x)dx = 0.65. 1 25 © Zhenlin Yang, SMU Chapter 4 Conditional Distributions (c) f X |Y ( x | 0.5) = 6 x (3 2 x) , 5 1 E(X | Y = 0.5) = xf X Y ( x 0.5) dx = 0 6 2 0 5 x (3 2 x)dx = 0.6. 1 Definition 4.9. Let X and Y be jointly distributed r.v.s. The conditional variance of X given Y = y, is given by, Var(X | Y = y) = E[(X µX|Y)2 | Y = y], where µX|Y = E(X | Y = y). Continuing on Example 4.8, now we want to find Var(X | Y = 0.5): 1 16 2 2 E(X | Y = 0.5) = x f X Y ( x 0.5) dx = x 3 (3 2 x)dx = 0.42 0 5 0 Var(X | Y = 0.5) = E(X2 | Y = 0.5) [E(X | Y = 0.5)]2 = 0.42 – 0.62 = 0.06 STAT151,Term TermII,II09/10 14-15 STAT306, 26 © Zhenlin Yang, SMU