Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 5 Joint Probability Distributions z If x and y are two random variables, the probability distribution that defines their simultaneous behavior is called a joint probability distribution. 5-1 Two discrete random variables 5-1.1 Joint Probability Distributions Example 5-1 z In the development of a new receiver for the transmission of digital information, each received bit is rated as acceptable, suspect, or unacceptable,with probabilities 0.9, 0.08, and 0.02, respectively. z Assume that the ratings of each bit are independent. z In the first four bits transmitted: X: the number of acceptable bits Y: the number of suspect bits z The distribution of X is binomial with n =4 and p =0.9, and the distribution of Y is binomial with n =4 and p =0.08. z The probability of each of the points show in Fig. 5-1 Four unacceptable bits: ( )(0.02) 4 4 4 = (2 ×10 −2 ) 4 = 1.6 ×10 −7 If X and Y are discrete random variables, the joint probability distribution of X and Y is a description of the set of point (x, y) in the range of (X, Y) along with the probability of each point. z The joint probability distribution of two random variables is sometimes referred to as the bivariate probability distribution or bivariate 5-1 distribution of random variables. The joint probability mass function P(X = x and Y = y) is usually written as P(X = x, Y = y) z Definition The joint probability mass function of the discrete random variables X and Y, denoted as fXY (x, y), satisfies (1) f XY ( x, y ) ≥ 0 (2) ∑∑ f x XY ( x, y ) = 1 y (3) f XY ( x, y ) = P( X = x, Y = y ) (5-1) Example 5-2 z See Fig 5-1 z For example, P(X=2, Y=1)is the probability that exactly two acceptable bits and exactly one suspect bit are received among the four bits transferred. z a: acceptable ( p = 0.9) s: suspect ( p = 0.08) u: unacceptable ( p = 0.02) By the assumption of independence: P(aasu)=0.9(0.9)(0.08)(0.02)=0.0013 ⎛ 4 ⎞ f XY ( 2 ,1) = P( X = 2, Y = 1) = ⎜ ⎟ × 0.0013 2 , 1 , 1 ⎠ ⎝ 4! = × 0.0013 = 12 × 0.0013 = 0.0156 2! 1! 1! 5-2 5-1.2 Marginal Probability Distributions z z The individual probability distribution of a random variable is referred to as its marginal probability distribution. To determine P(X = x), we sum P(X = x, Y = y ) over all points in the range of (X , Y ) for which X = x Example 5-3 z See Fig 5-1 z Find the marginal probability distribution of X. z For example : P(X = 0 )= P(X = 1 )= P(X = 2 )= P(X = 3 )= P(X =3, Y = 0) + P(X =3, Y = 1)= 0.0583 + 0.2333 ⎛4⎞ 3 1 = 0.292= ⎜ ⎟ 0.9 0.1 ⎝3⎠ P(X = 4 )= Figure 5-2 Marginal probability distributions of X and Y from Fig. 5-1 5-3 Definition If X and Y are discrete random variables with joint probability mass function fXY (x, y), then the marginal probability mass functions of X and Y are f X ( x) = P( X = x) = ∑ f XY ( x, y ) Rx fY ( y ) = P(Y = y ) = ∑ f XY ( x, y ) (5-2) Ry where Rx denotes the set of all points in the range of (X, Y) for which X = x and Ry denotes the set of all points in the range of (X, Y) for which Y = y Definition - Mean and Variance from Joint Distribution If the marginal probability distribution of X has the probability mass function fX (x), then E ( X ) = µ X = ∑ xf X ( x) = ∑ x ∑ f XY ( x, y ) = ∑∑ xf XY ( x, y ) x x Rx x Rx = ∑ xf XY ( x, y ) (5-3) R and V ( X ) = σ X2 = ∑ ( x − µ X ) 2 f X ( x ) = ∑ ( x − µ X ) 2 ∑ f XY ( x, y ) x x Rx = ∑∑ ( x − µ X ) 2 f XY ( x, y ) = ∑ ( x − µ X ) 2 f XY ( x, y ) x Rx R where Rx denotes the set of all points in the range of (X, Y) for which X = x and R denotes the set of all points in the range of (X, Y) Example 5-4 z In Example 5-1 E ( X ) = 0 [ f XY (0, 0) + f XY (0, 1) + f XY (0, 2) + f XY (0, 3) + f XY (0, 4)] + 1[ f XY (1, 0) + f XY (1,1) + f XY (1, 2) + f XY (1, 3)] + 2 [ f XY (2, 0) + f XY (2, 1) + f XY (2, 2)] + 3 [ f XY (3, 0) + f XY (3, 1)] + 4 [ f XY (4, 0)] = 0 [0.0001] + 1 [0.0036] + 2 [0.0486] + 3 [0.02916] + 4 [0.6561] = 3.6 5-4 z Alternatively, because the marginal probability distribution of X is binomial, E ( X ) = np = 4(0.9) = 3.6 V ( X ) = np (1 − p ) = 4(0.9)(1 − 0.9) = 0.36 5-5 5-1.3 Conditional Probability Distributions Example 5-5 z See Example 5-1(Fig 5-1) z z Fig 5-1 The probability that Y = 0 given that X = 3 (Note: P ( A | B ) = P ( A I B ) P ( B ) ) P(Y = 0 | X = 3) = P( X = 3, Y = 0) P( X = 3) = f XY (3, 0) f X (3) = 0.05832 0.2916 = 0.200 The probability that Y = 1 given that X = 3 P(Y = 1 | X = 3) = P( X = 3, Y = 1) P( X = 3) = f XY (3, 1) f X (3) = 0.2333 0.2916 = 0.800 Notice P(Y = 0 | X =3) +P(Y = 1 | X = 3) =1 Definition Given discrete random variables X and Y with joint probability mass function fXY (x, y) the conditional probability mass function of Y given X = x is P ( X = x , Y = y ) f XY ( X , Y ) P (Y = y | X = x ) = = for fX(x)>0 (5-4) P( X = x) f X ( x) 5-6 z The function fY|x(y) is used to find the probabilities of the possible value for Y given that X = x Because a conditional probability mass function fY|x( y )is a probability mass function for all y in Rx, the following properties are satisfied: (1) fY |x ( y ) ≥ 0 (2)∑ fY |x ( y ) = 1 z (5-5) Rx (3) P(Y = y | X = x) = fY |x ( y ) Example 5-6 z For the joint probability distribution in Fig. 5-1, f ( x, y ) P ( X = x, Y = y ) f Y | x ( y ) = XY = f X ( x) P( X = x) The function fY | x (y) is shown in Fig. 5-3. f Y |0 (0) = P(Y = 0 | X = 0) = 5-7 P( X = 0, Y = 0) 1.6 × 10 −7 = = 0.0016 P ( X = 0) 0.0001 Definition Let Rx denote the set of all points in the range of (X, Y) for which X = x. The conditional mean of Y given X = x, denoted as E(Y | x) or µ Y | x , is E (Y | x) = ∑ yfY |x ( y ) (5-6) Rx 2 and the conditional variance of Y given X =x, denoted as V(Y| x) or σY|x , is V (Y | x) = ∑ ( y − µY |x ) 2 fY |x ( y ) =∑ y 2 fY |x ( y ) − µY2|x Rx Rx Example 5-7 z The conditional mean of Y given X=2 is obtained from the conditional distribution in Fig. 5-3: E (Y | 2) = µ Y |2 = 0 (0.040 ) + 1 (0.320 ) + 2 (0.640 ) = 1.6 z The conditional variance of Y given X=2 V (Y | 2) = (0 − µY |2 ) 2 (0.040) + (1 − µY |2 ) 2 (0.320) + (2 − µY |2 ) 2 (0.640) = 0.32 5-8 5-1.4 Independence Example 5-8 z In a plastic molding operation, each part is classified as to whether it conforms to color and length specifications. z Define the random variable X and Y ⎧1 if the part conforms to color specifications ⎪ X =⎨ ⎪0 otherwise ⎩ ⎧1 ⎪ Y =⎨ ⎪0 ⎩ z if the part conforms to length specifications otherwise The joint probability distribution of X and Y is defined by fXY (x, y) in Fig. 5-4(a). Notice that for any x, fY|x ( y) =fY ( y). Two random variables to be independent If two random variables are independent, then f ( x, y ) f X ( x ) f Y ( y ) f Y | x ( y ) = XY = = f Y ( y) f X ( x) f X ( x) 0.9702 f Y | 1 (1) = = 0.98 0.99 Figure 5-4 5-9 f Y | 0 ( 0) = 0.0002 = 0.02 0.01 For discrete random variables X and Y, if any one of the following properties is true, the others are also true, and X and Y are independent. (1) fXY (x, y) = fX(x) fY(y) for all x and y (2) fY|x ( y ) = fY (y) for all x and y with fX(x)>0 (3) fY|x( x )=fX (x) for all x and y with fY(y)>0 (4) P ( X ∈ A, Y ∈ B ) = P ( X ∈ A) P (Y ∈ B ) for any sets A and B in the range of X and Y, respectively. (5-7) Rectangular Range for (X, Y)! z If the set of points in 2-d space that receive positive probability under fXY(x, y) does not form a rectangle, X and Y are not independent because knowledge of X can restrict the range of values of Y that receive positive probability. Example 5-9 z In a large shipment of parts, 1% of the parts do not conform to specifications. z The supplier inspects a random sample of 30 partS z X: The number of parts in the sample that do not conform to specifications. z The purchaser inspects another random sample of 20 parts z Y: The number of parts in this sample that do not conform to specifications. z What is the probability that X≦1and Y≦1? z The sampling is with replacement and that X and Y are independent. z The marginal probability distribution of X is binomial with n = 30 and p =0.01 z The marginal probability distribution of Y is binomial with n = 20 and p =0.01. z If independence between X and Y were not assumed P ( X ≤ 1, Y ≤ 1) = P ( X = 0, Y = 0) + P ( X = 1, Y = 0) + P ( X = 0, Y = 1) + P ( X = 1, Y = 1) = f XY (0, 0) + f XY (1, 0) + f XY (0, 1) + f XY (1, 1) 5-10 z However, with independence P ( X ≤ 1, Y ≤ 1) = P ( X ≤ 1) P (Y ≤ 1) P ( X ≤ 1) = P ( X = 0) + P ( X = 1) = ( )0.01 × 0.99 30 0 0 30 + ( )0.01 × 0.99 30 1 1 29 = 0.2241152 + 0.7397 = 0.9639 P(Y ≤ 1) = P(Y = 0) + P(Y = 1) ( ) z z z ( ) = 020 0.010 × 0.99 20 + 120 0.011 × 0.9919 = 0.9831 Therefore, P ( X ≤ 1, Y ≤ 1) = 0.9639 × 0.9831 = 0.948 If the supplier and the purchaser change their policies so that the shipment is acceptable only if zero nonconforming parts are found in the sample. The probability that the shipment is accepted for production is still quite high. P( X = 0, Y = 0) = P( X = 0) P(Y = 0) = 0.605 Exercise 5-1: 5-1, 5-3, 5-5, 5-7, 5-9, 5-13, 5-15 5-11 5-2 Multiple discrete random variables 5-2.1 Joint Probability Distributions z Given discrete random variables X1, X2, …, Xp, the joint probability distribution of X1, X2, …, Xp is a description of the set of points (x1, x2 ,…, xp) in the range of X1, X2, …, Xp , along with the probability of each point. Definition The joint probability mass function of X1, X2,… , Xp is f X1 X 2 ... X p ( x1 , x2 , ..., x p ) = P( X 1 = x1 , X 2 = x2 , ..., X p = x p ) (5-8) for all points (x1, x2,…, xp) in the range of X1, X2,…, Xp Definition If X1,X2,…, Xp are discrete random variables with joint probability mass function f X 1 X 2 ... X p ( x1 , x2 , ..., x p ) , the marginal probability mass function of any Xi is f X i ( xi ) = P( X i = xi ) = ∑ f X1 X 2 ... X p ( x1 , x2 , ..., x p ) Rxi (5-9) where R xi denotes the set of points in the range of (X1, X2,… , Xp) for which Xi = xi Example 5-11 z The joint probability distribution of three random variables X1, X2, X3 are shown in Fig. 5-5. z x1 + x2 +x3 =3 z The marginal probability distribution of X2 is found as follows P( X 2 = 0) = f X1X 2 X 3 (3, 0, 0) + f X1X 2 X 3 (0, 0, 3) + f X1X 2 X 3 (1, 0, 2) + f X1X 2 X 3 (2, 0, 1) P( X 2 = 1) = f X1X 2 X 3 (2, 1, 0) + f X1X 2 X 3 (0, 1, 2) + f X1X 2 X 3 (1, 1, 1) P( X 2 = 2) = f X1X 2 X 3 (1, 2, 0) + f X1X 2 X 3 (0, 2, 1) P( X 2 = 3) = f X1X 2 X 3 (0, 3, 0) 5-12 Figure 5-5 Joint probability distribution of X1, X2, and X3 Mean and variance from Joint distribution E ( X i ) = ∑ xi f X1X 2 ... X p ( x1 , x2 , K, x p ) R and V ( X i ) = ∑ ( xi − µ X i ) 2 f X1X 2 ... X p ( x1 , x2 , K , x p ) (5-10) R where R is the set of all points in the range of X1, X2, … , Xp. Distribution of a Subset of Random Variables If X1,X2,…, Xp are discrete random variables with joint probability mass function f X 1 , X 2 ,..., X p ( x1 , x2 , ..., x p ) ,the joint probability mass function of X1,X2,…, Xk, k < p, is f X1X 2 ... X k ( x1 , x2 ,..., xk ) = P ( X 1 = x1 , X 2 = x2 ,K , X k = xk ) ∑ P( X = R 1 = x1 , X 2 = x2 ,..., X k = xk ) (5-11) x1x2 ... xk where Rx1x2 ... xk denotes the set of all points in the range of X1,X2,…Xp for which X1 = x1, X2 = x2,…, Xk = xk 5-13 z That is, P(X1 = x1, X2 = x2,…, Xk = xk) is the sum of the probabilities over all points in the range of X1, X2, …, Xp for which X1 = x1 , X2 = x2,…, and Xk = xk. Conditional Probability Distributions z Conditional probability distributions can be developed for multiple discrete random variables by an extension of the ideas used for two discrete random variables. z For example, the conditional joint probability mass function of X1, X2, X3 given X4, X5 is f X 1 X 2 X 3 | x4 x5 ( x1 , x2 , x3 ) = f X 1 X 2 X 3 X 4 X 5 ( x1 , x2 , x3 , x4 , x5 ) f X 4 X 5 ( x4 , x5 ) for f X 4 X 5 ( x 4 , x5 ) > 0 Definition Discrete variables X1, X2, …, Xp are independent if and only if f X 1 X 2 ... X p ( x1 , x2 , ..., x p ) = f X 1 ( x1 ) f X 2 ( x2 )... f X p ( x p ) (5-12) for all x1, x2,…, xp z It can be show that if X1, X2, …, Xp are independent P ( X 1 ∈ A1 , X 2 ∈ A2 , L , X p ∈ A p ) = P ( X 1 ∈ A1 ) P ( X 2 ∈ A2 ) K P ( X p ∈ A p ) for any sets A1, A2, …, Ap 5-14 5-2.2 Multinomial Probability Distribution Example 5-12 z Of the 20 bits received, what is the probability that 14 are excellent, 3 are good, 2 are fair, and 1 is poor? z The probabilities of E, G, F, and P are 0.6, 0.3, 0.08, and 0.02, respectively. z One sequence of 20 bits : EEEEEEEEEEEEEE GGG FF P 14 3 2 1 P(EEEEEEEEEEEEEEGGGFFP) = 0.6 14 0.3 30.0820.021 =2.708 × 10-9 z The number of sequences is : 20! = 2325600 14!3!2!1! z The requested probability is: P(14E’s, three G’s, two F’s , and one P) = 2325600(2.708 × 10-9) =0.0063 Multinomial distribution Suppose a random experiment consists of a series of n trials. Assume that (1) The result of each trial is classified into one of k classes. (2) The probability of a trial generating a result in class 1, class 2, ..., class k is constant over the trials and equal to p1, p2, …, pk, respectively. (3) The trials are independent. The random variables X1, X2, …, Xk that denote the number of trials that result in class 1, class 2, …, class k, respectively, have a multinomial distribution and the joint probability mass function is P( X 1 = x1 , X 2 = x2 , ..., X k = xk ) = n! p1x1 p2x2 ... pkxk x1! x2 ! K xk ! for x1 +x2 +…+xk = n and p1 + p2 + …+pk = 1 5-15 (5-13) Example 5-13 In Example 5-12, let the random variables X1, X2, X3, and X4 denote the number of bits that are E, G, F, and P, respectively, in a transmission of 20 bits. The probability that 12 of the bits received are E, 6 are G, 2 are F, and 0 are P 20! P( X 1 = 12, X 2 = 6, X 3 = 2, X 4 = 0) = 0.1612 0.36 0.082 0.020 12!6! 2!0! = 0.0358 If X1, X2, . . . , Xk have a multinomial distribution, the marginal probability distribution of Xi is binomial with E(Xi)= npi and V(Xi)= npi (1- pi) (5-14) Example 5-14 z In Example 5-13, the marginal probability distribution of X2 is binomial with n =20 and p = 0.3. z The joint marginal probability distribution of X2 and X3 : P(X2 =x2, X3 =x3) is the probability that exactly x2 trials result in G and that x3 result in F. z The remaining n- x2 -x3 trials must result in either E or P. z {G}, {F}, or {E, P}, with probabilities 0.3, 0.08, and 0.6 + 0.02 = 0.62 f X 2 X 3 ( x 2 , x3 ) = P ( X 2 = x 2 , X 3 = x3 ) = n! (0.3) x2 (0.08) x3 (0.62) n− x2 − x3 x2 ! x3 !(n − x2 − x3 )! Exercise 5-2 : 5-17, 5-19, 5-23, 5-25, 5-27, 5-29 5-16 5-3 Two continuous random variables 5-3-1 Joint probability distributions A joint probability density function can be defined over two-dimensional space. The double integral of fXY(x, y) over a region R provides the probability that (X, Y )assumes a value in R. The integral can be interpreted as the volume under the surface fXY(x, y) over the region R. z z Definition A joint probability density function for the continuous random variables X and Y, denoted as fXY(x, y), satisfies the following properties: (1) f XY ( x, y ) ≥ 0 for all x , y ∞ ∞ (2) ∫∫f XY ( x, y ) dx dy = 1 −∞ −∞ (3) For any region R of two-dimensional space P([ X , Y ] ∈ R) = ∫ ∫ f XY ( x, y ) dx dy (5-15) R Example 5-15 z The random variable X : the time until a computer server connects to your machine (in milliseconds) z Y :the time until the server authorizes you as a valid user (in milliseconds). z X<Y z The joint probability density function for X and Y: f XY ( x, y ) = 6 × 10 −6 exp(−0.001x − 0.002 y ) z for x < y The region with nonzero probability is shaded in Fig. 5-8. Figure 5-8 The joint probability density function of X and Y is nonzero over the shaded region 5-17 ⎛∞ ⎞ −6 −0.001 x −0.002 y ⎜ ⎟dx = × f x y dy dx e dy ( , ) 6 10 XY ∫−∞ ∫−∞ ∫0 ⎜⎝ ∫x ⎟ ⎠ ∞ ∞ ⎛ ⎞ −6 = 6 × 10 ∫ ⎜⎜ ∫ e −0.002 y dy ⎟⎟e −0.001x dx 0⎝x ⎠ ∞ −0.002 x ⎞ −0.001x −6 ⎛ e ⎟⎟e = 6 × 10 ∫ ⎜⎜ dx 0 . 002 ⎠ 0⎝ ∞ ∞ ∞ ∞ ( ) 1 ) =1 0 . 003 0 The probability that X <1000 and Y < 2000 = 0.003∫ e −0.003 x dx = 0.003( z 1000 2000 P( X ≤ 1000, Y ≤ 2000) = ∫ ∫f 0 XY ( x, y )dy dx x ⎛ 2000 − 0.002 y ⎞ − 0.001x = 6 × 10 ∫ ⎜⎜ ∫ e dy ⎟⎟e dx 0 ⎝ x ⎠ 1000 ⎛ e − 0.002 x − e − 4 ⎞ − 0.001x −6 ⎟⎟e = 6 × 10 ∫ ⎜⎜ dx 0 . 002 ⎠ 0 ⎝ −6 1000 ⎡⎛ 1 − e − 3 ⎞ − 4 ⎛ 1 − e −1 ⎞⎤ ⎟⎟⎥ ⎟⎟ − e ⎜⎜ − e − 4 e − 0.001x dx = 0.003⎢⎜⎜ 0 . 003 0 . 001 ⎝ ⎠⎦ ⎠ 0 ⎣⎝ = 0.003(316.738 − 11.578) = 0.915 1000 = 0.003 Fig5-9 ∫ (e − 0.003 x ) Region of integral for probability that X<1000 and Y<2000 is darkly shade. 5-18 5-3-2 Marginal probability distributions Definition If the joint probability density function of continuous random variables X and Y is fXY (x, y), the marginal probability density functions of X and Y are Definition f X ( x ) = f XY ( x, y ) dy and f Y ( y ) = f XY ( x, y ) dx (5-16) ∫ ∫ Ry Rx Rx :the set of all points in the range of (X, Y) for which X = x Ry :the set of all points in the range of (X, Y) for which Y = y b z P ( a < X < b) = ∫ ∫ a Rx b ⎞ ⎛ f XY ( x, y ) dy dx = ∫ ⎜⎜ ∫ f XY ( x, y )dy ⎟⎟dx = ∫ f X ( x) dx a ⎝ Rx a ⎠ b Mean and Variance from Joint Distribution E( X ) = µ X = ∞ ∫ xf X ( x)dx = −∞ ⎤ ⎡ x f ( x , y ) dy ⎥ dx ⎢ ∫−∞ ⎢ R∫ XY ⎥⎦ ⎣ x ∞ (5-17) = ∫∫ xf XY ( x, y ) dx dy R and ⎡ ⎤ V ( X ) = σ = ∫ ( x − µ X ) f X ( x)dx = ∫ ( x − µ X ) 2 ⎢ ∫ f XY ( x, y )dy ⎥ dx −∞ −∞ ⎣R ⎦ = ∫∫ ( x − µ X ) 2 f XY ( x) dx dy ∞ 2 X ∞ 2 x R Rx :the set of all points in the range of (X, Y) for which X = x 5-19 Example 5-16 z For the random variables that denote times in Example 5-15, calculate the probability that Y exceeds 2000 milliseconds. 2000 ⎛ ∞ ⎞ P(Y > 2000) = ∫ ⎜⎜ ∫ 6 × 10 −6 e −0.001 x−0.002 y dy ⎟⎟ dx 0 ⎝ 2000 ⎠ ⎛∞ ⎞ + ∫ ⎜⎜ ∫ 6 × 10 −6 e −0.001 x−0.002 y dy ⎟⎟ dx 2000 ⎝ x ⎠ = 0.0475 + 0.0025 = 0.05 Alternatively, the probability can be calculated from the marginal probability distribution of Y y>0 ∞ z z y fY ( y ) = ∫ 6 × 10−6 e −0.001x−0.002 y dx = 6 × 10−3 e −0.002 y (1 − e −0.001 y ) 0 for y > 0 ∞ P (Y > 2000) = ∫f Y ( y )dy = 6 × 10 2000 −3 ∞ ∫e 2000 5-20 − 0.002 y (1 − e −0.001 y )dy = 0.05 5-3.3 Conditional Probability Distributions Definition Given continuous random variables X and Y with joint probability density function fXY (x, y), the conditional probability density function of Y given X = x is f ( x, y ) f Y | x ( y ) = XY for fX(x)>0 (5-18) f X ( x) The function fY|x(y) is used to find the probabilities of the possible values for Y given that X = x. Let Rx denote the set of all points in the range of (X, Y) for which X = x .The conditional probability function provides the conditional probabilities for the values of Y in the set Rx z z Because the conditional probability density function fY|x(y)is a probability density function for all y in Rx, the following properties are satisfied: (1) f Y | x ( y ) ≥ 0 (2) ∫f Y |x ( y )dy = 1 Rx (3) P (Y ∈ B | X = x) = ∫f Y |x ( y ) dy for any set B in the range of Y B (5-19) Example 5-17 z For the random variables that denote times in Example 5-15, determine the conditional probability density function for Y given that X = x. z The marginal density function of x is ∞ f X ( x) = ∫ 6 × 10 −6 e −0.001x −0.002 y dy = 0.003e −0.003 x for x >0 x z This is an exponential distribution with λ = 0.003 . 5-21 z For 0 < x and x < y the conditional probability density function : f XY ( x, y ) 6 ×10−6 e −0.001x −0.002 y fY | x ( y ) = = f X ( x) 0.003e −0.003 x = 0.002e 0.002 x −0.002 y z for 0 < x and x < y Determine the probability that Y exceeds 2000, given that x = 1500. ∞ P(Y > 2000 | x = 1500) = ∫f ∞ Y |1500 ( y )dy = 2000 ∫ 0.002e 0.002 (1500 ) −0.002 y dy 2000 = 0.368 Definition Let Rx denote the set of all points in the range of (X, Y) for which X =x. The conditional mean of Y given X = x, denoted as E(Y | x) or µY|x is E (Y | x) = ∫ yf Y |x ( y ) dy Rx and the conditional variance of Y given X = x, denoted as V(Y | x) or σ Y2|x is 2 V (Y | x ) = ∫ ( y − µY |x ) f Y |x ( y ) dy = Rx ∫y 2 f Y |x ( y ) dy − µY2|x Rx Example 5-18 z For the random variables that denote times in Example 5-15, determine the conditional mean for Y given that x = 1500. z The conditional probability density function for Y was determined in Example 5-17. fY | x ( y ) = 0.002e 0.002 x −0.002 y 0 < x and x < y E (Y | x = 1500) ∞ = ∫ y(0.002e 0.002 (1500) −0.002 y ∞ )dy = 0.002e 1500 3 ∫ ye 1500 5-22 −0.002 y dy = 2000 (5-20) 5-3.4 Independence z If fXY (x, y) = fX(x) fY(y) for all x and y, X and Y are independent. Definition For continuous random variables X and Y, if any one of the following properties is true, the others are also true, and X and Y are said to be independent. (1)fXY (x, y) = fX(x) fY(y) for all x and y (2)fY|x(y) =fY(y)for all x and y with fX(x)>0 (3)fX|y(x) for all x and y with fY(y)>0 (4) P ( X ∈ A, Y ∈ B ) = P ( X ∈ A) P (Y ∈ B ) for any sets A and B in the range of X and Y, respectively (5-21) Example 5-19 z For the joint distribution of times in Example 5-15 z From Ex. 5-16, P(Y>2000)=0.05 z From Ex. 5-17, P(Y>2000 | x=1500)=0.368 ∵P(Y>2000)≠P(Y>2000 | x=1500) ∴These variables are not independent Example 5-20 z Example 5-15 is modified so that the joint probability density −6 −0.001 x −0.002 y for x ≥ 0 function of X and Y is f XY ( x, y ) = 2 × 10 e and y ≥ 0 z Show that X and Y are independent and determine P(X >1000 , Y < 1000) z The marginal probability density function of X: ∞ f X ( x) = ∫ 2 × 10 −6 e −0.001x −0.002 y dy = 0.001e −0.001x for x >0 0 z The marginal probability density function of y : ∞ f Y ( y ) = ∫ 2 × 10 −6 e −0.001x −0.002 y dx = 0.002e −0.002 y for y >0 0 z z Therefore, fXY (x, y) =fX (x) fY (y) for all x and y X and Y are independent P( X > 1000 , Y < 1000) = P( X > 1000) P(Y < 1000) = e −1 (1 − e −2 ) = 0.318 5-23 Example 5-21 z The random variables X and Y : the lengths of two dimensions of a machined part z Assume that X and Y are independent random variables z The distribution of X is normal with mean 10.5 millimeters and variance 0.0025 (millimeter)2 z The distribution of Y is normal with mean 3.2 millimeters and variance 0.0036(millimeter)2 z Determine the probability that 10.4 < X < 10.6 and 3.15 < Y < 3.25 z Because X and Y are independent P(10.4 < X < 10.6, 3.15 < Y < 3.25) = P(10.4 < X < 10.6) P(3.15 < Y < 3.25) 10.4 − 10.5 10.6 − 10.5 3.15 − 3.2 3.25 − 3.2 ) P( ) <X< <Y < = P( 0.05 0.05 0.06 0.06 = P(−2 < Z < 2) P(−0.833 < Z < 0.833) = 0.566 Exercise: 5-35, 5-37, 5-39, 5-49, 5-51, 5-53 5-24 5-5 z Covariance and Correlation The covariance is a measure to describe the relationship between two random variables. Definition ⎧∑∑ h( x, y ) f XY ( x, y ) X , Y discrete ⎪⎪ R E [h( X , Y )] = ⎨ ⎪∫∫ h( x, y ) f XY ( x, y ) dx dy X , Y continuous ⎪⎩ R (5 − 27) Example 5-27 z For the joint probability distribution of the two random variables in Fig. 5-12 0.3 0.4 0.3 0.3 z z z 0.7 Calculate E[( X − µ X )(Y − µ Y )] µ X = 1 × 0.3 + 3 × 0.7 = 2.4 µY = 1 × 0.3 + 2 × 0.4 + 3 × 0.3 = 2.0 E[( X − µ X )(Y − µY )] = (1 − 2.4)(1 − 2.0) × 0.1 + (1 − 2.4)( 2 − 2.0) × 0.2 + (3 − 2.4)(1 − 2.0) × 0.2 + (3 − 2.4)( 2 − 2.0) × 0.2 + (3 − 2.4)(3 − 2.0) × 0.3 = 0.2 5-25 Definition The covariance between the random variables X and Y, denoted as cov(X, Y ) or σ XY is σ XY = E[( X − µ X )(Y − µY )] = E ( XY ) − µ X µY (5-28) z z If the points in the joint probability distribution of X and Y that receive probability tend to fall along a line of positive (or negative ) slope, σ XY is positive (or negative) Covariance is a measure of linear relationship between the random variables. (a) Positive covariance (b) Zero covariance All points are of equal probability (d) Zero covariance (c)Negative covariance 5-26 z The equality of the two expressions for covariance in Equation 5-28 is shown for continuous random variables as follows. ∞ ∞ σ XY = E[( X − µ X )(Y − µY )] = ∫ ∫ (x − µ X )( y − µY ) f XY ( x, y ) dx dy − ∞− ∞ ∞ ∞ = ∫ ∫ [xy − µ X y − xµY + µ X µY ]f XY ( x, y ) dx dy − ∞− ∞ ∞ ∞ ⎡ ⎤ −∞ −∞ ⎣ ⎦ ∫ ∫ µ X y f XY ( x, y) dx dy = µ X ⎢∫ ∫ y f XY ( x, y)dx dy ⎥ = µ X µY E[( X − µ x )(Y − µ y )] = ∞ ∞ ∫ ∫ xy f XY ( x, y ) dx dy − µ X µY − µ X µY + µ X µY − ∞− ∞ ∞ ∞ = ∫ ∫ xy f XY ( x, y ) dx dy − µ X µY = E ( XY ) − µ X µY − ∞− ∞ Definition The correlation between random variables X and Y, denoted as ρ XY is σ cov( X , Y ) ρ XY = = XY (5-29) V ( X )V (Y ) σ X σY For any two random variables X and Y − 1 ≤ ρ XY ≤ +1 z z (5-30) Two random variables with nonzero correlation are said to be correlated . The correlation is also a measure of the linear relationship between random variables. 5-27 Example 5-29 z For the discrete random variables X and Y with the joint distribution shown in Fig. 5-14 0.4 0.2 0.2 0.2 0.2 z z 0.2 0.2 0.4 Fig 5-14 Determine σ XY and ρ XY The calculations for E(XY), E(X), and V(X): E ( XY ) = 0 × 0 × 0.2 + 1 × 1 × 0.1 + 1 × 2 × 0.1 + 2 × 1 × 0.1 + 2 × 2 × 0 .1 + 3 × 3 × 0 .4 = 4 .5 E ( X ) = 0 × 0 .2 + 1 × 0 .2 + 2 × 0 .2 + 3 × 0 .4 = 1 .8 V ( X ) = (0 − 1.8) 2 × 0.2 + (1 − 1.8) 2 × 0.2 + (2 − 1.8) 2 × 0.2 + (3 − 1.8) 2 × 0.4 = 1.36 z E[Y]=1.8, V[Y]=1.36 σ XY = E ( XY ) − E ( X ) E (Y ) = 4.5 − (1.8)(1.8) = 1.26 ρ XY = σ XY 1.26 = = 0.926 σ X σY ( 1.36 )( 1.36 ) If X and Y are independent random variables σ XY = ρ XY =0 5-28 ( 5-31) Proof: σ XY = E ( XY ) − µ X µ Y = E ( X ) E (Y ) − µ X µ Y =0 Example 5-31 z For the two random variables in Fig. 5-16, show that σ XY = 0 Fig. 5-16 ⎤ 1 ⎡ 32 E ( XY ) = ∫ ∫ xyf XY ( x, y ) dx dy = ∫ ⎢ ∫ x 2 y 2 dx ⎥ dy = 16 0 ⎣ 0 9 0 0 ⎦ 4 4 2 2 4 2 1 ⎡ 2 ⎤ 4 E ( X ) = ∫ ∫ xf XY ( x, y ) dx dy = ∫ ⎢ ∫ x dx ⎥ dy = 16 0 ⎣ 0 3 0 0 ⎦ 4 2 4 ⎡2 ⎤ 1 8 2 E (Y ) = ∫ ∫ yf XY ( x, y ) dx dy = ∫ y ⎢ ∫ x dx ⎥ dy = 16 0 ⎣ 0 3 0 0 ⎦ 4 2 z E ( XY ) − E ( X ) E (Y ) = 32 4 8 − ( )( ) = 0 9 3 3 Exercise 5-5: 5-67, 5-71, 5-73, 5-75 5-29 5-6 BIVARIATE NORMAL DISTRIBUTION z Bivariate normal distribution : an extension of a normal distribution to two random variables. Definition The probability density function of a bivariate normal distribution is f XY ( x, y; σ X , σ Y , µ X , µ Y , ρ ) = 1 2πσ X σ Y 1 − ρ 2 − exp{ 2 ρ ( x − µ X )( y − µ Y ) σ XσY (x − µ X )2 −1 [ 2(1 − ρ 2 ) σX2 + ( y − µY ) 2 (5-32) ]} σY2 for − ∞ < x < ∞ and − ∞ < y < ∞ , with parameters σ x > 0 , σ Y > 0 , − ∞ < µ X < ∞ , − ∞ < µY < ∞ , and − 1 < ρ < 1 . EXAMPLE 5-33 1 2 + y2 ) z The joint probability density function f XY ( x, y ) = z special case of a bivariate normal distribution with σ X = 1 , σ Y = 1 , µ X = 0 , µY = 0 , and ρ = 0 . See Fig. 5-18. Figure 5-18 2π e −0.5( x is a Bivariate normal probability density function with σ X = 1 , σ Y = 1 , ρ = 0 , µ X = 0 , and µY = 0 . 5-30 Marginal Distributions of Bivariate Normal Random Variables If X and Y have a bivariate normal distribution with joint probability density f XY (σ X , σ Y , µ X , µY , ρ ) , the marginal probability distributions of X and Y are normal with means µ X and µY and standard deviations σ X and σ Y , respectively. (5-33) If X and Y have a bivariate normal distribution with joint probability density function f XY ( x, y;σ X , σ Y , µ X , µY , ρ ) , the correlation between X and Y is ρ . (5-34) If X and Y have a bivariate normal distribution with ρ = 0, X and Y are independent. (5-35) Exercise 5-6 : 5-81 5-31 5-7 LINEAR COMBINATIONS OF RANDOM VARIABLES Definition Given random variables X 1 , X 2 ,..., X p and constants c1 , c2 ,..., c P , (5-36) Y = c1 X 1 + c 2 X 2 + ... + c p X p is a linear combination of X 1 , X 2 ,..., X p . Mean of a Linear Combination If Y = c1 X 1 + c 2 X 2 + ... + c p X p , E (Y ) = c1 E ( X 1 ) + c 2 E ( X 2 ) + ... + c p E ( X p ) (5-37) Variance of a Linear Combination If X 1 , X 2 ,..., X p are random variables, and Y = c1 X 1 + c 2 X 2 + ... + c p X p then in general (5-38) V (Y ) = c12V ( X 1 ) + c 22V ( X 2 ) + ... + c 2pV ( X p ) + 2∑∑ ci c j cov( X i , X j ) i< j If X 1 , X 2 ,..., X p are independent, (5-39) V (Y ) = c12V ( X 1 ) + c 22V ( X 2 ) + ... + c 2pV ( X p ) Prove: V [Y ] = V [c1 X 1 + c 2 X 2 + ... + c p X p ] = [c1 X 1 + c 2 X 2 + ... + c p X p − c1 µ1 − c 2 µ 2 − ... − c p µ p ] 2 = [c1 ( X 1 − µ1 ) + c 2 ( X 2 − µ 2 ) + ... + c p ( X p − µ p )] 2 = c12 ( X 1 − µ1 ) 2 + c 22 ( X 2 − µ 2 ) 2 + ... + c 2p ( X p − µ p ) 2 + 2c1c 2 ( X 1 − µ1 )( X 2 − µ 2 ) + 2c2 c3 ( X 2 − µ 2 )( X 3 − µ 3 ) + ... = c12V [ X 1 ] + ... + c 2pV [ X p ] + 2∑∑ ci c j cov( X i , X j ) i< j EXAMPLE 5-36 z Suppose the random variables X 1 and X 2 denote the length and width, respectively, of a manufactured part. E ( X 1 ) = 2 centimeters with standard deviation 0.1 centimeter z E ( X 2 ) = 5 centimeters with standard deviation 0.2 centimeter. z z The covariance between X 1 and X 2 is -0.005. 5-32 z Y = 2 X 1 + X 2 is a random variable that represent the perimeter of the part. X1 X2 X2 X1 Y=2X1+2X2 z E (Y ) = 2(2) + 2(5) = 14 centimeters z V (Y ) = 2 2 (0.12 ) + 2 2 (0.2 2 ) + 2 × 2 × 2(−0.005) = 0.16 centimeters squared Mean and Variance of an Average If X = ( X 1 + X 2 + ... + X p ) / p with E ( X i ) = µ for i = 1,2..., p E( X ) = µ (5-40a) if X 1 , X 2 ,..., X p are also independent with V ( X i ) = σ V (X ) = σ2 2 for i = 1,2,..., p (5-40b) p Prove: V ( X ) = (1 / p) 2 σ 2 + ... + (1 / p) 2 σ 2 = σ 2 / p 14444244443 p _ terms Reproductive Property of the Normal Distribution If X 1 , X 2 ,..., X p are independent, normal random variables with E ( X i ) = µ i and V ( X i ) = σ i2 , for i = 1,2,..., p Y = c1 X 1 + c 2 X 2 + ... + c p X p is a normal random variable with E (Y ) = c1 µ1 + c 2 µ 2 + ... + c p µ p and V (Y ) = c12σ 12 + c 22σ 22 + ... + c 2pσ p2 5-33 (5-41) EXAMPLE 5-37 z Let the random variables X 1 and X 2 denote the length and width, respectively, of a manufactured part. z X 1 is normal with E ( X 1 ) = 2 centimeters and standard deviation 0.1 centimeter. z X 2 is normal with E ( X 2 ) = 5 centimeters and standard deviation 0.2 centimeter. z Assume that X 1 and X 2 are independent. Determine the probability that the perimeter exceeds 14.5 centimeters. Sol: z Y = 2 X 1 + 2 X 2 is a normal random variable that represents the perimeter of the part. z E (Y ) = 2 × 2 + 2 × 5 = 14 V (Y ) = 4 × 0.12 + 4 × 0.2 2 = 0.0416 z (先標準化,再查表) z P (Y > 14.5) = P[(Y − µ Y ) / σ Y > (14.5 − 14) / 0.0416 ] = P( Z > 1.12) = 0.13 EXAMPLE 5-38 z Soft-drink cans are filled by an automated filling machine. z The mean fill volume is 12.1 fluid ounces. z The standard deviation is 0.1 fluid ounce. z Assume that the fill volumes of the cans are independent, normal random variables. z What is the probability that the average volume of 10 cans selected from this process is less than 12 fluid ounces? Sol: z Let X 1 , X 2 ,..., X 10 denote the fill volumes of the 10 cans. 0.12 = 0.001 10 12 − 12.1 < ] = P( Z < −3.16) = 0.00079 0.001 z E ( X ) = 12.1 and V ( X ) = z P ( X < 12) = P[ X − µX σX Exercise 5-7 : 5-87,5-89,5-91,5-93,5-95 5-34 中華大學資訊工程學系 學號: 姓名: 日期: 時間: 輔導老師: 課程名稱: 年級:□一年級 班別:□甲 □乙 □二年級 地點: □三年級 □丙(二部) 課後輔導記錄表 □四年級 □二技專 □研一 □碩專 問題描述: 問題回覆: 系所主任簽章: 輔導老師簽章: 5-35 □研二