Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Indeterminism wikipedia , lookup
Inductive probability wikipedia , lookup
Ars Conjectandi wikipedia , lookup
Birthday problem wikipedia , lookup
Infinite monkey theorem wikipedia , lookup
Probability box wikipedia , lookup
Probability interpretations wikipedia , lookup
Random variable wikipedia , lookup
Central limit theorem wikipedia , lookup
Module U8: Bivariate Random Variables 1 Module PE.PAS.U8.3 Bivariate Random Variables Primary Author: Email Address: Co-author: Email Address: Last Update: Reviews: Prerequisite Competencies: James D. McCalley, Iowa State University [email protected] None None 8/1/02 None 1. Know basic probability relations (module U5) 2. Define random variables and distinguish between discrete (module U6) and continuous (module U7) random variables. 3. Relate probability density functions and distributions for discrete and continuous random variables (module U7). Module Objectives: 1. Use and relate joint, marginal, and conditional pdfs and cdfs. U8.1 Introduction I n this module, we describe the so-called bivariate case of uncertain situations where two quantities are modeled as random variables. Associated analytic models for density functions and distributions, and the relations between then, will be discussed. Although the bivariate case occurs in many disciplines, we will confine our discussion to examples associated with power system engineering. It is important to note that the bivariate case is the simplest of the more general multivariate case and as such, serves well to illustrate some fundamental concepts associated with uncertain situations requiring multivariate random variables. This module addresses only the continuous bivariate case; similar treatment may be accorded the discrete bivariate case. U8.2 A Motivating Illustration Power system engineers have always computed current ratings for transmission conductors. This is necessary because excessive currents lead to conductor heating, expansion of the conductor material, sagging, and violation of clearance requirements (as set by law). The approach is to define a maximum conductor temperature, max, beyond which it is unsafe to operate, and then compute the conductor temperature under various currents i until the current is found for which the conductor temperate exceeds max. What one must recognize is, however, that the conductor temperature depends on other factors besides current alone, the most significant of which are current duration d, ambient temperature t (the temperature of the air surrounding the conductor), and normal wind speed u (the component of the wind velocity that is perpendicular to the conductor). There are other less influential factors as well, including solar radiation, but we will ignore these for the moment. Thus, we see that conductor temperature may be written as a function of i, d, t, and u. Let’s denote the function characterizing this relation as g, so that g (i, d , t , u ) We simplify the problem by making the following assumptions: (U8.1) Module U8: Bivariate Random Variables 2 1. The current is fixed at a high level of i0, and we desire to see if the conductor temperature is within its safe range for a time period between 3 and 4 pm on a summer day. 2. The current level of i0 will last for a duration of d=1hour. 3. The transmission line is very short so that the temperature and wind speed conditions have no spatial variation, i.e., these conditions are the same at one point on the line as another. 4. The ambient temperature and wind speed both vary during the next hour, but their variation is independent. Thus, we may treat them as independent random variables T and U, respectively. 5. We have information that allows us to characterize the uncertainty associated with the ambient temperature t and the wind speed u during this hour, according to the following uniform probability density functions (pdfs). 0.1, 20 t 30 f T (t ) 0, otherwise 6. 0.2, 5 u 10 fU (u ) 0, otherwise (U8.2) We are able to utilize (U8.1) to identify the range of safe operating conditions according to the following relation: u 0.5t 5 (U8.3) The inequality of (U8.3) for safe conductor operation can be illustrated on a diagram of the possible range and possible safe range of t and u, as indicated in Fig. U8.1. u 25 20 u=0.5t-5 safe range 15 10 possible range safe 5 unsafe 10 20 30 t 40 Fig. U8.1: Illustration of possible and safe ranges of conductor operation We desire to compute the probability that the conductor will be operating safely in the hour of interest. This is the probability that the conductor temperature does not exceed max, i.e., Pr(<max). It is also the probability that the (T,U) random variable combination falls within the safe range of operation, which can be expressed as Pr(20<T<30, 0.5T-5<U<10). To obtain this probability, we call on assumption 4 from above, which is that the two random variables T and U are independent. Because of this assumption, we Module U8: Bivariate Random Variables 3 may compute a joint density from their individual densities in the same way that we compute a joint probability from individual probabilities. That is, if A and B are independent events, then Pr(AB)=Pr(A)Pr(B). Likewise, if T and U are independent random variables, then we may compute a joint density function as: f T ,U (t , u ) f T (t ) f U (u ) (U8.4) In the case of our particular problem here, this would be given as (0.1)(0.2)=0.02 for the range on which T and U are defined, that is: 0.02, (20 t 30), (5 u 10) f T ,U (t , u ) 0, otherwise (U8.5) This joint density function is illustrated in Fig. U8.2, where we have taken the u-t coordinate axis frame from Fig. U8.1 and laid it in the “horizontal” plane, with the “vertical” plane corresponding to the value of the density function. u fT,U(t,u) Safe range 0.03 25 20 0.02 15 10 0.01 5 t 10 20 30 40 Fig. U8.2: Illustration of a joint density function Module U7 shows that, for the case of a single random variable (which we will heretofore refer to as the univariate case), the probability that it is within a certain range is evaluated as the integration of the pdf over that range. The bivariate case is analogous in that here, we evaluate the probability of the two random variables being within some defined range by integrating the joint density function over that range. In mechanics, we come across a similar situation where we functionally express a density as a mass per unit area. The total mass within a specified area is then found as an integration of that function about the area. In this particular problem, we are interested in the random variables being within the safe range as indicated in Fig. U8.2. Thus, Pr({T ,U } safe range) Pr( 20 T 30, 0.5T 5 U 10) (U8.6) which is given by Pr({T ,U } safe range) f safe range Evaluation results in T ,U (t , u )dudt 30 10 20 0.5t 5 0.02dudt (U8.7) Module U8: Bivariate Random Variables 4 0.01t 2 Pr({T ,U } safe range) 0.02u dt (0.3 0.1t )dt 0.3t 0.5t 5 2 20 20 30 10 30 30 0.5 20 So we see that the probability of operating the conductor in the safe range is 0.5. We obtain 0.5 because the safe range is exactly half of the possible range and the joint pdf is uniform over the possible range. Example U10.1 repeats the analysis for the case when the pdf is not uniform over the possible range. Example U8.1 Repeat the above analysis to compute the probability of operating the conductor safely for the following pdfs: 0.1, 20 t 30 f T (t ) 0, otherwise 0.8u 0.4, 5 u 10 fU (u ) 0, otherwise The safe operating region is still specified by u 0.5t 5 . Now the joint pdf is given by 0.008 0.04, (20 t 30), (5 u 10) f T ,U (t , u ) f T (t ) f U (u ) = 0, otherwise Again, we have that Pr({T ,U } safe range) Pr( 20 T 30, 0.5T 5 U 10) f T ,U (t , u )dudt safe range 30 10 20 0.5t 5 0.008u 0.04dudt 2 / 3 After some messy but straightforward effort, the above double integral evaluates to 2/3. The probability of operating safely has increased. Why? Comparison of the two situations indicates that everything is the same except for the wind speed pdf. In the first case, the wind speed pdf is uniform over the range (20,30). In the second case, it is linearly increasing over the range (20,30). Thus, the density is higher at the upper range for the second case, indicating that it is more likely to see higher wind speeds in the second case than in the first case. Higher wind speeds mean cooler conductors. U8.3 Joint Cumulative Distribution Functions We have seen in module U7 that, for the univariate case, the cumulative distribution function (cdf) gives the probability that the random variable is less than the argument of the function, i.e., FX ( x) Pr( X x) (U8.8) and that it is obtained from the pdf according to x FX ( x ) f X ( ) d (U8.9) Module U8: Bivariate Random Variables 5 Similarly, in the bivariate case, the joint cdf gives the probability that the two random variables are jointly less than the arguments of the function. In the case of our example, where the random variables are wind speed U and ambient temperature t, this is FT ,U (t , u ) Pr(T t ,U u ) (U8.10) The joint cdf is obtained from the joint pdf according to t u FT ,U (t , u ) f T ,U ( , )dd (U8.11) The joint pdf for our illustration is given by (U8.5) above. We use this in obtaining the joint cdf according to (U8.11) as follows: t u FT ,U (t , u) f T ,U ( , )dd t u 0.02dd 0.02ut 0.4u 0.1t 2 (U8.12) 20 5 Now one must be careful with using the cdf outside the range where the pdf is non-zero. Recall that the pdf is defined for 20t30, 5u10 (see Fig. U8.1). It is possible to evaluate the cdf for values outside of these ranges according to the following. 0.02ut 0.4u 0.1t 2, 20 t 30, and 0, t 20 or FT ,U (t , u ) 0.1t 2, 20 t 30 and 0.2u 1, t 30 and 5 u 10 u5 u 10 5 u 10 (U8.13) The bottom three expressions of (U8.13) are obtained by Second expression: Use t=20 or u=5 in the top expression Third expression: Use u=10 in the top expression Fourth expression: Use t=30 in the top expression U8.4 Marginal Statistics Consider in the illustration of section U8.2 that we used the joint pdf to obtain the probability of the two random variables being within some region or range. That is, we computed Pr({T ,U } safe range) Pr( 20 T 30, 0.5T 5 U 10) 30 f T ,U (t , u )dudt safe range 20 10 0.02dudt (U8.14) 0.5t 5 From the above, it is clear that the range of integration on the joint pdf specifies the particular outcome for which we want the probability. In general, we have that b d Pr[( a X b) (c Y d )] f X ,Y ( , )dd (U8.15) a c Now what if we were only interested in a range for the random variable X, but we do not care about the random variable Y? What does “do not care about the random variable Y” mean? It means that Y may be anything, where the event “Y may be anything” may be specified by (-Y). Thus, to compute the probability that X is within a certain range without regard to Y, we may write that: b Pr[( a X b) ( Y )] f a X ,Y ( x, y)dyd x (U8.16) Module U8: Bivariate Random Variables 6 Consider the right-hand side of the above relation; in particular, consider the inner integral, and note that this integral is providing a probability that X is within the range (a,b) without regard to Y. This is precisely what the univariate pdf, called fX(x), has done for us, and it is no coincidence. In fact, the inner integral is the univariate pdf, fX(x). When we obtain it from a joint pdf, however, we refer to it as the marginal pdf. So, more formally, the marginal pdf with respect to X, is given by f X ( x) f X ,Y ( x, y )dy (U8.17) Likewise, we may define the marginal pdf with respect to Y, given by f fY ( y) X ,Y ( x, y )dx (U8.18) Returning to our illustration, let’s start with the joint pdf of T and U and then obtain the respective marginal pdfs. This is done as follows: fU (u ) 30 f T (t ) f T ,U (t , u )dt 0.02dt 0.2 5 u 10 (U8.19) 20 t 30 (U8.20) 20 10 f T ,U (t , u)du 0.02du 0.1 5 Note that fU(u) is zero outside of the specified range (5,10), and f T(t) is zero outside of the specified range (20,30). We may also obtain marginal cdfs. We already know from our work on the univariate case that we may obtain the marginal cdf from the marginal pdf according to: x FX ( x ) f X ( ) d (U8.21) Substitution of (U8.17) into (U8.21) results in: x FX ( x ) f X ,Y ( , y )dyd (U8.22) which shows how to obtain the marginal cdf from the joint pdf. Likewise, we can show in a similar way that y FY ( y ) f X ,Y ( x, )dxd (U8.23) Going back to our illustration using random variables T and U, we can obtain the marginal cdfs as t FT (t ) f T ,U ( , u )dud u FU (u ) t 10 0.02dud 0.1t 2, 20 t 30 20 5 u 30 f T ,U (t , )dtd 0.02dtd 0.2u 1, 5 20 5 u 10 Module U8: Bivariate Random Variables 7 Again, as with the marginal pdfs, note that fU(u) is zero outside of the specified range (5,10), and f T(t) is zero outside of the specified range (20,30). U8.5 Conditional Statistics We recall that the conditional probability associated with two events A and B may be expressed as Pr( A | B) Pr( A B) Pr( B) (U8.24) We desire to express the cdf and pdf of a random variable when it is conditioned on another random variable. The joint statistics described by bivariate cdfs and pdfs facilitates this. We derive the appropriate relations in this section. Let’s denote two events A and B in the following way: A : [ X x] B : [ y Y y y ] (U8.25) Then, by (U8.24), we Pr( A | B) Pr( A B) Pr[( X x) ( y Y y y )] Pr( B) Pr( y Y y y ) (U8.26) Inspection of the right-hand-side of (U8.26) reveals that it expresses a conditional cdf in that it provides the probability of the event (X<x) (making it a cdf) given the event (y<Y<y+y) (making it conditional on this event). Notationally, we write FX ( x | ( y Y y y )) Pr[( X x) ( y Y y y )] Pr( y Y y y ) (U8.27) The expression of (U8.27) provides the cdf for X given that Y is known to be in some interval. We can express this in terms of the joint pdfs on X and Y by recognizing that the numerator and denominator of (U8.27) is given by: x y y f Pr[( X x) ( y Y y y )] X ,Y ( , )dd (U8.28) y y y Pr( y Y y y ) f Y ( )d (U8.29) y Substitution of (U8.28) and (U8.29) into (U8.27) yields: x y y FX ( x | ( y Y y y )) f X ,Y ( , )dd y (U8.30) y y f Y ( )d y So (U8.30) provides the cdf of X conditioned on the event that Y is within some range. Let’s take it one step further and consider the cdf of X conditioned on the event that Y is exactly some number. This conditional cdf we will denote as FX(x|y). It is obtained from (U8.30) by letting y0, i.e., Module U8: Bivariate Random Variables 8 x y y FX ( x | y ) lim FX ( x | ( y Y y y )) lim y 0 f X ,Y ( , )dd y (U8.31) y y y 0 f Y ( )d y Noting that the limit of the ratio is the limit of the numerator divided by the limit of the denominator, (U8.31) becomes: x y y lim FX ( x | y ) y 0 f X ,Y ( , )dd y (U8.32) y y f lim y 0 Y ( )d y and we may move the limit operation of the numerator inside the first integral since the first integral is with respect to x, not y, resulting in: x FX ( x | y ) lim y 0 y y f X ,Y ( , )dd y (U8.33) y y lim y 0 f Y ( )d y Now consider what the limit operation does to the inner integral of the numerator. If y becomes very small, then the inner integration of the numerator can be approximated by the area of a rectangle having width of y and height of fX,Y(,y), so that the numerator of (U8.33) becomes y y x lim f y 0 X ,Y ( , )dd x f x X ,Y y ( , y )yd y f X ,Y ( , y )d (U8.34) Likewise, the denominator of (U8.33) becomes y y lim y 0 f Y ( )d yf Y ( y ) (U8.35) y Substitution of (U8.34) and (U8.35) into (U8.33) results in x FX ( x | y ) y f X ,Y ( , y )d yf Y ( y ) x f X ,Y ( , y )d fY ( y) (U8.36) We may also go through similar reasoning to find y FY ( y | x) f X ,Y ( x, )d f X ( x) (U8.37) The above expressions (U8.36) and (U8.37) provide us with the ability to compute a conditional cdf for the bivariate case given that we have the joint pdf of the random variables. These expressions also require the Module U8: Bivariate Random Variables 9 marginal pdfs in their denominators, but the marginal pdfs may also be obtained from the joint pdfs via (U8.17) and (U8.18). We may also obtain marginal pdfs as follows. Recall that for the univariate case, x f X ( ) d F X ( x ) (U8.38) Using the fundamental theorem of calculus to differentiate the left-hand-side of (U8.38), we obtain f X ( x) dFX ( x) dx (U8.39) In a similar way, we can relate the conditional cdf and pdf as follows: x f X ( | y )d FX ( x | y ) (U8.40) Again, using the fundamental theorem of calculus to differentiate the left-hand-side of (U8.40), we obtain f X ( x | y) dFX ( x | y ) dx (U8.41) Substituting (U8.36) into (U8.41), we have x f X ,Y ( , y)d dF ( x | y ) d f X ,Y ( x, y) f X ( x | y) X dx dx f Y ( y) f Y ( y) (U8.42) Likewise, we may obtain x f X ,Y ( x, )d dF ( y | x) d f X ,Y ( x, y) f X ( y | x) Y dy dy f X ( x) f X ( x) (U8.43) Again, going back to our illustration using random variables T and U, the conditionals are found as follows: t FT (t | u ) f T ,U FU (u | t ) U8.6 ( , y )d fU (u ) u f T ,U t f T (t ) 0.2 0.02t 0.4 0.1t 2, 0.2 0.02u 0.1 0.2u 1, 0.1 20 t 30 u ( , y )d 0.02d 20 0.02d 5 0.1 Independence Two events A and B may be tested for independence in two ways. 5 t 10 Module U8: Bivariate Random Variables 10 First, one may examine their joint probability. If Pr(A∩B)=Pr(A)Pr(B), then A and B are independent. Likewise, for bivariate random variables, if fX,Y(x,y)=fX(x)fY(y), then the X and Y are independent. Second, one may examine the conditional probabilities. If Pr(A|B)=Pr(A) or if Pr(B|A)=Pr(B), then A and B are independent. Likewise, for bivariate random variables, if fX(x|y)=fX(x) or if fY(y|x)=fY(y), then X and Y are independent. Example U8.2 Consider the following joint pdf, f X ,Y ( x, y ) x 2 xy , 3 0 x 1, a. Determine the marginal densities b. Determine the conditional densities c. Determine if X and Y are independent. 0 y2 Solution: 2 xy xy 2 2 f X ( x) f X ,Y ( x, y )dy x dy x y 3 6 0 2 2 a. 0 2 2 2 x x, 0 x 1 3 0 otherwise 1 1 1 xy x3 x 2 y y, 0 y 2 f Y ( y ) f X ,Y ( x, y )dx x dx 3 6 3 3 6 0 0 0 otherwise 1 2 xy 6 x 2 2 xy f X ,Y ( x, y ) 0 x 1, 0 y 2 3 f X ( x | y) 2 y 1 1 f Y ( y) y 0 otherwise 3 6 x2 b. f X ( y | x) c. f X ,Y ( x, y ) f X ( x) xy 3x y 3 6x 2 2 2 x 2 x 0 3 x2 0 x 1, 0 y2 otherwise Two methods to determine independence: First method (by comparing joint density to product of marginal densities): f X ( x) f Y ( y ) (2 x 2 2 1 1 2 1 2 1 xy x)( y ) x 2 x 2 y x y f X ,Y ( x, y ) x 2 3 3 6 3 3 9 9 3 Since the above shows that fX,Y(x,y)≠fX(x)fY(y), we can be sure that X and Y are not independent, i.e., they are dependent. We can check this by using the second method. Second method (by comparing the conditional densities to the marginal densities): Inspection of the appropriate densities above shows that fX(x|y)≠fX(x) and if fY(y|x)≠fY(y), either one of which shows that X and Y are not independent. Module U8: Bivariate Random Variables 11 Problems Problem 1 Two random variables have the joint density function xy, 0 x 1, 0 y 2 f X ,Y ( x, y ) otherwise 0, a. Find the probability that both random variables will be less than 1.0 b. Find the joint cumulative distribution c. Find both marginal density functions d. Find an expression for the conditional density function of x given y e. Find an expression for the conditional density function of x given y=1 f. Determine whether X and Y are independent or dependent. Explain how you know. Problem 2 Let the random variable X represent the time (in minutes) that a programmer spends to locate and correct a routine problem, and let random variable Y represent the length of time (in minutes) that he needs to spend reading procedures for correcting the problem. These two random variables are represented by the joint probability density c( x1 / 3 y 1 / 5 ), 0 x 60, 0 y 10 f X ,Y ( x, y) 0, otherwise Determine a. The value of c b. The probability that a programmer will be able to take care of the problem in less than 10 minutes using less than 2 minutes to read the necessary procedures. c. Whether the two random variables are independent. Problem 3 A joint probability density is given by f(u,t)=0.02 for 20<t<30 and 5<u<10. Find the CDF and then use this CDF to obtain 1. P[U<5,T<20] 2. P[U<10,T<30] Problem 4 Module U8: Bivariate Random Variables 12 For the following joint probability density function, find the probability that the random variable is between the limits 0 and 1 AND that the random variable Y is also between the limits 0 and 1. 9 e 3 x e 3 y , x, y 0 f X ,Y ( x, y) otherwise 0, Problem 5 For the following joint probability density function, find the probability that the random variable X is between the limits 0 and 2 AND that the random variable Y is between the limits 0 and 4. 4 xy, 0 x 3, 0 y 5 f X ,Y ( x, y ) 225 0, otherwise Problem 6 For the following joint probability density function, find both marginal pdfs and both conditional pdfs. 4 xy, 0 x 3, 0 y 5 f X ,Y ( x, y ) 225 0, otherwise Problem 7 For the following joint probability density function, find the value of A, both marginal pdfs, and both conditional pdfs. Also, determine if X and Y are independent. A x y , 0 x 1, 0 y 1 f X ,Y ( x, y ) otherwise 0, Problem 8 Two random variables, x and y, are described by the following probability density function. 4 xy, 0 x 1, 0 y 1 f X ,Y ( x, y ) otherwise 0, a. Prove that this is a valid joint probability density function (pdf). b. Find the cumulative distribution function (CDF) for these random variables for the range of values over which the pdf is non-zero. c. Find the CDF for x<0, y<0. d. Find the CDF for x>1, y>1. Module U8: Bivariate Random Variables 13 e. Find the CDF for x>1, 0<y<1 f. An particular event is specified by the range A: (X+Y)/2<0.5. Use the pdf to evaluate the probability P[A]. g. Find the marginal pdfs fX(x) and fY(y). h. Are the two random variables X and Y independent? Problem 9 1. Two random variable, X and Y, are described by the following bi-variate probability density function: fXY(x,y) =6x2y 0.8736<x<1, 1<y<2 =0 otherwise a. Prove that this function is a valid pdf. b. Find both marginal distributions. c. Find the conditional pdf, fX(x|y). d. Compute Prob[X<0.9, Y< 1.3]. e. Prove that X and Y are dependent, or prove that they are independent. f. Compute the mean value of X, X. References [1] [2] [3] [4] [5] [6] [7] [8] “Power System Reliability Evaluation,” IEEE Tutorial Course 82 EHO 195-8-PWR, 1982, IEEE. P. Anderson, “Power System Protection,” 1996 unpublished review copy. S. Ross, “Introduction to Probability Models,” Academic Press, New York, 1972. M. Modarres, M. Kaminskiy, and V. Krivsov, “Reliability Engineering and Risk Analysis, A Practical Guide,” Marcel Dekker, Inc. New York, 1999. L. Bain and M. Engelhardt, “Introduction to Probability and Mathematical Statistics,” Duxbury Press, Belmont, California, 1992. A. Papoulis, “Probability, Random Variables, and Stochastic Processes,” McGraw-Hill, New York, 1984. W. Blischke and D. Murthy, “Reliability: Modeling, Prediction, and Optimization,” John Wiley, New York, 2000. R. Billinton and R. Allan, “Reliability Evaluation of Engineering Systems,” Plenum Press, New York, 1992.