Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 8. Jointly Distributed Random Variables Suppose we have two random variables X and Y defined on a sample space S with probability measure P( E), E ⊆ S. Note that S could be the set of outcomes from two ‘experiments’, and the sample space points be two-dimensional; then perhaps X could relate to the first experiment, and Y to the second. Then from before we know to define the marginal probability distributions PX and PY by, for example, PX ( B) = P( X −1 ( B)), B ⊆ R. We now define the joint probability distribution: Definition 8.0.1. Given a pair of random variables, X and Y, we define the joint probability distribution PXY as follows: PXY ( BX , BY ) = P{ X −1 ( BX ) ∩ Y −1 ( BY )}, BX , BY ⊆ R. So PXY ( BX , BY ), the probability that X ∈ BX and Y ∈ BY , is given by the probability of the set of all points in the sample space that get mapped both into BX by X and into BY by Y. More generally, for a single region BXY ⊆ R2 , find the collection of sample space elements SXY = {s ∈ S|( X (s), Y (s)) ∈ BXY } and define PXY ( BXY ) = P(SXY ). 8.0.1 Joint Cumulative Distribution Function We define the joint cumulative distribution as follows: Definition 8.0.2. Given a pair of random variables, X and Y, the joint cumulative distribution function is defined as FXY ( x, y) = PXY ( X ≤ x, Y ≤ y), x, y ∈ R. It is easy to check that the marginal cdfs for X and Y can be recovered by FX ( x ) = FXY ( x, ∞), FY (y) = FXY (∞, y), and that the two definitions will agree. 68 x ∈ R, y ∈ R, 8.0.2 Properties of Joint CDF FXY For FXY to be a valid cdf, we need to make sure the following conditions hold. 1. 0 ≤ FXY ( x, y) ≤ 1, ∀ x, y ∈ R; 2. Monotonicity: ∀ x1 , x2 , y1 , y2 ∈ R, x1 < x2 ⇒ FXY ( x1 , y1 ) ≤ FXY ( x2 , y1 ) and y1 < y2 ⇒ FXY ( x1 , y1 ) ≤ FXY ( x1 , y2 ); 3. ∀ x, y ∈ R, FXY ( x, −∞) = 0, FXY (−∞, y) = 0 and FXY (∞, ∞) = 1. Y Suppose we are interested in whether the random variable pair ( X, Y ) lie in the interval cross product ( x1 , x2 ] × (y1 , y2 ]; that is, if x1 < X ≤ x2 and y1 < Y ≤ y2 . ✻ y2 y1 x1 x2 ✲ X First note that PXY ( x1 < X ≤ x2 , Y ≤ y) = FXY ( x2 , y) − FXY ( x1 , y). It is then easy to see that PXY ( x1 < X ≤ x2 , y1 < Y ≤ y2 ) is given by FXY ( x2 , y2 ) − FXY ( x1 , y2 ) − FXY ( x2 , y1 ) + FXY ( x1 , y1 ). 8.0.3 Joint Probability Mass Functions Definition 8.0.3. If X and Y are both discrete random variables, then we can define the joint probability mass function as p XY ( x, y) = PXY ( X = x, Y = y), x, y ∈ R. We can recover the marginal pmfs p X and pY since, by the law of total probability, ∀ x, y ∈ R pX (x) = ∑ pXY (x, y), y pY ( y ) = ∑ pXY (x, y). x Properties of Joint PMFs For p XY to be a valid pmf, we need to make sure the following conditions hold. 1. 0 ≤ p XY ( x, y) ≤ 1, ∀ x, y ∈ R; 2. ∑ ∑ pXY (x, y) = 1. y x 69 8.0.4 Joint Probability Density Functions On the other hand, if ∃ f XY : R × R → R s.t. PXY ( BXY ) = � ( x,y)∈ BXY f XY ( x, y)dxdy, BXY ⊆ R × R, then we say X and Y are jointly continuous and we refer to f XY as the joint probability density function of X and Y. In this case, we have FXY ( x, y) = � y t=−∞ � x s=−∞ f XY (s, t)dsdt, x, y ∈ R, Definition 8.0.4. By the Fundamental Theorem of Calculus we can identify the joint pdf as ∂2 FXY ( x, y). ∂x∂y f XY ( x, y) = Furthermore, we can recover the marginal densities f X and f Y : d d FX ( x ) = FXY ( x, ∞) dx dx � ∞ � x d = f XY (s, y)dsdy. dx y=−∞ s=−∞ f X (x) = By the Fundamental Theorem of Calculus, and through a symmetric argument for Y, we thus get � � f X (x) = ∞ y=−∞ f XY ( x, y)dy, fY (y) = ∞ x =−∞ f XY ( x, y)dx. Properties of Joint PDFs For f XY to be a valid pdf, we need to make sure the following conditions hold. 1. f XY ( x, y) ≥ 0, ∀ x, y ∈ R; 2. � ∞ y=−∞ � ∞ x =−∞ f XY ( x, y)dxdy = 1. 8.1 Independence and Expectation 8.1.1 Independence Two random variables X and Y are independent if and only if ∀ BX , BY ⊆ R., PXY ( BX , BY ) = PX ( BX )PY ( BY ). More specifically, 70 Definition 8.1.1. Two random variables X and Y are independent if and only if f XY ( x, y) = f X ( x ) f Y (y), ∀ x, y ∈ R. Definition 8.1.2. For two random variables X, Y we define the conditional probability distribution PY |X by PY |X ( BY | BX ) = PXY ( BX , BY ) , P X ( BX ) BX , BY ⊆ R. This is the revised probability of Y falling inside BY given that we now know X ∈ BX . Then we have X and Y are independent ⇐⇒ PY |X ( BY | BX ) = PY ( BY ), ∀ BX , BY ⊆ R. Definition 8.1.3. For random variables X, Y we define the conditional probability density function f Y |X by f XY ( x, y) , x, y ∈ R. fY|X (y| x ) = f X (x) Note The random variables X and Y are independent ⇐⇒ f Y |X (y| x ) = f Y (y), ∀ x, y ∈ R. 8.1.2 Expectation Suppose we have a (measurable) bivariate function of interest of the random variables X and Y, g : R × R → R. Definition 8.1.4. If X and Y are discrete, we define E{ g( X, Y )} by EXY { g( X, Y )} = ∑ ∑ g(x, y) pXY (x, y). y x Definition 8.1.5. If X and Y are jointly continuous, we define E{ g( X, Y )} by EXY { g( X, Y )} = � ∞ y=−∞ � ∞ x =−∞ g( x, y) f XY ( x, y)dxdy. Immediately from these definitions we have the following: • If g( X, Y ) = g1 ( X ) + g2 (Y ), EXY { g1 ( X ) + g2 (Y )} = EX { g1 ( X )} + EY { g2 (Y )}. 71 • If g( X, Y ) = g1 ( X ) g2 (Y ) and X and Y are independent, EXY { g1 ( X ) g2 (Y )} = EX { g1 ( X )}EY { g2 (Y )}. In particular, considering g( X, Y ) = XY for independent X, Y we have EXY ( XY ) = EX ( X )EY (Y ). 8.1.3 Conditional Expectation Warning! In general EXY ( XY ) �= EX ( X )EY (Y ). Suppose X and Y are discrete random variables with joint pmf p( x, y). If we are given the value x of the random variable X, our revised pmf for Y is the conditional pmf p(y| x ), for y ∈ R. Definition 8.1.6. The conditional expectation of Y given X = x is therefore EY |X (Y | X = x ) = ∑ y p ( y | x ). y Similarly, Definition 8.1.7. If X and Y were continuous, EY |X (Y | X = x ) = � ∞ y=−∞ y f (y| x )dy. In either case, the conditional expectation is a function of x but not the unknown Y. For a single variable X we considered the expectation of g( X ) = ( X − µ X )( X − µ X ), called the variance and denoted σX2 . The bivariate extension of this is the expectation of g( X, Y ) = ( X − µ X )(Y − µY ). We define the covariance of X and Y by σXY = Cov( X, Y ) = EXY [( X − µ X )(Y − µY )]. Covariance measures how the random variables move in tandem with one another, and so is closely related to the idea of correlation. Definition 8.1.8. We define the correlation of X and Y by ρ XY = Cor( X, Y ) = σXY . σX σY Unlike the covariance, the correlation is invariant to the scale of the random variables X and Y. It is easily shown that if X and Y are independent random variables, then σXY = ρ XY = 0. 72 8.2 Examples Example Suppose that the lifetime, X, and brightness, Y of a light bulb are modelled as continuous random variables. Let their joint pdf be given by f ( x, y) = λ1 λ2 e−λ1 x−λ2 y , x, y > 0. Question Are lifetime and brightness independent? Solution If the lifetime and brightness are independent we would have f ( x, y) = f ( x ) f (y) for all x, y ∈ R. The marginal pdf for X is f ( x )= � ∞ y=−∞ f ( x, y)dy = = λ 1 e − λ1 x . � ∞ y =0 λ1 λ2 e−λ1 x−λ2 y dy Similarly f (y) = λ2 e−λ2 y . Hence f ( x, y) = f ( x ) f (y) and X and Y are independent. � Example Suppose continuous random variables ( X, Y ) ∈ R2 have joint pdf � √ 1, | x | + |y| < 1/ 2 f ( x, y) = 0, otherwise. Question Determine the marginal pdfs for X and Y. Solution √ √ We have | x | + |y| < 1/ 2 ⇐⇒ |y| < 1/ 2 − | x |. So f (x) = Similarly f (y) = √ � √1 −| x | 2 y=−( √1 −| x |) 2 dy = √ 2 − 2| x |. 2 − 2|y|. Hence f ( x, y) �= f ( x ) f (y) and X and Y are not independent. � 73