Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Jointly Distributed Random Variables COMP 245 STATISTICS Dr N A Heard Contents 1 2 3 1 1.1 Jointly Distributed Random Variables 1.1 Definition . . . . . . . . . . . . . . 1.2 Joint cdfs . . . . . . . . . . . . . . . 1.3 Joint Probability Mass Functions . 1.4 Joint Probability Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 2 3 Independence and Expectation 2.1 Independence . . . . . . . . 2.2 Expectation . . . . . . . . . 2.3 Conditional Expectation . . 2.4 Covariance and Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 5 5 6 . . . . . . . . . . . . . . . . Examples 6 Jointly Distributed Random Variables Definition Joint Probability Distribution Suppose we have two random variables X and Y defined on a sample space S with probability measure P( E), E ⊆ S. Jointly, they form a map ( X, Y ) : S → R2 where s 7→ ( X (s), Y (s)). From before, we know to define the marginal probability distributions PX and PY by, for example, PX ( B) = P( X −1 ( B)), B ⊆ R. We now define the joint probability distribution PXY by PXY ( BX , BY ) = P{ X −1 ( BX ) ∩ Y −1 ( BY )}, BX , BY ⊆ R. So PXY ( BX , BY ), the probability that X ∈ BX and Y ∈ BY , is given by the probability P of the set of all points in the sample space that get mapped both into BX by X and into BY by Y. More generally, for a single region BXY ⊆ R2 , find the collection of sample space elements SBXY = {s ∈ S|( X (s), Y (s)) ∈ BXY } and define PXY ( BXY ) = P(SBXY ). 1 1.2 Joint cdfs Joint Cumulative Distribution Function We thus define the joint cumulative distribution function as FXY ( x, y) = PXY ((−∞, x ], (−∞, y]), x, y ∈ R. It is easy to check that the marginal cdfs for X and Y can be recovered by FX ( x ) = FXY ( x, ∞), x ∈ R, FY (y) = FXY (∞, y), y ∈ R, and that the two definitions will agree. Properties of a Joint cdf For FXY to be a valid cdf, we need to make sure the following conditions hold. 1. 0 ≤ FXY ( x, y) ≤ 1, ∀ x, y ∈ R; 2. Monotonicity: ∀ x1 , x2 , y1 , y2 ∈ R, x1 < x2 ⇒ FXY ( x1 , y1 ) ≤ FXY ( x2 , y1 ) and y1 < y2 ⇒ FXY ( x1 , y1 ) ≤ FXY ( x1 , y2 ); 3. ∀ x, y ∈ R, FXY ( x, −∞) = 0, FXY (−∞, y) = 0 and FXY (∞, ∞) = 1. Interval Probabilities Y Suppose we are interested in whether the random variable pair ( X, Y ) lie in the interval cross product ( x1 , x2 ] × (y1 , y2 ]; that is, if x1 < X ≤ x2 and y1 < Y ≤ y2 . 6 y2 y1 x1 x2 - X First note that PXY (( x1 , x2 ], (−∞, y]) = FXY ( x2 , y) − FXY ( x1 , y). It is then easy to see that PXY (( x1 , x2 ], (y1 , y2 ]) is given by FXY ( x2 , y2 ) − FXY ( x1 , y2 ) − FXY ( x2 , y1 ) + FXY ( x1 , y1 ). 1.3 Joint Probability Mass Functions If X and Y are both discrete random variables, then we can define the joint probability mass function as p XY ( x, y) = PXY ({ x }, {y}), x, y ∈ R. We can recover the marginal pmfs p X and pY since, by the law of total probability, ∀ x, y ∈ R pX (x) = ∑ pXY (x, y), pY ( y ) = y ∑ pXY (x, y). x 2 Properties of a Joint pmf For p XY to be a valid pmf, we need to make sure the following conditions hold. 1. 0 ≤ p XY ( x, y) ≤ 1, ∀ x, y ∈ R; 2. ∑ ∑ pXY (x, y) = 1. y 1.4 x Joint Probability Density Functions On the other hand, if ∃ f XY : R × R → R s.t. PXY ( BXY ) = Z ( x,y)∈ BXY BXY ⊆ R × R, f XY ( x, y)dxdy, then we say X and Y are jointly continuous and we refer to f XY as the joint probability density function of X and Y. In this case, we have FXY ( x, y) = Z y t=−∞ Z x s=−∞ f XY (s, t)dsdt, x, y ∈ R, By the Fundamental Theorem of Calculus we can identify the joint pdf as f XY ( x, y) = ∂2 FXY ( x, y). ∂x∂y Furthermore, we can recover the marginal densities f X and f Y : d d FX ( x ) = FXY ( x, ∞) dx dx Z Z x d ∞ = f XY (s, y)dsdy. dx y=−∞ s=−∞ f X (x) = By the Fundamental Theorem of Calculus, and through a symmetric argument for Y, we thus get f X (x) = Z ∞ y=−∞ f XY ( x, y)dy, f Y (y) = Z ∞ x =−∞ f XY ( x, y)dx. Properties of a Joint pdf For f XY to be a valid pdf, we need to make sure the following conditions hold. 1. f XY ( x, y) ≥ 0, ∀ x, y ∈ R; 2. Z ∞ y=−∞ Z ∞ x =−∞ f XY ( x, y)dxdy = 1. 3 2 2.1 Independence and Expectation Independence Independence of Random Variables Two random variables X and Y are independent iff ∀ BX , BY ⊆ R., PXY ( BX , BY ) = PX ( BX )PY ( BY ). More specifically, two discrete random variables X and Y are independent iff p XY ( x, y) = p X ( x ) pY (y), ∀ x, y ∈ R; and two continuous random variables X and Y are independent iff f XY ( x, y) = f X ( x ) f Y (y), ∀ x, y ∈ R. Conditional Distributions For two r.v.s X, Y we define the conditional probability distribution PY |X by PY |X ( BY | BX ) = PXY ( BX , BY ) , P X ( BX ) BX , BY ⊆ R, PX ( BX ) 6= 0. This is the revised probability of Y falling inside BY given that we now know X ∈ BX . Then we have X and Y are independent ⇐⇒ PY |X ( BY | BX ) = PY ( BY ), ∀ BX , BY ⊆ R. For discrete r.v.s X, Y we define the conditional probability mass function pY |X by p XY ( x, y) , x, y ∈ R, p X ( x ) 6= 0, pX (x) and for continuous r.v.s X, Y we define the conditional probability density function f Y |X by pY | X ( y | x ) = f Y |X (y| x ) = f XY ( x, y) , f X (x) x, y ∈ R. [Justification: PXY ([ x, x + dx ), (−∞, y]) PX ([ x, x + dx )) F (y, x + dx ) − F (y, x ) = F ( x + dx ) − F ( x ) { F (y, x + dx ) − F (y, x )}/dx = { F ( x + dx ) − F ( x )}/dx d{ F (y, x + dx ) − F (y, x )}/dxdy =⇒ f (y| X ∈ [ x, x + dx )) = { F ( x + dx ) − F ( x )}/dx f (y, x ) =⇒ f (y| x ) = lim f (y| X ∈ [ x, x + dx )) = .] f (x) dx −>0 P(Y ≤ y| X ∈ [ x, x + dx )) = R. In either case, X and Y are independent ⇐⇒ pY |X (y| x ) = pY (y) or f Y |X (y| x ) = f Y (y), ∀ x, y ∈ 4 2.2 Expectation E{ g( X, Y )} Suppose we have a bivariate function of interest of the random variables X and Y, g : R × R → R. If X and Y are discrete, we define E{ g( X, Y )} by EXY { g( X, Y )} = ∑ ∑ g(x, y) pXY (x, y). y x If X and Y are jointly continuous, we define E{ g( X, Y )} by EXY { g( X, Y )} = Z ∞ y=−∞ Z ∞ x =−∞ g( x, y) f XY ( x, y)dxdy. Immediately from these definitions we have the following: • If g( X, Y ) = g1 ( X ) + g2 (Y ), EXY { g1 ( X ) + g2 (Y )} = EX { g1 ( X )} + EY { g2 (Y )}. • If g( X, Y ) = g1 ( X ) g2 (Y ) and X and Y are independent, EXY { g1 ( X ) g2 (Y )} = EX { g1 ( X )}EY { g2 (Y )}. In particular, considering g( X, Y ) = XY for independent X, Y we have EXY ( XY ) = EX ( X )EY (Y ). 2.3 Conditional Expectation EY |X (Y | X = x ) In general EXY ( XY ) 6= EX ( X )EY (Y ). Suppose X and Y are discrete r.v.s with joint pmf p( x, y). If we are given the value x of the r.v. X, our revised pmf for Y is the conditional pmf p(y| x ), for y ∈ R. The conditional expectation of Y given X = x is therefore EY |X (Y | X = x ) = ∑ y p ( y | x ). y Similarly, if X and Y were continuous, EY |X (Y | X = x ) = Z ∞ y=−∞ y f (y| x )dy. In either case, the conditional expectation is a function of x but not the unknown Y. 5 2.4 Covariance and Correlation Covariance For a single variable X we considered the expectation of g( X ) = ( X − µ X )( X − µ X ), called the variance and denoted σX2 . The bivariate extension of this is the expectation of g( X, Y ) = ( X − µ X )(Y − µY ). We define the covariance of X and Y by σXY = Cov( X, Y ) = EXY [( X − µ X )(Y − µY )]. Correlation Covariance measures how the random variables move in tandem with one another, and so is closely related to the idea of correlation. We define the correlation of X and Y by ρ XY = Cor( X, Y ) = σXY . σX σY Unlike the covariance, the correlation is invariant to the scale of the r.v.s X and Y. It is easily shown that if X and Y are independent random variables, then σXY = ρ XY = 0. 3 Examples Example 1 Suppose that the lifetime, X, and brightness, Y of a light bulb are modelled as continuous random variables. Let their joint pdf be given by f ( x, y) = λ1 λ2 e−λ1 x−λ2 y , x, y > 0. Are lifetime and brightness independent? The marginal pdf for X is f (x) = = Z ∞ f ( x, y)dy = y=−∞ λ 1 e − λ1 x . Z ∞ y =0 λ1 λ2 e−λ1 x−λ2 y dy Similarly f (y) = λ2 e−λ2 y . Hence f ( x, y) = f ( x ) f (y) and X and Y are independent. Example 2 Suppose continuous r.v.s ( X, Y ) ∈ R2 have joint pdf 1 2 2 π, x + y ≤ 1 f ( x, y) = 0, otherwise. Determine the marginal pdfs for X and Y. √ Well x2 + y2 ≤ 1 ⇐⇒ |y| ≤ 1 − x2 . So f (x) = Similarly f (y) = 2 π q Z ∞ y=−∞ f ( x, y)dy = Z √1− x 2 √ y=− 1− x2 1 2p dy = 1 − x2 . π π 1 − y2 . Hence f ( x, y) 6= f ( x ) f (y) and X and Y are not independent. 6