Download Chapter 8. Jointly Distributed Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Randomness wikipedia , lookup

Transcript
Chapter 8.
Jointly Distributed Random Variables
Suppose we have two random variables X and Y defined on a sample space S with probability
measure P( E), E ⊆ S.
Note that S could be the set of outcomes from two ‘experiments’, and the sample space
points be two-dimensional; then perhaps X could relate to the first experiment, and Y to the
second.
Then from before we know to define the marginal probability distributions PX and PY by,
for example,
PX ( B) = P( X −1 ( B)), B ⊆ R.
We now define the joint probability distribution:
Definition 8.0.1. Given a pair of random variables, X and Y, we define the joint probability
distribution PXY as follows:
PXY ( BX , BY ) = P{ X −1 ( BX ) ∩ Y −1 ( BY )},
BX , BY ⊆ R.
So PXY ( BX , BY ), the probability that X ∈ BX and Y ∈ BY , is given by the probability of the set
of all points in the sample space that get mapped both into BX by X and into BY by Y.
More generally, for a single region BXY ⊆ R2 , find the collection of sample space elements
SXY = {s ∈ S|( X (s), Y (s)) ∈ BXY }
and define
PXY ( BXY ) = P(SXY ).
8.0.1 Joint Cumulative Distribution Function
We define the joint cumulative distribution as follows:
Definition 8.0.2. Given a pair of random variables, X and Y, the joint cumulative distribution
function is defined as
FXY ( x, y) = PXY ( X ≤ x, Y ≤ y),
x, y ∈ R.
It is easy to check that the marginal cdfs for X and Y can be recovered by
FX ( x ) = FXY ( x, ∞),
FY (y) = FXY (∞, y),
and that the two definitions will agree.
68
x ∈ R,
y ∈ R,
8.0.2 Properties of Joint CDF FXY
For FXY to be a valid cdf, we need to make sure the following conditions hold.
1. 0 ≤ FXY ( x, y) ≤ 1, ∀ x, y ∈ R;
2. Monotonicity: ∀ x1 , x2 , y1 , y2 ∈ R,
x1 < x2 ⇒ FXY ( x1 , y1 ) ≤ FXY ( x2 , y1 ) and y1 < y2 ⇒ FXY ( x1 , y1 ) ≤ FXY ( x1 , y2 );
3. ∀ x, y ∈ R,
FXY ( x, −∞) = 0, FXY (−∞, y) = 0 and FXY (∞, ∞) = 1.
Y
Suppose we are interested in whether the random variable pair ( X, Y ) lie in the interval
cross product ( x1 , x2 ] × (y1 , y2 ]; that is, if x1 <
X ≤ x2 and y1 < Y ≤ y2 .
✻
y2
y1
x1
x2
✲
X
First note that PXY ( x1 < X ≤ x2 , Y ≤ y) = FXY ( x2 , y) − FXY ( x1 , y).
It is then easy to see that PXY ( x1 < X ≤ x2 , y1 < Y ≤ y2 ) is given by
FXY ( x2 , y2 ) − FXY ( x1 , y2 ) − FXY ( x2 , y1 ) + FXY ( x1 , y1 ).
8.0.3 Joint Probability Mass Functions
Definition 8.0.3. If X and Y are both discrete random variables, then we can define the joint
probability mass function as
f XY ( x, y) = PXY ( X = x, Y = y),
x, y ∈ R.
We can recover the marginal pmfs p X and pY since, by the law of total probability, ∀ x, y ∈ R
f X (x) =
∑ f XY (x, y),
y
f Y (y) =
∑ f XY (x, y).
x
Properties of Joint PMFs
For f XY to be a valid pmf, we need to make sure the following conditions hold.
1. 0 ≤ f XY ( x, y) ≤ 1, ∀ x, y ∈ R;
2.
∑ ∑ f XY (x, y) = 1.
y
x
69
8.0.4 Joint Probability Density Functions
On the other hand, if ∃ f XY : R × R → R s.t.
PXY ( BXY ) =
�
( x,y)∈ BXY
f XY ( x, y)dxdy,
BXY ⊆ R × R,
then we say X and Y are jointly continuous and we refer to f XY as the joint probability
density function of X and Y.
In this case, we have
FXY ( x, y) =
� y
t=−∞
� x
s=−∞
f XY (s, t)dsdt,
x, y ∈ R,
Definition 8.0.4. By the Fundamental Theorem of Calculus we can identify the joint pdf as
∂2
FXY ( x, y).
∂x∂y
f XY ( x, y) =
Furthermore, we can recover the marginal densities f X and f Y :
d
d
FX ( x ) =
FXY ( x, ∞)
dx
dx
� ∞
� x
d
=
f XY (s, y)dsdy.
dx y=−∞ s=−∞
f X (x) =
By the Fundamental Theorem of Calculus, and through a symmetric argument for Y, we thus
get
�
�
f X (x) =
∞
y=−∞
f XY ( x, y)dy,
fY (y) =
∞
x =−∞
f XY ( x, y)dx.
Properties of Joint PDFs
For f XY to be a valid pdf, we need to make sure the following conditions hold.
1. f XY ( x, y) ≥ 0, ∀ x, y ∈ R;
2.
� ∞
y=−∞
� ∞
x =−∞
f XY ( x, y)dxdy = 1.
8.1 Independence and Expectation
8.1.1 Independence
Two random variables X and Y are independent if and only if ∀ BX , BY ⊆ R.,
PXY ( BX , BY ) = PX ( BX )PY ( BY ).
More specifically,
70
Definition 8.1.1. Two random variables X and Y are independent if and only if
f XY ( x, y) = f X ( x ) f Y (y),
∀ x, y ∈ R.
Definition 8.1.2. For two random variables X, Y we define the conditional probability
distribution PY |X by
PY |X ( BY | BX ) =
PXY ( BX , BY )
,
P X ( BX )
BX , BY ⊆ R.
This is the revised probability of Y falling inside BY given that we now know X ∈ BX .
Then we have X and Y are independent ⇐⇒ PY |X ( BY | BX ) = PY ( BY ), ∀ BX , BY ⊆ R.
Definition 8.1.3. For random variables X, Y we define the conditional probability density
function f Y |X by
f XY ( x, y)
, x, y ∈ R.
fY|X (y| x ) =
f X (x)
Note The random variables X and Y are independent ⇐⇒ f Y |X (y| x ) = f Y (y), ∀ x, y ∈ R.
8.1.2 Expectation
Suppose we have a (measurable) bivariate function of interest of the random variables X and
Y, g : R × R → R.
Definition 8.1.4. If X and Y are discrete, we define E{ g( X, Y )} by
EXY { g( X, Y )} =
∑ ∑ g(x, y) pXY (x, y).
y
x
Definition 8.1.5. If X and Y are jointly continuous, we define E{ g( X, Y )} by
EXY { g( X, Y )} =
� ∞
y=−∞
� ∞
x =−∞
g( x, y) f XY ( x, y)dxdy.
Immediately from these definitions we have the following:
• If g( X, Y ) = g1 ( X ) + g2 (Y ),
EXY { g1 ( X ) + g2 (Y )} = EX { g1 ( X )} + EY { g2 (Y )}.
71
• If g( X, Y ) = g1 ( X ) g2 (Y ) and X and Y are independent,
EXY { g1 ( X ) g2 (Y )} = EX { g1 ( X )}EY { g2 (Y )}.
In particular, considering g( X, Y ) = XY for independent X, Y we have
EXY ( XY ) = EX ( X )EY (Y ).
8.1.3 Conditional Expectation
Warning! In general EXY ( XY ) �= EX ( X )EY (Y ).
Suppose X and Y are discrete random variables with joint pmf p( x, y). If we are given the
value x of the random variable X, our revised pmf for Y is the conditional pmf p(y| x ), for
y ∈ R.
Definition 8.1.6. The conditional expectation of Y given X = x is therefore
EY |X (Y | X = x ) =
∑y
y
f ( y | x ).
Similarly,
Definition 8.1.7. If X and Y were continuous,
EY |X (Y | X = x ) =
� ∞
y=−∞
y f (y| x )dy.
In either case, the conditional expectation is a function of x but not the unknown Y.
For a single variable X we considered the expectation of g( X ) = ( X − µ X )( X − µ X ), called
the variance and denoted σX2 .
The bivariate extension of this is the expectation of g( X, Y ) = ( X − µ X )(Y − µY ). We define
the covariance of X and Y by
σXY = Cov( X, Y ) = EXY [( X − µ X )(Y − µY )].
Covariance measures how the random variables move in tandem with one another, and so is
closely related to the idea of correlation.
Definition 8.1.8. We define the correlation of X and Y by
ρ XY = Cor( X, Y ) =
σXY
.
σX σY
Unlike the covariance, the correlation is invariant to the scale of the random variables X
and Y.
It is easily shown that if X and Y are independent random variables, then σXY = ρ XY = 0.
72
8.2 Examples
Example Suppose that the lifetime, X, and brightness, Y of a light bulb are modelled as
continuous random variables. Let their joint pdf be given by
f ( x, y) = λ1 λ2 e−λ1 x−λ2 y ,
x, y > 0.
Question Are lifetime and brightness independent?
Solution
�
Example Suppose continuous random variables ( X, Y ) ∈ R2 have joint pdf
�
√
1, | x | + |y| < 1/ 2
f ( x, y) =
0, otherwise.
Question Determine the marginal pdfs for X and Y.
Solution
�
73