Download Jointly Distributed Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Jointly Distributed Random Variables
COMP 245 STATISTICS
Dr N A Heard
Contents
1
2
3
1
1.1
Jointly Distributed Random Variables
1.1 Definition . . . . . . . . . . . . . .
1.2 Joint cdfs . . . . . . . . . . . . . . .
1.3 Joint Probability Mass Functions .
1.4 Joint Probability Density Functions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
2
3
Independence and Expectation
2.1 Independence . . . . . . . .
2.2 Expectation . . . . . . . . .
2.3 Conditional Expectation . .
2.4 Covariance and Correlation
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
4
5
5
6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Examples
6
Jointly Distributed Random Variables
Definition
Joint Probability Distribution
Suppose we have two random variables X and Y defined on a sample space S with probability measure P( E), E ⊆ S. Jointly, they form a map ( X, Y ) : S → R2 where s 7→ ( X (s), Y (s)).
From before, we know to define the marginal probability distributions PX and PY by, for
example,
PX ( B) = P( X −1 ( B)),
B ⊆ R.
We now define the joint probability distribution PXY by
PXY ( BX , BY ) = P{ X −1 ( BX ) ∩ Y −1 ( BY )},
BX , BY ⊆ R.
So PXY ( BX , BY ), the probability that X ∈ BX and Y ∈ BY , is given by the probability P of
the set of all points in the sample space that get mapped both into BX by X and into BY by Y.
More generally, for a single region BXY ⊆ R2 , find the collection of sample space elements
SBXY = {s ∈ S|( X (s), Y (s)) ∈ BXY }
and define
PXY ( BXY ) = P(SBXY ).
1
1.2
Joint cdfs
Joint Cumulative Distribution Function
We thus define the joint cumulative distribution function as
FXY ( x, y) = PXY ((−∞, x ], (−∞, y]),
x, y ∈ R.
It is easy to check that the marginal cdfs for X and Y can be recovered by
FX ( x ) = FXY ( x, ∞),
x ∈ R,
FY (y) = FXY (∞, y),
y ∈ R,
and that the two definitions will agree.
Properties of a Joint cdf
For FXY to be a valid cdf, we need to make sure the following conditions hold.
1. 0 ≤ FXY ( x, y) ≤ 1, ∀ x, y ∈ R;
2. Monotonicity: ∀ x1 , x2 , y1 , y2 ∈ R,
x1 < x2 ⇒ FXY ( x1 , y1 ) ≤ FXY ( x2 , y1 ) and y1 < y2 ⇒ FXY ( x1 , y1 ) ≤ FXY ( x1 , y2 );
3. ∀ x, y ∈ R,
FXY ( x, −∞) = 0, FXY (−∞, y) = 0 and FXY (∞, ∞) = 1.
Interval Probabilities
Y
Suppose we are interested in whether the
random variable pair ( X, Y ) lie in the interval cross product ( x1 , x2 ] × (y1 , y2 ]; that is, if
x1 < X ≤ x2 and y1 < Y ≤ y2 .
6
y2
y1
x1
x2
-
X
First note that PXY (( x1 , x2 ], (−∞, y]) = FXY ( x2 , y) − FXY ( x1 , y).
It is then easy to see that PXY (( x1 , x2 ], (y1 , y2 ]) is given by
FXY ( x2 , y2 ) − FXY ( x1 , y2 ) − FXY ( x2 , y1 ) + FXY ( x1 , y1 ).
1.3
Joint Probability Mass Functions
If X and Y are both discrete random variables, then we can define the joint probability mass
function as
p XY ( x, y) = PXY ({ x }, {y}),
x, y ∈ R.
We can recover the marginal pmfs p X and pY since, by the law of total probability, ∀ x, y ∈ R
pX (x) =
∑ pXY (x, y),
pY ( y ) =
y
∑ pXY (x, y).
x
2
Properties of a Joint pmf
For p XY to be a valid pmf, we need to make sure the following conditions hold.
1. 0 ≤ p XY ( x, y) ≤ 1, ∀ x, y ∈ R;
2.
∑ ∑ pXY (x, y) = 1.
y
1.4
x
Joint Probability Density Functions
On the other hand, if ∃ f XY : R × R → R s.t.
PXY ( BXY ) =
Z
( x,y)∈ BXY
BXY ⊆ R × R,
f XY ( x, y)dxdy,
then we say X and Y are jointly continuous and we refer to f XY as the joint probability
density function of X and Y.
In this case, we have
FXY ( x, y) =
Z y
t=−∞
Z x
s=−∞
f XY (s, t)dsdt,
x, y ∈ R,
By the Fundamental Theorem of Calculus we can identify the joint pdf as
f XY ( x, y) =
∂2
FXY ( x, y).
∂x∂y
Furthermore, we can recover the marginal densities f X and f Y :
d
d
FX ( x ) =
FXY ( x, ∞)
dx
dx
Z
Z x
d ∞
=
f XY (s, y)dsdy.
dx y=−∞ s=−∞
f X (x) =
By the Fundamental Theorem of Calculus, and through a symmetric argument for Y, we
thus get
f X (x) =
Z ∞
y=−∞
f XY ( x, y)dy,
f Y (y) =
Z ∞
x =−∞
f XY ( x, y)dx.
Properties of a Joint pdf
For f XY to be a valid pdf, we need to make sure the following conditions hold.
1. f XY ( x, y) ≥ 0, ∀ x, y ∈ R;
2.
Z ∞
y=−∞
Z ∞
x =−∞
f XY ( x, y)dxdy = 1.
3
2
2.1
Independence and Expectation
Independence
Independence of Random Variables
Two random variables X and Y are independent iff ∀ BX , BY ⊆ R.,
PXY ( BX , BY ) = PX ( BX )PY ( BY ).
More specifically, two discrete random variables X and Y are independent iff
p XY ( x, y) = p X ( x ) pY (y),
∀ x, y ∈ R;
and two continuous random variables X and Y are independent iff
f XY ( x, y) = f X ( x ) f Y (y),
∀ x, y ∈ R.
Conditional Distributions
For two r.v.s X, Y we define the conditional probability distribution PY |X by
PY |X ( BY | BX ) =
PXY ( BX , BY )
,
P X ( BX )
BX , BY ⊆ R, PX ( BX ) 6= 0.
This is the revised probability of Y falling inside BY given that we now know X ∈ BX .
Then we have X and Y are independent ⇐⇒ PY |X ( BY | BX ) = PY ( BY ), ∀ BX , BY ⊆ R.
For discrete r.v.s X, Y we define the conditional probability mass function pY |X by
p XY ( x, y)
,
x, y ∈ R, p X ( x ) 6= 0,
pX (x)
and for continuous r.v.s X, Y we define the conditional probability density function f Y |X by
pY | X ( y | x ) =
f Y |X (y| x ) =
f XY ( x, y)
,
f X (x)
x, y ∈ R.
[Justification:
PXY ([ x, x + dx ), (−∞, y])
PX ([ x, x + dx ))
F (y, x + dx ) − F (y, x )
=
F ( x + dx ) − F ( x )
{ F (y, x + dx ) − F (y, x )}/dx
=
{ F ( x + dx ) − F ( x )}/dx
d{ F (y, x + dx ) − F (y, x )}/dxdy
=⇒ f (y| X ∈ [ x, x + dx )) =
{ F ( x + dx ) − F ( x )}/dx
f (y, x )
=⇒ f (y| x ) = lim f (y| X ∈ [ x, x + dx )) =
.]
f (x)
dx −>0
P(Y ≤ y| X ∈ [ x, x + dx )) =
R.
In either case, X and Y are independent ⇐⇒ pY |X (y| x ) = pY (y) or f Y |X (y| x ) = f Y (y), ∀ x, y ∈
4
2.2
Expectation
E{ g( X, Y )}
Suppose we have a bivariate function of interest of the random variables X and Y, g :
R × R → R.
If X and Y are discrete, we define E{ g( X, Y )} by
EXY { g( X, Y )} =
∑ ∑ g(x, y) pXY (x, y).
y
x
If X and Y are jointly continuous, we define E{ g( X, Y )} by
EXY { g( X, Y )} =
Z ∞
y=−∞
Z ∞
x =−∞
g( x, y) f XY ( x, y)dxdy.
Immediately from these definitions we have the following:
• If g( X, Y ) = g1 ( X ) + g2 (Y ),
EXY { g1 ( X ) + g2 (Y )} = EX { g1 ( X )} + EY { g2 (Y )}.
• If g( X, Y ) = g1 ( X ) g2 (Y ) and X and Y are independent,
EXY { g1 ( X ) g2 (Y )} = EX { g1 ( X )}EY { g2 (Y )}.
In particular, considering g( X, Y ) = XY for independent X, Y we have
EXY ( XY ) = EX ( X )EY (Y ).
2.3
Conditional Expectation
EY |X (Y | X = x )
In general EXY ( XY ) 6= EX ( X )EY (Y ).
Suppose X and Y are discrete r.v.s with joint pmf p( x, y). If we are given the value x of the
r.v. X, our revised pmf for Y is the conditional pmf p(y| x ), for y ∈ R.
The conditional expectation of Y given X = x is therefore
EY |X (Y | X = x ) =
∑ y p ( y | x ).
y
Similarly, if X and Y were continuous,
EY |X (Y | X = x ) =
Z ∞
y=−∞
y f (y| x )dy.
In either case, the conditional expectation is a function of x but not the unknown Y.
5
2.4
Covariance and Correlation
Covariance
For a single variable X we considered the expectation of g( X ) = ( X − µ X )( X − µ X ), called
the variance and denoted σX2 .
The bivariate extension of this is the expectation of g( X, Y ) = ( X − µ X )(Y − µY ). We define
the covariance of X and Y by
σXY = Cov( X, Y ) = EXY [( X − µ X )(Y − µY )].
Correlation
Covariance measures how the random variables move in tandem with one another, and so
is closely related to the idea of correlation.
We define the correlation of X and Y by
ρ XY = Cor( X, Y ) =
σXY
.
σX σY
Unlike the covariance, the correlation is invariant to the scale of the r.v.s X and Y.
It is easily shown that if X and Y are independent random variables, then σXY = ρ XY = 0.
3
Examples
Example 1
Suppose that the lifetime, X, and brightness, Y of a light bulb are modelled as continuous
random variables. Let their joint pdf be given by
f ( x, y) = λ1 λ2 e−λ1 x−λ2 y ,
x, y > 0.
Are lifetime and brightness independent?
The marginal pdf for X is
f (x) =
=
Z ∞
f ( x, y)dy =
y=−∞
λ 1 e − λ1 x .
Z ∞
y =0
λ1 λ2 e−λ1 x−λ2 y dy
Similarly f (y) = λ2 e−λ2 y . Hence f ( x, y) = f ( x ) f (y) and X and Y are independent.
Example 2
Suppose continuous r.v.s ( X, Y ) ∈ R2 have joint pdf
1
2
2
π, x + y ≤ 1
f ( x, y) =
0, otherwise.
Determine the marginal pdfs for X and Y.
√
Well x2 + y2 ≤ 1 ⇐⇒ |y| ≤ 1 − x2 . So
f (x) =
Similarly f (y) =
2
π
q
Z ∞
y=−∞
f ( x, y)dy =
Z √1− x 2
√
y=− 1− x2
1
2p
dy =
1 − x2 .
π
π
1 − y2 . Hence f ( x, y) 6= f ( x ) f (y) and X and Y are not independent.
6