Download 4 Jointly distributed random variables

Document related concepts
no text concepts found
Transcript
Probability
STAT 416
Spring 2007
4 Jointly distributed random variables
1. Introduction
2. Independent Random Variables
3. Transformations
4. Covariance, Correlation
5. Conditional Distributions
6. Bivariate Normal Distribution
1
4.1 Introduction
Computation of probabilities with more than one random variable
two random variables . . . bivariate
two or more random variables. . . multivariate
Concepts:
• Joint distribution function
• purely discrete: Joint probability function
• purely continuous: Joint density
2
Joint distribution function
Bivariate case: Random variables X and Y
Define joint distribution function as
F (x, y) := P (X ≤ x, Y ≤ y),
−∞ < x, y < ∞
Bivariate distribution thus fully specified
P (x1<X≤x2 , y1<Y ≤y2 ) = F (x2 , y2 )−F (x1 , y2 )−F (x2 , y1 )+F (x1 , y1 )
for x1 < x2 and y1 < y2
Marginal distribution:
Idea:
FX (x) := P (X ≤ x) = F (x, ∞)
P (X ≤ x) = P (X ≤ x, Y < ∞) = lim F (x, y)
Analogous
y→∞
FY (y) := P (Y ≤ y) = F (∞, y)
3
Bivariate continuous random variable
X and Y jointly continuous if there exists joint density function:
∂2
f (x, y) =
F (x, y)
∂x ∂y
joint distribution function then obtained by integration
Zb
Za
F (a, b) =
f (x, y) dxdy
y=−∞ x=−∞
Density of marginal distribution X obtained by integration over ΩY :
Z∞
fX (x) =
f (x, y) dy
y=−∞
4
Example: Bivariate uniform distribution
X and Y uniformly distributed on [0, 1] × [0, 1]
f (x, y) = 1,
⇒
density
0 ≤ x, y ≤ 1.
Joint distribution function
Zb Za
f (x, y) dxdy = a b,
F (a, b) =
0 ≤ a, b ≤ 1.
y=0 x=0
Density of marginal distribution:
Z∞
fX (x) =
f (x, y) dy = 1,
y=−∞
i.e. density of univariate uniform distribution
5
0≤x≤1
Exercise: Bivariate continuous distribution
Let joint density of X and Y be given as

 ax2 y 2 , 0 < x < y < 1
f (x, y) =

0, else
1. Determine the constant a
2. Determine the CDF F (x, y)
3. Compute the probability P (X < 1/2, Y > 2/3)
4. Compute the probability P (X < 2, Y > 2/3)
5. Compute the probability P (X < 2, Y > 3)
Hint: Be careful with the domain of integration
6
S.R.: Example 1c
Let joint density of X and Y be given as

 e−(x+y) , 0 < x < ∞, 0 < y < ∞
f (x, y) =

0, else
Find the density function of the random variable X/Y
Again most important to consider the correct domain of integration
Z∞ Zay
P (X/Y ≤ a) =
e
−(x+y)
y=0 x=0
(More details are given in the book)
7
1
dx dy = 1 −
a+1
Bivariate discrete random variable
X and Y both discrete
Define joint probability function
p(x, y) = P (X = x, Y = y)
Naturally we have
p(x, y) = F (x, y) − F (x− , y) − F (x, y − ) + F (x− , y − )
Obtain probability function of X by summation over ΩY :
X
pX (x) = P (X = x) =
p(x, y)
ΩY
8
Example
Bowl with 3 red, 4 white and 5 blue balls;
draw 3 balls randomly without replacement
X . . . number of red drawn balls
Y . . . number of white drawn balls
e. g.:
¡3¢¡4¢¡5¢ ¡12¢
p(0, 1) = P (0R, 1W, 2B) = 0 1 2 / 3 = 40/220
j
i
0
1
2
3
pX
0
10/220
40/220
30/220
4/220
84/220
1
30/220
60/220
18/220
0
108/220
2
15/220
12/220
0
0
27/220
3
1/220
0
0
0
1/220
pY
56/220
112/220
48/220
4/220
1
9
Multivariate random variable
More than two random variables
joint distribution function for n rabdom variables
F (x1 , . . . , xn ) = P (X1 ≤ x1 , . . . , Xn ≤ xn )
Discrete: Joint probability function:
p(x1 , . . . , xn ) = P (X1 = x1 , . . . , Xn = xn )
Marginals again by summation over all components that are not of
interest, e.g.
X
X
pX1 (x1 ) =
···
p(x1 , . . . , xn )
x2 ∈Ω2
xn ∈Ωn
10
Multinomial distribution
one of the most important multivariate discrete distribution
n independent experiments with r possible results, each with
probability p1 , . . . pr
Xi . . . number of experiments with ith result, then
P (X1 = n1 , . . . , Xr = nr ) =
whenever
Pr
i=1
n!
n1 !···nr !
pn1 1 · · · pnr r
ni = n
Generalization of binomial distribution (r = 2)
Exercise: Poker Dice (throw 5 dices)
Compute probability of a (High) straight, Four of a kind, Full House
11
4.2 Independent random variables
Two random variables X and Y are called independent if for all
events A and B
P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B)
Information on X does not give any information on Y
X and Y independent if and only if
P (X ≤ a, Y ≤ b) = P (X ≤ a)P (Y ≤ b)
i.e. F (a, b) = FX (a) FY (b) for all a, b.
Also equivalent with f (x, y) = fX (x) fY (y) in the continuous case
and with p(x, y) = pX (x) pY (y) in the discrete case for all x, y
12
Example: continuous
Looking back at the example

 ax2 y 2 , 0 < x < y < 1
f (x, y) =

0, else
Are the random variables X and Y with joint density as specified
above independent?
What changes if we look at

 ax2 y 2 ,
f (x, y) =

0,
0 < x < 1, 0 < y < 1
else
13
Example
Z ∼ P(λ) . . . number of persons which enter a bar
p . . . percentage of male visitors
X, Y . . . number of male and female visitors respectively
X, Y are independently Poisson distributed with parameters pλ and
qλ respectively
Solution: Law of total probability:
P (X = i, Y = j) = P (X = i, Y = j|X + Y = i + j)P (X + Y = i + j)
by definition:
¡i+j ¢ i j
P (X = i, Y = j|X + Y = i + j) = i p q
P (X + Y = i + j) =
−λ λi+j
e (i+j)!
Together:
i
j
−λp
−λ (λp)
(λq)
=
e
P (X = i, Y = j) = e
i!j!
14
(λp)i
i!
j
−λq (λq)
e
j!
Example: Two dices
X, Y . . . uniformly distributed on {1, . . . , 6}
Due to independence we have p(x, y) = pX (x) pY (y) =
1
36
Distribution function:
FX (x) = FY (x) = i/6, if x ≤ i < x + 1 and 0 < x < 7
F (x, y) = FX (x)FY (y) =
ij
36 ,
x ≤ i < x + 1, y ≤ j < y + 1
Which distribution has X + Y ?
P (X + Y = 2) = p(1, 1) = 1/36
P (X + Y = 3) = p(1, 2) + p(2, 1) = 2/36
P (X + Y = 4) = p(1, 3) + p(2, 2) + p(3, 1) = 3/36
P (X + Y = k) = p(1, k − 1) + p(2, k − 2) + · · · + p(k − 1, 1)
15
Sum of two independent distributions
Sum of random variables itself a random variable
Computation of distribution via convolution
Discrete random variables:
X
X
P (X + Y = k) =
pX (x)pY (y) =
pX (k − y)pY (y)
ΩY
x+y=k
Continuous random variables:
Z∞
fX+Y (x) =
fX (x − y)fY (y)dy
y=−∞
Exercise: X ∼ P(λ1 ) and Y ∼ P(λ2 ) independent
⇒
X + Y ∼ P(λ1 + λ1 )
16
Example of convolution: Continuous case
X, Y independent,uniformly distributed on [0, 1]
i.e. f (x, y) = 1, (x, y) ∈ [0, 1] × [0, 1]
fX (x) = 1, 0 ≤ x ≤ 1, fY (y) = 1, 0 ≤ y ≤ 1
Computation of density of Z := X + Y
Z∞
fZ (x)
=
fX (x − y)fY (y)dy
y=−∞




=



Rx
dy = x,
y=0
R1
dy = 2 − x,
0<x≤1
1<x≤2
y=x−1
Reason: fY (y) = 1 if 0 ≤ y ≤ 1
fX (x − y) = 1 if 0 ≤ x − y ≤ 1
⇔
17
y ≤x≤y+1
Another example of convolution
X, Y indep., Γ−distributed with parameters t1 , t2 and same λ
fX (x) =
λe−λx (λx)t1 −1
, fY
Γ(t1 )
(y) =
λe−λy (λy)t2 −1
,
Γ(t2 )
x, y ≥ 0,
Z∞
fZ (x)
=
fX (x − y)fY (y)dy
y=−∞
Zx
=
y=0
=
=
λe−λ(x−y) (λ(x − y))t1 −1 λe−λy (λy)t2 −1
dy
Γ(t1 )
Γ(t2 )
t1 +t2 −λx
λ
e
Γ(t1 )Γ(t2 )
¯
¯ y
¯
¯
¯ dy
=
=
Zx
(x − y)t1 −1 y t2 −1 dy
y=0
¯
xz ¯¯
λe−λx (λx)t1 +t2 −1
¯=
Γ(t1 + t2 )
xdz ¯
18
4.3 Transformations
As in the univariate case one considers also in the multivariate
case transformations of random variables
We restrict ourselves to the bivariate case
X1 and X2 jointly distributed with density fX1 ,X2
Y1 = g1 (X1 , X2 ) and Y2 = g2 (X1 , X2 )
Major question: What is the joint density fY1 ,Y2 ?
19
Density of Transformation
Assumptions:
• y1 = g1 (x1 , x2 ) and y2 = g2 (x1 , x2 ) can be uniquely solved for
x1 and x2 , say x1 = h1 (y1 , y2 ) and x2 = h2 (y1 , y2 )
• g1 and g2 are C 1 such that
¯
¯
¯
J(x1 , x2 ) = ¯¯
¯
∂g1
∂x1
∂g2
∂x1
∂g1
∂x2
∂g2
∂x2
¯
¯
¯
¯ 6= 0
¯
¯
Under these conditions we have
fY1 ,Y2 (y1 , y2 ) = fX1 ,X2 (x1 , x2 )|J(x1 , x2 )|−1
Proof: Calculus
Note the similarity to the one-dimensional case
20
Examples
• Sheldon Ross: Example 7a
Y1 = X1 + X2
Y2 = X1 − X2
• Sheldon Ross: Example 7c
X and Y independent gamma random variables
U =X +Y
V = X/(X + Y )
21
Expectation of bivariate RV
X and Y discrete with joint probability p(x, y).
Like in the one-dimensional case:
P
E(g(X, Y )) =
P
g(x, y)p(x, y)
x∈ΩX y∈ΩY
Exercise:
Let X and Y eyes of two fair dice, independently thrown
Compute expectation of the difference |X − Y |
22
Expectation of bivariate RV
X and Y continuous with joint density f (x, y).
Like in the one-dimensional case:
R R
E(g(X, Y )) =
g(x, y)f (x, y) dydx
x∈R y∈R
Exercise:
Accident occurs at point X on a road of length L. At that time
ambulance is at point Y . Assuming both X and Y uniformly
distributed on [0, L] and independent, calculate average distance
between them. This is again E(|X − Y |)
23
Expectation of the sum of two RV
X and Y discrete with joint probability p(x, y)
For g(x, y) = x + y we obtain
P
E(X + Y ) =
P
(x + y)p(x, y) = E(X) + E(Y )
x∈ΩX y∈ΩY
Analogous in the continuous case:
E(X + Y ) =
R∞
R∞
(x + y)f (x, y) dx dy = E(X) + E(Y )
y=−∞ x=−∞
Be aware: Additivity for variances in general not the case!
24
Expectation of functions under independence
If X and Y are independent random variables, then for any
functions h and g we obtain
E (g(X)h(Y )) = E (g(X)) E (h(Y ))
Proof:
Z Z
E (g(X)h(Y )) =
g(x)h(y)f (x, y) dx dy
Z Z
=
=
g(x)h(y)fX (x)fY (y) dx dy
Z
Z
g(x)fX (x) dx h(y)fY (y) dy
=
E (g(X)) E (h(Y ))
Second equality uses independence
25
Expectation of random samples
X1 , . . . , Xn i.i.d. like X, (independent, identically distributed)
n
P
1
Definition: X̄ := n
Xi
i=1
For E(X) = µ and Var (X) = σ 2 we obtain:
¡ ¢
E X̄ = µ,
Var (X̄) =
σ2
n
Proof:
µ n
¶
n
P
P
E
Xi =
E(Xi )
i=1
µ
E
1
n
n
P
i=1
i=1
¶2
Xi − µ =
µ
1
n2 E
n
P
¶2
(Xi − µ) =
i=1
Last equality due to independence of Xi
26
µ
1
n2 E
n
P
¶
(Xi − µ)2
i=1
4.4 Covariance and Correlation
Describe the relation between two random variables
Definition Covariance:
Cov (X, Y ) = E(X − E(X))(Y − E(Y ))
Usual notation: σXY := Cov (X, Y )
Just like for variances we have
σXY = E(XY ) − E(X)E(Y )
Definition of correlation:
ρ(X, Y ) :=
27
σXY
σX σY
Example Correlation
3
2
1.5
2
1
0.5
1
0
ρ = 0.9
0
ρ = −0.6
−0.5
−1
−1
−1.5
−2
−2
−2.5
−3
−4
−3
−2
−1
0
1
2
−3
−3
3
3
−2
−1
0
1
2
3
4
3
2
2
1
ρ = 0.3
1
ρ = 0.0
0
0
−1
−1
−2
−3
−3
−2
−2
−1
0
1
2
3
28
−3
−3
−2
−1
0
1
2
3
4
Example Covariance
Discrete bivariate distribution (ΩX = ΩY = {0, 1, 2, 3}) given by
j
i
0
1
2
3
pX
0
1/20
4/20
3/20
2/20
10/20
1
3/20
2/20
2/20
0
7/20
2
1/20
1/20
0
0
2/20
3
1/20
0
0
0
1/20
pY
6/20
7/20
5/20
2/20
20/20
Compute Cov (X, Y )
Solution:
Cov (X, Y ) = E(XY ) − E(X)E(Y ) =
29
8
20
−
14
20
·
23
20
= − 162
400
Example 2: Covariance:
Compute covariance between X and Y for

 24xy, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, x + y ≤ 1
f (x, y) =

0, else
Compute first cdf of X:
1−a
Z Za
P (X ≤ a) =
Z1
1−y
Z
24xy dx dy +
y=0 x=0
24xy dx dy
y=1−a x=0
= 6(1 − a)2 a2 + 1 − 6(1 − a)2 + 8(1 − a)3 − 3(1 − a)4
and by differentiation
fX (x) =
=
12(1 − x)2 x − 12(1 − x)x2 + 12(1 − x) − 24(1 − x)2 + 12(1 − x)3
12(1 − x)2 x
30
Example 2 :Covariance continued
Because of symmetry we have fY (x) = fX (x) and we get
Z1
12(1 − x)2 x2 dx = 2/5
E(X) = E(Y ) =
x=0
Furthermore
Z
Z1 1−y
E(XY ) =
24x2 y 2 dx dy = 2/15
y=0 x=0
and finally
Cov (X, Y ) =
2 2
10 − 12
2
2
− · =
=−
15 5 5
75
75
31
Covariance for independent RV
X and Y independent
⇒
σXY = 0
follows immediately from σXY = E(XY ) − E(X)E(Y ) and
XX
X
X
E(XY ) =
xyp(x, y) =
xpX (x)
ypY (y)
x
y
x
y
Converse not true:

 0, X 6= 0
X uniformly distributed on {−1, 0, 1} and Y =
 1, X = 0
E(X) = 0
XY = 0 ⇒
E(XY ) = 0
thus Cov (X, Y ) = 0, although X and Y not independent:
e.g. P (X = 1, Y = 0) = P (X = 1) = 1/3, P (Y = 0) = 2/3
32
Properties of Covariance
Obviously
Cov (X, Y ) = Cov (Y, X),
and Cov (X, X) = Var (X)
Covariance is a bilinear form:
Cov (aX, Y ) = a Cov (X, Y ),
and

Cov 
n
X
i=1
Xi ,
m
X

Yj  =
j=1
n X
m
X
i=1 j=1
Proof by simple computation . . .
33
a∈R
Cov (Xi , Yj )
Variance of a sum of RV
Due to the properties shown above
à n
!
n X
n
X
X
Xi
=
Cov (Xi , Xj )
Var
i=1
i=1 j=1
n
X
=
Var (Xi ) +
i=1
Extreme cases:
• independent RV:
n
P
i=1
¶
Xi
µ
• X1 = X2 = · · · = Xn :
Cov (Xi , Xj )
i=1 j6=i
µ
Var
n X
X
Var
n
P
i=1
34
=
n
P
i=1
Var (Xi )
¶
Xi
= n2 Var (X1 )
Correlation
Definition:
ρ(X, Y ) :=
σXY
σX σY
We have:
−1 ≤ ρ(X, Y ) ≤ 1
Proof:
µ
0 ≤
=
Var
=
Var (X) Var (Y ) 2Cov (X, Y )
=
+
+
2
σX
σY2
σX σY
2[1 + ρ(X, Y )]
µ
0 ≤
X
Y
+
σX
σY
¶
Var
Y
X
−
σX
σY
¶
=
Var (X) Var (Y ) 2Cov (X, Y )
+
−
2
2
σX
σY
σX σY
2[1 − ρ(X, Y )]
35
Exercise Correlation
Let X and Y be independently uniform on [0, 1]
Compute correlation between X and Z for
1. Z = X + Y
2. Z = X 2 + Y 2
3. Z = (X + Y )2
36
4.5 Conditional Distribution
Conditional probability for two events A and B:
P (EF )
P (E|F ) =
P (F )
Corresponding definition for random variables X and Y
Discrete:
pX|Y (x|y) := P (X = x|Y = y) =
p(x,y)
pY (y)
Exercise: Let p(x, y) given by
p(0, 0) = 0.4,
p(0, 1) = 0.2,
p(1, 0) = 0.1,
p(1, 1) = 0.3,
Compute conditional probability function of X for Y = 1
37
Discrete conditional distribution
Conditional CDF:
FX|Y (x|y) := P (X ≤ x|Y = y) =
X
pX|Y (k|y)
k≤x
If X and Y are independent – then pX|Y (x|y) = pX (x)
Proof: Easy computation
Exercise: Let X ∼ P(λ1 ) and Y ∼ P(λ2 ) be independent.
Compute conditional distribution of X, for X + Y = n
P (X = k|X + Y = n) =
X + Y ∼ P(λ1 + λ2 )
P (X=k)P (Y =n−k)
,
P (X+Y =n)
⇒
³
X|(X + Y = n) ∼ B n,
38
λ1
λ1 +λ2
´
Continuous conditional distribution
Continuous:
fX|Y (x|y) :=
f (x,y)
fY (y)
for fY (y) > 0
Definition in continuous case can be motivated by discrete case
(Probabilities for small environments of x and y)
Computation of conditional probabilities:
Z
P (X ∈ A|Y = y) = fX|Y (x|y) dx
A
Conditional CDF:
Za
FX|Y (a|y) := P (X ∈ (−∞, a)|Y = y) =
fX|Y (x|y) dx
x=−∞
39
Example
Joint density of X and Y given by

 c x(2 − x − y), x ∈ [0, 1], y ∈ [0, 1],
f (x, y) =

0, otherwise.
Compute c, fX|Y (x|y) and P (X < 1/2|Y = 1/3)
Solution:
R1
fY (y) = c
x=0
1=
R1
y=0
x(2 − x − y) dx = c ( 23 − y2 )
fY (y) = c( 32 − 14 )
fX|Y (x|y) =
f (x,y)
fY (y)
=
⇒
x(2−x−y)
y
2
3−2
P (X < 1/2|Y = 1/3) =
1/2
R
x=0
c=
=
12
5
6x(2−x−y)
4−3y
6x(2−x−1/3)
dx
4−3/3
40
= · · · = 1/3
Conditional expectation and variance
Computation in continuous case with conditional density:
Z∞
E(X|Y = y)
=
xfX|Y (x|y)dx
x=−∞
Z∞
(x − E(X|Y = y))2 fX|Y (x|y)dx
Var (X|Y = y) =
x=−∞
Example: prolonged
Z1
E(X|Y = y) =
x=0
Specifically:
6x2 (2 − x − y)
5/2 − 2y
dx =
4 − 3y
4 − 3y
E(X|Y = 1/3) =
2
9
Var (X|Y = y) = . . .
41
Conditional expectation and variance
Computation in discrete case with conditional probability function:
X
E(X|Y = y) =
xpX|Y (x|y)
x∈ΩX
Var (X|Y = y) =
X
(x − E(X|Y = y))2 pX|Y (x|y)
x∈ΩX
Exercise: Compute expectation and variance for X given Y = j
j
i
0
1
2
3
pX
0
1/20
4/20
3/20
2/20
10/20
1
3/20
2/20
2/20
0
7/20
2
1/20
1/20
0
0
2/20
3
1/20
0
0
0
1/20
pY
6/20
7/20
5/20
2/20
20/20
42
Computation of expectation by conditioning
E(X|Y = y) can be seen as a function of y, and therefore the
expectation of this function can be computed
It holds:
E(X) = E(E(X|Y )
Proof: (for the continuous case)
Z∞
E(E(X|Y ) =
E(X|Y = y)fY (y) dy
y=−∞
Z∞
Z∞
=
xfX|Y =y (x)fY (y) dx dy
y=−∞ x=−∞
Z∞
Z∞
=
y=−∞ x=−∞
f (x, y)
fY (y) dx dy = E(X)
x
fY (y)
Verify formula for previous examples
43
The conditional variance formula
Var (X) = E(Var (X|Y )) + Var (E(X|Y ))
Proof: From
Var (X|Y ) = E(X 2 |Y ) − (E(X|Y ))2
we get
E(Var (X|Y )) = E(E(X 2 |Y ))−E((E(X|Y ))2 ) = E(X 2 )−E(E(X|Y )2 )
On the other hand
Var (E(X|Y )) = E(E(X|Y )2 )−(E(E(X|Y )))2 = E(E(X|Y )2 )−E(X)2
The sum of both expressions gives the result.
Formula important in the theory of linear regression!
44
4.6 Bivariate normal distribution
Univariate normal distribution:
Standard normal distribution:
f (x) =
φ(x) =
√ 1
2π σ
√1
2π
e
e−x
−(x−µ)2 /2σ 2
2
/2
Let X1 and X2 be independent, both normal distributed
N (µi , σi2 ), i = 1, 2
⇒
f (x1 , x2 )
=
=
where
1
−(x1 −µ1 )2 /2σ12 −(x2 −µ2 )2 /2σ22
e
2π σ1 σ2
1
−(x−µ)T Σ−1 (x−µ)/2
e
2π |Σ|1/2
¡x1 ¢
¡µ1 ¢
¡σ2
x = x2 , µ = µ2 , Σ = 01
45
0¢
σ22
Densitiy function in general
X = (X1 , X2 ) bivariate normal distributed if joint density has the
form
f (x) =
1
2π |Σ|1/2
e−(x−µ)
T

Covariance matrix:
Notation:
ρ :=
Σ=
Σ−1 (x−µ)/2

σ12
σ12
σ12
σ22

σ12
σ1 σ2
2
• |Σ| = σ12 σ22 − σ12
= σ12 σ22 (1 − ρ2 )


2
σ
−ρσ1 σ2
2
1
−1


• Σ = σ2 σ2 (1−ρ2 )
1 2
−ρσ1 σ2
σ12
46
Density in general
Finally we obtain
f (x1 , x2 ) =
1
where z1 = x1σ−µ
1
standardization)
1√
2πσ1 σ2
and
1−ρ2
z2 =
n 2
o
z1 −2ρz1 z2 +z22
exp − 2(1−ρ2 )
x2 −µ2
σ2
(compare
Notation suggests that µi and σi2 are expectation and variance of
the marginals Xi , and that ρ is the correlation between X1 and X2
Important formula:
f (x1 , x2 ) =
√ 1
2πσ1
e
2
z1
− 2
·√
1
2π(1−ρ2 )σ2
e
Completion of square in the exponent
47
−
(ρz1 −z2 )2
2(1−ρ2 )
Bivariate normal distribution plot
Marginals standard normal distributed N (0, 1), ρ = 0:
48
Example density
Let (X, Y ) be bivariate normal distributed with µ1 = µ2 = 0,
σ1 = σ2 = 1 and ρ = 1/2, compute the joint density
¡0¢
¡ 1 1/2¢
Solution: µ = 0 , Σ = 1/2 1
¡ 1 −1/2¢
4
−1
|Σ| = 1 − 1/4 = 3/4,
Σ = 3 −1/2 1
¡ ¢ 2
¡ 2x−y ¢ 4 2
−1 x
2
(x, y)Σ
y = 3 (x, y) −x+2y = 3 (x − xy + y )
1
− 23 (x2 −xy+y 2 )
f (x, y) = √
e
3π
Density after completion of square in exponent:
(y−x/2)2
1 − 1 x2
1
−
f (x, y) = √ e 2 p
e 2·3/4
2π
2π 3/4
49
Example continued
(y−x/2)2
1 − 1 x2
1
−
f (x, y) = √ e 2 p
e 2·3/4
2π
2π 3/4
Joint density is the product of a standard normal (in x) and a
normal distribution (in y) with mean x/2 and variance 3/4.
Compute density of X:
1 − 1 x2
fX (x) = √ e 2
2π
Z∞
p
y=−∞
1
2π 3/4
e
−
(y−x/2)2
2·3/4
1 − 1 x2
dy = √ e 2
2π
fX (x) is density of a standard normal distribution
The integral equals 1, because we integrate a density function!
50
Interpretation of µi , σi2 and ρ
In general we have for a bivariate normal distribution
1. X1 ∼ N (µ1 , σ12 ) and X2 ∼ N (µ2 , σ22 )
2. Correlation coefficient
ρ(X1 , X2 ) =
σ12
σ1 σ2
Proof: 1. Use formula with completed square in exponent and
integrate:
1
fX1 (x1 )= √
e
2πσ1
z2
− 21
2
z1
1
=√
e− 2
2πσ1
Z∞
1
p
2π(1 −
x2 =−∞
Ã
Z∞
s=−∞
With substitution s ← z2 /
p
ρ2 )σ2
1
√
e−
2π
1−
ρ2
√ρz1 −s
1−ρ2
2
e
−
(ρz1 −z2 )2
2(1−ρ2 )
dx2
!2
2
z1
1
e− 2
ds = √
2πσ1
p
= (x2 − µ2 )/( 1 − ρ2 σ2 )
51
Proof continued
2. Again formula with completed square in exponent and
substitutions z1 ← (x1 − µ1 )/σ1 , z2 ← (x2 − µ2 )/σ2 :
Z∞
Z∞
Cov (X1 , X2 ) =
(x1 − µ1 )(x2 − µ2 )f (x1 , x2 ) dx2 dx1
x1 =−∞ x2 =−∞
Z∞
x1 − µ1
√
e
2πσ1
=
x1 =−∞
Z
=
Z
z1 φ(z1 )
z1
= σ1 σ2
z2
Z
z2
− 21
Z∞
p
x2 − µ2
ρ2 )σ2
e
−
(ρz1 −z2 )2
2(1−ρ2 )
2π(1 −
x2 =−∞
Ã
!
z
ρz1 − z2
p 2
φ p
σ2 dz2 σ1 dz1
2
2
1−ρ
1−ρ
z1 φ(z1 )ρz1 dz1 = σ1 σ2 ρ = σ12
z1
52
dx2 dx1
Conditional distribution
Interpretation of the formula
f (x1 , x2 ) =
√ 1
2πσ1
e
z2
− 21
·√
1
2π(1−ρ2 )σ2
−
e
(ρz1 −z2 )2
2(1−ρ2 )
f (x1 , x2 ) = f1 (x1 )f2|1 (x2 |x1 )
From
(ρz1 −z2 )2
(1−ρ2 )
=
(µ2 +σ2 ρz1 −x2 )2
σ22 (1−ρ2 )
we conclude:
Conditional distribution is again normal distribution with
µ2|1 = µ2 + ρ(x1 − µ1 ) σσ12 , σ2|1 = σ22 (1 − ρ2 )
For bivariate normal distribution:
ρ = 0 ⇒ Independence
In general not correct!
53
Sum of bivariate normal distributions
Let X1 , X2 be bivariate normal with µ1 , µ2 , σ12 , σ22 , σ12
Then the random variable Z = X1 + X2 is again normal, with
X1 + X2 ∼ N (µ1 + µ2 , σ12 + σ22 + 2σ12 )
Proof: For the density of the sum we have
Z∞
fZ (z) =
f (z − x2 , x2 ) dx2
x2 =−∞
Get the result again by completion of square (lengthy calculation)
Intuition: Mean and variance of Z obtained from formula for
general random variables
54
Related documents