Download Lecture Notes 11

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
STATISTICS I
Week 11
Covariance and Correlation
Definition:
The Covariance between X and Y is given by
Cov  X , Y   E  XY   E  X  E Y 
The Correlation between X and Y is given by
Corr  X , Y  
Cov  X , Y 
SD  X  SD Y 
Example 1
Suppose X and Y have the following joint distribution:
X
Y
-1
0
1
-1
1
9
1
9
0
0
1
3
1
9
1
2
9
0
Find the covariance and correlation between X and Y.
0
1
9
E  XY    1 1  19   1 1  19    11  92   11  19   91
Distribution of X is given by
x
E  X   31  92  91
V  X   95  811  44
81
Distribution of Y is given by
y
-1
1
3
0
4
9
1
2
9
E  X 2   95
SD  X  
-1
5
9
0
1
9
1
44
9
1
1
3
E Y 2   89
E Y   95  13  92
V Y   89  814  68
SD  X   968
81
Cov  X , Y   E  XY   E  X  E Y   91   91  92   8111
11
Cov  X , Y 
11
Corr  X , Y  
 4481 68 
 .201
SD  X  SD Y 
44 68
9
9
Example:
Let X and Y have joint density
f  x, y   k  x  y  I 0  x  1,0  y  1
Find Corr(X,Y)
Here, by integrating the density function, it is easy to see that k = 1. So
f  x, y    x  y  I 0  x  1,0  y  1
From this, we find the marginal densities.
f X  x    x  12  I  0  x  1
fY  y    y  12  I  0  y  1
1 1
1 1
1 1
E  XY     xy  x  y  dxdy     x y  dxdy     xy 2  dxdy
2
0 0
0 0
0 0


2
1
 2  y   x 2 dx  dy   ydy 
30
3
0
0

1
1
1
1
 x3 x 2 
7
E  X    x  x   dx     
 3 4  0 12
0
1
1
2
1
 x 4 x3 
5
1
E  X    x  x  2  dx     
 4 6  0 12
0
1
2
2
2
5 7
11
V X     
12  12  144
Similarly, E Y  
7
11
and V Y  
12
144
1  7  7  1
Cov  X , Y       
3  12  12  144
1
1
144
Corr  X , Y  

V  X V Y  11
2
PROPERTIES OF COVARIANCE AND CORRELATION
n
 m
 m n
1. Cov   ai X i , b jY j ,    aib j Cov  X i , Y j 
j 1
 i 1
 i 1 j 1
2. Cov(X, X) = V(X)
3. Cov(aX + b, cY + d) = ac Cov(X, Y)
4. V(aX + bY) = a2 V(X) + b2 V(Y) + 2ab Cov(X, Y)
5. V(X + Y) = V(X) + V(Y) + 2 Cov(X, Y)
6. SD( X  Y ) 
7.
 SD( X )    SD(Y ) 
2
2
 2Cov  X , Y   V  X   V Y   2Cov  X , Y 
1  Corr  X , Y   1
8. Corr(aX + b, cY + d) = sgn(ac) Corr(X, Y) where sgn(x) = I[x > 0] – I[x < 0]
1 if x  0
(In other words, sgn  x   0 if x  0 )
1
if x  0
9. If X and Y are independent,
a. Cov(X, Y) = 0
b. Corr(X, Y) = 0
c. V(X + Y) = V(X) +V(Y)
d.
SD( X  Y ) 
 SD( X )    SD(Y ) 
2
2
 V  X   V Y 
Example 1:
Let X and Y be independent with SD(X) = 4, SD(Y) = 5. Find
(i)
SD(X + Y)
(ii)
Cov(X + Y, X – Y)
(iii)
Corr(X + Y, X – Y)
(i)
V(X + Y) = V(X) + V(Y) = 16 + 25 = 41, so SD  X  Y   41
(ii)
Cov( X  Y , X  Y )  Cov( X , X )  Cov(Y , X )  Cov( X , Y )  Cov(Y , Y )  V  X  V Y   9
(iii)
V(X – Y) = V(X) + V(Y) = 16 + 25 = 41, so SD  X  Y   41
Corr  X  Y , X  Y  
Cov  X  Y , X  Y 
9
9


SD  X  Y  SD  X  Y 
41 41 41
3
Example 2:
Let X and Y be two random variables with V(X) = 4, V(Y) = 9, Cov(X, Y) = -2. Find
(i)
Cov(2X + 3, -4Y + 2)
(ii)
Cov(2X – 1, 4 – X)
(iii)
V(2X – Y)
(iv)
Corr(X,Y)
(v)
Corr(3X + 1, -2Y + 2)
(i)
Cov  2 X  3, 4Y  2   2 4 Cov  X , Y    2 4 2   16
(ii)
Cov  2 X 1, 4  X    2 1 Cov  X , X    2V  X   8
(iii)
V  2 X  Y   4V  X   V Y   4Cov  X , Y   16  9  8  17
(iv)
Corr  X , Y  
(v)
Corr  3 X  1, 2Y  2  sgn  6 Corr  X , Y    1   13   13
Cov  X , Y 
2
1


SD  X  SD Y 
4 9 3
Computation of covariance and correlation for sample data.
Let (x1, y1), (x2, y2), (x3, y3),… (xn, yn) be a set of paired sample data. The sample
covariance between the two variables is given by
sxy 
1  n

xi yi  nx y 


n  1  i 1

and the correlation is given by
rxy 
sxy
sx s y
Note: Division by n – 1 in the above formula is because it is for a sample. If we are
dealing with a population, the division factor is n.
For the computation of correlation it does not matter whether you divide by n or n – 1
provided the same factor is used for sxy, sx and sy. You may divide by anything you
want, or none at all, so long as you are consistent.
Thus,
4
n
 x y  nx y
rxy 
Example:
x
y
i
i 1
i
n
 n 2
2 
2
2
x

nx
 i
  y i  ny 
 i 1
 i 1

2
4
-2
5
0
6
1
9
 n

y  6; sxy  13   xi yi  4  14   6    13  7  6   13
 i 1

4
4




35
sx2  13   xi2  4 x 2   13  9  14   12
; s y2  13   yi2  4 y 2   13 158  144   143
 i 1

 i 1

x  14  .25;
rxy 
sxy
sx s y

1
3
35
12
2
 0.09035
7 10

14
3
Note: You can calculate these directly using your bivariate calculator.
Bivariate Normal Distribution
This is the bivariate extension of the normal distribution. In its most general form its
density function is given by
f  x, y  

1
2 1 2 1  
2
e

1
2 1  2

2
 x    2
 x  1  y   2   y   2  
1

 2  

 
 
  1 
 1   2    2  

where 1  0,  2  0, and 1    1 .
When the means are both zero and the standard deviations are both 1, the joint density
becomes
f  x, y  

1
e
1
 x 2  2  xy  y 2 

2 1  2 


2 1  
and when, in addition, the correlation is zero, it becomes
f  x, y  
2
1  12  x2  y 2 
e
2
If X and Y have bivariate normal density function given above, then
E  X   1 , E Y   2 , SD  X   1 , SD Y    2
and the correlation between X and Y is ρ.
5
Theorem:
If X and Y have a bivariate normal distribution with respective means μ1 and μ2,
respective standard deviations σ1 and σ2, and correlation ρ, then the conditional
distribution of Y given X = x is N  Y | x ,  Y | x  where
Y |x  2     x  1 
2
1
 Y2|x   22 1   2 
Similarly, then the conditional distribution of X given Y = y is N   X | y ,  X | y  where
 X | y  1     y  2 
1
2
 X2 | y   12 1   2 
Example 1: Exercise 6.47 (Page 224)
1  2, 2  5,  1  3,  2  6,  
2
3
Y | x  2     x  1   5  23  2  x  2   43 x  73
2
1
 Y | x   22 1   2   6 1  94  2 5
Y |1  43  73  113 ;  Y | x  2 5
So the conditional distribution of Y given X = 1 is N

11
3

,2 5 .
Example 2: Exercise 6.83
(a)
Y |17  2    17  1   15  .75   23  17  18   14.5
2
1
(b)  X |20  1    12  20   2   18  .75   32   20  15   23.625

Example 3:
X and Y are jointly normally distributed. X has mean 2 and variance 4. Given X = x,
Y has mean Y | x  2  4x and variance 0.75.
(a) Find the mean of Y.
(b) Given that the correlation between X and Y is 0.5, find the variance of Y.
Y |x  2     x  1   Y |2  2     2  2   2
(a)
2
2
1
1
Y |x  2  4x  Y |2  2  12  2.5
So 2  2.5
6
(b)
 Y2|x   22 1   2    22  34  ; But  Y2|x  34 , so  22  1
SAMPLING DISTRIBTIONS
Let X1, X2, Xn be independent identically distributed (i.i.d.) with mean  and
standard deviation . Define the Sample Mean and the Sample Variance as follows:
X
n
1
n
X
i 1
S 
2
i
1
n 1
 n 2
2
X

nX
 i

 i 1

Please note that these are random variables. The meaning of sample mean and sample
variance that is familiar to you, the numerical summaries that you get out of a set of
data, is nothing other than a realization of these random variables.
Theorem
EX   
E S 2    2
V  X   n
2
Proof:
 X  X2 
EX   E 1
n

Xn 


 X  X2 
V X  V  1
n

1
n
E  X   E  X  
1
Xn 


1
n2
E  X n  
2
V  X   V  X  
1
n
n

V  X n  
2
n 2
n2
 n
2
For the third part observe that since  2  E  X 2    2 , it follows that
E  X 2    2   2 . Similarly, since
follows that E  X 2   n   2 .
2
n
 V  X   E  X 2    E  X   E  X 2    2 , it
2
2
Now,
E S
2

1
n 1
 n
2
2 
E
X

nE
X



 

i

 i 1


1
n 1
 n 2   2    2
1
n 1
 n
2
2




n


 i 1

2
n
CENTRAL LIMIT THEOREM
The above results only tell us about the mean and variance of the sample mean
without telling us anything about its distribution. If we were to calculate probabilities
7

 2 


related to X we will need its distribution. Distribution, of course will depend on how
X1, X2, Xn are distributed. But if n is large, the distribution of X is approximately
normal, no matter what the original distribution is.
Theorem (Central Limit Theorem)
Let X1, X2, Xn be independent identically distributed with mean and standard
deviation . Let X n denote the mean of n i.i.d. random variables. Then
Xn  

 N  0,1
n
Note: If the original distribution is normally distributed, the sample mean will have
normal distribution even if n is small.
Example 1:
The content of 500 ml bottles of Coca-Cola is normally distributed with mean 503 ml
and standard deviation 5 ml.
(i)
Find the probability that a randomly chosen bottle has less than 500 ml cola in
it.
(ii)
Find the probability that a randomly chosen 6-pack of bottles has an average
of less than 500 ml.
Solution:
(i)
(ii)
500  503 

P  X  500   P  Z 
  P  Z  .6   .2743
5



500  503 
P  X  500   P  Z 
  P  Z  1.47   .0708
5


6


Example 2:
The waiting time for a particular service for a person is exponentially distributed with
mean 5 minutes. Find the probability that a total service time for hundred people is
less than 8 hours.
Solution:
For exponential distribution the mean is the same as the standard deviation, so σ = 5.
Applying the CLT we know that the sample mean is approximately normal.
P  Total time  8 hours   P  Average time  4.8 minutes   P  X  4.8 

4.8  5 
 P  Z  5   P  Z  .4   .3446


100


8
Related documents