Download Approximations to Probability Distributions: Limit Theorems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Approximations to Probability
Distributions: Limit Theorems
Sequences of Random Variables
• Interested in behavior of functions of random
variables such as means, variances, proportions
• For large samples, exact distributions can be
difficult/impossible to obtain
• Limit Theorems can be used to obtain properties
of estimators as the sample sizes tend to infinity
– Convergence in Probability – Limit of an estimator
– Convergence in Distribution – Limit of a CDF
– Central Limit Theorem – Large Sample Distribution of
the Sample Mean of a Random Sample
Convergence in Probability
• The sequence of random variables, X1,…,Xn, is said to
converge in probability to the constant c, if for every e>0,
lim P(| X n  c | e )  1
n 
• Weak Law of Large Numbers (WLLN): Let X1,…,Xn be iid
random variables with E(Xi)=m and V(Xi)=s2 < . Then the
sample mean converges in probability to m:




lim P X n  m  e  0 or lim P X n  m  e  1
n 
where X n


n
i 1
n
Xi
n 
 
E X n  mX  m
Proof of WLLN
s
s
V X  
s 
n
n
2
n
X
1
k2
1
1
 P (| X  m X | ks X )  1  2  P (| X  m X | ks X )  2
k
k
ks  1

 P | X n  m X | ks X 
 2
n k

Chebyshev' s Inequality : P( m X  ks X  X  m X  ks X )  1 
ks
Let : e 
n
 k
ne
s
ks
 k2 
ne 2
s2
1
s2
 2  2
k
ne
s2

 1
 P | X n  m X | ks X 
e 2  2
ne
n

 k


s2
 lim P | X n  m X | e  lim
 0 e  0
n 
n   ne 2
Prob
 Xn m
(k  1)
Other Case/Rules
• Binomial Sample Proportions
X ~ Binomial (n, p )
1 if Trial i is a Success
Xi  
0 if Trial i is a Failure
E ( X i )  p V ( X i )  p(1  p)
n
X   X i  E ( X )  np, V ( X )  np(1  p )
i 1
X
Let p  
n
^

n
i 1
n
Xi
^
 ^  p(1  p)
 E  p   p, V  p  
n
 
 
^ Prob
 p p
Prob
Prob
• Useful Generalizations: Suppose : X n  m X and Yn  mY Then :
Prob
1)
X n  Yn  m X  mY
Prob
2)
X nYn  m X mY
Prob
3)
X n / Yn  m X / mY
Prob
4)
X n  mX
(provided mY  0)
(provided P( X n  0)  1)
Convergence in Distribution
• Let Yn be a random variable with CDF Fn(y).
• Let Y be a random variable with CDF F(y).
• If the limit as n of Fn(y) equals F(y) for every point
y where F(y) is continuous, then we say that Yn
converges in distribution to Y
• F(y) is called the limiting distribution function of
Yn
• If Mn(t)=E(etYn) converges to M(t)=E(etY), then Yn
converges in distribution to Y
Example – Binomial  Poisson
• Xn~Binomial(n,p)
Let l=np  p=l/n
• Mn(t) = (pet + (1-p))n = (1+p(et-1))n = (1+l(et-1)/n)n
• Aside: limn (1+a/n)n = ea
•  limn Mn(t) = limn (1+l(et-1)/n)n = exp(l(et-1))
• exp(l(et-1)) ≡ MGF of Poisson(l)
•  Xn converges in distribution to Poisson(l=np)
Example – Scaled Poisson  N(0,1)
X ~ Poisson (l ) 
E ( X )  l  V ( X ),
X  E( X ) X  l
Y

 aX  b
V (X )
l
M aX b (t )  ebt M X (at )
 M Y (t )  e t l e l ( e

t/
l
1)
i
x
Aside : e x    e t /
i  0 i!
l
M X (t )  e
1
a

l
l ( e t 1)
, b l

 exp  t l  l e t /
l

1
t / l   t
 1  1  1  t / l  
2!
2



 
3

/ l3 / 2

3!




t2 / l
t 3 / l3 / 2

 M Y (t )  exp  t l  l   1  1  t / l 

 
2!
3!




 t 2



t2
t 3 / l1/ 2
t 3 / l1/ 2
 exp  t l   t l 

   exp 

 
2!
3!
3!




 2!
Now taking limit as l   :
    
  


  
 t 2

t 3 / l1/ 2
t2 / 2
lim M Y (t )  lim exp 

   e
 MGF ( N (0,1))
l 
l 
3!

 2!

Poisson/Normal CDF Y=(X-L)/sqrt(L) L=25
1
0.9
0.8
0.7
F(y)
0.6
Poisson CDF
0.5
Z CDF
0.4
0.3
0.2
0.1
0
-6
-4
-2
0
2
y
4
6
8
Central Limit Theorem
• Let X1,X2,…,Xn be a sequence of independently and
identically distributed random variables with finite
mean m, and finite variance s2. Then:

n X m
s

Dist
 N (0,1)

where X 
n
i 1
Xi
n
• Thus the limiting distribution of the sample mean is a
normal distribution, regardless of the distribution of the
individual measurements
Proof of Central Limit Theorem (I)
• Additional Assumptions for this Proof:
• The moment-generating function of X, MX(t), exists in
a neighborhood of 0 (for all |t|<h, h>0).
• The third derivative of the MGF is bounded in a
neighborhood of 0 (M(3)(t) ≤ B< for all |t|<h, h>0).
• Elements of Proof
• Work with Yi=(Xi-m)/s
• Use Taylor’s Theorem (Lagrange Form)
• Calculus Result: limn[1+(an/n)]n = ea if limnan=a
Proof of CLT (II)
Define : Yi 
Xi  m
s
E (Yi )  0 V (Yi )  1
(independe nt)
 t  X is m   tm / s
t 
X i (t / s )
 tm / s



M Yi (t )  E e
e
Ee
e
M Xi  


s 


1 n Xi  m 1
Y  
 X m
n i 1 s
s




n X m
s

nY 
 M n  X  m  (t )  M
s
1 n
i 1Yi
n



n
n
Y
1
n

Y

i 1 i
n
n
(t )  M n (t ) This is our " target"
i 1 i
Proof of CLT (III)

M n (t )  E e
(t / n )
 
Yi  E e (t /
n )Y1
 e (t /
n )Yn

n
 t 
 t    t 
M Y1 
  M Yn 
  M Y 

 n
 n    n 
Aside : Taylor' s Theorem (Lagrange form) :
( k 1)
f
(t x )
f ( k ) (a)
k
f ( x)  f (a )  f ' (a )( x  a )   
( x  a) 
( x  a ) k 1
k!
(k  1)!
f ( k ) (t x )
where : rk ( x) 
( x  a ) k 1
(k  1)!
with t x strictly between a and x
Proof of CLT (IV)
( k 1)
f
(t x )
f ( k ) (a)
f ( x)  f (a )  f ' (a )( x  a )   
( x  a) k 
( x  a ) k 1
k!
(k  1)!
t x  min( a,x) , max( a,x) 
Current Applicatio n :
f ()  M Y ()
t
x
n
a0
 
 t 
t x   0,

n

k 2
f (a )  M Y (0)  E e 0Y  E (1)  1
f ' (a )  M Y ' (0)  E (Y )  0
 
f ( 2 ) (a )  M Y( 2 ) (0)  E Y 2  V (Y )  E (Y )  1  0  1
f (3) (t x )  M Y( 3) (t x )  Bn  B  
2
(Previous assumption )
2
3
Bnt 3
t2
 t 
 t
 1 t
 Bn  t

 MY 
 0  
 0  
 0  1
 3/ 2
  1  0
3!  n
2n 6n
 n
 n
 2!  n


Proof of CLT (V)
n
  t 
 t
Bnt 
lim M n (t )  lim  M Y 
 3/ 2 
  lim 1 
n 
n 
  n   n   2n 6n 
2
n
3
n
n
3
2
 1t

Bnt 
a
B
t
t


 lim 1    1/ 2   lim 1  n 
where an   n1/ 2
n 
n 
n
2 6n

 n  2 6n  
 t 2 Bnt 3 
 t 2 Bt 3  t 2
lim an  lim   1/ 2   lim   1/ 2    a ( B  )
n 
n  2
6n  n   2 6n  2

3
2
 lim M n (t )  e  e
a
n 


n X m
s

t2 /2
 MGF ( N (0,1))
Dist

N (0,1)