Download central limit theorem for mle

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Randomness wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Central limit theorem wikipedia , lookup

Transcript
February 6, 2017
CENTRAL LIMIT THEOREM, DELTA METHOD, AND
ASYMPTOTIC EFFICIENCY OF MLE
1. Introduction
Let us recall the our regularity conditions on a family of pdfs {fθ }θ .
(A) The pdfs are distinct; that is, if θ 6= θ0 , then fθ 6= fθ0 .
(B) The pdfs have common support.
(C) The true value θ0 is an interior point of Θ; that is, there is an open
interval contaning θ0 that is a subset of Θ.
(D) The pdf f (x; θ) is differentiable as a function of θ.
(E) The pdf f (x; θ) Ris twice differentiable as a function of θ.
(F) In the integral f (x; θ)dx can be differentiated twice under the
integral sign as a function of θ.
Theorem 1. Let X = (X1 , . . . , Xn ) be a random sample from fθ0 ,
where θ0 is unknown, and all the regularity conditions are satisfied. If
0 < I(θ0 ) < ∞, then any consistent sequence Tn of solutions to the mle
equations satisfies
√
n(Tn − θ0 ) →D N (0, 1/I(θ0 )).
The proof of Theorem 1 will make use of Taylor’s theorem.
Theorem 2. Let f be k times differentiable at the point a. Then there
exists a function r such that limx→a r(x) → 0 and
f k (a)
· (x − a)k + r(x) · (x − a)k .
k!
To warm up to the proof of Theorem 1, we review the central limit
theorem and a result (which is not needed for the proof) that is sometimes called the Delta method.
f (x) = f (a) + f 0 (a) · (x − a) + · · · +
Theorem 3 (Central limit theorem). Let X1 , X2 , . . . be i.i.d. random
variables with mean 0 and unit variance. If Sn := X1 + · · · + Xn , the
S
√n →D N (0, 1).
n
Sketch proof. The proof relies on three facts: characteristic functions
can be used to characterize convergence in distribution, the sum of
two independent random variables has a characteristic function given
by the product of their characteristic functions, and the characteristic
t2
function of a standard normal is given by the function t 7→ e− 2 .
1
C
2 ENTRAL LIMIT THEOREM, DELTA METHOD, AND ASYMPTOTIC EFFICIENCY OF MLE
Let φ(t) = EeitX1 be the characteristic function of X1 . Let ψ be the
Sn
characteristic function of √
. Since the Xi ’s are i.i.d. we have that
n
ψ(t) = [φ( √tn )]n .
A Taylor’s expansion at t = 0 for φ gives
φ(t) ≈ 1 − t2 /2,
for t small, since the mean is zero and variance is one. Thus for n large
ψ(t) ≈ [1 −
t2 n
]
2n
−
→e
t2
2,
as desired.
Exercise 4. State and prove the central limit theorem in the case where
the mean µ ∈ R and the variance σ 2 > 0.
Theorem 5 (Delta method). Let Tn be a sequence of random variables
such that for some θ0 ∈ R and σ > 0, we have
√
n(Tn − θ0 ) →D N (0, σ 2 ).
If g is differentiable and g 0 (θ0 ) 6= 0, then
√
n(g(Tn ) − g(θ0 )) →D N (0, [g 0 (θ0 )σ]2 ).
Proof. First, observe that Tn → θ0 in probability. Next, using Theorem
2, write
g(Tn ) = g(θ0 ) + g 0 (θ0 )(Tn − θ0 ) + r(Tn )(Tn − θ).
Rearranging gives,
√
√
√
n(g(Tn ) − g(θ0 )) = g 0 (θ0 ) · n(Tn − θ0 ) + r(Tn ) · n(Tn − θ).
By assumption we have that
√
g 0 (θ0 ) · n(Tn − θ0 ) →D N (0, [g 0 (θ0 )σ]2 ).
Thus it suffices to show, by Slutsky’s theorem, that the second term
goes to 0 in probability; this also follows from Slutsky’s theorem since
by definition of r, we know that r(Tn ) → 0 in probability, since Tn → θ0
in probability.
Recall that if Z ∼ N (0, 1), then the random variable given by χ2 =
Z 2 has a chi-squared distribution with 1 degree of freedom.
Exercise 6 (Second order Delta method). Let Tn be a sequence of
random variables such that for some θ0 ∈ R, we have
√
n(Tn − θ0 ) →D N (0, 1).
CENTRAL LIMIT THEOREM, DELTA METHOD, AND ASYMPTOTIC EFFICIENCY OF MLE
3
If g is twice differentiable, g 0 (θ0 ) = 0, g 00 (θ0 ) 6= 0, then
g 00 (θ0 ) 2
n(g(Tn ) − g(θ0 )) →D
χ.
2
2. The proof
Proof of Theorem 1. By Theorem 2 applied to
point θ0 , we write
∂
`(θ; X)
∂θ
= `0 (θ) at the
`0 (Tn ) = `0 (θ0 ) + `00 (θ0 )[Tn − θ0 ] + r(Tn )[Tn − θ0 ].
√
By assumption, `0 (Tn ) = 0. Multiplying (both side) by 1/ n, we
obtain,
√
√
√
(−1/ n)`0 (θ0 ) = (1/n)`00 (θ0 ) · ( n)[Tn − θ0 ] + (1/ n)r(Tn )[Tn − θ0 ].
With the following exercises and Slutsky’s theorem, we are done.
Exercise 7. Show that
√
(1/ n)`0 (θ0 ) →D N (0, I(θ0 )).
Exercise 8. Show that almost surely we have
(1/n)`00 (θ0 ) → −I(θ0 ).
Exercise 9 (Functions of mle estimators). Show using Theorem 1 that
if g is differentiable with g 0 (θ0 ) 6= 0, then
√
n(g(Tn ) − g(θ0 )) →D N (0, g 0 (θ0 )2 /I(θ0 )).