Download Maximum Likelihood Estimator

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big O notation wikipedia , lookup

Series (mathematics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Central limit theorem wikipedia , lookup

German tank problem wikipedia , lookup

Tweedie distribution wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Maximum Likelihood Estimator
All of Statistics (Chapter 9)
Outline
• MLE
• Properties of MLE
– Consistency
– Asymptotic normality
– Efficiency
– Invariance
Definition of MLE
• Joint density function
• Likelihood Function
=
• Log Likelihood Function
• MLE
is the value that maximizes
Definition of MLE
• MLE
is the value that maximizes
Properties of MLE
MLE has the following nice properties:
n
• Consistency: P (ˆMLE   0 )  1
0
• Asymptoticly Normal:
• Asymptotic optimality: MLE has the smallest variance
•
Invariance Property
Consistency:
P0 (ˆ   0 )  1
n
Scaled Log-likelihood Function
Expectation:
Outline of Proof
1.
2.
3.
is the maximizer of
is the maximizer of
we have
4. Based on 1,2,3
.
in probability by Law of large numbers
P0 (ˆ   0 )  1
n
2:
is the maximizer of
For any
, we have
Proof:
Since
.
3:
in probability
• Law of large numbers: sample average converges to expectation
in probability (proved by Chebyshev's inequality)
• Sample average:
• Expectation:
1. is the maximizer of
2. is is the maximizer of
3.
we have
Target: Based on 1,2,3
.
in probability by LLN.
Consistency:
The distributions of the estimators become more and more
concentrated near the true value of the parameter being estimated.
MLE is Asymptotically Normal
Fisher Information
• Notation:


Fisher Information is defined as
Measure how quickly pdf will change
Larger fisher information  pdf changes quickly at
0
can be well distinguished from the distribution with other parameters
easier to estimate 0 based on data
?
=
Since
is pdf:
Taking the derivative:
Equivalently,
Writing (4) as an expectation
Differentiate(4):
Second term:
First term:
Theorem. (Asymptotic normality of MLE.)
Since MLE
is maximizer of
, we have
By Mean Value Theory:
Convergence in Distribution by
Central Limit Theory
First, consider the
numerator
(1)
Next, consider the
denominator :
Since
Convergence in Prob. by LLN.
(2)
Theorem. (Asymptotic normality of MLE.)
Combine (1) and (2), we get
With the normal property, we can generate confidence bounds and
hypothesis tests for the parameters.
Asymptotic optimal (efficient)
• Cramér–Rao bound expresses a lower bound on
the variance of estimators
• The variance of an unbiased estimator is
bounded by:
• MLE:
MLE is a unbiased estimator with smallest
variance
• MLE has the smallest asymptotic variance and
we say that the MLE is asymptotically efficient
and asymptotically optimal.
Functional invariance
X ~ f ( x |  ),   
  g ( )
An invertible mapping
X ~ f ( x |  ),   g ()
where   g ( )
Outline of Proof
L* x ( )   .f ( x | . )   f ( x | g 1 ( ))  Lx ( g 1 ( ))  Lx ( )
Thus, the maximum of
L* x ( )
is attained at
  g (ˆ)
Discussion
• Questions??
Asymptotic optimal (efficient)
MLE has the smallest asymptotic variance and we say that the MLE is
asymptotically efficient and asymptotically optimal.
 Let Y is a statistic with mean
have
then we
 Statistic Y is called efficient estimator of
Rao-Cramer lower bound.
 When Y is an unbiased estimator of
becomes
iff the variance of Y attains the
, then the Rao-Cramer inequality
When n converges to infinity, MLE is a
unbiased estimator with smallest variance