* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Maximum Likelihood Estimator
Survey
Document related concepts
Transcript
Maximum Likelihood Estimator All of Statistics (Chapter 9) Outline • MLE • Properties of MLE – Consistency – Asymptotic normality – Efficiency – Invariance Definition of MLE • Joint density function • Likelihood Function = • Log Likelihood Function • MLE is the value that maximizes Definition of MLE • MLE is the value that maximizes Properties of MLE MLE has the following nice properties: n • Consistency: P (ˆMLE 0 ) 1 0 • Asymptoticly Normal: • Asymptotic optimality: MLE has the smallest variance • Invariance Property Consistency: P0 (ˆ 0 ) 1 n Scaled Log-likelihood Function Expectation: Outline of Proof 1. 2. 3. is the maximizer of is the maximizer of we have 4. Based on 1,2,3 . in probability by Law of large numbers P0 (ˆ 0 ) 1 n 2: is the maximizer of For any , we have Proof: Since . 3: in probability • Law of large numbers: sample average converges to expectation in probability (proved by Chebyshev's inequality) • Sample average: • Expectation: 1. is the maximizer of 2. is is the maximizer of 3. we have Target: Based on 1,2,3 . in probability by LLN. Consistency: The distributions of the estimators become more and more concentrated near the true value of the parameter being estimated. MLE is Asymptotically Normal Fisher Information • Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher information pdf changes quickly at 0 can be well distinguished from the distribution with other parameters easier to estimate 0 based on data ? = Since is pdf: Taking the derivative: Equivalently, Writing (4) as an expectation Differentiate(4): Second term: First term: Theorem. (Asymptotic normality of MLE.) Since MLE is maximizer of , we have By Mean Value Theory: Convergence in Distribution by Central Limit Theory First, consider the numerator (1) Next, consider the denominator : Since Convergence in Prob. by LLN. (2) Theorem. (Asymptotic normality of MLE.) Combine (1) and (2), we get With the normal property, we can generate confidence bounds and hypothesis tests for the parameters. Asymptotic optimal (efficient) • Cramér–Rao bound expresses a lower bound on the variance of estimators • The variance of an unbiased estimator is bounded by: • MLE: MLE is a unbiased estimator with smallest variance • MLE has the smallest asymptotic variance and we say that the MLE is asymptotically efficient and asymptotically optimal. Functional invariance X ~ f ( x | ), g ( ) An invertible mapping X ~ f ( x | ), g () where g ( ) Outline of Proof L* x ( ) .f ( x | . ) f ( x | g 1 ( )) Lx ( g 1 ( )) Lx ( ) Thus, the maximum of L* x ( ) is attained at g (ˆ) Discussion • Questions?? Asymptotic optimal (efficient) MLE has the smallest asymptotic variance and we say that the MLE is asymptotically efficient and asymptotically optimal. Let Y is a statistic with mean have then we Statistic Y is called efficient estimator of Rao-Cramer lower bound. When Y is an unbiased estimator of becomes iff the variance of Y attains the , then the Rao-Cramer inequality When n converges to infinity, MLE is a unbiased estimator with smallest variance