Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LECTURE 6 Chapter 8.4: The method of moments The kth moment of a random variable X (or the kth moment of the distribution of X) is defined as µk = E(X k ). If X1, . . . , Xn are i.i.d. random variables each having the same distribution as X, the kth sample moment is defined to be n X 1 Xik . µ̂k = n i=1 µ̂k is a natural estimate of µk . 1 The method of moments estimates parameters by finding expressions for them in terms of the lowest possible order moments and then substituting sample moments into these expressions. E.g., suppose we wish to estimate the parameter θ = (θ1, θ2). Suppose θ1 and θ2 can be expressed in terms of the first two moments as θ1 = f1(µ1, µ2), θ2 = f2(µ1, µ2). Then the method of moments estimates of θ1 and θ2 are θ̂1 = f1(µ̂1, µ̂2), θ̂2 = f1(µ̂1, µ̂2). To summarize, the construction of a method of moments estimate involves 3 steps: 2 1. Calculate low order moments. Find expressions for the moments in terms of the parameters. Typically the number of low order moments required equals the number of unknown parameters. 2. Invert the expressions in Step 1. This gives new expressions for the unknown parameters in terms of the moments. 3. Substitute the sample moments in place of the (population) moments in the expressions of Step 2. This gives parameter estimates in terms of the sample moments. 3 Example A: Poisson distribution Let X be a random variable having the Poisson distribution with parameter (or mean) λ. I.e., λ = E(X). We write this as X ∼ Poi(λ). If X1, . . . , Xn is an i.i.d. sequence of random variables each distributed as Poi(λ), the 1st sample moment is n 1X Xi = X̄. µ̂1 = n i=1 Thus the method of moments estimate of λ is λ̂ = µ̂1 = X̄. In order to gauge the accuracy of λ̂ as an estimate of λ, we shall derive the sampling distribution of λ̂ or an approximation to that distribution. 4 Let the true value of λ be λ0 and n X S= Xi . i=1 Then λ̂ = S/n. From Chapter 4.5, recall that S ∼ Poi(nλ0), since of sum of independent Poisson random variables has a Poisson distribution. Observing that λ̂ is a discrete random variable, the probability mass function (pmf) of λ̂ is P (λ̂ = v) = P (S = nv) (nλ0)nv e−nλ0 , = (nv)! for v such that nv is a nonnegative integer. 5 Since S ∼ Poi(nλ0) and λ̂ = S/n, we have 1 E(λ̂) = E(S) = λ0, n 1 λ0 Var(λ̂) = 2 Var(S) = . n n Thus we conclude that λ̂ is an unbiased estimate and the standard error of λ̂ is r λ0 σλ̂ = . n We note that σλ̂ is unknown but we can estimate it using its sample analogue: s λ̂ sλ̂ = . n 6 Example B: Normal distribution Let X ∼ N (µ, σ 2). Then the 1st and 2nd moments for X are µ1 = E(X) = µ, µ2 = E(X 2) = µ2 + σ 2. Inverting these equations, we obtain µ = µ1, σ 2 = µ2 − µ21. (1) If X1, . . . , Xn is an i.i.d. sequence of N (µ, σ 2) random variables, the 1st and 2nd sample moments are µ̂1 = X̄, n X 1 µ̂2 = Xi2. n i=1 Method of moments estimates µ̂, σ̂ 2 of µ, σ 2 can be obtained by substituting µ̂1, µ̂2 as µ1, µ2, respectively, in (1). 7 More precisely, we get µ̂ = X̄, n X 1 2 σ̂ = Xi2 − X̄ 2 n 1 = n i=1 n X (Xi − X̄)2. i=1 Note that σ̂ 2 6= s2, where s2 is the sample variance given by s2 = 1 n−1 n X (Xi − X̄)2. i=1 Since E(s2) = σ 2, we conclude that E(σ̂ 2) 6= σ 2 which implies that σ̂ 2 is a biased estimate of σ 2. From Chapter 6.3, X̄ ∼ N (µ, σ 2/n), nσ̂ 2/σ 2 ∼ χ2n−1, and that X̄, σ̂ 2 are independent. 8 Example C: Gamma distribution Let X be a random variable having a gamma distribution with parameters (α, λ). The pdf of X is λα α−1 −λt g(t) = t e , t ≥ 0. Γ(α) We write this as X ∼ Γ(α, λ). See Chapter 2.2.2 for the properties of the gamma distribution. From Example B of Chapter 4.5, the first 2 moments of the gamma distribution are α µ1 = , (2) λ α(α + 1) µ2 = . (3) 2 λ 9 To apply the method of moments, we need to invert these equations. From (2), we get α = λµ1. Substituting this into (3), we have µ1 2 µ2 = µ1 + , λ or equivalently, µ1 λ= . 2 µ2 − µ1 Finally we have α = λµ1 = µ21 µ2 − µ21 . Let X1, . . . , Xn be an i.i.d. sequence of Γ(α, λ) random variables with 1st and 2nd sample moments given by µ̂1 = X̄ and n X 1 Xi2. µ̂2 = n i=1 10 We write σ̂ 2 = µ̂2 − µ̂21 n X 1 = (Xi − X̄)2. n i=1 Then the method of moments estimates for α and λ are X̄ λ̂ = 2 , σ̂ X̄ 2 α̂ = 2 . σ̂ Example D: An angular distribution The angle θ at which electrons are emitted in muon decay has a distribution with the density 1 + αx f (x|α) = , 2 −1 ≤ x, α ≤ 1 where x = cos(θ). 11 The method of moments may be applied to estimate α from an i.i.d. sample of measurements X1, . . . , Xn. We note that the mean of the density is µ = E(X1) Z 1 1 + αx = dx x 2 −1 α = . 3 Thus α = 3µ and the method of moments estimate of α is α̂ = 3X̄. 12 Consistency of an estimate Definition Let θ̂n be an estimate of a parameter θ based on a sample of size n. Then θ̂n is said to be consistent in probability if θ̂n converges in probability to θ as n approaches infinity; i.e., for any ε > 0, P (|θ̂n − θ| > ε) → 0 as n → ∞. An intuitive interpretation of θ̂n being a consistent estimate of θ is that θ̂n approaches θ as sample size n increases to infinity. Example By the weak law of large numbers (from ST2131), the kth sample moment µ̂k converges in probability to the kth population moment µk as sample size n → ∞. Thus µ̂k is a consistent estimate of µk . 13 This implies the consistency of method of moments estimates. To conclude this section, we shall show that consistency of method of moments estimates can be used to provide a justification for a procedure that is commonly used in estimating standard errors. Suppose we are interested in the standard error of the method of moments estimate θ̂. Denoting the true parameter by θ0, suppose the standard error of θ̂ has the form 1 σθ̂ = √ σ(θ0). n We estimate σθ̂ by 1 sθ̂ = √ σ(θ̂). n 14 If σ(.) is a continuous function, then σ(θ̂) → σ(θ0) in probability as n → ∞ (since θ̂ is consistent). This implies that sθ̂ σθ̂ →1 in probability as n → ∞. Tutorial 3: # 8.10.5, 8.10.7, 8.10.16, 8.10.19. 15