Download Mathematical statistics Week 2/b: Finite sample

Mathematical statistics Week 2/b: Finite sample properties of estimators Péter Elek and Ádám Reiff 26th September, 2013 1. Unbiasedness, relative efficiency and MSE-criterion Unbiasedness. • Consider an estimator θb that is used to estimate a population parameter θ. – θb is a random variable and its distribution depends on the true value of θ. b = θ for all possible values • θb is said to be an unbiased estimator of θ if E(θ) of θ. b 6= θ for at least one value of θ, then θb is a biased estimator of θ. • If E(θ) • Bias of θb is Biasθb (θ) = E θb − θ. – The bias is also a function of the true parameter value θ. Example. • Example (from previous lecture): the (true) sampling distribution of the sample mean and sample median are: x m 0 1 2 3 4 5 6 8 9 12 1 27 7 27 3 27 3 27 3 27 6 27 3 27 3 27 3 27 0 0 1 27 13 27 0 0 0 0 0 1 27 7 27 • Which is an unbiased estimator of the population mean? 1 Relative efficiency. • Consider θb1 and θb2 , which are two unbiased estimators of a population parameter θ. • Estimator θb1 is said to be relatively more efficient than θb2 (to estimate θ) if V ar(θb1 ) ≤ V ar(θb2 ) for all θ, with strict inequality for at least one value of θ. – i.e. the variance of its sampling distribution is smaller Efficiency (best unbiased estimator). • The unbiased estimator θb is said to be efficient (or best unbiased estimator) if it has the smallest sampling variance among all unbiased estimators. b ≤ V ar(θb2 ) for all – That is, for any other θb2 unbiased estimator, V ar(θ) possible values of θ. – Note: efficiency is sometimes defined in a different way. • Since the sampling variances may depend on the true value of θ (and hence functions of θ are compared in the definition), a best unbiased estimator does not always exist. Mean squared error (MSE) criterion. • Comparing variances is useful only for unbiased estimators. • To compare more general estimators, we can use the mean squared error. 2 b b • M SE θ = E θ − θ . • MSE criterion: we choose the estimator that has a smaller MSE. 2 b b • Proposition: M SE θ = V ar θ + Bias θb 2 Proof of M SE θb = V ar θb + Bias θb . • We have: 2 2 b b b b b = E (θ − E(θ)) − (θ − E(θ)) M SE θ = E θ − θ i 2 h b b b θ − E(θ) = E θb − E(θ) − 2E θb − E(θ) +E 2 2 b θ − E(θ) . b the second term is zero, and the third term is • Here the first term is V ar(θ), 2 b b not a random variable, so it is θ − E(θ) = Bias2 (θ). • Therefore M SE θb = V ar θb + Bias2 θb . Example (cont.). • Which of the previous two sample statistics (sample mean and sample median) has smaller MSE at the particular true parameter value? – "True parameter value" now means: the three possible outcomes occur with equal probability. 1 3 • V ar(x) = 27 (0 − 5)2 + 27 (1 − 5)2 + . . . + M SE(x) = V ar(x) because of unbiasedness. 1 27 (12 − 5)2 = 8.6667 and 7 7 2 2 • V ar(m) = 27 (0−4.5556)2 + 13 27 (3−4.5556) + 27 (12−4.5556) = 20.9136 and M SE(m) > V ar(m) because of the bias. • Hence x is better than m in terms of the MSE-criterion. (And also the former is unbiased, while the latter is not.) 2. Finite sample properties of sample mean and sample variance 2.1. Properties of the sample mean Properties of the sample mean of an i.i.d. sample. • Suppose X1 , X2 , . . . , Xn is an i.i.d. sample from a distribution with unknown population mean (expected value) µ and variance σ 2 . • Then the sample mean is unbiased for µ. Pn P i=1 Xi – E(X) = E = n1 E ( ni=1 Xi ) = n 1 n 1 n Pn i=1 E(Xi ) = 1 n Pn i=1 µ = (nµ) = µ. √ • Its standard deviation is proportional to 1/ n. Pn P i=1 Xi – V ar(X) = V ar = n12 V ar ( ni=1 Xi ) = n 1 Pn 2 = 1 nσ 2 = σ 2 σ 2 2 i=1 n n n √ – sd(X) = σ/ n 3 1 n2 Pn i=1 V ar(Xi ) = Best linear unbiased estimator (BLUE). • Best linear unbiased estimator (BLUE): an unbiased estimator is BLUE if it has the smallest variance among all unbiased estimators, which are linear combinations of the sample elements. • The sample mean is not always the best unbiased estimator for µ. (There are "weird" counter-examples.) • But it is the best linear unbiased estimator for µ in the case of an i.i.d. sample. Proof of BLUE-property of the sample mean in an i.i.d. sample. P • Let θb be an arbitrary unbiased linear estimator: θb = ni=1 ai Xi , with E θb = µ. P P • Then E θb = E ( ni=1 ai Xi ) = ni=1 ai E (Xi ) = µ, so by unbiasedness P we have ni=1 ai = 1. P P P • Also, V ar θb = V ar ( ni=1 ai Xi ) = ni=1 a2i V ar (Xi ) = σ 2 ni=1 a2i . • P Hence, since V ar (x) = n i=1 ai = 1. σ2 n , we only have to prove that Pn 2 i=1 ai ≥ 1 n if • But this is true because of the inequality between the quadratic and arithmetic mean. So X is indeed the Best Linear Unbiased Estimator of µ. Properties of the sample mean in a normal sample. • Suppose X1 , X2 , . . . , Xn is an i.i.d. sample from a N (µ, σ 2 ) distribution with unknown parameters. • Then the sample mean is not only BLUE but it is the best unbiased estimator for µ. – i.e. it is the best among all (even nonlinear) estimators. We do not prove this. • Moreover, as the linear combination of normal random variables is also nor σ √ mally distributed, X is also normally distributed: X ∼ N µ, n . 4 Exercise 1. • A filling machine is set to pour 500 g-s of cereal into a box container. Denote the actual weight of cereal filled into the container by X, and assume that X ∼ N 500, 202 . • A random sample of n = 25 boxes is selected (i.e. x1 , . . . , x25 is drawn), and the plant manager stops the process if x > 510 or x < 490. • What is the probability of stopping? Solution. 2 • Since X ∼ N 500, 202 and n = 25, X ∼ N (500, 20 25 = 16), and X−500 √ ∼ N (0, 1). 20/ 25 • Therefore Pr(stop) = 1 − Pr(notstop) = 1 − Pr(490 < X < 510) =1 − 510−500 490−500 X−500 X−500 √ Pr 20/√25 < 20/√25 < 20/√25 = 1 − Pr −2.5 < 20/ < 2.5 = 25 1 − [Φ(2.5) − Φ(−2.5)] = 2 − 2Φ(2.5) = 0.0124. 2.2. Properties of the sample variance "Ideal" sampling variance. • The "ideal" sample variance is obtained when we treat the expected value µ P n 2 i=1 (Xi −µ) as known in the variance formula: s2ideal = n . – "Ideal" because we have the generally unknown µ instead of X in the expression. • In an i.i.d. sample, s2ideal is an unbiased estimator of σ 2 . – Proof: E s2ideal = Pn i=1 E [(Xi −µ)2 ] n = nσ 2 n = σ2 ns2 • Moreover, in a normally distributed sample, σideal follows a chi-squared 2 distribution with n degrees of freedom. 2 2 Pn ns2 Xi −µ – Proof: s2ideal = σn . Hence σideal is the sum of squa2 i=1 σ res of n independent standard normal variables. Exercise 2. • Assume that the size of output of some production process is distributed normally: X ∼ N (10, 0.12 ) (i.e. the true variance of the size is known). • We draw a sample of n = 25 observations from a large number of outputs. • What is the probability that the ideal sample variance will exceed 0.014? 5 Solution. • Pr(s2ideal > 0.014) = Pr 2 nsideal Pr < 35 ≈ 0.1 2 σ ns2ideal σ2 > 25·0.014 0.01 = Pr ns2ideal σ2 > 35 = 1 − • from the table of chi-squared distribution with 25 degrees of freedom. "Actual" uncorrected sample variance. • "Actual" uncorrected sample variance is: s2 = Pn 2 i=1 (Xi −X ) n – Actual because µ is replaced by x • E(s2 ) = n−1 2 n σ , hence it is a biased estimator of σ 2 . • Proof: – s2 = Pn i=1 2 2 (Xi −X ) n Pn i=1 = 2 [(xi −µ)−(X−µ)] n = ... = Pn 2 i=1 (Xi −µ) n − X −µ . h i – In the expected value we have E (Xi − µ)2 = V ar(Xi ) = σ 2 , and h h 2 i 2 i 2 also we have E X − µ = E X − E(X) = V ar(X) = σn . – Therefore E(s2 ) = σ 2 − σ2. σ2 n = n−1 2 n σ , so s2 is a biased estimator of Corrected sample variance. 2 2 n n i=1 (Xi −X ) i=1 (Xi −X ) n • s∗2 = n−1 = is unbiased for σ 2 , and it is called n n−1 the corrected sample variance. P P 2 n (Xi −X ) • Moreover, in a normally distributed sample, s∗2 = i=1n−1 = ... = 2 2 Pn 2 ∗2 P 2 (n−1)s n Xi −µ X−µ n i=1 (Xi −µ) √ − X − µ , so = − i=1 n−1 n−1 σ σ2 σ/ n P is distributed as χ2n−1 . – We "lose" one degree of freedom because µ was replaced by X. – Correct proof is based on induction (not covered in class). 6 Further properties in case of a normal sample. 2 • Since (n − 1)s∗2 = ns2 , therefore ns = (n−1)s σ2 σ2 distribution. ∗2 • Further, V ar (n−1)s = 2(n − 1) (why?). 2 σ • Therefore V ar s∗2 = σ4 V (n−1)2 ar (n−1)s∗2 σ2 = ∗2 also follows a χ2n−1 σ4 2(n (n−1)2 • By similar arguments it is easy to see that V ar(s2 ) = − 1) = 2σ 4 n−1 . 2(n−1)σ 4 . n2 "Normalized" sample mean in a normal sample. • "Normalized" sample mean: X−µ √ s∗ / n • Suppose X1 , X2 , . . . , Xn is an i.i.d. sample from a N (µ, σ 2 ) distribution with unknown parameters. • We know that – – X−µ √ σ/ n q ∼ N (0, 1) and (n−1)s∗2 /σ 2 n−1 = s∗ σ q ∼ χ2n−1 n−1 . • Further, these two random variables are independent. – We do not prove this. • So X−µ √ σ/ n s∗ /σ = X−µ √ s∗ / n follows a tn−1 -distribution. Exercise 3. • Assume that the test-scores at the CEU math entry test are distributed normally. • From a random sample of 25 students we find that x = 70 and s∗2 = 400. • Find the "probability" that the expected value of the results lies in the [60, 80] interval! 7 Solution. • We know that X−µ √ s∗ / n ∼ t24 , • so Pr(60 < µ < 80) = Pr(−80 < −µ < −60) = Pr √ = 0.98 Pr −2.5 < sX−µ ∗ / n < 2.5 X−80 √ 20/ 25 < X−µ √ s∗ / n < X−60 √ 20/ 25 • where the last result is taken from the table of the t-distribution with 24 degrees of freedom. Material. • Further exercises are on a separate sheet (with solutions). • Wooldridge Appendix C.1-C.2 • Casella-Berger 5.1-5.3 (except for Theorem 5.2.11), 7.1, 7.3.1-7.3.2. – only to the extent covered in the course 8 =

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Mathematical statistics Week 2/b: Finite sample