Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Module 4: Point Estimation Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 8.1-8.4 Revision: 1-12 1 Goals for this Module • Define and distinguish between point estimates vs. point estimators • Discuss characteristics of good point estimates – Unbiasedness and minimum variance – Mean square error – Consistency, efficiency, robustness • Quantify and calculate the precision of an estimator via the standard error – Discuss the Bootstrap as a way to empirically estimate standard errors Revision: 1-12 22 Welcome to Statistical Inference! • Problem: – We have a simple random sample of data – We want to use it to estimate a population quantity (usually a parameter of a distribution) – In point estimation, the estimate is a number • Issue: Often lots of possible estimates – E.g., estimate E(X) with x , x , or ??? • This module: What’s a “good” point estimate? – Module 5: Interval estimators – Module 6: Methods for finding good estimators Revision: 1-12 3 Point Estimation • A point estimate of a parameter q is a single number that is a sensible value for q – I.e., it’s a numerical estimate of q – We’ll use q to represent a generic parameter – it could be m, s, p, etc. • The point estimate is a statistic calculated from a sample of data – The statistic is called a point estimator – Using “hat” notation, we will denote it as qˆ – For example, we might use x to estimate m, so in this case mˆ x Revision: 1-12 44 Definition: Estimator An estimator is a rule, often expressed as a formula, that tells how to calculate the value of an estimate based on the measurements contained in a sample Revision: 1-12 5 An Example • You’re testing a new missile and want to estimate the probability of kill (against a particular target under specific conditions) – You do a test with n=25 shots – The parameter to be estimated is pk, the fraction of kills out of the 25 shots • Let X be the number of kills – In your test you observed x=15 • A reasonable estimator and estimate is X x 15 estimator: pˆ k estimate: pˆ k 0.6 n n 25 Revision: 1-12 6 A More Difficult Example • On another test, you’re estimating the mean time to failure (MTTF) of a piece of electronic equipment • Measurements for n=20 tests (in units of 1,000 hrs): • Turns out a normal distribution fits the data quite well • So, what we want to do is to estimate m, the MTTF • How best to do this? Revision: 1-12 7 Example, cont’d • Here are some possible estimators for m and their values for this data set: 1 20 (1) mˆ X , so x xi 27.793 20 i 1 27.94 27.98 27.960 2 min( X i ) max( X i ) 24.46 30.88 (3) mˆ X e , so xe 27.670 2 2 1 18 (4) mˆ X tr (10) , so xtr (10) x(i ) 27.838 16 i 3 (2) mˆ X , so x • Which estimator should you use? – I.e., which is likely to give estimates closer to the true (but unknown) population value? Revision: 1-12 8 Another Example • In a wargame computer simulation, you want to estimate a scenario’s run-time variability (s 2) • The run times (in seconds) for eight runs are: • Two possible estimates: 2 1 n ˆ (1) s S X i X , so s 2 0.25125 n 1 i 1 2 2 2 1 n ˆ (2) s X i X , so estimate 0.220 n i 1 2 • Why prefer (1) over (2)? Revision: 1-12 9 Bias • Definition: Let qˆ be a point estimator for a q – qˆ is an unbiased estimator if E qˆ q – If E qˆ q , qˆ is said to be biased • Definition: The bias of a point estimator qˆ is given by B qˆ E qˆ q • E.g.: Revision: 1-12 * Figures from Probability and Statistics for Engineering and the Sciences, 7th ed., Duxbury Press, 2008. 10 Proving Unbiasedness • Proposition: Let X be a binomial r.v. with parameters n and p. The sample proportion p̂ X n is an unbiased estimator of p. • Proof: Revision: 1-12 11 11 Remember: Rules for Linear Combinations of Random Variables • For random variables X1, X2,…, Xn – Whether or not the Xis are independent E a1 X 1 a2 X 2 an X n a1 E X 1 a2 E X 2 an E X n – If the X1, X2,…, Xn are independent Var a1 X 1 a2 X 2 an X n a12 Var X 1 a22 Var X 2 an2 Var X n Revision: 1-12 12 Example 8.1 • Let Y1, Y2,…, Yn be a random sample with E(Yi)=m and Var(Yi)=s2. Show that n 2 1 2 S ' Yi Y n i 1 is a biased estimator for s2, while S2 is an unbiased estimator of s2. • Solution: Revision: 1-12 13 13 Example 8.1 (continued) Revision: 1-12 14 Example 8.1 (continued) Revision: 1-12 15 Another Biased Estimator • Let X be the reaction time to a stimulus with X~U[0,q], where we want to estimate q based on a random sample X1, X2,…, Xn • Since q is the largest possible reaction time, consider the estimator qˆ1 max X1 , X 2 , , X n • However, unbiasedness implies that we can observe values bigger and smaller than q • Why? • Thus, qˆ1 must be a biased estimator Revision: 1-12 16 16 Fixing the Biased Estimator • For the same problem consider the estimator n 1 ˆ q2 max X 1 , X 2 , n , Xn • Show this estimator is unbiased Revision: 1-12 17 17 Revision: 1-12 18 18 One Criterion for Choosing Among Estimators • Principle of minimum variance unbiased estimation: Among all estimators of q that are unbiased, choose the one that has the minimum variance – The resulting estimator qˆ is called the minimum variance unbiased estimator (MVUE) of q Estimator qˆ1 is preferred to qˆ2 Revision: 1-12 * Figure from Probability and Statistics for Engineering and the Sciences, 7th ed., Duxbury Press, 2008. 19 Example of an MVUE • Let X1, X2,…, Xn be a random sample from a normal distribution with parameters m and s. Then the estimator mˆ X is the MVUE for m – Proof beyond the scope of the class • Note this only applies to the normal distribution – When estimating the population mean E(X)=m for other distributions, X may not be the appropriate estimator – E.g., for Cauchy distribution E(X)=! Revision: 1-12 20 20 How Variable is My Point Estimate? The Standard Error • The precision of a point estimate is given by its standard error • The standard error of an estimator qˆ is its standard deviation s qˆ Var qˆ • If the standard error itself involves unknown parameters whose values are estimated, substitution of these estimates into s qˆ yields the estimated standard error – The estimated standard error is denoted by sˆqˆ or sqˆ Revision: 1-12 21 Deriving Some Standard Errors (1) • Proposition: If Y1, Y2,…, Yn are distributed iid with variance s2 then, for a sample of size n, Var Y s 2 n . Thus s Y s n . • Proof: Revision: 1-12 22 Deriving Some Standard Errors (2) • Proposition: If Yi~Bin(n,p), i=1,…,n, then s p̂ pq n , where q=1-p and p̂ Y n . • Proof: Revision: 1-12 23 Expected Values and Standard Errors of Some Common Point Estimators Target Parameter If populations are independent q Sample Size(s) Point Estimator qˆ m n Y p n pˆ m1-m2 n1 and n2 p1-p2 n1 and n2 Revision: 1-12 E qˆ m Y n Y1 Y2 pˆ1 pˆ 2 Standard Error s qˆ s n pq n p m1-m2 p1-p2 s 12 n1 s 22 n2 p1q1 p2 q2 n1 n2 24 However, Unbiased Estimators Aren’t Always to be Preferred • Sometimes an estimator with a small bias can be preferred to an unbiased estimator • Example: • More detailed discussion beyond scope of course – just know unbiasedness isn’t necessarily required for a good estimator Revision: 1-12 25 25 Mean Square Error • Definition: The mean square error (MSE) of a point estimator qˆ is 2 ˆ ˆ MSE q E q q • MSE of an estimator qˆ is a function of both its variance and its bias – I.e., it can be shown (extra credit problem) that 2 2 MSE qˆ E qˆ q Var qˆ B qˆ – So, for unbiased estimators MSE qˆ Var qˆ Revision: 1-12 26 Error of Estimation • Definition: The error of estimation e is the distance between an estimator and its target parameter: e qˆ q – Since qˆ is a random variable, so it the error of estimation, e – But we can bound the error: Pr q b qˆ q b Pr qˆ q b Pr b qˆ q b Revision: 1-12 27 Bounding the Error of Estimation • Tchebysheff’s Theorem. Let Y be a random variable with finite mean m and variance s2. Then for any k > 0, Pr Y m ks 1 1 k 2 – Note that this holds for any distribution – It is a (generally conservative) bound – E.g., for any distribution we’re guaranteed that the probability Y is within 2 standard deviations of the mean is at least 0.75 • So, for unbiased estimators, a good bound to use on the error of estimation is b 2s qˆ Revision: 1-12 28 Example 8.2 • In a sample of n=1,000 randomly selected voters, y=560 are in favor of candidate Jones. • Estimate p, the fraction of voters in the population favoring Jones, and put a 2-s.e. bound on the error of estimation. • Solution: Revision: 1-12 29 Example 8.2 (continued) Revision: 1-12 30 Example 8.3 • Car tire durability was measured on samples of two types of tires, n1=n2=100. The number of miles until wear-out were recorded with the following results: y1 26, 400 miles y2 25,100 miles s12 1, 440, 000 miles s22 1,960, 000 miles • Estimate the difference in mean miles to wear-out and put a 2-s.e. bound on the error of estimation Revision: 1-12 31 Example 8.3 • Solution: Revision: 1-12 32 Example 8.3 (continued) Revision: 1-12 33 Other Properties of Good Estimators • An estimator is efficient if it has a small standard deviation compared to other unbiased estimators • An estimator is robust if it is not sensitive to outliers, distributional assumptions, etc. – That is, robust estimators work reasonably well under a wide variety of conditions • An estimator qˆn is consistent if P qˆn q e 0 as n For more detail, see Chapter 9.1-9.5 Revision: 1-12 34 34 A Useful Aside: Using the Bootstrap to Empirically Estimate Standard Errors The Hard Way to Empirically Estimate Standard Errors Population ~F x1 qˆ x1 s.e.[q (X)] Revision: 1-12 x2 xR qˆ x 2 qˆ x R Draw multiple (R) samples from the population, where xi={x1i,x2i,…,xni} Calculate multiple parameter estimates Estimate s.e. of the 2 parameter using the std. 1 R ˆ ˆ q ( x ) q ( x ) i dev. of the estimates R 1 i 1 35 The Bootstrap • The “hard way” is either not possible or is wasteful in practice • Bootstrap is: – Useful when you don’t know or, worse, simply cannot analytically derive sampling distribution – Provides a computer-intensive method to empirically estimate sampling distribution • Only feasible recently with the widespread availability of significant computing power Revision: 1-12 36 Plug-in Principle • We’ve been doing this throughout the class • If you need a parameter for a calculation, simply “plug in” the equivalent statistic • For example, we defined 2 Var( X ) E X E ( X ) and then we sometimes did the calculation using X for E ( X ) • Relevant for the bootstrap as we will “plug in” the empirical distribution in place of the population distribution Revision: 1-12 37 Empirically Estimating Standard Errors Using the Bootstrap ˆ x x1 , x2 ,..., xn ~F x1* qˆ x1* sqˆ x*2 x*B qˆ x*2 qˆ x*B 2 1 B ˆ * * q ( x ) q i B 1 i 1 Revision: 1-12 where Draw multiple (B) resamples from the data, where x*i x1*i , x2*i ,...xni* Calculate multiple bootstrap estimates Estimate s.e. from bootstrap estimates 1 B ˆ * q q xi B i 1 * 38 Some Key Ideas • Bootstrap samples are drawn with replacement from the empirical distribution – So, observations can actually occur in the bootstrap sample more frequently than they occurred in the actual sample • Empirical distribution substitutes for the actual population distribution – Can draw lots of bootstrap samples from the empirical distribution to calculate the statistic of interest • Make B as big as can run in a reasonable timeframe – Bootstrap resamples are of same size as orignal sample (n) • Because this is all empirical, don’t need to analytically solve for the sampling distribution of the statistic of interest Revision: 1-12 39 What We Covered in this Module • Defined and distinguished between point estimates vs. point estimators • Discussed characteristics of good point estimates – Unbiasedness and minimum variance – Mean square error – Consistency, efficiency, robustness • Quantified and calculated the precision of an estimator via the standard error – Discussed the Bootstrap as a way to empirically estimate standard errors Revision: 1-12 40 Homework • WM&S chapter 8.1-8.4 – Required exercises: 2, 8, 21, 23, 27 – Extra credit: 1, 6 • Useful hints: Problem 8.2: Don’t just give the obvious answer, but show why it’s true mathematically Problem 8.8: Don’t do the calculations for the qˆ4 estimator Extra credit problem 8.6: The a term is a constant with 0 a 1 Revision: 1-12 41