Download Notes 22 - Wharton Statistics

Statistics 512 Notes 22: Wrap up of Sufficiency, Most Powerful Tests Rao-Blackwell Theorem: Theorem 7.3.1 (stated a little differently): Let X 1 , , X n be an iid sample from the pdf or pmf f ( x; ) ,   . Let u ( X1 , , X n ) be a sufficient statistic for  . Let ˆ  W ( X , , X ) be an estimator of  . Because 1 n , X n ) is a sufficient statistic for  ,   E(ˆ | u( X1, , X n )  u; ) is a function of X 1 , , X n that is independent of  . The theorem is that for all  , MSE ( )  MSE (ˆ) The inequality is strict unless   ˆ . u( X1 , Note: If u ( X1 , , X n ) is not a sufficient statistic for  , then E(ˆ | u( X , , X )  u; ) is not an estimator since it 1 depends on  . n Application of the Rao-Blackwell Theorem: Suppose X 1 , X 2 , X 3 are iid Bernoulli random variables with success probability p . Consider the estimator X  2 X 2  3X 3 pˆ  1 . 6 Summing over the possible samples ( X1 , X 2 , X 3 )  (0, 0, 0), (0, 0,1), (0,1, 0), (0,1,1), (1, 0, 0), (1, 0,1), (1,1, 0), (1,1,1) , MSE p ( pˆ )  (1  p)3 (0  p) 2  p(1  p) 2 (1/ 2  p) 2  we have p 2 (1  p)(5 / 6  p) 2  p(1  p) 2 (1/ 6  p) 2  p 2 (1  p)(2 / 3  p) 2  p 2 (1  p)(1/ 2  p) 2  p 3 (1  p) 2 p̂ is not a function of the sufficient statistic Y  X1  X 2  X 3 . We can use the Rao-Blackwell Theorem to improve p̂ . Consider the estimator  X  2 X 2  3X 3  p E 1 | X1  X 2  X 3  y  . 6   We have  X  2 X 2  3X 3  E 1 | X1  X 2  X 3  0  0 6    X  2 X 2  3X 3  1 E 1 | X 1  X 2  X 3  1  6   3  X  2 X 2  3X 3  2 E 1 | X1  X 2  X 3  2  6   3  X  2 X 2  3X 3  E 1 | X 1  X 2  X 3  3  1 6   p (1  p ) . Here is a 3 comparison of the MSEs for p and p̂ for some values of p. p MSE ( pˆ ) MSE ( pˆ ) .25 0.073 0.063 .5 0.097 0.083 Thus, p  X . We have MSE p ( p )  p p .75 0.073 0.063 Limitation of Rao-Blackwell Theorem for Finding Best Estimator: Suppose there are two estimates ˆ and 1 ˆ2 having the same expectation. Assuming that a sufficient statistic Y exists, we may construct two other estimates 1 and 2 ,by conditioning on Y. The Rao-Blackwell Theorem gives no clues as to which one of these two is better. If the probability distribution of Y has a property called completeness, then 1 and 2 are identical, by a theorem of Lehmann and Scheffe. This topic is pursued in the rest of Chapter 7 but we shall not cover not it. Optimal Hypothesis Testing Review on Hypothesis Testing Goal: Decide between two hypotheses about a parameter of interest   H 0 :   0 H1 :   1 , where 0 1   . Null vs. Alternative Hypothesis: The alternative hypothesis is the hypothesis we are trying to see if there is strong evidence for. The null hypothesis is the default hypothesis that we will retain unless there is strong evidence for the alternative hypothesis. Critical region: A test is defined by its critical region. Let S denote the support of the random sample ( X1 , , X n ) . The subset C of S for which we reject the null hypothesis is called the critical region, i.e., our decision rule is Reject H 0 if ( X1 , , X n )  C C Retain (Do not reject) H 0 if ( X 1 , , X n )  C . Note: Here I am following the book in defining a critical region in terms of the sample space rather than a test statistic as I did in Notes 5-6. Errors in hypothesis testing: True State of Nature Decision H1 is true H 0 is true Type I error Correct decision Reject H 0 Accept (retain) H 0 Correct decision Type II error The best critical region would make the probability of a Type I error small when H 0 is true and the probability of a Type II error small when H1 is true. But in general there is a tradeoff between these two types of errors. Size of test, power of test: Power function of test =  C ( )  P (( X1 , , X n )  C ) = Probability of rejecting null hypothesis when true parameter is  . Size of test = max 0  C ( ) Power at an alternative   1 =  C ( ) Neyman-Pearson paradigm: Choose size of test to be reasonably small to protect against Type I error, typically 0.05 or 0.01. Among tests which have prescribed size, choose the most powerful test. What is the most powerful test? Example: Consider one random variable X that has a binomial distribution with n=5 and p   . Suppose we want to test H 0 :   0.5 vs. H1 :   0.75 Let f ( x; ) denote the pmf of X . The following table gives, at points of positive probability mass, the value of f ( x;0.5), f ( x;0.75) , and the ratio f ( x;0.5) / f ( x;0.75) . f ( x;0.75) f ( x;0.5) / f ( x;0.75) f ( x;0.5) x 0 1/32 1/1024 32/1 1 5/32 15/1024 32/3 2 10/32 90/1024 32/9 3 10/32 270/1024 32/27 4 5/32 405/1024 32/81 5 1/32 243/1024 32/243 What is the best critical region of size  1 ? 32 Two critical regions have size f ( x;0.5) / f ( x;0.75) : (1) C1  { X  0} (2) C2  { X  0} Power of C1 = P( X  0;  0.75)  1/1024 . Power of C2 = P( X  5;  0.75)  243 /1024 The test with critical region C2  { X  0} is the most powerful test of size   1 . 32 The Neyman-Pearson Lemma provides a systematic way of finding most powerful tests for testing a simple null hypothesis versus a simple alternative hypothesis. Theorem 8.1.1 (Neyman-Pearson Lemma): Let X 1 , , X n be an iid sample from the pdf or pmf f ( x; ) Suppose we want to test H 0 :    ' vs. H1 :    '' Then (1) any test with a critical region C of the following form (where k is some positive number) is a most powerful test of size  : (a) (b) f ( x; ') L( ';( X1 , , X n ))   k for each point ( X1 , f ( x; '') L( '';( X 1 , , X n )) , Xn ) C f ( x; ') L( ';( X 1 , , X n ))   k for each point ( X 1 , f ( x; '') L( '';( X 1 , , X n )) , X n )  CC (c)   PH [( X 1 , 0 , X n )  C] (2) A necessary condition for a test to be a most powerful test of level  is that it satisfy conditions (a), (b) and (c). Proof: We will follow the proof in the textbook for (1). Remark 8.1.1 discusses (2).

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Notes 22 - Wharton Statistics