Download Notes 22 - Wharton Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Statistics 512 Notes 22: Wrap up of Sufficiency,
Most Powerful Tests
Rao-Blackwell Theorem:
Theorem 7.3.1 (stated a little differently): Let X 1 , , X n be
an iid sample from the pdf or pmf f ( x; ) ,   . Let
u ( X1 , , X n ) be a sufficient statistic for  . Let
ˆ  W ( X , , X ) be an estimator of  . Because
1
n
, X n ) is a sufficient statistic for  ,
  E(ˆ | u( X1, , X n )  u; ) is a function of X 1 , , X n
that is independent of  . The theorem is that for all  ,
MSE ( )  MSE (ˆ)
The inequality is strict unless   ˆ .
u( X1 ,
Note: If u ( X1 , , X n ) is not a sufficient statistic for  , then
E(ˆ | u( X , , X )  u; ) is not an estimator since it
1
depends on  .
n
Application of the Rao-Blackwell Theorem:
Suppose X 1 , X 2 , X 3 are iid Bernoulli random variables with
success probability p . Consider the estimator
X  2 X 2  3X 3
pˆ  1
.
6
Summing over the possible samples ( X1 , X 2 , X 3 ) 
(0, 0, 0), (0, 0,1), (0,1, 0), (0,1,1), (1, 0, 0), (1, 0,1), (1,1, 0), (1,1,1) ,
MSE p ( pˆ )  (1  p)3 (0  p) 2  p(1  p) 2 (1/ 2  p) 2 
we have
p 2 (1  p)(5 / 6  p) 2  p(1  p) 2 (1/ 6  p) 2 
p 2 (1  p)(2 / 3  p) 2  p 2 (1  p)(1/ 2  p) 2 
p 3 (1  p) 2
p̂ is not a function of the sufficient statistic
Y  X1  X 2  X 3 . We can use the Rao-Blackwell
Theorem to improve p̂ . Consider the estimator
 X  2 X 2  3X 3

p E 1
| X1  X 2  X 3  y  .
6


We have
 X  2 X 2  3X 3

E 1
| X1  X 2  X 3  0  0
6


 X  2 X 2  3X 3
 1
E 1
| X 1  X 2  X 3  1 
6

 3
 X  2 X 2  3X 3
 2
E 1
| X1  X 2  X 3  2 
6

 3
 X  2 X 2  3X 3

E 1
| X 1  X 2  X 3  3  1
6


p (1  p )
. Here is a
3
comparison of the MSEs for p and p̂ for some values of
p.
p
MSE ( pˆ )
MSE ( pˆ )
.25
0.073
0.063
.5
0.097
0.083
Thus, p  X . We have MSE p ( p ) 
p
p
.75
0.073
0.063
Limitation of Rao-Blackwell Theorem for Finding Best
Estimator: Suppose there are two estimates ˆ and
1
ˆ2 having the same expectation. Assuming that a sufficient
statistic Y exists, we may construct two other estimates 1
and 2 ,by conditioning on Y. The Rao-Blackwell
Theorem gives no clues as to which one of these two is
better. If the probability distribution of Y has a property
called completeness, then
1 and 2 are identical, by a theorem of Lehmann and
Scheffe. This topic is pursued in the rest of Chapter 7 but
we shall not cover not it.
Optimal Hypothesis Testing
Review on Hypothesis Testing
Goal: Decide between two hypotheses about a parameter of
interest  
H 0 :   0
H1 :   1 ,
where 0
1   .
Null vs. Alternative Hypothesis: The alternative hypothesis
is the hypothesis we are trying to see if there is strong
evidence for. The null hypothesis is the default hypothesis
that we will retain unless there is strong evidence for the
alternative hypothesis.
Critical region: A test is defined by its critical region. Let
S denote the support of the random sample ( X1 , , X n ) .
The subset C of S for which we reject the null hypothesis
is called the critical region, i.e., our decision rule is
Reject H 0 if ( X1 , , X n )  C
C
Retain (Do not reject) H 0 if ( X 1 , , X n )  C .
Note: Here I am following the book in defining a critical
region in terms of the sample space rather than a test
statistic as I did in Notes 5-6.
Errors in hypothesis testing:
True State of Nature
Decision
H1 is true
H 0 is true
Type I error
Correct decision
Reject H 0
Accept (retain) H 0
Correct decision
Type II error
The best critical region would make the probability of a
Type I error small when H 0 is true and the probability of a
Type II error small when H1 is true. But in general there is
a tradeoff between these two types of errors.
Size of test, power of test: Power function of test =
 C ( )  P (( X1 , , X n )  C ) =
Probability of rejecting null hypothesis when true
parameter is  .
Size of test = max 0  C ( )
Power at an alternative   1 =  C ( )
Neyman-Pearson paradigm: Choose size of test to be
reasonably small to protect against Type I error, typically
0.05 or 0.01. Among tests which have prescribed size,
choose the most powerful test.
What is the most powerful test?
Example: Consider one random variable X that has a
binomial distribution with n=5 and p   . Suppose we
want to test
H 0 :   0.5 vs. H1 :   0.75
Let f ( x; ) denote the pmf of X . The following table
gives, at points of positive probability mass, the value of
f ( x;0.5), f ( x;0.75) , and the ratio f ( x;0.5) / f ( x;0.75) .
f ( x;0.75)
f ( x;0.5) / f ( x;0.75)
f ( x;0.5)
x
0
1/32
1/1024
32/1
1
5/32
15/1024
32/3
2
10/32
90/1024
32/9
3
10/32
270/1024
32/27
4
5/32
405/1024
32/81
5
1/32
243/1024
32/243
What is the best critical region of size

1
?
32
Two critical regions have size f ( x;0.5) / f ( x;0.75) :
(1) C1  { X  0}
(2) C2  { X  0}
Power of C1 = P( X  0;  0.75)  1/1024 .
Power of C2 = P( X  5;  0.75)  243 /1024
The test with critical region C2  { X  0} is the most
powerful test of size   1 .
32
The Neyman-Pearson Lemma provides a systematic way of
finding most powerful tests for testing a simple null
hypothesis versus a simple alternative hypothesis.
Theorem 8.1.1 (Neyman-Pearson Lemma): Let X 1 , , X n
be an iid sample from the pdf or pmf f ( x; ) Suppose we
want to test
H 0 :    ' vs.
H1 :    ''
Then
(1) any test with a critical region C of the following form
(where k is some positive number) is a most powerful test
of size  :
(a)
(b)
f ( x; ') L( ';( X1 , , X n ))

 k for each point ( X1 ,
f ( x; '') L( '';( X 1 , , X n ))
, Xn ) C
f ( x; ') L( ';( X 1 , , X n ))

 k for each point ( X 1 ,
f ( x; '') L( '';( X 1 , , X n ))
, X n )  CC
(c)   PH [( X 1 ,
0
, X n )  C]
(2) A necessary condition for a test to be a most powerful
test of level  is that it satisfy conditions (a), (b) and (c).
Proof: We will follow the proof in the textbook for (1).
Remark 8.1.1 discusses (2).