Download Notes 7 - Wharton Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Omnibus test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Statistics 512 Notes 7
Hypothesis Testing Continued
Testing a normal mean
Example: A highway patrol officer believes that the
average speed of cars traveling over a certain stretch of
highway exceeds the posted limit of 55 mph. The speeds of
a random sample of 200 cars were recorded. The standard
deviation of speeds is known to be 5 and it is assumed that
the distribution of speeds is normal. Test the patrol
officer’s claim.
Distributions
Speeds
40
45
50
55
60
Moments
Mean
Std Dev
Std Err Mean
upper 95% Mean
lower 95% Mean
N
55.8
4.4676391
0.3159098
56.42296
55.17704
200
65
2
Suppose X 1 , , X n iid N (  ,  ) with the variance known.
We want to test
H 0 :   0 vs. H1 :   0 .
X  0
z

Consider the test statistic
and critical region
n
C  {z : z  c} . What do we need to choose c to be so that
the size of the test is 0.05?


X  0
P  
 c   P( Z  c)
0
 

n


where Z is a standard normal random variable.
Thus, we want to choose c to be the 0.95 quantile of the
standard normal distribution which equals 1.645.
X  0
55.8  55
 2.26

5
For the speed limit data,
.
n
200
Since z>1.645, we reject the null hypothesis – there is
strong evidence that the average speed is above 55 MPH.
z

Suppose we wanted to test H 0 :   0 vs. H1 :   0 .
X  0
z

The size of the test with test statistic
and
n
critical region C  {z : z  c} is


X  0
max   0 P 
 c 
 
 . We have
n






X





X



0
P 
 c   P 
c 0


 



n
n
n 









P  Z  c  0




n 





  
 c  0   
P  Z  c  0

1




 is an


Because 



n 
n 


increasing function of  , the size of the test is


X


0
P  
 c 
0
 
 . Thus a test of size 0.05 for testing
n


H 0 :   0 vs. H1 :   0 is the same as the test of size
0.05 for testing H 0 :   0 vs. H1 :   0 -- the critical
region is C  {z : z  1.645} where
z
X  0

.
n
Power function: The power function of the test with critical
region C  {z : z  1.645} is the following


X


0
 C (  )  P 
 1.645  
 

n




X









0
P 
 0
 1.645  0



 

n
n
n 



  
X 
P 
 1.645  0





n
n 



  
1   1.645  0




n 

For H 0 :   0 vs. H a :   0 and   1 , the power function
is shown below for n=10 and n=100.
0.0
0.2
0.4
Power
0.6
0.8
1.0
Power when n=10
-1.0
-0.5
0.0
0.5
mu
1.0
1.5
2.0
0.0
0.2
0.4
Power
0.6
0.8
1.0
Power when n=100
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
mu
Two sided tests: Suppose we want to test H 0 :   0 vs.
X  0
z

H1 :   0 . Using the test statistic

still seems
n
reasonable but now it makes sense to reject for both very
large and very small values of z . We can use a critical
region of the form C  {z :| z | c} . A test of size 0.05 has
critical region C  {z :| z | 1.96} because


 X  0

P  0 
 c   P  0 | Z | c 



n


Duality between tests and confidence intervals
Suppose we want to test H 0 :   0 vs. H1 :   0 and
use the rejection region C  {z :| z | 1.96} . Then, the set of
0 for which the H 0 :   0 is not rejected is
{0 :
X  0

 1.96}  {0 : 1.96 
X  0

n
{0 : X  1.96
 1.96} 
n

 0  X  1.96

}
n
n
which is the 95% confidence interval for  that we have
used.
In general, there is a duality between tests and confidence
intervals.
Suppose we have a family of tests of size  of
H 0 :   0 vs. H a :   0 for each   . Then
{0 : test of H0 :   0 vs. H1 :   0 is not rejected}
is a (1   ) confidence interval for  .
Proof: Let
CI ( X1 , , X n )  {0 : test of H0 :   0 vs. H1 :   0 is not rejected}
Then
P0 [ 0  CI ( X 1 , , X n )]  1  size(test of H 0 :    0 vs. H a :    0 )
 1
Conversely, suppose we have a (1   ) confidence interval
CI ( X1 , , X n ) for  . Then a test of size at most  of
H 0 :   0 vs. H a :   0 is to reject the null hypothesis if
and only if 0 does not belong to the confidence region.
Proof: We have P0 [ 0  CI ( X 1 , , X n )]   because
CI ( X1 , , X n ) is a (1   ) confidence interval. Thus, the
test is of size at most  .
Large sample tests for mean
One of the issues that came up in a recent municipal
election was the high cost of housing. A candidate seeking
to unseat an incumbent claimed that the average family
spends more than 30% of its annual income on housing. A
housing expert was asked investigate the claim. A random
sample of 125 households was drawn, and each household
was asked to report the percentage of household income
spent on housing costs. Is there strong evidence in favor of
the candidate’s claim?
Distributions
Costs
15
20
25
30
35
40
45
50
Moments
Mean
Std Dev
Std Err Mean
upper 95% Mean
lower 95% Mean
N
31.952
7.1907826
0.6431632
33.225
30.679
125
We want to test H 0 :   30 vs. H a :   30 .
More generally, test H 0 :   0 vs. H a :   0 .
n
2
X  0
(
X

X
)

i
2
i 1
t
S
Test statistic
where S 
is the
n 1
n
sample variance.
Consider the test with critical region {t : t  c} . By the
central limit theorem,




X


X









0
0
P 
 c   P 
 0
c 0

S
S
 S

 S

n
n
n
n 













X


  1   c  0

P 
c 0



S
S
 S



n
n 
n 


Note that the approximate probability of rejecting the null
hypothesis is an increasing function of  so that the size is
equal to the probability of rejecting the null hypothesis
when   0 . Thus, if we choose
c   1 (1   )  z where  is the standard normal CDF,
then the approximate size of the test that has critical region
1
{t : t  c} is 1     (1   )    .
For the data on family spending on annual housing,
31.952  30
t
 3.03
. Since t>1.645, we reject the null
7.191
125
hypothesis at the 0.05 significance level; there is strong
evidence for the candidate’s claim that the average family
spends more than 30% of its annual income on housing.
t-test for normal mean
, X n iid N (  ,  2 ) with the variance
unknown. Suppose we want to test H 0 :   0 vs.
X  0
t

H a :   0 . Consider the test statistic
S
.
n
The test with rejection region {t : t  t ,n 1} [where t ,n 1 is
the (1   ) quantile of the t-distribution with n-1 degrees of
freedom, i.e.,   P(T  t ,n 1 ) ] has exact size  because
X  0
t

S
when   0 ,
has a t-distribution with n-1
n
degrees of freedom.
Suppose X 1 ,
Note the difference between the rejection rule
{t : t  t ,n 1} and {t : t  z } . The large sample
{t : t  z } has approximate size  , while {t : t  t ,n 1} has
exact size  . Of course, we now have to assume that
X i has a normal distribution. In practice, we may not be
willing to assume that the population is normal. In general
t-critical values are larger than z critical values (i.e.,
t ,n 1  z ) so the t-test is conservative relative to the large
sample test. So in practice, many statisticians often use the
t-test even if they do not believe the data is normally
distributed. Note that lim t ,n 1  z .
n 
How well does the t-test work in moderate sized samples
when the data is not normal, i.e., what is its true size in
moderate sized samples? We will look at this question
using the Monte Carlo method (Section 5.8) on Thursday.
Review of hypothesis testing
Goal: Decide between two hypotheses about a parameter of
interest  
H 0 :   0
H1 :   1 ,
where 0
1   .
Null vs. Alternative Hypothesis: The alternative hypothesis
is the hypothesis we are trying to see if there is strong
evidence for. The null hypothesis is the default hypothesis
that we will retain unless there is strong evidence for the
alternative hypothesis.
Test statistic and critical region: Test is defined by test
statistic and critical region. Critical region is region of
values of test statistic for which we will reject the null
hypothesis.
Errors in hypothesis testing: Type I and Type II errors.
Size of test, power of test: Power function of test =
 C ( )  P (W ( X1, , X n )  C ) =
Probability of rejecting null hypothesis when true
parameter is  .
Size of test = max 0  C ( )
Power at an alternative   1 =  C ( )
Neyman-Pearson paradigm: Choose size of test to be
reasonably small to protect against Type I error, typically
0.05 or 0.01. Among tests which have prescribed size,
choose the most powerful test.
P-values: Measure of evidence against the null hypothesis.
Smallest sized test in a family of tests for which we would
reject the null hypothesis.
In chapter 8, we will discuss how to choose most powerful
tests.