Download Chap 9

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chap 9: Testing Hypotheses &
Assessing Goodness of Fit
Section 9.1: INTRODUCTION
In section 8.2, we fitted a Poisson dist’n to
counts. This chapter will provide us with tools
in testing (Neyman-Pearson Paradigm) and
assessing how good was such a fit.
9.2: The Neyman-Pearson
Paradigm
Definitions: Null Hypothesis H 0 vs H a Alternative
(one-sided or two-sided) Hypothesis.
Decision to reject H 0 in favor of H ais based on a
statisticT ( X 1 ,..., X n )function of the sample values
Acceptance region vs Rejection region.
Type I error vs Type II error.
Significance level & Power of the test.
9.3: Optimal Tests:
The Neyman-Pearson Lemma
Ideal: in a class of tests at a level of significance 
, we would like to select the most powerful one.
Lemma: (simple null vs simple alternative)
f 0 ( x)
c
Assume an LRT at level  that rejects H owhen
*
f A ( x)
Then any other test at level    will be less
or equally powerful than that LRT.
Rationale: The LR (Likelihood Ratio) measures the
relative plausibilities of the null and the
alternative. The LRT (Likelihood Ratio Test) is
optimal and rejects for small values of the LR.
9.4: The Duality of Confidence
Intervals & Hypothesis Testing
Thm A:  0  ,  a test at level  for H 0 :    0
Then the set C( X )   : X  A  is a100(1   )%
confidence region for  , where A( 0 )denotes
the acceptance region of the test.
Theorem B: Suppose that C ( X ) is a 100 1   %
confidence region for  ;
that is,  0 , P  0  C ( X ) |    0  1  
Then an acceptance region for a test at level  of
H 0 :   0 is A(0 )  X | 0  C ( X )

 






9.5:Generalized
Likelihood Ratio Tests
Test statistic:
 
max[ L( )]
  0
max[ L( )]
 
Theorem:
Under smoothness conditions on the PMF or PDF,
the null dist’n of  2 log  tends to a chi-square
dist’n with degrees of freedom equal to dim   dim 0
as the sample size tends to infinity.
9.6: Likelihood Ratio Tests
for Multinomial Distribution
Problem:
A generalized LRT of the Goodness of Fit of a
model for Multinomial cell probabilities will be
derived.
Here, the large sample dist’n
of
m
 Oi 

 2 log   2 Oi log
 Ei 
i 1
 
where Oi  np i and Ei  npi 
is a chi-square with m-k-1 degrees of freedom.
9.7: The Poisson Dispersion Test
Here, the LRT statistic resumes to:
 xi 
 2 log   2 xi log 
 x
i 1
where the xi are the given counts.
n
The Poisson Dispersion Test is:
n
1
2
 2 log    xi  x 
x i 1
which tends to a chi-square with dim   dim 0  n  1
degrees of freedom as the sample size tends to
infinity.
9.9: Probability Plots
Let the ordered sample values be denoted by the
order statistics X (1)  ...  X ( n )
 
j

n 1
Page 155 #17 implies that E X ( j )
If the underlying dist’n is uniform, then the plot of
the ordered observations against their expected
values should look linear.
For examples, please visit MINITAB.
9.10: Tests for Normality
A goodness-of-fit test can be based on the
coefficients of SKEWNESS or KURTOSIS
but their sampling distributions are difficult
to evaluate in closed form.
We will base our goodness-of-fit test on the
linearity of the Probability Plot, as
measured by the correlation coefficient, r,
of the x and y components. Such a test
rejects for small values of r.
9.11: Conclusion
Estimation (Chap 8) & Hypotheses Testing
(Chap 9) were introduced when fitting
probability distributions and testing models
based on LRT (if not a chi-square dist’n as
a large-sample approximation).
• estimating parameter from data
• testing hypotheses about parameter value
As graphical method, we discussed the
Probability Plot technique.
STEPS for Testing Hypotheses:
1.
2.
3.
4.
5.
Formulate hypotheses H 0 (null ) vs H A (altenative)
State test statistic & form RR=rejection region
With a specified level  , determine the RR
Calculate the test statistic from the data
Draw a conclusion:
• either REJECT the null hypothesis H 0 at level 
• or FAIL to REJECT the null hypothesis H 0 at level 
INTREPRET the conclusion in the context of the problem
CALCULATE the p-value to strengthen the conclusion.
EXAMPLE:
5-week weight loss program !
Subscriptions for a new diet program state
that the participants are expected to lose
over 22 pounds in five weeks. From the
data of 56 participants, the sample mean
and the sample standard deviation are
found to be 23.5 pounds and 10.2 pounds,
respectively. Could the statement in the
brochure be substantiated on the basis of
these findings? Test at level 5%. Calculate
the p-value and interpret the result.
SOLUTION:
0. Let  denote the population mean weight loss from
the five weeks of participation in the program.
1. Formulation H 0 :   22 vs H A :   22
X  0
Z
; X  sample.mean.weight.loss
2a. Test statistic
S/ n
2b. Since it’s a 1-sided test H A :   22  RR  X  c
3. c will be found (next slide) from the definition of 
4. x  23.5;  0  22; s  10.2; n  56


x   0 23.5  22
z

 1.10 (observed value)
s / n 10.2 / 56
Solutions (cont’d)
In step 3, the specified level of confidence 5%
determines the critical value c such that :
  P(reject H 0 | H 0 true)  PX  c |   22 
 X  0
c  22 
 c  22 
  1  
 P


 10.2 / 56 
 s / n 10.2 / 56 
c  22

  1 (1   )  z  z0.05  1.645
10.2 / 56
10.2
 c  22 
*1.645  24.24
56
Thus, RR  X  24.24
Solutions (cont’d)
5. Conclusion:
RR  X  c   RR  Z  z 
i.e. RR  X  24.24  Z  1.645
C1: 1.10  z  RR  Z  1.645
since the observed value z = 1.10 is NOT in the
Rejection Region, then we fail to reject the null
hypothesis in favor of the alternative hypothesis.
C2 : p  value  PZ  zobs   PZ  1.10  1  1.10
 1  0.8643  0.1357
Note, 0.1357  p  value    0.05  fail to reject H 0
The data do not provide evidence to reject the null.
TESTS about pop’n MEAN:
Case 1:
wn
X 1 ,..., X n ~ N (  ,  ) ,  known
2
2
RR at level 
H0
HA
H 0 :   0
H 0 :   0
Z  z
H 0 :   0
H 0 :   0
Z   z
H 0 :   0
H 0 :   0
Z  z / 2

TESTS about pop’n MEAN:
Case 2: Large Sample (n > 30)
Random samples come from any pop’n dist’n
with unknown mean & variance.
Table is the same for Case 1 & Case 2.
Test statistics are different for Case 1 & Case 2.
Z case1
X  0

~ N (0,1)
/ n
X  0
Z case2 
 N (0,1)
s/ n
TESTS about pop’n MEAN:
X 1 ,..., X n ~ N (  ,  ) ,  unknown
Case 3:
X  0
2
Tcase3 
H0
H 0 :   0
HA
H 0 :   0
s/ n
2
 t n 1
RR at level 
T  t , n 1
H 0 :   0
H 0 :   0
T  t , n 1
H 0 :   0
H 0 :   0
| T | t / 2, n 1