Download 1.017 Class 10: Common Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
1.017/1.010 Class 16
Testing Hypotheses about a Single Population
Formulating Hypothesis Testing Problems
Hypotheses about a random variable x are often formulated in terms of its
distributional properties. Example, if property is a:
Null hypothesis H0: a = a0
Objective of hypothesis testing is to decide whether or not to reject this
hypothesis. Decision is based on estimator â of a:
Reject H0: If observed estimate â lies in rejection region
Ra0 ( aˆ  Ra 0 )
Do not reject H0: Otherwise ( aˆ  Ra 0 )
Select rejection region to obtain desired error properties:
Test Result
Do not reject H0
aˆ  Ra 0
Reject H0
aˆ  Ra 0
H0 true
P(H0|H0) =1 - 
P(~H0|H0) = 
(Type I Error)
H0 false
P(H0|~H0) = 
(Type II Error)
P(~H0|~H0) =
1- 
True situation
Type I error probability is called the test significance level.
Deriving Hypothesis Rejection Regions for Large Sample Tests
Hypothesis test is often based on a standardized statistic that depends
on unknown true property and its estimate. Basic concepts are the same
as used to derive confidence intervals (see Class 14).
An example is the z statistic:
z (aˆ , a) 
aˆ - a
SD[aˆ ]
If the estimate is unbiased E[z] = 0 and Var[z] = 1.
1
Define a rejection region Rz0 in terms of z as:
R z 0 : z (aˆ , a0 )  z L
z (aˆ , a0 )  zU
As rejection region grows Type I error increases and Type II error
decreases (test is more likely to reject hypothesis).
As rejection region shrinks Type I error decreases and Type II error
increases (test is less likely to reject hypothesis)
Usual practice is to select rejection region to insure that Type I error
probability is equal to a specified value 
For a two-sided test require that Type I error probability is distributed
equally between intervals below zL (probability = /2) and above zU
(probability = /2).
These probabilities are:
P[ z (aˆ , a)  z L | H0]  P[ z (aˆ , a0 )  z L ]  Fz ( z L ) 

2
P[ z (aˆ , a)  zU | H0]  P[ z (aˆ , a0 )  zU ]  1  Fz ( zU ) 
 
z L  Fz-1  
2

2
 
zU  Fz-1 1  
2

For large samples z (aˆ , a0 ) has a unit normal distribution. Use the
MATLAB function norminv to evaluate Fz-1 
If the definition of z is applied a two-sided rejection region Ra0 can also be
written directly in terms of the estimate â :
 
Ra 0 : aˆ  a L  a0  Fz-1   SD[aˆ ]
2
 
aˆ  aU  a0  Fz-1 1   SD[aˆ ]
2

p Values
p value is largest significance level resulting in acceptance of H0.
For a symmetric two-sided rejection region and a large sample:
2
 aˆ  a0 
p / 2  1  Fz 

 SD(aˆ ) 
 aˆ  a0 
p / 2  Fz 

 SD(aˆ ) 
aˆ  a
aˆ  a0 0
For large samples use the MATLAB function normcdf to
compute p from â and SD[ â ].
Special Case -- Sample mean
Consider hypothesis about value of population mean a = E[x]:
H0: a = E[x] = a0
Base test on sample mean estimator mx. Obtain SD[mx] from sample
standard deviation:
SD[m x ] 
SD[ x ]
N

sx
N
Example: Testing whether mean is significantly different from zero
Suppose a0 = 0,
sx = 3, N = 9, mx = 1.2 and = .05:
 0.05  3
Ra 0 : m x  a L  0  Fz-1 
 1.96

 2  9
 0.05  3
m x  aU  0  Fz-1 1 
 1.96

2  9

In this case hypothesis is not rejected since mx = 1.2 does not lie in Ra0.
The two-sided p-value is (see plot):
 m  a0 
1  p / 2  Fz  x
  Fz
s
/
N
 x

p = 0.22
3
1.2  0 

  Fz 1.2  .89
3
/
9


1-p/2
z
Copyright 2003 Massachusetts Institute of Technology
Last modified Oct. 8, 2003
4