Download PHP 2510 Hypothesis testing: One sample We have discussed

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
PHP 2510
Hypothesis testing: One sample
We have discussed methods of point and interval estimation for
parameter of interest.
Researchers often have preconceived ideas about what the
parameter might be and wish to test whether the data confirm
with these ideas.
PHP 2510 – October 29, 2009
1
Example: Testing hypothesis about a mean
Suppose that the average birthweight of full-term, live-born is 120
oz.
A researcher hypothesizes that mothers with low socioeconomic
status (SES) deliver babies whose birthweight are lower than
“normal”.
Two hypotheses are considered.
The average birthweight of these newborn is 120 oz
The average birthweight of these newborn is lower than 120 oz
PHP 2510 – October 29, 2009
2
Procedure for hypothesis testing
1. State your hypothesis about a parameter of interest. Usually
constructed in terms of a null hypothesis and alternative
hypothesis. Example:
Null hypothesis:
mean is 120 oz
Alternative hypothesis:
mean is lower than 120 oz
2. Collect data and compute an estimate of the parameter.
3. Draw conclusion based on whether the estimate is close to the
null hypothesis value.
The primary methodologic issue is: how do we define ‘close’ ?
That comes later.
PHP 2510 – October 29, 2009
3
Back to birthweight example
Let µ denote the average birthweight of those newborn.
Null hypothesis
µ = 120 oz
Alternative hypothesis
µ < 120 oz
PHP 2510 – October 29, 2009
4
Conducting the test
After collecting the data and testing the hypothesis, make a
conclusion to accept or reject the null hypothesis.
In this case, if you accept that µ = 120, we say that those
newborns have normal birthweight.
If you reject, then we conclude that those newborns have lower
birthweight and associated risks should be further investigated.
PHP 2510 – October 29, 2009
5
Four possible outcomes an occur:
(1) We accept the null hypothesis when the null hypothesis is in
fact true.
(2) We reject the null hypothesis when the null hypothesis is in
fact true.
In this case, we say that we make a Type I error.
(3) We accept the null hypothesis when the alternative hypothesis
is in fact true.
In this case, we say that we make a Type II error.
(4) We reject the null hypothesis when the alternative hypothesis
is in fact true.
PHP 2510 – October 29, 2009
6
Testing hypothesis with data
Consider birthweight example. Collect data on 100 such newborns,
and find the following:
X
= µ
b = 115
S
= σ
b = 24
To test hypothesis, figure out how far is X from the null hypothesis
value?
• If it is far away, reject the null
• If it is close, do not reject
What do we mean by ‘close’ ?
PHP 2510 – October 29, 2009
7
1. Testing hypotheses with a confidence interval
We can draw some conclusions by forming a confidence interval.
A 95% interval is
√
X
± 1.96 × σ
b/ n
115
± 1.96 × 24/10
⇒ (110.2, 119.8)
Use the interval to draw a conclusion about the null hypothesis
that µ = 120.
PHP 2510 – October 29, 2009
8
2. Testing hypotheses with a test statistics
The usual approach:
1. State null hypothesis about a parameter of interest (e.g.
µ = 120 oz)
2. Decide on a statistic that will estimate µ (like X)
3. Characterize the random variation of your statistic under the
assumption that the null hypothesis is true (e.g. if the true
value of µ is 120, what is the distribution of X?). This is the
crucial step.
PHP 2510 – October 29, 2009
9
4. Collect data and compute a value for the statistic.
5. Compare the observed value of your statistic to its
distribution under the null hypothesis
(a) If the observed value is consistent with the distribution of the
statistic under the hypothesis, accept the hypothesis
(b) If not, then reject the hypothesis
PHP 2510 – October 29, 2009
10
Example: Test hypothesis about birthweight
Before we collect data, we need to characterize the distribution of
X if the null hypothesis is true. The null hypothesis mean is
usually denoted by µ0 .
Using the central limit theorem, we know that for any sample mean,
¶
µ
2
σ
X ∼ N µ0 ,
n
If the null hypothesis is true, then µ = µ0 = 120, and we expect
that
µ
¶
2
σ
X ∼ N 120,
n
Equivalently, we expect that
X − 120
√
∼ N (0, 1)
σ/ n
PHP 2510 – October 29, 2009
11
Accept or reject?
Once I observe X, I need to make a decision to accept or reject the
hypothesis. Remember that the hypothesis is either true or false!
I will make a decision to accept or reject the null hypothesis based
on X.
The question I will ask is, ‘If the true mean is 120, what is the
probability of observing X, or something farther from 120 than X?’
If the probability is low, then I reject the null. The threshhold for
‘low’ can vary, depending on the setting.
PHP 2510 – October 29, 2009
12
Carrying out the test
The probability of interest is calculated using the null distribution
of X; that is, its distribution when the null hypothesis is true
The standardized distance from X to µ0 is
Z
=
=
=
X − µ0
√
σ/ n
115 − 120
√
24/ 100
−2.08
The associated probability is P (Z < −2.08) = .02. This is a
one-sided p-value because the only alternatives we consider are in
the downward direction (i.e., lower birthweight than the nationwide
average).
The two-sided p value is
P (Z < −2.08) + P (Z > 2.08) = 0.02 + .02 = .04.
PHP 2510 – October 29, 2009
13
100
0
50
Frequency
150
200
Histogram of Sample Means under mu = 120, n = 100
110
PHP 2510 – October 29, 2009
115
120
125
130
14
Types of Hypotheses
Simple vs simple
H0 : µ = µ0 vs H1 : µ = µ1 .
e.g.
H0 : µ = 3 vs H1 : µ = 5.
H0 : µ = 3 vs H1 : µ = 1.
H0 : average birthweight is µ = 120 vs
H1 : average birthweight is µ = 110.
PHP 2510 – October 29, 2009
15
Simple vs Composite
H0 : µ = µ0 vs H1 : µ > µ0 .
H0 : µ = µ0 vs H1 : µ < µ0 .
H0 : µ = µ0 vs H1 : µ 6= µ0 .
e.g.
H0 : µ = 5 vs H1 : µ < 5.
H0 : average birthweight is µ = 120 vs
H1 : average birthweight is µ < 120.
We commonly see this type of hypotheses.
PHP 2510 – October 29, 2009
16
Type I error rate
α
=
=
Type I error rate
P (reject null hypothesis | null is true)
Type I error can be made if the null hypothesis is true.
We can set it to be 0.01, 0.05, 0.10, ... Most commonly used value
is 0.05.
We have a control over α. Why?
Recall that we assume H0 is true before drawing a conclusion.
PHP 2510 – October 29, 2009
17
One-tailed test
A one-tailed test is one that locates the rejection region in only one
tail of the sampling distribution of the test statistic.
To detect H1 : µ > µ0 , place the rejection region in the upper tail of
the distribution of X̄ under H0 .
To detect H1 : µ < µ0 , place the rejection region in the lower tail of
the distribution of X̄ under H0 .
PHP 2510 – October 29, 2009
18
Consider the case, H0 : µ = µ0 versus H1 : µ < µ0 .
H0 will be rejected for small X̄ < c. But how small? Suppose that
H0 is rejected for all values of X̄ < c.
α
= P (reject null | null is true)
= P (X̄ < c | X̄ ∼ N (µ0 , σ 2 /n))
X̄ − µ0
c − µ0
√ <
√ | X̄ ∼ N (µ0 , σ 2 /n))
σ/ n
σ/ n
c − µ0
√ )
= P (Z <
σ/ n
= P(
Let zα satisfies P (Z > zα ) = α.
c − µ0
√ = −zα
σ/ n
PHP 2510 – October 29, 2009
⇒
√
c = µ0 − zα σ/ n
19
Two-tailed test
A two-tailed test is one that locates the rejection regions in both
tails of the sampling distribution of the test statistic.
To detect H1 : µ 6= µ0 , place the rejection region in both the upper
and lower tails of the distribution of X̄ under H0 .
To test H0 : µ = µ0 versus H1 : µ 6= µ0 .
H0 will be rejected for small X̄ < c1 or large X̄ > c2 .
In this case
α
=
P (reject null | null is true)
=
P (X̄ < c1 or X > c2 | X̄ ∼ N (µ0 , σ 2 /n))
PHP 2510 – October 29, 2009
20
Usually,
α/2
=
P (X̄ < c1 | X̄ ∼ N (µ0 , σ 2 /n))
=
P(
=
X̄ − µ0
c −µ
√ < 1 √ 0 | X̄ ∼ N (µ0 , σ 2 /n))
σ/ n
σ/ n
c1 − µ0
√ )
P (Z <
σ/ n
Let zα/2 satisfies P (Z > zα/2 ) = α/2.
√
c1 = µ0 − zα/2 σ/ n
Similarly
√
c2 = µ0 + zα/2 σ/ n
PHP 2510 – October 29, 2009
21
p-value
Alternatively, we can draw conclusions using p-value.
p-value can be thought of as the probability under the null
hypothesis of a result as or more extreme that actually observed.
PHP 2510 – October 29, 2009
22
Example:
Certain brand of cigarettes is advertised by manufacturer as having
mean nicotine content of 15 mg/cigarette. Sample of 200 cigarettes
is tested by lab and found to have average of 16.2 mg of nicotine
with SD = 3.6.
Using a 0.01 level of significance, can we conclude that actual mean
nicotine content of this brand is greater than 15 mg?
PHP 2510 – October 29, 2009
23
Type II error rate
β
=
=
Type II error rate
P (accept null hypothesis | alternative is true)
In general, we do not have control over Type II error rate β. Why?
So, we say “we fail to reject the null hypothesis” instead of “we
accept the null”.
When we test “simple vs simple” hypotheses, we can determine
Type II error rate.
PHP 2510 – October 29, 2009
24
Simple vs simple hypotheses
Consider H0 : µ = µ0 vs H1 : µ = µ1 .
PHP 2510 – October 29, 2009
25