Download Statistics II Chapter 2: Hypothesis testing in one population

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 2. Hypothesis testing in one population
Contents
I
Introduction, the null and alternative hypotheses
I
Hypothesis testing process
I
I
Type I and Type II errors, power
Test statistic, level of significance and rejection/acceptance regions
in upper-, lower- and two-tail tests
I
Test of hypothesis: procedure
I
p-value
I
I
Two-tail tests and confidence intervals
Examples with various parameters
I
Power and sample size calculations
Chapter 2. Hypothesis testing in one population
Learning goals
At the end of this chapter you should be able to:
I
Perform a test of hypothesis in a one-population setting
I
Formulate the null and alternative hypotheses
I
Understand Type I and Type II errors, define the significance level,
define the power
I
Choose a suitable test statistic and identify the corresponding
rejection region in upper-, lower- and two-tail tests
I
Use the p-value to perform a test
I
Know the connection between a two-tail test and a confidence
interval
I
Calculate the power of a test and identify a sample size needed to
achieve a desired power
Chapter 2. Hypothesis testing in one population
References
I
Newbold, P. ”Statistics for Business and Economics”
I
I
Chapter 9 (9.1-9.5)
Ross, S. ”Introduction to Statistics”
I
Chapter 9
Test of hypothesis: introduction
A test of hypothesis is a procedure that:
I
is based on a data sample
I
and allows us to make a decision
I
about a validity of some conjecture or hypothesis about the
population X , typically the value of a population parameter θ (θ can
be any of the parameters we covered so far: µ, p, σ 2 , etc)
This hypothesis, called a null hypothesis (H0 ):
I
Can be thought of as a hypothesis being supported (before the test
is carried out)
I
Will be believed unless sufficient contrary sample evidence is
produced
I
When sample information is collected, this hypothesis is put in
jeopardy, or tested
The null hypothesis: examples
1. A manufacturer who produces boxes of cereal claims that, on average,
their contents weigh at least 20 ounces. To check this claim, the contents
of a random sample of boxes are weighed and inference is made.
Population: X = ’weight of a box of cereal (in oz)’
µ0
z}|{
SRS
Null hypothesis, H0 : µ ≥ 20
'
Does sample data produce evidence against H0 ?
2. A company receiving a large shipment of parts accepts their delivery only
if no more than 50% of the parts are defective. The decision is based on a
check of a random sample of these parts.
Population: X = 1 if a part is defective and 0 otherwise
X ∼ Bernoulli(p), p = proportion of defective parts in the entire shipment
p0
z}|{
Null hypothesis, H0 : p ≤ 0.5
SRS
'
Does sample data produce evidence against H0 ?
Null hypothesis, H0
I
States the assumption to be tested
I
We begin with the assumption that the null hypothesis is true (similar to
the notion of innocent until proven guilty)
I
Refers to the status quo
I
Always contains a ’=’, ’≤’ or ’≥’ sign (closed set)
I
May or may not be rejected
I
Simple hypothesis (specifies a single value):
µ0
p0
σ02
z}|{
z}|{
z}|{
H0 : µ = 5 , H0 : p = 0.6 , H0 : σ 2 = 9 In general: H0 : θ = θ0
Parameter space under this null: Θ0 = {θ0 }
I
Composite hypothesis (specifies a range of values):
µ0
p0
z}|{
z}|{
H0 : µ ≤ 5 , H0 : p ≥ 0.6 In general: H0 : θ ≤ θ0 or H0 : θ ≥ θ0
Parameter space under this null: Θ0 = (−∞, θ0 ] or Θ0 = [θ0 , ∞)
Alternative hypothesis, H1
If the null hypothesis is not true, then some alternative must be true, and in
carrying out a hypothesis test, the investigator formulates an alternative
hypothesis against which the null hypothesis is tested.
The alternative hypothesis H1 :
I
Is the opposite of the null hypothesis
I
Challenges the status quo
I
Never contains ’=’, ’≤’ or ’≥’ sign
I
May or may not be supported
I
Is generally the hypothesis that the researcher is trying to support
I
One-sided hypothesis:
(upper-tail) H1 : µ > 5
(lower-tail) H0 : p < 0.6
In general: H1 : θ > θ0 or H1 : θ < θ0
Parameter space under this alternative: Θ1 = (θ0 , ∞) or Θ1 = (−∞, θ0 )
I
Two-sided hypothesis (two-tail):
H1 : σ 2 6= 9
In general: H1 : θ 6= θ0
Parameter space under this alternative: Θ1 = (−∞, θ0 ) ∪ (θ0 , ∞)
The alternative hypothesis: examples
1. A manufacturer who produces boxes of cereal claims that, on average,
their contents weigh at least 20 ounces. To check this claim, the contents
of a random sample of boxes are weighed and inference is made.
Population: X = ’weight of a box of cereal (in oz)’
Null hypothesis, H0 : µ ≥ 20 versus
Alternative hypothesis, H1 : µ < 20
'
SRS
Does sample data produce evidence against H0 in favour of H1 ?
2. A company receiving a large shipment of parts accepts their delivery only
if no more than 50% of the parts are defective. The decision is based on a
check of a random sample of these parts.
Population: X = 1 if a part is defective and 0 otherwise
X ∼ Bernoulli(p), p = proportion of defective parts in the entire shipment
Null hypothesis, H0 : p ≤ 0.5 versus
Alternative hypothesis, H1 : p > 0.5
'
SRS
Does sample data produce evidence against H0 in favour of H1 ?
Hypothesis testing process
xyyxxxxyy
Population:
X = ’height of a UC3M student (in m)’
Claim: On average, students are shorter
than 1.6 ⇒ Hypotheses:
H0 : µ ≤ 1.6 versus H1 : µ > 1.6
'
SRS
yyxx
Sample: Suppose the sample
mean height is 1.65 m, x̄ = 1.65
Is it likely to observe a
sample mean x̄ = 1.65
if the population mean is
µ ≤ 1.6?
If not likely, reject the null
hypothesis in favour of the
alternative.
Hypothesis testing process
I
Having specified the null and alternative hypotheses and collected
the sample information, a decision concerning the null hypothesis
(reject or fail to reject H0 ) must be made.
I
The decision rule is based on the value of a “distance” between the
sample data we have collected and those values that would have a
nigh probabiilty under the null hypothesis.
I
This distance is calculated as the value of a so-called test statistic
(closely related to the pivotal quantities we talked about in Chapter
1). We will discuss specific cases later on.
I
However, whatever decision is made, there is some chance of
reaching an erroneous conclusion about the population parameter,
because all that we have available is a sample and thus we cannot
know for sure if the null hypothesis is true or not.
I
There are two possible states of nature and thus two errors can be
committed: Type I and Type II errors.
Type I and Type II errors, power
I
Type I Error: to reject a true null hypothesis. A Type I error is considered
a serious type of error. The probability of a Type I Error is equal to α and
is called the significance level.
α = P(reject the null|H0 is true)
I
Type II Error: to fail to reject a false null hypothesis. The probability of a
Type II Error is β.
β = P(fail to reject the null|H1 is true)
I
power: is the probability of rejecting a null hypothesis (that is false).
power = 1 − β = P(reject the null|H1 is true)
Decision
Do not
Reject H0
Reject
H0
Actual situation
H0 true
H0 false
No error
Type II Error
(1 − α)
(β)
Type I error
No Error
(α)
(1 − β = power)
Type I and Type II errors, power
I
I
I
Type I and Type II errors can not happen at the same time
I
Type I error can only occur if H0 is true
I
Type II error can only occur if H0 is false
If the Type I error probability (α) ⇑, then the Type II error
probability β ⇓
All else being equal:
I
I
I
I
I
I
I
β ⇑ when the difference between the hypothesized parameter value
and its true value ⇓
β ⇑ when α ⇓
β ⇑ when σ ⇑
β ⇑ when n ⇓
The power of the test increases as the sample size increases
For θ ∈ Θ1
power(θ) = 1 − β
For θ ∈ Θ0
power(θ) ≤ α
Test statistic, level of significance and rejection region
Test statistic, T
I
Allows us to decide if the sample data is “likely” or “unlikely” to
occur, assuming the null hypothesis is true.
I
It is the pivotal quantity from Chapter 1 calculated under the null
hypothesis.
I
The decision in the test of hypothesis is based on the observed value
of the test statistic, t.
I
The idea is that, if the data provide an evidence against the null
hypothesis, the observed test statistic should be “extreme”, that is,
very unusual. It should be “typical” otherwise.
In distinguishing between “extreme” and “typical” we use:
I
I
I
the sampling distribution of the test statistic
the significance level α to define so-called rejection (or critical)
region and the acceptance region.
Test statistic, level of significance and rejection region
Rejection region (RR) and acceptance region (AR) in size α tests:
Upper-tail test H1 : θ > θ0
RRα = {t : t > Tα }
α
ARα = {t : t ≤ Tα }
Lower-tail test H1 : θ < θ0
RRα = {t : t < T1−α }
ARα = {t : t ≥ T1−α }
Two-tail test H1 : θ 6= θ0
RRα = {t : t < T1−α/2 or t > Tα/2 }
ARα = {t : T1−α/2 ≤ t ≤ Tα/2 }
AR
●
CRITICAL
VALUE
RR
α
RR
●
CRITICAL
VALUE
α 2
●
RRCRITICAL AR
VALUE
AR
α 2
●
CRITICALRR
VALUE
Test statistics
Let X n be a s.r.s. from a population X with mean µ and variance σ 2 , α a significance
level, zα the upper α quantile of N(0,1), µ0 the population mean under H0 , etc.
Parameter
Mean
Assumptions
Test statistic
Normal data
Known variance
X̄ −µ0
√ ∼ N(0, 1)
σ/ n
Non-normal data
Large sample
X̄ −µ0
√ ∼ap. N(0, 1)
σ̂/ n
Bernoulli data
Large sample
Normal data
Unknown variance
p
p̂−p0
∼ap. N(0, 1)
p0 (1−p0 )/n
X̄ −µ0
√ ∼ tn−1
s/ n
Variance
Normal data
(n−1)s 2
∼ χ2
n−1
σ2
0
St. dev.
Normal data
(n−1)s 2
∼ χ2
n−1
σ2
0
RRα in two-tail test
9
z
>
>
z }| {
>
>
>
=
x̄ − µ0
x̄−µ0
√ > zα/2
z :
< z1−α/2 or
√
>
>
σ/ n
>
>
σ/
n
>
>
>
>
>
>
:
;

ff
x̄−µ0
x̄−µ0
√ < z1−α/2 or
√ > zα/2
z :
σ̂/ n
σ̂/ n
8
>
>
>
>
>
<

p̂−p0
p̂−p0
< z1−α/2 or p
> zα/2
p0 (1−p0 )/n
p0 (1−p0 )/n
8
9
t
>
>
>
>
z }| {
>
>
>
>
>
>
<
=
x̄ − µ0
x̄−µ0
√
< tn−1;1−α/2 or
> tn−1;α/2
t :
√
>
>
s/
n
>
>
s/ n
>
>
>
>
>
>
:
;
z :
ff
p
8
9
χ2
>
>
>
>
>
>
>
>
z
}|
{
>
>
>
>
>
>
>
>
<
=
2
(n − 1)s 2
(n−1)s
2
χ2 :
< χ2
or
>
χ
2
n−1;1−α/2
n−1;α/2
>
>
2
σ
>
>
σ
>
>
0
0
>
>
>
>
>
>
>
>
>
>
:
;
(
)
2
2
(n−1)s
(n−1)s
2
2
χ2 :
<
χ
>
χ
or
n−1;1−α/2
n−1;α/2
σ2
σ2
0
0
Question: How would you define RRα in upper- and lower-tail tests?
Test of hypothesis: procedure
1. State the null and alternative hypotheses.
2. Calculate the observed value of the test statistic (see the formula
sheet).
3. For a given significance level α define the rejection region (RRα ).
I
Reject H0 , the null hypothesis, if the test statistic is in RRα and fail
to reject H0 otherwise.
4. Write down the conclusions in a sentence.
Upper-tail test for the mean, variance known: example
Example: 9.1 (Newbold) When a process producing ball bearings is operating
correctly, the weights of the ball bearings have a normal distribution with mean
5 ounces and standard deviation 0.1 ounces. The process has been adjusted
and the plant manager suspects that this has raised the mean weight of the ball
bearings, while leaving the standard deviation unchanged. A random sample of
sixteen bearings is selected and their mean weight is found to be 5.038 ounces.
Is the manager right? Carry out a suitable test at a 5% level of significance.
Population:
X = ”weight of a ball bearing (in oz)”
√ 0 ∼ N(0, 1)
Test statistic: Z = X̄σ/−µ
n
X ∼ N(µ, σ 2 = 0.12 )
Observed test statistic:
'
SRS: n = 16
Sample: x̄ = 5.038
Objective: test
µ0
z}|{
H0 : µ = 5 against H1 : µ > 5
(Upper-tail test)
σ = 0.1
µ0 = 5
n = 16
z
=
=
x̄ = 5.038
x̄ − µ0
√
σ/ n
5.038 − 5
√
= 1.52
0.1/ 16
Upper-tail test for the mean, variance known: example
Example: 9.1 (cont.)
Rejection (or critical) region:
RR0.05
=
{z : z > z0.05 }
=
{z : z > 1.645}
z=
1.52
Since z = 1.52 ∈
/ RR0.05 we fail
to reject H0 at a 5% significance
level.
N(0,1) density
AR
●
RR
zα = 1.645
Conclusion: The sample data did not provide sufficient evidence to reject
the claim that the average weight of the bearings is 5oz.
Definition of p-value
I
It is the probability of obtaining a test statistic at least as extreme
(≤ or ≥) as the observed one (given H0 is true)
I
Also called the observed level of significance
I
It is the smallest value of α for which H0 can be rejected
Can be used in step 3) of the testing procedure with the following
rule:
I
I
I
I
If p-value < α, reject H0
If p-value ≥ α, fail to reject H0
Roughly:
I
I
“small” p-value - evidence against H0
“large” p-value - evidence in favour of H0
p-value
p-value when t is the observed value of the test statistic T :
Upper-tail test H1 : θ > θ0
p-value = P(T ≥ t)
Lower-tail test H1 : θ < θ0
p-value = P(T ≤ t)
Two-tail test H1 : θ 6= θ0
p-value = P(T ≤ −|t|) + P(T ≥ |t|)
test
stat
p−value
=area
test
stat
p−value
=area
−|test
stat|
|test
stat|
p−value
=left+right
areas
p-value: example
Example: 9.1 (cont.)
Population:
X = ”weight of a ball bearing (in oz)”
X ∼ N(µ, σ 2 = 0.12 )
'
SRS: n = 16
Sample: x̄ = 5.038
Objective: test
µ0
z}|{
H0 : µ = 5 against H1 : µ > 5
(Upper-tail test)
=
P(Z ≥ z) = P(Z ≥ 1.52)
=
0.0643 where Z ∼ N(0, 1)
Since it holds that
p-value = 0.0643 ≥ α = 0.05
we fail to reject H0 (but would reject
at any α greater than 0.0643, e.g.,
α = 0.1).
z=
1.52
√ 0 ∼ N(0, 1)
Test statistic: Z = X̄σ/−µ
n
Observed test statistic: z = 1.52
N(0,1) density
p-value
p−value
=area
The p-value and the probability of the null hypothesis
I
The p-value:
I
I
I
1
is not the probability of H0 nor the Type I error α;
but it can be used as a test statistic to be compared with α (i.e.
reject H0 if p-value < α).
We are interested in answering: How probable is the null given the
data?
I
I
I
1 Selke,
Remember that we defined the p-value as the probability of the data
(or values even more extreme) given the null.
We cannot answer exactly.
But under fairly general conditions and assuming that if we had no
observations Pr(H0 ) = Pr(H1 ) = 1/2, then for p-values, p, such that
p < 0.36:
−ep ln(p)
.
Pr(H0 |Observed Data) ≥
1 − ep ln(p)
Bayarri and Berger, The American Statistician, 2001
The p-value and the probability of the null hypothesis
This table helps to calibrate a desired p-value as a function of the
probability of the null hypothesis:
p-value
0.1
0.05
0.01
0.001
0.00860
0.00341
0.00004
≤ 0.00001
I
I
Pr(H0 |Observed Data) ≥
0.39
0.29
0.11
0.02
0.1
0.05
0.01
0.001
For a p-value equal to 0.05 the null has a probability of at least 29%
of being true
While if we want the probability of the null being true to be at most
5%, the p-value should be no larger than 0.0034.
Confidence intervals and two-tail tests: duality
A two-tail test of hypothesis at a significance level α can be carried out
using a (two-tail) 100(1 − α)% confidence interval in the following way:
1. State the null and two-sided alternative
H0 : θ = θ0
against H1 : θ 6= θ0
2. Find a 100(1 − α)% confidence interval for θ
3. If θ0 doesn’t belong to this interval, reject the null.
If θ0 belongs to this interval, fail to reject the null.
4. Write down the conclusions in a sentence.
Two-tail test for the mean, variance known: example
Example: 9.2 (Newbold) A drill is used to make holes in sheet metal.
When the drill is functioning properly, the diameters of these holes have a
normal distribution with mean 2 in and a standard deviation of 0.06 in.
To check that the drill is functioning properly, the diameters of a random
sample of nine holes are measured. Their mean diameter was 1.95 in.
Perform a two-tailed test at a 5% significance level using a CI-approach.
Population:
100(1 − α)% = 95% confidence
X = ”diameter of a hole (in inches)”
interval for µ:
X ∼ N(µ, σ 2 = 0.062 )
σ
√
CI0.95 (µ) =
x̄ ∓ 1.96
n
SRS: n = 9
0.06
√
=
1.95 ∓ 1.96
Sample: x̄ = 1.95
9
= (1.9108, 1.9892)
Objective: test
µ0
z}|{
Since µ0 = 2 ∈
/ CI0.95 (µ) we
H0 : µ = 2 against H1 : µ 6= 2
reject H0 at a 5% significance
(Two-tail test)
level.
'
Two-tail test for the proportion: example
Example: 9.6 (Newbold) In a random sample of 199 audit partners in
U.S. accounting firms, 104 partners indicated some measure of
agreement with the statement: “Cash flow from operations is a valid
measure of profitability”. Test at the 10% level against a two-sided
alternative the null hypothesis that one-half of the members of this
population would agree with the preceding statement.
Population:
X = 1 if a member agrees with the
Test statistic:
statement and 0 otherwise
Z = √ p̂−p0
∼approx. N(0, 1)
p0 (1−p0 )/n
X ∼ Bernoulli(p)
Observed test statistic:
'
SRS: n = 199 large n
Sample: p̂ =
104
199
= 0.523
Objective: test
p0
z}|{
H0 : p = 0.5 against H1 : p 6= 0.5
(Two-tail test)
p0 = 0.5
n = 199
z
p̂ = 0.523
p̂ − p0
= p
p0 (1 − p0 )/n
0.523 − 0.5
= p
0.5(1 − 0.5)/199
= 0.65
Two-tail test for the proportion: example
Example: 9.6 (cont.)
Rejection (or critical) region:
RR0.10
= {z : z > z0.05 } ∪
{z : z < −z0.05 }
= {z : z > 1.645} ∪
{z : z < −1.645}
Since z = 0.65 ∈
/ RR0.10 we fail
to reject H0 at a 10% significance
level.
N(0,1) density
z=
0.65
●
●
RR
AR
RR
− zα2 = − 1.645 zα2 = 1.645
Conclusion: The sample data does not contain sufficiently strong
evidence against the hypothesis that one-half of all audit partners agree
that cash flow from operations is a valid measure of profitability.
Lower-tail test for the mean, variance unknown: example
Example: 9.4 (Newbold, modified) A retail chain knows that, on average, sales
in its stores are 20% higher in December than in November. For a random
sample of six stores the percentages of sales increases were found to be: 19.2,
18.4, 19.8, 20.2, 20.4, 19.0. Assuming a normal population, test at a 10%
significance level the null hypothesis (use a p-value approach) that the true
mean percentage sales increase is at least 20, against a one-sided alternative.
Population:
X = “stores increase in sales from Nov to
√ 0 ∼ tn−1
Test statistic: T = X̄s/−µ
n
Dec (in %s)”
Observed test statistic:
X ∼ N(µ, σ 2 ) σ 2 unknown
'
SRS: n = 6 small n
Sample: x̄ = 117
= 19.5
6
2284.44−6(19.5)2
2
s =
= 0.588
6−1
Objective: test
µ0
z}|{
H0 : µ ≥ 20 against H1 : µ < 20
(Lower-tail test)
µ0 = 20
x̄ = 1.95
t
=
=
n=6
√
s = 0.588 = 0.767
x̄ − µ0
√
s/ n
19.5 − 20
√ = −1.597
0.767/ 6
Lower-tail test for the mean, variance unknown: example
Example: 9.4 (cont.)
p-value = P(T ≤ −1.597)
∈ (0.05, 0.1) because
−t5;0.05
−t5;0.10
z }| {
z }| {
−2.015 < −1.597 < −1.476
Hence, given that
p-value < α = 0.1 we reject the
null hypothesis at this level.
tn−1 density
t=
−1.597
p−value
=area
||
−2.015 −1.476
Conclusion: The sample data gave enough evidence to reject the claim
that the average increase in sales was at least 20%.
p-value interpretation: if the null hypothesis were true, the probability of
obtaining such sample data would be at most 10%, which is quite
unlikely, so we reject the null hypothesis.
Lower-tail test for the mean, variance unknown: example
Example: 9.4 (cont.) in Excel: Go to menu: Data, submenu: Data
Analysis, choose function: two-sample t-test with unequal variances.
Column A (data), Column B (n repetitions of µ0 = 20), in yellow
(observed t stat, p-value and tn−1;α ).
Upper-tail test for the variance: example
Example: 9.5 (Newbold) In order to meet the standards in consignments of a
chemical product, it is important that the variance of their percentage impurity
levels does not exceed 4. A random sample of twenty consignments had a
sample quasi-variance of 5.62 for impurity level percentages.
a) Perform a suitable test of hypothesis (α = 0.1).
b) Find the power of the test. What is the power at σ12 = 7?
c) What sample size would guarantee a power of 0.9 at σ12 = 7?
Population:
X = “impurity level of a consignment of a
chemical (in %s)”
X ∼ N(µ, σ 2 )
'
SRS: n = 20
2
Test statistic: χ2 = (n−1)s
∼ χ2n−1
σ02
Observed test statistic:
σ02 = 4
n = 20
2
s = 5.62
Sample: s 2 = 5.62
χ2
=
Objective: test
σ02
z}|{
H0 : σ 2 ≤ 4 against H1 : σ 2 > 4
(Upper-tail test)
=
=
(n − 1)s 2
σ02
(20 − 1)5.62
4
26.695
Upper-tail test for the variance: example
Example: 9.5 a) (cont.)
p-value = P(χ2 ≥ 26.695)
χ2 =
∈ (0.1, 0.25) because
26.695
χ219;0.25
χ219;0.1
z}|{
z}|{
22.7 < 26.695 < 27.2
p−value
=area
Hence, given that p-value exceeds
α = 0.1, we cannot reject the null
hypothesis at this level.
χ2n−1
density
●●
22.7
27.2
Conclusion: The sample data did not provide enough evidence to reject
the claim that the variance of the percentage impurity levels in
consignments of this chemical is at most 4.
Upper-tail test for the variance: power
Hence the power is:
0.6
0.8
1.0
Example: 9.5 b) Recall that: power = P(reject H0 |H1 is true)
When do we reject H0 ?

ff
(n − 1)s 2
2
RR0.1 =
>
χ
n−1;0.1
power(σ 2 ) versus σ 2
σ02
9
8
27.2 · 4 = 108.8>
>
>
}|
{ >
z
=
<
power(σ22) =
2
= (n − 1)s > χ2n−1;0.1 · σ02
1 − β(σ )
>
>
>
>
;
:
σ20 = 4
0.0 α 0.2
0.4
“
”
power(σ12 ) = P reject H0 |σ 2 = σ12
“
”
= P (n − 1)s 2 > 108.8|σ 2 = σ12
„
«
(n − 1)s 2
108.8
=P
>
σ12
σ2
●
„
«
„ 1 «
0 Θ0 2
4
6 Θ1 8
108.8
108.8
= 1 − Fχ2
= P χ2 >
2
2
σ1
σ1
`
´
(Fχ2 is the cdf of χ2n−1 ) Hence, power(7) = P χ2 > 108.8
= 0.6874.
7
σ2
10
Upper-tail test for the variance: sample size calculations
Example: 9.5 c)
From our previous calculations,
we know that
σ02
(n−1)s 2
2
2
potencia(σ1 ) = P
>
χ
,
2
2
n−1;0.1
σ
σ
1
1
(n−1)s 2
σ12
∼ χ2n−1
Our objective is to find the smallest n such that:

0.571
z}|{
 (n − 1)s 2
4 


2
>
χ
power(7) = P 
 ≥ 0.9
n−1;0.1
7 

σ12

The last equation implies that we are dealing with a χ2n−1 distribution,
whose upper 0.9-quantile satisfies χ2n−1;0.9 ≥ 0.571χ2n−1;0.1 .
chi-square table
χ243;0.9 /χ243;0.1 = 0.573 > 0.571 ⇒ n − 1 = 43
Thus, if we collect 44 observations we should be able to detect the
alternative value σ12 = 7 with at least 90% chance.
Another power example: lower-tail test for the mean,
normal population, known σ 2
H0 : µ ≥ µ0 versus H1 : µ < µ0 at α = 0.05
I
Say that µ0 = 5, n = 16, σ = 0.1
I
We reject H0 if
x̄−µ
√0
< −zα = −1.645 that is when x̄ ≥ 4.96, hence
4.96−µ
√ 1
0.1/ 16
0.6
0.0
0.2
0.4
0.6
µ0 = 5
0.4
α 0.2
0.0
4.85
n=16
n=9
n=4
0.8
power(µ) =
1 − β(µ)
0.8
1.0
σ/ n
power(µ1 ) = P Z <
1.0
I
●
4.95
Θ1
5.05
Θ0
µ
4.85
4.95
5.05
Another power example: lower-tail test for the mean,
normal population, known σ 2
Note that the power = 1 − P(Type II error) function has the following
features (everything else being equal):
I
The farther the true mean µ1 from the hypothesized µ0 , the greater
the power
I
The smaller the α, the smaller the power, that is, reducing the
probability of Type I error will increase the probability of Type II error
I
The larger the population variance, the lower the power (we are less
likely to detect small departures from µ0 , when there is greater
variability in the population)
I
The larger the sample size, the greater the power of the test (the
more info from the population, the greater the chance of detecting
any departures from the null hypothesis).
Related documents