Download Lecture14 - Department of Statistics, Purdue University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Lecture 15: Tests about Population Means and Population
Proportions
Devore: Section 8.2-8.3
April, 2011
Page 1
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
A Normal Population with known σ
• This case is not common in practice. We will use it to illustrate
basic principles of test procedure design
• Let X1 , . . . , Xn be a sample size n from the normal
population. The null value of the mean is usually denoted µ0
and we consider testing either of the three possible alternatives
µ > µ0 , µ < µ0 and µ 6= µ0
• The test statistic that we will use is
X̄ − µ0
√
Z=
σ/ n
It measures the distance of X̄ from µ0 in standard deviation
units.
April, 2011
Page 2
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• Consider Ha : µ > µ0 as an alternative. The outcome that
would allow us to reject the null hypothesis H0 : µ = µ0 is
z ≥ c for some c > 0
• How do you select c? We need to control the probability of Type
I Error. For a test of level α, we have
α = P ( Type I Error ) = P (Z ≥ c|Z ∼ N (0, 1))
• Therefore, we need to choose c = zα . Such a test procedure is
called upper-tailed.
• It is easy to understand that for Ha : µ < µ0 we will have the
rejection region of the form z ≤ c. For the test to have the level
α, we need to choose c = −zα . Such a test is called a
lower-tailed test.
April, 2011
Page 3
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• Now consider the case of Ha : µ 6= µ0 . The rejection region
here consists of z ≥ c and z ≤ −c.
• For simplicity, consider the case α = 0.05. Then,
0.05 = P (Z ≥ c or Z ≤ −c|Z ∼ N (0, 1))
= Φ(−c) + 1 − Φ(c) = 2[1 − Φ(c)]
• Therefore, we select c such that
1 − Φ(c) = P (Z ≥ c) = 0.025; it is z0.025 = 1.96. This test
is called a two-tailed test.
April, 2011
Page 4
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Summary
• Let H0 : µ = µ0 ; define the test statistic Z =
1.
X̄−µ
√0.
σ/ n
Ha : µ > µ0 has the rejection region z ≥ zα and is called
an upper-tailed test
2.
Ha : µ < µ0 has the rejection region z ≤ −zα and is
called an lower-tailed test
3.
Ha : µ 6= µ0 has the rejection region z ≥ zα/2 or
z ≤ −zα/2 and is called a two-tailed test
April, 2011
Page 5
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Recommended Steps for Testing Hypotheses about a Parameter
1. Identify the parameter of interest and describe it in the
context of the problem situation.
2. Determine the null value and state the null hypothesis.
3. State the alternative hypothesis.
4. Give the formula for the computed value of the test statistic.
5. State the rejection region for the selected significance level
6. Compute any necessary sample quantities, substitute into
the formula for the test statistic value, and compute that
value.
• The formulation of hypotheses (steps 2 and 3) should be done
before examining the data.
April, 2011
Page 6
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Example
• A manufacturer of sprinkler systems used for fire protection in
office buildings claims that the true average system-activation
temperature is 130. Can we believe his claim?
• Parameter of interest is µ = true average activation
temperature.
• H0 : µ = 130; Ha : µ 6= 130.
• Test statistic is
x̄ − µ0
x̄ − 130
√ =
√
Z=
σ/ n
1.5/ n
April, 2011
Page 7
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• Rejection region is z ≤ −zα/2 or z ≥ zα/2 . If α = 0.01, we
have z ≤ −z0.005 = −2.58 and z ≥ z0.005 = 2.58
• With n = 9 and x̄ = 131.08,
131.08 − 130
√
z=
= 2.16
1.5/ 9
• Since 2.16 is outside the rejection region, we fail to reject H0 at
significance level 0.01.
April, 2011
Page 8
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Type II Error
• As an example, consider an upper-tailed test with the rejection
region z ≥ zα .
√
• H0 is not rejected when x̄ < µ0 + zα · σ/ n
0
• For a particular µ > µ, the probability of Type II error is then
√
0
0
β(µ ) = P (X̄ < µ0 + zα · σ/ n|µ = µ )
0
0
X̄ − µ
µ0 − µ
0
√
√
=P
< zα +
|µ = µ
σ/ n
σ/ n
0
µ0 − µ
√
= Φ zα +
σ/ n
April, 2011
Page 9
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• Similar derivations can help us to derive Type II error
probabilities for a lower-tailed test and a two-tailed test. Results
can be summarized as follows:
1.
Ha : µ > µ0 has
the probability of Type II Error
0
Φ zα +
µ0 −µ
√
σ/ n
2.
Ha : µ< µ0 has the probability
of Type II Error
0
µ0 −µ
1 − Φ −zα + σ/√n
3.
Ha : µ 6= µ0 has the probability
of Type II Error
0
Φ zα/2 +
µ0 −µ
√
σ/ n
0
− Φ −zα/2 +
µ0 −µ
√
σ/ n
April, 2011
Page 10
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Sample Size Determination
• Sometimes, we want to bound the value of Type II error for a
0
specific value µ .
• Consider the same sprinkler example. Fix α and specify β for
0
such an alternative value. For µ = 132 we may want to require
β(132) = 0.1 in addition to α = .01.
• The sample size required for that purpose is such that
0
µ0 − µ
√
=β
Φ zα +
σ/ n
April, 2011
Page 11
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• Solving for n, we obtain
σ(zα + zβ )
n=
µ0 − µ0
2
and the same answer is true for a lower-tailed test
• For a two-tailed test, it is only possible to give an approximate
solution. It is
σ(zα/2 + zβ )
n≈
µ0 − µ0
2
April, 2011
Page 12
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Large Sample Tests
• When the sample size is large, the z tests described earlier are
modified to yield valid test procedures without requiring either a
normal population distribution or a known σ .
• Let us assume n > 40. Then, the test statistic
X̄ − µ0
√
Z=
S/ n
is approximately standard normal
• The use of the same rejection regions as before results in a test
procedure for which the significance level is approximately α.
April, 2011
Page 13
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Example
• To test H0 : µ = 20 vs. H1 : µ > 20 at significance level
α = 0.05, we collect 400 observations with x̄n = 21.82935
and sn = 24.70037. Should H0 be rejected?
• The value of test statistic is
21.82935 − 20
= 1.481234 < 1.645 = z0.05
t=
24.70037/20
- should not be rejected
April, 2011
Page 14
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
A Normal Population Distribution
• When n is small, we can no longer invoke CLT as a justification
for the large sample test
• Remember that for a normally distributed random sample
X1 , . . . , Xn , the statistic
X̄ − µ
√
T =
S/ n
has a t distribution with n − 1 df
• Therefore, we have the test with H0 : µ = µ0 and a test
x̄−µ
statistic value t = s/√n0 .
April, 2011
Page 15
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Summary of the Three Possible t-Tests
• If Ha : µ > µ0 , the rejection region of the level α test is
t ≥ tα,n−1
• If Ha : µ < µ0 , the rejection region of the level α test is
t ≤ −tα,n−1
• If Ha : µ 6= µ0 , the rejection region of the level α test is
t ≥ tα/2,n−1 or t ≤ −tα/2,n−1
April, 2011
Page 16
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Example
• The Edison Electric Institute publishes figures on the annual
number of kilowatt hours expended by various home appliances.
• It is claimed that a vacuum cleaner expends an average of 46
kilowatt hours per year.
• Suppose a planned study includes a random sample of 12
homes and it indicates that VC’s expend an average of 42
kilowatt hours per year with s = 11.9 kilowatt hours.
• Assuming the population normality, design a 0.05 level test to
see whether VC’s spend less than 46 kilowatt hours annually
April, 2011
Page 17
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• H0 : µ = 46 kilowatt hours and Ha : µ < 46 kilowatt hours
• Assuming α = 0.05, we have a critical region t < −1.796
where
x̄ − µ0
√
t=
s/ n
with 11 df
• The value of the statistic is
42 − 46
√ = −1.16
t=
11.9/ 12
• Since t is not in the rejection region, we fail to reject H0 .
April, 2011
Page 18
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
A β curve for the t-test
• It is much more difficult to compute the probability of the Type II
Error in this case than in the normal case
• The reason is that it requires the knowledge of distribution of
√ 0 under the alternative Ha . To do it precisely, we
T = X̄−µ
S/ n
must compute
0
0
β(µ ) = P (T < tα,n−1 when µ = µ )
• There exist extensive tables of these probabilities for both oneand two-tailed tests.
April, 2011
Page 19
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Figure 1:
April, 2011
Page 20
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Calculating β
0
• First, we select µ and the estimated value for unknown σ .
0
Then, we find an estimated value of d = |µ0 − µ |/σ . Finally,
the value of β is the height of the n − 1 df curve above the
value of d
• If n − 1 is not the value for which the corresponding curve
appears visual interpolation is necessary
April, 2011
Page 21
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• One can also calculate the sample size n needed to keep the
Type II Error probability below β for specified α.
1. First, we compute d
2. Then, the point (d, β) is located on the relevant set of graphs
3. The curve below and closest to the point gives n − 1 and
thus n
4. Interpolation, of course, is often necessary
April, 2011
Page 22
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Large-Sample Tests
• Let p denote the proportion of individuals or objects in a
population who possess a specified property; thus, each object
either possesses a desired property (S) or it doesn’t (F).
• Consider a simple random sample X1 , . . . , Xn . If the sample
size n is small relative to the population size, the number of
successes in the sample X has an approximately binomial
distribution. If n itself is also large, both X and the sample
proportion p̂ = X/n are approximately normally distributed
• Large-sample tests concerning p are a special case of the more
general large-sample procedures for an arbitrary parameter θ .
We considered such a large-sample test before for the mean µ
of an arbitrary distribution.
April, 2011
Page 23
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• Some basic properties of p̂ are;
1. Estimator p̂ is unbiased:
E p̂ = p.
2. Second, it is approximately normal and its standard deviation
p
(SD) is σp̂ =
p(1 − p)/n
3. Note that σp̂ does not include any unknown parameters.
This is not always the case. It is enough to remember the
large-sample test of the mean where σµ̂
= σX̄ = σ 2 /n
which is in general unknown unless σ 2 is specified.
April, 2011
Page 24
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• Let us consider first an upper-tailed test. It means having a null
hypothesis H0 : p = p0 vs. an alternative Ha : p > p0 .
• Under the null hypothesis, we have E (p̂) = p0 and
p
σp̂ = p0 (1 − p0 )/n; therefore, for large n the test statistic
Z=p
p̂ − p0
p0 (1 − p0 )/n
has approximately standard normal distribution
• The rejection region is, clearly, z ≥ zα for a test of
approximately level α.
April, 2011
Page 25
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• The lower-tailed test has a rejection region z ≤ zα
• The two-tailed test has a rejection region |z| ≥ zα/2 . The last
expression is a concise way of saying that z ≥ zα/2 or
z ≤ −zα/2 .
• These tests are applicable whenever the normal approximation
of the binomial distribution is reasonable: np0 ≥ 10,
n(1 − p0 ) > 10.
April, 2011
Page 26
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Example
• 1276 individuals in a sample of 4115 adults were found to be
obese as per their BMI. A 1998 survey based on
self-assessment revealed that 20% of the adult population
considered themselves obese. Does the most recent data
suggest that the true proportion of obese adults is more than
1.5 times the percentage from the self-assessment survey?
• The null hypothesis here is that p = 0.3 and the alternative is
p > .30
• Note that the rule of thumb is satisfied: np0 = 4115(.3) > 10
abd n(1 − p0 ) = 4115(.7) > 10
April, 2011
Page 27
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• Thus, we use the large-sample test with
p
z = (p̂ − .3)/ (.3)(.7)/n
• For a significance level α = .1 we use zα = 1.28. The sample
proportion is p̂ = 1276/4115 = .310. Plugging this value in z ,
we obtain 1.40 > 1.28. Thus, the null hypothesis is rejected.
April, 2011
Page 28
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Type II Error and sample size determination
• Type II Error probability can be computed exactly as before. If
0
H0 is not true, the true proportion p = p 6= p0 . Under
0
Ha : p = p we have Z is still approximately normal; however,
0
p − p0
E(Z) = p
p0 (1 − p0 )/n
and
0
0
p (1 − p )/n
V (Z) =
p0 (1 − p0 )/n
April, 2011
Page 29
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• The formulas for the type II error are very similar to what we saw
before for the mean test. We only give the upper-tailed test
formula (Ha
: p > p0 )
"
#
p
0
p0 − p + zα p0 (1 − p0 )/n
0
p 0
β(p ) = Φ
p (1 − p0 )/n
and the lower-tailed test formula (Ha
: p < p0 )
"
#
p
0
p0 − p − zα p0 (1 − p0 )/n
0
p 0
β(p ) = 1 − Φ
p (1 − p0 )/n
• Sample size formulas can also be easily derived. In the
two-tailed case, the formula is approximate as before
April, 2011
Page 30
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Example
• A package delivery service advertises that at least 90% of all
packages brought to its office by 9 A.M. are delivered by noon
the same day. How likely is it that a level 0.1 test based on
n = 225 packages will detect a departure of 10% from this
goal?
• The null hypothesis is H0 : p = 0.9 vs. Ha : p < 0.9.
0
• For p = 0.8, we have
β(.8) = 1 − Φ
!
p
.9 − .8 − 2.33 (.9)(.1)/225
p
= 0.228
(.8)(.2)/225
• Thus, the probability of type II error under the alternative
0
p = 0.8 is about 23%.
April, 2011
Page 31
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Small Sample tests
• These are test procedures for proportions when the sample size
n is small. They are based directly on the binomial distribution
rather than the normal approximation.
• Consider the alternative hypothesis Ha : p > p0 and let X be
yet again the number of successes in the sample size n. For a
test level α, we find the rejection region from
P (X ≥ c when X ∼ Bin(n; p0 ))
= 1 − P (X ≤ c − 1 when X ∼ Bin(n; p0 ))
= 1 − B(c − 1; n, p0 )
April, 2011
Page 32
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• It is usually not possible to find an exact value of c in this case;
the usual way out is to use the largest rejection region of the
form {c, c + 1, . . . , n} satisfying the bound on the Type I error
0
• To compute the Type II error for an alternative p > p0 , we first
0
note that X ∼ Bin(n, p ) if the alternative is true. Then,
0
0
0
β(p ) = P (X < c when X ∼ Bin(n, p )) = B(c−1; n, p )
Note that this is a result of a straightforward binomial probability
calculation
April, 2011
Page 33
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Example
• A builder claims that heat pumps are installed in 70% of all
homes being constructed today in the city of Richmond, VA.
Would you agree with this claim if a random survey of new
homes in this city shows that 8 out of 15 had heat pumps
installed? Use a 0.1 level of significance.
• H0 : p = 0.7 vs. Ha : p < 0.7 with α = 0.10
• The test statistic is X ∼ Bin(0.7, 15)
April, 2011
Page 34
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• We have x = 8 and np0 = (15)(0.7) = 10.5. Thus, we must
find c such that
P (X ≥ c) = 1 − B(c − 1; 15, 0.7) = 0.1
∼ Bin(0.7, 15). It is easy to check that the rejection
region will be {13, 14, 15}.
for X
April, 2011
Page 35