Download 1 p-Values 2 Hypothesis Testing using a Small Sample

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
1
p-Values
Consider the null hypothesis that the mean cost of 4 years of college is $40000. The alternative hypothesis is
that the mean cost of 4 years of college is more than $40000. A sample of size 144 is used to do a right-tail
test of the null hypothesis with significance level α = 0.01. For our sample, the mean is $45000 and the
standard deviation is $24000. We would like to find the critical values, rejection and nonrejection regions,
and the conclusion.
For the right-tailed test, the rejection and nonrejection regions are arranged as indicated below.
nonrejection region
|
rejection region
40000
critical value
The fact that α = 0.01 tells us that, if µ = 40000 then the probability that x̄ is less than the critical value
is 0.01. So, to obtain the critical value, assume that µ = 40000. In this case, since n ≥ 30, we know by the
central limit theorem that x̄ is normally distributed with mean 40000.
Looking up an area of 0.49 on the standard normal table, we see that the critical value has a z-score of
z = 2.33. Since n ≥ 30, σx̄ is well approximated by
24000
s
= 2000.
sx̄ = √ = √
n
144
Therefore, this z-score corresponds to a critical value of
x = µ + zsx̄ = 40000 + 2.33 · 2000 = 44660.
The rejection and nonrejection regions are indicated below.
nonrejection region
|
rejection region
40000
44660
Since our sample mean lies in the rejection region, we reject the null hypothesis.
In fact, we would reject the null hypothesis even at lower significance levels. We would like to find the
minimum significance level α at which we would reject the null hypothesis. This is known as the p-value.
The minimum significance level at which we would reject the null hypothesis occurs when x̄ is a critical
value. To find this, we need to find the z-score corresponding to x̄, namely
z=
x̄ − µ
45000 − 40000
=
= 2.5.
sx̄
2000
From the standard normal table, this corresponds to area 0.4938. This means that the significance level and
thus the p-value is 0.5 − 0.4938 = 0.0062.
2
Hypothesis Testing using a Small Sample
Again consider the age of high school students upon graduation. Assume that this follows a normal distribution. The null hypothesis is that the mean age is 18.5 years. The alternative hypothesis is that the mean
age is not 18.5 years. A sample of size 7 is used to do a test of the null hypothesis with significance level
α = 0.05. The sample mean is x̄ = 18.15 and the sample standard deviation is s = 0.15. We would like to
find the critical values, rejection and nonrejection regions, and the conclusion.
The alternative hypothesis tells us to use a two-tailed test. For the two-tailed test, the rejection and nonrejection regions are arranged as indicated below.
rejection region
|
nonrejection region
|
rejection region
left critical value
18.5
right critical value
1
The fact that α = 0.05 tells us that, if µ = 18.5 then the probability that x̄ is less than the left critical
value or more than the right critical value is 0.05. So, to obtain the critical value, assume that µ = 18.5. In
this case, since the population data is normally distributed, x̄ follows a type of Student t distribution with
mean 18.5. Since the left and right critical values are symmetrically located with respect to this mean, the
probability that x̄ is less than the left critical value is 0.025 and the probability that x̄ is greater than the
right critical value is 0.025.
Looking up an area of 0.025 on the Student t table with 6 degrees of freedom, we see that the critical values
have t-scores of ±2.447. Since
0.15
s
sx̄ = √ = √ = 0.0567,
n
7
these t-scores correspond to critical values of
x = µ + tsx̄ = 18.5 + 2.447 · 0.0567 = 18.64
and
x = µ − tsx̄ = 18.5 − 2.447 · 0.0567 = 18.36.
The rejection and nonrejection regions are indicated below.
rejection region
|
nonrejection region
|
rejection region
18.36
18.5
18.64
Since our sample mean lies in the rejection region, reject the null hypothesis.
3
Testing Hypotheses about Population Proportions
It is claimed that 130 of the days this year are sunny. We want to test this hypothesis with a lower-tailed test
with significance level 0.2. We randomly select 15 days. Of these, 3 days are sunny. We would like to find the
alternative hypothesis, the critical values, the rejection and nonrejection regions, and state the conclusion.
First observe that the null hypothesis is that p = 130/365 = 0.356. For the lower-tailed test, the alternative
hypothesis is that the population proportion is less than that claimed in the null hypothesis. So the alternative
hypothesis is that p < 0.356.
For the lower-tailed test, the rejection and nonrejection regions are arranged as indicated below
rejection region
|
nonrejection region
critical value
0.356
The fact that α = 0.2 tells us that, if p = 0.356 then the probability that p̄ is less than the critical value is
0.2. So, to obtain the critical value, assume that p = 0.356. In this case, since np = 15 · 0.356 = 5.34 ≥ 5 and
n(1 − p) = 15 · (1 − 0.356) = 9.66 ≥ 5, we know by the central limit theorem for the sample proportion that
p̄ is normally distributed with mean
p̄ = p = 0.356
and standard deviation
r
r
p(1 − p)
0.356(1 − 0.356)
σp̄ =
=
= 0.124.
n
15
We did not need the FPCF here because the sample size is 15 and the population size is 365, and 15 is less
than 5 percent of 365.
We can find the z-score corresponding to the critical value by looking up an area of 0.3 on the standard
normal table. The result is that z = −0.84.
The critical value is thus
p̄ + zσp̄ = 0.356 − 0.84 · 0.124 = 0.251.
Therefore the rejection and nonrejection regions look like this:
rejection region
|
nonrejection region
0.251
0.356
Since the sample proportion is p̄ = 3/15 = 0.2, which is in the rejection region, we reject the null hypothesis.
2