Download Hypothesis Testing list of problems 2015 a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Hypothesis Testing
1.
Consider the following exercise: The policy of a particular bank branch is that its ATMs
must be stocked with enough cash to satisfy customers making withdrawals over an
entire weekend. At this branch the expected (i.e., population) average amount of money
withdrawn from ATM machines per customer transaction over the weekend is $160 with
an expected (i.e., population) standard deviation of $30. Suppose that a random sample
of 36 customer transactions is examined and it is observed that the sample mean
withdrawal is $172. Note that the standard deviation that we are to use for this problem
did not come from the sample. Therefore, this will be a Z-test, not a t-test. For this
problem, we have
, n = 36, and
.
(a)
State the null and alternative hypothesis.
This is not very clear, but apparently we are to check if the average exceeds $160, which
would mean the ATMs are not "stocked with enough cash." This is what we will try to
show, so it should go in the alternative hypothesis. Therefore the hypotheses would be:
(b)
At the .05 level of significance, using the critical value approach to hypothesis testing, is
there enough evidence to believe that the true average withdrawal is greater than $160?
Let us get the test statistic:
Set up the rejection region by drawing a Z-curve and shade the last 5% of the right
tail. We need the Z critical value associated with this area, which is
. Use the invNorm function with .95 as the argument, since, as
your hand-drawn curve should clearly show, the area from
.
to Z.05 is
The test statistic falls into the rejection region, i.e., 2.4 > 1.645, therefore we reject
H0. Yes, there is enough evidence that the average is more than $160.
(c)
At the .05 level of significance, using the p-value approach to hypothesis testing, is there
enough evidence to believe that the true average withdrawal is greater than $160?
Now you should draw another Z-curve, this time shading the area to the right of the test
statistic, 2.4. Alternatively, you could use the same curve you drew in (b) and shade the
new area with a different color pen. This would make it clear that the area we wish to
obtain will be less than
. The shaded area is of course the p-value, and we can
get it with the normalcdf function. You have done this before (in Chapter 4) when you
needed a probability.
Now the probability we need is
. This pvalue is smaller than , thus we reject H0.
(d)
Interpret the meaning of the p-value in this problem.
The probability of obtaining a sample whose mean is $172 or more when H0 is true is
.0082.
(e)
Compare your conclusions in (b) and (c).
The conclusions are the same, of course. A "large" test statistic, one that is larger than
the critical value, is associated with a "small" p-value, one that is smaller than alpha.
And the decision rule is:
Reject H0 if the test statistic falls in the rejection region [part (b)], or
Reject H0 if the p-value is less than alpha [part(c)].
(f)
Now let us re-do part (a) using the Z-Test function. Under STAT TESTS, choose ZTest. We do not have a list of data for this problem, instead we have summary statistics
from the textbook. Choose Stats. Enter the information for this problem as you see
below.
With the cursor on Calculate, press ENTER.
This gives the same results as before. Note, however, that this method gives the p-value,
and does not give the critical value.
Another useful option is Draw. Go back and choose Draw instead of Calculate.
The graph should match what you drew by hand. Actually, for "extreme" test statistics,
the shading will not show up on the graph. This one shows up just barely!
2.
Consider the following exercise.
The marketing manager for an automobile manufacturer is interested in determining the
proportion of new compact-car owners who would have purchased a passenger-side
inflatable air bag if it had been available for an additional cost of $300. The manager
believes from previous information that the proportion is .30. Suppose that a survey of
200 new compact-car owners is selected and 79 indicate that they would have purchased
the air bags.
Since this is a hypothesis test for the proportion, it will be a Z-test. For this problem, we
have n = 200, and ps = 79/200 = .395.
(a)
At the .10 level of significance, is there enough evidence that the population proportion is
different from .30?
The hypotheses here will be
H0: p = .30
Let us get the test statistic:
Set up the rejection region by drawing a Z-curve and shade the most extreme 5% of both
tails.
We need the Z critical value associated with this, which is
. Again, use the invNorm function with .95
as the argument. The test statistic falls into the rejection region, i.e., 2.93176 > 1.645,
therefore we reject H0. Yes, there is enough evidence that the population proportion is
different from .30.
(b)
Compute the p-value and interpret its meaning.
Now you should draw another Z-curve, this time shading the area to the right (and left!)
of the test statistic,
. Of course, you could use the same curve you drew in
(a) and shade the new area with a different color. This would make it easy to see that the
area we wish to obtain will be less than
. The shaded area is the p-value, and
we get it with the normalcdf function, like before.
We have two identical shaded (although the shading isn't very visible) areas, so we
calculate:
. This p-value is
smaller than , thus we reject H0.
(c)
What is your answer to (a) if 70 new owners indicated that they would have purchased
the air bags?
Now we have ps = 70/200 = .35. The test statistic changes to:
This test statistic does not fall into the rejection region, i.e.,
, therefore
we do not reject H0. No, there is no enough evidence that the population proportion is
different from .30.
Now, the p-value should be larger than
. Indeed,
. Do not reject
H0.
(d)
Now let us re-do part (a) using the 1-PropZTest function. Under STAT TESTS, choose
1-PropZTest. Enter the information for this problem as you see below.
With the cursor on Calculate, press ENTER.
This gives the same results as before. Again note that this method gives the p-value, and
does not give the critical value.
Again, go back and choose Draw.
For this one, the shading will not show up on the graph.
Let us use the1-PropZTest function for part (c).
This time, we can clearly see the shading.
3.
Consider the following exercise: The director of admissions at a large university advises
parents of incoming students about the cost of textbooks during a typical semester. A
sample of 100 students enrolled in the university indicates a sample average cost of
$315.40 with a sample standard deviation of $43.20. Note that the standard deviation
did come from the sample. Therefore, this will be a t-test, not a Z-test. For this
problem, we have s = 43.20, n = 100, and
.
(a)
Using the .10 level of significance, is there enough evidence that the population average
is above $300?
We are asked if the data shows that the mean is greater than $300. This will be the
alternative hypothesis. Thus the hypotheses here will be
Now let us get the test statistic:
Set up the rejection region by drawing a t-curve (just draw a symmetric bell-shaped curve
like usual) and shade the last 10% of the right tail. We need the t critical value
associated with this, which is
. We get this value with the
EQUATION SOLVER, just like we did for t confidence intervals. Under the MATH menu,
choose Solver. Recall the method explained on the confidence intervals example page
for finding t critical values. This equation should still be in your calculator, but if not,
enter in variables for the arguments like you see below.
L is for Lower bound, U is for Upper bound, D is for Degrees of freedom, and A is for
Area.
We want the value such that there is 10% of the area to the right of that value. Let us
have the calculator solve for L, so enter zero on the first line for a "guess." The upper
bound is
, so enter 1E99 for U. The degrees of freedom for this problem are n - 1 =
100 - 1 = 99, and we want the area between the lower and upper bound to be
. With the cursor on the L=0 line, press SOLVE (ALPHA ENTER). Remember that this
calculation takes about 15-20 seconds.
We now see that
. The test statistic falls into the rejection
region, i.e., 3.5648 > 1.2902, therefore we reject H0. Yes, there is enough evidence that
the population average is above $300.
Now let us get the p-value.
Draw another t-curve, this time shading the area to the right of the test statistic,
3.5648. Again, you could use the same curve you drew in (a) and shade the new area
with a different color. The shaded area is the p-value, and we get it with the tcdf
function.
This p-value is smaller than
, thus we reject H0.
(b)
What is your answer in (a) if the standard deviation is $75 and the .05 level of
significance is used?
Now, s = 75 and
. The test statistic will change to:
The rejection region now has only 5% of the area shaded in the right tail. Go back to the
EQUATION SOLVER and make the change to the area part of our equation.
Now
. The test statistic falls into the rejection region, i.e.,
2.0533 > 1.6604, therefore we reject H0.
Now let us get the p-value.
This p-value is smaller than
, thus we reject H0.
(c)
What is your answer in (a) if the sample average is $305.11?
Now,
. The test statistic will change to:
The rejection region will be the same as it was in part (a).
Our new test statistic does not fall into the rejection region, i.e.,
therefore we do not reject H0.
Now let us get the p-value.
,
Do not reject H0.
(d)
Now let us re-do part (a) using the T-Test function. Under STAT TESTS, choose TTest.
We do not have a list of data for this problem, instead we have summary statistics from
the textbook. Choose Stats. Enter the information for this problem as you see
below. With the cursor on Calculate, press ENTER.
This gives the same results as before. Note again that this gives the p-value, but not the
critical value.
T-Test also has the Draw option. Go back and choose Draw instead of Calculate.
Like before, our test statistic is too extreme to be shown on the screen. Imagine it offscreen to the right.
4.
Consider the following exercise.
A manufacturer claims that the average capacity of a certain type of battery the company
produces is at least 140 ampere-hours. An independent consumer protection agency
wishes to test the credibility of the manufacturer's claim and measures the capacity of 20
batteries from a recently produced batch. The results, in ampere-hours, are as follows:
137.4
141.1
140.0
139.7
138.8
136.7
139.1
136.3
144.4
135.6
139.2
138.0
141.8
140.9
137.3
140.6
133.5
136.7
138.2
134.1
We were not given a mean or standard deviation; we'll have to get them ourselves from
the data. Of course, the standard deviation we get will be a sample standard deviation,
which makes this a t-test, not a Z-test. Enter the data into your calculator, into L1,
say. Obtain the summary statistics from STAT CALC 1-Var Stats.
So for this data, we have
, s = 2.6589, and n = 20.
(a)
Using the .05 level of significance, is there enough evidence that the manufacturer's claim
is being overstated?
The claim is that the average capacity is at least 140, and we will try to show that it is in
fact less than 140. Thus the hypotheses here will be
Now let us get the test statistic:
Set up the rejection region by drawing a t-curve and shade the leftmost 5% of the left
tail.
We need the t critical value associated with this, which is
. We get this t critical value with the EQUATION
SOLVER, just like we did for the last problem.
The test statistic falls into the rejection region, i.e., -2.5734 < -1.7291, therefore we reject
H0. Yes, there is enough evidence that the manufacturer's claim is overstated.
Now let us get the p-value.
Draw another t-curve, this time shading the area to the left of the test statistic, 2.5734. Again, you could use the same curve you drew in (a) and shade the new area
with a different color. The shaded area is the p-value, and we get it with the tcdf
function.
This p-value is smaller than
, thus we reject H0.
(b)
What assumption must hold in order to perform the test in (a)?
The population of battery capacities must be (approximately) normally distributed.
(c)
Evaluate this assumption through a graphical approach.
Okay, let us see how likely it is that this data came from a normal population. First, let
us look at a boxplot.
This is very symmetric. Looks good so far.
Now let us look at a normal probability plot for this data.
This is a very straight line. Both the boxplot and the normal probability plot seem to be
telling us that our data does indeed come from a normal population.
(d)
What is your answer in (a) if the last two values are 146.7 and 144.1 instead of 136.7 and
134.1?
Go to STAT Edit and change the last two values.
Obtain the summary statistics again. The mean and standard deviation will be
different. The test statistic changes to:
The rejection region stays the same, so we see that this time, the test statistic does not fall
into the rejection region, i.e.,
. Therefore, we do not reject H0.
For the p-value, shade the area under the t-curve to the left of the test statistic, -0.7466.
Do not reject H0.
(e)
Now, instead of changing the data back to what it was originally in part (a), let us leave it
as it is and re-do part (d) using the T-Test function.
This time, we do have a list of data, so choose Data. Enter the information for this
problem as you see below. With the cursor on Calculate, press ENTER.
This gives the same results as before.
Again, go back and choose Draw.
For this one, the shading is visible.