Download Unit-23-Hypotheses-Testing-Left-and-Right

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Elementary Statistics
Triola, Elementary Statistics 11/e
Unit 23 Hypotheses Testing, Left and Right Tails
There are three basic types of of hypotheses testing and they involve the tails, Left Tail (LT), Right
Tail(RT) and Two Tail (2T). To understand the difference between these tests and why we would use
one over the other, let’s work with an example.
Suppose we get in a shipment of 50,000 washers. (These are circular discs with a hole in the center).
We might be concerned that the hole is too small, in which case we would use a left-tail test. Or we
might be concerned that the outside diameter of the washer is too big, in which case we would use a
right-tail test. Finally, we might be concerned that the hole is too small or too big , in which case we
would use the two-tail test.
Let’s take a deeper look at why we would use the left-tail test if we were concerned about the hole
being two small. Our whole focus is on the hole being too small. We are not at all concerned about it
being too big. Now take a look at the following two graphs,
The first graph shows a two-tail test and the next one shows a left-tail test. In both cases, 𝛼 =
0.05. However, in the Two-Tail Test, 𝛼 is split between the two tails, but in the Left-Tail Test, all of 𝛼 lies
under the left tail. Therefore, the critical values are going to be different. In the Two-Tail Test, 𝑧𝛼⁄2 =
βˆ’1.96 and in the Left-Tail Test, it is equal to βˆ’1.645. Think about what this means. If you used the
same sample for doing both a two-tail and a left-tail test, and hence, used the same test score for both,
which one, the two-tail test or the one-tail is more likely to have the null hypothesis rejected?
Remember we reject the null hypothesis if the absolute value of the test score is greater than the
absolute value of the critical value.
Question #1
With the same test score, which test is more likely to have the null hypothesis rejected, a one-tail test or
a two-tail test?
Suppose the manufacturer is claiming that on average, the hole size is 25.00 mm. We are either going to
reject this claim because we have very convincing evidence that it is wrong, or we will not reject the
claim and end up accepting the batch. So let’s examine the two cases, Case 1 where we cannot reject
the claim, and Case 2, where we do reject the claim.
Case 1.
Let’s say that our sample yields the following results, π‘₯Μ… = 24.25, 𝑠 = 3.45, 𝑛 = 30, πœ‡ = 25.00 (this is
the claim). The t value is 𝑑 = βˆ’1.1907.
57
Unit 23 Hypotheses Testing Left and Right Tails
x
s
24.1
n
2.45
30
t
-2.0120
If we are concerned about the hole being too small, and hence conduct a left-tail test, then assuming
that 𝛼 = 0.05, the critical value is,
𝑑𝛼⁄2 = 𝑇. 𝐼𝑁𝑉(0.05, 29) = βˆ’1.699
The test score is,
𝑑=
(π‘₯Μ… βˆ’ π‘π‘™π‘Žπ‘–π‘š)
(24.25 βˆ’ 25.0)
√30 = βˆ’1.1907
βˆšπ‘› =
𝑠
3.45
Do we reject the null hypothesis? The answer is no because the absolute value |𝑑| < |𝑑𝛼⁄2 |, i.e. t does
not fall to the left of 𝑑𝛼⁄2 and hence does not fall under the β€œred zone”. In other words, the sample
mean of 24.25 just isn’t far enough away from the claim of 25.00 to reject the claim. Hence, we would
end up accepting the batch. Remember we just took a simple random sample of only 30 washers. If we
were to take another simple random sample of 30 washer, we could very easily end up with a sample
hole diameter average of 25.75, clearly not too small.
Case 2.
In this case, let’s say we have the following results, π‘₯Μ… = 24.05, 𝑠 = 1.74, 𝑛 = 30, πœ‡ = 25.00. The t
statistic is βˆ’2.9904.
x
24.05
s
n
1.74
30
t
-2.9904
This time we see that the test score, t is in the red zone. Given a standard deviation of 1.74 and a
sample size of 30, 24.05 is just too β€œfar” from 25.00 forcing us to reject the claim. Given the evidence of
our sample, it is very unlikely (less than 5%) that the claim is correct and our sample was just a fluke.
You try one.
Question #2
Given that π‘₯Μ… = 24.10, 𝑠 = 2.45, 𝑛 = 30, 𝐢𝐿 = 95%, would you reject the null hypothesis in a left=tail
test? 𝑑𝛼⁄2 = βˆ’1.699.
58
Unit 23 Hypotheses Testing Left and Right Tails
There are five mathematical relationships to consider, and we use them to characterize the type of tail
test that we will be conducting:
Relation
Less than
Greater than
At most
At least
Equals
Tail Type
LT
RT
RT
LT
2T
All you have to do is determine which of the relational tests you want to run and then according to the
table above, select the appropriate tail test. The key is to find the words in the problem statement
that match (or mean the same thing) as the words in the table. The homework will give you plenty
of practice doing this.
Finally, there is one more step to setting up a hypotheses test. We need to formally state the
hypotheses. This involves defining the null hypothesis, π‘―πŸŽ , and the alternative hypothesis, π‘―πŸ . The
null hypothesis is always stated one way, that πœ‡ = π‘π‘™π‘Žπ‘–π‘š. No matter how the problem is worded, i.e.,
regardless of whether we are testing πœ‡ is less than, greater than, at most, at least or equals some
number, the null hypothesis is always stated as,
π‘―πŸŽ : 𝝁 = π’„π’π’‚π’Šπ’Ž
where claim is some numerical value.
The alternative hypothesis reflects the wording of the problem, but can only use the following relations,
<, >, β‰ . To get the right alternative, you need a table,
Relation
Less than
Greater than
At most
At least
Equals
Tail Type
LT
RT
RT
LT
2T
Alt Hyp
π‘―πŸ : 𝝁 < π’„π’π’‚π’Šπ’Ž
π‘―πŸ : 𝝁 > π’„π’π’‚π’Šπ’Ž
π‘―πŸ : 𝝁 > π’„π’π’‚π’Šπ’Ž
π‘―πŸ : 𝝁 < π’„π’π’‚π’Šπ’Ž
π‘―πŸ : 𝝁 β‰  π’„π’π’‚π’Šπ’Ž
It may seem that the relation for β€œat most” and β€œat least” the alternative hypothesis is contradictory, but
it results from the fact that we cannot use ≀ π‘œπ‘Ÿ β‰₯ in alternative hypothesis statement. Remember, the
null hypothesis, 𝐻0 is always stated one way and one way only, 𝐻0 : πœ‡ = β€² π‘π‘™π‘Žπ‘–π‘šβ€² .
Try your hand at this.
Question #3
Consider the claim that the mean weight of airline passengers (including carry-on baggage) is at most
195 lbs. Set up the hypotheses statement to test this claim.
59
Unit 23 Hypotheses Testing Left and Right Tails
Power of a Test
There is always the possibility of error when doing a hypotheses test. We can never be 100% confident.
For example, if we are running the test at a 95% confidence level, there is a 5% chance that we picked a
sample that’s going to give incorrect results. Remember, we place the burden of proof on the
alternative hypothesis. We do not reject the null hypothesis unless we have very strong evidence to do
so. Still, that means that there is a small chance that we are rejecting a true null hypothesis. For
example, the claim might actually be true, but we have a 5% chance of committing a Type I Error.
This is why we typically use a 99% confidence level when testing things like drugs. The test is set up so
that the null hypothesis represents the claim that the drug is ineffective. The burden of proof is on the
alternative hypothesis to demonstrate that the drug is effective. At the 99% level, there is less than a
1% chance that the sample we selected is going to lead us to an incorrect conclusion.
When we reject a null hypothesis that is true, we have committed an error, which is called a Type I
Error. 𝜢, the significance, is the probability of incurring a Type I Error. In other words, if the confidence
level is 95%, then 𝛼 = 0.05, and therefore, there is a 5% chance that the sample we selected is going to
cause us to reject a true claim.
On the other hand, there is plenty of room for failing to reject a null hypothesis when it’s false. We call
this a Type II Error, the error of failing to reject a false null hypothesis. Consider the drug example
again. The null hypothesis basically states that the drug is ineffective. Therefore, if the drug is effective,
the null hypothesis would be wrong. However, we need strong evidence before we reject the null
hypothesis, so even an effective drug might end up being rejected. Don’t worry, to the best of my
knowledge, we have never thrown away a drug that would have cured all cancer. If the drug had really
been effective, and we ended rejecting it, it was mostly likely due to the fact that the drug was only
marginally effective. We had a very good chance of selecting a sample that would have led us to
conclude that the drug was β€œineffective”. When it comes to drugs, we tend to error on the side of
safety.
If 𝛽 is the he probability of failing to reject a false null hypothesis, then the probability of rejecting a
false hypothesis would be 1 βˆ’ 𝛽. This is called the power of the test. The actual computation of 𝛽 is a
bit complicated, depending on 𝛼, the sample size, and some other factors. Many software packages
used for doing statistical analysis will give you 𝛽, Excel does not.
The higher the power, the better the test. Power is about rejecting a false hypothesis. The greater the
difference between the value of the null hypothesis and the actual value of the population parameter
that we are testing, the greater will be the power. We are not actually going to calculate the power of a
test, because doing so is beyond the scope of this class. However, it should be noted that just about the
only thing you can do to increase the power of a test is to increase the size of your sample. On the other
hand, there is a cost to doing so, as the cost of a study goes up as the sample size increases.
60
Unit 23 Hypotheses Testing Left and Right Tails
Worked Example
In this example, we are going to make only one claim about the weight of elephants. We are going to
claim that the average weight of elephants is less than 5000 lbs, and we are going to test this hypothesis
at the 95% confidence level. We collect a simple random sample of 30 elephants and find that π‘₯Μ… =
4900, 𝑠 = 270.
Hypotheses Statement
𝐻0 : πœ‡ = 5000
𝐻1 : πœ‡ < 5000
Calculate the test statistic,
𝑑=
π‘₯Μ… βˆ’ πœ‡
4900 βˆ’ 5000
√30 = βˆ’1.4803
βˆšπ‘› =
𝑠
370
The critical value for this problem is,
𝑑𝛼⁄2 = 𝑇. 𝐼𝑁𝑉(0.05,29) = βˆ’1.6991
The absolute value of t is less than the absolute value of the critical value,
|βˆ’1.4803| < |βˆ’1.6991|
Therefore, the sample average is not β€œfar” enough away from the null hypothesis, and so we cannot
reject it. So, what exactly does this mean? It means that we are refuting the alternative hypothesis. We
are saying that on the basis of our sample, elephants do not weigh less than 5000 lbs. Please read this
over a few times until it makes sense.
This is the end of Unit 23.
Now turn to MyMathLab to get more practice with
these concepts.
61