Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Elementary Statistics Triola, Elementary Statistics 11/e Unit 23 Hypotheses Testing, Left and Right Tails There are three basic types of of hypotheses testing and they involve the tails, Left Tail (LT), Right Tail(RT) and Two Tail (2T). To understand the difference between these tests and why we would use one over the other, letβs work with an example. Suppose we get in a shipment of 50,000 washers. (These are circular discs with a hole in the center). We might be concerned that the hole is too small, in which case we would use a left-tail test. Or we might be concerned that the outside diameter of the washer is too big, in which case we would use a right-tail test. Finally, we might be concerned that the hole is too small or too big , in which case we would use the two-tail test. Letβs take a deeper look at why we would use the left-tail test if we were concerned about the hole being two small. Our whole focus is on the hole being too small. We are not at all concerned about it being too big. Now take a look at the following two graphs, The first graph shows a two-tail test and the next one shows a left-tail test. In both cases, πΌ = 0.05. However, in the Two-Tail Test, πΌ is split between the two tails, but in the Left-Tail Test, all of πΌ lies under the left tail. Therefore, the critical values are going to be different. In the Two-Tail Test, π§πΌβ2 = β1.96 and in the Left-Tail Test, it is equal to β1.645. Think about what this means. If you used the same sample for doing both a two-tail and a left-tail test, and hence, used the same test score for both, which one, the two-tail test or the one-tail is more likely to have the null hypothesis rejected? Remember we reject the null hypothesis if the absolute value of the test score is greater than the absolute value of the critical value. Question #1 With the same test score, which test is more likely to have the null hypothesis rejected, a one-tail test or a two-tail test? Suppose the manufacturer is claiming that on average, the hole size is 25.00 mm. We are either going to reject this claim because we have very convincing evidence that it is wrong, or we will not reject the claim and end up accepting the batch. So letβs examine the two cases, Case 1 where we cannot reject the claim, and Case 2, where we do reject the claim. Case 1. Letβs say that our sample yields the following results, π₯Μ = 24.25, π = 3.45, π = 30, π = 25.00 (this is the claim). The t value is π‘ = β1.1907. 57 Unit 23 Hypotheses Testing Left and Right Tails x s 24.1 n 2.45 30 t -2.0120 If we are concerned about the hole being too small, and hence conduct a left-tail test, then assuming that πΌ = 0.05, the critical value is, π‘πΌβ2 = π. πΌππ(0.05, 29) = β1.699 The test score is, π‘= (π₯Μ β πππππ) (24.25 β 25.0) β30 = β1.1907 βπ = π 3.45 Do we reject the null hypothesis? The answer is no because the absolute value |π‘| < |π‘πΌβ2 |, i.e. t does not fall to the left of π‘πΌβ2 and hence does not fall under the βred zoneβ. In other words, the sample mean of 24.25 just isnβt far enough away from the claim of 25.00 to reject the claim. Hence, we would end up accepting the batch. Remember we just took a simple random sample of only 30 washers. If we were to take another simple random sample of 30 washer, we could very easily end up with a sample hole diameter average of 25.75, clearly not too small. Case 2. In this case, letβs say we have the following results, π₯Μ = 24.05, π = 1.74, π = 30, π = 25.00. The t statistic is β2.9904. x 24.05 s n 1.74 30 t -2.9904 This time we see that the test score, t is in the red zone. Given a standard deviation of 1.74 and a sample size of 30, 24.05 is just too βfarβ from 25.00 forcing us to reject the claim. Given the evidence of our sample, it is very unlikely (less than 5%) that the claim is correct and our sample was just a fluke. You try one. Question #2 Given that π₯Μ = 24.10, π = 2.45, π = 30, πΆπΏ = 95%, would you reject the null hypothesis in a left=tail test? π‘πΌβ2 = β1.699. 58 Unit 23 Hypotheses Testing Left and Right Tails There are five mathematical relationships to consider, and we use them to characterize the type of tail test that we will be conducting: Relation Less than Greater than At most At least Equals Tail Type LT RT RT LT 2T All you have to do is determine which of the relational tests you want to run and then according to the table above, select the appropriate tail test. The key is to find the words in the problem statement that match (or mean the same thing) as the words in the table. The homework will give you plenty of practice doing this. Finally, there is one more step to setting up a hypotheses test. We need to formally state the hypotheses. This involves defining the null hypothesis, π―π , and the alternative hypothesis, π―π . The null hypothesis is always stated one way, that π = πππππ. No matter how the problem is worded, i.e., regardless of whether we are testing π is less than, greater than, at most, at least or equals some number, the null hypothesis is always stated as, π―π : π = πππππ where claim is some numerical value. The alternative hypothesis reflects the wording of the problem, but can only use the following relations, <, >, β . To get the right alternative, you need a table, Relation Less than Greater than At most At least Equals Tail Type LT RT RT LT 2T Alt Hyp π―π : π < πππππ π―π : π > πππππ π―π : π > πππππ π―π : π < πππππ π―π : π β πππππ It may seem that the relation for βat mostβ and βat leastβ the alternative hypothesis is contradictory, but it results from the fact that we cannot use β€ ππ β₯ in alternative hypothesis statement. Remember, the null hypothesis, π»0 is always stated one way and one way only, π»0 : π = β² πππππβ² . Try your hand at this. Question #3 Consider the claim that the mean weight of airline passengers (including carry-on baggage) is at most 195 lbs. Set up the hypotheses statement to test this claim. 59 Unit 23 Hypotheses Testing Left and Right Tails Power of a Test There is always the possibility of error when doing a hypotheses test. We can never be 100% confident. For example, if we are running the test at a 95% confidence level, there is a 5% chance that we picked a sample thatβs going to give incorrect results. Remember, we place the burden of proof on the alternative hypothesis. We do not reject the null hypothesis unless we have very strong evidence to do so. Still, that means that there is a small chance that we are rejecting a true null hypothesis. For example, the claim might actually be true, but we have a 5% chance of committing a Type I Error. This is why we typically use a 99% confidence level when testing things like drugs. The test is set up so that the null hypothesis represents the claim that the drug is ineffective. The burden of proof is on the alternative hypothesis to demonstrate that the drug is effective. At the 99% level, there is less than a 1% chance that the sample we selected is going to lead us to an incorrect conclusion. When we reject a null hypothesis that is true, we have committed an error, which is called a Type I Error. πΆ, the significance, is the probability of incurring a Type I Error. In other words, if the confidence level is 95%, then πΌ = 0.05, and therefore, there is a 5% chance that the sample we selected is going to cause us to reject a true claim. On the other hand, there is plenty of room for failing to reject a null hypothesis when itβs false. We call this a Type II Error, the error of failing to reject a false null hypothesis. Consider the drug example again. The null hypothesis basically states that the drug is ineffective. Therefore, if the drug is effective, the null hypothesis would be wrong. However, we need strong evidence before we reject the null hypothesis, so even an effective drug might end up being rejected. Donβt worry, to the best of my knowledge, we have never thrown away a drug that would have cured all cancer. If the drug had really been effective, and we ended rejecting it, it was mostly likely due to the fact that the drug was only marginally effective. We had a very good chance of selecting a sample that would have led us to conclude that the drug was βineffectiveβ. When it comes to drugs, we tend to error on the side of safety. If π½ is the he probability of failing to reject a false null hypothesis, then the probability of rejecting a false hypothesis would be 1 β π½. This is called the power of the test. The actual computation of π½ is a bit complicated, depending on πΌ, the sample size, and some other factors. Many software packages used for doing statistical analysis will give you π½, Excel does not. The higher the power, the better the test. Power is about rejecting a false hypothesis. The greater the difference between the value of the null hypothesis and the actual value of the population parameter that we are testing, the greater will be the power. We are not actually going to calculate the power of a test, because doing so is beyond the scope of this class. However, it should be noted that just about the only thing you can do to increase the power of a test is to increase the size of your sample. On the other hand, there is a cost to doing so, as the cost of a study goes up as the sample size increases. 60 Unit 23 Hypotheses Testing Left and Right Tails Worked Example In this example, we are going to make only one claim about the weight of elephants. We are going to claim that the average weight of elephants is less than 5000 lbs, and we are going to test this hypothesis at the 95% confidence level. We collect a simple random sample of 30 elephants and find that π₯Μ = 4900, π = 270. Hypotheses Statement π»0 : π = 5000 π»1 : π < 5000 Calculate the test statistic, π‘= π₯Μ β π 4900 β 5000 β30 = β1.4803 βπ = π 370 The critical value for this problem is, π‘πΌβ2 = π. πΌππ(0.05,29) = β1.6991 The absolute value of t is less than the absolute value of the critical value, |β1.4803| < |β1.6991| Therefore, the sample average is not βfarβ enough away from the null hypothesis, and so we cannot reject it. So, what exactly does this mean? It means that we are refuting the alternative hypothesis. We are saying that on the basis of our sample, elephants do not weigh less than 5000 lbs. Please read this over a few times until it makes sense. This is the end of Unit 23. Now turn to MyMathLab to get more practice with these concepts. 61