Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Transcript

9.2 Hypothesis Tests for Population Means LEARNING GOAL Understand and interpret one- and two-tailed hypothesis tests for claims made about population means, and learn to recognize and avoid common errors (type I and type II errors) in hypothesis tests. Copyright © 2009 Pearson Education, Inc. One-Tailed Hypothesis Tests Consider the following hypothetical situation. Columbia College advertises that the mean starting salary of its graduates is $39,000. An independent organization suspects that this claim is exaggerated and decides to conduct a hypothesis test to seek evidence to support its suspicion. The null and alternative hypotheses are H0: μ = $39,000 Ha: μ < $39,000 Because the alternative hypothesis has a “less than” form, we are dealing with a left-tailed hypothesis test. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 2 They select a random sample of 100 recent graduates from the college. The mean salary of the graduates in the sample turns out to be xx̄ = $37,000. From the four-step hypothesis test process on page 376: Step 1: The The mean starting salary has been identified as the population parameter of interest, the null and alternative hypotheses have been stated. Step 2: The sample has been drawn and measured to determine its sample size and sample mean. The Sampling Distribution Step 3 of the hypothesis test process is finding the likelihood of observing a sample mean as extreme as the one found, under the assumption that the null hypothesis is true. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 3 For the Columbia College example, the question becomes this: How likely are we to select a sample (of size n = 100) with a mean of $37,000 or less when the mean for the whole population is $39,000? To answer this question, we need a crucial observation based on our work with sampling distributions in Chapter 8: The observed sample mean (xxx̄ == $37,000 $37,000 ) is just one point in a distribution of sample means. Moreover, assuming the null hypothesis is true, this distribution of sample means is approximately normal with a mean of μ = $39,000. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 4 Figure 9.1 Graphical interpretation of the hypothesis test for the Columbia College example. If the null hypothesis is true, then the population mean is μ = $39,000. In that case, if we took many samples from the population, the distribution of sample means would be approximately normal with a mean of $39,000. Under this assumption, the distance of a single sample mean from the population mean allows us to decide whether to reject or not reject the null hypothesis. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 5 One-Tailed Hypothesis Tests The Sampling Distribution Finding the Standard Score Because the sampling distribution is a (nearly) normal distribution, the standard score of the sample mean represents a quantitative measure of the distance between the sample mean and the claimed population mean. Recall from Section 5.2 that the standard score (or z-score) of a data value in a normal distribution is the number of standard deviations that it lies above or below the mean of the distribution. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 6 From the Central Limit Theorem (Section 5.3), the standard deviation of a distribution of sample means is σ / n , where σ is the population standard deviation and n is the sample size. Putting these ideas together, we find the following formula for the standard score of a sample mean : x– sample mean – population mean z= = standard deviation of sampling distribution / n The only remaining problem is that we generally do not know the population standard deviation, σ. For now, let’s assume that we can approximate the population standard deviation with the sample standard deviation, s, and that s = $6,150 for the 100 salaries in the sample. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 7 (In Section 10.1 we’ll discuss a better way to proceed when σ is not known.) In that case, we set σ = $6,150 and the standard deviation for the distribution of sample means is z= = $6,150 = $615 n 1100 Using this value in the previous equation tells us that the standard score for the sample mean of xx̄ = $37,000 in a sampling distribution with a population mean of μ = $39,000 is x– $37,000 – $39,000 z= = = -3.25 $615 / n In other words, the sample mean of xx̄ = $37,000 lies 3.25 standard deviations below the mean of the sampling distribution. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 8 Computing the Standard Score for the Sample Mean in a Hypothesis Test When we draw a random sample for a hypothesis test, we can consider it to be one of many possible samples in the sampling distribution. Given the sample size (n), the sample mean ( x̄ ), the population standard deviation (σ), and the claimed population mean (μ), we make the following computations: standard deviation for the distribution σ of sample means = n x– standard score for the sample mean, z = / n Note: In reality, it is rare that we know the population standard deviation σ; see Section 10.1 about how to deal with such cases. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 9 One-Tailed Hypothesis Tests The Sampling Distribution Finding the Standard Score Critical Values for Statistical Significance Recall that the hypothesis test is significant at the 0.05 level if the probability of finding a result as extreme as the one actually observed is 0.05 or less (assuming the null hypothesis is true). For a left-tailed test, then, we are looking for a standard score that is at or below the 5th percentile of the sampling distribution. From Appendix A, the 5th percentile has a standard score of z = -1.645. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 10 Therefore, a left-tailed hypothesis test is significant at the 0.05 level if the standard score of the sample mean is less than or equal to z = -1.645. This standard score represents the critical value for significance at the 0.05 level in a left-tailed hypothesis test. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 11 Copyright © 2009 Pearson Education, Inc. Slide 9.2- 12 Decisions Based on Statistical Significance for One-Tailed Hypothesis Tests We decide whether to reject or not reject the null hypothesis by comparing the standard score (z) for a sample mean to critical values for significance at a given level. Table 9.1 (next slide) summarizes the decisions for one-tailed hypothesis tests at the 0.05 and 0.01 levels of significance. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 13 Copyright © 2009 Pearson Education, Inc. Slide 9.2- 14 TIME OUT TO THINK Suppose that, in right-tailed tests, one study finds a sample mean with z = 3 and another study finds a sample mean with z = 10. Both are significant at the 0.01 level, but which result provides stronger evidence for rejecting the null hypothesis? Explain. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 15 EXAMPLE 1 Columbia College Test Significance Assuming that the null hypothesis is true and the mean starting salary for Columbia College graduates is $39,000, is it statistically significant to find a sample in which the mean is only $37,000? Based on your answer, should we reject or not reject the null hypothesis? Solution: Recall that the hypothesis test is left-tailed (because the alternative hypothesis has the “less than” form of μ < $39,000) and we already found that a sample mean of xxx̄== $37,000 has a standard score of -3.25. This result is significant at the 0.05 level because the standard score is less than the critical value of z = -1.645. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 16 EXAMPLE 1 Columbia College Test Significance Solution: (cont.) In fact, it is also significant at the 0.01 level, because the standard score is less than -2.33. We therefore have strong reason to reject the null hypothesis and conclude that Columbia College officials did indeed exaggerate the mean starting salary of its graduates. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 17 One-Tailed Hypothesis Tests The Sampling Distribution Finding the Standard Score Critical Values for Statistical Significance Finding the P-Value Recall that the P-value is the probability of finding a sample mean as extreme as the one found, under the assumption that the null hypothesis is true. For a left-tailed test, the probability of finding a sample mean less than or equal to some particular value is simply the area under the curve to the left of the sample mean. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 18 Figure 9.4 The P-value for one-tailed hypothesis tests corresponds to an area under the sampling distribution curve. Once we compute the standard score for a particular sample mean, we can find the corresponding area from standard score tables like those in Appendix A. (Note: Appendix A lists areas to the left of each standard score; therefore, for right-tailed tests, finding the area to the right of the sample mean requires subtracting the value given in the table from 1.) Copyright © 2009 Pearson Education, Inc. Slide 9.2- 19 One-Tailed Hypothesis Tests The Sampling Distribution Finding the Standard Score Critical Values for Statistical Significance Finding the P-Value Summary of One-Tailed Tests for Population Means Copyright © 2009 Pearson Education, Inc. Slide 9.2- 20 To summarize: • Because we are dealing with population means, the null hypothesis has the form μ = claimed value. To decide whether to reject or not reject the null hypothesis, we must determine whether a sample as extreme as the one found in the hypothesis test is likely or unlikely to occur if the null hypothesis is true. • We determine this likelihood from the standard score (z) of the sample mean, which we compute from the formula x– z= / n where n is the sample size, xx̄ is the sample mean, μ is the population mean claimed by the null hypothesis, and σ is the population standard deviation. (In this section, we generally approximate σ with the sample standard deviation, s; a better approach is described in Section 10.1.) Copyright © 2009 Pearson Education, Inc. Slide 9.2- 21 To summarize: (cont.) • We can then assess the standard score in two ways: 1. We can assess its level of statistical significance by comparing it to the critical values given in Table 9.1 (page 383). 2. We can determine its P-value with standard score tables like those in Appendix A. For a left-tailed test, the P-value is the area under the normal curve to the left of the standard score; for a right-tailed test, it is the area under the normal curve to the right of the standard score. • If the result is statistically significant at the chosen level (usually either the 0.05 or the 0.01significance level), we reject the null hypothesis. If it is not statistically significant, we do not reject the null hypothesis. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 22 EXAMPLE 2 Mean Rental Car Mileage (Revisited) Recall the case of the rental car fleet owner (Example 5 of Section 9.1) who suspects that the mean annual mileage of his cars is greater than the national mean of 12,000 miles. He selects a random sample of n = 225 cars and calculates the sample mean to be xx̄ = 12,375 miles and the sample standard deviation to be s = 2,415 miles. Determine the level of statistical significance and P-value for this hypothesis test, and interpret your findings. Solution: To determine the significance and P-value we must first calculate the standard score for the sample mean of x̄ = 12,375 miles. We are given the claimed population mean (μ = 12,000 miles) and the sample size (n = 225); we do not know the population standard deviation, σ, but we will assume it is the same as the sample standard deviation and set s = 2,415 miles. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 23 EXAMPLE 2 Mean Rental Car Mileage (Revisited) Solution: (cont.) We find x– z= = / n 12,375 – 12,000 2,415/ 225 = 2.33 This standard score is greater than the critical value of z = 1.645 for significance at the 0.05 level and equal to the value of z = 2.33 for significance at the 0.01 level, giving us strong reason to reject the null hypothesis and conclude that the rental car fleet average really is greater than the national average. We can find the P-value from Appendix A, which shows that the area to the left of a standard score of z = 2.33 is 0.9901; because this is a right-tailed test, the probability is the area to the right (see Figure 9.4b in slide 19), which is 1 – 0.9901 = 0.0099. Therefore, the P-value is 0.0099, which is very close to 0.01. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 24 Two-Tailed Tests The same basic ideas apply to two-tailed hypothesis tests in which the alternative hypothesis has the “not equal to” form of Ha: μ = claimed value. For two-tailed tests a value “as extreme as the one actually found” can lie either on the left or on the right of the sampling distribution (Figure 9.5 in the next slide). A probability of 0.05, or 5%, therefore corresponds to standard scores either in the first 2.5% of the sampling distribution on the left or in the last 2.5% on the right. From Appendix A, the 2.5th percentile corresponds to a standard score of -1.96 and the 97.5th percentile corresponds to a standard score of 1.96. These standard scores become the critical values for two-tailed tests to be significant at the 0.05 level. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 25 Figure 9.5 For two-tailed tests, the critical values for significance at the 0.05 level correspond to the 2.5th and 97.5th percentiles (as opposed to the 5th or 95th percentiles for one-tailed tests). Copyright © 2009 Pearson Education, Inc. Slide 9.2- 26 Two-Tailed Test (Ha: μ ≠ claimed value) Statistical significance: A two-tailed test is significant at the 0.05 level if the standard score of the sample mean is at or below a critical value of -1.96 or at or above a critical value of 1.96. For significance at the 0.01 level, the critical values are -2.575 and 2.575. P-values: To find the P-value for a sample mean in a twotailed test, first use the standard score of the sample mean to find the P-value assuming the test is one-tailed; then double this value to find the P–value for the twotailed test. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 27 An example: Consider a drug company that seeks to be sure that its “500milligram” aspirin tablets really contain 500 milligrams of aspirin. The null hypothesis says that the population mean of the aspirin content is 500 milligrams: H0: μ = 500 milligrams The drug company is interested in the possibility that the mean weight is either less than or greater than 500 milligrams. Because the company is interested in possibilities on both sides of the claimed mean of 500 milligrams, we have a two-tailed test in which the alternative hypothesis is Ha: μ ≠ 500 milligrams Copyright © 2009 Pearson Education, Inc. Slide 9.2- 28 An example: (cont.) Suppose the company selects a random sample of n = 100 tablets and finds that they have a mean weight of xx̄x==12,375milligrams; 501.5 further suppose that the population standard deviation is σ = 7.0 milligrams. Then the standard score (z) for this sample mean is x– z= = / n 501.5 – 5000 7/ 100 = 2.33 This standard score is above the critical value of 1.96 for a twotailed test, so it is significant at the 0.05 level. (It is not significant at the 0.01 level, because it is not above the critical value of 2.575.) This gives us good reason to reject the null hypothesis and conclude that the mean weight of the “500-milligram” aspirin tablets is different from 500 milligrams. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 29 An example: (cont.) We find the P-value by using Appendix A, which shows that the area to the left of a standard score of z = 2.14 is 0.9838. Therefore, if this were a one-tailed test, the P-value would be the area to the right, or 1 – 0.9838 = 0.0162. But because this is a two-tailed test, we double this number to find that the P-value is 2 × 0.0162 = 0.0324. In other words, if the null hypothesis is true, the probability of drawing a sample as extreme as the one found is 0.0324. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 30 Common Errors in Hypothesis Testing Two common types of error may affect the conclusions. Recall the legal analogy at the end of Section 9.1, in which the null hypothesis is H0: The defendant is innocent. One type of error occurs if we conclude that the defendant is guilty when, in reality, he or she is innocent. The other type of error occurs if we find the defendant not guilty when he or she actually is guilty. In this case, we have wrongly failed to reject the null hypothesis. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 31 Type I and Type II errors • An error in which H0 is wrongly rejected, is called a type I error. • An error in which we wrongly fail to reject H0, is called a type II error. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 32 Common Errors in Hypothesis Testing Bias in Choosing Hypotheses Consider a situation in which a factory is investigated for releasing pollutants into a nearby stream at a level that may (or may not) exceed the maximum level allowed by the government. The logical choice for the null hypothesis is that the actual mean level of pollutants equals the maximum allowed level: H0: mean level of pollutants = maximum level allowed Copyright © 2009 Pearson Education, Inc. Slide 9.2- 33 However, this leaves us with two reasonable choices for the alternative hypothesis, each of which introduces some bias into the ultimate conclusions. If factory officials conducted the test, they would be inclined to claim that the mean level of pollutants is less than the maximum allowed level. H0: mean level of pollutants < maximum level allowed With this alternative hypothesis, the two possible outcomes are • Reject H0, in which case we conclude that the mean level of pollutants is less than the allowed maximum. This outcome would please the factory officials. • Not reject H0, in which case the test is inconclusive. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 34 Now suppose an environmental group conducts the test. Because the group suspects that the factory is violating the government standards, its choice for the alternative hypothesis would be that the mean level of pollutants is greater than the maximum allowed level, or Ha: mean level of pollutants > maximum level allowed With this choice of alternative hypothesis, rejection of H0 implies that the factory has violated the standards, while not rejecting H0 is again inconclusive. In other words, choosing the right-tailed (“greater than”) test creates a situation in which the factory may be found in violation but cannot be proved to be in compliance. Copyright © 2009 Pearson Education, Inc. Slide 9.2- 35 The End Copyright © 2009 Pearson Education, Inc. Slide 9.2- 36