* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download SECTION 9.3 – SIGNIFICANCE TESTS ABOUT MEANS
Psychometrics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Foundations of statistics wikipedia , lookup
Omnibus test wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Misuse of statistics wikipedia , lookup
SECTION 9.3 – SIGNIFICANCE TESTS ABOUT MEANS Overview For the most part this is done similarly to significance tests for proportions. We also need to check the assumption that the population is normally distributed. ◦ This is important when n is small and we are computing a one-sided test. ◦ Two-sided tests are robust concerning the normality assumption. First Example A job placement director claims that the average starting salary for nurses is $24,000. A researcher questions this claim – she thinks the true mean is lower than that figure. She takes a random sample of 10 starting nurses, and gets the following data: 23,400 19,000 20,200 22,000 22,145 25,150 24,530 25,000 21,000 25,500 Is there enough evidence to reject the director’s claim at α = 0.05? Note: Using my TI-83, I found a mean of $22,792.50 and a standard deviation of $2273.37. Hypothesis Testing Steps 1. 2. 3. What is the population mean to be tested? µ = the average starting salary among all nurses. State the null hypothesis. H0 is “µ = 24000” State the alternative hypothesis. Ha is “µ < 24000” Hypothesis Testing Steps 5. 5. 6. Calculate the t test statistic and the P-value. t = (22792.50 – 24000)/718.90 = -1.68 P-value = Pr( t < -1.68) = tcdf(-10000, -1.68, 9) = 0.064 Do we reject the null hypothesis? Since the Pvalue is not small, we do not reject H0. English conclusion: “There is not statistically significant evidence that the average starting salary among all nurses is less than $24K.” Example 1 The diastolic blood pressure for American women ages 18 to 44 has approximately a normal distribution with mean 75 millimeters of mercury (mm Hg) and standard deviation 10 mm Hg. We suspect that regular exercise will lower the blood pressure. A sample of 25 women who jog at least 5 miles per week gives a sample mean blood pressure of 71 mm Hg. Is this good evidence (at the 5% level) that the mean diastolic blood pressure for the population of regular exercisers is lower than 75 mm Hg? Hypothesis Testing Steps 1. 2. 3. What is the population mean to be tested? µ = the average blood pressure among all nurses. State the null hypothesis. H0 is “µ = 24000” State the alternative hypothesis. Ha is “µ < 24000” Hypothesis Testing Steps 5. Calculate the t test statistic and the P-value. 6. Do we reject the null hypothesis? 7. State the conclusion in plain English. Example 1 The diastolic blood pressure for American women ages 18 to 44 has approximately a Normal distribution with mean 75 millimeters of mercury (mm Hg) and standard deviation 10 mm Hg. We suspect that regular exercise will lower the blood pressure. A sample of 25 women who jog at least 5 miles per week gives a sample mean blood pressure of 71 mm Hg. Is this good evidence (at the 5% level) that the mean diastolic blood pressure for the population of regular exercisers is lower than 75 mm Hg? Step 1: Assumptions The variable of interest is quantitative – What is your diastolic blood pressure? ◦ This is a test with means. The data is produced using randomized methods. The sample size is smaller than 30, but the population distribution is said to be normal. Step 2: Hypotheses µ = the mean diastolic blood pressure (mm Hg) among all women who jog at least 5 miles per week. H0: µ = 75 mm Hg Ha: µ < 75 mm Hg Step 3: Test Statistic 10 s.e. = = = 2 mm Hg n 25 σ x = 71 mm Hg 71 − 75 z= = −2 2 Step 4: P-value normalcdf(-1000, 71, 75, 2) = 0.023 If the true mean diastolic blood pressure for female joggers was 75 mm Hg (like women in general), we would still get a sample at least this far from the true mean 2.3% of the time. Step 5: Conclusion This evidence is strong against the null hypothesis. Thus we are led to believe that the mean diastolic blood pressure for female joggers is less than 75 mm Hg. Example 2 The following figures refer to the actual weights (in ounces) of a simple random sample of ten "one-pound" cans of peaches distributed by a company. It is known that weights of this type vary with a Normal distribution. 16.3 15.5 15.4 15.7 16.0 16.8 16.8 16.2 16.5 15.8 At the 5% level of significance, decide whether or not the true mean of all of the cans distributed by the company is different from 16 ounces. Step 1: Assumptions The variable of interest is quantitative – What is the weight of each can? ◦ This is a test with means. The data is produced using randomized methods. The sample size is smaller than 30, but the population distribution is said to be Normal. Step 2: Hypotheses µ = the average weight (oz) of all cans of peaches distributed by the company H0: µ = 16 ounces Ha: µ ≠ 16 ounces Step 3: Test Statistic s 0.5055 s.e. ≈ = ≈ 0.16 ounces n 10 x = 16.1 ounces 16.1 − 16 t= = 0.625 0.16 Step 4: P-value 2*tcdf(0.625, 1000, 9) = 0.547 If the mean weight of cans was 16 ounces (as the label claims), we would still get a sample off by this much or more 54.7% of the time. Step 5: Conclusion The P-value is not small. This evidence is not strong enough to reject the null hypothesis. This evidence is not strong enough to conclude that the cans weigh significantly different from the posted weight of 16 ounces. SECTION 9.5 – LIMITATIONS OF SIGNIFICANCE TESTS Common Misconceptions Small P-values correspond to statistically significant results. It does not mean that the results are practically significant. The sample value can be extremely close to the target parameter value, but really large sample sizes can make the result appear statistically significant. “Not rejecting” the null hypothesis is not the same as “accepting” the null hypothesis. The P-value is NOT the probability that the null hypothesis is true. Common Misinterpretations Some tests may be statistically significant by chance. ◦ Repeated testing might find significant results just because one test is bound to appear significant by chance. The only reported or publicized data might be the extreme data. Significance Tests are Less Useful that Confidence Intervals A significance test merely indicates whether a particular parameter is plausible. ◦ If our significance test fails to reject the null hypothesis, we have no idea what the plausible parameters are. A confidence interval gives a whole range of plausible values for the parameter.