Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Chapter 9 Inferring Population Means Copyright © 2014 Pearson Education, Inc. All rights reserved Learning Objectives 9- 2 Understand when the Central Limit Theorem for sample means applies and know how to use it to find approximate probabilities for sample means. Know how to test hypotheses concerning a population mean and concerning the comparison of two population means. Understand how to find, interpret, and use confidence intervals for a single population mean and for the difference of two population means. Copyright © 2014 Pearson Education, Inc. All rights reserved Learning Objectives Continued 9- 3 Understand the meaning of the p-value and of significance levels. Understand how to use a confidence interval to carry out a two tailed hypothesis test for a population mean or for a difference of two population means. Copyright © 2014 Pearson Education, Inc. All rights reserved 9.1 Sample Means of Random Samples Copyright © 2014 Pearson Education, Inc. All rights reserved Statistics, Parameters, Means and Proportions 9- 5 Mean and Standard Deviation if the survey question has a numerical variable. Proportion if the survey question is Yes/No The confidence interval and hypothesis test always refer to the population not the sample Copyright © 2014 Pearson Education, Inc. All rights reserved Accuracy of the Sample Mean If the sample mean is accurate, then the average of all sample means will equal the population mean. If Simple Random Sampling is used the sample mean is accurate, also called unbiased. Other sampling techniques to be looked at later produce results that are close to being unbiased. 9- 6 Copyright © 2014 Pearson Education, Inc. All rights reserved Precision and the Sample Mean The precision of the sample mean describes how much variability there is from one sample mean to the next. If the population standard deviation is small the sample mean will have more precision. If the sample size is large the sample mean will have more precision. 9- 7 Copyright © 2014 Pearson Education, Inc. All rights reserved Simulating Many Sample Means As the sample size increases Better Precision Accuracy Does Not Change 9- 8 Copyright © 2014 Pearson Education, Inc. All rights reserved Standard Error 9- 9 The Standard Error is the standard deviation of the sampling distribution. x x n Copyright © 2014 Pearson Education, Inc. All rights reserved Standard Error and Sample Size x n The Standard Error is smaller for larger sample sizes. Increasing the sample size by a factor of 4 decreases the standard error by a factor of 2. Increasing the sample size by a factor of 100 decreases the standard error by a factor of 10. 9 - 10 Copyright © 2014 Pearson Education, Inc. All rights reserved The mean cost per item at a grocery store is $2.75 and the standard deviation is $1.26. A shopper randomly puts 36 items in her cart. Is 2.75 a parameter or a statistic? Predict the average cost per item in the shopper’s cart. $2.75 Find the standard error for carts with 36 items. 9 - 11 Parameter 1.26 x 0.21 36 Copyright © 2014 Pearson Education, Inc. All rights reserved Comparing Standard Errors The mean income for residents of the city is $47,000 and the standard deviation is $12,000. Find the standard error for the following sample sizes n=1 n=4 n = 16 n = 100 9 - 12 → $12,000 → $6,000 → $3,000 → $1,200 Copyright © 2014 Pearson Education, Inc. All rights reserved 9.2 The Central Limit Theorem for Sample Means Copyright © 2014 Pearson Education, Inc. All rights reserved Conditions for the Central Limit Theorem for Sample Means Random Sampling Technique One or Both of the Following: Population is Normally Distributed Sample Size is Large 9 - 14 Population Size is At Least 10 Times Bigger Than the Sample Size Copyright © 2014 Pearson Education, Inc. All rights reserved What is a Large Enough Sample Size? If the population distribution is not too far from Normal then the sample size can be small. For most population distributions n = 25 or higher gives sufficient accuracy. If the population distribution is far from normal, a larger sample size is needed. 9 - 15 Copyright © 2014 Pearson Education, Inc. All rights reserved Central Limit Theorem For Means Central Limit Theorem: If the conditions are met and the population has mean and standard deviation , then the sampling distribution will be approximately normal. N , n 9 - 16 Copyright © 2014 Pearson Education, Inc. All rights reserved Visualizing the Central Limit Theorem 9 - 17 Population Distribution Skewed Right Sampling Distribution Approximately Normal Copyright © 2014 Pearson Education, Inc. All rights reserved The Central Limit Theorem and Shape 9 - 18 → Sampling Distribution Mound Shaped → Sampling Distribution Unimodal Population Distribution Skewed Left → Sampling Distribution Symmetric Population Distribution Not Symmetric → Sampling Distribution Symmetric Population Distribution Uniform Population Distribution Bimodal Copyright © 2014 Pearson Education, Inc. All rights reserved Applying the Central Limit Theorem The distribution of women’s pulse rates is skewed right with = 74 bpm, = 13 bpm. If 30 women are selected, find P ( x 72) 9 - 19 13 N 74, N (74, 2.1) 38 P( x 72) 0.17 Copyright © 2014 Pearson Education, Inc. All rights reserved Applying the Central Limit Theorem The distribution of women’s pulse rates is skewed right with = 74 bpm, = 13 bpm. If one woman is selected can you find P(x < 72) No, since the distribution is skewed right, it is not normal. Without more information about the distribution this probability cannot be found. 9 - 20 Copyright © 2014 Pearson Education, Inc. All rights reserved Population, Sample, and Sampling Distributions The population distribution is the distribution of all individuals that exist. The distribution of the sample is the distribution of the individuals that were surveyed. The sampling distribution is the distribution of all possible sample means of sample size n. 9 - 21 The mean, standard deviation, and the shape are likely to be close to the population distribution. The mean will be the same as the population mean, but the shape will be approximately normal and the standard deviation will be smaller. Copyright © 2014 Pearson Education, Inc. All rights reserved The t-Distribution If is unknown, we cannot find the z-score. Use the sample standard deviation s instead. x t s n 9 - 22 SEEST s is an estimate for the n standard error Copyright © 2014 Pearson Education, Inc. All rights reserved Facts About the t-Distribution Bell shaped Tails a little bigger than Normal Given n there are n – 1 degrees of freedom. For large degrees of freedom, the distribution is almost normal. 9 - 23 Copyright © 2014 Pearson Education, Inc. All rights reserved 9.3 Answering Questions about the Mean of a Population Copyright © 2014 Pearson Education, Inc. All rights reserved Confidence Interval for a Population Mean Gives a plausible range of values for the population mean. Confidence level gives the percent of all possible confidence intervals that contain the population mean. Similar to confidence interval for a population proportion, but used for a quantitative variable. 9 - 25 Copyright © 2014 Pearson Education, Inc. All rights reserved CI Example: 9 - 26 x t SEEST 45 randomly selected college students worked on homework for an average of 9 hours per week. Their standard deviation was 2 hours. Find a 90% confidence interval for the population mean. s 2 0.30 d.f. = 44 → t = 1.68, SEEST n 45 Lower Bound: 9 – 1.68 x 0.30 ≈ 8.5 Upper Bound: 9 + 1.68 x 0.30 ≈ 9.5 (8.5,9.5) Copyright © 2014 Pearson Education, Inc. All rights reserved CI Interpretations: (8.5,9.5) 9 - 27 45 randomly selected college students worked on homework for an average of 9 hours per week. Their standard deviation was 2 hours. Find a 90% confidence interval for the population mean. Interpretation of Confidence Interval: We are 90% confident that the population mean number of hours worked on homework for all college students is between 8.5 and 9.5 hours. Copyright © 2014 Pearson Education, Inc. All rights reserved CI Interpretations: (8.5,9.5) 9 - 28 45 randomly selected college students worked on homework for an average of 9 hours per week. Their standard deviation was 2 hours. Find a 90% confidence interval for the population mean. Interpretation of Confidence Level: If many groups of 45 randomly selected students were surveyed, each survey would result in a different confidence interval. 90% of these confidence intervals will succeed in containing the actual population mean number of hours worked on homework and 10% will not contain the true population mean. Copyright © 2014 Pearson Education, Inc. All rights reserved Confidence Intervals and StatCrunch 9 - 29 45 randomly selected college students worked on homework for an average of 9 hours per week. Their standard deviation was 2 hours. Find a 90% confidence interval for the population mean. Stat →T statistics → One sample →with summary Copyright © 2014 Pearson Education, Inc. All rights reserved Confidence Intervals Given Data 9 - 30 Data was collected on recovery time for the newest flu virus. Find a 95% confidence interval for the population mean recovery time. Stat →T statistics → One sample →with data Copyright © 2014 Pearson Education, Inc. All rights reserved Confidence Intervals Given Data 9 - 31 Data was collected on recovery time for the newest flu virus. A 95% confidence interval for the population mean recovery time was (5.9,7.5). We are 95% confident that the mean recovery time for all people who get the flu is between 5.9 and 7.5 days. If many random groups are observed then a different confidence interval would result from each. 95% of these confidence intervals will contain the true population mean recovery time while 5% will not contain the true population mean recovery time. Copyright © 2014 Pearson Education, Inc. All rights reserved Hypothesis Test for a Population Mean The same four steps apply for a hypothesis test for a population mean: 1. 2. 3. 4. 9 - 32 Hypothesize: State H0 and Ha. Prepare: Choose a, check conditions and assumptions and determine the test statistic to use. Compute to Compare: Compute the test statistic and the p-value and compare p with a. Interpret: Reject or fail to Reject H0? Write down the conclusion in the context of the study. Copyright © 2014 Pearson Education, Inc. All rights reserved Hypothesis Test Example (by Formula) 1. Ford claims that its 2012 Focus gets 40 mpg on the highway. Does your Focus’ mpg differ from 40 mpg? You chart your Focus over 35 randomly selected highway trips and find it got 39.5 mpg with a standard deviation of 1.4 mpg. Hypothesize H0: 2. Prepare 9 - 33 = 40, Ha: ≠ 40 Choose a = 0.05, Use t-statistic: random and large sample Copyright © 2014 Pearson Education, Inc. All rights reserved Ford claims that it’s 2012 Focus gets 40 mpg on the highway. Does your Focus’ mpg differ from 40 mpg? You chart your Focus over 35 randomly selected highway trips and find it got 39.2 mpg with a standard deviation of 1.4 mpg. 3. Compute to Prepare 4. 39.5 40 t 2.11 1.4 35 p value 0.04 Interpret = 0.04 < a = 0.05 Reject H0. Accept Ha. There is statistically significant evidence to conclude that your Focus does not get 40 mpg on average. p-value 9 - 34 Copyright © 2014 Pearson Education, Inc. All rights reserved Hypothesis Test with Data Using StatCrunch 1. 150 minutes per week of exercise is recommended. Do college students exercise more than 150 minutes per week? 42 randomly selected college tracked their weekly exercise. Hypothesize H0: 2. Prepare Use 9 - 35 = 150, Ha: > 150 a = 0.05 and find a t-statistic, large sample. Copyright © 2014 Pearson Education, Inc. All rights reserved 150 minutes per week of exercise is recommended. Do college students exercise more than 150 minutes per week? 42 randomly selected college tracked their weekly exercise. 3. Compute to Compare 9 - 36 Stat →T statistics → One sample →with data Copyright © 2014 Pearson Education, Inc. All rights reserved 150 minutes per week of exercise is recommended. Do college students exercise more than 150 minutes per week? 42 randomly selected college tracked their weekly exercise. 4. Interpret 9 - 37 P-value = 0.081 > 0.05 = a Fail to reject H0 There is statistically insufficient evidence to conclude that the average college student exercises more than the recommended 150 hours. Copyright © 2014 Pearson Education, Inc. All rights reserved 9.4 Comparing Two Population Means Copyright © 2014 Pearson Education, Inc. All rights reserved Independent vs. Dependent (Paired) Two samples are dependent or paired if each observation from one group is coupled with a particular observation from the other group. Before and After Identical Twins Husband and Wife Older Sibling and Younger Sibling 9 - 39 If there is no pairing then the samples are independent. Copyright © 2014 Pearson Education, Inc. All rights reserved Independent (Ind) or Dependent (Dep)? 9 - 40 Do women perform better on average than men on their statistics final? 60 women and 40 men were surveyed. 40 people’s blood pressure was measured before and after giving a public speech. Does blood pressure change on average? Is the average tip percent greater for dinner than lunch? 35 wait staff who worked both lunch and dinner looked at their receipts. Are Americans more stressed out on average compared to the French? 50 from each country were given a stress test. Copyright © 2014 Pearson Education, Inc. All rights reserved → Ind → Dep → Dep → Ind Independent Samples Standard Error and Margin of Error SEEST 2 1 2 s s2 n1 n2 Margin of Error t SEEST Degrees of Freedom is approximately the smaller of n1 – 1 and n2 – 1. Use a computer or calculator for better accuracy. 9 - 41 Copyright © 2014 Pearson Education, Inc. All rights reserved Requirement for Independent Samples Both samples are randomly taken and each observation is independent of any other. The two samples are independent of each other (not paired). Either both populations are Normally distributed or each sample size is greater than 25. 9 - 42 Copyright © 2014 Pearson Education, Inc. All rights reserved Example: Independent Samples 9 - 43 38 randomly selected engineer majors and 42 randomly selected psychology majors were observed to estimate the difference in how long it takes to graduate. xE 5.1, sE 0.4, xP 5.6, sP 0.5 Find a 95% confidence interval for the difference. The two population are independent since there is no pairing between each engineer major and each psychology major. The students were selected randomly, independently, and the sample sizes are both greater than 25. Copyright © 2014 Pearson Education, Inc. All rights reserved 38 randomly selected engineer majors and 42 randomly selected psychology majors were observed to estimate the difference in how long it takes to graduate. xE 5.1, sE 0.4, xP 5.6, sP 0.5 Find a 95% confidence interval for the difference. Stat → T Statistics → Two sample → with summary 9 - 44 Copyright © 2014 Pearson Education, Inc. All rights reserved 38 randomly selected engineer majors and 42 randomly selected psychology majors were observed to estimate the difference in how long it takes to graduate. xE 5.1, sE 0.4, xP 5.6, sP 0.5 Find a 95% confidence interval for the difference. 9 - 45 We are 95% confident that the average time it takes to graduate is between 0.3 and 0.7 years longer for psychology majors than for engineer majors. Copyright © 2014 Pearson Education, Inc. All rights reserved Hypothesis Test: Paired Samples 1. Does eating chocolate improve memory. 12 people were give a memory test before and after eating chocolate. The data for the number of words recalled out of 50 are shown below. Assume Normality. Before 24 16 33 9 42 38 27 30 41 After 20 29 11 42 39 25 34 44 Hypothesize H0: 9 - 46 26 diff = 0, Ha: diff ≠ 0 Copyright © 2014 Pearson Education, Inc. All rights reserved Does eating chocolate improve memory. 12 people were give a memory test before and after eating chocolate. The data for the number of words recalled out of 50 are shown below. Assume Normality. 2. Prepare 3. a = 0.05, T-Statistic, large sample Compute to Compare Stat 9 - 47 → T Statistics → Paired Copyright © 2014 Pearson Education, Inc. All rights reserved Does eating chocolate improve memory. 12 people were give a memory test before and after eating chocolate. The data for the number of words recalled out of 50 are shown below. Assume Normality. 4. Interpret = 0.13 > 0.05 = a Fail to Reject H0 Conclusion: There is insufficient evidence to make a conclusion about the mean number of words increasing after eating chocolate. P-value 9 - 48 Copyright © 2014 Pearson Education, Inc. All rights reserved Hypothesis Test: Independent Samples Do batteries last longer in colder climates than in warmer ones? The table shows some randomly selected battery lives in months. Florida 19 22 25 21 18 19 27 25 Montreal 37 49 22 26 47 41 38 37 1. Hypothesize F = M Ha: F < M H0: 9 - 49 Copyright © 2014 Pearson Education, Inc. All rights reserved 28 15 Do batteries last longer in colder climates than in warmer ones? Prepare a = 0.05 Independent Samples, Assume Normal Distributions 9 - 50 Copyright © 2014 Pearson Education, Inc. All rights reserved Do batteries last longer in colder climates than in warmer ones? 3. 9 - 51 Compute to Compare Stat → T Statistics → Two sample → with data Copyright © 2014 Pearson Education, Inc. All rights reserved Do batteries last longer in colder climates than in warmer ones? Florida 19 22 25 21 18 19 27 25 Montreal 37 49 22 26 47 41 38 37 4. 28 15 Interpret = 0.0009 < 0.05 = a Reject H0 Accept Ha Conclusion: There is statistically significance evidence to support the claim that on average batteries last longer in Montreal than in Florida. P-value 9 - 52 Copyright © 2014 Pearson Education, Inc. All rights reserved 9.5 Overview of Analyzing Means Copyright © 2014 Pearson Education, Inc. All rights reserved General Formulas Hypothesis Test Statistic Test statistic estimated value null hypothesis value SE Confidence Interval CI : estimated value multiplier SEEST 9 - 54 Copyright © 2014 Pearson Education, Inc. All rights reserved Finding the p-value Given the Test Statistic Left Tailed Hypothesis: Find the probability that a value is less than the test statistic . Right Tailed Hypothesis: Find the probability that a value is greater than the test statistic . Two Tailed Hypothesis: Make the test statistic negative. Then find the probability that a value is less than the test statistic. Finally multiply by 2. 9 - 55 Copyright © 2014 Pearson Education, Inc. All rights reserved Comparing CI and Hypothesis Tests It can be concluded at the 5% level that the value is not the mean, proportion, or difference if a value falls outside the 95% confidence interval the p-value is less than 0.05 9 - 56 A 95% (90%, 99%) confidence interval is equivalent to a two-tailed test with a = 0.05 (0.1, 0.01) when it comes to rejecting or failing to reject H0. Copyright © 2014 Pearson Education, Inc. All rights reserved Hypothesis Tests and CI Example Suppose that a hypothesis test: H0: = 80 Ha: ≠ 80 was done for the average height of male college basketball players. If p-value = 0.02 can the 95% confidence interval contain 80? No. Since the p-value < 0.05, H0 is rejected. 80 cannot be in the confidence interval. 9 - 57 Copyright © 2014 Pearson Education, Inc. All rights reserved Hypothesis Test or Confidence Interval: Which Should be Used? For one-tailed testing: hypothesis test For two tailed testing: either can be used Confidence Intervals give more than hypothesis tests. CI gives a plausible range for the population value. 9 - 58 The hypothesis test addresses the question of whether H0 is false Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 9 Case Study Copyright © 2014 Pearson Education, Inc. All rights reserved Epilepsy, Drugs, and Giving Birth Four drugs are taken for epilepsy: carbamazepine, lamotrigine, phenytoin, and valproate. Three years after pregnant mothers took the medicine, their children were given a IQ test. The New England Journal of Medicine reported that taking valproate increased the risk of impaired cognitive development. 9 - 60 Copyright © 2014 Pearson Education, Inc. All rights reserved 95% Confidence Intervals These give us a visual comparison. The valporate CI does not overlap with the lamotrigine CI. For better comparisons, use confidence intervals for the difference between means. 9 - 61 Copyright © 2014 Pearson Education, Inc. All rights reserved Confidence Intervals for Differences 9 - 62 None contain 0. A hypothesis test for a difference between the means will reject H0. There is statistically significant evidence to conclude that the mean IQ for children born to mothers taking valproate is different than for any of the other drugs. Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 9 Guided Exercise 1 Copyright © 2014 Pearson Education, Inc. All rights reserved Is the Mean Body Temperature really 98.6? A random sample of 10 independent healthy people showed body temperatures (in degrees Fahrenheit) as follows: 98.5, 98.2, 99.0, 96.3, 98.3, 98.7, 97.2, 99.1, 98.7, 97.2 1. Use a = 0.05. Hypothesize = 98.6 Ha: ≠ 98.6 H0: 9 - 64 Copyright © 2014 Pearson Education, Inc. All rights reserved 2. Prepare Not far from normal. Sample collected randomly. Use the t-statistic. 9 - 65 Copyright © 2014 Pearson Education, Inc. All rights reserved 3. Compute to Compare t ≈ -1.65 p-value ≈ 0.13 p-value ≈ 0.13 > 0.05 = a 9 - 66 Copyright © 2014 Pearson Education, Inc. All rights reserved 4. Interpret A random sample of 10 independent healthy people showed body temperatures (in degrees Fahrenheit) as follows: 98.5, 98.2, 99.0, 96.3, 98.3, 98.7, 97.2, 99.1, 98.7, 97.2 p-value = 0.13 > 0.05 = a We cannot reject 98.6 as the population mean body temperature from these data at the 0.05 level. 9 - 67 Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 9 Guided Exercise 2 Copyright © 2014 Pearson Education, Inc. All rights reserved 9 - 69 A two-sample t-test for the number of televisions owned in households of random samples of students at two different community colleges. Assume independence. One of the schools is in a wealthy community (MC), and the other (OC) is in a less wealthy community. Copyright © 2014 Pearson Education, Inc. All rights reserved 1. Hypothesize 9 - 70 Let oc be the population mean number of televisions owned by families of students in the less wealthy community (OC), and let mc be the population mean number of televisions owned by families of students at in the wealthy community (MC). H0: oc = m Ha: oc ≠ m Copyright © 2014 Pearson Education, Inc. All rights reserved 2. Prepare Choose an appropriate t-test. Because the sample sizes are 30, the Normality condition of the t-test is satisfied. State the other conditions, indicate whether they hold, and state the significance level that will be used. Use a t-test with two independent samples. The households were chosen randomly and independently. The population of all households of each type is more than 10 times the sample sizes. 9 - 71 Copyright © 2014 Pearson Education, Inc. All rights reserved 3. Compute to Compare t = 0.95 p-value = 0.345 9 - 72 Copyright © 2014 Pearson Education, Inc. All rights reserved 4. Interpret Since the p-value = 0.345 is very large, we fail to reject H0. At the 5% significance level, we cannot reject the hypothesis that the mean number of televisions of all students in the wealthier community is the same as the mean number of televisions of all students in the less wealthy community. 9 - 73 Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 9 Guided Exercise 3 Copyright © 2014 Pearson Education, Inc. All rights reserved Pulse Before and After Fright Test the hypothesis that the mean of college women’s pulse rates is higher after a fright, using a = 0.05. 1. Hypothesize before = after Ha: before > after H0: 9 - 75 Copyright © 2014 Pearson Education, Inc. All rights reserved 2. Prepare Choose a test: Should it be a paired t-test or a two-sample t-test? Why? Assume that the sample was random and that the distribution of differences is sufficiently Normal. Mention the level of significance. Paired t-test since before and after. Level of Significance: a = 0.05. 9 - 76 Copyright © 2014 Pearson Education, Inc. All rights reserved 3. Compute to Compare t ≈ 4.9 p-value = 0.002 0.002 < 0.05 9 - 77 Copyright © 2014 Pearson Education, Inc. All rights reserved 4. Interpret Reject or do not reject H0. Then write a sentence that includes “significant” or “significantly” in it. Report the sample mean pulse rate before the scream and the sample mean pulse rate after the scream. 9 - 78 Reject H0. There is statistically significant evidence to support the claim that mean blood pressure is higher after a fright. before ≈ 74.8 after ≈ 83.7 Copyright © 2014 Pearson Education, Inc. All rights reserved