Download Chapter 3 - mistergallagher

Chapter 13 Asking and Answering Questions About the Difference Between Two Population Means Created by Kathy Fritz Let’s review notation: Population Mean Population 1 : Population 2: Population Standard Deviation m1 m2 s1 s2 Sample Size Sample Mean Sample Standard Deviation Sample from Population 1 n1 𝑥1 𝑠1 Sample from Population 2 n2 𝑥2 𝑠2 Testing Hypotheses About the Difference Between To Population Means Using Independent Samples Properties of the Sampling Distribution of 𝑥1 − 𝑥2 Two-Sample t Test for the Difference in Population Means Properties of the Sampling Distribution of 𝑥1 − 𝑥2 This rule specifies the standard If 𝑥1 − 𝑥2 is the difference in sample error means of 𝑥1 −for 𝑥2 . independently The value of the selected random samples, then the following hold: how much standard errorrules describes the 𝑥1 − 𝑥2 values tend to vary from Rule 1. 𝜇𝑥1 −𝑥2 = 𝜇1 − 𝜇2the actual difference in population means. Rule 2. 𝜎12 𝜎22 sizes can be distribution considered This ruleThe that the sampling 𝜎𝑥1 −𝑥2 = +sayssample 2 large if n at ≥ 30 n ≥ value 30. of of𝑥 𝑛 −1𝑥 is𝑛centered theand actual 1 2 1 2 the difference in population means. Rule 3. If both n1 and n2 means are large or if the population This that differences in sample means tend to cluster around the value of the distributions are approximately normal, then the actual difference means. sampling distribution of 𝑥1 − 𝑥2inispopulation approximately normal. Two-Sample t Test for a Difference in Population Means Appropriate when the following conditions are met: 1. The samples are independently selected 2. Each sample is a random sample from the population of interest or the samples are selected in such a way that results in samples that are representative of the populations. 3. Both sample sizes are large (n1 ≥ 30 and n2 ≥ 30) or the population distributions are approximately normal. Two-Sample t Test for a Difference in Population Means When these conditions are met, the following test statistic can be used: 𝑡= 𝑥1 −𝑥2 −(𝜇1 −𝜇2 ) 2 𝑠2 1 + 𝑠2 𝑛1 𝑛2 Where m1 – m2 is the hypothesized value of the difference in population means from the null hypothesis (often this will be 0). When the conditions are met and the null hypothesis is true, the t test statistic has a t distribution with df = 𝑉1 + 𝑉2 2 𝑠12 𝑠22 where 𝑉1 = and 𝑉2 = 𝑛1 𝑛2 𝑉12 𝑉22 + 𝑛1 − 1 𝑛2 − 1 The computed value of df should be truncated to obtain an integer value. Two-Sample t Test for a Difference in Population Means Form of the null hypothesis: H0: m1 – m2 = hypothesized value When the Alternative Hypothesis Is . . . The P-value Is . . . Ha: m1 – m2 > hypothesized value Area under the t curve to the right of the calculated value of the test statistic Ha: m1 – m2 < hypothesized value Area under the t curve to the left of the calculated value of the test statistic Ha: m1 – m2 ≠ hypothesized value 2·(area to the right of t) if t is positive Or 2·(area to the left of t) if t is negative Another Way to Write Hypothesis Statements: When the hypothesized value is 0, we can rewrite these hypothesis statements: H0: m1 = - m22 = 0 Ha: m1 -< m2 < 0 Ha: m1 >- m2 > 0 Ha: m1 -≠ mm22 ≠ 0 Researchers have studied the ways in which college students who use Facebook differ from college students who do not use Facebook. As part of the study, each person in a sample of 141 college students who use Facebook was asked to report his or her grade point average (GPA). College GPA was also reported by each person in a sample of 68 students who do not use Facebook. Did the data from this study provide convincing evidence that the mean college GPA for Facebook users was lower than mean college GPA who do not use Facebook? The two samples were independently selected from a large, public Midwestern university. Although the samples were not selected at random, they were selected to be representative of the two populations. Facebook and Grades Continued . . . Data from these samples were used to compute these summary statistics. Sample Size Sample Mean Sample Standard Deviation Students who use Facebook n1 = 141 𝑥1 = 3.06 s1 = 0.95 Students who do not use Facebook n2 = 68 𝑥2 = 3.82 s2 = 0.41 Population Step 1 (Hypotheses): Population characteristics of interest: m1 = mean GPA for students who use Facebook m2 = mean GPA for students who do not use Facebook Null hypothesis: H0: m1 – m2 = 0 Alternative hypothesis: Ha: m1 – m2 < 0 Facebook and Grades Continued . . . Step 2 (Method): Because the answers to the four key questions are 1) hypothesis testing, 2) sample data, 3) one numerical variable, and 4) two independently selected samples, consider a two-sample t test for a difference in population means. Significance level: a = 0.05 Step 3 (Check): • The sample sizea is met because thebased sample sizes Yoularge should choose significance level areon both large: n1 = 141 n2 = 68 > 30 a consideration of >30 the and consequences of Type Iyou andknow Typethat II errors. • From the study, the In this situation, becauseselected neither type of samples were independently error muchthat more serious than the other, a • You alsoisknow the samples value for a of is a reasonable choice. were selected to 0.05 be representative of the two populations of interest. Facebook and Grades Continued . . . Step 4 (Calculate): Test statistic: 𝑥1 − 𝑥2 − (𝜇1 − 𝜇2 ) 3.06 − 3.82 − 0 𝑡= = = −8.08 2 2 0.95 0.41 2 2 𝑠1 𝑠2 + 141 68 + 𝑛1 𝑛2 Degrees of freedom: 𝑉1 = 𝑠12 𝑛1 = 0.0064 df = 𝑉2 = 𝑉1 +𝑉2 2 2 𝑉 𝑉2 1 + 2 𝑛1 −1 𝑛2 −1 (0.0064+0.0025)2 = (0.0064)2 140 (0.0025)2 + 68 = 205.181 Truncate df to 205. 𝑠22 𝑛2 = 0.0025 Facebook and Grades Continued . . . Step 4 (Calculate) Continued: Associated P-value: P-value = area under t curve to the left of -8.08 = P(t < -8.08) ≈ 0 Step 5 (Communicate Results): Decision: 0 < 0.05, Reject H0 Conclusion: Based on the sample data, there is convincing evidence that the mean college GPA for students at the university who use Facebook is lower than the mean college GPA for students at the university who do not use Facebook. More on Degrees of Freedom The degree of freedom formula for the two-sample t test involves quite a bit of arithmetic. An alternative approach is to compute a conservative estimate of the P-value – one that is close to but larger than the actual P-value. A conservative estimate of the P-value for the two-sample t test can be found by using the t curve with the degrees of freedom equal to the smaller of (n1 – 1) and (n2 – 1). If the null hypothesis is rejected using this conservative estimate, then it will also be rejected if the actual P-value is used. The Pooled t Test If it is known that the variances of the two populations are equal 𝜎12 = 𝜎22 , an alternative procedure known as the pooled t test can be used. This test procedure combines information from both samples to obtain a “pooled” estimate of the common variance, and then used this pooled estimate of the variance in place of 𝑠12 and 𝑠22 in the test statistic. This test procedure was widely used in the past, but it has fallen into some disfavor because it is quite sensitive to departures from the assumption of equal population variances. Testing Hypotheses About the Difference Between Two Population Means Using Paired Samples Ultrasound is often used in the treatment of soft tissue injuries. In a study to investigate the effect of ultrasound therapy on knee extension, range of motion was measured for people in a representative sample of physical therapy patients both before and after ultrasound therapy. Is there evidence that the ultrasound therapy increases range of motion? Let m1 denote the mean range of motion for the population of all physical therapy patients prior to ultrasound and m2 denote the mean range of motion after ultrasound. H0: m1 – m2 = 0 versus Ha: m1 – m2 < 0 If the mean range of motion after ultrasound is greater than the mean range of motion before ultrasound, then the difference will be negative. Range of Motion Problem Continued . . . H0: m1 – m2 = 0 versus Ha: m1 – m2 < 0 Suppose we look at a sample of seven patients. The range of motion before ultrasound therapy and after ultrasound are plotted below. This would lead to a decision not to reject the null hypothesis. Why is this NOT a If you were (incorrectly) useand theafter two-sample t test for Both the to before ultrasound ultrasound range correct decision? independent with thevary given data, the resulting of motionsamples measurements from patient to patient. testIt statistic would be that may obscure any difference. is this variability t = -0.61 Range of Motion Problem Continued . . . H0: m1 – m2 = 0 versus Ha: m1 – m2 < 0 Now let’s look at the sample as paired data. However, plot inthat which pairs are identified (above) Thisthe suggests the methods for independent does suggest a difference. that for of the samples are not adequateNotice for dealing withsix paired data. seven pairs the after ultrasound observation is greater than the before ultrasound observation. When observations are paired in some meaningful way, inferences are based on the differences between the two observations within a pair A Look at Hypotheses To compare two population or treatment means when the samples are paired, first translate the hypotheses of interest about the value of m1 – m2 into equivalent hypotheses involving md, the mean of the difference population. Hypothesis Equivalent Hypothesis When Samples Are Paired H0: m1 – m2 = hypothesized value H0: md = hypothesized value Ha: m1 – m2 > hypothesized value Ha: md > hypothesized value Ha: m1 – m2 < hypothesized value Ha: md < hypothesized value Ha: m1 – m2 ≠ hypothesized value Ha: md ≠ hypothesized value The Paired t Test Appropriate when the following conditions are met: 1. The samples are paired. 2. The n sample differences can be viewed as a random sample from a population of differences (or it is reasonable to regard the sample of differences as representative of the population of differences). 3. The number of sample differences is large (n ≥ 30) or the population distribution of differences is approximately normal. Summary of the Paired t test for Comparing Two Population Means Continued When these conditions are met, the following test statistic can be used: 𝑥𝑑 − 𝜇0 𝑡= 𝑠 𝑑 𝑛 Where m0 is the hypothesized value of the population mean difference from the null hypothesis, n is the number of sample differences, and 𝑥𝑑 and 𝑠𝑑 are the mean and standard deviation of the sample differences. Summary of the Paired t test for Comparing Two Population Means Continued Form of the null hypothesis: H0: md = m0 When the conditions are met and the null hypothesis is true, this t test statistic has a t distribution with df = n – 1. When the Alternative Hypothesis Is . . . The P-value Is . . . Ha: md > m0 Area under the t curve to the right of the calculated value of the test statistic Ha: md < m0 Area under the t curve to the left of the calculated value of the test statistic Ha: md ≠ m0 2·(area to the right of t) if t is positive Or 2·(area to the left of t) if t is negative Is this an example of paired samples? An engineering association wants to see if there is a difference in the mean annual salary for electrical engineers and chemical engineers. A random sample of electrical engineers is surveyed about their annual income. Another random sample of chemical engineers is surveyed about their annual income. No, there is no pairing of individuals, you have two independent samples Is this an example of paired samples? A pharmaceutical company wants to test its new weight-loss drug. Before giving the drug to volunteers, company researchers weigh each person. After a month of using the drug, each person’s weight is measured again. Yes, you have two observations on each individual, resulting in paired data. In a study to investigate the effect of ultrasound therapy on knee extension, range of motion was measured for people in a representative sample of physical Because the samples are paired, the firsttherapy thing to do patients both and after ultrasound therapy. is before to compute the sample differences. Range of Motion Patient 1 2 3 4 5 6 7 Before Ultrasound 31 53 45 57 50 43 32 After Ultrasound 32 59 46 64 49 45 40 Differences -1 -6 -1 -7 1 -2 -8 Is there evidence that the ultrasound therapy increases range of motion? The mean and standard deviation computed from these sample differences are 𝑥𝑑 = -3.43 and sd = 3.51 Range of Motion Continued . . . Step 1 (Hypotheses): The population characteristics of interest are m1 = mean range of motion for physical therapy patients before ultrasound m2 = mean range of motion for physical therapy patients after ultrasound Because the samples are paired, you should also define md : md = m1 – m2 = mean difference in range of motion (before – after) Translating the question of interest into hypotheses gives: H0: md = 0 versus H0: md < 0 Step 2 (Method): Because the answers to the four key questions are 1) hypothesis testing, 2) sample data, 3) one numerical variable, and 4) two paired samples, consider the paired t test as a potential method. When the null hypothesis is true, the test statistic will have a t distribution with df = 7 – 1 = 6. Significance level: a = 0.05 Range of Motion Continued . . . Step 3 (Check): • The sample was representative of physical therapy patients. • Because the sample size is small, the distribution of range of motion differences should be approximately normal. The following boxplot of the seven differences is not too asymmetric and there are no outliers, so it is reasonable to think that the population differences could be approximately normal. Range of Motion Continued . . . Step 4 (Calculate): Because all conditions are met, it is appropriate to use paired-samples t test. Test Statistic: 𝑡= −3.43−0 3.51 7 = −2.59 P-value: P-value = area to the left of -2.6 = area to the right of 2.6 = 0.02 Step 5 (Communicate Results): Because the P-value (0.02) is less than a (0.05), you reject H0. There is convincing evidence that mean knee range of motion for physical therapy patients before ultrasound is less than the mean range of motion after ultrasound. Estimating the Difference Between Two Population Means The Two-Sample t Confidence Interval for the Difference Between Two Population Means Appropriate when the following conditions are met: 1. The samples are independently selected 2. The samples are a random samples from the populations of interest or the samples are selected in such a way that results in samples that are representative of the populations. 3. Both sample sizes are large (n1 ≥ 30 and n2 ≥ 30) or the population distributions are approximately normal. The Two-Sample t Confidence Interval for the Difference Between Two Population Means When these conditions are met, a confidence interval for a difference in population means is 𝑠12 𝑠22 𝑥1 − 𝑥2 ± (𝑡 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒) + 𝑛1 𝑛2 The t critical value is based on df = 𝑉1 + 𝑉2 2 𝑉12 𝑉22 + 𝑛1 − 1 𝑛2 − 1 𝑠12 𝑠22 where 𝑉1 = and 𝑉2 = 𝑛1 𝑛2 The computed value of df should be truncated to obtain an integer value. The desired confidence level determines which t critical value is used. The Two-Sample t Confidence Interval for the Difference Between Two Population Means Interpretation of Confidence Interval You can be confident that the actual value of the difference in population means is included in the computed interval. This statement should be worded in context. Interpretation of Confidence Level The confidence level specifies the long-run proportion of the time that this method is successful in capturing the actual difference in population means. In 2010, the Nielson Company released a report that stated “Women Talk and Text More than Men Do” (State of the Media 2010: U.S. Audiences & Devices, The Nielsen Company). This statement was based on data collected by examining random samples of telephone bills selected from two populations – female cell phone users and male cell phone users. The report indicated that the mean number of text messages sent per month by women was 716, while the mean number of text messages sent per month by men was 555. The report also indicated the sample sizes were large, although it did not give the actual sample sizes. For purposes of this example, suppose that the summary statistics are as shown in the table. Sample Sample Sizes Sample Mean Sample Standard Deviation Women n1 = 1200 𝑥1 = 716 s1 = 90 Men n2 = 1000 𝑥2 = 555 s2 = 75 Step 1 (Estimate): The population characteristic of interest are: m1 = mean number of text messages sent by female cell phone users m2 = mean number of text messages sent by male cell phone users m1 – m2 = difference in mean number of text messages sent Step 2 (Method): Because the answers to the four key questions are 1) estimation, 2) sample data, 3) one numerical variable, and 4) two independently selected samples, consider constructing a two-sample t confidence interval. For this example, a confidence level of 90% was selected. Step 3 (Check): • Both samples are large. • The samples were randomly selected from the two populations of interest. • The samples are independent. Step 4 (Calculate): Degrees of freedom: 𝑉1 = 𝑠12 𝑛1 = 6.750 𝑉2 = 𝑠22 𝑛2 = 5.625 (0.0064 + 0.0025)2 df = = 2187.729 ≈ 2187 2 2 (0.0064) (0.0025) + 140 68 From Table 3 or technology: t critical value = 1.645 𝑠12 𝑠22 𝑥1 − 𝑥2 ± (𝑡 critical value) + 𝑛1 𝑛2 902 752 716 − 555 ± (1.645) + 1200 1000 (155.213, 166.787) Step 5 (Communicate Results): Confidence Interval: You can be 90% confident that the actual difference in mean number of text messages sent is between 155.213 and 166.787. Both endpoints of this interval are positive, so you estimate that the mean number of text messages sent by female cell phone users is greater than the mean number of text messages sent by male cell phone users by somewhere between 155.213 and 166.787. Confidence Level: The method used to construct this interval estimate is successful in capturing the actual difference in population means about 90% of the time. The Paired-Samples t Confidence Interval for a Difference in Population Means Appropriate when the following conditions are met: 1. The samples are paired. 2. The n sample differences can be viewed as a random sample from a population of differences (or it is reasonable to regard the sample of differences as representative of the population of differences). 3. The number of sample differences is large (n ≥ 30) or the population distribution of differences is approximately normal. The Paired-Samples t Confidence Interval for a Difference in Population Means When these conditions are met, a confidence interval for the difference in population means is 𝑥𝑑 ± 𝑡 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 𝑠𝑑 𝑛 Where n is the number of sample differences, and 𝑥𝑑 and 𝑠𝑑 are the mean and standard deviation of the sample differences. The t critical value is based on df = n – 1. Interpretation of Confidence Interval You can be confident that the actual value of the difference in population means is included in the computed interval. This statement should be worded in context. Interpretation of Confidence Level The confidence level specifies the long-run proportion of the time that this method is successful in capturing the actual difference in population means. Benefits of Ultrasound Revisited . . . Range of Motion Patient 1 2 3 4 5 6 7 Before Ultrasound 31 53 45 57 50 43 32 After Ultrasound 32 59 46 64 49 45 40 Differences -1 -6 -1 -7 1 -2 -8 The mean and standard deviation computed from these sample differences are 𝑥𝑑 = -3.43 and sd = 3.51 Step 1 (Estimate): You want to estimate md = m1 – m2 = mean difference in knee random of motion where m1 = mean knee range of motion before ultrasound and m2 = mean knee range of motion after ultrasound Range of Motion Continued . . . Step 2 (Method): Because the answers to the four key questions are 1) estimation, 2) sample data, 3) one numerical variable, and 4) two paired samples, consider the paired t confidence interval as a potential method. Step 3 (Check): • The sample was representative of physical therapy patients. • Because the sample size is small, the distribution of range of motion differences should be approximately normal. The following boxplot of the seven differences is not too asymmetric and there are no outliers, so it is reasonable to think that the population differences could be approximately normal. Range of Motion Continued . . . Step 4 (Calculate): The t critical value for df = 6 and 95% confidence level is 2.45. The method used to construct this interval 𝑠𝑑 3.51 is successful the actual 𝑥𝑑 ± 𝑡 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 in capturing = −3.43 ± 2.45 𝑛 7 difference in population means about 95% = (−6.68, of the−0.18) time. Step 5 (Communicate Results): Based on these samples, you can be 95% confident that the actual difference in mean range of motion is somewhere between -6.670 degrees and -0.187 degrees. Because both endpoints are negative, you would estimate that knee range of motion after ultrasound is greater than the mean range of motion before ultrasound by somewhere between 0.189 and 6.679 degrees. Avoid These Common Mistakes Avoid These Common Mistakes 1. Remember that the results of a hypothesis test can never show strong support for the null hypothesis. In two-sample situations, this means that you shouldn’t be convinced that there is not difference between two population means based on the outcome of a hypothesis test. Avoid These Common Mistakes 2. If you have complete information (a census) for both populations, there is no need to carry out a hypothesis test or to construct a confidence interval – in fact, it would be inappropriate to do so. Avoid These Common Mistakes 3. Don’t confuse statistical significance with practical significance. In the two-sample setting, it is possible to be convinced that two population means are not equal even in situations where the actual difference between them is small enough that it is of no practical use. After rejecting a null hypothesis of no difference, it is useful to look at a confidence interval estimate of the difference to get a sense of practical significance. Avoid These Common Mistakes 4. Correctly interpreting confidence intervals in the two-sample case is more difficult than in the one-sample case, so take particular care when providing two-sample confidence interval interpretations. Because the two-sample confidence interval estimates a difference (m1 – m2), the most important thing to note is whether or not the interval includes 0.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 3 - mistergallagher