Download Confidence intervals for difference of means of two independent

Statistics 215 Lab Materials Confidence intervals for difference of means of two independent populations, µ 1-µ 2 Previously, we focused on a single population and parameters calculated from that population. Often we want to compare two populations. In this section, we will be interested in comparing the means of two populations. Specifically, we will consider the difference between the means of two populations. This type of comparison for two populations we will want to make is between independent populations. Independent populations imply the two groups are distinct and are not related. We might be interested in the iron levels of the blood in two different species of baboons. We take a sample from each population and compare the mean for each sample. Another common occurrence is for the two populations to be similar but for each population to receive a different treatment. Comparison is then made on a measurement related to the treatments given. For example, one fourth-grade class at Springfield elementary might be shown a DVD about volcanoes, while the second fourth grade class at Springfield elementary would read an article about volcanoes. The two groups would be given the same test about volcanoes. We could then compare the two groups to see if there is a difference in the means of the two groups. We often we are interested in whether or not there is a difference between the means of the two populations. Remember that because of sampling variability a difference in the sample means may not mean that the population means are different. To account for this variability, we use a confidence interval. As described below, we can create a confidence interval for the difference of the mean of the two populations. If the two populations would have the same mean, then the difference of the means would be 0 (zero). For example, call the mean of the first population, µ1 and the mean of the second population, µ2. If µ1 = µ2, then µ1 – µ2 = 0. Consequently, when we consider the confidence intervals, we are interested in whether or not 0 (zero) is inside the confidence interval. If zero is inside the confidence interval then, we would conclude there is no difference in the means of the two populations. Confidence interval for the difference of independent population means, µ 1-µ 2 (Small Samples). With two independent populations, we have two different samples from two different populations. We need special notation to distinguish the two populations. From the first population, we will have a sample of size n1. The average of those n1 observations will be X1 and the standard deviation will be s1. For the first population, we will refer to the population mean as µ1 and the population standard deviation as σ1. From the second population, we will have a sample of size n2. The average of those n2 observations will be X2 and the standard deviation will be s2. For the first population, we will refer to the population mean as µ2 and the population standard deviation as σ2. The following (1-α)*100% CI for the difference of independent means can be used when 1. n1 and n2 are both less than 30 and σ1 = σ2 OR 2. n1 and n2 are both more than 2, σ1 = σ2, and the two populations possess Normal distributions. (X1 − X 2 ) ± t(n α 1 +n 2 −2,1− 2 ) * sp * 1 1 + n1 n 2 where € Page 1 of 4 Statistics 215 Lab Materials sp = (n1 −1)s12 + (n 2 −1)s22 n1 + n 2 − 2 sp represents an ‘average ‘ of the standard deviations (called the pooled standard deviation) from the two samples. It is necessary to calculate sp before you can complete the calculation of the confidence interval. € Note that sx1 −x 2 = sp * 1 n1 + n12 . Example: € Suppose we want to construct a 90% CI for the difference of independent population means. X1 = 49.37, s1 = 4.89, n1 = 25. X2 =52.13, s2 = 5.38, n2 = 16. sp = (n1 −1)s12 + (n 2 −1)s22 (25 −1)4.89 2 + (16 −1)5.38 2 = s= = 5.084 n1 + n 2 − 2 (25 +16 − 2) Then, € (X1 − X 2 ) − t(n 1 1 €α 2 ) * sp * n + n 1 2 1 +n 2 −2,1− = (49.37 − 52.13) ± t(25+16−2,0.95) * 5.084 * € € = −2.76 ± 1.645 * 5.084 * 1 1 + 25 16 € = −2.76 ± 1.645 * 5.084 * 1 1 + 25 16 1 1 + 25 `16 = −2.76 ± 2.678 € = (-5.438, - 0.082) € So we are 90% confident that the difference of µ1 – µ2 is between –5.438 and –0.082. Page 2 of 4 Statistics 215 Lab Materials Confidence interval for the difference of independent population means, µ 1-µ 2 (Large Samples). When we have two large samples (each sample has at least 30 observations), we can use the following formula: s12 s22 (X1 − X 2 ) ± z α * + (1− ) n1 n 2 2 Example: The lifetimes of calculator batteries is being investigated by Consumer Digest. They find that the average length of 45 Everset batteries is 125.245 hours and the standard deviation is 34.890 hours. For JordoVac the average length of 50 batteries is 120.051 hours and the standard deviation is 42.801 hours. Assuming that both sets of data are approximately Normal, create a 95% confidence interval for the mean difference of lifetimes for these two batteries. € We have two distinct set of batteries. Each battery in one population is unrelated to another battery in the other population so they are independent populations. For the samples that we have, (call the Everset batteries population 1 and the JordoVac batteries population 2), both n1 and n2 are more than 30. Similarly their standard deviations are approximately equal; they are close enough that we can claim that σ1 = σ2. Consequently, we can use the formula above to make our confidence intervals. (X1 − X 2 ) ± z α (1− ) 2 * s12 s22 + n1 n 2 € = (125.245 −120.051) ± z(0.0975) * € = 5.194 ± 1.96 * 34.890 2 42.8012 + 45 50 34.890 2 42.8012 + 45 50 = 5.194 ± 1.96 * 63.690 € = 5.194 ± 15.642 € = (-10.448, 20.836) € We are 95 % confident that the mean difference in calculator batteries lifetimes between Everset and JordoVac is between –10.448 and 20.836. Note that the differences in the sample means was 5.194; however since zero was inside the confidence interval, we conclude with 95% confidence that 0 is a possible value for the difference between the population means. This is because of the variability present from sample to sample. Because of the Page 3 of 4 Statistics 215 Lab Materials variability, we must conclude, with 95% confidence, that there is no difference between the means of these two populations. A note: Some notes on confidence intervals: 1. This chapter on confidence intervals is the first that develops ideas that are statistical. For most people it is a new way of thinking. It implies that a point estimate of a parameter is not the parameter. This forces us to acknowledge the variability from sample to sample. And we must recognize that there is sampling variability in any estimate, which includes almost every statistic reported in the media. 2. The reason for using a CI is the variability that comes from one sample to the next. Each sample is different, each sample gives us a different value for a statistic. The range of a confidence interval gives an indication of how much variability there is in the sample it was derived from. Another way to think of this is that the smaller the variability in the sample, the more information we have about the location of the mean. 3. There are three factors that influence the size or width of a confidence interval. • n, as n increases, the width of the CI decreases. • Confidence level (1-α), as confidence level increases, the width of the CI increases. • s, the bigger s is, the wider the CI is. 4. As mentioned in the previous note, the samples size is affected by the number of observations in a sample (or samples). It is possible to determine the minimum sample size required to estimate a population parameter with a specified precision at a given confidence level. 5. The consequence of not having the assumptions met for a particular confidence interval is that the confidence level is likely incorrect. It almost all cases this means that the confidence level is lower than it should be. That is, if we make a 95% confidence interval but not all the assumptions are met for this interval, then the true confidence level will be less (often much less) than 95%. Page 4 of 4

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Confidence intervals for difference of means of two independent