Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 215 Lab Materials Confidence Intervals on the Mean of Paired Differences, µd We need some notation before we introduce the formula for this confidence interval. µd is the mean of the differences between matched (or paired) elements of the first and second populations. We will estimate µd € with d , the average of the paired differences in the two samples. Since each observation from the first sample is paired with one observation from the second sample, we only need a single number for the sample size. We will use n. The standard deviation of these n differences, will be sd. Confidence interval for the mean of paired difference, µ d (Small Sample) The following (1-α)*100% CI for the mean difference of paired difference can be used when n (the number of paired observations) is more than 2 and the original data is approximately Normal (which implies that the differences will be Normally distributed also) d ± t(n −1,1−α ) * 2 € sd n where d is the mean of the differences of the paired observations and sd is the standard deviation of the differences of the paired observations. Since the observations are paired, we treat the differences as a single population. sd Note that s d = n Example: Representatives of Vanguard airlines want to claim that their rates are lower than their competitor Southwest Airlines. They sample 26 fares for routes that both airlines fly. The values for these routes are given in the table below. Vanguard fares on those routes have an average cost of 121.89 with a standard deviation of 38.58. Southwest fares have a mean cost of 126.01 and a standard deviation of 36.35 for those same routes. For the differences the mean is –4.12 (Vanguard minus Southwest) and the standard deviation of the differences is 5.29. Assuming that the data comes from a Normal distribution, create a 95% confidence interval for the difference in the means between Vanguard and Southwest. Since n>2 and we have assumed a Normal distribution for the data we can use the formula given above. The population is paired since we are looking at the same routes for two different airlines. First note that d = -4.12, and sd = 5.29. These numbers were calculated on the values in the “diff” column of the table below. This example provides additional information about the standard deviations of the values for Vanguard and Southwest that is irrelevant to the confidence interval we are making. Route Boston to Detroit Boston to Memphis Bostion to Charlotte Charlotte to Memphis Charlotte to Dallas Vanguard fare Southwest fare 138.79 109.99 139.64 76.83 76.09 Page 1 of 3 139.01 110.76 142.51 81.78 83.98 diff -0.22 -0.77 -2.86 -4.95 -7.89 Statistics 215 Lab Materials Charlotte to Salt Lake City Charlotte to Denver Dallas to Boston Dallas to Los Angeles Dallas to Denver Dallas to Chicago Dallas to Des Moines Denver to Boston Denver to Detroit Denver to Salt Lake City Des Moines to Seattle Memphis to New Orleans Omaha to Albuquerque Omaha to Seattle Philadelphia to Boston Philadelphia to Denver Pittsburgh to Omaha Seattle to Los Angeles Seattle to Portland Tulsa to Dallas Winston-Salem to Omaha d ± t(n −1,1−α ) * 2 184.62 114.90 64.57 171.12 186.17 124.87 100.42 140.48 144.70 114.81 95.57 150.37 139.06 40.26 193.23 77.56 139.78 90.40 105.77 120.82 128.44 186.08 122.91 78.39 177.80 180.75 127.94 96.54 145.23 146.40 114.41 102.44 148.66 153.58 49.73 198.27 89.79 139.87 100.93 109.90 122.49 126.11 -1.46 -8.01 -13.82 -6.68 5.42 -3.08 3.87 -4.75 -1.70 0.40 -6.87 1.71 -14.52 -9.47 -5.04 -12.23 -0.09 -10.53 -4.13 -1.67 2.34 sd n = −4.11 ± t(25,0.975) * € = −4.11 ± 2.060 * € 5.29 26 5.29 26 = −4.11 ± 2.137 € = (-6.247, -1.973) € We are 95% confident that the mean difference between Vanguard and Southwest (Vanguard minus Southwest) on identical routes is between $ –6.247 and $-1.973. Since zero is not inside this interval, we can claim with 95% confidence that there is a difference between the mean cost of flights on identical routes of these two airlines. Consequently, Vanguard can claim that with 95% confidence, that their rates are lower than Southwest’s. Confidence interval for the mean of paired difference, µ d (Large Sample) When n is large (at least 30 paired differences in the sample) we can use the following formula to compute a confidence interval on µd d ± z(1−α ) * 2 € sd n Page 2 of 3 Statistics 215 Lab Materials Example: Suppose d = 10.45, sd = 4.19 and n = 56. We want to create a 90% CI for the difference of paired means. d ± z(1−α ) * 2 € sd n = 10.45 ± z(0.975) * = 10.45 ± 1.96 * € 4.19 56 4.19 56 = 10.45 ± 1.097 € € = (9.353, 11.547) Hence, we are 95% confidence that the mean of paired differences between the two populations is between 9.353 and 11.547. Page 3 of 3