Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lab#__________ Section________ Name___________________ Instructor________________ Homework #7 Most of these are problems from your textbook. Remember to get the data for the problems that need it in the lab (S:data files/IPS4e/SPSS etc.), off the web (http://bcs.whfreeman.com/ips4e/ click on Datasets and choose Excel format since SPSS can read it. Note: you must have a zip program, like ZipCentral or Winzip, to unzip the file), or off the cd in the back of your book (D:PCDataSets/SPSS/). Section 7.1 **1. Ex. 7.2 & 7.4 In SPSS open the data file Ch07/EX07_002.por, Select Analyze Compare Means OneSample T Test Options Confidence Intervals 95% Continue,move [rent] into Test Variable(s): ,change Test Value to “500” Click “OK” One-Sample Statistics Normal Q-Q Plot of 700 N Mean Std.Deviation Std.Error Mean 600 Expected Normal Value 10 531.0000 82.7916 26.1810 s , n what is (531 2.262 26.1810, 531 2.262 26.1810) (471.7745,590.2255) 95% CI for : x t9,.025 500 400 300 300 400 500 600 700 Observed Value One-Sample Test Test Value = 0 t df 20.282 Sig. (2-tailed) 9 .000 Mean Difference 531.0000 95% Confidence Interval of the difference Lower Upper 471.7745 590.2255 7.4 For the data, x = 531 and s=82.792. H0: = $500 vs. Ha: > $500. The t statistic is 1.184 with 9 df. From software, the p-value is 0.1335, half of .267 the p-value given which is for a two-sided test. From Table D, the t statistic is between the 0.15 and 0.10 columns. This is not strong evidence that the mean is over $500. One-Sample Test Test Value = 500 t 1.184 t df 9 Sig. (2-tailed) .267 Mean Difference 31.0000 95% Confidence Int erval of the Difference Lower Upper -28.2255 90.2255 x 0 531 500 1.184 s / n 82.792 / 10 NOTE: $500 is in the 95% confidence interval, so $500 is a plausible mean rent of all advertised apartments. Lab#__________ Section________ Name___________________ Instructor________________ 2. 7.6 (a) H0: = $0 vs. Ha: $0, where is the mean change in sales. We use the two-sided alternative because there is no information in the problem that suggests whether we are looking for an increase or a decrease. (b) The t-statistic is 2.263 and the p-value is 0.028 from software. Table D does not have 49 df, but for 50 df, our statistic is between 2.109 and 2.403, so we feel comfortable saying that the (2 sided) p-value is between 0.04 and 0.02. (c) Certainly not. We are confident that the mean sales are up, but individual stores may either increase or decrease. Extra question: In part (b), would the appropriate Confidence Interval provide the same conclusion. Perform the appropriate calculations to support your answer. s 0.15 ANSWER: Yes. C.I. = x t49,.025 .048 2.009 (0.535,9.065) . Since 0 is NOT in the n 50 interval, we would conclude that average sales are different, but we don’t know whether that means increase or decrease. 3. 7.12 (a) 19 (b) 1.729 and 2.093 (c) 0.05 and 0.025 (d) Since it is a one-sided test, 0.05 and 0.025. (e) Is significant at 5%, since the statistics is larger than the table value (and our alternative is “greater than”). Not significant at 1% because it is not larger than 2.093. (Omit part (f)but here’s the answer) (f) 0.0385. **4. The power of a t test is the numerical measure (between 0 and 1) of the test’s ability to detect deviations from the null hypothesis. Which of the following is NOT true (about the power of a t test? (a) Sample size affects the power of the test we have used. (b) Power of the t test = the p-value of a t test. ANSWER: Power of the t test = the p-value of a t test is NOT a true statement. The power is the probability that a test will reject a false null hypothesis, while the p-value is the exact probability that a test will (incorrectly) reject a true claim (null hypothesis). (c) High values of a t test are generally quite important. (d) Power is the probability that we would reject a false claim (hypothesis). **5. Matched pairs: 7.43 H0: A = B ( = 0) vs. Ha: A > B ( > 0), where is the mean difference, x 0 0.46 1.581 and the p-value is between 0.10 and 0.05 or 0.074 from Excel. s / n 0.92 / 10 This is not strong evidence that A is better than B. There is no statistical advantage (higher yield) to Variety A. Extra Question: Use your knowledge of x (the sample mean) and s (sample standard deviation) to state why t tests are not robust against outliers. ANSWER: Both of these statistics ( x and s) are used to compute a t-value and both use all available data and are therefore adversely affected by outliers. Both statistics are inflated (or x is deflated if the outliers are on the left (negative) side) if there are outliers. (A-B). t 6. 7.45 The table gives the values for all 50 states, the whole population of states. There is no sampling involved. Extra Question: If no sampling is involved and using t test procedures does not make sense, then what would you do to characterize the mean number of medical doctors per 100,000 people? ANSWER: It’s been done! We have looked at the entire population, so just calculate the average of all the data collected. The average of this population is . Section 7.2 Lab#__________ Section________ Name___________________ Instructor________________ 2 2 s s 0.7 2 1.82 **7. 7.56 (b) Using df=9, t*=2.262. ( x1 x2 ) t n 1, / 2 1 2 = 4.6 3.2 2.262 = n1 n2 10 10 (0.019,2.781) . The SE of the difference of the means is 0.61074. From Excel using less conservative df’s: (0.1837,2.9837) (c) H0: new = old (diff = 0) vs. Ha: new old (diff 0) NOTE: Since the 95% confidence interval does not contain 0, we can reject H0 at the 5% level. BUT JUST BARELY. Since 0 was almost in the interval, if the new monitors cost a lot more, it’s probably not enough better to offset the cost. However, using Excel’s & 7.57 A randomized design is generally better. It is possible that new employees will come in batches, so that we might end up with more employees in one department (with different needs) who get the flat screens. OR it could be that the next 10 people hate their computers and requested new ones! **8. 7.58 For the two-bedroom apartments, x = $609 and s=$89.31. For the one-bedroom apartments, x = $531 and s=$82.79. The SE for the difference is $38.511. Using df=9, t=2.262 and the confidence interval for the difference is (609 531) t9,.025 89.312 82.79 2 ( 9.11,165.11) SPSS output is 10 10 below: Group Statistics N Mean 2 10 1 10 Independent Samples Test Std. Std. Error Deviation Mean 609.0000 89.3122 28.2430 531.0000 82.7916 26.1810 Levene's Test for Equality of Variances F Sig. t-test for Equality of Means t df Sig. Mean Std. Error 95% Confidence Interval (2- Differen Difference of the Difference tailed) ce Lower Upper .058 78.0000 38.5112 -2.9090 158.9090 Equal .249 .624 2.025 18 variances assumed Equal 2.025 17.898 .058 78.0000 38.5112 -2.9422 158.9422 variances not assumed Levene’s Test failed to reject the null hypothesis about equality of variances, therefore it is plausible to assume that variances are equal and use pooled two-sample t-test. ( n1 1) s1 ( n2 1) s2 9 89.312 9 82.79 2 86.12 2 n1 n2 2 18 x1 x2 609 531 t 2.0256 s p 1 / n1 1 / n2 86.12 1 / 10 1 / 10 2 sp 2 2 Lab#__________ Section________ Name___________________ Instructor________________ **7.59 (a) H0: 1 = 2 vs. Ha: 1 < 2 , where 1 is the mean cost for one-bedroom apartments and 2 is the mean cost for two-bedroom apartments. (b) The t-statistic is 2.025 with df=9. The p-value is 0.037. This is fairly strong evidence that two-bedroom apartments cost more than one-bedroom ones. (c) No, only that the mean for one-bedrooms is less than the mean for two-bedrooms. (d) Confidence intervals are generally more useful because they give you a better idea of the size of the difference. The p-value is a probability and does not tell you the amount of dollars involved. Not on the assignment: Extra Question. Also, find the 99% confidence interval for the additional cost of a second bedroom. What are the key differences between the 95% and 99% intervals? s12 s22 ANSWER: ( x1 x2 t9,0.005 ) = ($609 - $531) 3.25 * $38.51 = ($47.16, $203.16). . From n1 n2 Excel using less conservative df’s: (46.098,202.098) . Key differences between 95% and 99% C.I.’s: The 99% CI is wider than the 95% CI and includes 0. This 99% interval would not lead us to conclude that there was a significant change (either increase or decrease) in the subject rents. 9. 7.70 (a) The SE of the difference is 52.47. H0: 1 = 2 vs. Ha: 1 < 2, where 1 is the mean of the Positive group and 2 is the mean of the Other group. The t-statistic is t 3118 2733 599 2 672 2 134 5974 7.34 with ( s1 / n1 s2 / n1 ) 2 140.61 . The p-value is essentially 0. (b) The 95% interval for df 1 1 2 2 ( s1 / n1 ) 2 ( s2 / n2 ) 2 n1 1 n2 1 the difference is 282.16 to 487.84. From Excel using less conservative df’s: ( 252.26,517.79) . (c) The Other group includes women who were not tested. Some of these might be drug users. However, this would reduce the difference between groups, which is quite significant. On the other hand, there may be lurking variables. Drug users likely have lower socioeconomic status and generally lower health condition than non-users. So we cannot conclude from this data that the drug use causes the low birth weight. 2 2 Section 7.3 10. 7.89 Both numerator and denominator have df=19. The closest table entry is F(20,19). At the 5% level, the ratio must be at least 2.16. Lab#__________ Section________ Name___________________ Instructor________________