Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CHAPTER 12: SAMPLE PROBLEMS FOR HOMEWORK, CLASS OR EXAMS These problems are designed to be done without access to a computer, but they may require a calculator. 1. a. In an analysis of the relationship between the traits valued by sales managers for employees they hire as salesmen, and for the employees they hire as office assistants, you find 2 196 , p value <0.001, and Pearson contingency coefficient 0.84. Does this imply that the sales managers value the same traits in salesmen as they do in office assistants? Explain. b. In writing up the results described in part (a), you describe it as a test of homogeneity. The editor objects that this is really a test of independence. What is the difference, and which phrase is most appropriate here? 2. A teacher has given an exam question that is multiple-choice with three choices. The teacher suspects that the class has not learned this material at all, and is simply selecting a choice at random. From the data, can you disprove the teachers suspicion, at = 5%? Choice number of students choosing A 25 B 15 C 20 3. A large clinical study of a new drug for the treatment of cholesterol is attempting to compare the frequency of serious side effects. It is thought that the frequency might vary by age. The data on side effects is tabulated below, within age groups. Age: 30 – 39 12 of 132 reported serious side effects 40 – 49 14 of 124 reported serious side effects 50 – 59 21 of 119 reported serious side effects 60 – 69 28 of 121 reported serious side effects a. What is the apparent pattern in this data? b. A software package reports that 2 11.81 . Is there evidence of a difference in the probability of side effects by age group? Use = 1%. 4. The state’s department of environmental protection maintains lists of gasoline service stations in each county. Concerned that leaky underground tanks may be contaminating water supplies, the state inspects a large sample of tanks, classifying them as to whether or not they leak. The tanks are also classified by age category (0 – 5 years, 6 – 10 years, 11 + years). The data is summarized in the accompanying crosstab and statistical printouts. Comment on the apparent trend, if any, and use the formal hypothesis test to state a conclusion. Use = 1%. Age * Leak Crosstabulation Age 0 - 5 years 6 - 10 years 11 + years Total Leak does not leak 117 108.8 96.7% 40.9% 88 85.4 92.6% 30.8% 81 91.7 79.4% 28.3% 286 286.0 89.9% 100.0% Count Expected Count % within Age % within Leak Count Expected Count % within Age % within Leak Count Expected Count % within Age % within Leak Count Expected Count % within Age % within Leak leaks 4 12.2 3.3% 12.5% 7 9.6 7.4% 21.9% 21 10.3 20.6% 65.6% 32 32.0 10.1% 100.0% Chi-Square Te sts Pearson Chi-Square Lik elihood Ratio Linear-by-Linear As soc iation N of Valid Cases Value 19.352 a 18.782 17.756 2 2 As ymp. Sig. (2-sided) .000 .000 1 .000 df 318 a. 0 c ells (.0% ) have expected count less than 5. The minimum expected count is 9.56. Total 121 121.0 100.0% 38.1% 95 95.0 100.0% 29.9% 102 102.0 100.0% 32.1% 318 318.0 100.0% 100.0% 5. In a sample of 68 new homes recently built by Company A, inspectors find 7 homes with substantial code violations. In a sample of 54 new homes recently built by Company B, inspectors find 9 homes with substantial code violations. a. Is there evidence that the two companies differ in the probability of having a substantial code violation? Use = 5%. b. Suppose that Company B had 0 homes with substantial code violations. How would that affect your choice of methods for analyzing the data? 6. A psychologist has administered IQ tests to a sample of 200 adult prisoners. If this population has a distribution of IQs similar to that in the general population, the data should come from a normal distribution with a mean of 100 and a standard deviation of 16. The data is summarized below. For some categories, the expected number under the presumed distribution, and the contribution to the 2 statistic have also been given. a. Fill in the remaining blanks in the table. b. Test the null hypothesis that the distribution of IQs in the adult prison population is similar to that in the general population, using = 5%. c. Comment on the apparent difference between the adult prison population and the general population. IQ < 84 84 ≤ IQ < 100 100 ≤ IQ < 116 116 ≤ IQ sample count 55 81 40 24 expected 31.73 68.27 chi-squared 17.07 2.37 contribution 7. In the past, students have evaluated an instructor on a scale of 40% Fair, 40% Good, and 20% Excellent. In the past year, the instructor has changed his style of delivery. A random sample of 50 independent student evaluations showed 10 Fair, 25 Good, and 15 Excellent. Is there evidence of a change in the distribution of the teaching evaluations? Use = 5%. 8. In a trial of a new drug, 1 of the 100 people in the Placebo group reported flu-like symptoms in the first week. By contrast, 8 of the 100 people in the Drug group reported flu-like symptoms in the first week. The table below shows the printout for a standard set of test statistics comparing the proportions in the two groups. You wish to test the null hypothesis that there is no difference in the probability of experiencing flu-like symptoms for the two groups, using = 5%. Cite the single most appropriate test and give the conclusion. Statistics for Table of flu by drug Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 5.7010 0.0170 Likelihood Ratio Chi-Square 1 6.4543 0.0111 Contingency Coefficient 0.1665 Cramer's V 0.1688 Fisher's Exact Test ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Table Probability (P) 0.0158 Two-sided Pr <= P 0.0349 SOLUTIONS 1. a. No, it only means that there is a relationship. They could equally value the opposite traits in salesmen and in office assistants. b. In a true test of homogeneity, you would sample separate naturally occurring populations, and see if the distribution of some characteristic varied by population. For example, you might sample sales managers and production managers to see if the distribution of traits they prefer in office assistants were different. In a test of independence, a single population is sampled, and each participant is measured on two variables. This is a test of independence. 2. Expected value for each choice is 20. Chi-squared statistic is with 2 degrees of freedom. There is not significant evidence to disprove the teacher’s suspicion. (25 20)2 / 20 (15 20)2 / 20 (20 20)2 / 20 2.5 3 a. Apparently, the probability of a serious side effect increases as age increases. b. This has 3 degrees of freedom, the critical value is 11.345. There is significant evidence that at least one group has a different probability of a serious side effect. 4. The probability a tank will leak appears to be increasing, from 3.3% in the newest tanks to 20.6% in the oldest tanks. The proportion of tanks that are leaking is significantly different for at least one group, 2 19.352,with 2 df and p value = 0.000. Note that the Chi-squared test is valid here, as all cells have expected counts at least 5. 5. a. This data has expected number of homes with code violations = 8.91 for Company A and 7.08 for Company B, so the sample is sufficiently large for a Chi-Squared test. 2 2 2 2 2 (7 8.91) / 8.91 (9 7.08) / 7.08 (61 59.09) / 59.09 (45 46.92) / (46.92) 1.07 With 1 degree of freedom. There is no significant difference in the proportion of homes with substantial code violations. b. Now 2 of the 4 cells have expected counts less than 5. Use Fisher’s Exact Test. 6a. using the symmetry of the normal distribution about the mean of 100, IQ < 84 84 ≤ IQ < 100 100 ≤ IQ < 116 sample count 55 81 40 expected 31.73 68.27 68.27 chi-squared 17.07 2.37 11.71 contribution 116 ≤ IQ 24 31.73 1.72 b. Chi-squared statistic = 32.87 with 3 degrees of freedom. Critical value is 7.815. There is significant evidence that the distribution of IQs among adult prisoners differs from that in the general population. c. Since the low IQ categories have more than expected, and the high IQ categories have fewer than expected, it appears that IQs are typically lower in the prison population. 7. The expected counts under the old distribution would be 20 Fair, 20 Good, and 10 Excellent. The Chi-squared statistic is 8.75 with 2 degrees of freedom. The critical value is 5.991. There is significant evidence of a change in the distribution of the teaching evaluations. They have apparently improved. 8. Two of the 4 cells have expected count less than 5 (4.5 expected with flu-like symptoms in each group). Use Fisher’s Exact test with a two-tailed p value, 0.0349. There is significant evidence that the groups differ in the probability a patient will experience flu-like symptoms.