Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STATISTICS IN BUSINESS BAMG 20100 R Class 19 Exam2 March 28, 2012 Questions and Answers The following questions refer to the data set “Class 19 Exam Data” which contains four variables on 60 US car models. The variables are “class” (Compact, Midsize, Large) which refers to the general size of the car, “displacement” which is the size of the engine in liters, “Fuel Type” which is P if the manufacturer recommends premium gasoline and R if regular, and “Hwy MPG” which is the measure of highway miles per gallon. The first and third variables are categorical and the second and fourth are numerical. The data set also contains the index “Car” running from 1 to 60. We are interested in making inferences about the population from which these 60 cars were sampled randomly. (You can think of this population either as all the possible car makes in the US or as the process which selects the properties of cars on the US market.) 1. One would expect that the size of the car engine (measured by displacement) would change based on car class (compact, midsize, large). Use the data to test a relevant hypothesis with respect to this question. State your hypotheses, p-value, and conclusion. [20 points] H0 is that the mean displacements are equal for the three classes of car. Ha is that the means are not all equal. With three groups of a numerically scaled variable, we use ANOVA single factor. A pivot table creates the three columns of displacement numbers, and DATA ANALYSIS ANOVA performs the desired test of significance. Anova: Single Factor SUMMARY Groups Compact Large Midsize ANOVA Source of Variation Count 19 16 25 SS Sum Average Variance 81.8 4.305263 1.280526 53.1 3.31875 0.160292 62.3 2.492 0.215767 df MS F Between Groups Within Groups 35.51708 30.63225 2 17.75854 33.04481 57 0.537408 Total 66.14933 59 P-value F crit 2.96E10 3.158843 Compact cars showed the highest sample mean (perhaps because many of the compact cars were high-performance sports cars), and midsize cars had the lowest sample mean. The differences among the three sample means are statistically significant given the low p-value of 2.96E-10. We reject H0 in favor of Ha. 2. One might expect to see a relationship between car class and recommended fuel type. Do the data support such an expectation? Use the data to test a relevant hypothesis with respect to this question. State your hypotheses, test statistic, p-value, and conclusion. [20 points] Here we are asked about a potential relationship between two categorical variables. H0 will be that car class and fuel type are independent. Ha is that they are not independent. The appropriate test is a chi-squared independence test, for which we need the contingency table of observed and expected counts. Compact Large Midsize Expected P 16 11 9 36 R 3 5 16 24 11.4 9.6 15 7.6 6.4 10 Distances 1.86 0.20 2.40 calculated chi-square (sum of distances) pvalue = chidist(11.15,2) pvalue = chisq.test(O,E) 19 16 25 60 2.78 0.31 3.60 11.15 0.00379002 0.00379002 Note that all 6 expected counts are greater than 5, as required by this test. The sum of the 6 distances in the table (the calculated chi-square statistic) is 11.15. The single largest distance comes from the 16 midsize cars (out of 25) that use regular fuel…a large number given that overall only 40% (24 out of 60) use regular fuel. This calculated chi-squared is compared to the theoretical chi-square distribution with (3-1)(2-1) = 2 degrees of freedom. The p-value is 0.004 and the results are statistically significant. We reject H0 of independence. 3. One might expect that because premium gasoline is higher quality, cars for which it is recommended will get higher gas mileage (on average) than cars for which regular fuel is recommended. Do the data support this expectation? Use the data to test a relevant hypothesis with respect to this question. State your hypotheses, test statistic, p-value, and conclusion. [20 points] It appears this is one of those situations in which the “wrong” alternative hypothesis was chosen. H0 is that mean Hwy MPG is equal for cars using premium and cars using regular. Ha is that the mean MPG is higher for cars using Premium. We need to perform a t-test: two samples (with equal variances). t-Test: Two-Sample Assuming Equal Variances Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail P R 24.33333 27.70833 12.4 9.519928 36 24 11.2579 0 58 -3.81704 0.000165 1.671553 0.000331 2.001717 Examining this report carefully, we see that the sample mean MPG was higher for cars using regular gas. Our idea that premium would lead to a higher mean MPG was incorrect. (On second thought, it is more about the kinds of cars using the fuel than the fuel itself. Economy cars have both higher MPGs and are built to use regular fuel.) So….we need go no further. We cannot reject H0 in favor of our Ha. Our results are not statistically significant (for our Ha). The one-tail p-value reported above of 0.000165 is for the Ha that mean MPG is lower for cars using premium fuel. If that had been our Ha (and it was not), our result would be statistically significant. 4. About 22,000 volunteers participated in the clinical study to test the use of daily aspirin to reduce the risk of a heart attack. Half the participants took an aspirin a day, and the other half took a placebo. The treatment (aspirin) group had 104 heart attacks while the placebo group had 189. a.) Are the results statistically significant? (Be certain to state your hypotheses, calculate a p-value, and be clear about your conclusion.) [20 points] Both variables here are that special case of yes/no variables (which can be either categorical or numerical). This means we have a few ways to proceed, and we decide to treat them as categorical. H0 is that treatment (aspirin or placebo) is independent of outcome (heart attack or no heart attack). For a 2x2 situation like this, independence is equivalent to H0: the probability of a heart attack is the same for those taking aspirin as for those taking the placebo. Ha is they are not independent (heart attack probabilities are unequal). This is a two-tailed alternative. (We could also do a one-tailed alternative that the probability is lower for aspirin requiring us to treat them as numerical….not reported here.) The details of the chi-square independence test follow: Heart Attack 104 189 293 No Heart Attack 10896 10811 21707 Expected 146.5 146.5 10853.5 10853.5 Distances 12.33 12.33 0.17 0.17 Aspirin Placebo Calculated chi-square P-value = chisq.test.rt(24.99,1) P-value = chisq.test(O,E) 11000 11000 22000 24.99 5.7582E-07 5.7582E-07 The low p-value means we reject the hypothesis of independence in favor of Ha. The difference in sample heart attack rates for the two groups (aspirin and placebo) is statistically significant. In answering b) and c) below, assume the probability of a heart attack is 104/11,000 if taking aspirin and 189/11,000 if taking a placebo. b.) For future trials, a public health official suggests giving the treatment (in this case, the aspirin) to 75% of the participants rather than 50%. Said the official, “under the assumption the treatment is effective, this will lead to better health outcomes for the participants.” Suppose the study is repeated with 22,000 new participants using the proposed 75/25 split. How many heart attacks will the 22,000 participants experience? [10 points] GOOD ANSWER The expected (or mean) number of heart attack is (189/11000)*5500 + (104/11000)*16500 = 94.5 + 156 = 250.5. This is an improvement over the 293 from the original 50/50 design. The official is correct that the new split will lower the mean number of heart attacks. We expect 42.5 fewer. (The 5500 and 16500 numbers are 25% and 75% of the 22,000 subjects.) BETTER ANSWER But, we can provide a better, more complete, answer. Even if we assume known probabilities (which the question asks us to do), we cannot know exactly how many heart attacks will occur. The mean is 250.5, but there will be uncertainty. The total number of heart attacks will be a random variable. The number of heart attacks from the aspirin group will be binomially distributed with n=16,500 and P=104/11000. The number of heart attacks from the placebo group will be binomially distributed with n=5,500 and P=189/11000. But what will the distribution of total heart attacks? It is the sum of two binomials….but what distribution is that? BEST ANSWER Our best answer comes if we remember that when n is big the normal can be used to approximate the binomial. And we know that the sum of two normals is normal with mean equal to the sum of the two means and variance equal to the sum of the two variances (if independent). The standard deviation of a binomial is [n*P*(1-P)]^.5 Group aspirin placebo TOTAL Number 16500 5500 22000 Probability 0.009455 0.017182 Number of Heart Attacks mean variance std dev 156 154.53 12.4 94.5 92.88 9.6 250.5 247.40 15.7 So the total number of heart attacks will be normally distributed with mean 250.5 and standard deviation of 15.7. c.) Will the new 75/25 split affect the resulting p-value? In particular, should we expect the new p-value to be greater than, equal to, or less than the p-value calculated for the original trial (your answer to a. above)? Explain the reasoning behind your answer. [10 points] The new split will probably lead to a higher p-value. With 22,000 subjects available to measure and test the difference in two sample proportions, it is best to use an equal split. The more extreme the split, the less powerful will be the test (imagine, for example, a 100/0 split---we’d have great data on the aspirin rate and NO data on the placebo rate). Another way to see this is to keep the heart attack rates the same, and redo the p-value for the new 75/25 split. Heart Attack 156 94.5 250.5 No Heart Attack 16344 5405.5 21749.5 16500 5500 22000 Expected 187.875 62.625 16312.125 5437.375 16500 5500 Distances 5.41 16.22 0.06 0.19 Aspirin Placebo Calculated chi-square P-value = chisq.test.rt(24.099,1) P-value = chisq.test(O,E) 21.88 2.9011E-06 2.9011E-06 As predicted the calculated chi-square is smaller and the p-value bigger. However, with the sample sizes available here, the drop in p-value is immaterial. In this situation, since aspirin and placebo heart attack rates are decidedly different and total n is very big, the official’s proposal looks good. Switching to 75/25 will result in fewer heart attacks (on average) and a slightly higher p-value, but the test should still be powerful enough to reject the null hypothesis. 5. Di and El compete to see who is better at throwing a tennis ball into a trash can. Di got 10 out of 20 throws in the can, whereas El only saw 5 of her 22 tosses end up in the can. El claims that Di’s victory was simply a matter of good fortune and that Di really isn’t any more skillful. Comment on El’s claim. (To receive full credit your commentary must include analysis of the contest results that results in a pvalue, not simply opinion.) [20 points] Let us start by doing a chi-square independence test. H0: tosser and outcome are independent. This is equivalent to Di and El having equal probabilities of success. Ha: not independent (or unequal probabilities). Di El Expected In 10 5 15 Out 10 17 27 7.1 7.9 12.9 14.1 20 22 42 Distances 1.142857 0.634921 1.038961 0.577201 calculated chi-suared 3.393939 Pvalue 0.065436 Note that expected counts are all greater than 5. We cannot reject H0 in favor of the 2-Tailed Ha. But the Ha we would really like to use is Ha: Di has a higher probability. Since the chi-squared independence test is two-tailed, the 0.065 p-value includes both “tails”. For the one-tailed test, the p-value will be half that. So the 1-tail p-value is 0.033, and we CAN reject H0 in favor of Ha Di having a higher probability. So even with the relatively small sample sizes, we reject the hypothesis that are equally skilled in favor of the one-tailed alternative that Di is more skillful (has a higher P). 6. Bo has three stocks in his portfolio. Bo knows the returns on these three stocks are random variables with the following means and standard deviations. Stock Mean Return Standard Deviation 1 0.10 0.10 2 0.05 0.04 3 0.20 0.40 Because Bo has split his money equally among the three stocks, the return of his portfolio will be the sum of the three returns divided by 3, (r1 + r2 + r3)/3, where r1 is the return from stock 1, r2 the return from stock 2, and r3 represents the return from stock 3. Bo is interested in the properties of the return of his portfolio. We know that the mean of the sum is always the sum of the means and the variance of the sum is the sum of the variances if independent components. Assuming independence, the sum of the three returns will have mean 0.35, variance 0.1716, and standard deviation 0.414. Since the portfolio return is the sum divided by 3, we know that the mean is 0.35/3 and the standard deviation is 0.414/3. Stock 1 2 3 TOTAL Average Mean Return 0.1 0.05 0.2 0.35 0.117 Variance 0.01 0.0016 0.16 0.1716 Standard Deviation 0.1 0.04 0.4 0.414 0.138 a.) What is the mean? [5 points] The mean is 0.117, the average of the three returns. b.) What is the standard deviation? [5 points] The standard deviation is 0.138. c.) Does the answer to a) require independence? [5 points] No. d.) Does the answer to b) require independence? If yes, is independence a reasonable assumption? Explain briefly. [5 points] Yes. Independence is NOT a reasonable assumption because stocks tend to rise and fall together with the general conditions of the stock market and economy. If the return on one exceeds its mean, the returns on the others are likely to exceed their means also. 7. A hotel elevator is rated to hold at most 3,500 pounds safely. Assume the weights of the guests at the hotel using the elevator follow the normal distribution with a mean of 150 pounds and a standard deviation of 40 pounds. a.) If 20 guests board the elevator, will they be over or under the weight limit? (Assume independence) [10 points] The mean of the total weight will be 20*150 or 3,000. The variance of the total weight will be 20*variance (variances add if independent) = 20*40^2 = 32,000. This means the standard deviation of the total weight in the elevator is 32,000^.5 = 178.885. The probability the total weight exceeds 3,500 is then 1 – NORMDIST(3500,3000,178.885,true) = 0.0026. b.) It turns out this is a family hotel. The weights of the 20 guests boarding the elevator will not be independent as the 20 will usually include entire families (adults and children). In light of this new information, in what direction does your answer to change? Why? [5 points] The weights of the 20 guests in the elevator will now not be independent. Higher weight individuals (adults) are likely to enter the elevator along with lower weight children. This will make the variance of the total weight LOWER than that from independence. This will make the probability the total will exceed 3,500 even smaller than 0.0026. So this changes my answer to (a.) by making it lower. 8. To test a one-sample hypothesis about a mean in a situation in which σ is not known, Al correctly used the t-statistic whereas Bo incorrectly used the Z-statistic (and just replaced σ with s). The alternative hypothesis was such that this was a 2-tailed test. Please circle the correct statement (select one). a.) Al’s p-value will be greater than Bo’s b.) Al’s p-value will be equal to Bo’s. c.) Al’s p-value will be less than Bo’s. d.) We cannot know how their p-values will compare….it depends on the data. Briefly explain your answer. [10 points] (a) is the correct answer. Al’s t and Bo’s Z will come out equal. The difference between the two will occur in the subsequent step in which Al will correctly use =t.dist.2t whereas Bo will incorrectly use the normal. The entire reason for the t is to account for the extra uncertainty one faces when using s instead of σ. So the t-distribution will be wider (reflecting more uncertainty) than the normal. Al’s p-value will come out greater than Bo’s. A simple numerical example will verify this. In essence, Bo “cheated” on the hypothesis test by assuming s was σ. His cheating got him a better result. Sample mean s n 104 18 33 Al's t = (104-100)/(18/33^.5) Bo's Z 1.28 1.28 Al's p = t.dist.2t(1.28,32) Bo's p = 2*(1norm.s.dist(1.28,true)) 0.210941 0.201754 9. Assume no one “cheats” on the IQ experiment. When instructed, all students hit F9 (recalculate) exactly ten times and report the final IQ (regardless of how big or small it is). We then proceed to test the hypothesis that the mean IQ’s of males and females are equal. We have 42 males and 27 females. (The generated IQs will be normally distributed with mean 100 and standard deviation 15.) Because the alternative is that the means are not equal, this will be a 2-tailed test. We will reject or fail to reject the null hypothesis? [5 points] Since H0 is known to be true, then we will reject it only by chance. That will happen with probability 0.05.