Download Class 19 Exam2 Questions and Answers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
STATISTICS IN BUSINESS
BAMG 20100
R
Class 19 Exam2
March 28, 2012
Questions and Answers
The following questions refer to the data set “Class 19 Exam Data” which contains four variables on 60
US car models. The variables are “class” (Compact, Midsize, Large) which refers to the general size of the
car, “displacement” which is the size of the engine in liters, “Fuel Type” which is P if the manufacturer
recommends premium gasoline and R if regular, and “Hwy MPG” which is the measure of highway miles
per gallon. The first and third variables are categorical and the second and fourth are numerical. The
data set also contains the index “Car” running from 1 to 60.
We are interested in making inferences about the population from which these 60 cars were sampled
randomly. (You can think of this population either as all the possible car makes in the US or as the
process which selects the properties of cars on the US market.)
1. One would expect that the size of the car engine (measured by displacement) would change based on
car class (compact, midsize, large). Use the data to test a relevant hypothesis with respect to this
question. State your hypotheses, p-value, and conclusion. [20 points]
H0 is that the mean displacements are equal for the three classes of car. Ha is that the means
are not all equal. With three groups of a numerically scaled variable, we use ANOVA single
factor. A pivot table creates the three columns of displacement numbers, and DATA ANALYSIS
ANOVA performs the desired test of significance.
Anova: Single
Factor
SUMMARY
Groups
Compact
Large
Midsize
ANOVA
Source of Variation
Count
19
16
25
SS
Sum
Average Variance
81.8 4.305263 1.280526
53.1 3.31875 0.160292
62.3
2.492 0.215767
df
MS
F
Between Groups
Within Groups
35.51708
30.63225
2 17.75854 33.04481
57 0.537408
Total
66.14933
59
P-value
F crit
2.96E10 3.158843
Compact cars showed the highest sample mean (perhaps because many of the compact cars
were high-performance sports cars), and midsize cars had the lowest sample mean. The
differences among the three sample means are statistically significant given the low p-value of
2.96E-10. We reject H0 in favor of Ha.
2. One might expect to see a relationship between car class and recommended fuel type. Do the data
support such an expectation? Use the data to test a relevant hypothesis with respect to this question.
State your hypotheses, test statistic, p-value, and conclusion. [20 points]
Here we are asked about a potential relationship between two categorical variables. H0 will be
that car class and fuel type are independent. Ha is that they are not independent. The
appropriate test is a chi-squared independence test, for which we need the contingency table of
observed and expected counts.
Compact
Large
Midsize
Expected
P
16
11
9
36
R
3
5
16
24
11.4
9.6
15
7.6
6.4
10
Distances
1.86
0.20
2.40
calculated chi-square (sum of
distances)
pvalue = chidist(11.15,2)
pvalue = chisq.test(O,E)
19
16
25
60
2.78
0.31
3.60
11.15
0.00379002
0.00379002
Note that all 6 expected counts are greater than 5, as required by this test. The sum of the 6
distances in the table (the calculated chi-square statistic) is 11.15. The single largest distance
comes from the 16 midsize cars (out of 25) that use regular fuel…a large number given that
overall only 40% (24 out of 60) use regular fuel. This calculated chi-squared is compared to the
theoretical chi-square distribution with (3-1)(2-1) = 2 degrees of freedom. The p-value is 0.004
and the results are statistically significant. We reject H0 of independence.
3. One might expect that because premium gasoline is higher quality, cars for which it is recommended
will get higher gas mileage (on average) than cars for which regular fuel is recommended. Do the data
support this expectation? Use the data to test a relevant hypothesis with respect to this question. State
your hypotheses, test statistic, p-value, and conclusion. [20 points]
It appears this is one of those situations in which the “wrong” alternative hypothesis was
chosen. H0 is that mean Hwy MPG is equal for cars using premium and cars using regular. Ha
is that the mean MPG is higher for cars using Premium. We need to perform a t-test: two
samples (with equal variances).
t-Test: Two-Sample Assuming Equal
Variances
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
P
R
24.33333 27.70833
12.4 9.519928
36
24
11.2579
0
58
-3.81704
0.000165
1.671553
0.000331
2.001717
Examining this report carefully, we see that the sample mean MPG was higher for cars using
regular gas. Our idea that premium would lead to a higher mean MPG was incorrect. (On
second thought, it is more about the kinds of cars using the fuel than the fuel itself. Economy
cars have both higher MPGs and are built to use regular fuel.) So….we need go no further. We
cannot reject H0 in favor of our Ha. Our results are not statistically significant (for our Ha).
The one-tail p-value reported above of 0.000165 is for the Ha that mean MPG is lower for cars
using premium fuel. If that had been our Ha (and it was not), our result would be statistically
significant.
4. About 22,000 volunteers participated in the clinical study to test the use of daily aspirin to reduce the
risk of a heart attack. Half the participants took an aspirin a day, and the other half took a placebo. The
treatment (aspirin) group had 104 heart attacks while the placebo group had 189.
a.) Are the results statistically significant? (Be certain to state your hypotheses, calculate a p-value, and
be clear about your conclusion.) [20 points]
Both variables here are that special case of yes/no variables (which can be either categorical or
numerical). This means we have a few ways to proceed, and we decide to treat them as
categorical. H0 is that treatment (aspirin or placebo) is independent of outcome (heart attack or
no heart attack). For a 2x2 situation like this, independence is equivalent to H0: the probability
of a heart attack is the same for those taking aspirin as for those taking the placebo. Ha is they
are not independent (heart attack probabilities are unequal). This is a two-tailed alternative.
(We could also do a one-tailed alternative that the probability is lower for aspirin requiring us to
treat them as numerical….not reported here.) The details of the chi-square independence test
follow:
Heart
Attack
104
189
293
No Heart
Attack
10896
10811
21707
Expected
146.5
146.5
10853.5
10853.5
Distances
12.33
12.33
0.17
0.17
Aspirin
Placebo
Calculated chi-square
P-value = chisq.test.rt(24.99,1)
P-value =
chisq.test(O,E)
11000
11000
22000
24.99
5.7582E-07
5.7582E-07
The low p-value means we reject the hypothesis of independence in favor of Ha. The difference
in sample heart attack rates for the two groups (aspirin and placebo) is statistically significant.
In answering b) and c) below, assume the probability of a heart attack is 104/11,000 if taking aspirin and
189/11,000 if taking a placebo.
b.) For future trials, a public health official suggests giving the treatment (in this case, the aspirin) to 75%
of the participants rather than 50%. Said the official, “under the assumption the treatment is effective,
this will lead to better health outcomes for the participants.” Suppose the study is repeated with 22,000
new participants using the proposed 75/25 split. How many heart attacks will the 22,000 participants
experience? [10 points]
GOOD ANSWER
The expected (or mean) number of heart attack is (189/11000)*5500 + (104/11000)*16500 =
94.5 + 156 = 250.5. This is an improvement over the 293 from the original 50/50 design. The
official is correct that the new split will lower the mean number of heart attacks. We expect 42.5
fewer. (The 5500 and 16500 numbers are 25% and 75% of the 22,000 subjects.)
BETTER ANSWER
But, we can provide a better, more complete, answer. Even if we assume known probabilities
(which the question asks us to do), we cannot know exactly how many heart attacks will occur.
The mean is 250.5, but there will be uncertainty. The total number of heart attacks will be a
random variable.
The number of heart attacks from the aspirin group will be binomially distributed with n=16,500
and P=104/11000. The number of heart attacks from the placebo group will be binomially
distributed with n=5,500 and P=189/11000.
But what will the distribution of total heart attacks? It is the sum of two binomials….but what
distribution is that?
BEST ANSWER
Our best answer comes if we remember that when n is big the normal can be used to
approximate the binomial. And we know that the sum of two normals is normal with mean equal
to the sum of the two means and variance equal to the sum of the two variances (if
independent). The standard deviation of a binomial is [n*P*(1-P)]^.5
Group
aspirin
placebo
TOTAL
Number
16500
5500
22000
Probability
0.009455
0.017182
Number of Heart Attacks
mean
variance
std dev
156
154.53
12.4
94.5
92.88
9.6
250.5
247.40
15.7
So the total number of heart attacks will be normally distributed with mean 250.5 and standard
deviation of 15.7.
c.) Will the new 75/25 split affect the resulting p-value? In particular, should we expect the new p-value
to be greater than, equal to, or less than the p-value calculated for the original trial (your answer to a.
above)? Explain the reasoning behind your answer. [10 points]
The new split will probably lead to a higher p-value. With 22,000 subjects available to measure
and test the difference in two sample proportions, it is best to use an equal split. The more
extreme the split, the less powerful will be the test (imagine, for example, a 100/0 split---we’d
have great data on the aspirin rate and NO data on the placebo rate).
Another way to see this is to keep the heart attack rates the same, and redo the p-value for the
new 75/25 split.
Heart
Attack
156
94.5
250.5
No Heart
Attack
16344
5405.5
21749.5
16500
5500
22000
Expected
187.875
62.625
16312.125
5437.375
16500
5500
Distances
5.41
16.22
0.06
0.19
Aspirin
Placebo
Calculated chi-square
P-value = chisq.test.rt(24.099,1)
P-value =
chisq.test(O,E)
21.88
2.9011E-06
2.9011E-06
As predicted the calculated chi-square is smaller and the p-value bigger. However, with the
sample sizes available here, the drop in p-value is immaterial. In this situation, since aspirin
and placebo heart attack rates are decidedly different and total n is very big, the official’s
proposal looks good. Switching to 75/25 will result in fewer heart attacks (on average) and a
slightly higher p-value, but the test should still be powerful enough to reject the null hypothesis.
5. Di and El compete to see who is better at throwing a tennis ball into a trash can. Di got 10 out of 20
throws in the can, whereas El only saw 5 of her 22 tosses end up in the can. El claims that Di’s victory
was simply a matter of good fortune and that Di really isn’t any more skillful. Comment on El’s claim.
(To receive full credit your commentary must include analysis of the contest results that results in a pvalue, not simply opinion.) [20 points]
Let us start by doing a chi-square independence test. H0: tosser and outcome are independent.
This is equivalent to Di and El having equal probabilities of success. Ha: not independent (or
unequal probabilities).
Di
El
Expected
In
10
5
15
Out
10
17
27
7.1
7.9
12.9
14.1
20
22
42
Distances 1.142857 0.634921
1.038961 0.577201
calculated chi-suared 3.393939
Pvalue
0.065436
Note that expected counts are all greater than 5. We cannot reject H0 in favor of the 2-Tailed
Ha.
But the Ha we would really like to use is Ha: Di has a higher probability. Since the chi-squared
independence test is two-tailed, the 0.065 p-value includes both “tails”. For the one-tailed test,
the p-value will be half that. So the 1-tail p-value is 0.033, and we CAN reject H0 in favor of Ha
Di having a higher probability. So even with the relatively small sample sizes, we reject the
hypothesis that are equally skilled in favor of the one-tailed alternative that Di is more skillful
(has a higher P).
6. Bo has three stocks in his portfolio. Bo knows the returns on these three stocks are random variables
with the following means and standard deviations.
Stock
Mean Return
Standard Deviation
1
0.10
0.10
2
0.05
0.04
3
0.20
0.40
Because Bo has split his money equally among the three stocks, the return of his portfolio will be the
sum of the three returns divided by 3, (r1 + r2 + r3)/3, where r1 is the return from stock 1, r2 the return
from stock 2, and r3 represents the return from stock 3.
Bo is interested in the properties of the return of his portfolio.
We know that the mean of the sum is always the sum of the means and the variance of the sum
is the sum of the variances if independent components. Assuming independence, the sum of
the three returns will have mean 0.35, variance 0.1716, and standard deviation 0.414. Since
the portfolio return is the sum divided by 3, we know that the mean is 0.35/3 and the standard
deviation is 0.414/3.
Stock
1
2
3
TOTAL
Average
Mean
Return
0.1
0.05
0.2
0.35
0.117
Variance
0.01
0.0016
0.16
0.1716
Standard
Deviation
0.1
0.04
0.4
0.414
0.138
a.) What is the mean? [5 points]
The mean is 0.117, the average of the three returns.
b.) What is the standard deviation? [5 points]
The standard deviation is 0.138.
c.) Does the answer to a) require independence? [5 points]
No.
d.) Does the answer to b) require independence? If yes, is independence a reasonable assumption?
Explain briefly. [5 points]
Yes. Independence is NOT a reasonable assumption because stocks tend to rise and fall
together with the general conditions of the stock market and economy. If the return on one
exceeds its mean, the returns on the others are likely to exceed their means also.
7. A hotel elevator is rated to hold at most 3,500 pounds safely. Assume the weights of the guests at the
hotel using the elevator follow the normal distribution with a mean of 150 pounds and a standard
deviation of 40 pounds.
a.) If 20 guests board the elevator, will they be over or under the weight limit? (Assume independence)
[10 points]
The mean of the total weight will be 20*150 or 3,000. The variance of the total weight will be
20*variance (variances add if independent) = 20*40^2 = 32,000. This means the standard
deviation of the total weight in the elevator is 32,000^.5 = 178.885. The probability the total
weight exceeds 3,500 is then 1 – NORMDIST(3500,3000,178.885,true) = 0.0026.
b.) It turns out this is a family hotel. The weights of the 20 guests boarding the elevator will not be
independent as the 20 will usually include entire families (adults and children). In light of this new
information, in what direction does your answer to change? Why? [5 points]
The weights of the 20 guests in the elevator will now not be independent. Higher weight
individuals (adults) are likely to enter the elevator along with lower weight children. This will
make the variance of the total weight LOWER than that from independence. This will make the
probability the total will exceed 3,500 even smaller than 0.0026. So this changes my answer to
(a.) by making it lower.
8. To test a one-sample hypothesis about a mean in a situation in which σ is not known, Al correctly used
the t-statistic whereas Bo incorrectly used the Z-statistic (and just replaced σ with s). The alternative
hypothesis was such that this was a 2-tailed test. Please circle the correct statement (select one).
a.) Al’s p-value will be greater than Bo’s
b.) Al’s p-value will be equal to Bo’s.
c.) Al’s p-value will be less than Bo’s.
d.) We cannot know how their p-values will compare….it depends on the data.
Briefly explain your answer. [10 points]
(a) is the correct answer. Al’s t and Bo’s Z will come out equal. The difference between the two
will occur in the subsequent step in which Al will correctly use =t.dist.2t whereas Bo will
incorrectly use the normal. The entire reason for the t is to account for the extra uncertainty one
faces when using s instead of σ. So the t-distribution will be wider (reflecting more uncertainty)
than the normal. Al’s p-value will come out greater than Bo’s. A simple numerical example will
verify this. In essence, Bo “cheated” on the hypothesis test by assuming s was σ. His cheating
got him a better result.
Sample mean
s
n
104
18
33
Al's t = (104-100)/(18/33^.5)
Bo's Z
1.28
1.28
Al's p = t.dist.2t(1.28,32)
Bo's p = 2*(1norm.s.dist(1.28,true))
0.210941
0.201754
9. Assume no one “cheats” on the IQ experiment. When instructed, all students hit F9 (recalculate)
exactly ten times and report the final IQ (regardless of how big or small it is).
We then proceed to test the hypothesis that the mean IQ’s of males and females are equal. We have 42
males and 27 females. (The generated IQs will be normally distributed with mean 100 and standard
deviation 15.) Because the alternative is that the means are not equal, this will be a 2-tailed test.
We will reject or fail to reject the null hypothesis? [5 points]
Since H0 is known to be true, then we will reject it only by chance. That will happen with
probability 0.05.