Download In 1988, the average gasoline retail price of one of the major oil

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
10. In 1988, the average gasoline retail price of one of the major oil companies had
been $1.25 per gallon in a certain large market area of the country, Houston,
Texas. Since the company was concerned about the aggressive pricing of its
competitors (gasoline, of course, is a commodity product which typically
competes on price), its general manager tried to take measures to reduce the retail
price per gallon (that is, the price you and I pay at the pump). After several
months of trying to push down prices, she was interested in determining whether
or not the then-current price was significantly less than $1.25. A random sample
of n = 49 of their gasoline stations was selected, and the average price was
determined to be $1.20 per gallon. Assume that the standard deviation of the
population was  = $0.14. Note: (b), (c), and (d) below require numerical
answers.
(a) At the 95% confidence level, test to determine whether the measures taken by
the company were effective in reducing the average price. Please use the sixstep hypothesis-testing framework we employed in class, and write out the
last three steps in this table.
H 0 : µ ≥ $1.25
H a : µ  $1.25
n = 49, α = .05
√𝑛(𝑥̅ −𝜇)
The rejection region is 𝑧 =
< −𝑧𝛼 , 𝑤ℎ𝑒𝑟𝑒 𝑛 = 49, 𝜇 = 1.25,
𝜎
𝜎 = 0.14,
𝛼 = 0.05, 𝑧0.05 = 1.645, − 𝑧0.05 = −1.645
Compute the test statistic
√49(1.20 − 1.25)
𝑧=
= −2.5
0.14
𝑧 = −2.5 < −𝑧0.05 = −1.645
Conclusion
The test statistic is significant. It falls in the critical region. This leads to the
rejection of the null hypothesis. The conclusion is that the measures taken by
the company were effective in reducing the average price.
(b) What is the p-value associated with the above sample result?
𝑃(𝑧 < −2.5) = 0.00621
(c) What is the probability of committing a Type II error, β, if the actual price per
gallon is $1.19 per gallon? _0.3085______________
𝛽 = 𝑃(𝑡𝑦𝑝𝑒 𝐼𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝐴𝑐𝑐𝑒𝑝𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑓𝑎𝑙𝑠𝑒)
√49(𝑥̅ − 1.25)
= 𝑃(
≥ −2.5/𝜇 = 1.19)
0.14
0.14
= 𝑃 (𝑥̅ ≥ 1.25 − 2.5 ×
/𝜇 = 1.19)
√49
√49(𝑥̅ − 1.19) √49(1.20 − 1.19)
= 𝑃(𝑥̅ ≥ 1.20/ 𝜇 = 1.19) = 𝑃 (
≥
)
0.14
0.14
= 𝑃(𝑧 ≥ 0.5) = 1 − 𝑃(𝑧 < 0.5) = 1 − 0.6915 = 0.3085
(d) What is the “power of the test,” i.e., what is (1 – β)? _____0.6915_________
1 − 𝛽 = 1 − 0.3085 = 0.6915
11. Please interpret the printout results below, and answer the following questions.
Suppose we regress the dependent variable y on the four independent variables x 1
, x 2 , x 3 , and x 4 . After running this regression, on sixteen observations, the
SPSS printout provides the following information about the ANOVA table:
Regression Sum of Squares, 946.181; Residual Sum of Squares, 49.773. For (a),
(b), and (c) below, please provide numerical answers.
(a) What is the coefficient of determination? __0.950_____________
Total Sum of Squares = Regression Sum of squares + Residual Sum of
Squares = 946.181 + 49.773 = 995.954
𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠
946.181
Coefficient of determination =𝑅 2 = 𝑇𝑝𝑡𝑎𝑙 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 = 995.954 =
0.950
(b) What is the correlation between the actual dependent variable and the
predicted dependent variable? ___0.9747____________
𝑅 = √𝑅 2 = √0.950 = 0.9747
(c) What is the adjusted coefficient of determination? ___0.9319____________
𝑛−1
𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑅 2 = 1 − (1 − 𝑅 2 )
𝑛−𝑝−1
16 − 1
15
= 1 − (1 − 0.950)
= 1 − 0.050 × ( ) = 1 − 0.0681 = 0.9319
16 − 4 − 1
11
(d) Test the overall significance of the regression model at the  = .05 level.
Please use the six-step hypothesis-testing framework we employed in class,
and write out the last three steps in this table.
2
H0:r =0
2
H a :r 0
n = 16, α = .05
The rejection region is 𝐹 > 𝐹0.05 ,4,11 = 3.357
Compute the F statistic
𝑅2 𝑛 − 𝑝 − 1
0.950 16 − 4 − 1
𝐹=
=
2
1−𝑅
𝑝
1 − 0.950
4
(0.950)(11)
=
= 52.277
(0.050)(4)
𝐹 = 53.277 > 3.357 = 𝐹0.05,3,11
Conclusion
The F statistic is significant. It falls in the critical region. This leads to the
rejection of the null hypothesis. The conclusion is that the regression model
is over all significant. The model fits the data well.
(e) The printout also provides the following information about the partial
regression coefficients:
Independent variables
x1
x2
x3
x4
Unstandardized
coefficients
-0.0008155
Standard error
-2.484
0.960
0.05901
0.015
0.06928
0.038
0.003
Test the significance of each of the four independent variables, one at a time, at
the  = .05 level. Please use the six-step hypothesis-testing framework we
employed in class, and write out the last three steps in this table.
H 0 : 1 = 0
H a : 1  0
n = 16, α = .05
The critical region is |𝑡| > 𝑡 0.025,11 = 2.201
Compute the t statistic
𝛽̂1
−0.0008155
𝑡=
=
= −0.27183
0.003
𝑆. 𝐸(𝛽̂1 )
|𝑡| = 0.27183 ≯ 2.201
Conclusion
𝑡 is not significant. The contribution of the independent variable 𝑋1 is not
significant.
H 0 : 2 = 0
H a : 2  0
n = 16, α = .05
The critical region is |𝑡| > 𝑡 0.025,11 = 2.201
Compute the t statistic
𝛽̂2
−2.484
𝑡=
=
= −2.5875
0.960
𝑆. 𝐸(𝛽̂2 )
|𝑡| = 2.5875 > 2.201
Conclusion
𝑡 is significant. The contribution of the independent variable 𝑋2 is
significant.
H 0 : 3= 0
H a : 3 0
n = 16, α = .05
The critical region is |𝑡| > 𝑡 0.025,11 = 2.201
Compute the t statistic
𝛽̂3
0.05901
𝑡=
=
= 3.934
0.015
𝑆. 𝐸(𝛽̂3 )
|𝑡| = 3.934 > 2.201
Conclusion
𝑡 is significant. The contribution of the independent variable 𝑋3 is
significant.
H 0 : 4 = 0
H a : 4  0
n = 16, α = .05
The critical region is |𝑡| > 𝑡 0.025,11 = 2.201
Compute the t statistic
𝛽̂4
0.06928
𝑡=
=
= 1.8232
0.038
𝑆. 𝐸(𝛽̂4 )
|𝑡| = 1.8232 ≯ 2.201
Conclusion
𝑡 is not significant. The contribution of the independent variable 𝑋4 is not
significant.
(f) What is your conclusion? That is, is the overall regression model significant?
Why or why not? Which, if any, of the independent variables are significant?
Why or why not?
𝐹 = 53.277 > 3.357 = 𝐹0.05,3,11
The F statistic is significant. It falls in the critical region. This leads to the rejection of the
null hypothesis. The conclusion is that the regression model is over all significant. The
model fits the data well.
The independent variables 𝑋2 𝑎𝑛𝑑 𝑋3 are significant, since the corresponding
t statistics are significant.
The independent variables 𝑋1 𝑎𝑛𝑑 𝑋4 are not significant, since the
corresponding t statistics are not significant.
12. Recall from our discussion in class (also see the website) that the hypothesis
testing framework can be applied by food and beverage companies seeking to hire
“tasters” who have above average taste-discrimination ability. Suppose a brewer
intends to use the triangle taste test method to identify the best applicants for the
position of “taster.” (It does so because the company will want their tasters to
have sensitive palates, and it will need someway of determining beforehand
whether a candidate for a taster’s job can detect subtle differences in taste.)
Remember how this works: in a single trial of this test, the applicant is presented
with 3 samples of beer---two of which are alike---and is asked to identify the odd
sample. Except for the taste difference, the samples are as alike as possible (same
color, same temperature, same cup, and so on). To check the applicant’s ability,
he/she is presented with a series of triangle tests. The order of the presentation is
randomized within each trial. Clearly, in the absence of any ability at all to
distinguish tastes, the probability the applicant will correctly identify the odd
sample in a single trial is one-third (i.e., 1/3). The question---and the question the
brewing company wants to answer---is whether the applicant can do better than
this. More specifically, if an applicant with no ability to distinguish tastes is
presented with 12 trials, we would expect that he/she would correctly identify the
odd sample 4 times, simply by luck alone. On the other hand, if an applicant with
a sensitive palate is presented with 12 trials, we wouldn’t be surprised to see
him/her correctly identify the odd sample more frequently than this, and perhaps
much more frequently.
Suppose our null hypothesis is that a certain applicant has no ability at all to
discern differences in beer samples, and that to her one beer tastes pretty much the
same as another. Obviously, the alternative hypothesis states that the applicant
does have taste discrimination ability. Our job is to present this person with a
series of triangle taste tests with the purpose of collecting data (the number of
correct identifications made in a series of trials) which help the brewing company
classify the job applicant into one of the two groups.
If the company administers n=10 identical triangle taste tests to this job applicant
and if we say that ‘x’ is the number of correct identifications made (in n=10
trials), then the Rejection Region is the “set of values which ‘x’ could assume that
will lead us to reject the null hypothesis, and prefer the alternative hypothesis.”
We could choose any Rejection Region we like, but suppose the company decides
it should be: 6, 7, 8, 9, or 10. That is, if after an applicant is presented with n=10
triangular taste tests (or 10 trials), she correctly identifies the odd sample at least 6
times, we reject the null hypothesis (that the applicant has no taste sensitivity) and
prefer the alternative hypothesis (that the applicant has taste discrimination
ability), and we make her an offer of employment as a taster.
(a) With a Rejection Region of 6, 7, 8, 9 or 10, what is the probability of a Type
I error? ___0.0767____________
𝑃(𝑡𝑦𝑝𝑒 𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑡𝑟𝑢𝑒 )
6
7
8
9
2 4
2 3
2 2
2 1
10 1
10 1
10 1
10 1
= ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( )
6 3
8 3
9 3
7 3
3
3
3
3
10
0
1
2
10
+ ( )( ) ( )
10 3
3
= 0.0570 + 0.0163 + 0.0030 + 0.0003 + 0.0000 = 0.0767
(b) With a Rejection Region of 6, 7, 8, 9 or 10, what is the probability of a Type
II error, if the job applicant has a probability of identifying the odd sample
with p = 0.5? _______________
𝑃(𝑇𝑦𝑝𝑒 𝐼𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝑎𝑐𝑐𝑒𝑝𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑓𝑎𝑙𝑠𝑒 )
0
1
2
3
1 10
1 9
1 8
1 7
10 1
10 1
10 1
10 1
( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( )
0 2
3 2
1 2
2 2
2
2
2
2
4
6
5
5
1
1
10 1
10 1
+ ( )( ) ( ) + ( )( ) ( )
5 2
4 2
2
2
= 0.0010 + 0.0098 + 0.0439 + 0.1172 + 0.2051 + 0.2461 = 0.6230
(c) With a Rejection Region of 8, 9 or 10, what is the probability of a Type I
error? _______________
𝑃(𝑡𝑦𝑝𝑒 𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑡𝑟𝑢𝑒 )
9
10
1 8 2 2
2 1
2 0
10 1
10 1
= ( ) ( ) + ( )( ) ( ) + ( )( ) ( )
9 3
10 3
3
3
3
3
= 0.0030 + 0.0003 + 0.0000 = 0.0034
(d) With a Rejection Region of 8, 9 or 10, what is the probability of a Type II
error, if the job applicant has a probability of identifying the odd sample with
p = 0.5? __________
𝑃(𝑇𝑦𝑝𝑒 𝐼𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝑎𝑐𝑐𝑒𝑝𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑓𝑎𝑙𝑠𝑒 ) =
0
1
2
3
1 10
1 9
1 8
1 7
10 1
10 1
10 1
10 1
( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( )
0 2
3 2
1 2
2 2
2
2
2
2
4
6
5
5
6
4
7
1
1
1
1
1
1
1
1 3
10
10
10
10
+ ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( )
5 2
6 2
4 2
7 2
2
2
2
2
= 0.0010 + 0.0098 + 0.0439 + 0.1172 + 0.2051 + 0.2461 + 0.2051 + 0.1172
= 0.9453
(e) Which Rejection Region would you recommend that the brewing company
use? Why?
The power of the Rejection region {6,7,8,9,10} is 1 − 0.6230 = 0.3760 when 𝑝 = 1/2
The power of the Rejection region {8,9,10} is 1 − 0.9453 = 0.0547 when 𝑝 = 1/2
Since {6,7,8,9,10} is more powerful and level of significance is 0.0767, which is not
large, we recommend the critical region {6,7,8,9,10}