Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
10. In 1988, the average gasoline retail price of one of the major oil companies had been $1.25 per gallon in a certain large market area of the country, Houston, Texas. Since the company was concerned about the aggressive pricing of its competitors (gasoline, of course, is a commodity product which typically competes on price), its general manager tried to take measures to reduce the retail price per gallon (that is, the price you and I pay at the pump). After several months of trying to push down prices, she was interested in determining whether or not the then-current price was significantly less than $1.25. A random sample of n = 49 of their gasoline stations was selected, and the average price was determined to be $1.20 per gallon. Assume that the standard deviation of the population was = $0.14. Note: (b), (c), and (d) below require numerical answers. (a) At the 95% confidence level, test to determine whether the measures taken by the company were effective in reducing the average price. Please use the sixstep hypothesis-testing framework we employed in class, and write out the last three steps in this table. H 0 : µ ≥ $1.25 H a : µ $1.25 n = 49, α = .05 √𝑛(𝑥̅ −𝜇) The rejection region is 𝑧 = < −𝑧𝛼 , 𝑤ℎ𝑒𝑟𝑒 𝑛 = 49, 𝜇 = 1.25, 𝜎 𝜎 = 0.14, 𝛼 = 0.05, 𝑧0.05 = 1.645, − 𝑧0.05 = −1.645 Compute the test statistic √49(1.20 − 1.25) 𝑧= = −2.5 0.14 𝑧 = −2.5 < −𝑧0.05 = −1.645 Conclusion The test statistic is significant. It falls in the critical region. This leads to the rejection of the null hypothesis. The conclusion is that the measures taken by the company were effective in reducing the average price. (b) What is the p-value associated with the above sample result? 𝑃(𝑧 < −2.5) = 0.00621 (c) What is the probability of committing a Type II error, β, if the actual price per gallon is $1.19 per gallon? _0.3085______________ 𝛽 = 𝑃(𝑡𝑦𝑝𝑒 𝐼𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝐴𝑐𝑐𝑒𝑝𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑓𝑎𝑙𝑠𝑒) √49(𝑥̅ − 1.25) = 𝑃( ≥ −2.5/𝜇 = 1.19) 0.14 0.14 = 𝑃 (𝑥̅ ≥ 1.25 − 2.5 × /𝜇 = 1.19) √49 √49(𝑥̅ − 1.19) √49(1.20 − 1.19) = 𝑃(𝑥̅ ≥ 1.20/ 𝜇 = 1.19) = 𝑃 ( ≥ ) 0.14 0.14 = 𝑃(𝑧 ≥ 0.5) = 1 − 𝑃(𝑧 < 0.5) = 1 − 0.6915 = 0.3085 (d) What is the “power of the test,” i.e., what is (1 – β)? _____0.6915_________ 1 − 𝛽 = 1 − 0.3085 = 0.6915 11. Please interpret the printout results below, and answer the following questions. Suppose we regress the dependent variable y on the four independent variables x 1 , x 2 , x 3 , and x 4 . After running this regression, on sixteen observations, the SPSS printout provides the following information about the ANOVA table: Regression Sum of Squares, 946.181; Residual Sum of Squares, 49.773. For (a), (b), and (c) below, please provide numerical answers. (a) What is the coefficient of determination? __0.950_____________ Total Sum of Squares = Regression Sum of squares + Residual Sum of Squares = 946.181 + 49.773 = 995.954 𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 946.181 Coefficient of determination =𝑅 2 = 𝑇𝑝𝑡𝑎𝑙 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 = 995.954 = 0.950 (b) What is the correlation between the actual dependent variable and the predicted dependent variable? ___0.9747____________ 𝑅 = √𝑅 2 = √0.950 = 0.9747 (c) What is the adjusted coefficient of determination? ___0.9319____________ 𝑛−1 𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑅 2 = 1 − (1 − 𝑅 2 ) 𝑛−𝑝−1 16 − 1 15 = 1 − (1 − 0.950) = 1 − 0.050 × ( ) = 1 − 0.0681 = 0.9319 16 − 4 − 1 11 (d) Test the overall significance of the regression model at the = .05 level. Please use the six-step hypothesis-testing framework we employed in class, and write out the last three steps in this table. 2 H0:r =0 2 H a :r 0 n = 16, α = .05 The rejection region is 𝐹 > 𝐹0.05 ,4,11 = 3.357 Compute the F statistic 𝑅2 𝑛 − 𝑝 − 1 0.950 16 − 4 − 1 𝐹= = 2 1−𝑅 𝑝 1 − 0.950 4 (0.950)(11) = = 52.277 (0.050)(4) 𝐹 = 53.277 > 3.357 = 𝐹0.05,3,11 Conclusion The F statistic is significant. It falls in the critical region. This leads to the rejection of the null hypothesis. The conclusion is that the regression model is over all significant. The model fits the data well. (e) The printout also provides the following information about the partial regression coefficients: Independent variables x1 x2 x3 x4 Unstandardized coefficients -0.0008155 Standard error -2.484 0.960 0.05901 0.015 0.06928 0.038 0.003 Test the significance of each of the four independent variables, one at a time, at the = .05 level. Please use the six-step hypothesis-testing framework we employed in class, and write out the last three steps in this table. H 0 : 1 = 0 H a : 1 0 n = 16, α = .05 The critical region is |𝑡| > 𝑡 0.025,11 = 2.201 Compute the t statistic 𝛽̂1 −0.0008155 𝑡= = = −0.27183 0.003 𝑆. 𝐸(𝛽̂1 ) |𝑡| = 0.27183 ≯ 2.201 Conclusion 𝑡 is not significant. The contribution of the independent variable 𝑋1 is not significant. H 0 : 2 = 0 H a : 2 0 n = 16, α = .05 The critical region is |𝑡| > 𝑡 0.025,11 = 2.201 Compute the t statistic 𝛽̂2 −2.484 𝑡= = = −2.5875 0.960 𝑆. 𝐸(𝛽̂2 ) |𝑡| = 2.5875 > 2.201 Conclusion 𝑡 is significant. The contribution of the independent variable 𝑋2 is significant. H 0 : 3= 0 H a : 3 0 n = 16, α = .05 The critical region is |𝑡| > 𝑡 0.025,11 = 2.201 Compute the t statistic 𝛽̂3 0.05901 𝑡= = = 3.934 0.015 𝑆. 𝐸(𝛽̂3 ) |𝑡| = 3.934 > 2.201 Conclusion 𝑡 is significant. The contribution of the independent variable 𝑋3 is significant. H 0 : 4 = 0 H a : 4 0 n = 16, α = .05 The critical region is |𝑡| > 𝑡 0.025,11 = 2.201 Compute the t statistic 𝛽̂4 0.06928 𝑡= = = 1.8232 0.038 𝑆. 𝐸(𝛽̂4 ) |𝑡| = 1.8232 ≯ 2.201 Conclusion 𝑡 is not significant. The contribution of the independent variable 𝑋4 is not significant. (f) What is your conclusion? That is, is the overall regression model significant? Why or why not? Which, if any, of the independent variables are significant? Why or why not? 𝐹 = 53.277 > 3.357 = 𝐹0.05,3,11 The F statistic is significant. It falls in the critical region. This leads to the rejection of the null hypothesis. The conclusion is that the regression model is over all significant. The model fits the data well. The independent variables 𝑋2 𝑎𝑛𝑑 𝑋3 are significant, since the corresponding t statistics are significant. The independent variables 𝑋1 𝑎𝑛𝑑 𝑋4 are not significant, since the corresponding t statistics are not significant. 12. Recall from our discussion in class (also see the website) that the hypothesis testing framework can be applied by food and beverage companies seeking to hire “tasters” who have above average taste-discrimination ability. Suppose a brewer intends to use the triangle taste test method to identify the best applicants for the position of “taster.” (It does so because the company will want their tasters to have sensitive palates, and it will need someway of determining beforehand whether a candidate for a taster’s job can detect subtle differences in taste.) Remember how this works: in a single trial of this test, the applicant is presented with 3 samples of beer---two of which are alike---and is asked to identify the odd sample. Except for the taste difference, the samples are as alike as possible (same color, same temperature, same cup, and so on). To check the applicant’s ability, he/she is presented with a series of triangle tests. The order of the presentation is randomized within each trial. Clearly, in the absence of any ability at all to distinguish tastes, the probability the applicant will correctly identify the odd sample in a single trial is one-third (i.e., 1/3). The question---and the question the brewing company wants to answer---is whether the applicant can do better than this. More specifically, if an applicant with no ability to distinguish tastes is presented with 12 trials, we would expect that he/she would correctly identify the odd sample 4 times, simply by luck alone. On the other hand, if an applicant with a sensitive palate is presented with 12 trials, we wouldn’t be surprised to see him/her correctly identify the odd sample more frequently than this, and perhaps much more frequently. Suppose our null hypothesis is that a certain applicant has no ability at all to discern differences in beer samples, and that to her one beer tastes pretty much the same as another. Obviously, the alternative hypothesis states that the applicant does have taste discrimination ability. Our job is to present this person with a series of triangle taste tests with the purpose of collecting data (the number of correct identifications made in a series of trials) which help the brewing company classify the job applicant into one of the two groups. If the company administers n=10 identical triangle taste tests to this job applicant and if we say that ‘x’ is the number of correct identifications made (in n=10 trials), then the Rejection Region is the “set of values which ‘x’ could assume that will lead us to reject the null hypothesis, and prefer the alternative hypothesis.” We could choose any Rejection Region we like, but suppose the company decides it should be: 6, 7, 8, 9, or 10. That is, if after an applicant is presented with n=10 triangular taste tests (or 10 trials), she correctly identifies the odd sample at least 6 times, we reject the null hypothesis (that the applicant has no taste sensitivity) and prefer the alternative hypothesis (that the applicant has taste discrimination ability), and we make her an offer of employment as a taster. (a) With a Rejection Region of 6, 7, 8, 9 or 10, what is the probability of a Type I error? ___0.0767____________ 𝑃(𝑡𝑦𝑝𝑒 𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑡𝑟𝑢𝑒 ) 6 7 8 9 2 4 2 3 2 2 2 1 10 1 10 1 10 1 10 1 = ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) 6 3 8 3 9 3 7 3 3 3 3 3 10 0 1 2 10 + ( )( ) ( ) 10 3 3 = 0.0570 + 0.0163 + 0.0030 + 0.0003 + 0.0000 = 0.0767 (b) With a Rejection Region of 6, 7, 8, 9 or 10, what is the probability of a Type II error, if the job applicant has a probability of identifying the odd sample with p = 0.5? _______________ 𝑃(𝑇𝑦𝑝𝑒 𝐼𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝑎𝑐𝑐𝑒𝑝𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑓𝑎𝑙𝑠𝑒 ) 0 1 2 3 1 10 1 9 1 8 1 7 10 1 10 1 10 1 10 1 ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) 0 2 3 2 1 2 2 2 2 2 2 2 4 6 5 5 1 1 10 1 10 1 + ( )( ) ( ) + ( )( ) ( ) 5 2 4 2 2 2 = 0.0010 + 0.0098 + 0.0439 + 0.1172 + 0.2051 + 0.2461 = 0.6230 (c) With a Rejection Region of 8, 9 or 10, what is the probability of a Type I error? _______________ 𝑃(𝑡𝑦𝑝𝑒 𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑡𝑟𝑢𝑒 ) 9 10 1 8 2 2 2 1 2 0 10 1 10 1 = ( ) ( ) + ( )( ) ( ) + ( )( ) ( ) 9 3 10 3 3 3 3 3 = 0.0030 + 0.0003 + 0.0000 = 0.0034 (d) With a Rejection Region of 8, 9 or 10, what is the probability of a Type II error, if the job applicant has a probability of identifying the odd sample with p = 0.5? __________ 𝑃(𝑇𝑦𝑝𝑒 𝐼𝐼 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝑎𝑐𝑐𝑒𝑝𝑡 𝐻0 /𝐻0 𝑖𝑠 𝑓𝑎𝑙𝑠𝑒 ) = 0 1 2 3 1 10 1 9 1 8 1 7 10 1 10 1 10 1 10 1 ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) 0 2 3 2 1 2 2 2 2 2 2 2 4 6 5 5 6 4 7 1 1 1 1 1 1 1 1 3 10 10 10 10 + ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) + ( )( ) ( ) 5 2 6 2 4 2 7 2 2 2 2 2 = 0.0010 + 0.0098 + 0.0439 + 0.1172 + 0.2051 + 0.2461 + 0.2051 + 0.1172 = 0.9453 (e) Which Rejection Region would you recommend that the brewing company use? Why? The power of the Rejection region {6,7,8,9,10} is 1 − 0.6230 = 0.3760 when 𝑝 = 1/2 The power of the Rejection region {8,9,10} is 1 − 0.9453 = 0.0547 when 𝑝 = 1/2 Since {6,7,8,9,10} is more powerful and level of significance is 0.0767, which is not large, we recommend the critical region {6,7,8,9,10}