Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years of operation. To test the validity of this claim, a government testing agency selected a random sample of 100 sets and found that 14 sets required some repair within the first two years of operation. 1. What is the critical value for this 95% confidence interval? 2. What is the standard error of this confidence interval? 3. What is the margin of error? 4. Set up a 95% confidence interval estimate of the population proportion of TV sets that need repair in the first two years of operation? 5. What conclusion can we draw from this confidence interval? 6. Interpret the 95% confidence interval. 7. What sample size should be taken if the agency wants 95% confidence when the margin of error is 0.05? 1 2.2 CI 2-independent samples Scenario 2 The purchasing director for an industrial factory is investigating the possibility of purchasing a new milling machine. She determines that the new machine will be purchased if there is evidence that the parts produced a higher breaking strength than those from the old machine. The sample standard deviation of the breaking strength for the old machine is 10 kilograms and for the new machine is 9 kilograms. A sample of 25 parts taken from the old machine indicated a sample mean of 65 kilograms, whereas a similar sample of 25 from the new machine indicated a sample mean of 72 kilograms. 1. What are the degrees of freedom? 2. What is the critical value for this 95% confidence interval? 3. What is the standard error of this confidence interval? 4. What is the margin of error? 5. Set up a 95% confidence interval of the population difference between the two means? 6. What conclusion can we draw from this confidence interval? 7. Interpret the 95% confidence interval. 2 2.3 CI 1 sample T Scenario 3 Suppose an independent testing agency has been contracted to determine whether the contracting company should use a gasoline additive. The current gasoline mileage for it vehicles is 18.5 mpg. A random sample of 30 vehicles from the company’s fleet produced a sample average of 19.34 mpg and a sample standard deviation of 5.2 mpg. 1. What are the degrees of freedom? 2. What is the critical value for this 95% confidence interval? 3. What is the standard error of this confidence interval? 4. What is the margin of error? 5. Set up a 95% confidence interval of the population average of the of MPG with gasoline additive? 6. What conclusion can we draw from this confidence interval? 7. Interpret the 95% confidence interval. 8. What sample size should be taken if the agency wants 95% confidence when the margin of error is 1.5? 3 2.4 CI paired t Scenario 4 Suppose a shoe company wants to test material for the soles of shoes. For each pair of shoes the new material is placed on one shoe and the old material is placed on the other shoe. After a given period of time a random sample of 10 pairs of shoes is selected. The wear is measured on a 10 point scale (higher is better) with the following results. The average of the differences is 0.3 and it standard deviation is 1.767. 1. What are the degrees of freedom? 2. What is the critical value for this 95% confidence interval? 3. What is the standard error of this confidence interval? 4. What is the margin of error? 5. Set up a 95% confidence interval of the population difference of paired observations of shoe soles? 6. What conclusion can we draw from this confidence interval? 7. Interpret the 95% confidence interval. 8. What sample size should be taken if the agency wants 95% confidence when the margin of error is 0.6? 4 2.5 hypotheses test 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years of operation. To test the validity of this claim, a government testing agency selected a random sample of 100 sets and found that 14 sets required some repair within the first two years of operation. The company uses a 5% level of significance. 1. How many tails have for this test? 2. What are the hypotheses? 3. What is the standard error of the proportion? 4. What is the test statistic? 5. What is the p-value? 6. What conclusion can we draw from this test? 7. What is the critical value? 5 2.6 hypotheses test 2-independent samples Scenario 2 The purchasing director for an industrial factory is investigating the possibility of purchasing a new milling machine. She determines that the new machine will be purchased if there is evidence that the parts produced a higher breaking strength than those from the old machine. The sample standard deviation of the breaking strength for the old machine is 10 kilograms and for the new machine is 9 kilograms. A sample of 25 parts taken from the old machine indicated a sample mean of 65 kilograms, whereas a similar sample of 25 from the new machine indicated a sample mean of 72 kilograms. The director uses a 5% level of significance. 1. How many tails have for this test? 2. What are the hypotheses? 3. What is the test statistic? 4. What are the degrees of freedom? 5. What is the p-value? 6. Should you reject the null hypothesis (decision)? 7. What conclusion can we draw from this test? 8. What is the critical value? 6 2.7 Hypotheses testing 1 sample T Scenario 3 Suppose an independent testing agency has been contracted to determine whether the contracting company should use a gasoline additive. The current gasoline mileage for it vehicles is 18.5 mpg. A random sample of 30 vehicles from the company’s fleet produced a sample average of 19.34 mpg and a sample standard deviation of 5.2 mpg. Is there evidence that putting an additive into the gasoline of the company vehicles will improve the performance (i.e., MPG) of the company vehicles. The company uses a 5% level of significance. 1. How many tails have for this test? 2. What are the hypotheses? 3. What is the test statistic? 4. What are the degrees of freedom? 5. What is the p-value? 6. Should you reject the null hypothesis (decision)? 7. What conclusion can we draw from this test? 8. What is the critical value? 7 2.8 Hypotheses test paired t Scenario 4 Suppose a shoe company wants to test material for the soles of shoes. For each pair of shoes the new material is placed on one shoe and the old material is placed on the other shoe. After a given period of time a random sample of 10 pairs of shoes is selected. The wear is measured on a 10 point scale (higher is better) with the following results. The average of the differences is 0.3 and it standard deviation is 1.767. 1. How many tails have for this test? 2. What are the hypotheses? 3. What is the test statistic? 4. What are the degrees of freedom? 5. What is the p-value? 6. Should you reject the null hypothesis (decision)? 7. What conclusion can we draw from this test? 8. What is the critical value? 8 2.9 χ2 -test Scenario 5 Suppose the head of the HR division of a mid-sized company wants to determine if she should let Red Cross have a give blood day in the company cafeteria. She take a random sample of size 49. The follow contingency table is constructed. Blood Donor Status Yes No Total Men 5 17 22 Women 7 20 27 Total 12 37 49 1. What are the hypotheses? 2. What is the test statistic? 3. What are the degrees of freedom? 4. What is the p-value? 5. Should you reject the null hypothesis (decision)? 6. What conclusion can we draw from this test? 7. What is the expected value for cell row 2 column 2? 9 2.10 SLR Scenario 6 A statistician for an American automobile manufacturer would like to develop a statistical model for predicting delivery time (the days between initiating the order to the actual delivery of the new car) of custom-ordered new automobile. The statistician believes there is a linear relationship between the number of options ordered on a car and the delivery time. A random sample of 16 cars is selected with the following results. Options Ordered vs Delivery Time 70 Residuals vs Fitted 4 10 0 Residuals 50 -2 -4 Delivery Time 2 60 13 40 30 Regression Statistics Multiple R 0.9785 R square 0.9575 Adj R sq 0.9545 Standard error 3.0446 Observations 16 5 10 15 20 25 3 30 40 Options Ordered df Regression 1 Residual 14 Total 15 intercept optionsOrdered ANOVA SS MS F Significance F 2927.23 2927.23 315.8 0 129.77 9.27 3057.00 Coefficients Coefficient Std error t Stat p-value 21.9254 1.5908 13.7823 0.0 2.0687 0.1164 17.7707 0.0 50 60 Low 95% Up 95% 18.51 25.34 1.819 2.3184 1. Identify which variable is the X, independent, or explanatory variable. 2. Identify which variable is the Y, dependent, or response variable. 3. Describe the pattern of points as they appear on the graph. 4. What kind of relationship do you see? 5. Are there any ”outliers?” 6. Describe the strength and direction of the correlation. 7. Compare this relationship with the pattern of points on the scatter diagram between the two variables. 10 70 Fitted values lm(Time ~ Options) 8. Write the specific estimated regression equation for this problem. 9. Using the estimated regression equation predict the average delivery time for the average car with 16 options ordered. 10. Is the previous prediction extrapolation? 11. Interpret the slope estimate, that is, explain what is means in terms of this problem. 12. Compute the coefficient of determination or how much variation in delivery time is accounted for by this regression model? Express your answer as a percent. What measure did you use to answer this question? 13. What is the standard error of the estimated regression line? Include the unit of measurement in your answer. 14. Using a 5% level of significance, is there evidence of a linear relationship between delivery time and options ordered? Be sure to state the hypotheses, test statistic, p-value, and the conclusion. 15. Give a 95% confidence interval for the true (i.e., population) slope. 16. If the original correlation coefficient between these two variables were not known, how could it be calculated using the statistics in the regression output? How do you determine the sign of the correlation coefficient? 17. Describe what you see on the residual plot. 18. For the data set, look at the 9th pair of observations (Options, Time) or (12, 44). Calculate the residual, i.e., ei = Yi − Ŷi . 19. Is the model a good fit for the data? Be sure to state your decision and give the reasons that support your decision. 11 2.11 MLR Scenario 7 Suppose a consumer organization wanted to develop a model to predict gasoline mileage as measured by miles per gallon (MPG) based on the horsepower of the car’s engine and the weight of the car. A sample of 50 recent car models was selected, with the results summarized below. Descriptive Statistics Regression Statistics MPG Horsepower Weight Multiple R 0.8657 Mean 28.5 90.8 2756.5 R square 0.7494 Std Err 1.16 3.85 89.81 Adj R sq 0.7388 Std Dev 8.17 27.26 635.05 Standard error 4.1766 Variance 66.77 743.04 403289.76 Observations 50 Minimum 15.5 48 1755 Maximum 46.6 165 4360 Sum 1427.1 4542 137826 Count 50 50 50 Correlation Coefficient Min - Max MPG HP WT x-variable Min Max MPG 1 HP 48 165 HP -0.7882 1 WT 1755 4360 WT -0.8248 0.7419 1 ANOVA df SS MS F Significance F Regression 2 2451.97 1225.99 70.2813 0 Residual 47 819.87 17.44 Total 49 3271.84 Coefficients Coefficient Std error t Stat p-value Low 95% Up 95% intercept 58.1508 2.6582 21.8780 0.0 52.81 63.50 Horsepower -0.1175 0.0326 -3.6003 0.0008 -0.1832 -0.0519 Weight -0.0069 0.0014 -4.9035 0.0 -0.0097 -0.0041 1. Identify which variables are the X, independent, or explanatory variables. 2. Identify which variable is the Y, dependent, or response variable. 3. Describe the strength and direction of the correlation. 12 4. Write the specific estimated regression equation for this problem. 5. Using the estimated regression equation predict the average MPG for a car that has 60 HP and weighs 2000 lbs. 6. Is the previous prediction extrapolation? 7. Interpret the slope estimate, that is, explain what is means in terms of this problem. 8. Determine the coefficient of multiple determination or how much variation in MPG is accounted for by this regression model? Express your answer as a percent. What measure did you use to answer this question? 9. What is the standard error of the estimated regression line? Include the unit of measurement in your answer. 10. Using a 5% level of significance, is there evidence of a linear relationship between MPG and WT? Be sure to state the hypotheses, test statistic, p-value, and the conclusion. 11. Give a 95% confidence interval for the true (i.e., population) slope of MPG and HP. 12. For the data set, look at the 1st set of observations (MPG, HP, WT) or (43.1, 48, 1985). Calculate the residual, i.e., ei = Yi − Ŷi . 13. Is the model a good fit for the data? Be sure to state your decision and give the reasons that support your decision. Questions Questions? 13