* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 3710 Spring 2010 FinalA
Survey
Document related concepts
Transcript
COURSE: DSCI 3710 Print Name: Exam 1 – version A Signature: Spring 2010 Student ID#: INSTRUCTIONS: Please print your name and student ID number on this exam. Also, put your signature on this exam. On your scantron PRINT your name and exam version. To better protect your privacy also print your name on the backside of your scantron. You have 105 minutes to complete this exam. The exam is open book, open notes, and open mind. You may use any type of hand calculator but please show all your work on the exam and mark all answers on the scantron. Usage of cell phones, digital cameras, PDAs, and other communication devices is strictly prohibited. Many of the questions follow the format of those in Hawkes Learning Systems Business Statistics. The remaining questions are either based on the Excel assignments or use an HLSBS-like approach with problems nearly identical to those assigned in the textbook. Please DO NOT pull this exam apart. When you have completed the exam, please turn in your scantron and exam booklet to your instructor, at the front desk. No cheating. Good luck and we wish you well on the exam. Note: Whenever question(s) are connected you may be asked to assume a result (given a value) as an answer for the previous question but this result (value) may or may not be correct. The procedure is set in place to prevent you from losing points on a subsequent question because you made a mistake on some previous question/s. Use the information given below to answer the 4 questions that follow: A corporation randomly selects 150 salespeople and finds that 125 who have never taken a selfimprovement course that would like a course. The firm did a similar study 10 years ago in which 120 of a random sample of 160 salespeople wanted a self-improvement course. The groups are assumed to be independent random samples. Let p1 and p2 represent the true proportion of workers who would like to attend a self-improvement course in the current study and the past study respectively. The firm wants to test whether their current course recruitment efforts resulted in a greater proportion of workers that want to attend a self-improvement course than in the past. Z Test for Two Proportions Sample Proportion Number of Observations Ho: XXX Z* P[Z Z*] Z Critical, = 0.01 99% CI for p1 - p2 Variable 1 Variable 2 0.833333 150 Ha: XXX 1.801215 0.035835 X.XX -0.035026 0.750000 160 to 0.201692 1. What are the correct null and alternative hypotheses for the above situation? A. Ho: μ1 > μ2 Ha: μ1 < μ2 B. Ho: p1 < p2 Ha: p1 > p2* C. Ho: p1 = p2 Ha: p1 ≠ p2 D. Ho: p1 > p2 Ha: p1 < p2 E. Ho: μ1 = μ2 Ha: μ1 ≠ μ2 2. What is the critical value for testing the hypotheses for this problem if = 0.01? A. 1. 645 B. 1.96 C. 2.33* D. 2.57 E. 1.28 3. What is the calculated value of the test statistic for the above statistical test? A. -0.09 B. 0.18 C. 0.16 D. 1.80* E. 0.72 4. What are the decision and conclusion of the test at the significance level of 0.01? A. Fail to reject the null hypothesis, conclude there is sufficient evidence that the proportion of workers that want to attend the course has decreased. B. Reject the null hypothesis, conclude there is no evidence of difference in proportions. C. Fail to reject the null hypothesis, conclude there is insufficient evidence that the proportion of workers that want to attend the course has increased.* D. Reject the null hypothesis, conclude there is evidence that the proportion of workers that want to attend the course has increased. E. Fail to reject the null hypothesis, conclude there is sufficient evidence of difference in proportions. Use the information given below to answer the four (4) questions that follow: The Glen Valley Steel Company manufactures steel bars. If the production process is working properly, it turns out steel bars with an average length of at least 2.75 feet with a standard deviation of 0.20 foot (as determined from engineering specifications on the production equipment involved). Longer steel bars can be used or altered, but shorter bars must be scrapped. A sample of 25 bars is selected from the production line. The sample indicates an average length of 3.08 feet. The company wishes to determine whether the process is making short bars because if it is, then the production equipment needs an immediate adjustment. t Test for Population Mean Number of Observations Sample Standard Deviation Sample Mean Ho: X.XX T* P[T T*] T Critical, = 0.05 95% CI for Pop. Mean 25 0.606218 3.080000 Ha: < X.XX 2.721794 0.994052 -1.710882 2.829766 to 5. State the null and alternative hypothesis. A. Ho: µ ≥ 2.75; Ha: µ < 2.75 * B. Ho: µ < 2.75; Ha: µ ≥ 2.75 C. Ho: µ ≥ 2.75; Ha: µ ≠ 2.75 D. Ho: µ = 2.75; Ha: µ ≠ 2.75 E. Ho: µ ≥ 2.75; Ha: µ = 2.75 6. What is the degree of freedom? A. 25 B. 24* C. 26 D. 27 E. 28 7. At the 10% level of significance, where is the Reject Ho region? A. To the left of T = -1.318 * B. To the left of T = -1.645 and to the right of T = 1.645 C. To the left of Z = -1.328 and to the right of Z = 1.328 D. To the right of T = -1.711 E. To the left of T = -1.960 3.330234 8. Assuming the calculated value of the test statistic is -1.25, what are the decision and conclusion of the test at the significance level of 0.05? A. Fail to reject the null hypothesis; there is insufficient evidence to conclude that the process is not making short bars. B. Reject the null hypothesis, there is sufficient evidence to conclude that the process is not making short bars. C. Fail to reject the null hypothesis, there is insufficient evidence to conclude that the process is making short bars.* D. Reject the null hypothesis, there is insufficient evidence to conclude that the process is not making short bars. E. Reject the alternative hypothesis; conclude the mean is significantly less than 2.75 feet. Use the information given below to answer the four questions that follow: The effect of a corporate contract-training course on employee efficiency is the subject of a study. Based on advice from a statistics consultant, the human resources training specialist assigned two sets of 10 pages of equally difficult material for data entry by the same 10 staff members, with one set being entered before and the other after completing the corporate contract-training course. The raw data with the number of errors and Excel analysis using a 10% significance level are shown in the Tables below. Typist 1 2 3 4 5 6 7 8 9 10 Before 31 30 35 43 36 34 . . 43 45 After 30 33 36 38 30 28 . . 40 47 t-Test: Paired Two Sample for Means Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Before 35.6 43.1556 10 0.85156 0 9 1.07763 0.XXXX 1.38303 0.XXXX 1.83311 After 34.4 40.0444 10 9. What is the table value of the appropriate test statistic to test the belief at the 10% level of significance that there is a reduction in the mean number of errors if data entry personnel go through the contract-training course? A. 1.38 * B. 1.83 C. 1.08 D. 0.86 E. 2.03 10. What is the calculated value of the appropriate test statistic to test the belief that there is a reduction in the mean number of errors if data entry personnel go through the training course? A. 1.38 B. 1.08 * C. 1.83 D. 0.86 E. 1.96 11. Which of the following ranges best represents the p-value (calculated t statistic) for testing the belief that there is a reduction in the mean number of errors if data entry personnel go through the contract-training course? A. p > 0.1* B. 0.05 < p < 0.1 C. 0.025 < p < 0.05 D. 0.01 < p < 0.025 E. p < 0.01 12. What are the decision and conclusion of the test? A. Fail to reject the null hypothesis, conclude there is insufficient evidence for error reduction. * B. Fail to reject the null hypothesis, conclude there is sufficient evidence for error reduction. C. Reject the null hypothesis, conclude there is evidence for error reduction. D. Reject the null hypothesis, conclude there is no evidence for error reduction. E. No decision or conclusion can be reached from this analysis Use the information in the paragraph below to answer the next five questions. A grocery store wants to learn about the preferred shopping times for customers of different age groups. A random sample of customers of the store was selected, and information was gathered on their shopping time preference and age classification. A chi-square test of independence was performed at the 0.05 significance level using Excel with the three age categories in the rows and the three different time categories in the columns. A Chi –square test of independence using Excel gave the following Tables. Cross tabulation Table OBSERVED Coln 2 25 27 XX 73 Calculation of the Chi-Square Test DESCRIPTION VALUE Row 1 Row 2 Row 3 Total Coln 1 XX XX 26 83 Coln 3 24 30 19 73 Total 78 85 66 229 Row 1 Row 2 Row 3 Total EXPECTED Coln 1 Coln 2 Coln 3 28.271 24.865 24.865 30.808 27.096 XXX 23.921 XXX XXX 83 73 73 Total 78 85 66 229 2* p-value Critical value df 0.995441 XXX 9.487729 0.05 4 13. What alternative hypothesis would you use to test whether there is a relationship between age and preferred shopping time? A. Ha: The means of the preferred shopping times are different. B. Ha: The age range is independent of the preferred shopping time. C. Ha: The preferred shopping time depends on the customers’ age range.* D. Ha: Each of the three preferred shopping times is the same. E. Ha: The three age ranges prefer a different mean shopping time. 14. What is the calculated value of the test statistic for this statistical test? A. 0.05 B. 0.995* C. 0.910 D. 9.487 E. 4 15. A. B. C. D. E. Based on the Excel output given above, what is the conclusion of the test of this hypothesis? There is sufficient evidence that the means of the preferred shopping times are not equal. There is sufficient evidence that age range and preferred shopping time are dependent. There is sufficient evidence that the three age ranges prefer the same shopping time. There is insufficient evidence that age range and preferred shopping time are dependent.* There is insufficient evidence that the means of the preferred shopping times are not equal. 16. What is the degrees of freedom value for this chi-square test? A. 9 B. 8 C. 6 D. 4 * E. 2 17. Which of the following ranges best describes the p-value for the test statistic? A. p > 0.05* B. 0.025 < p-value < 0.05 C. 0.01 < p-value < 0.025 D. 0.001 < p-value < 0.01 E. p < 0.001 Use the following information to answer the next six questions. A medical researcher was interested in the amount of weight loss caused by a particular diuretic. In a controlled experiment with 20 mice, the amount of weight loss was recorded after 1 month of fixed daily doses of the diuretic, administered as follows: (A partial output of the regression analysis of the data are given subsequently.) Rat 1 2 3 Diuretic (milligrams) Weight Loss (pounds) 0.30 0.30 0.35 0.35 0.38 0.41 11 12 0.45 0.50 0.50 0.61 0.71 0.72 18 Rat 13 Diuretic (milligrams) Weight Loss (pounds) 0.55 0.55 0.60 0.73 0.72 0.74 0.70 0.55 0.55 0.83 0.49 0.51 . . . . 8 9 10 SUMMARY OUTPUT Regression Statistics Multiple R XXXX R Square XXXX Adjusted R Square 0.7904 Standard Error 0.0754 Observations 20 19 20 ANOVA Regression Residual Total df 1 18 19 Intercept Diuretic (milligrams) SS 0.4125 0.1022 0.5147 MS 0.4125 0.0057 Coefficients 0.0382 1.1639 F 72.6458 Standard Error 0.0710 0.1366 Significance F 0.0000 t Stat 0.5387 8.5233 P-value 0.5967 0.0000 18. What is the correlation between these two variables? A. .856 B. -.856 C. .0754 D. .7904 E. .8952* 19. What is the intercept of the least squares line? A. 253,57 B. –1.042 C. 1.1639 D. 0.0382 * E. 0.4125 20. What percentage of the variation in weight loss is explained by its regression on diuretic amount? A. 45 B. 69 C. 75 D. 19.86 E. 80.14* 21. According to the least squares regression from this sample, when the diuretic increases by 1 milligram, the weight loss will: A. Increase by 1.1639 pounds* B. Decrease by 1.1639 pounds C. Increase by .0382 pounds D. Decrease by 0.0382 pounds E. Increase by 1.223 pounds 22. Based on the outcome of the hypothesis test for the slope of the regression line at the 5% significance level, we can conclude that: A. There is insufficient evidence that there is a relationship between weight loss and daily dose of the diuretic. B. There is sufficient evidence that the slope of the regression line is equal to zero. C. There is insufficient evidence that the slope of the regression line is positive. D. There is sufficient evidence that there is a relationship of weight loss and daily dose of the diuretic. * E. Inconclusive. 23. Assuming the confidence interval for the slope is [0.8770 to 1.4508], we can conclude that: A. There is sufficient evidence that there is a relationship between weight loss and daily dose of the diuretic.* B. There is insufficient evidence that there is a relationship between weight loss and daily dose of the diuretic. C. There is sufficient evidence that the slope of the regression line is equal to zero. D. There is insufficient evidence that the slope of the regression line is positive. E. Inconclusive. Use the following information to answer the next five questions. The personnel manager of a large insurance company wishes to evaluate the leadership of supervisors, mid-level managers and upper-level managers. 10 persons from each management level were sampled. Is there a difference on the average of the leadership scores for the three groups? Leadership indices and ANOVA table are as follows: Supervisor Mid-Manager Upper Manager 16 34 21 32 35 41 21 15 20 21 40 29 28 49 . . . 35 31 33 48 38 37 42 45 25 36 29 26 27 Count 10 10 10 Sum 256 352 353 Average 25.6 35.2 35.3 Variance 81.82222 40.84444 68.01111 SS 620.8667 1716.10 XXXX df X 27 29 MS XXXX XXXX F XXXX SUMMARY Groups Supervisor Mid-Manager Upper Manager ANOVA Source of Variation Between Groups Within Groups Total P-value 0.0155 F crit 3.3541 24. Which of the following would be the appropriate null hypothesis for testing the claim that there is significant difference of leadership scores among these three management levels? A. Ho: At least one pair of μ1, μ2, μ3 is different B. Ho: μ1 μ2 μ3 C. Ho: μ1 > μ2 > μ3 D. Ho: μ1 < μ2 < μ3 E. Ho: μ1 = μ2 = μ3 * 25. How much is the total variation among the leadership scores? A. 1716 B. 2337* C. 621 D. 310 E. 63 26. What is the calculated value of the test statistic? A. 19.45 B. 3.40 C.4.88* D. 3.35 E. 2.54 27. What is the critical value of the test statistic? Use a = 0.05. A. 4.48 B. 3.35* C. 3.13 D. 3.60 E. 2.54 28. Using a = 0.05 and assuming the calculated F value is 4.88, what is the conclusion of this ANOVA test? A. There is insufficient information to decide whether the leadership scores are different. B. Fail to reject H0, conclude the mean leadership scores are different. C. Fail to reject H0, conclude there is insufficient evidence for the mean leadership scores being different. D. Reject H0, conclude the mean leadership scores are different.* E. Reject H0, conclude the mean leadership scores are equal. Use the following information to answer the next four questions. To predict the sales prices of a used Mustang GT, the following data from the past sales records for the car were collected - the car’s age, condition, and mileage and seller (whether the seller is an individual or a dealer). The dummy variable X3 is set equal to 0 if the car is in poor condition, 1 if not. The dummy variable X4 is set equal to 1 if the seller is a dealer and 0 if not (i.e., if the seller is an individual). The data was analyzed using the multiple regression module of Excel, and partial results are given, following the data. Please complete the Tables only to the extent required to answer the subsequent questions. sales Price(Y) Age (in Years) X1 Mileage(in 1000s) X2 4103 3803 4098 6603 5091 5003 9 9 8 7 7 7 74 . . 51 54 78 Condition (Excellent, Poor) X3 1 0 0 0 0 0 Dealer or individual X4 0 0 0 1 0 0 3903 7903 7398 4803 8553 7903 7898 7579 7553 5903 10798 10778 10698 9603 9098 8098 7553 15453 13068 7 6 6 6 5 5 5 5 5 5 4 4 4 4 4 4 4 3 3 110 65 70 55 59 62 66 55 59 70 48 32 48 50 50 51 69 24 27 0 0 1 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 SUMMARY OUTPUT Regression Statistics Multiple R 0.938236 R Square 0.880287 Adjusted R Square 0.856344 Standard Error 1110.74 Observations 25 ANOVA df Regression Residual Total XX 20 24 SS 181442155.2 24674887.4 206117042.6 MS 45360538.8 1233744.37 Coefficients Standard Error Intercept 14655.61 888.3083 Age (in Years) X1 -1070.16 197.5215 Mileage(in 1000s) X2 -24.90 16.0507 Condition (Excellent, Poor) X3 -824.58 537.3320 Dealer or individual X4 1976.33 554.5758 F 36.766562 t Stat 16.4983 -5.4180 -1.5512 -1.5346 3.5637 Significance F 5.9261E-09 P-value Lower 95% Upper 95% 0.0000 12802.6 16508.6 XXXX -1482.2 -658.1 0.1365 -58.4 8.6 0.1406 -1945.4 296.3 0.0019 819.5 3133.2 29. If we wish to determine whether Mileage (variable X2) has a significant effect on the sales price, what is the calculated value of the test statistic? A. 16.4983 B. -5.4180 C. -1.5512* D. -1.5346 E. 3.5637 30. What is the p value, associated with the calculated value of the test statistic for the variable age (X1)? A. p-value < .01* B. 0.01 < p-value < 0.025 C. 0.025 < p-value < 0.05 D. 0.05 < p-value < 0.10 E. p-value > 0.10 31. What is the Upper limit of the 95% confidence interval for 3, the coefficient of "condition”? A. B. C. D. E. 16508.6 -658.1 8.6 296.3* 3133.2 32. What is the predicted sales price of a 5 year old car in excellent condition, having a mileage of 90,000, sold by an individual? A. B. C. D. E. $ 9,040 $ 6,239* $ 3,154 $ 5,903 $ 2,991 33. What are the degrees of freedom for regression? A. B. C. D. E. 24 20 4* 5 6 *********************** Enjoy your summer break ***********************