Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance Questions of the exam are related to a study for a business-to-business situation, specifically a survey of existing customers of the company HATCO (source of data: Hair J.F. et al. Multivariate data analysis. Prentice Hall, 1998.) The data are in the file “May2001.xls”. Before using this file to answer the questions, be sure that you save on your hard disk, at least one copy of this file under another name or in another directory. Three types of information were collected. The first type is the perception of HATCO on seven attributes identified in past studies as the most influential in the choice of suppliers. The respondents, purchasing managers of firms buying from HATCO, rated HATCO on each attribute. The second type of information relates to actual purchase outcomes, either the evaluations of each respondent’s product purchases from HATCO. The third type of information contains general characteristics of the purchasing companies (e.g., firm size, industry type). The data provided should give HATCO a better understanding of both the characteristics of its customers and the relationships between their perceptions of HATCO and their actions toward HATCO (purchases and satisfaction). A definition of each variable and an explanation of its coding is given in the following sections. Perceptions of HATCO Each of the variables was measured on a graphic rating scale, where a 10-centimeter line was drawn between the endpoints, labeled "Poor" and "Excellent". Poor Excellent Respondents indicated their perceptions by making a mark anywhere on the line. The mark was then measured and the distance from 0 (in centimeters) was recorded. The result was a scale ranging from 0 to 10. The seven HATCO attributes rated by each respondent are as follows : X1 X2 X3 X4 X5 X6 X7 Manufacturer's image – overall image of the manufacturer or supplier Product quality – perceived level of quality of a particular product (e.g., performance or yield) Overall service – overall level of service necessary for maintaining a satisfactory relationship between supplier and purchaser Delivery speed – amount of time it takes to deliver the product once an order has been confirmed Price level – perceived level of price charged by product suppliers Price flexibility – perceived willingness of HATCO representatives to negotiate price on all types of purchases Salesforce image – overall image of the manufacturer's salesforce Page 1 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance Purchase Outcomes Two specific measures were obtained that reflected the outcomes of the respondent's purchase relationships with HATCO. These measures include : X9 Usage level – how much of the firm's total product is purchased from HATCO, measured on a 100-point percentage scale, ranging from 0 to 100 percent X10 Satisfaction level – how satisfied the purchaser is with past purchases from HATCO, measured on the same graphic rating scale as perceptions X1 to X7 Purchase Characteristics The five characteristics of the responding firms used in the study are as follows : X8 Size of firm – size of the firm relative to others in this market. This variable has two categories : 1 = large, 0 = small X11 Specification buying – extent to which a particular purchaser evaluates each purchase separately (total value analysis) versus the use of specification buying, which details precisely the product characteristics desired. This variable has two categories : 1 = employs total value analysis approach, evaluating each purchase separately ; 0 = use of specification buying X12 Structure of procurement – method of procuring or purchasing products within a particular company. This variable has two categories : 1 = centralized procurement, 0 = decentralized procurement X13 Type of industry – industry classification in which a product purchaser belongs. This variable has two categories : 1 = industry A, 0 = other industries X14 Type of buying situation – type of situation facing the purchaser. This variable has three categories : 1 = new task, 2 = modified rebuy, 3 = straight rebuy The following variables in the file “May2001.xls” are the results of data management of previous columns to help you answer some of the questions of the exam. X15 X16 X17 X18 Satisfaction level (X10) for customers with centralized procurement (X12 =1) Satisfaction level (X10) for customers with decentralized procurement (X12 =0) Random numbers – you will not need that column Random numbers – you will not need that column Page 2 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance Problem 1. ( 10 points) a) Obtain estimates of the mean, standard deviation, minimum and maximum of the satisfaction level with past purchases from HATCO based on the sample of 100 purchasers surveyed. (3 points) Mean Standard deviation Minimum Maximum b) Use a 99% confidence interval to estimate the average satisfaction level of existing customers of the company HATCO and briefly give the interpretation of this interval. (4 points) c) Obtain the new lower and upper bounds of the 99% confidence interval to estimate the average satisfaction level, if the sample size is increased to 400 and assuming the that sample mean and standard deviation did not change. (3 points) Lower bound Upper bound Page 3 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance Problem 2. ( 15 points) a) What are the means and standard deviations of the satisfaction level with past purchases from HATCO, for companies in the sample with centralized and decentralized procurement respectively? (2 points) Centralized procurement Decentralized procurement Mean Standard deviation HATCO management is now interested in testing if the satisfaction level is significantly different between companies with centralized and decentralized procurement. b) Formulate precisely the hypotheses H0 and H1 that we want to test in this problem. (2 points) c) Before doing the test on the means, you need to check if the variances are equal or unequal. What is the p-value of the two-tailed test to compare the two variances? What do you conclude at the =5% level? (4 points) d) What is the p-value for the test of the hypotheses formulated in b)? (4 points) e) Comment briefly the results of your test in d). (3 points) Page 4 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance Problem 3. (15 points) The cross tabulation of the size of the firm (X8) and type of buying situation (X14) gave the following results: Count Firm size (X8) 0 = small 1 = large 1= new task 10 24 Type of buying situation (X14) 2= modified rebuy 16 16 3= straight rebuy 34 0 a) HATCO management wants to test if there is a significant relationship between the size of the firm and the type of buying situation. Obtain the p-value of the test and briefly comment the results according to the context (note: use appropriate distribution of percentages to comment on the presence or absence of a relationship). (5 points) HATCO management hypothesizes that the proportion of firms that have centralized procurement (X12 = 1) is different in large firms (X8 = 1) compared to small firms (X8 = 0). b) What are the proportion of firms in the survey sample that have centralized procurement in large and small firms respectively? (5 points) (In EXCEL, use “Pivot Table and PivotChart Report” in the menu “Data” to get the numbers you need to compute the proportions) c) Obtain the p-value to test your hypotheses and give your conclusion at the =1% level. (5 points) Page 5 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance Problem 4. ( 20 points) HATCO management has long been interested in more accurately predicting the level of business obtained from its customers in the attempt to provide a better basis for production controls and marketing efforts. To this end you propose that a linear regression analysis should be attempted to predict the product usage levels (dependent variable X9) of the customers based on their perceptions of HATCO’s performance (independent variables X1 to X7). In addition to finding a way to predict usage levels, the management is also interested in identifying the significant factors (independent variables) that led to increased product usage for application in differentiated marketing campaigns. X7 X6 X5 X4 X3 X2 X1 X9 Below are scatter plots of the usage levels (X9) with all seven attributes measuring perceptions of HATCO (X1 to X7), as well as scatter plots of the seven attributes pairwise. X9 X1 X2 X3 X4 X5 X6 X7 Page 6 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance The Pearson correlation coefficients are as follows: X9 X1 X2 X3 X4 X5 X6 X7 X9 1, 0, - 0, 0, 0, 0, 0, 0, X1 000 224 155 664 676 082 536 256 1, 0, 0, 0, 0, - 0, 0, X2 000 208 255 050 272 112 788 1, 0, - 0, 0, - 0, 0, X3 000 023 452 445 432 199 1, 0, 0, 0, 0, X4 000 497 558 061 208 1, - 0, 0, 0, 000 349 525 077 X5 X6 1, 000 - 0, 492 0, 186 1, 000 - 0, 026 a) Which of the above perception attributes seem(s) to explain the largest amount of variability in usage level. Justify briefly your answer. (2 points) b) Which of the above perception attributes seem(s) to explain the lowest amount of variability in usage level. Justify briefly your answer. (2 points) c) Based on the scatter plots and the Pearson correlation coefficients, do you expect possible multicolinearity problems in building your multiple linear regression model? Justify briefly your answer. (3 points) Page 7 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance d) Using EXCEL and the backward elimination procedure with the criteria of =5% (i.e. perception attributes with a p-value > 5% in the model are removed), obtain and report below the multiple linear regression equation for the “best” model to predict usage levels. Briefly comment on the adjusted R square in this context. (8 points) e) Based on the results of your regression model, what would be your recommendations to HATCO management? (5 points) Page 8 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance Solution Problem 1 a) Use Descriptive Statistics in Excel: Mean 6,102 Standard deviation 1,3386 Minimum 3,4 Maximum 9,6 b) 6,102 ± 0,352 or (5,750; 6,454). We are 99% confident that the true mean satisfaction level of customers is in the interval (5,750; 6,454). c) Lower bound 6,102 – 0,352 x 1/4 = 5,93 Upper bound 6,102 + 0,352 x 1/4 = 6,28 T test for a mean (unknown sigma) X-bar Mu0 n s t statictic p-value 2-tailed test Confidence level 6,102 0 400 1,3386 91,170 0,0000 99,0% CI: lower limit CI: upper limit 5,93 6,28 Page 9 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance Problem 2. a) Centralized procurement Decentralized procurement Mean 6,636 5,568 Standard deviation 1,3405 1,11418 b) H0 : centralized = decentralized vs H1 : centralized ≠ decentralized c) “F-test two-sample for variances” for testing the equality of the variances: p-value = 2 x 0,09949 > 0,05 => do not reject H0 (or accept H0) , the variances are equal. F-Test Two-Sample for Variances Mean Variance Observations df F P(F<=f) one-tail F Critical one-tail X15 6,636 1,797044898 50 49 1,447590615 0,099494227 1,607290301 X16 5,568 1,241404082 50 49 d) p-value = 3,5703E-05 = 0,000035703 t-Test: Two-Sample Assuming Equal Variances X15 Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail 6,636 1,7970449 50 1,51922449 0 98 4,33241729 1,7852E-05 1,66055088 3,5703E-05 1,98446742 X16 5,568 1,2414041 50 Page 10 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance e) p-value < 0,05, we reject H0 (or accept H1). There is a statistically significant difference between the two means, the average satisfaction level of clients with centralized procurement is higher than those with decentralized procurement. Problem 3. a) p-value = 0,00000000813. There is a strong relationship between the size of the firm and the type of buying situation. 60% of large firms face new task compared to only 16,7% of small firms. On the other hand 56,7% of small firms use straight rebuy and none of the large firms in the sample uses this type of buying situation. CROSSED TABLE 2X3 Obs. frequencies line: small large Total column: new 10 24 34 Exp. frequencies line: 1 2 Total column: 1 20,40 13,60 34 (Exp-Obs)^2 Exp 5,302 7,953 Chi-square statistic: Degree of freedom: P-value: Cramer coefficient: modified 16 16 32 2 19,20 12,80 32 0,533 0,800 straight 34 0 34 3 20,40 13,60 34 Total 60 40 100 % line line: small large Total column: new 0,167 0,600 0,340 Total 60 40 100 % column line: 1 2 Total column: 1 0,294 0,706 1 modified straight 0,267 0,567 0,400 0,000 0,320 0,340 2 0,500 0,500 1 3 1,000 0,000 1 9,067 13,600 37,255 2 0,00000000813 0,610367938 Page 11 of 13 Total 1 1 1 Total 0,600 0,400 1 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance b) Large: 100%; Small: 16,7%. c) p-value = 0,000 There is a significant difference: large firms use exclusively centralized procurement and small firms use mostly (83,3%) decentralized procurement. Count of X8 X8 X12 0 50 0 1 Grand Total 50 1 Grand Total 10 60 40 40 50 100 CROSSED TABLE 2X2 Obs. frequencies line: X8 small large Total column: X12 decent. cent. 50 10 40 50 50 Total 60 40 100 % line line: small large Total Exp. frequencies line: 1 2 Total column: 1 30,00 20,00 50 Total 60 40 100 % column column: line: 1 1 1,000 2 0,000 Total 1 (Exp-Obs)^2 Exp 13,333 20,000 2 30,00 20,00 50 column: decent. 0,833 0,000 0,500 cent. 0,167 1,000 0,500 Total 1 1 1 2 0,200 0,800 1 Total 0,600 0,400 1 13,333 20,000 Chi-square statistic: Degree of freedom: P-value: 66,667 1 0,00000000 Cramer coefficient: 0,8164966 Problem 4. Page 12 of 13 51-651-00 Statistics May 2001 FINAL EXAM Professor: François Bellavance a) X4: Delivery service (r = 0,676) and X3: Overall speed (r = 0,664) because these two variables have the highest correlation coefficient (in absolute value) with X9: usage level. b) X5: Price level (r = 0,082) because it has the lowest correlation coefficient (in absolute value) with X9: usage level. c) Yes, it is possible because of the relatively high correlation coefficient between X7: salesforce image and X1: manufacturer’s image (r = 0,788). d) 1st step remove X1 (p-value = 0,7679); 2nd step remove X2 (p-value = 0,1318); 3rd step remove X3 (pvalue = 0,0656). “Best” multiple linear regression model: Usage level = -5,008 + 4,026 x X4 (Speed) + 3,737 x X5 (Price level) + 3,0237 x X6 (Price flexibility) + 1,520 x X7 (Salesforce image) Adj. R Square = 0,716. 71,6% of the variability observed in the usage level (X9) is explained by the perception of delivery speed (X4), price level (X5), price flexibility (X6), and salesforce image (X7). The higher the perception on these variables, the higher will be the usage level. SUMMARY OUTPUT Regression Statistics Multiple R 0,853064625 R Square 0,727719255 Adjusted R Square 0,716254803 Standard Error 4,788114317 Observations 100 ANOVA df Regression Residual Total Intercept X4 X5 X6 X7 SS MS 4 5821,026322 1455,256581 95 2177,973678 22,92603871 99 7999 Coefficients -5,007988001 4,02630684 3,737187823 3,023782061 1,520263504 Standard Error 4,013694059 0,435369282 0,477028412 0,436075138 0,643138137 t Stat -1,24772539 9,248026928 7,834308667 6,934084968 2,363821108 F Significance F 63,476146 5,18021E-26 P-value 0,215198465 6,69187E-15 6,71555E-12 4,91977E-10 0,020123693 Lower 95% -12,97617246 3,161990156 2,790167367 2,158064075 0,243473786 Upper 95% 2,960196455 4,890623525 4,684208279 3,889500047 2,797053222 Page 13 of 13