Download May 2001

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Actuarial credentialing and exams wikipedia , lookup

Line (geometry) wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
Questions of the exam are related to a study for a business-to-business situation, specifically a survey of
existing customers of the company HATCO (source of data: Hair J.F. et al. Multivariate data analysis.
Prentice Hall, 1998.) The data are in the file “May2001.xls”. Before using this file to answer the
questions, be sure that you save on your hard disk, at least one copy of this file under another name
or in another directory.
Three types of information were collected. The first type is the perception of HATCO on seven attributes
identified in past studies as the most influential in the choice of suppliers. The respondents, purchasing
managers of firms buying from HATCO, rated HATCO on each attribute. The second type of information
relates to actual purchase outcomes, either the evaluations of each respondent’s product purchases from
HATCO. The third type of information contains general characteristics of the purchasing companies (e.g.,
firm size, industry type).
The data provided should give HATCO a better understanding of both the characteristics of its customers
and the relationships between their perceptions of HATCO and their actions toward HATCO (purchases
and satisfaction). A definition of each variable and an explanation of its coding is given in the following
sections.
Perceptions of HATCO
Each of the variables was measured on a graphic rating scale, where a 10-centimeter line was drawn
between the endpoints, labeled "Poor" and "Excellent".
Poor
Excellent
Respondents indicated their perceptions by making a mark anywhere on the line. The mark was then
measured and the distance from 0 (in centimeters) was recorded. The result was a scale ranging from 0 to
10. The seven HATCO attributes rated by each respondent are as follows :
X1
X2
X3
X4
X5
X6
X7
Manufacturer's image – overall image of the manufacturer or supplier
Product quality – perceived level of quality of a particular product (e.g., performance or
yield)
Overall service – overall level of service necessary for maintaining a satisfactory
relationship between supplier and purchaser
Delivery speed – amount of time it takes to deliver the product once an order has been
confirmed
Price level – perceived level of price charged by product suppliers
Price flexibility – perceived willingness of HATCO representatives to negotiate price on all
types of purchases
Salesforce image – overall image of the manufacturer's salesforce
Page 1 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
Purchase Outcomes
Two specific measures were obtained that reflected the outcomes of the respondent's purchase
relationships with HATCO. These measures include :
X9 Usage level – how much of the firm's total product is purchased from HATCO, measured on
a 100-point percentage scale, ranging from 0 to 100 percent
X10 Satisfaction level – how satisfied the purchaser is with past purchases from HATCO,
measured on the same graphic rating scale as perceptions X1 to X7
Purchase Characteristics
The five characteristics of the responding firms used in the study are as follows :
X8 Size of firm – size of the firm relative to others in this market. This variable has two
categories : 1 = large, 0 = small
X11 Specification buying – extent to which a particular purchaser evaluates each purchase
separately (total value analysis) versus the use of specification buying, which details precisely
the product characteristics desired. This variable has two categories : 1 = employs total
value analysis approach, evaluating each purchase separately ; 0 = use of specification
buying
X12 Structure of procurement – method of procuring or purchasing products within a particular
company. This variable has two categories : 1 = centralized procurement, 0 =
decentralized procurement
X13 Type of industry – industry classification in which a product purchaser belongs. This
variable has two categories : 1 = industry A, 0 = other industries
X14 Type of buying situation – type of situation facing the purchaser. This variable has three
categories : 1 = new task, 2 = modified rebuy, 3 = straight rebuy
The following variables in the file “May2001.xls” are the results of data management of previous
columns to help you answer some of the questions of the exam.
X15
X16
X17
X18
Satisfaction level (X10) for customers with centralized procurement (X12 =1)
Satisfaction level (X10) for customers with decentralized procurement (X12 =0)
Random numbers – you will not need that column
Random numbers – you will not need that column
Page 2 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
Problem 1. ( 10 points)
a) Obtain estimates of the mean, standard deviation, minimum and maximum of the satisfaction
level with past purchases from HATCO based on the sample of 100 purchasers surveyed. (3
points)
Mean
Standard deviation
Minimum
Maximum
b) Use a 99% confidence interval to estimate the average satisfaction level of existing customers of
the company HATCO and briefly give the interpretation of this interval. (4 points)
c) Obtain the new lower and upper bounds of the 99% confidence interval to estimate the average
satisfaction level, if the sample size is increased to 400 and assuming the that sample mean and
standard deviation did not change. (3 points)
Lower bound
Upper bound
Page 3 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
Problem 2. ( 15 points)
a) What are the means and standard deviations of the satisfaction level with past purchases from
HATCO, for companies in the sample with centralized and decentralized procurement
respectively? (2 points)
Centralized procurement
Decentralized procurement
Mean
Standard deviation
HATCO management is now interested in testing if the satisfaction level is significantly different between
companies with centralized and decentralized procurement.
b) Formulate precisely the hypotheses H0 and H1 that we want to test in this problem. (2 points)
c) Before doing the test on the means, you need to check if the variances are equal or unequal. What
is the p-value of the two-tailed test to compare the two variances? What do you conclude at the
=5% level? (4 points)
d) What is the p-value for the test of the hypotheses formulated in b)? (4 points)
e) Comment briefly the results of your test in d). (3 points)
Page 4 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
Problem 3. (15 points)
The cross tabulation of the size of the firm (X8) and type of buying situation (X14) gave the following
results:
Count
Firm size (X8)
0 = small
1 = large
1= new task
10
24
Type of buying situation (X14)
2= modified rebuy
16
16
3= straight rebuy
34
0
a) HATCO management wants to test if there is a significant relationship between the size of the
firm and the type of buying situation. Obtain the p-value of the test and briefly comment the
results according to the context (note: use appropriate distribution of percentages to comment on
the presence or absence of a relationship). (5 points)
HATCO management hypothesizes that the proportion of firms that have centralized procurement (X12 =
1) is different in large firms (X8 = 1) compared to small firms (X8 = 0).
b) What are the proportion of firms in the survey sample that have centralized procurement in large
and small firms respectively? (5 points) (In EXCEL, use “Pivot Table and PivotChart Report” in
the menu “Data” to get the numbers you need to compute the proportions)
c) Obtain the p-value to test your hypotheses and give your conclusion at the =1% level. (5 points)
Page 5 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
Problem 4. ( 20 points)
HATCO management has long been interested in more accurately predicting the level of business
obtained from its customers in the attempt to provide a better basis for production controls and marketing
efforts. To this end you propose that a linear regression analysis should be attempted to predict the
product usage levels (dependent variable X9) of the customers based on their perceptions of HATCO’s
performance (independent variables X1 to X7). In addition to finding a way to predict usage levels, the
management is also interested in identifying the significant factors (independent variables) that led to
increased product usage for application in differentiated marketing campaigns.
X7
X6
X5
X4
X3
X2
X1
X9
Below are scatter plots of the usage levels (X9) with all seven attributes measuring perceptions of HATCO
(X1 to X7), as well as scatter plots of the seven attributes pairwise.
X9
X1
X2
X3
X4
X5
X6
X7
Page 6 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
The Pearson correlation coefficients are as follows:
X9
X1
X2
X3
X4
X5
X6
X7
X9
1,
0,
- 0,
0,
0,
0,
0,
0,
X1
000
224
155
664
676
082
536
256
1,
0,
0,
0,
0,
- 0,
0,
X2
000
208
255
050
272
112
788
1,
0,
- 0,
0,
- 0,
0,
X3
000
023
452
445
432
199
1,
0,
0,
0,
0,
X4
000
497
558
061
208
1,
- 0,
0,
0,
000
349
525
077
X5
X6
1, 000
- 0, 492
0, 186
1, 000
- 0, 026
a) Which of the above perception attributes seem(s) to explain the largest amount of variability in
usage level. Justify briefly your answer. (2 points)
b) Which of the above perception attributes seem(s) to explain the lowest amount of variability in
usage level. Justify briefly your answer. (2 points)
c) Based on the scatter plots and the Pearson correlation coefficients, do you expect possible
multicolinearity problems in building your multiple linear regression model? Justify briefly your
answer. (3 points)
Page 7 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
d) Using EXCEL and the backward elimination procedure with the criteria of =5% (i.e. perception
attributes with a p-value > 5% in the model are removed), obtain and report below the multiple
linear regression equation for the “best” model to predict usage levels. Briefly comment on the
adjusted R square in this context. (8 points)
e) Based on the results of your regression model, what would be your recommendations to HATCO
management? (5 points)
Page 8 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
Solution
Problem 1
a) Use Descriptive Statistics in Excel:
Mean
6,102
Standard deviation
1,3386
Minimum
3,4
Maximum
9,6
b) 6,102 ± 0,352 or (5,750; 6,454). We are 99% confident that the true mean satisfaction level of
customers is in the interval (5,750; 6,454).
c)
Lower bound
6,102 – 0,352 x 1/4 = 5,93
Upper bound
6,102 + 0,352 x 1/4 = 6,28
T test for a mean (unknown sigma)
X-bar
Mu0
n
s
t statictic
p-value
2-tailed test
Confidence
level
6,102
0
400
1,3386
91,170
0,0000
99,0%
CI: lower
limit
CI: upper
limit
5,93
6,28
Page 9 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
Problem 2.
a)
Centralized procurement
Decentralized procurement
Mean
6,636
5,568
Standard deviation
1,3405
1,11418
b)
H0 : centralized = decentralized vs H1 : centralized ≠ decentralized
c) “F-test two-sample for variances” for testing the equality of the variances:
p-value = 2 x 0,09949 > 0,05 => do not reject H0 (or accept H0) , the variances are equal.
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
X15
6,636
1,797044898
50
49
1,447590615
0,099494227
1,607290301
X16
5,568
1,241404082
50
49
d) p-value = 3,5703E-05 = 0,000035703
t-Test: Two-Sample Assuming Equal Variances
X15
Mean
Variance
Observations
Pooled Variance
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
6,636
1,7970449
50
1,51922449
0
98
4,33241729
1,7852E-05
1,66055088
3,5703E-05
1,98446742
X16
5,568
1,2414041
50
Page 10 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
e)
p-value < 0,05, we reject H0 (or accept H1). There is a statistically significant difference between
the two means, the average satisfaction level of clients with centralized procurement is higher
than those with decentralized procurement.
Problem 3.
a) p-value = 0,00000000813. There is a strong relationship between the size of the firm and the
type of buying situation. 60% of large firms face new task compared to only 16,7% of small
firms. On the other hand 56,7% of small firms use straight rebuy and none of the large firms in
the sample uses this type of buying situation.
CROSSED TABLE 2X3
Obs. frequencies
line:
small
large
Total
column:
new
10
24
34
Exp. frequencies
line:
1
2
Total
column:
1
20,40
13,60
34
(Exp-Obs)^2
Exp
5,302
7,953
Chi-square statistic:
Degree of freedom:
P-value:
Cramer coefficient:
modified
16
16
32
2
19,20
12,80
32
0,533
0,800
straight
34
0
34
3
20,40
13,60
34
Total
60
40
100
% line
line:
small
large
Total
column:
new
0,167
0,600
0,340
Total
60
40
100
%
column
line:
1
2
Total
column:
1
0,294
0,706
1
modified straight
0,267
0,567
0,400
0,000
0,320
0,340
2
0,500
0,500
1
3
1,000
0,000
1
9,067
13,600
37,255
2
0,00000000813
0,610367938
Page 11 of 13
Total
1
1
1
Total
0,600
0,400
1
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
b) Large: 100%; Small: 16,7%.
c) p-value = 0,000 There is a significant difference: large firms use exclusively centralized procurement
and small firms use mostly (83,3%) decentralized procurement.
Count of X8
X8
X12
0
50
0
1
Grand Total
50
1 Grand Total
10
60
40
40
50
100
CROSSED TABLE 2X2
Obs. frequencies
line: X8
small
large
Total
column: X12
decent.
cent.
50
10
40
50
50
Total
60
40
100
% line
line:
small
large
Total
Exp. frequencies
line:
1
2
Total
column:
1
30,00
20,00
50
Total
60
40
100
% column column:
line:
1
1
1,000
2
0,000
Total
1
(Exp-Obs)^2
Exp
13,333
20,000
2
30,00
20,00
50
column:
decent.
0,833
0,000
0,500
cent.
0,167
1,000
0,500
Total
1
1
1
2
0,200
0,800
1
Total
0,600
0,400
1
13,333
20,000
Chi-square
statistic:
Degree of freedom:
P-value:
66,667
1
0,00000000
Cramer coefficient:
0,8164966
Problem 4.
Page 12 of 13
51-651-00 Statistics
May 2001
FINAL EXAM
Professor: François Bellavance
a) X4: Delivery service (r = 0,676) and X3: Overall speed (r = 0,664) because these two variables have
the highest correlation coefficient (in absolute value) with X9: usage level.
b) X5: Price level (r = 0,082) because it has the lowest correlation coefficient (in absolute value) with X9:
usage level.
c) Yes, it is possible because of the relatively high correlation coefficient between X7: salesforce image
and X1: manufacturer’s image (r = 0,788).
d) 1st step remove X1 (p-value = 0,7679); 2nd step remove X2 (p-value = 0,1318); 3rd step remove X3 (pvalue = 0,0656). “Best” multiple linear regression model:
Usage level = -5,008 + 4,026 x X4 (Speed) + 3,737 x X5 (Price level) + 3,0237 x X6 (Price flexibility) +
1,520 x X7 (Salesforce image)
Adj. R Square = 0,716. 71,6% of the variability observed in the usage level (X9) is explained by the
perception of delivery speed (X4), price level (X5), price flexibility (X6), and salesforce image (X7). The
higher the perception on these variables, the higher will be the usage level.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0,853064625
R Square
0,727719255
Adjusted R
Square
0,716254803
Standard Error
4,788114317
Observations
100
ANOVA
df
Regression
Residual
Total
Intercept
X4
X5
X6
X7
SS
MS
4 5821,026322 1455,256581
95 2177,973678 22,92603871
99
7999
Coefficients
-5,007988001
4,02630684
3,737187823
3,023782061
1,520263504
Standard
Error
4,013694059
0,435369282
0,477028412
0,436075138
0,643138137
t Stat
-1,24772539
9,248026928
7,834308667
6,934084968
2,363821108
F
Significance F
63,476146
5,18021E-26
P-value
0,215198465
6,69187E-15
6,71555E-12
4,91977E-10
0,020123693
Lower 95%
-12,97617246
3,161990156
2,790167367
2,158064075
0,243473786
Upper 95%
2,960196455
4,890623525
4,684208279
3,889500047
2,797053222
Page 13 of 13