Download Autumn 1999 exam

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

World Values Survey wikipedia , lookup

Time series wikipedia , lookup

Categorical variable wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Solutions to Autumn 1999 exam
Business Statistics
Question 1
x  9.382 days
s  3.998 days
a.
b.
Median = Average of 10th and 11th data values
= (8.45 + 8.58)/2
= 8.515 days
Mode : there is no mode
c.
Plant A processing times are skewed in the positive direction. This can be
verified by:



mean > median. This is consistent with a positively skewed distribution.
the boxplot. The upper tail (positive direction) is twice as long as the lower
tail (negative direction) with a moderate outlier (in the positive direction).
The boxplot again. The median is closer to the lower quartile than it is to
the upper quartile.
Plant B is slightly positively skewed although it could also be argued it is
almost symmetric. This can be verified by:



d.
median > mean or median is approximately equal to the mean.
the measure of skewness from the descriptive statistics. This value is 0.80
which indicates a slight positive skewness.
The boxplot. If we extend the upper tail to the moderate outlier, the upper
tail is longer than the lower tail. This is consistent with a positively
skewed distribution. Note that we could also argue that the tails are of
similar length, which is consistent with an approximately symmetric
distribution.
x A  9.382
s A  3.998
median A = 8.515
xB  11.35
s B  5.13
median B = 11.96
The mean processing time at plant A is shorter (9.382) than the mean
processing time at plant B (11.35)
The processing times at plant A are less variable (3.998) than those for plant B
(5.13).
This extra variability is processing times is also evident from the boxplots.
The median processing time at plant A (8.515) is also shorter than the median
processing time at plant B (11.96).
e.
#N/A means not available. There is no mode processing time for plant B, just
as there was none for plant A, as no times occur more than once.
f.
1
sx 

s
n
5.13
20
 1.15
This is the standard error shown in the table.
Question 2
a.
Let H = event subscriber owns a house
C = event the subscriber owns a car
Therefore
P( H )  0.6
i.
P( H )  0.4
P(C )  0.75
P(C )  0.25
P(C | H )  0.9
P (C  H )  P (C | H ).P ( H )
 (0.9)(0.6)
 0.54
ii.
P(C  H )  P( H )  P(C )  P(C  H )
 0.75  0.6  0.54
 0.81
iii.
P(C  H )  1  P(C  H )
 1  0.81
 0.19
b.
Let X = life of lamp therefore X ~ N (3500,200 2 )
i.
4000  3500 

P( X  4000)  P Z 

200


 P( Z  2.5)
 0.5  0.4938
 0.0062
3500
4000 X
2
ii.
P( X  A)  0.03
0.03
0.47
A
3500
X
Therefore a = -1.88 and hence
0.03
0.47
a
0
Z
A = 3500 – 1.88 (200)
= 3124
Therefore, managers should advertise 3124 hours as the the life of the lamps.
c.
To determine which type of probability distribution we have here you need to
ask the following questions:
 Are there a fixed number of trials?
 Are only two outcomes possible?
 Do we have information on the average number of occurrences?
Yes
Yes
No
Therefore the distribution of the number of doctors that recommend the
product follows a binomial distribution where:
n = 20
X = no. doctors that recommend the product = 0, 1, 2, 3, . . . , 20
p = probability doctor recommends the product = 0.4
Wherever possible, binomial probabilites should be determined from the
binomial tables in the Appendix at the rear of the text. We use Excel for
determining binomial probabilities not found in these tables.
i.
P( X  2)  P( X  2)  P( x  1)
 0.004  0.001
 0.003
ii.
P( X  2)  0.004
iii.
No. Given the claim is true, then the probability that only 2 out of 20
doctors would recommend the product is only 0.003 ie 3/1000. This is
a very small probability therefore it is more likely that the claim is not
true.
3
Question 3.
a.
i.
99% CI (  )  X  Z  / 2 / n
 75 000  (2.58)( 20 000) /( 100 )
 75 000  5160
Therefore the mean after tax profit for all retailers lies between $69 840 and
$80 160.
ii.
1    0.99    0.01 Z 0.005  2.58   20 000 B  4000
Z  
n    /2 
 B 
2
2
 (2.58)( 20 000) 


4000


 166.41
 167
Therefore, 67 additional retailers would need to be surveyed.
b.
H 0 : p  0.001 error rate is 0.1%
H A : p  0.001 error rate is less than 0.1%
np  (10 000)(0.001)  10, nq  (10 000)(0.999)  9990 both these values are  5 there fore the
binomial can be approximat ed by the normal.
pˆ  p
The correct te st statistic is Z 
pq / n
  0.01 therefor e Z 0.01  2.33
 Reject H 0 if Z sample  -2.33
0.01
-2.33
Z sample 

pˆ  p
pq / n
0.0003  0.001
(0.001)(0.999) / 10 000
 pˆ 
(0.001)(0.999)
 0.00032
10 000
 2.21
Since - 2.21  -2.33 we do not reject H 0 .
4
There is insufficient evidence at  = 0.01 to conclude that the error rate is less
than 0.1%
Question 4
a.
i.
From the scatterplot it appears that there is a positive linear
relationship between trade executions and the number of incoming
calls. As the number of calls increases, the number of trade executions
also increases.
ii.
yˆ  63.02  0.19x
iii.
Slope coefficient = 0.19
This implies that for each extra incoming call, the trade executions
increases by 0.19 ie for each extra 100 calls, there are an extra 19
trade executions.
iv.
yˆ  63.02  (0.19)( 2000)
 316.98
 317 trade executions
v.
No. Since we have no data showing the relationship between number
of calls and number of trade executions, past approximately 2600 calls,
we cannot reliably use this regression equation to predict the number
of trade executions outside this range.
vi.
yˆ  t / 2,n2 s
2
1 (xg  x)
is the confidence interval estimator

n
SS x
t 0.025,33  1.96 s  29.42 SS x  1 361 738 x g  2000 n  35 x  2156.66
1 (2000  2456.66) 2
95% CI ( y )  316.98  (1.96)( 29.42)

35
1 361 738
 316.98  12.4
b.
Therefore the 95% CI estimate for the number of trades executed on days
when there are 2000 incoming calls is between 304.53 and 328.98
.
omit
5
Part B
1.
B.
s
x
4.06

5
 0.812
CV 
2.
C.
A is categorical data and hence nominal.
B is categorical data and hence nominal.
C is quantitative data with an absolute zero and hence ratio.
D is categorical data and hence nominal.
E is quantitative data with order implied but no absolute zero, hence ordinal.
3.
D.
Variance uses all data values in its calculation and hence will be affected
significantly by extreme values.
Range is the difference between the highest and lowest values and hence will
be affected significantly by extreme values.
Median is not a measure of central tendency.
Interquartile range does not use the highest or lowest values in its calculation
hence, will not be significantly affected by extreme values.
Standard deviation is simply the square root of the variance. Refer comments
on variance.
4.
D.
The labelling used along the x-axis in graph A. is incorrect. The upper limit of
each class has been plotted at the midpoint of each column.
The labelling used along the x-axis in graphs B. and C. is inappropriate. This
method should only be used to label a bar graph.
Graph D. is correct as it is a histogram with the midpoint of each class plotted
at the midpoint of each column.
Graph E. has the incorrect midpoint of each class plotted along the x-axis.
5.
B.
The variance is in the data units squared.
6.
A.
A Poisson problem with X = number of phone calls,  =10 phone calls per 2
hours.
P ( X  3)  P ( X  3, 4, 5,...)
 1  P ( X  2)
 1  0.003
 0.997
6
7.
D.
 = 5 phone calls per hour
P( X  12)  P( X  12)  P( X  11)
 0.998  0.995
 0.003
8.
E.
P( Z  1.84)  0.5  0.4671
 0.0329
0
1.84 Z
9.
E.
P( A  B)  P( B | A).P( A)
 (0.75)(0.4)
 0.3
10.
E.
Since P(H) = P(T) = 0.5 for a fair coin, then
P(THTHHT) = (0.5)6 and this probability is the same for each sequence.
11.
C.
f(x)
1/210
1
210
 0.2143
P ( X  165)  45 
 21.43%
0
12.
165
210 x secs
D.

57.95  53 
P( X  57.95)  P Z 

21 / 49 

 P( Z  1.65)
0.5
0.4505
 0.5  0.4505
 0.9505
13.
0 1.65 Z
D.
Here we have  unknown but estimated by s, therefore s / n  2 / 25  0.4
n = 25, therefore n-1 = 24;  = 0.1 therefore /2 = 0.05
7
14.
B.
sx 
s
n
 20
s2
 20 2
n
40 000
n 
20 2
 100

15.
C.
16.
E.
17.
A.
R 2  0.9351
r  0.9351
 0.97
18.
19.
20.
omit
omit
omit
8