Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Solutions to Autumn 1999 exam Business Statistics Question 1 x 9.382 days s 3.998 days a. b. Median = Average of 10th and 11th data values = (8.45 + 8.58)/2 = 8.515 days Mode : there is no mode c. Plant A processing times are skewed in the positive direction. This can be verified by: mean > median. This is consistent with a positively skewed distribution. the boxplot. The upper tail (positive direction) is twice as long as the lower tail (negative direction) with a moderate outlier (in the positive direction). The boxplot again. The median is closer to the lower quartile than it is to the upper quartile. Plant B is slightly positively skewed although it could also be argued it is almost symmetric. This can be verified by: d. median > mean or median is approximately equal to the mean. the measure of skewness from the descriptive statistics. This value is 0.80 which indicates a slight positive skewness. The boxplot. If we extend the upper tail to the moderate outlier, the upper tail is longer than the lower tail. This is consistent with a positively skewed distribution. Note that we could also argue that the tails are of similar length, which is consistent with an approximately symmetric distribution. x A 9.382 s A 3.998 median A = 8.515 xB 11.35 s B 5.13 median B = 11.96 The mean processing time at plant A is shorter (9.382) than the mean processing time at plant B (11.35) The processing times at plant A are less variable (3.998) than those for plant B (5.13). This extra variability is processing times is also evident from the boxplots. The median processing time at plant A (8.515) is also shorter than the median processing time at plant B (11.96). e. #N/A means not available. There is no mode processing time for plant B, just as there was none for plant A, as no times occur more than once. f. 1 sx s n 5.13 20 1.15 This is the standard error shown in the table. Question 2 a. Let H = event subscriber owns a house C = event the subscriber owns a car Therefore P( H ) 0.6 i. P( H ) 0.4 P(C ) 0.75 P(C ) 0.25 P(C | H ) 0.9 P (C H ) P (C | H ).P ( H ) (0.9)(0.6) 0.54 ii. P(C H ) P( H ) P(C ) P(C H ) 0.75 0.6 0.54 0.81 iii. P(C H ) 1 P(C H ) 1 0.81 0.19 b. Let X = life of lamp therefore X ~ N (3500,200 2 ) i. 4000 3500 P( X 4000) P Z 200 P( Z 2.5) 0.5 0.4938 0.0062 3500 4000 X 2 ii. P( X A) 0.03 0.03 0.47 A 3500 X Therefore a = -1.88 and hence 0.03 0.47 a 0 Z A = 3500 – 1.88 (200) = 3124 Therefore, managers should advertise 3124 hours as the the life of the lamps. c. To determine which type of probability distribution we have here you need to ask the following questions: Are there a fixed number of trials? Are only two outcomes possible? Do we have information on the average number of occurrences? Yes Yes No Therefore the distribution of the number of doctors that recommend the product follows a binomial distribution where: n = 20 X = no. doctors that recommend the product = 0, 1, 2, 3, . . . , 20 p = probability doctor recommends the product = 0.4 Wherever possible, binomial probabilites should be determined from the binomial tables in the Appendix at the rear of the text. We use Excel for determining binomial probabilities not found in these tables. i. P( X 2) P( X 2) P( x 1) 0.004 0.001 0.003 ii. P( X 2) 0.004 iii. No. Given the claim is true, then the probability that only 2 out of 20 doctors would recommend the product is only 0.003 ie 3/1000. This is a very small probability therefore it is more likely that the claim is not true. 3 Question 3. a. i. 99% CI ( ) X Z / 2 / n 75 000 (2.58)( 20 000) /( 100 ) 75 000 5160 Therefore the mean after tax profit for all retailers lies between $69 840 and $80 160. ii. 1 0.99 0.01 Z 0.005 2.58 20 000 B 4000 Z n /2 B 2 2 (2.58)( 20 000) 4000 166.41 167 Therefore, 67 additional retailers would need to be surveyed. b. H 0 : p 0.001 error rate is 0.1% H A : p 0.001 error rate is less than 0.1% np (10 000)(0.001) 10, nq (10 000)(0.999) 9990 both these values are 5 there fore the binomial can be approximat ed by the normal. pˆ p The correct te st statistic is Z pq / n 0.01 therefor e Z 0.01 2.33 Reject H 0 if Z sample -2.33 0.01 -2.33 Z sample pˆ p pq / n 0.0003 0.001 (0.001)(0.999) / 10 000 pˆ (0.001)(0.999) 0.00032 10 000 2.21 Since - 2.21 -2.33 we do not reject H 0 . 4 There is insufficient evidence at = 0.01 to conclude that the error rate is less than 0.1% Question 4 a. i. From the scatterplot it appears that there is a positive linear relationship between trade executions and the number of incoming calls. As the number of calls increases, the number of trade executions also increases. ii. yˆ 63.02 0.19x iii. Slope coefficient = 0.19 This implies that for each extra incoming call, the trade executions increases by 0.19 ie for each extra 100 calls, there are an extra 19 trade executions. iv. yˆ 63.02 (0.19)( 2000) 316.98 317 trade executions v. No. Since we have no data showing the relationship between number of calls and number of trade executions, past approximately 2600 calls, we cannot reliably use this regression equation to predict the number of trade executions outside this range. vi. yˆ t / 2,n2 s 2 1 (xg x) is the confidence interval estimator n SS x t 0.025,33 1.96 s 29.42 SS x 1 361 738 x g 2000 n 35 x 2156.66 1 (2000 2456.66) 2 95% CI ( y ) 316.98 (1.96)( 29.42) 35 1 361 738 316.98 12.4 b. Therefore the 95% CI estimate for the number of trades executed on days when there are 2000 incoming calls is between 304.53 and 328.98 . omit 5 Part B 1. B. s x 4.06 5 0.812 CV 2. C. A is categorical data and hence nominal. B is categorical data and hence nominal. C is quantitative data with an absolute zero and hence ratio. D is categorical data and hence nominal. E is quantitative data with order implied but no absolute zero, hence ordinal. 3. D. Variance uses all data values in its calculation and hence will be affected significantly by extreme values. Range is the difference between the highest and lowest values and hence will be affected significantly by extreme values. Median is not a measure of central tendency. Interquartile range does not use the highest or lowest values in its calculation hence, will not be significantly affected by extreme values. Standard deviation is simply the square root of the variance. Refer comments on variance. 4. D. The labelling used along the x-axis in graph A. is incorrect. The upper limit of each class has been plotted at the midpoint of each column. The labelling used along the x-axis in graphs B. and C. is inappropriate. This method should only be used to label a bar graph. Graph D. is correct as it is a histogram with the midpoint of each class plotted at the midpoint of each column. Graph E. has the incorrect midpoint of each class plotted along the x-axis. 5. B. The variance is in the data units squared. 6. A. A Poisson problem with X = number of phone calls, =10 phone calls per 2 hours. P ( X 3) P ( X 3, 4, 5,...) 1 P ( X 2) 1 0.003 0.997 6 7. D. = 5 phone calls per hour P( X 12) P( X 12) P( X 11) 0.998 0.995 0.003 8. E. P( Z 1.84) 0.5 0.4671 0.0329 0 1.84 Z 9. E. P( A B) P( B | A).P( A) (0.75)(0.4) 0.3 10. E. Since P(H) = P(T) = 0.5 for a fair coin, then P(THTHHT) = (0.5)6 and this probability is the same for each sequence. 11. C. f(x) 1/210 1 210 0.2143 P ( X 165) 45 21.43% 0 12. 165 210 x secs D. 57.95 53 P( X 57.95) P Z 21 / 49 P( Z 1.65) 0.5 0.4505 0.5 0.4505 0.9505 13. 0 1.65 Z D. Here we have unknown but estimated by s, therefore s / n 2 / 25 0.4 n = 25, therefore n-1 = 24; = 0.1 therefore /2 = 0.05 7 14. B. sx s n 20 s2 20 2 n 40 000 n 20 2 100 15. C. 16. E. 17. A. R 2 0.9351 r 0.9351 0.97 18. 19. 20. omit omit omit 8