Download Question 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
Question 2
(a) Mode = 4 hours
Mean = 6.9 hours (to 1 dp)
Ordered array: 3, 4, 4, 5, 8, 9, 10, 12
There are 8 values so the median = 41/2th value in order ie, (5 + 8)/2 = 6.5 hours
Median = 6.5 hours
(b) Range:
Max.=12
Min.=3
Range = 12 - 3 = 9
Lower quartile = n/4th value in order ie, 8/4 (2nd) value = 4
Upper quartile = 3n/4th value in order ie, (3*8)/4 (6th) value = 9
Inter quartile range = 9 - 4 = 5
x = 55
x2 = 455
Population variance:
2 = x2 - x
n
n
2
= 455 - 55
8
8
2
= 56.875 - 6.8752 = 9.609375
The data are from a sample, hence the correction factor is required:
s2 = n/(n-1)*2 = 8/7*9.609375 = 10.982143
s = 10.982143 = 3.314 (to 3 dp)
(c) Mode = 4 hours
Mean = (55+30)/9 = 85/9 = 9.4 hours (to 1 dp)
Median = 5th value in order = 8 hours
Range: Max.=30
Min.=3
Range = 30 - 3 = 27
Inter quartile range:
Lower quartile = n+1/4th value in order ie, 10/4 (2.5th) value = 4
Upper quartile = 3(n+1)/4th value in order ie, (3*10)/4 (7.5th) value = 11
Inter quartile range = 11 - 4 = 7
Standard deviation:
x = 85
x2 = 1355
Population variance:
2 = x2 - x
n
n
2
= 1355 - 85
9
9
2
= 61.358025
Again, the correction factor is required:
s2 = n/(n-1)*2 = 9/8*61.358025 = 69.027778
s = 69.027778 = 8.308 (to 3 dp)
(d) In the revised data set the unusual value of 30 hours would affect the mean, therefore
the median value of 8 hours is the most appropriate measure to use. This would be
partnered with the inter-quartile range.
Question 4
Maximum = 9 Minimum = 0 so Range = 9 – 0 =9
Mode = 2 or 3 errors (both have the same frequency)
Hence the distribution is bi-modal.
There are 100 values so median = 501/2th ordered value
Lower quartile = 25th in order (n/4 approximation OK here)
Upper quartile = 75th in order (3n/4 approximation OK here)
x
0
1
2
3
4
5
6
7
8
9
f
15
18
19
19
10
8
7
2
1
1
Position
1 to 15
16 to 33
34 to 52
53 to 71
72 to 81
82 to 89
90 to 96
97 to 98
98 to 99
99 to 100
Median = 2 errors
Lower Quartile = 1 error
Upper Quartile = 4 errors
Hence inter-quartile range = 4 – 1 = 3 errors
To find mean and standard deviation a table of calculations is required:
x = number of errors, f = number of days
x
0
1
2
3
4
5
6
7
f
15
18
19
19
10
8
7
2
fx
0*15 = 0
1*18 = 18
2*19 = 38
57
40
40
42
14
fx2
02*15 = 0
12*18 = 18
22*19 = 76
171
160
200
252
98
8
9
1
1
f= 100
8
9
fx = 266
64
81
2
fx = 
Mean = fx/f = 266/100 = 2.66 errors
2 = fx2 - fx
n
n
2
= 1120 - 266
100 100
2
= 11.2 – 7.0756 = 4.1244
The data are from a sample so the correction factor must be applied:
s2 =
n
(n - 1)
s2 =
100
*
2
/99 * 4.1244 = 4.1660606
standard deviation (s) = 2.041 (to 3 dp)
The data provided are discrete –so to display the distribution a bar chart can be used:
Figure 3.1 Distribution of errors made
No errors made per day by computer system
over 100 days in auditing dept.
Frequency
20
15
10
5
0
0
1
2
3
4
5
6
7
8
9
Errors made per day
From the graph it is apparent that the data provided are quite skewed, so the the most
appropriate descriptive measures to use are the median and inter-quartile range.
Question 6
(a) Firstly, the data provided indicate that no one has been with the company for more
than 50 years. At present the last category is open ended therefore 50 years can be used
as the upper limit. The categories are also uneven in size so in order to construct a
histogram, frequency densities are required. These are shown below:
Years service
0-<5
5 - < 15
15 – < 25
25 - < 35
35 - < 50
Frequency
105
231
173
85
31
Class width
5
10
10
10
15
Frequency density
105
/5 = 21
231
/10 = 23.1
173
/10 = 17.3
85
/10 = 8.5
31
/15 = 2.07
The resulting histogram is:
Figure 3.4 Histogram of years of service
Number of years employees of JFS chemicals
have been with the company.
25
Frequency Density
20
15
10
5
10
20
50
30
40
Years of service
(b) The histogram indicates that the modal group is 5 to less than 15 years.
(c) Required calculations are:
Years
service
0-<5
5 - < 15
15 – < 25
Frequency
(f)
105
231
173
Mid-pt (x)
2.5
10
20
fx
fx2
105* 2.5= 262.5
2310
3460
105*2.52 =656.25
23100
69200
25 - < 35
35 - < 50
85
31
625
30
42.5
2550
1317.5
9900
76500
55993.75
225450
Mean = fx/f = 9900/625 = 15.84 years (to 2dp)
Population variance:
2 = fx2 - fx
n
n
2
= 225450 625
9900
625
2
= 360.72 – 250.9056 = 109.8144
Data relates to all employees of the company so they form a population.
Standard deviation = 109.8144 = 10.48 (to 2dp)
(d) In order to estimate the mean and inter-quartile range for the data an ogive is
required:
Figure 3.5 Ogive of years with company
Cumulative Frequency
Ogive showing amount spent by 81 shoppers on
luxury goods
90
80
70
60
50
40
30
20
10
0
0
20
40
60
80
100
120
Amount (£)
As there are 625 employees in the company the median position is 625+1/2 = 313, the
lower quartile position is (625+1)/4 = 156½ and the upper quartile position is 3(625+1)/4 =
469½
From the graph these positions relate to approximately 14 years, 7 years and 23 years
respectively.
Hence median = 14 years and inter-quartile range = 23 – 7 = 16 years
(e) From the histogram created in part (a) the data is slightly skewed in shape therefore
the median and inter quartile range are the most appropriate measures of central tendency
and dispersion to use.