Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Measures of Dispersion
Week 4
Dispersion
• Two groups of three students
Group 1
4
7
10
Group 2
7
7
7
• Mean mark
Group 1
4 + 7 + 10 = 21/3 = 7
Group 2
7 + 7 + 7 = 21/3 = 7
• Same mean mark, but Group 1’s marks are
widely spread, Group 2’s are all the same
• The following diagram reinforces this point
2
3
Range
• The absolute difference between the
highest and lowest value of the raw data
• Group of students 4
7
10
• Range = Maximum – Minimum
10 – 4 = 6
4
Interquartile Range
• This is the absolute difference between
the upper and lower quartiles of the
distribution.
• Interquartile Range =
Upper Quartile - Lower Quartile
• See next powerpoints for estimating
quartiles
5
Quartiles (1)
• Upper quartile: that value for which 25%
of the distribution is above it and 75%
below
• Lower quartile: that value for which 75%
of the distribution is above it and 25%
below
6
Quartiles (2)
• If the data is ungrouped, then put the data
in order in an array
• Find the quartile position , then estimate
its value, as previously for the median
• Upper quartile (Q3): position = 3(n + 1)
4
• Lower quartile (Q1): position = (n + 1)
4
7
Quartiles (3)
Example: ungrouped data:
3, 5, 6, 9, 15, 27, 30, 35, 37
• Lower quartile: position = n + 1 = 9 + 1 = 2.5th
4
4
Lower quartile: value = 5.5
(mid-way between 2nd and 3rd number in array)
• Upper quartile: position = 3(n + 1) = 3(9 + 1)
4
4
= 7.5th
Upper quartile: value = 32.5
(mid-way between 7th and 8th number in array)
8
Quartiles (4)
• Grouped data: use the same approach as
for estimating the median for grouped data
in week 4, except this time use the
quartile positions
9
Semi-Interquartile Range
• This is half the interquartile range. It is
sometimes called the Quartile Deviation
• Semi-Interquartile Range
= Upper Quartile - Lower Quartile
2
10
Example
Using previous ungrouped data
Interquartile range
= UQ - LQ
= 32.5 – 5.5 = 27
Semi-interquartile range = UQ - LQ
2
= 32.5 – 5.5
2
= 27 = 13.5
2
11
Mean Deviation
• Average of the absolute deviations from
the arithmetic mean (ignoring the sign)
• When two straight lines (rather than
curved brackets) surround a number or
variable it is referred to as the modulus
and we ignore the sign
12
Mean Deviation of ungrouped data
• X1 = 2,
X2 = 4,
X3 = 3
• MD = X 1  X  X 2  X  X 3  X
n
• MD =
2  3 4  3 3 3
3
=
1  1  0
3
=⅔
13
Variance
• If we square all the deviations from the
arithmetic mean, then we no longer need
to bother with dropping the signs since all
the values will be positive.
• We can then replace the straight line
brackets (modulus) for the Mean Deviation
with the more usual round brackets.
• Variance is the average of the squared
deviations from the arithmetic mean
14
Variance: ungrouped data (1)
•
Variance =
 X
n
i 1
i
X

2
n
• To calculate the variance
1. Calculate the mean value
X
2. Subtract the mean from each value in turn,
that is, find X i  X
2
3. Square each answer to get
Xi  X


15
Variance: ungrouped data (2)
4. Add up all these squared values to get
 X
n
i 1
X
i
5. Divide the result by n to get
 X
n
i 1
1
X

2

2
n
6. You now have the average of the squared deviations
from the mean (in square units)
16
Standard deviation (SD)
• This is simply the square root of the
variance
• An advantage is that we avoid the square
units of the variance
• Larger SD, larger the average dispersion
of data from the mean
• Smaller SD, smaller the average
dispersion of data from the mean
17
Example 1: variance/standard
deviation
xi
x1 - x
(x1 – x)2
4
7
10
Total
4–7=-3
7–7= 0
10 – 7 = 3
(-32) = 9
02 = 0
32 = 9
18
18
Solutions
 X
n
Variance =
i 1
i
X
n

2
18
  6 square units
3
Standard deviation is square root of 6
= 2.449 units
19
Example 2: variance/standard
deviation
xi
xi - x
7
7
7
Total
7–7=0
7–7=0
7–7=0
(xi – x)2
02 = 0
02 = 0
02 = 0
0
20
Solution
 X
n
Variance =
i 1
i
X
n

2
0
  0 square units
3
Standard deviation is square root of 0 = 0
i.e. there is no spread of values
21
Variance of grouped data
j
S 
2
F X
i
i 1
j
F
i 1
i

  Fi X i

 i 1 j

  Fi
 i 1
j
2
i






2
where Fi = Frequency of ith class interval
Xi = mid point of ith class interval
j = number of class intervals
22
Price of item (£)
No of items
sold
LCB
Fi
UCB
Xi
FiXi
FiXi^2
1.5
2.5
15
2
30
60
2.5
3.5
2
3
6
18
3.5
4.5
19
4
76
304
4.5
5.5
10
5
50
250
5.5
6.5
14
6
84
504
246
1136
60
23
1136  246 
S 


60  60 
2
2
S2 = 18.93 – 4.12
S2 = 18.93 – 16.81
S2 = £2.122
S = √ 2.12 = £1.45
24
Co-efficient of variation (C of V)
• A measure of relative dispersion
S
• Given by X i.e. the standard
deviation divided by the arithmetic
mean of the data.
• Data sets with a higher co-efficient of
variation have higher relative
dispersion
25
Related documents