Download Measures of Variability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Transcript
QMS 102 Measures of Variability
The local Bank has been monitoring the daily average waiting times for its customers. The manager believed
that there would be an improvement in the waiting times by having 1 permanent clerk and 2 floating clerks
instead of the current 2 permanent clerks. The floating clerks would do other tasks except when the ques
became excessive and then would switch an open other wickets. Last week the manager implemented the
new system.
Old Policy:
4, 6, 12, 15, 18, 4, 4, 4, 14, 21, 0, 7, 8, 9, 18, 6, 6, 12, 22, 25
New Policy:
1, 3, 9, 14, 16
Question:
Were waiting times more variable using the "Old Policy" or the "New Policy"?
QMS 102 Measures of Variability
Since the shapes of the distributions can be expected to
be different, we will not be able to answer the question
be comparing the distributions of the 2 processes
visually.
QMS 102 Measures of Variability
Old Policy:
New Policy:
4, 6, 12, 15, 18, 4, 4, 4, 14, 21, 0, 7, 8, 9, 18, 6, 6, 12, 22, 25
1, 3, 9, 14, 16
A more precise comparison can be accomplished by choosing a criteria to assign a single number variability
measure to each group!
We will then compare the measure to answer the question.
ie
Criteria:
Use the highest value to represent the variability of the group.
Since 25>16 , old procedure is more variable.
What is wrong with this criteria ?
QMS 102 Measures of Variability
Old Policy:
New Policy:
4, 6, 12, 15, 18, 4, 4, 4, 14, 21, 0, 7, 8, 9, 18, 6, 6, 12, 22, 25
1, 3, 9, 14, 16
A more precise comparison can be accomplished by choosing a criteria to assign a single
variability measure to each group!
We will then compare the variability measures to answer the question.
ie
Criteria:
Use the highest value to represent the variability of the group.
Since 25>16 , old procedure is more variable.
What is wrong with this criteria ?
Excessive influenced of analysts choices of sample sizes
{A bigger sample will have more significant extremes}
Insensitive to the existence and magnitude of most values.
{Note how adding or moving values usually has no effect}
QMS 102 Measures of Variability
Range
xmax- xmin , distance on the real line required to hold the data
Interquartile Range Q3 - Q1 , distance on the real line required to hold the middle 50%
{Mean Absolute Deviation}
Population Variance
Population Standard Deviation
Population Coefficient of Variation
Others:
Sample Variance
Sample Standard Deviation
Sample Coefficient of Variation
QMS 102 Measures of Variability
Old Policy:
New Policy:
4, 6, 12, 15, 18, 4, 4, 4, 14, 21, 0, 7, 8, 9, 18, 6, 6, 12, 22, 25
1, 3, 9, 14, 16
Range Old Policy = Largest - Smallest = 25 - 0 = 25 min
Casio
[Me nu][List] List5 "4,6 ,12,....”
[Menu][Stat][Calc]
[Set] Xlist:“List1” Xfreq “1”
[Exit][1-Var]
xmin =0 xmax= 25
Range New Policy = ?
16-1=15
QMS 102 Measures of Variability Interquartile Range
Old Policy:
New Policy:
IQR
Old Policy
4, 6, 12, 15, 18, 4, 4, 4, 14, 21, 0, 7, 8, 9, 18, 6, 6, 12, 22, 25
1, 3, 9, 14, 16
= Q3 - Q1 = 16.5 - 5 = 11.5 min
Casio
[Me nu][List] List5 "4,6 ,12,....”
[Menu][Stat][Calc]
[Set] Xlist:“List1” Xfreq “1”
[Exit][1-Var]
Q1= 5
Q3= 16 .5
[Menu][Run][Var][Stat][Grph]
[Q3] - [Q1][EXE]
11 .5
IQR New Policy = ?
15-2=13
QMS 102 Measures of Variability
Old Policy:
New Policy:
Mean Absolute Deviation
4, 6, 12, 15, 18, 4, 4, 4, 14, 21, 0, 7, 8, 9, 18, 6, 6, 12, 22, 25
1, 3, 9, 14, 16
Mean Absolute Deviation Old Policy = MAD Old Policy =
Mean Absolute Deviation New Policy = 5.28
(WILL NOT BE ON TESTS)
QMS 102 Measures of Variability
Example
Data A: 1, -1, 1, -1
Which Population is More Variable?
B: 2, 0, -2, 0
QMS 102 Measures of Variability
Example continued:
Data A: 1, -1, 1, -1 Mean of 0, all “deviations” are 1.
MADA = ( |1-0| + |1-0| + |1-0| + |1-0| ) / 4 = 1
Data B: 2, 0, -2, 0
Mean of 0, 2 deviations of 0, 2 - deviations of 2.
MADB = ( |2-0| + |0-0| + |2-0| + |0-0| ) / 4 = 1
MAD is not sensitivity enough to the data at the "edges".{large deviation}!
How can we emphasize the edge values? Squaring deviations does the job!
.1 1
10
Some deviations
|---|-----|------------------|----------------------------------------------------------------------.01 1
Squared deviations
100
||--------|----------------------------------------------------------------------------------/\/\-----|
QMS 102 Measures of Variability
Old Policy:
New Policy:
4, 6, 12, 15, 18, 4, 4, 4, 14, 21, 0, 7, 8, 9, 18, 6, 6, 12, 22, 25
1, 3, 9, 14, 16
Population Variance Old Policy = (
minutes squared
Population Variance New Policy =
5.892 = 34.64
Population Variance
)2 = 6.93 2= 48.09
Old Policy
Casio
[Menu][List] List5 "4,6 ,12,....”
[Menu][Stat][Calc]
[Set] Xlist:“List1” Xfreq “1”
[Exit][1-Var]
x
n= 6.93
[Menu][Run][Var][Stat][X]
[x
n][x2][EXE]
48.09
2
New Policy
=
?
QMS 102 Measures of Variability
Population Standard Deviation
Casio
[Me nu][List] List5 "4,6 ,12,....”
[Menu][Stat][Calc]
[Set] Xlist:“List1” Xfreq “1”
[Exit][1-Var]
x
n= 6.93
This estimation is not usually this accurate.
Population Standard Deviation New Policy = ?
5.89
QMS 102 Population Variance / Standard Deviation
Demonstration of the edge effect!
Assume that a set of data has a Mean of 0 and the set includes 1 and 10.
Move 1 ---> 2 this increases the numerator of the variance calculation by ?
Instead , move 10 --->11 How does the numerator change?
Variance and standard deviation are much more sensitive to values far from the mean!
QMS102 Measures of Variability
Which one should we use?
Criteria
Properties
Range
Insensitive to the existence and magnitude of most values. Based only on location of
the two extremes. Easy to calculate.
IQR
Robust. Totally insensitive to magnitude of the outer 50%.
{MAD}
Equally sensitive to the existence and magnitude of all values. Easy to understand.}
Variance
Much more sensitive to the existence and magnitude of extreme values than central
values. Calculation supported.
Unusual units.
Standard
Deviation
Same as Variance but nice units.