Download Lecture 8 - Measures of Dispersion Median

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Lecture 8 - Measures of Dispersion
Median - quartiles,
Mean -Standard Deviations
1
© m j winter, FS2004
Review Box (and Whiskers) Plots
• Shows median, range, and quartiles (fourths)
• To construct: take ordered data list
find median
find median of each half
draw appropriate number line
mark ends of data on number line
draw center boxes, add whiskers
We’ll convert the Elevator Data to relative frequencies
and construct box plots.
2
1
Relative Frequency all 3 passengers
alight at different floors
.3
.5
.5 .5
.6
.6
.6
.6
.6
.6
.7 .8
.8
.9
.9
.49 .49 .50 .51 .51 .52 .52 .55 .56 .56
. 56 .57 .58 .61 .63 6.4
.53 .54 .55 .58
Notice:
range is decreasing
outliers decrease
medians tending to
p(all alight at different floors)
=6*5*4/6^3 = 0.5555
3
Statistics on the TI83(+)
(turn off graphs of functions Y=)
STAT Edit enter data in L1
Go to STATPLOT (yellow)
Select Plot1
Select: ON, type of plot; specify Xlist
QUIT, then GRAPH
ZOOM 9 (this is ZoomStat
Use TRACE and arrows to
read values
4
2
Box Plots; 5-number summary
Sum of 2 “mathematical dice”. Two
random numbers (not integers)
between 0 and 6 are generated and
summed.
6.26
0.94
4.33
11.27
8.80
Five number summary: {.94, 4.33, 6.26, 8.80, 11.27}
5
Comparing Dispersion Bowling Scores Example
• Chris: 185, 135, 200, 185, 250, 155
• Sandy: 182, 185, 188, 185, 180, 190
Look at Box plots
Which is Sandy?
Which is Chris?
We are using the median as
the central measurement.
Medians are the same.
Dispersion from the median
6
is quite different.
3
Look at 5-number summaries
Calculate interquartile ranges
Chris: 135, 155, 185, 200, 250:
Sandy: 180, 182, 185, 188, 190:
IQR = 45
IQR = 6
7
Median vs Mean
When we use the 5 number summary, the divisions of the
range of data are according to how many data points lie in
each division. There are the same number of data points
in each subinterval. But the intervals usually have different
widths.
When we use the mean, we group by a measure of how
far the data are from the mean. Instead of saying “she’s in
the top quartile”we’ll say how her distance from the mean
compares with the distance of others from the mean”.
8
4
Deviation from the Mean
• Chris: 185, 135, 200, 185, 250, 155
• Sandy: 182, 185, 188, 185, 180, 190
185
185
• Calculate means: Chris: ________,
Sandy _____
x - Chris
185
135
200
185
250
155
Deviation
x - 185
0
-50
15
0
65
-30
Sum of deviations is 0
9
Algebraically:
Data set: x1, x2, x3
let µ = mean = (x1 + x2 + x3)/3
Look at the sum of the deviations:
(x1 – µ) + (x2 – µ) + (x3 – µ)
= x1 + x2 + x3 – 3 µ
=3[(x1+x2+x3)/3 – µ]
=3*0 = 0
In order to have “errors” that don’t cancel, we use the
squares of the deviations.
10
5
Sum of Squares of Deviations
X(Chris) Deviation
x - 185
185
0
135
-50
200
15
185
0
65
250
-30
155
Dev squared
2
(x - 185)
0
2500
225
0
4225
900
sum = 7850
X
Deviation Dev squared
2
(Sandy) x - 185
(x - 185)
182
-3
9
0
185
0
3
188
9
0
185
0
-5
180
25
5
190
25
sum = 68
If there are more points,
the sums will be greater.
We need some sort of
average.
There are two averages used:
Population Variance
Sample Variance
These lead to
Population Standard Deviation
Sample Standard Deviation
11
12
6
13
Sample & Population Standard Deviations
X(Chris) Deviation
x - 185
185
0
135
-50
200
15
185
0
65
250
-30
155
Dev squared
2
(x - 185)
0
2500
225
0
4225
900
sum = 7850
X
Deviation Dev squared
2
(Sandy) x - 185
(x - 185)
182
-3
9
0
185
0
3
188
9
0
185
0
-5
180
25
5
190
25
sum = 68
7850
= 1,570
6-1
s = 1570 = 39.62323
7850
σ2 =
= 1308.33
6
σ = 1308.33 = 36.17
s2 =
s2 =
68
= 13.6
6 -1
s = 13.6 = 3.68782
68
σ2 =
= 11.33...
6
σ = 11.33 = 3.36...
14
7
Check with your Calculator
7850
= 1,570
6-1
s = 1570 = 39.62323
s2 =
s2 =
68
= 13.6
6-1
s = 13.6 = 3.68782
15
Using Standard Deviations instead of
Quartiles
Median - Quartile (25%)
25%
25%
|_____________|____________|_____|________|
min
median
max
Mean - Standard Deviation (s)
__|______|______|______|______|______|______|__
s
s
s
s
s
s
mean = x
within one
standard dev
within 2 standard devs of mean
16
8
Chris: 185, 135, 200, 185, 250, 155
Sandy: 182, 185, 188, 185, 180, 190
Chris and Sandy, again
• For Chris, what percent of her games are within one
standard deviation of the mean?
s = 39.6; 185 - s = 145.4, 185 + s = 224.6
4/6 = 67% of her games are within 1 std. deviation.
• For Sandy
s = 3.687; 185 - s = 181.3, 185 + s =188.687
4/6 = 67% of his games are within 1 std. deviation
17
Statistics on the TI 83
10 of the 15 points lie within 1 standard
deviation of the mean. 9 of these are below
the mean.
.3
.5
.5
.5
.6
.6
.6
.6
.6
.6
.7
.8
.8
.9
.9
18
9
19
10