Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 8 - Measures of Dispersion Median - quartiles, Mean -Standard Deviations 1 © m j winter, FS2004 Review Box (and Whiskers) Plots • Shows median, range, and quartiles (fourths) • To construct: take ordered data list find median find median of each half draw appropriate number line mark ends of data on number line draw center boxes, add whiskers We’ll convert the Elevator Data to relative frequencies and construct box plots. 2 1 Relative Frequency all 3 passengers alight at different floors .3 .5 .5 .5 .6 .6 .6 .6 .6 .6 .7 .8 .8 .9 .9 .49 .49 .50 .51 .51 .52 .52 .55 .56 .56 . 56 .57 .58 .61 .63 6.4 .53 .54 .55 .58 Notice: range is decreasing outliers decrease medians tending to p(all alight at different floors) =6*5*4/6^3 = 0.5555 3 Statistics on the TI83(+) (turn off graphs of functions Y=) STAT Edit enter data in L1 Go to STATPLOT (yellow) Select Plot1 Select: ON, type of plot; specify Xlist QUIT, then GRAPH ZOOM 9 (this is ZoomStat Use TRACE and arrows to read values 4 2 Box Plots; 5-number summary Sum of 2 “mathematical dice”. Two random numbers (not integers) between 0 and 6 are generated and summed. 6.26 0.94 4.33 11.27 8.80 Five number summary: {.94, 4.33, 6.26, 8.80, 11.27} 5 Comparing Dispersion Bowling Scores Example • Chris: 185, 135, 200, 185, 250, 155 • Sandy: 182, 185, 188, 185, 180, 190 Look at Box plots Which is Sandy? Which is Chris? We are using the median as the central measurement. Medians are the same. Dispersion from the median 6 is quite different. 3 Look at 5-number summaries Calculate interquartile ranges Chris: 135, 155, 185, 200, 250: Sandy: 180, 182, 185, 188, 190: IQR = 45 IQR = 6 7 Median vs Mean When we use the 5 number summary, the divisions of the range of data are according to how many data points lie in each division. There are the same number of data points in each subinterval. But the intervals usually have different widths. When we use the mean, we group by a measure of how far the data are from the mean. Instead of saying “she’s in the top quartile”we’ll say how her distance from the mean compares with the distance of others from the mean”. 8 4 Deviation from the Mean • Chris: 185, 135, 200, 185, 250, 155 • Sandy: 182, 185, 188, 185, 180, 190 185 185 • Calculate means: Chris: ________, Sandy _____ x - Chris 185 135 200 185 250 155 Deviation x - 185 0 -50 15 0 65 -30 Sum of deviations is 0 9 Algebraically: Data set: x1, x2, x3 let µ = mean = (x1 + x2 + x3)/3 Look at the sum of the deviations: (x1 – µ) + (x2 – µ) + (x3 – µ) = x1 + x2 + x3 – 3 µ =3[(x1+x2+x3)/3 – µ] =3*0 = 0 In order to have “errors” that don’t cancel, we use the squares of the deviations. 10 5 Sum of Squares of Deviations X(Chris) Deviation x - 185 185 0 135 -50 200 15 185 0 65 250 -30 155 Dev squared 2 (x - 185) 0 2500 225 0 4225 900 sum = 7850 X Deviation Dev squared 2 (Sandy) x - 185 (x - 185) 182 -3 9 0 185 0 3 188 9 0 185 0 -5 180 25 5 190 25 sum = 68 If there are more points, the sums will be greater. We need some sort of average. There are two averages used: Population Variance Sample Variance These lead to Population Standard Deviation Sample Standard Deviation 11 12 6 13 Sample & Population Standard Deviations X(Chris) Deviation x - 185 185 0 135 -50 200 15 185 0 65 250 -30 155 Dev squared 2 (x - 185) 0 2500 225 0 4225 900 sum = 7850 X Deviation Dev squared 2 (Sandy) x - 185 (x - 185) 182 -3 9 0 185 0 3 188 9 0 185 0 -5 180 25 5 190 25 sum = 68 7850 = 1,570 6-1 s = 1570 = 39.62323 7850 σ2 = = 1308.33 6 σ = 1308.33 = 36.17 s2 = s2 = 68 = 13.6 6 -1 s = 13.6 = 3.68782 68 σ2 = = 11.33... 6 σ = 11.33 = 3.36... 14 7 Check with your Calculator 7850 = 1,570 6-1 s = 1570 = 39.62323 s2 = s2 = 68 = 13.6 6-1 s = 13.6 = 3.68782 15 Using Standard Deviations instead of Quartiles Median - Quartile (25%) 25% 25% |_____________|____________|_____|________| min median max Mean - Standard Deviation (s) __|______|______|______|______|______|______|__ s s s s s s mean = x within one standard dev within 2 standard devs of mean 16 8 Chris: 185, 135, 200, 185, 250, 155 Sandy: 182, 185, 188, 185, 180, 190 Chris and Sandy, again • For Chris, what percent of her games are within one standard deviation of the mean? s = 39.6; 185 - s = 145.4, 185 + s = 224.6 4/6 = 67% of her games are within 1 std. deviation. • For Sandy s = 3.687; 185 - s = 181.3, 185 + s =188.687 4/6 = 67% of his games are within 1 std. deviation 17 Statistics on the TI 83 10 of the 15 points lie within 1 standard deviation of the mean. 9 of these are below the mean. .3 .5 .5 .5 .6 .6 .6 .6 .6 .6 .7 .8 .8 .9 .9 18 9 19 10