Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 4: Measures of Variation Slide 1 Review of Lecture 3: Measures of Center • Given a stem –and-leaf plot Be able to find » Mean • (40+42+3*50+51+2*52+64+67)/10=46.7 » Median • (50+51)/2=50.5 » mode Stem (tens) Leaves (units) 4 5 6 02 000122 47 5th • 50 • Given a regular frequency distribution Be able to find » Sample size •2+4+5+16+13=40 » Mean •(8+12+10+16+0)/40=1.15 » Median: •average of the two middle values=1 Median group 6th # of phones (x) f fx Cum Freq 4 2 8 2 3 4 12 6 2 5 10 11 1 16 16 27 0 13 0 40=n 2.5 Measures of Variation Slide 2 Measure of Variation (Measure of Dispersion): A measure helps us to know the spread of a data set. Candidates: Range Standard Deviation, Variance Coefficient of Variation Statistics handles variation. Thus this section one of the most important sections in the entire book Definition Slide 3 The range of a set of data is the difference between the highest value and the lowest value Range=(Highest value) – (Lowest value) Example: Range of {1, 3, 14} is 14-1=13. Standard Deviation The standard deviation of a set of values is a measure of variation of values about the mean We introduce two standard deviation: • Sample standard deviation • Population standard deviation Slide 4 Sample Standard Deviation Formula Data value S= Formula 2-4 (x - x) n-1 Sample size 2 Slide 5 Sample Standard Deviation (Shortcut Formula) n (x ) - (x) n (n - 1) 2 s= Formula 2-5 2 Slide 6 Example: Publix check-out waitingSlide 7 times in minutes Data: 1, 4, 10. Find the sample mean and sample standard deviation. Using the shortcut xx formula: ( x x )2 x 2 x n=3 x 15 5.0 min 3 s 1 4 10 15 x 2 x x n 1 15 -4 -1 5 16 1 25 42 ( x x )2 s 16 100 117 x n x 2 x 2 1 2 42 21 4.6 min 3 1 n(n 1) 3(117) 15 3(3 1) 2 351 225 126 6 6 21 4.6 min Standard Deviation Key Points Slide 8 The standard deviation is a measure of variation of all values from the mean The value of the standard deviation s is usually positive and always non-negative. The value of the standard deviation s can increase dramatically with the inclusion of one or more outliers (data values far away from all others) The units of the standard deviation s are the same as the units of the original data values Population Standard Deviation = Slide 9 (x - µ) 2 N This formula is similar to Formula 2-4, but instead the population mean and population size are used Variance Slide 10 The variance of a set of values is a measure of variation equal to the square of the standard deviation. Sample variance s2: Square of the sample standard deviation s Population variance: Square of the population standard deviation Variance - Notation Slide 11 standard deviation squared } Notation s 2 2 Sample variance Population variance Round-off Rule for Measures of Variation Slide 12 Carry one more decimal place than is present in the original set of data. Round only the final answer, not values in the middle of a calculation. Definition Slide 13 The coefficient of variation (or CV) for a set of sample or population data, expressed as a percent, describes the standard deviation relative to the mean Sample CV = s 100% x Population CV = 100% • A measure good at comparing variation between populations • No unit makes comparing apple and pear possible. Example: How to compare the variability in heights and weights of men? Slide 14 Sample: 40 males were randomly selected. The summarized statistics are given below. Sample mean Height 68.34 in Sample standard deviation 3.02 in Weight 172.55 lb 26.33 lb Solution: Use CV to compare the variability s 3.02 100 % 100% 4.42% Heights: x 68.34 s 26.33 Weights: CV 100% 100% 15.26% x 172.55 CV Conclusion: Heights (with CV=4.42%) have considerably less variation than weights (with CV=15.26%) Standard Deviation from a Frequency Distribution Formula 2-6 n [(f • x 2)] - [(f • x)]2 S= n (n - 1) Use the class midpoints as the x values Slide 15 Example: Number of TV sets Owned by households Slide 16 • A random sample of 80 households was selected • Number of TV owned is collected given below. TV sets (x) 0 1 2 3 4 Total (a) x # of Households (f) 4 33 28 10 5 80 0 33 56 30 20 139 fx2 0 33 112 90 80 315 Compute: (a) the sample mean (b) the sample standard deviation 139 1.7sets 80 n ( fx2 ) fx 2 (b) s fx n(n 1) 80(315) (139) 2 5879 1.0 sets 80(80 1) 6320 Estimation of Standard Deviation Range Rule of Thumb Slide 17 For estimating a value of the standard deviation s, Use s Range 4 Where range = (highest value) – (lowest value) Estimation of Standard Deviation Range Rule of Thumb Slide 18 For interpreting a known value of the standard deviation s, find rough estimates of the minimum and maximum “usual” values by using: Minimum “usual” value (mean) – 2 X (standard deviation) Maximum “usual” value (mean) + 2 X (standard deviation) Definition Slide 19 Empirical (68-95-99.7) Rule For data sets having a distribution that is approximately bell shaped, the following properties apply: About 68% of all values fall within 1 standard deviation of the mean About 95% of all values fall within 2 standard deviations of the mean About 99.7% of all values fall within 3 standard deviations of the mean The Empirical Rule FIGURE 2-13 Slide 20 The Empirical Rule FIGURE 2-13 Slide 21 The Empirical Rule FIGURE 2-13 Slide 22 Recap Slide 23 In this section we have looked at: Range Standard deviation of a sample and population Variance of a sample and population Coefficient of Variation (CV) Standard deviation using a frequency distribution Range Rule of Thumb Empirical Distribution Homework Assignment 4 Slide 24 • problems 2.5: 1, 3, 7, 9, 11, 17, 23, 25, 27, 31 • Read: section 2.6: Measures of relative standing.