Download 2.2

Measures of variation (dispersion) [‫]مقاييس التشتت‬ Formula: 1. For ungrouped data population variance: N  2   (x i 1 i  )2 N 2. For ungrouped data sample variance n S  2  (x i 1 i  x) n 1 2 Measures of variation (dispersion) [‫]مقاييس التشتت‬ Standard Deviation (S) 1. It is the square root of variance. 2. Most commonly used measure of variance. 3. Shows variation about mean. 4. It used to compare between more than one data set when the means are equal, the best one is the minimum. S S 2 Measures of variation (dispersion) [‫]مقاييس التشتت‬ • Example   ( x  x) ( x  x) 2 Year of graduation No. of Students 2004 4 -2 4 2005 6 0 0 2006 5 -1 1 2007 8 2 4 2008 7 1 1 Total 30 0 10 • Calculate variance and standard deviation Measures of variation (dispersion) [‫]مقاييس التشتت‬ n • Solution: Variance S  2  (x i 1 i  x )2 n 1  10  2.5 4 • Standard Deviation S  S 2  2.5  1.58 • Interpretation: The observations fall 1.58 units from the mean. Measures of variation (dispersion) [‫]مقاييس التشتت‬ Variance and standard deviation for grouped data • Formula n n S  2 fx i 1 i 2 i  ( f x ) i 1 n 1 i n i 2 Measures of variation (dispersion) [‫]مقاييس التشتت‬ • Example: the table below shows the temperature of a sample of 50 cities taken at the same time on a certain day; determine the mean and standard deviation of the sample. Measures of variation (dispersion) [‫]مقاييس التشتت‬ Temp. f Cum ulativ e F. Midp oint f .x x2 f .x 2 10-14 10 10 12 144 120 1440 15-19 12 22 17 289 204 3468 20-24 18 40 22 484 396 8712 25-29 6 46 27 729 162 4374 30-34 4 50 32 1024 128 4096 Total 50 1010 22090 Measures of variation (dispersion) [‫]مقاييس التشتت‬ f .x 1010  x   20.2 f 50 n n S  2 fx i 1 S i 2 i  ( f x ) i 1 i n n 1 S 2  i 2 (1010) 2 22090  1688 50    34.45 49 49 34.45  5.87 Measures of variation (dispersion) [‫]مقاييس التشتت‬ • Coefficient of Variation (C.V) (‫)معامل االختالف‬ 1. It is the main important application of the mean and standard deviation. 2. Measures relative variation and always in percentage (%). 3. Can be used to compare two or more data sets measured in different units. 4. Can be used widely in chemistry and engineering science. 5. The variable with smaller C.V is less dispersed than others so it is the better. Measures of variation (dispersion) [‫]مقاييس التشتت‬ • Formula: • Coefficient of Variation S C.V   100% x • Example: Suppose that technician A completes 40 analysis daily with standard deviation of 5, technician B completes 160 analysis per day with standard deviation of 15. • Which employee shows less variability or better? Measures of variation (dispersion) [‫]مقاييس التشتت‬ • Sol. S 5 C.V ( A)   100%   100%  12.5% x 40 S 15 C.V ( B)   100%   100%  9.4% x 160 • Employee B is better than A because he have the less variation. Measures of variation (dispersion) [‫]مقاييس التشتت‬ Interquartile Range [IQR] (‫)المدى الربيعي‬ There are three quartiles Q1, Q2, Q3 1. Q1 is a 25% of sorted data. 2. Q2 is a 50% of sorted data or median. 3. Q3 is a 75% of sorted data. Measures of variation (dispersion) [‫]مقاييس التشتت‬ Formulas Q1  L1  N   F 1     4   C1 f 1       Q2 the same as median formula.  3N   F 3   4 Q3  L3     C3 f3     Measures of variation (dispersion) [‫]مقاييس التشتت‬ • Example: you have the frequency table: Class boundaries 31.5 – 36.5 36.5 – 41.5 41.5 – 46.5 46.5 – 51.5 51.5 – 56.5 65.5 – 61.5 Frequency (f) Cumulative frequency (F) 4 4 7 11 10 21 7 28 18 46 4 50 Calculate Q1, Q2, Q3 and interquartile range. Measures of variation (dispersion) [‫]مقاييس التشتت‬ • Sol. Q1  L1 step 2   N  F1    4 f1        C1    50  12.5 4 Step (3) the first quartile class is [41.5-46.5] step(4) : L1  41.5, F1  11, f1  10, C1  5 N   50   F  11 1 4  4  Q1  L1    C  41 . 5   1    5  42.25  f1   10      Measures of variation (dispersion) [‫]مقاييس التشتت‬ • Sol. Q2 step 2  50  25 2 Step (3) the median quartile class is [46.5-51.5] step(4) : L1  46.5, F1  21, f1  7, C1  5 N   F 2 2  25  21 Q2  L    C  46 . 5   5  49.36  2    7   f2    Measures of variation (dispersion) [‫]مقاييس التشتت‬ • Sol. Q3 50  3 step 2   37.5 4 Step (3) the third quartile class is [51.5-56.5] step(4) : L3  51.5, F3  28, f 3  18, C3  5  N 3   3  50   F  28 3    4  Q3  L3   4  C  51 . 5      5  54.14 3 f 18 3         Interquartile range = Q3 - Q1 =54.14-42.25=11.89. Box plot • A box plot is a descriptive statistics and it is a convenient way of graphically depicting groups of numerical data through their quartiles. • Example: Plot a box plot for {7, 4, 3, 5, 6, 8, 10, 1} Box plot • Solution: • Sort data as: {1, 3, 4, 5, 6, 7, 8, 10}. • Minimum value is 1, maximum value is 10. • Calculate Q1, Q2, Q3 as: position of Q1  k (n  1) 25(9)   2.25 100 100 3 4 Q1   3.5 2 Box plot • Q2 and Q3 k (n  1) 50(9) position of Q 2    4.5 100 100 56 Q2   5 .5 2 k (n  1) 75(9) position of Q3    6.75 100 100 78 Q3   7 .5 2 Box plot Box plot Note: The above plotting done by computer using software R as: > A <- c(7, 4, 3, 5, 6, 8, 10 ,1) > quantile(A) 0% 1.00 25% 3.75 50% 5.50 75% 7.25 100% 10.00

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 2.2