Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PPA 415 – Research Methods in Public Administration Lecture 4 – Measures of Dispersion Introduction By themselves, measures of central tendency cannot summarize data completely. For a full description of a distribution of scores, measures of central tendency must be paired with measures of dispersion. Measures of dispersion assess the variability of the data. This is true even if the distributions being compared have the same measures of central tendency. Introduction – Example, JCHA 1999 How safe is your community? How safe is your community? Trafford Red Hollow 3.5 3.5 3.0 3.0 2.5 2.5 2.0 2.0 1.5 1.5 1.0 1.0 Std. Dev = 2.67 .5 Mean = 6.8 N = 14.00 0.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 How safe do you feel in your community? 9.0 10.0 Std. Dev = 3.96 .5 Mean = 6.8 N = 7.00 0.0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 How safe do you feel in your community? 9.0 10.0 Introduction Measures of dispersion discussed. Index of qualitative variation (IQV). The range and interquartile range. Standard deviation and variance. Index of Qualitative Variation Used primarily for nominal variables, but can be used with any variable with a frequency distribution. Ratio of amount of variation actually observed in a distribution of scores to the maximum variation that could exist in that distribution. Index of Qualitative Variation Maximum variation in a frequency distribution occurs when all cases are evenly distributed across all categories. The measure gives you information on how homogeneous or heterogeneous a distribution is. Index of Qualitative Variation IQV k N2 f 2 N 2 k 1 where : k number of categories N number of cases 2 f the sum of the squared frequencie s Index of Qualitative Variation JCHA 1999: Ethnicity in Housing Communities Dixie Red Hickory Oak Terrace Fultondale Brookside Warrior I Warrior II Bradford Manor Trafford Hollow Grove Ridge Manor White f1 9 20 18 8 10 2 14 5 7 0 9 Nonwhite f2 6 4 5 4 4 13 0 2 0 17 20 Total N Categories k IQV 15 2 24 2 23 2 12 2 96.0% 55.6% 68.1% 88.9% 14 2 15 2 81.6% 46.2% 14 2 7 2 0.0% 81.6% 7 2 17 2 29 2 0.0% 0.0% 85.6% Range and Interquartile Range Range: the distance between the highest and lowest scores. Only uses two scores. Can be misleading if there are extreme values. Interquartile range: Only examines the middle 50% of the distribution. Formally, it is the difference between the value at the 75% percentile minus the value at the 25th percentile. Range and Interquartile Range Problems: only based on two scores. Ignores remaining cases in the distribution. Range Highest lowest IQR Q3 ( P75 ) Q1 ( P25 ) Range and Interquartile Range: JCHA 1999 Example Statistics How long have your lived at N Valid Missing Minimum Maximum Percentiles 25 75 this address? 181 4 1 564 24 108 Range = Maximum - Minimum IQR = P75-P25 563 84 The Standard Deviation The basic limitation of both the range and the IQR is their failure to use all the scores in the distribution A good measure of dispersion should Use all the scores in the distribution. Describe the average or typical deviation of the scores. Increase in value as the distribution of scores becomes more heterogeneous. The Standard Deviation One way to do this is to start with the distances between every point and some central value like the mean. The distances between the scores are the mean (Xi-Mean X) are called deviation scores. The greater the variability, the greater the deviation score. The Standard Deviation One course of action is to sum the deviations and divide by the number of cases, but the sum of the deviations is always equal to zero. The next solution is to make all deviations positive. Absolute value – average deviation. Squared deviations – standard deviation. Average and Population Standard Deviation Average Deviation X AD i X N Variance (populatio n) 2 X i X 2 N Standard Deviation (populatio n) X i X N 2 Sample Variance and Standard Deviation Sample variance s 2 X X i 2 n 1 Sample standard deviation s X i X n 1 2 Computational Variance and Standard Deviation - Sample Computatio nal Variance (Sample) x x n 2 2 s2 n 1 Computatio nal Sample Standard Deviation s s2 Examples – JCHA 1999 N X Safety (Xi ) 10 9 5 5 10 7 10 10 10 5 81 10 8.1 (X i X ) 1.9 0.9 -3.1 -3.1 1.9 -1.1 1.9 1.9 1.9 -3.1 0.0 X i X 1.9 0.9 3.1 3.1 1.9 1.1 1.9 1.9 1.9 3.1 20.8 ( X i X )2 3.61 0.81 9.61 9.61 3.61 1.21 3.61 3.61 3.61 9.61 48.90 X2 100 81 25 25 100 49 100 100 100 25 705 Examples – Average and Standard Deviation AD s 2 Xi X X n X n 1 i 2 28 2.8 10 48.9 5.43 9 s s 2 5.43 2.33 x x n 2 2 s2 n 1 812 705 705 656.1 48.9 10 5.43 9 9 9 s s 2 5.43 2.33 Grouped Standard Deviation s fx m fx 2 2 m n 1 n Grouped Standard Deviation Example What is your monthly household income? f xm x 2m fx m fx 2m Valid $500 or less 13 250 62500 3250 812500 $1,000 or less 16 750 562500 12000 9000000 $1,500 or less 9 1250 1562500 11250 14062500 Total 38 26500 23875000 Missing Missing Values 5 Total 43 Grouped Standard Deviation Example s fx m fx 2 2 m n 1 n 26,500 23,875,000 38 37 2 23,875,000 18,480,263.16 5,394,736.84 s 37 37 s 145,803.6984 $381.84