Data Analysis Techniques II: Measures of Central Tendencies, Dispersion and Symmetry Advanced Planning Techniques, Lecture 9 Prof. Dr. S. Shabih-ul-Hassan Zaidi Dean, Faculty of Architecture & Planning, University of Engineering & Technology, Lahore. The Mean The Arithmetic Mean or Average can be defined as the sum of the value of all items divided by the number of items. The mean is a very important measure of the central tendency of the data, because it not only gives a summary of the data but is also used in further analysis. In case of grouped data the Mean is calculated by the formula: Mean = X = Sum f.x/ n Where f = Frequency of the group x = Central value of the group i.e. lower limit + upper limit divided by 2 n = Total of the frequency The Median The Median is the central item of a series. Its value divides the series into two parts. In case of even number of items, the average of the two central items is taken as Median. In case of grouped data the median can be calculated by the formula: Median = l + i/f (n/2 – c) Where l = lower limit of the median group i.e the group in which the value of n/2 lies i = class interval of the median group f = frequency of the median group n = total of the frequency c = cumulative frequency of the group preceding the median group The Quartiles, Deciles and the Percentiles The median divides the series into two parts. Similarly, the series can be divided into 4 parts and the value of quartiles can be calculated. On the same pattern, the value of Deciles (10th part) and the value of Percentiles (100th part) can be calculated. In case of grouped data the formulas for calculation of these measures are given below: First Quartile = l + i/f (n/4 – c) Third Quartile = l + i/f (3n/4 – c) First Deciles = l + i/f (n/10 – c) Ninth Decile = l + i/f (9n/10 – c) First Percentile = l + i/f (n/100 - c) Ninety ninth Percentile = l + i/f (99n/100 – c) The Mode The mode represents the value of the item which is repeated the maximum number of times in a series. In case of grouped data the mode can be calculated by the following formula: Mode = l1 + f2/(f1 + f2)xi Where l1 = Lower limit of the modal group i.e. maximum frequency group f1 = Frequency of the modal group f2 = Frequency of the next higher frequency group i = Class interval Measures of Dispersion The measures of dispersion tell about the dispersion or scatter of the data. The following measures of dispersion are usually calculated: Range: Max. value – Min. value, or Range = Fn – F1 Mean Deviation = Sum |x – x-|/n Where x is the central value of the group x- = Mean n = Total of frequencies Variance = Sum f (x – x-)2/n Standard Deviation = Under-root of Variance Quartile Deviation = (Q3 – Q1 ) / 2 Measure of Symmetry A perfectly symmetrical data can be represented by a bell shaped curve. In this case the value of mean, median and the mode is the same. The opposite of symmetry is skewness. Therefore, the Pearson’s Coefficient of Skewness is calculated for measuring symmetry of the data. The formula is: Coefficient of Skewness = (Mean – Mode)/Standard Deviation or Coeff. of Skewness = 3(Mean - Median)/Standard Deviation If the value of Coefficient is negative, the distribution is said to be negatively skewed i.e. when large number of items have small values, while the distribution is said to be positively skewed when the a large number of items have large values.