Describing Data Wahyu Wibowo Central Tendency the value used to characterize the center of the set of values it contains. useful to quantify the middle or central location of a variable. the central location (quantitative data): the mode, the median, and the mean. the central location (qualitative data): the mode. Dispersion Parameter The boxplot provides an indication of the value spread around the median. The field of statistics has developed parameters to describe this spread, or dispersion, using a single measure. interquartile range range median absolute deviation standard deviation and variance Coefficient of Variation If two variables are measured with different units, then the values of the standard deviation cannot be used as the measure of comparison for the dispersion CV can be to tompare dispersions measured in different units equal to the quotient of the standard deviation and the absolute value of the mean Skewness Skewness is a measure of distribution asymmetry. Yule & Pearson express the difference between median and mean as a degree of deviation from symmetry 3( x med ) Skew s Values larger than 0 indicate a rightskewed distribution, values less than 0 indicate a left-skewed distribution, and values that are 0 indicate a symmetric distribution. Kurtosis Kurtosis is used to help determine which form is present. Defined as the fourth central moment, Distribution of data how the different values are distributed around this location Boxplot Box plots provide a succinct summary of the overall frequency distribution of a variable. Six values are usually displayed: the lowest value, the lower quartile (Q1), the median (Q2), the upper quartile (Q3), the highest value, and the mean Dotplot Use to assess and compare distributions by plotting the values along a number line. Dotplots are especially useful for comparing distributions The x-axis for a dotplot is divided into many small intervals, or bins. Data values falling within each bin are represented by dots Stem and Leaf Use to examine the shape and spread of sample data The display has three columns, o The leaves (right) o The stem (middle) o Counts (left) Histograms The purpose of a histogram is to graphically summarize the distribution of a univariate data set. The histogram graphically shows the following: 1. center (i.e., the location) of the data; 2. spread (i.e., the scale) of the data is; 3. skewness of the data; 4. presence of outliers; and 5. presence of multiple modes in the data Referensi : Exploratory Data Analysis in Business and Economics, Toit, S.H.C, Steyn, A.G.W., Stumpf, R.H., Graphical Exploratory Data Analysis, Springer-Verlag