Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Important Summary Statistics (Measures of Dispersion, Skewness and Kurtosis) Dr. Dennis S. Mapa Prof. Michael Dominic del Mundo UP School of Statistics Measures of Dispersion Measures of dispersion indicate the extent to which individual items in a series are scattered about an average. Used as a measure of reliability of the average value. General Classifications of Measures of Dispersion Measures of Absolute Dispersion - used to describe the variability of a data set Measures of Relative Dispersion - used to compare two or more data sets with different means and different units of measurement Variance and Standard Deviation The variance and standard deviation are measures of dispersion of data with respect to the mean. For a finite population of size N, the population variance is N 2 2 X i i 1 N Variance and Standard Deviation The population standard deviation is defined as the positive square root of the variance, N 2 X i i 1 N The standard deviation is often referred to the measure of “volatility.” Variance and Standard Deviation For a sample of size n, the sample variance is defined as, n s2 X i 1 i 2 X 1 and the sample standardndeviation is defined as, X n s i 1 i X n 1 2 Variance and Standard Deviation If there is a large amount of variation in the data set, the data values will be far from the mean. In this case, the standard deviation will be large. If, on the other hand, there is only a small amount of variation in the data set, the data values will be close to the mean. Hence, the standard deviation will be small. Characteristics of the Standard Deviation Just like the mean, it is affected by the value of every observation. It may be distorted by few extreme values. It is always positive. Measures of Relative Dispersion Measures of relative dispersion are unit less and are used to compare the scatter of one distribution with the scatter of another distribution. Coefficient of Variation Commonly dispersion. used measure of relative The coefficient of variation utilizes two measures: the mean and the standard deviation. It is expressed as a percentage, removing the unit of measurement, thus, allowing comparison of two or more data sets. Coefficient of Variation The formula of the coefficient of variation is given as, CV x 100% The sample counterpart is defined as, s CV x 100% X Standard Score The standard score measures how many standard deviations an observation is above or below the mean. It is computed as, Z X and the sample counterpart is, X X Z s Standard Score The standard score is not a measure of relative dispersion per se but is somewhat related. It is useful for comparing two values from different series specially when these two series differ with respect to the mean or standard deviation or both are expressed in different units. Measure of Skewness A measure of skewness shows the degree of asymmetry, or departure from symmetry of a distribution. It indicates not only the amount of skewness but also the direction (skewed to the left or skewed to the right). Positive Skewness or Skewed to the Right distribution tapers more to the right than to the left longer tail to the right more concentration of values below than above the mean Positive Skewness _______________________________________________ Income frequency distribution of income Negative Skewness or Skewed to the left distribution tapers more to the left than to the right longer tail to the left more concentration of values above than below the mean Negative Skewness ________________________________________________ Negative Skewness rarely do we find curves that are skewed to the left, and even more rarely do we find data characteristically skewed to the left however, recent studies have shown that asset returns (particularly equities) follow a skewed to the left distribution during times of “crisis” Measure of Skewness Pearson’s Coefficient of Skewness SK 3( x Md ) s SK=0 implies symmetry SK>0 implies positive skewness SK<0 implies negative skewness Kurtosis The skewness coefficient enables the analyst to distinguish between a symmetric distribution and a nonsymmetric distribution but that still leaves us with the problem of distinguishing between two symmetric distributions with different shapes. Kurtosis The two graphs are both symmetric with respect to their mean, but differ with respect to their peaks and tails. The distribution on the right has greater kurtosis – more peaked, less flat – but it is possible that it has the same standard deviation as the graph on the left, which is more spread out but is thinner at the tails. Kurtosis is a measure that distinguishes distributions by measuring peakedness in relation to tails. Measuring Kurtosis Kurtosis is a standardized version of the fourth moment defined as, n K (X X ) i 1 i nS 4 4 Measuring Kurtosis K 3 mesokurtic distribution K 3 leptokurtic distribution K 3 platykurtic distribution Measuring Kurtosis A platykurtic distribution is one with a flatter peak. A mesokurtic distribution is one with neither pointed peak nor flat peak. A leptokurtic distribution is one with more pointed peak. Measuring Kurtosis Intuitively, we can think of the kurtosis coefficient as a measure which indicates whether a symmetric distribution when compared with the Normal distribution (with kurtosis equal to 3) has thicker tails and more pointed peaks or not.