Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CHAPTER THREE: Measures of Central Tendency Sometimes a most representative number will be used to depict a distribution, for summary purposes. But what is a most representative number. It is a measure of CENTRAL TENDENCY, the place within a distribution where most members tend to occur. This chapter discusses the three types of representative numbers, as well as, the computation and application of each. MEASURES OF CENTRAL TENDENCY: There are three methods of measuring the central tendency of a distribution. Types of measures: THE MODE (Mo): The most frequently occurring score in a distribution tends to fall in the middle when a distribution is symmetrical. But there is no guarantee of that, so the mode can be very misleading. In fact, when all the members occur the same number of times there is no mode. The mode will be all that is possible when the data is nominal. When data is represented in a frequency distribution, the mode is simply the score with the highest frequency. This is not appropriate in a grouped frequency distribution. THE MEDIAN (Md): The 50th%, or exact middle of a distribution. The median also falls near the center of the distribution. It is sensitive to extreme numbers, or OUTLIERS, so it may not be the true balance point of the distribution. Still, it is not oblivious to outliers and is generally not as potentially misleading as the mode. The median requires at least ordinal data, as percentiles must be ranked. When data is collected into a cumulative percentile frequency distribution, the median is the percentile equivalent to the 50th percentile rank. THE MEAN (Mn): The arithmetic average of a distribution, or mean, is the balanced center of a distribution. Denoted by X, it's formula is: X = The Sum of X/N. The sum of the distances of all the scores from the mean is always equal to zero. This is true even when only one score is greater than the mean and all the other scores in the distribution are less than the mean. For this to happen, the larger score is simply much further away from the mean. Consider the set X = 10, 10, 10, 10 and 60. Since the average is 20, all the small scores are ten points below the mean. The largest score is 40 points above the mean. _ Additional columns can depict the distance from X: _ X X - X (distance from the mean) 60 60 - 20 = 40 10 10 - 20 = -10 10 10 10 ___ 100 10 - 20 = -10 10 - 20 = -10 10 - 20 = -10 ___________ Sum (x- u)=0 *Note...The sum of the distances of the scores of a distribution is always zero. Furthermore, the mean lends itself to more complex calculations and requires a continuous scale of data. Unfortunately, the mean is extremely sensitive to outliers. That is what is happening in the example depicted above. The mean is suppose to measure the central tendency of the distribution. The mean is 20 and not one of the scores is a 20. When the distribution is symmetrical, or balanced in shape, the mean falls where most of the members of the distribution are. Regardless of whether the distribution is symmetrical or not, the sum of the distances from the mean is generally smaller and never larger than the distances from the mode or the median. Interestingly, when a distribution is perfectly symmetrical, all three measures of tendency are equal to each other. Let's consider a CPDF. Notice the sum of X, _X, and the number of scores, N, seem to be computed differently. That is because frequency distributions no longer contain the original data set, unless all the numbers appear exactly once. By adding a Xf column, in which each possible score is multiplied by the frequency in the f column, you can correct this problem. Note, N still = Sum f, but now Sum X = Sum Xf. Further, the median is the 50% and the mode is simply the score with the greatest number in the f column Xf X f cf % c% 60 0 0 0 0 40 60 50 40 30 20 10 1 0 0 0 0 4 5 4 4 4 4 4 20 0 0 0 0 80 100 80 80 80 80 80 ∑X = ∑Xf = 100 N = ∑f = 5 Mo = 10 , Md = 10.125 , Sum of X = 100 N = 5 DISTRIBUTION SHAPE: The shape of the distribution is depicted with a polygon. A SKEW, or pull in the distribution, will jeopardize the symmetry of that distribution. Consider these shapes. SYMETERY: A distribution is described as symmetrical when the curve of the polygon depicts a balanced image. A normal curve is balanced , mesokurtic and all it’s measures of central tendency are equal. SKEW: A skew, or tail, in a distribution can be pulled toward smaller scores. NEGATIVE SKEW: When the majority of scores are large, and the exceptions are small, the data set will have a point or skewer toward the smaller. POSITIVE SKEWS: When most of the numbers in the set are relatively small and the exceptions are relatively large. The tail or skewer will be toward the larger numbers. SYMMETRY AND CENTRAL TENDENCY: The position of the measures of central tendency, in relation to each other, can indicate the shape of the distribution. When the curve is symmetrical: Mn = Md. When the curve is normal: Mo = Mn = Md. When the curve has -skew: Mo > Md > Mn. When the curve has +skew: Mo < Md < Mn. KURTOSIS: The degree of the curve in a polygon has three types: Platykurtic polygons are flat (A). Mesokurtic polygons are medial height (B). Leptokurtic polygons are pointed (C).