Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 3 Measures of Central Tendency and Dispersion Copyright © 2012 by Nelson Education Limited. 3-1 In this presentation you will learn about: • Concepts of Central Tendency and Dispersion • Mode, Median, Mean • Variation Ratio, Range and Interquartile Range, Variance and Standard Deviation • Choosing a Measure of Central Tendency and Dispersion Copyright © 2010 by Nelson Education Limited. 3-2 The Concept of Central Tendency • Central Tendency = most typical, central, or common score of a variable. • Three measures of Central Tendency: – Mode: The most common score. – Median: The score of the middle case. – Mean: The average score. Copyright © 2010 by Nelson Education Limited. 3-3 The Concept of Central Tendency (continued) • Mode, median, and mean are three different statistics. • They report three different kinds of information and will have the same value only in certain specific situations. • They vary in terms of: – Level-of-measurement considerations – How they define central tendency Copyright © 2010 by Nelson Education Limited. 3-4 The Concept of Dispersion • Dispersion = variety, diversity, amount of variation between scores. • The greater the dispersion of a variable, the greater the range of scores and the greater the differences between scores. Copyright © 2010 by Nelson Education Limited. 4-5 The Concept of Dispersion (continued) • The taller curve has less dispersion. • The flatter curve has more dispersion. Copyright © 2010 by Nelson Education Limited. 4-6 The Concept of Dispersion: Examples • Students in a given class tend to differ in their final exam marks. • Canadians tend to differ in their incomes. • Countries are diverse in their average incomes (e.g., the average income of Americans is higher [at about $49,000 in 2011] than that of Canadians [$40,000 in 2011]). Copyright © 2010 by Nelson Education Limited. 4-7 Measure of Central Tendency and Dispersion for Nominal Variables Mode Variation Ratio Copyright © 2010 by Nelson Education Limited. 4-8 Mode • The most common score. • Can be used with variables at all three levels of measurement. • Most often used with nominal level variables. Copyright © 2010 by Nelson Education Limited. 3-9 Finding the Mode 1. Count the number of times each score occurred. 2. The score that occurs most often is the mode. – If the variable is presented in a frequency distribution, the mode is the largest category. – If the variable is presented in a line chart, the mode is the highest peak. Copyright © 2010 by Nelson Education Limited. 3-10 Finding the Mode: An Example Copyright © 2010 by Nelson Education Limited. 3-11 Variation Ratio (v) • Variation Ratio (v) is one of only a few measures of dispersion for nominal-level variables. • v provides a quick, easy way to quantify dispersion. • Variation Ratio is simply the proportion of cases not in the modal category. That is: • v has a lower limit of 0.00 (no variation/all cases are in the mode) and increases to 1.00 as the proportion of cases in the mode decreases. • Thus, the larger the v, the more dispersion in a variable. Copyright © 2010 by Nelson Education Limited. 4-12 Variation Ratio: An Example Conclusion: Canadian society has grown increasingly diverse in its ethnocultural composition, and will be quite heterogeneous by Copyright © 2010 by Nelson Education Limited. 4-13 2017. Measure of Central Tendency and Dispersion for Ordinal Variables Median Range/Interquartile Range Copyright © 2010 by Nelson Education Limited. 4-14 Median (Md) • Exact center of distribution of scores. • The score of the middle case. • Can be used with variables measured at the ordinal or interval-ratio levels. – Cannot be used for nominal-level variables. Copyright © 2010 by Nelson Education Limited. 3-15 Finding the Median 1. Array the cases from high to low. 2. Locate the middle case. – If n is odd: the median is the score of the middle case. – If n is even: the median is the average of the scores of the two middle cases. Copyright © 2010 by Nelson Education Limited. 3-16 Finding the Median: Odd Number of Cases Copyright © 2010 by Nelson Education Limited. 3-17 Finding the Median: Even Number of Cases Copyright © 2010 by Nelson Education Limited. 3-18 The Range (R) • Range (R ) = High Score – Low Score • Quick and easy indication of variability. • Can be used with ordinal or interval-ratio variables. • Limitations because based on only two scores: 1. Distorted by atypically high or low scores (often referred to as outliers) 2. No information about variation between high and low scores. Copyright © 2010 by Nelson Education Limited. 4-19 Interquartile Range (Q) • Interquartile Range (Q) = distance from the third quartile (Q3) to the first quartile (Q1), or symbolically, Q = Q3 - Q1. • Quartile is a type of Percentile. •Percentile is the point below which a specific percentage of cases fall. Thus: the first quartile, Q1, is the point below which 25% of the cases fall; the third quartile, Q3, is the point below which 75% of the cases fall. •Q avoids some problems of R by focusing only on middle 50 percent of scores. Copyright © 2010 by Nelson Education Limited. 4-20 Measure of Central Tendency and Dispersion for Interval-Ratio Variables Mean Standard Deviation Copyright © 2010 by Nelson Education Limited. 4-21 Mean • Reports the average score of a distribution. • By far the most commonly used measure of central tendency. • Requires variables measured at the interval-ratio level. • Symbolized as, X, for a sample and, µ, for a population. Copyright © 2010 by Nelson Education Limited. 3-22 Finding the Mean •The calculation is straightforward: add the scores and then divide by the number of scores. • The mathematical formula for the sample mean is* *The population mean, µ, is calculated using the same method. Copyright © 2010 by Nelson Education Limited. 3-23 Finding the Mean: An Example The mean of these five scores is 78 Copyright © 2010 by Nelson Education Limited. 3-24 Characteristics of the Mean 1. All scores cancel out around the mean. 2. The mean is the point of minimized variation. “Least squares principle.” 3. The mean uses all the scores. Copyright © 2010 by Nelson Education Limited. 3-25 1. All Scores Cancel Out Around the Mean Copyright © 2010 by Nelson Education Limited. 3-26 2. Mean is the Point of Minimized Variation •As illustrated in the table above: •if we square and sum the differences between the scores and the mean (78), we get a total of 388. •if we performed the same operation with any OTHER number—say the value 77—the sum WILL ALWAYS be greater than 388. •for example, the sum of the squared differences around 77 is 393. Hence, the mean is the point in a distribution around which the variation of the scores (as indicated by the squared differences) is minimized: we call this the “least squares” principle. Copyright © 2010 by Nelson Education Limited. 3-27 3. Mean is Affected by Every Score • Conclusion: The strength of the mean is that it uses all the available information from the variable. However, its weaknesses is that it is affected by every score. • If there are some very high or low scores , the mean may be misleading. Copyright © 2010 by Nelson Education Limited. 3-28 Means, Medians, and Skew • When a distribution has a few very high or low scores (outliers), the mean will be pulled in the direction of the extreme scores. – For a positive skew, the mean will be greater than the median. – For a negative skew, the mean will be less than the median. • When an interval-ratio variable has a pronounced skew, the median may be the more trustworthy measure of central tendency. Copyright © 2010 by Nelson Education Limited. 3-29 Means, Medians, and Skew (continued) Copyright © 2010 by Nelson Education Limited. 3-30 Standard Deviation • A measure of the degree of dispersion of the data from the mean. • Specifically, the “average” distance of each score from the mean. • The lowest value possible is 0 (no dispersion) • Symbolized as, s, for a sample and, σ (sigma), for a population. • Square of the standard deviation is the variance, symbolized as s2 for a sample and σ2 for a population • Is used in combination with the mean to describe a "Normal” distribution (Ch.4). Copyright © 2010 by Nelson Education Limited. 4-31 Standard Deviation (continued) • Meets criteria for good measure of dispersion: 1. Use all scores in the distribution. 2. Describe the average or typical deviation of the scores. 3. Increase in value as the distribution of scores becomes more diverse. • As with the mean, the standard deviation requires variables measured at the interval-ratio level. Copyright © 2010 by Nelson Education Limited. 4-32 Formula for Sample Standard Deviation, s* *The population standard deviation, σ, is calculated using the same method. Copyright © 2010 by Nelson Education Limited. 4-33 Computing Standard Deviation • To solve: 1. 2. 3. 4. Subtract mean from each score. Square the deviations. Sum the squared deviations. Divide the sum of the squared deviations by the number of scores. 5. Find the square root of the result. Copyright © 2010 by Nelson Education Limited. 4-34 Computing Standard Deviation: An Example Copyright © 2010 by Nelson Education Limited. 4-35 Summary: Relationship Between Level Of Measurement And Measure Of Central Tendency And Dispersion Copyright © 2010 by Nelson Education Limited. 3-36