Download on Measures of Central Tendency

Exam 2: Review G 201 Statistics for Political Science 1 Exam 2: Review Exam 2: Review Topics Chapter 3: Central Tendency 1. Mode, Median, Mean (Definition, Formula for each) 2. Skewed Distribution 3. Systematical Distributions Chapter 4: Variability 1. Range (Definition, Formula) 2. Deviation (Definition, Formula) 3. Variance (Definition, Formula) 4. Standard Deviation (Definition, Formula) 2 Measures of central tendency: Measures of central tendency: Measures of central tendency are numbers that describe what is average or typical in a distribution We will focus on three measures of central tendency: – The Mode – The Median – The Mean (average) Our choice of an appropriate measure of central tendency depends on three factors: (a) the level of measurement, (b) the shape of the distribution, (c) the purpose of the research. 3 The Mode The Mode: The mode is the most frequent, most typical or most common value or category in a distribution. Example: There are more protestants in the US than people of any other religion. The mode is always a category or score, not a frequency. The mode is not necessarily the category with the majority (that is, 50% or more) of cases. It is simply the category in which the largest number (or proportion) of cases falls. 4 Ten Most Common Foreign Languages Spoken in the United States, 1990. Language Number of Speakers Spanish 17,339,000 French 1,702,000 German 1,547,000 Italian 1,309,000 Chinese 1,249,000 Tagalog 843,000 Polish 723,000 Korean 626,000 Vietnamese 507,000 Portuguese 430,000 Source: U.S. Bureau of the Census, Statistical Abstract of the United States, 2000, Table 51. 5 A Review of Mode Is the mode 17,339,000? NO! Recall: The mode is the category or score, not the frequency!! Thus, the mode is Spanish. 6 The Mode Some additional points to consider about modes: Some distributions have two modes where two response categories have the highest frequencies. Such distributions are said to be bimodal. NOTE: When two scores or categories have the highest frequencies that are quite close, but not identical, in frequency, the distribution is still “essentially” bimodal. In these instances report both the “true” mode and the highest frequency categories. 7 Example of a Bimodal Frequency Distribution 8 The Median The Median: The median is the score that divides the distribution into two equal parts so that half of the cases are above it and half are below it. The median can be calculated for both ordinal and interval levels of measurement, but not for nominal data. It must be emphasized that the median is the exact middle of a distribution. So, now let’s look at ways we can find the median in sorted data: 9 In some cases, we can find the median by simple inspection. Let’s look at the responses (A) to the question: “Think about the economy, how would you rate economic conditions in the country today?” A Poor Jim Good Sue Only Fair Bob Poor Jorge Excellent Karen First, we sort the responses (B) in order from lowest to highest (or highest to lowest). Total (N) Since we have an odd number of cases, let’s find the middle case. B 5 Poor Jim Poor Jorge Only Fair Bob Good Sue Excellent Karen Total (N) 5 10 Calculating the median: Jim Poor Jorge Poor Bob Only Fair Sue Good Karen Excellent We can find the median through visual inspection and through calculation. We can also find the middle case when N is odd by adding 1 to N and dividing by 2: (N + 1) ÷2. Since N is 5, you calculate (5 + 1) ÷ 2 = 3. The middle case is, thus, the third case (Bob), the median response is “Only Fair.” 11 Calculating the median: Another example: The following is a list of the number of hate crimes reported in the nine largest U.S. states for 1997. State California Number 1831 Florida 93 Virginia 105 New Jersey 694 New York 853 Ohio 265 Pennsylvania 168 Texas 333 North Carolina 42 TOTAL N=9 12 Calculating the median: Finding the Median State for Hate Crimes 1. 2. Order the cases from lowest to highest. In this situation, we need the 5th case: (9 + 1) ÷ 2 = 5 Which is Ohio Remember: (N + 1) ÷2. State Number North Carolina 42 Florida 93 Virginia 105 Pennsylvania 168 Ohio 265 Texas 333 New Jersey 694 New York 853 California 1831 N=9 13 Finding the Median Number of Hate Crimes out of Eight States Order the cases from lowest to highest. For an even number of cases, there will be two middle cases. In this instance, the median falls halfway between both cases (216.5). However, the circumstances being explained should determine if you use the two middle cases or the point halfway between both cases for your explanation. State Number North Carolina 42 Florida 93 Virginia 105 Pennsylvania 168 Ohio 265 Texas 333 New Jersey 694 New York 853 14 Finding the Median Number of Hate Crimes out of Eight States 1.In this instance, the median falls halfway between both cases (216.5). (8 + 1) ÷ 2 = 4.5 4.5 (216.5) State Number North Carolina 42 Florida 93 Virginia 105 Pennsylvania 168 Ohio 265 Texas 333 New Jersey 694 New York 853 15 The Median The Median (Mdn) : Examples Odd Number of Cases: Median exactly in the middle 12, 17, 13, 11, 16, 25, 20 (not ordered) 11, 12, 13, 16, 17, 20, 25 (ordered: Lowest to Highest) N=7 (N + 1) ÷ 2 = (7 + 1) ÷ 2 = 4 11, 12, 13, 16, 17, 20, 25, 26 (ordered) 1 2 3 4 Mdn = 16 16 The Median The Median (Mdn): Examples Even Number of Cases: Median is the point above and below which 50% of the cases fall: 17, 12, 16, 13, 11, 25, 20, 26 11, 12, 13, 16, 17, 20, 25, 26 (ordered) N=8 (N + 1) ÷ 2 = (8 + 1) ÷ 2 = 4.5 11, 12, 13, 16, 17, 20, 25, 26 1 2 3 4 4.5 Mdn = 16.5 17 The Mean The Mean: The mean is what most people call the average. It find the mean of any distribution simply add up all the scores and divide by the total number of scores. Here is formula for calculating the mean X å X= N where X = mean (read as X bar) å = sum (expressed as the Greek letter sigma) X = raw score in a set of scores N = total number of scores in a set 18 Finding the Mean Communicable Diseases -> Tuberculosis (as of 22 March 2007) 2005 Bangladesh 37 Bhutan 44 Democratic People's Republic of Korea 103 India 58 Indonesia 47 Maldives 76 Myanmar 119 Nepal 64 Sri Lanka 71 Thailand 61 Timor-Leste 71 n (cases) = 11 © World Health Organization, 2008. All rights reserved 751 19 Finding the Mean Finding the Mean: To identify the number of new tuberculosis cases found in 2006 by the WHO in this region, – Add up the cases for all of the countries in the region and – Divide the sum by the total number of cases. X å X= N Thus, the mean rate is (751 ÷ 11) = 68.273. 20 Using a formula to calculate the mean: The Usefulness of Formulas: The mean introduces the usefulness of a formula, which may be defined as a is a shorthand way to explain what operations we need to follow to obtain a certain result. Again, the formula that defines the mean is: X å X= N where X = mean (read as X bar) å = sum (expressed as the Greek letter sigma) X = raw score in a set of scores N = total number of scores in a set 21 Deviation: Deviation: The deviation indicates the distance and direction of any raw score from the mean. To find the deviation of a particular score, we simply subtract the mean from the score: Deviation = X - X Where X = any raw score in the distribution X  mean of the distributi on 22 So what does this tell us? The mode is the peak of the curve. The mean is found closest to the tail, where the relatively few extreme cases will be found. The median is found between the mode and mean or is aligned with them in a normal distribution. 23 Did you know? The shape or form of a distribution can influence the researcher’s choice of a measure of tendency. Why is that? Well, let’s see… 24 Measures of Variability Chapter 4: Measures of Variability Measures of Variability Measures of variability tell us: • The extent to which the scores differ from each other or how spread out the scores are. • How accurately the measure of central tendency describes the distribution. • The shape of the distribution. Measures of Variability Just what is variability? Variability is the spread or dispersion of scores. Measuring Variability There are a few ways to measure variability and they include: 1) The Range 2) The Deviation 3) The Standard Deviation 4) The Variance Variability Measures of Variability Range: The range is a measure of the distance between highest and lowest. R= H – L Temperature Example: Honolulu: 89° – 65° Phoenix: 106° – 41° Range: 24° 65° Okay, so now you tell me the range… This table indicates the number of metropolitan areas, as defined by the Census Bureau, in six states. What is the range in the number metropolitan areas in these six states? – R=H-L – R=9-3 – R=6 Delaware 3 Idaho 4 Nebraska 4 Kansas 5 Iowa 4 Montana 3 California 9 The Variance Remember that the deviation is the distance of any given score from its mean. (X  X ) The variance takes into account every score. But if we were to simply add them up, the plus and minus (positive and negative) scores would cancel each other out because the sum of actual deviations is always zero! (X  X )  0 The Variance So, what we should we do? We square the actual deviations and then add them together. å (X - X) 2 – Remember: When you square a negative number it becomes positive! SO, S2 = sum of squared deviations divided by the number of scores. The variance provides information about the relative variability. Variance: Weeks on Unemployment: Step 1: Calculate the Mean Step 2: Calculate Step 3: Calculate Deviation Sum of square Dev X (weeks) Deviation: (X - X) (X - X)2 (raw score from the mean, squared) 9 8 6 4 2 1 ΣX=30 χ= 30=5 6 9-5= 4 8-5=3 6-5=1 4-5=-1 2-5=-3 1-5=-4 42 = 16 32 = 9 12 = 1 -12 = 1 -32 = 9 -42 = 16 å (X - X) = 52 2 The Variance The mean of the squared deviations is the same as the variance, and can be symbolized by s2 s where 2 X - X) ( å = 2 N s 2  variance 2 ( X  X )  sum of the squared deviations from the mean  N  total number of scores Variance: Weeks on Unemployment: Step 1: Calculate the Mean Step 2: Calculate Step 3: Calculate Deviation Sum of square Dev X (weeks) Deviation: (X - X) (X - X) 9 8 6 4 2 1 ΣX=30 χ= 30=5 6 Variance: 2 (raw score from the mean, squared) 9-5= 4 8-5=3 6-5=1 4-5=-1 2-5=-3 1-5=-4 42 = 16 32 = 9 12 = 1 -12 = 1 -32 = 9 -42 = 16 2 (X X) = 52 å Step 4: Calculate the Mean of squared dev. s 2 X - X) ( å = N (weeks squared) 2 What is a standard deviation? Standard Deviation: It is the typical (standard) difference (deviation) of an observation from the mean. Think of it as the average distance a data point is from the mean, although this is not strictly true. What is a standard deviation? Standard Deviation: The standard deviation is calculated by taking the square root of the variance. s= (X - X) å n 2 Variance: Weeks on Unemployment: Step 1: Calculate the Mean Step 2: Calculate Step 3: Calculate Deviation Sum of square Dev X (weeks) Deviation: (X - X) (X - X) 9 8 6 4 2 1 ΣX=30 χ= 30=5 6 Variance: 2 (raw score from the mean, squared) 9-5= 4 8-5=3 6-5=1 4-5=-1 2-5=-3 1-5=-4 42 = 16 32 = 9 12 = 1 -12 = 1 -32 = 9 -42 = 16 Step 4: Calculate Step 5: Calculate the the Mean of squared dev. Square root of the Var. s 2 X - X) ( å = N 52 = 8.67 6 Standard Deviation: 2 ( X - X )2 å n (square root of the variance) s= 8.67 (weeks squared) å (X - X) = 52 2 s = 2.94 Raw Score Calculations Here is how you calculate variance using raw scores: Here is how you calculate standard deviation using raw scores: S= Variance: Weeks on Unemployment: Step 1: Calculate the Mean X (weeks) 9 8 6 4 2 1 ΣX=30 χ= 30=5 _2 6 X =25 Step 2: Calculate Square raw scores Step 3: Calculate Variance Step 4: Calculate the Standard Deviation. 2 X 92 = 81 82 = 64 62 = 36 42 = 16 22 = 4 12 = 1 202 – 25 = 6 ΣX 2= 202 S 2= 8.67 ____ √ 8.67 33.67 – 25 = s = 2.94 Standard Deviation Standard Deviation: Applications Standard deviation also allows us to: 1) Measure the baseline of a frequency polygon. 2) Find the distance between raw scores and the mean – a standardized method that permits comparisons between raw scores in the distribution – as well as between different distributions. Standard Deviation Standard Deviation: Baseline of a Frequency Polygon. The baseline of a frequency polygon can be measured in units of standard deviation. Example: X = 80 s=5 Thus, the raw score 85 lies one Standard Deviation above the mean (+1s). Standard Deviation Standard Deviation: The Normal Range Unless highly skewed, approximately two-thirds of scores within a distribution will fall within the one standard deviation above and below the mean. Example: Reading Levels Words per minute. X= 120 s = 25

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download on Measures of Central Tendency