Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Outline 1. Chebyshev’s Theorem 2. The Empirical Rule 3. Measures of Relative Standing 4. Examples Lecture 3 2 Chebyshev’s Theorem • Applies to any data set. • At least ¾ of the observations in any data set will fall within 2s (2 standard deviations) of the mean: x • ( x – 2s, x + 2s). • At least 8/9 of the observations in any data set will fall within 3s of the mean: • ( x – 3s, x + 3s). • k >1, at least 1 – (1/k2) of the observations will fall within ks of the mean. (Works with , too) Lecture 3 3 The Empirical Rule • Applies to distributions that are mound-shaped and symmetric • 68% of the observations will fall within +/- 1s of the mean • 95% of the observations will fall within +/- 2s of the mean • 99.7% (essentially all) of the observations will fall within +/- 3s of the mean • Works with , too. Lecture 3 4 The Empirical Rule • Example: IQ scores = 100 = 15 o Population mean: o Population std. dev: o The empirical rule tells us that 99.7% of IQ scores fall in the range 55 to 145. o Do you see why? Lecture 3 5 Measures of Relative Standing A. Percentiles B. Standard scores (Z scores) Lecture 3 6 Measures of Relative Standing • Measures of relative standing tell us something about a given score by reporting how it relates to other scores. • For example, one such measure tells us what proportion of scores in a data set are smaller than a given score. Lecture 3 7 Measures of Relative Standing A. Percentiles • With observations in a set arranged in order (smallest to largest), Pth percentile is a number such that P% of the observations fall below it. • In a set of 200 observations, if a number X is larger than 150 of the observations, then X is at the 75th percentile (150/200 = 75%). Lecture 3 8 Measures of Relative Standing B. Standard Scores (aka Z scores) i. For a sample: Z = X –x s ii. For a population: Z=X– • Z-scores can be positive or negative. • A (raw) score below the mean has a negative Z score. Lecture 3 9 2 Important qualities of the Z score 1. Z-scores are like a ruler – they measure distances. o Z scores give the distance between any score X and the mean, expressed in standard deviation units. 2. Values expressed in Z scores can be compared, regardless of their original units. Lecture 3 10 Z scores are like a ruler • Compare these situations: = 100 and = 10 120 = + 2 = 100 and = 40 120 = + ½ A score of 120 is more impressive on the left, where it is 2 standard deviations above the mean (vs. one-half s on the right). Lecture 3 11 Z-Scores are unit-free • Values expressed in Z scores can be compared, regardless of their original units. o For example, suppose you had exam grades for 100 students in Psych 281 and also knew how many hours each student studied for that exam. o You could compare any student’s Z-score for their grade with the Z-score for their # of hours of studying. Lecture 3 12 Z-Scores • Suppose we have the following data for our class: Mean grade for the exam: Standard deviation: 70 10 Mean # hours studying for exam: 8 Standard deviation: 2 Lecture 3 13 Z-Scores Grade Z Bill 80 +1 Bob 80 +1 Ben 60 -1 # hours Z 6 -1 12 +2 6 -1 Lecture 3 14 Example – Assignment 2, Q.1 • The distribution of a sample of 100 test scores is symmetrical and mound-shaped, with a mean of 50 and a variance of 144. a. Approximately how many scores are equal to or greater than 74? Lecture 3 15 Example – Assignment 2, Q.1 Since s2 = 144, s = 12, and 74 = x + 2s What percentage of scores falls above the red line? 50 Lecture 3 62 74 16 Example – Assignment 2, Q. 1 • Since 74 = x + 2s, we have p = .975. • 2.5% of scores are ≥ 74. • Since there are 100 scores, 2.5% = approx. 3 scores. Lecture 3 17 Example – Assignment 2, Q. 1 b. What score corresponds to the 75th percentile? • To answer this, we use interpolation. o A score at the 84th percentile is 1 s above the mean. o x = 50 and s = 12, so 62 is one s above the mean – which is the 84th percentile. (Why?) Lecture 3 18 Example – Assignment 2, Q.1 • Remember that the distribution is moundshaped and symmetric. • By the Empirical Rule, 68% of the distribution is between -1 s and +1 s around the mean – so half of that (34%) is between the mean and 1 s above the mean. Lecture 3 19 Example – Assignment 2, Q. 1 • The 84th percentile is 1 s above the mean by the Empirical Rule. • A score of 62 is 1 s above the mean because the mean is 50 and s = 12. • Therefore 84th percentile = 62 Lecture 3 20 Example – Assignment 2, Q.1 50% + 34% = 84% 34% 50% 84th percentile X = 50 Lecture 3 X = 62 21 Example – Assignment 2, Q. 1 • Now we know which score is at the 50th percentile (the mean score – by definition in a symmetric distribution). • We also know which score is at the 84th percentile. • Now we can answer our question: which score is at the 75th percentile? Lecture 3 22 Example – Assignment 2, Q. 1 X %ile 50 50 ∆ 12 25 a 75 62 84 34 ∆ = 25 12 34 Lecture 3 23 Example – Assignment 2, Q. 1 ∆ 12 = 25 34 Interpolation ∆ = 25 (12) = 8.82 34 75th percentile: X = 50 + 8.82 = 58.82 ~ 59 Lecture 3 24 Example – Assignment 2, Q. 1 c. If the lowest and highest scores in this sample are 14 and 89, respectively, what is the range of the scores in standard deviation units? Z1 = 14 – 50 12 = -3 .0 Z2 = 89 – 50 12 = 3.25 Range = -3 .0 to 3.25 = 6.25 Lecture 3 25 Example – Assignment 2, Q. 1 d. Assume the 100 scores have same mean and variance but now have a strongly skewed distribution. At least how many of the scores fall between 32 and 68 in this distribution? By Chebyshev’s Theorem: Z1 = 32 – 50 = –1.5 12 Z2 = 68 – 50 = 1.5 12 Lecture 3 26 Example – Assignment 2, Q. 1 k = 1.5 1 – (1/k2) = 1 – (1/1.52) = 1 – 1/2.25 = .555. 55.5% or ~ 56% of scores lie between 32 and 68. Lecture 3