Survey

Survey

Document related concepts

Transcript

z-scores Measures of Spread z-scores The Standard Deviation as a Ruler The trick in comparing very different-looking Remember: values is to use standard deviations as our rulers. The standard deviation tells us how the whole Standard (s) isso a measure of ruler collection of Deviation values varies, it’s a natural which approximately the average forspread comparing anisindividual to a group. each data point from the mean. Asdistance the mostofcommon measure of variation, the standard deviation plays a crucial role in how we look at data. z-scores Standardizing with z-scores A z-score measures how many standard deviations a number is from the mean. For example, let’s look at the following data set. 6, 8, 10, 12, 13, 15, 16, 17, 18, 20, 22, 25, 26, 30 z-scores 6, 8, 10, 12, 13, 15, 16, 17, 18, 20, 22, 25, 26, 30 n = 14 Mean: y 16.6 Minimum: Q1: Median: Q3: Maximum: 6 12 16.5 22 30 Because the mean(16.6) is very close to the median(16.5), there is a good chance that the data is symmetric. With symmetric data, the To be sure, we can lookis the standard deviation at the histogrammeasure of the of appropriate data.spread. The standard deviation: s ≈ 6.6 z-scores z = -1 6.6 6, 8, 10, 12, 13, 15, 16, 17, 18, 20, 22, 25, 26, 30 Mean ≈ 16.6 With a standard deviation of 6.6 and a mean of 16.6, we can see that the number 10 is exactly one standard deviation below the mean. Because 10 is exactly 1 standard away (mean) deviation – (std. dev.) from the mean, the zscore corresponding to 16.6 – 6.6 10 be the number 10=would z = -1 (The negative is because the number is to the left of the mean.) z-scores z ≈ 0.5 3.3 6, 8, 10, 12, 13, 15, 16, 17, 18, 20, 22, 25, 26, 30 Mean ≈ 16.6 With a standard deviation of 6.6, half (0.5) of one standard deviation would be 3.3. Notice that the number 20 is approximately ½ of one standard deviation from the mean. (mean) + (0.5)(std. dev.) The z-score corresponding to the number 20 would be approximately 0.5 or ½ . 16.6 +3.3 = 19.9 ≈ 20 z-scores Standardizing with z-scores We compare individual data values to their mean, relative to their standard deviation using the following formula: y y z s We call the resulting values standardized values, denoted as z. They can also be called z-scores z-scores Standardizing with z-scores (cont.) Standardized values have no units. z-scores measure the distance of each data value from the mean in standard deviations. A negative z-score tells us that the data value is below the mean, while a positive z-score tells us that the data value is above the mean. z-scores Use the formula y y z to standardize the s following data by converting to z-scores. 6, 8, 10, 12, 13, 15, 16, 17, 18, 20, 22, 24, 25, 26 Recall: mean: y ≈ 16.6 standard deviation: s ≈ 6.6 and that the data are the y values. z-scores y y z s 6, 8, 10, 12, 13, 15, 16, 17, 18, 20, 22, 24, 25, 26 y 6 y y 6 16.6 s 6.6 6.6 6.6 6.6 z -1.61 -1.30 -1 -0.70 8 10 12 8 16.6 10 16.6 12 16.6 Complete the rest of this table in your notes. z-scores The Standard Deviation as a Ruler z-scores measure the number of standard deviations a number is from the mean. A positive z-score means that the datum is to the right of the mean. A negative z-score means that the datum is to the left of the mean. z-scores The Standard Deviation as a Ruler If a z-score is near zero that indicates that the datum is typical (close to the mean). Most z-scores fall between -2 and 2. A z-score higher than 2 or less than -2 are unusual. (95% of the data in a set that is normally distributed are less than 2 standard deviations away from the mean) If a z-score has a value higher than 3 or lower than -3, then the corresponding datum is very unusual. (97.7% of the data in a set that is normally distributed are less than 3 standard deviations away from the mean) z-scores Normally distributed data has a histogram that looks similar to the bell curve below. As you can see, most of the data is near the mean (where z = 0). z z-scores Example: The z-scores for the data set: 2, 5, 7, 8, 13 are as follows: Mean = 7 2 Standard Deviation = 4.06 5 7 8 13 2 7 5 7 7 7 8 7 13 7 4.06 4.06 4.06 4.06 4.06 -1.23 -0.25 0 Bigger z-scores mean that data is less typical Small z-scores mean that data is typical 0.25 1.48 Bigger z-scores mean that data is less typical z-scores Benefits of Standardizing (Converting to z-scores) Standardized values have been converted from their original units to the standard statistical unit of standard deviations from the mean. Thus, we can compare values that are measured on different scales, with different units, or from different populations. z-scores Example: JaNathan earned a 93% on a test in Mr. Kane’s class. The test scores for that test were normally distributed with a mean of 75 and a standard deviation of 12. During the football season, JaNathan ran the 40-yard dash in 4.5 seconds. The mean time for the team in the 40-yd dash was normally distributed with a mean of 5.1 seconds and a standard deviation of 0.33. Which is more impressive, JaNathan’s 93% test score or his 5.1 second 40-yard dash time? z-scores Because the two numbers we are trying to compare use different units, we need to standardize the units (convert them to z-scores) before we can compare them. During the football season, JaNathan ran the 40-yard dash in 4.5 seconds. The JaNathan 93%inon a 40-yd test in dash Mr. Kane’s class. The test scores that mean timeearned for the ateam the was normally distributed with for a mean test were normally a mean 75 and a standard deviation of 12. of 5.1 seconds anddistributed a standardwith deviation of of 0.33. Converting 93 4.5to toaaz-score: z-score: xx4.5 93 xx5.1 75 ss0.33 12 93 75 5.1 x x 4.5 z 0.33 12 s z z 1.5 1.82 z-scores The standardized scores compare as follows: 93% test score z = 1.5 5.1 sec run time z = -1.8 Because the 5.1 second run time has a z-score that is further away from zero than the z-score of the 93% test score, it is more impressive that JaNathan ran the 40-yard dash in 5.1 seconds. (Keep in mind that a negative z-score for run time is good because we want our run time to be less than the mean run time.. Also, a positive z-score for test grade is good because we want to score higher than the mean on a test.) z-scores What have we learned? We’ve learned the power of standardizing data. Standardizing uses the Standard Deviation as a ruler to measure distance from the mean (zscores). With z-scores, we can compare values from different distributions or values based on different units. z-scores can identify unusual or surprising values among data.