Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
155S3.4_3 Measures of Relative Standing and Boxplots MAT 155 Dr. Claude Moore Cape Fear Community College Chapter 3 Statistics for Describing, Exploring, and Comparing Data 31 32 33 34 Review and Preview Measures of Center Measures of Variation Measures of Relative Standing and Boxplots Z score January 26, 2012 Key Concept Measures of relative standing, which are numbers showing the location of data values relative to the other values within a data set, can be used to compare values from different data sets, or to compare values within the same data set. The most important concept is the z score. We will also discuss percentiles and quartiles, as well as a new statistical graph called the boxplot. Interpreting Z Scores • z Score (or standardized value) the number of standard deviations that a given value x is above or below the mean. Sample Population Round z scores to 2 decimal places Whenever a value is less than the mean, its corresponding z score is negative Ordinary values: –2 ≤ z score ≤ 2 Unusual Values: z score < –2 or z score > 2 1 155S3.4_3 Measures of Relative Standing and Boxplots Percentiles are measures of location. There are 99 percentiles denoted P1, P2, . . . P99, which divide a set of data into 100 groups with about 1% of the values in each group. January 26, 2012 Converting from the kth Percentile to the Corresponding Data Value n total number of values in the data set k percentile being used L locator that gives the position of a value Pk kth percentile Converting from the kth Percentile to the Corresponding Data Value Quartiles Are measures of location, denoted Q1, Q2, and Q3, which divide a set of data into four groups with about 25% of the values in each group. • Q1 (First Quartile) separates the bottom 25% of sorted values from the top 75%. • Q2 (Second Quartile) same as the median; separates the bottom 50% of sorted values from the top 50%. • Q3 (Third Quartile) separates the bottom 75% of sorted values from the top 25%. 2 155S3.4_3 Measures of Relative Standing and Boxplots Quartiles Q1, Q2, Q3 divide ranked scores into four equal parts January 26, 2012 Some Other Statistics Interquartile Range (or IQR): Q3 – Q1 Semiinterquartile Range: Midquartile: 10 90 Percentile Range: P90 – P10 5Number Summary For a set of data, the 5number summary consists of 1. the minimum value; 2. the first quartile Q1; 3. the median (or second quartile Q2); 4. the third quartile, Q3; and 5. the maximum value. Boxplot A boxplot (or boxandwhiskerdiagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q1; the median; and the third quartile, Q3. Boxplot of Movie Budget 3 155S3.4_3 Measures of Relative Standing and Boxplots Boxplots Normal Distribution Heights from a Simple Random Sample of Women January 26, 2012 Outliers An outlier is a value that lies very far away from the vast majority of the other values in a data set. Important Principles Skewed Distribution: Salaries (in thousands of dollars) of NCAA Football Coaches • An outlier can have a dramatic effect on the mean. • An outlier can have a dramatic effect on the standard deviation. • An outlier can have a dramatic effect on the scale of the histogram so that the true nature of the distribution is totally obscured. Modified Boxplots Outliers for Modified Boxplots For purposes of constructing modified boxplots, we can consider outliers to be data values meeting specific criteria. In modified boxplots, a data value is an outlier if it is . . . above Q3 by an amount greater than 1.5 × IQR or below Q1 by an amount greater than 1.5 × IQR Boxplots described earlier are called skeletal (or regular) boxplots. Some statistical packages provide modified boxplots which represent outliers as special points. Modified Boxplot Construction A modified boxplot is constructed with these specifications: • A special symbol (such as an asterisk) is used to identify outliers. • The solid horizontal line extends only as far as the minimum data value that is not an outlier and the maximum data value that is not an outlier. 4 155S3.4_3 Measures of Relative Standing and Boxplots Modified Boxplots Example January 26, 2012 34 Measures of Relative Standing and Boxplots In this section we have discussed: • z Scores • z Scores and unusual values • Percentiles • Quartiles • Converting a percentile to corresponding data values Pulse rates of females listed in Data Set 1 in Appendix B. 34 Measures of Relative Standing and Boxplots Always consider certain key factors: • • • • • • • • • • Context of the data Source of the data Sampling Method Measures of Center Measures of Variation Distribution Outliers Changing patterns over time Conclusions Practical Implications • Other statistics • 5number summary • Boxplots and modified boxplots • Effects of outliers 132/3. Boxplots Shown below is a STATDISK generated boxplot of the durations (in hours) of flights of NASA’s Space Shuttle. What do the values of 0, 166, 215, 269, and 423 tell us? TI: See page 130 of your textbook for instructions. 5 155S3.4_3 Measures of Relative Standing and Boxplots 132/4. Boxplot Comparisons Refer to the two STATDISK generated boxplots shown below that are drawn on the same scale. One boxplot represents weights of randomly selected men and the other represents weights of randomly selected women. Which boxplot represents women? How do you know? Which boxplot depicts weights with more variation? TI: page 130 January 26, 2012 133/8. z Score for World’s Tallest Man Bao Xishun is the world’s tallest man with a height of 92.95 in. ( or 7 ft, 8.95 in.). Men have heights with a mean of 69.6 in. and a standard deviation of 2.8 in. a. What is the difference between Bao’s height and the mean height of men? b. How many standard deviations is that ( the difference found in part ( a))? c. Convert Bao’s height to a z score. d. Does Bao’s height meet the criterion of being unusual by corresponding to a z score that does not fall between 2 and 2? 134/16. 65 133/10. z Scores for Heights of Women Soldiers The U. S. Army requires women’s heights to be between 58 in. and 80 in. Women have heights with a mean of 63.6 in. and a standard deviation of 2.5 in. Find the z score corresponding to the minimum height requirement and find the z score corresponding to the maximum height requirement. Determine whether the minimum and maximum heights are unusual. Percentiles. In Exercises 15–18, use the given sorted values, which are the numbers of points scored in the Super Bowl for a recent period of 24 years. Find the percentile corresponding to the given number of points. 36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59 61 61 65 69 69 75 6 155S3.4_3 Measures of Relative Standing and Boxplots January 26, 2012 133/18. 41 Percentiles. In Exercises 15–18, use the given sorted values, which are the numbers of points scored in the Super Bowl for a recent period of 24 years. Find the percentile corresponding to the given number of points. 36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59 61 61 65 69 69 75 In Exercises 19–26, use the same list of 24 sorted values given for Exercises 1518. Find the indicated percentile or quartile. 36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59 61 61 65 69 69 75 134/22. P80 In Exercises 19–26, use the same list of 24 sorted values given for Exercises 1518. Find the indicated percentile or quartile. 36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59 61 61 65 69 69 75 134/26. P95 134/28. Boxplot for Number of English Words A simple random sample of pages from Merriam Webster’s Collegiate Dictionary, 11th edition, was obtained. Listed below are the numbers of defined words on those pages, and they are arranged in order. Construct a boxplot and include the values of the 5 number summary. 34 36 39 43 51 53 62 63 73 79 7 155S3.4_3 Measures of Relative Standing and Boxplots January 26, 2012 134/29. Boxplot for FICO Scores A simple random sample of FICO credit rating scores was obtained, and the sorted scores are listed below. Construct a boxplot and include the values of the 5number summary. S32B 664 693 698 714 751 753 779 789 802 818 834 836 8