Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Measures of Relative Standing A measure of relative standing is a measure of where a data value stands relative to the distribution of the whole data set. With an idea of relative standing, we can say things like, “You got a really high score compared to the rest of the class” or, “that man is unusually short”. We’ll discuss three measures of relative standing: z-Scores, Quartiles, and Percentiles. But first, a note about how to read these curves. Viewing Distribution Curves When viewing these graphs, imagine the data all stacked up under the curve in columns that all have the same value. Where the curve is higher, you’ve got more of those data with that x value. The z-Score The z-score of a data value is how many standard deviations away the value is from the mean. It’s the basis of our naming system shown below. Unusually High Really High Very High Pretty High High Average Low Pretty Low Very Low Really Low Unusually Low z>2 z=2 1<z<2 z=1 0<z<1 z=0 -1<z<0 z=-1 -2<z<-1 z=-2 z<-2 Beyond two std dev’s above the mean At or near two std dev’s above the mean In between one and two std dev’s above the mean At or near one std dev above the mean In between the mean and one std dev above the mean At or near the mean In between the mean and one std dev below the mean At or near one std dev below the mean In between one and two std dev’s below the mean At or near two std dev’s below the mean Beyond two std dev’s below the mean It’s easy to compute the z-Score of a data value x. Just subtract the mean x and divide by the xx standard deviation s. z . s Example. If you scored a 55 on a test that had a mean of x 85 and a standard deviation of s 10 , would your score be unusually low? 55 85 3 . Yes. The z-score for a grade of 55 is z 10 Example. If you scored a 55 on a test that had a mean of x 75 and a standard deviation of s 15 , would your score be unusually low? 55 75 1.33 , according to our naming No. Since the z-score of 55 in this case would be z 15 chart above, it would be considered only “very low”, not “unusually low”. Quartiles and Box Plots This is another way of measuring relative standing. It’s an extension of the median. Recall that the median is value at the center of the data list. So half the data have values below the median (and half the data is above). Look at the left hand figure below. Locate first quartile Q1 to the left of the median. 25% of the data is lower than the first quartile Q1 . Locate third quartile Q3 to the right of the median. 75% of the data is to the left of (has values less than) the third quartile Q3 . Together with the Min and Max values, we have the 5 Number Summary description of a data set. We also refer to these 5 numbers as The Quartiles. Max Q3 Med Q1 Min 100% of data have values less than Max 75% of data have values less than Q3 50% of data have values less than Med 25% of data have values less than Q1 No data have values less than Min A Box Plot is a visual representation of a 5 Number Summary. The box plots for the two distributions are shown underneath the graphs. The box is what’s in between Q1 and Q3 . The Med marked across the box. Lines extend out to the Min and Max values on either side. As seen in the right hand figure, if the data distribution is skewed then the positions of the quartiles are shifted. Any data value can be compared to the positions of the quartiles. We can say things like, “My score is above the third quartile.” Moreover, the relative positions of the five locations can help describe the distribution of the data set. Look at the two examples above. Notice that where the data is clustered together, the quartiles are closer together on the data axis. And where the data is spread out, the quartiles are farther apart. This is the basis for the Box Plot graphs that are used to compare data sets. Percentiles Percentiles are an extension of the Quartiles. The best way to define them is by example. P50 is the “50th Percentile”. 50% of the data are less than P50 . P30 is the “30th Percentile”. 30% of the data are less than P30 . P92 is the “92th Percentile”. 92% of the data are less than P92 . Convince yourself that Med P50 and Q1 P25 . What percentile is Q3 ? Q3 P75 . Note that Max P100 . Which quartile is P0 ? P0 Min .