Download Z Score

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Time series wikipedia , lookup

World Values Survey wikipedia , lookup

Transcript
Section 3.4
Measures of Relative
Standing and Boxplots
Part 1
Basics of z Scores, Percentiles,
Quartiles, and Boxplots
Z Score
the number of standard deviations that a given
value x is above or below the mean
- Also known as a standardized value
Measures of Position z Score
Sample
x
–
x
z= s
Population
x
–
µ
z=

*Round z scores to 2 decimal places if
necessary.
Interpreting Z Scores
Whenever a value is less than the mean, its
corresponding z score is negative
Ordinary values:
–2 ≤ z score ≤ 2
Unusual Values:
z score < –2 or z score > 2
Example 1: Helen Mirren was 61 when she earned her Oscar-winning Best
Actress award. The Oscar-winning Best Actresses have a mean age of 35.8 years
and a standard deviation of 11.3 years.
a)
What is the difference between Helen Mirren’s age and the mean age?
b) How many standard deviations is that?
c) Convert Helen Mirren’s age to a z score.
d) If we consider “usual” ages to be those that convert to z scores between –2
and 2, is Helen Mirren’s age usual or unusual?
Example 2: Human body temperatures have a mean of
98.20°F and a standard deviation of 0.62°F (based on Data
Set 2 in Appendix B). Convert each given temperature to a z
score and determine whether it is usual or unusual.
a) 101.00°F
b) 96.90°F
c) 96.98°F
Example 3: Scores on the SAT test have a mean of 1518
and a standard deviation of 325. Scores on the ACT test have
a mean of 21.1 and a standard deviation of 4.8. Which is
relatively better: a score of 1840 on the SAT test or a score
of 26.0 on the ACT test? Why?
Percentiles
measures of location. There are 99 percentiles
denoted P1, P2, . . . P99, which divide a set of data
into 100 groups with about 1% of the values in
each group.
Finding the Percentile
of a Data Value
Percentile of value x =
number of values less than x
total number of values
• 100
Example 4: Use the given sorted values (the numbers of points
scored in the Super Bowl for a recent period of 24 years). Find
the percentile corresponding to the given number of points.
36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59
61 61 65 69 69 75
a) 47 points
b) 54 points
Converting from the kth Percentile
to the Corresponding Data Value
Notation
total number of values in the data
set
k percentile being used
L locator that gives the position of a
value
Pk kth percentile
n
L=
k
100
•n
Converting
from the
kth Percentile
to the
Corresponding
Data Value
Example 5: Use the given sorted values (the numbers of points
scored in the Super Bowl for a recent period of 24 years). Find
the indicated percentile.
36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59
61 61 65 69 69 75
a) P50
b) P22
Quartiles
measures of location, denoted Q1, Q2, and Q3, which divide
a set of data into four groups with about 25% of the values
in each group.
Q1 (First Quartile) separates the bottom 25% of
sorted values from the top 75%.
Q2 (Second Quartile) same as the median; separates
the bottom 50% of sorted values from the top 50%.
Q3 (Third Quartile) separates the bottom 75% of
sorted values from the top 25%.
Quartiles
Q1, Q2, Q3 divide ranked scores into four equal parts
25%
(minimum)
25%
25%
25%
Q1 Q2 Q3
(median)
(maximum)
Some Other Statistics
• Interquartile Range (or IQR): Q3 – Q1
• Semi-interquartile Range:
•
Q3 – Q1
2
Q
+
Q
3
1
Midquartile:
2
•
10 - 90 Percentile Range: P – P
90
10
Example 6: Use the given sorted values (the numbers of points
scored in the Super Bowl for a recent period of 24 years). Find
the indicated percentile or quartile.
36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59
61 61 65 69 69 75
a) P20
b) Q3
5-Number Summary
For a set of data, the 5-number summary
consists of the minimum value; the first
quartile Q1; the median (or second quartile
Q2); the third quartile, Q3; and the
maximum value.
Boxplot
A boxplot (or box-and-whisker-diagram) is a
graph of a data set that consists of a line
extending from the minimum value to the
maximum value, and a box with lines drawn
at the first quartile, Q1; the median; and the
third quartile, Q3.
Boxplot
Boxplot of Movie Budget Amounts
Boxplots - Normal Distribution
Normal Distribution:
Heights from a Simple Random Sample of Women
Boxplots - Skewed Distribution
Skewed Distribution:
Salaries (in thousands of dollars) of NCAA Football Coaches
Example 7: Use the given sorted values (the numbers of points
scored in the Super Bowl for a recent period of 24 years).
Construct a boxplot and include the values of the 5 – number
summary.
36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59
61 61 65 69 69 75
Part 2
Outliers and Modified
Boxplots
Outliers
An outlier is a value that lies very far away from
the vast majority of the other values in a data set.
Important Principles
An outlier can have a dramatic effect on the
mean.
An outlier can have a dramatic effect on the
standard deviation.
An outlier can have a dramatic effect on the scale
of the histogram so that the true nature of the
distribution is totally obscured.
Outliers for Modified Boxplots
For purposes of constructing modified boxplots,
we can consider outliers to be data values
meeting specific criteria.
In modified boxplots, a data value is an outlier if
it is . . .
above Q3 by an amount greater than
1.5  IQR
or
below Q1 by an amount greater
than 1.5  IQR
Modified Boxplots
Boxplots described earlier are called skeletal
(or regular) boxplots.
Some statistical packages provide modified
boxplots which represent outliers as special
points.
Modified Boxplot Construction
A modified boxplot is constructed with these
specifications:
• A special symbol (such as an asterisk) is
used to identify outliers.
• The solid horizontal line extends only as far
as the minimum data value that is not an
outlier and the maximum data value that is
not an outlier.
Modified Boxplots - Example
Pulse rates of females listed in Data Set 1 in Appendix B.
Example 8: Use the 40 upper leg lengths (cm) listed for females
from Data Set 1 in Appendix B. Construct a modified boxplot.
Identify any outliers.
41.6 42.8
39
40.2 36.2 43.2 38.7
41
43.8 37.3
42.3 39.1 40.3 48.6 33.2 43.4 41.5
40
38.2 38.2
38.2
41
38.1
39
36.6
27
38
36
32.1 31.1 39.4 40.2 39.2
38.5 39.9 37.5 39.7
39
41.6 33.8
Deciles are the 10th, 20th, 30th, 40th, 50th, 60th,
70th, 80th, and 90th percentiles. The are
denoted using the following notation: D1, D2,
…, D9.
Quintiles are the 20th, 40th, 60th, and 80th
percentiles.
Example 9: Using the following data, find the deciles D2,
D5, and D9.
Sorted Movie Budget Amounts (in millions of dollars)
4.5
5
6.5
7
20
20
29
30
35
40
40
41
50
52
60
65
68
68
70
70
70
72
74
75
80
100
113
116
120
125
132
150
160
200
225