Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Describing Location in
a Distribution
Text
2.1 Measures of Relative Standing
and Density Curves
YMS3e
AP Stats at CSHNYC
Ms. Namad
Sample Data
Consider the following test scores for a small class:
79
81
80
77
73
83
74
93
78
80
75
67
77
83
86
90
79
85
83
89
84
82
77
72
73
Jenny’s score is noted in red. How did she perform on
this test relative to her peers?
6| 7
7 | 2334
7 | 5777899
8 | 00123334
8 | 569
9 | 03
Her score is “above average”...
but how far above average is it?
Standardized Value
One way to describe relative position in a data set is to
tell how many standard deviations above or below the
mean the observation is.
Standardized Value: “z-score”
If the mean and standard deviation of a distribution are
known, the “z-score” of a particular observation, x, is:
x  mean
z
standard deviation
Calculating z-scores
Consider the test data and Julia’s score.
79
81
80
77
73
83
74
93
78
80
75
67
77
83
86
90
79
85
83
89
84
82
77
72
73
According to Minitab, the mean test score was 80
while the standard deviation was 6.07 points.
Julia’s score was above average. Her standardized zx  80 86  80
score is:
z
 0.99
6.07
6.07
Julia’s score was almost one full standard deviation
above the mean. What about Kevin: x=
Calculating z-scores
79
81
80
77
73
83
74
93
78
80
75
67
77
83
86
90
79
85
83
89
84
82
77
72
6| 7
7 | 2334
7 | 5777899
8 | 00123334
8 | 569
9 | 03
73
Julia: z=(86-80)/6.07
z= 0.99
{above average =
+z}
Kevin: z=(72-80)/6.07
z= -1.32 {below average = z}
Katie: z=(80-80)/6.07
z= 0
{average z = 0}
Comparing Scores
Standardized values can be used to compare scores
from two different distributions.
Statistics Test: mean = 80, std dev = 6.07
Chemistry Test: mean = 76, std dev = 4
Jenny got an 86 in Statistics and 82 in Chemistry.
On which test did she perform better?
Statistics
86  80
z
 0.99
6.07
Chemistry
82  76
z
1.5
4
Although she had a lower score, she performed
relatively better in Chemistry.
Percentiles
Another measure of relative standing is a percentile rank.
pth percentile: Value with p % of observations below it.
median = 50th percentile {mean=50th %ile if
symmetric}
Q1 = 25th percentile
Q3 = 75th percentile
6| 7
7 | 2334
7 | 5777899
8 | 00123334
Jenny got an 86.
8 | 569
22 of the 25 scores are ≤ 86.
Jenny is in the 22/25 = 88th %ile. 9 | 03
Chebyshev’s Inequality
The % of observations at or below a particular z-score
depends on the shape of the distribution.
An interesting (non-AP topic) observation regarding
the % of observations around the mean in ANY
distribution is Chebyshev’s Inequality.
Chebyshev’s Inequality:
In any distribution, the % of observations within
k standard deviations of the mean is at least
 1 
%within k std dev  1 2 
 k 
Density Curve
In Chapter 1, you learned how to plot a dataset to
describe its shape, center, spread, etc.
Sometimes, the overall pattern of a large number
of observations is so regular that we can describe
it using a smooth curve.
Density Curve:
An idealized description of
the overall pattern of a
distribution.
Area underneath = 1,
representing 100% of
observations.
Density Curves
Density Curves come in many different shapes;
symmetric, skewed, uniform, etc.
The area of a region of a density curve represents
the % of observations that fall in that region.
The median of a density curve cuts the area in half.
The mean of a density curve is its “balance point.”
Example
•
Pretend you are rolling a die. The numbers 1,2,3,4,5,6 are the possible
outcomes. In 120 rolls, how many of each number would you expect to
roll?
•
Calculator can do a simulation:
•
Clear L1 in your calc. Use random integer generator to generate 120
random whole numbers between 1 and 6 then store in L1
•
RandInt (1, 6, 120) STO-> L1
•
Set viewing window: X (1,7) by Y (-5,25).
•
Specify a histogram using the data in L1
•
Repeat simulation several times. 2nd Enter will recall/reuse the
previous command. In theory we should expect a uniform
outcome...
2.1 Summary
We can describe the overall pattern of a distribution
using a density curve.
The area under any density curve = 1. This
represents 100% of observations.
Areas on a density curve represent % of observations
over certain regions.
An individual observation’s relative standing can be
described using a z-score or percentile rank.
x  mean
z
standard deviation
2.2 Normal Distributions
•
Normal Curves: symmetric, single-peaked, bell median are the same. Size of
shaped. and
the  will affect the spread of the normal
curve.
Example
•
Scores on the SAT verbal test in recent years
follow approximately the N (505, 110)
distribution. How high must a student score in
order to place in the top 10% of all students
taking the SAT?
•
1. State the problem and draw a picture.
Shade the area we’re looking for.
•
2. Find the Z score with the table
•
3. Convert to raw score.
Assessing Normality
•
Method 1: Construct a histogram, see if graph
is approximately bell-shaped and symmetric.
Median and Mean should be close. Then mark
off the -2, -1, +1, +2 SD points and check the
68-95-99.7 rule.
Normal Probability Plot
•
Method 2: Construct Normal Probability Plot
•
1. Arrange the observed data values from smallest to
largest. Record what percentile of the data each value
occupies (example, the smallest observation in a set of
20 is at the 5% point, the second is at 10% etc.)
•
Use Table A to find the Z’s at these same percentiles
(example -1.645 is @ 5%, -1.28 is @10%
•
Plot each data point against the corresponding Z (xvalues on the horizontal axis, z-scores on the vertical axis
is what I do, either is fine)
•
•
rkgnt
Normal w/Outliers
Right Skew
Normal
Interpretation: draw your X = Y line with a straight edge- points shouldn’t
vary too much
Constructing Probability Plot on
Calculator
79
81
80
77
73
83
74
93
78
80
75
67
77
83
86
90
79
85
83
89
84
82
77
72
•
Students in Mr. Pryor’s stats class
•
X values on horizontal axis
73
Case
Closed
The New SAT
Chapter 2
AP Stats at CSHNYC
Ms. Namad
I: Normal Distributions
•1.
SAT Writing Scores are N(516, 115)
What percent are between 600 and 700?
≈N(516, 115)
700  516
z700 
115
184
115
 1.6
516 600 700
SAT Writing Scores
600  516
z600 
115
84
115
 0.73
%Below 700≈.9452
%Below 600≈.7673
%Between 600 and 700≈.9452-.7673≈.1779
I: Normal Distributions
•1.
SAT Writing Scores are N(516, 115)
What score would place a student in the 65th
Percentile? Table A Standard Normal probabilities (continued)
≈N(516, 115)
z
0.00
0.01
...
0.07
0.08
0.09
0.0
0.500
0.5040
...
0.5279
0.5319
0.5359
...
...
...
...
...
...
...
0.3
0.6179
0.6217
...
0.6443
0.6480
0.6517
0.4
0.6554
0.6591
...
0.6808
0.6844
0.6879
z0.65  0.39
0.65
516 ?
SAT Writing Scores
?  mean  0.39(s)
?  516  0.39(115)
?  516  44.85
?  560.85
II: Comparing Observations
•
2. Male scores are N(491,110)
• Female scores are N(502,108)
•
a) What % of males earned scores below 502?
≈N(491,110)
491 502
Male Writing Scores
502  491
z
110
z  0.1
%below  .5398
II: Comparing Observations
•
2. Male scores are N(491,110)
• Female scores are N(502,108)
b) What % of females earned scores above
491
502
491?
≈N(502,108)
z
108
z  0.101
•
%below  .4602
491 502
Female Writing Scores
%above 1.4602  .5398
II: Comparing Observations
•
2. Male scores are N(491,110)
• Female scores are N(502,108)
• c) What % of males earned scores above the
85th %-ile of female scores?
85th %-ile for Females
≈N(491,110)
491
 614.32
Male Writing Scores
z.85  1.04
score  502  1.04(108)
score  614.32
614.32  491
z
110
z  1.12
%below  .8686
%above  .1314
III:Determining Normality
•
3a. Did males or females perform better?
The male and female scores are very similar. Both have
roughly symmetric distributions with no outliers. The
median for females is slightly higher (580 vs 570), but
the male average is slightly higher (584.6 vs 580). Both
have similar ranges, but the males had slightly more
variability in the middle 50%.
III:Determining Normality
•
3b. How do the male scores compare with
National results?
SATs
Male
584.58333
48
80.07864
11.558356
39
S1 = mean  
S2 = count  
S3 = stdDev  
S4 = stdError  
S5 = count  missing   
The males at this school did much better than the
overall national mean (584.6 vs. 516). Their scores
were also more consistent as evidenced by a lower
standard deviation (80.08 vs 115).
III:Determining Normality
•
3c. Are the male and female scores
approximately Normal?
The Normal Quantile Plots for both the male and female
scores are approximately linear. Therefore, there is
evidence that their scores are approximately Normal.