Download standard deviations from the mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Transcript
Chapter 5
The Standard
Deviation as a
Ruler and the
Normal Model
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
1
Objectives:
The student will be able to:
19.Compare
values from two different distributions using their z-
scores.
20.Use Normal models (when appropriate) and the 68-95-99.7
Rule to estimate the percentage of observations falling within
one, two, or three standard deviations of the mean.
21.Determine the percentages of observations that satisfy certain
conditions by using the Normal model and determine
“extraordinary” values.
22.Determine whether a variable satisfies the Nearly Normal
condition by making a normal probability plot or histogram.
23.Determine the z-score that corresponds to a given percentage
of observations.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 2
2
5.1
Standardizing
with z-Scores
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
3
Comparing Athletes
•
Dobrynska took the gold in the
Olympics with a long jump of 6.63 m
for the women’s heptathlon, 0.5 m
higher than average.
•
Fountain won the 200 m run with a
time of 23.21 s, 1.5 s faster than average.
•
Whose performance was more impressive?
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
4
How Many Standard Deviations Above?
Long Jump
200 m Run
Mean
6.11 m
24.71 s
SD
0.24 m
0.70 s
Individual
6.63 m
23.21 s
• The standard deviation helps us compare.
• Long Jump:
• 1 SD above: 6.11 + 0.24 = 6.35
• 2 SD above: 6.11 + (2)(0.24) = 6.59
• Just over 2 standard deviations above
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
5
The z-Score
•
In general, to find the distance between the value and
the mean in standard deviations:
1. Subtract the mean from the value.
2. Divide by the standard deviation.
y y
z
s
•
This is called the z-score.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
6
The z-score
•
The z-score measures the distance of the value from the
mean in standard deviations.
•
A positive z-score indicates the value is above the mean.
•
A negative z-score indicates the value is below the mean.
•
A small z-score indicates the value is close to the mean
when compared to the rest of the data values.
•
A large z-score indicates the value is far from the mean
when compared to the rest of the data values.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
7
How Many Standard Deviations Above?
Long Jump
200 m Run
Mean
6.11 m
24.71 s
SD
0.24 m
0.70 s
Individual
6.63 m
23.21 s
• Standard Deviations from the Mean
Long Jump:
200 m Run:
6.63  6.11
z
 2.17
0.24
23.23  24.71
z
 2.14
0.70
• Dobrynska’s long jump was a little more impressive
than Fountain’s 200 m run.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
8
Practice
Assume verbal SAT scores have a mean of 500 and a
std. dev. of 100. What is the z-score of somebody who
scores 500 on the verbal portion of the SAT?
Assume IQ scores have a mean of 100 and a std. dev. of
16. Albert Einstein reportedly had an IQ of 160. What is
the z-score of his IQ?
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 9
9
Examples
Suppose your stats professor reports your test grade as
a z-score and you received a score of 2.20.
• What does that mean?
• If the mean was 80 and
the standard deviation was 7,
what was your grade?
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 10
10
Benefits of Standardizing
Standardized values have been converted from their
original units to the standard statistical unit of standard
deviations from the mean.
Thus, we can compare values that are measured on
different scales, with different units, or from different
populations.
Example – Which student performed better?
• Student A received a 85 on a 100 point quiz with a
mean of 90 and standard deviation of 5.
• Student B received a 35 on a 50 point quiz with a mean
of 37 and a standard deviation of 3.
• We must compare z-scores!
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 11
11
Practice
Recall:
Suppose a basketball player scored the following
number of points in his last 15 games: 4, 4, 3, 4, 7, 16,
12, 15, 6, 8, 5, 9, 8, 25, 11
Use these scores to calculate the z-scores for the 3
lowest-scoring and 3 highest-scoring games.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 12
12
A college student received a score of 78 on her Math
exam and a score of 86 on her French exam. The
overall results on the French exam had a mean of 82
and a standard deviation of 8, while the math exam
had a mean of 54 and a standard deviation 12. On
which exam did she do relatively better?
Math z-score: 2.0
French z-score: 0.5
She did relatively better on her math exam.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 13
13
Who is relatively taller:
• A non-basketball playing man who is 75 inches tall (assume
non-basketball playing men have a mean height of 71.5
inches tall and a standard deviation of 2.1 inches).
• A male basketball player who is 85 inches tall (assume male
basketball players have a mean height of 80 inches and a
standard deviation of 3.3)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 14
14
5.2
Shifting and
Scaling
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
15
National Health and Examination Survey
•
Who?
80 male participants between 19 and 24 who
measured between 68 and 70 inches tall
•
What?
Their weights in kilograms
•
When? 2001 – 2002
•
Where? United States
•
Why?
To study nutrition and health issues and trends
•
How?
National survey
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
16
Shifting Weights
•
Mean: 82.36 kg
•
Maximum Healthy Weight: 74 kg
•
How are shape, center, and
spread affected when 74 is
subtracted from all values?
•
Shape and spread are
unaffected.
•
Center is shifted by 74.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
17
Rules for Shifting
•
If the same number is subtracted or added to all data
values, then:
•
The measures of the spread – standard deviation,
range, and IQR – are all unaffected.
•
The measures of position – mean, median, and mode –
are all changed by that number.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
18
Rescaling
•
If we multiply all data values by the same number, what
happens to the position and spread?
•
To go from kg to lbs, multiply by 2.2.
•
The mean and spread are also multiplied by 2.2.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
19
How Rescaling Affects the Center and
Spread
•
When we multiply (or divide) all the data values by a
constant, all measures of position and all measures of
spread are multiplied (or divided) by that same constant.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
20
Example: Rescaling Combined Times in
the Olympics
•
The mean and standard deviation in the men’s
combined event at the Olympics were 168.93 seconds
and 2.90 seconds, respectively.
•
If the times are measured in minutes, what will be the
new mean and standard deviation?
•
Mean: 168.93 / 60 = 2.816 minutes
•
Standard Deviation: 2.90 / 60 = 0.048 minute
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
21
Shifting, Scaling, and z-Scores
•
Converting to z-scores:
y y 0
•
Subtract the mean
•
Divide by the standard deviation
•
The shape of the distribution does not change.
•
Changes the center by making the mean 0
•
Changes the spread by making the standard deviation 1
s s =1
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
22
5.3
Normal Models
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
23
Models
•
“All models are wrong, but some are useful.”
George Box, statistician
•
−1 < z < 1: Not uncommon
•
z = ±3: Rare
•
z = 6: Shouts out for attention!
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
24
The Normal Model
•
Bell Shaped: unimodal, symmetric
•
A Normal model for every mean and standard deviation.
•
m (read “mew”) represents the population mean.
s (read “sigma”) represents the population standard
deviation.
• N(m, s) represents a Normal model with mean m and
standard deviation s.
•
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
25
Parameters and Statistics
•
Parameters: Numbers that help specify the model
• m, s
•
Statistics: Numbers that summarize the data
• y , s, median, mode
•
N(0, 1) is called the standard Normal model, or the
standard Normal distribution.
•
The Normal model should only be used if the data is
approximately symmetric and unimodal.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
26
The 68-95-99.7 Rule
•
68% of the values fall within 1 standard deviation of the
mean.
• 95% of the values fall within 2 standard deviations of the
mean.
• 99.7% of the values fall within 3 standard deviations of
the mean.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
27
Working With the 68-95-99.7 Rule
•
Each part of the SAT has a mean of 500 and a standard
deviation of 100. Assume the data is symmetric and
unimodal. If you earned a 700 on one part of the SAT
how do you stand among all others who took the SAT?
•
Think →
• Plan: The variable is quantitative and the distribution
is symmetric and unimodal. Use the Normal model
N(500, 100).
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
28
Show and Tell
•
Show → Mechanics:
• Make a picture.
• 700 is 2 standard deviations above the mean.
•
Tell → Conclusion:
• 95% lies within 2 standard deviations of the mean.
• 100% - 95% = 5% are outside of 2 standard deviations
of the mean.
• Above 2 standard deviations is half of that.
• 5% / 2 = 2.5%
• Your score is higher than 2.5% of all scores on this test.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
29
Practice
Assume IQ scores have a normal model distribution with
a mean of 100 and a std. dev. of 16. Find the following
percentages using the 68-95-99.7% rule
a. % of people with 84<=IQ<=116
b. % of people with IQ>=100
c. % of people 68<=IQ<=132
d. % of people who are “geniuses” (a genius is someone
with an IQ>=132)
e. % of people 84<=IQ<=132
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 30
30
Three Rules For Using the Normal Model
1.
Make a picture.
2.
Make a picture.
3.
Make a picture.
•
•
When data is provided, first make a histogram to make
sure that the distribution is symmetric and unimodal.
Then sketch the Normal model.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
31
5.4
Finding Normal
Percentiles
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
32
What if z is not −3, −2, −1, 0, 1, 2, or 3?
•
If the data value we are trying to find using the Normal
model does not have such a nice z-score, we will use a
computer.
•
Example: Where do you stand if your SAT math score
was 680? m = 500, s = 100
•
Note that the z-score is not an integer:
680  500
z
 1.8
100
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
33
Finding Normal Percentiles
When a data value doesn’t fall exactly 1, 2, or 3 standard
deviations from the mean, we can look it up in a table
of Normal percentiles.
Table Z in Appendix D provides us with normal
percentiles, but many calculators and statistics
computer packages provide these as well.
You will do these using your TI-83
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 34
34
Finding Normal Percentiles using Technology
– TI-83
•
To find what percentage of a standard Normal model
is found in the region a < z < b (note, for infinity use
any large number or 1E99!) use the DISTR function
normalcdf(a,b,μ,σ)
•
If your a and b are z-scores then μ = 0 and σ = 1
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 35
35
Finding Normal Percentiles
Example: Where do you stand if your SAT math score
was 680? m = 500, s = 100
z= 1.8
The figure shows us how to find the area to the left when we have a
z-score of 1.80:
Normalcdf(-1E99, 1.80, 0, 1) = .964
Alternatively we can compute Normalcdf(-1E99, 680, 500, 100)=.964
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 36
36
Using StatCrunch for the Normal Model
•
What percent of all SAT
scores are below 680?
• m = 500, s = 100
•
Stat → Calculators
→ Normal
•
Fill in info, hit Compute
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
37
Using StatCrunch for the Normal Model
•
What percent of all SAT
scores are below 680?
• m = 500, s = 100
•
Stat → Calculators
→ Normal
•
Fill in info, hit Compute
•
96.4% of SAT scores
are below 680.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
38
A Probability Involving “Between”
•
What is the proportion of SAT scores that fall between
450 and 600? m = 500, s = 100
•
Think →
• Plan: Probability that x is between 450 and 600
= Probability that x < 600 – Probability that x < 450
•
Variable: We are told that the
Normal model works.
N(500, 100)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
39
A Probability Involving “Between”
•
What is the proportion of SAT scores that fall between
450 and 600? m = 500, s = 100
•
Show → Mechanics: Use StatCrunch to find each of the
probabilities.
•
Probability that x is between 450 and 600
= Probability that x < 600 – Probability that x < 450
= 0.8413 – 0.3085 = 0.5328
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
40
A Probability Involving “Between”
•
What is the proportion of SAT scores that fall between
450 and 600? m = 500, s = 100
•
Probability that x is between 450 and 600
= Probability that x < 600 – Probability that x < 450
= 0.8413 – 0.3085
= 0.5328
•
Conclusion: The Normal model estimates that about
53.28% of SAT scores fall between 450 and 600.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
41
Underweight Cereal Boxes
•
Based on experience, a manufacturer
makes cereal boxes that fit the Normal
model with mean 16.3 ounces and
standard deviation 0.2 ounces, but the
label reads 16.0 ounces. What fraction
will be underweight?
•
Think →
• Plan: Find Probability that x < 16.0
• Variable: N(16.3, 0.2)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
42
Underweight Cereal Boxes
•
What fraction of the cereal boxes will be underweight
(less than 16.0)? m = 16.3, s = 0.2
•
Probability x < 16.0 = 0.0668
•
Conclusion: I estimate that approximately 6.7% of the
boxes will contain less than 16.0 ounces of cereal.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
43
From Percentiles to Scores: z in Reverse
Sometimes we start with areas and need to find the
corresponding z-score or even the original data value.
Example: What z-score represents the first quartile in a
Normal model?
• invNorm(.25, 0, 1)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 44
44
Finding the z-score given percentile using
technology– TI-83
To find the score with a given tail probability, use the
DISTR function invNorm(p,μ,σ).
• NOTE: invNorm always considers p to be the
percentage in the left (negative) tail
• To find a right tail or center probability, you will have to
do some work to find the right p to use with invNorm.
• If you want a z-score (not a raw score, then you can
leave off μ and σ (or write 0, 1)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 45
45
From Percentiles to Scores: z in Reverse
•
Suppose a college admits only people with SAT
scores in the top 10%. How high a score does it take
to be eligible? m = 500, s = 100
•
Think →
•
Plan: We are given the probability and want to go
backwards to find x.
•
Variable: N(500, 100)
Solve this using
• InvNorm(.9, 500, 100)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
46
Percentiles and Z-scores
• What
percent of a standard Normal model is found in each region?
Draw a picture for each
a) z > -2.05
b) z < -0.33
c) 1.2 < z < 1.8
d) |z| < 1.28
• In a standard Normal model, what value(s) of z cut(s) off the region
described? Draw a picture first!
a) The highest 20%
b) The highest 75%
c) The lowest 3%
d) The middle 90%
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 47
47
More Percentiles and Z-scores
• What
percent of a standard Normal model is found in each region?
Draw a picture for each
a) z > -1.05
b) z < -0.40
c) 1.3 < z < 2.0
• In a standard Normal model, what value(s) of z cut(s) off the region
described? Draw a picture first!
a) The highest 20%
b) The highest 60%
c) The lowest 6%
d) The middle 75%
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 48
48
Additional exercises
Some IQ tests are standardized to a normal model with a mean of
100 and a standard deviation of 16.
A) Draw the model for these IQ scores clearly labeling showing
what the 68-95-99.7 Rule predicts about the scores
B) In what interval would you expect to find the central 95% of IQ
scores to be found?
C) About what percent of people should have IQ scores above
116?
D) About what percent of people should have IQ scores between
68 and 84?
E) About what percent of people would have IQ scores above
132?
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 49
49
44) Based on the Normal model N(100, 16) describing
IQ scores, what percent of people’s IQ scores would
you expect to be
– Over 80?
– Under 90?
– Between 112 and 132?
46) In the same model, what cutoff value bounds
– The highest 5% of all IQs?
– The lowest 30% of the IQs?
– The middle 80% of the IQs?
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 50
50
What Can Go Wrong
•
Don’t use the Normal model when the distribution is not
unimodal and symmetric.
• Always look at the picture first.
•
Don’t use the mean and standard deviation when
outliers are present.
• Check by making a picture.
•
Don’t round your results in the middle of the calculation.
• Always wait until the end to round.
•
Don’t worry about minor differences in results.
• Different rounding can produce slightly different results.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
55