Download Chapter 6 Part 2 Powerpoint - peacock

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
AP Statistics
The Standard Deviation as a Ruler and the
Normal Model
Chapter 6 Part 2
Learning Goals
6. Recognize when standardization can be used to
compare values.
7. Be able to use Normal models and the 68-95-99.7
Rule to estimate the percentage of observations
falling within 1, 2, or 3 standard deviations of the
mean.
8. Know how to find the percentage of observations
falling below any value in a Normal model using a
Normal table or appropriate technology.
9. Know how to check whether a variable satisfies the
Nearly Normal Condition by making a Normal
Probability plot or histogram.
Learning Goal 6
Recognize when standardization can be used to
compare values.
Learning Goal 6:
Why We Standardize
• Standardizing allows us to compare
distributions by giving them a
common scale.
• If the distribution is Normal, then it
can be standardized.
• ALWAYS check to make sure the
Normal Model is appropriate before
standardizing data or using z-scores.
Learning Goal 6:
Nearly Normal Condition
• When we use the Normal model, we
are assuming the distribution is
Normal.
• We cannot check this assumption in
practice, so we check the following
condition:
– Nearly Normal Condition: The shape
of the data’s distribution is
unimodal and symmetric.
– This condition can be checked with
a histogram or a Normal probability
plot (to be explained later).
Learning Goal 6:
Nearly Normal Condition
• Standardization (z-scores)
can only be used when the
Nearly Normal Condition is
met.
Learning Goal 7
Be able to use Normal models and the 68-95-99.7 Rule to estimate the percentage
of observations falling within 1, 2, or 3 standard deviations of the mean.
Learning Goal 7:
The 68-95-99.7 Rule
• Normal models give us an idea of
how extreme a value is by telling us
how likely it is to find one that far
from the mean.
• We can find these numbers
precisely, but until then we will use
a simple rule that tells us a lot about
the Normal model…
• The 68-95-99.7 Rule or
Empirical Rule
Learning Goal 7:
The 68-95-99.7 Rule
• A very important property of any
normal distribution is that within a
fixed number of standard deviations
from the mean, all normal
distributions have the same fraction
of their probabilities.
• We will illustrate for for 1, 2,
and 3 from the mean .
9-9
Learning Goal 7:
The 68-95-99.7 Rule
• One-sigma rule: Approximately 68%
of the data values should lie within
one standard deviation of the mean.
• That is, regardless of the shape of
the normal distribution, the
probability that a normal random
variable will be within one standard
deviation of the mean is
approximately equal to 0.68.
• The next slide illustrates this.
9-10
Learning Goal 7: The 68-95-99.7 Rule
One sigma rule.
9-11
Learning Goal 7:
The 68-95-99.7 Rule
• Two-sigma rule: Approximately 95%
of the data values should lie within
two standard deviations of the
mean.
• That is, regardless of the shape of
the normal distribution, the
probability that a normal random
variable will be within two standard
deviations of the mean is
approximately equal to 0.95.
• The next slide illustrates this.
9-12
Learning Goal 7: The 68-95-99.7 Rule
Two sigma rule.
9-13
Learning Goal 7:
The 68-95-99.7 Rule
• Three-sigma rule: Approximately
99.7% of the data values should lie
within three standard deviations of
the mean.
• That is, regardless of the shape of
the normal distribution, the
probability that a normal random
variable will be within three
standard deviations of the mean is
approximately equal to 0.997.
• The next slide illustrates this.
9-14
Learning Goal 7: The 68-95-99.7 Rule
Three sigma rule.
9-15
Learning Goal 7:
The 68-95-99.7 Rule
• The following shows what the
68-95-99.7 Rule tells us:
Learning Goal 7: The 68-95-99.7 Rule
Because all Normal distributions share the same properties, we can
standardize our data to transform any Normal curve N(,) into the
standard Normal curve N(0,1).
N(64.5, 2.5)
N(0,1)
=>
x
z
Standardized height (no units)
And then use the 68-95-99.7 rule to find areas under the curve.
Learning Goal 7:
More 68-95-99.7% Rule
You can further divide the area
under the normal curve into the
following parts.
Using the 68-95-99.7 Rule
• SOUTH AMERICAN RAINFALL
• The distribution of rainfall in South
American countries is approximately
normal with a (mean) µ = 64.5 cm
and (standard deviation) σ = 2.5 cm.
• The next slide will demonstrate the
empirical rule of this application.
N(64.5,2.5)
• 68% of the countries receive rain fall
between 64.5(μ) – 2.5(σ) cm (62)
and 64.5(μ)+2.5(σ) cm (67).
– 68% = 62 to 67
• 95% of the countries receive rain fall
between 64.5(μ) – 5(2σ) cm (59.5)
and 64.5 (μ) + 5(2σ) cm (69.5).
– 95% = 59.5 to 69.5
• 99.7% of the countries receive rain
fall between 64.5(μ) – 7.5(3σ) cm
(57) and 64.5(μ) + 7.5(3σ) cm (72).
– 99.7% = 57 to 72
The middle 68% of the
countries (µ ± σ) have
rainfall between 62 –
67 cm
The middle 95% of the
countries (µ ± 2σ) have
rainfall between 59.5 –
69.5 cm
Almost all of
the data
(99.7%) is
within 57 – 72
cm (µ ± 3σ)
Example: IQ Test
• The scores of a referenced
population on the IQ Test are
normally distributed with μ=100
and σ=15.
1) Approximately what percent of
scores fall in the range from 70 to
130?
2) A score in what range would
represent the top 16% of the
scores?
Example: IQ Test
1) 70 to 130 is μ±2σ, therefore it
would 95% of the scores.
μ=100
2) The top 16% of the scoresσ=15
is one σ
above the μ, therefore the score
would be 115.
Your Turn:
• Runner’s World reports that the times
of the finishes in the New York City 10km run are normally distributed with a
mean of 61 minutes and a standard
deviation of 9 minutes.
1) Find the percent of runners who take
more than 70 minutes to finish.
2) Find the percent of runners who finish
in less than 43 minutes.
The First Three Rules for Working with Normal
Models
• Make a picture.
• Make a picture.
• Make a picture.
• And, when we have data, make a
histogram to check the Nearly
Normal Condition to make sure we
can use the Normal model to model
the distribution.
Finding Normal Percentiles by Hand
• When a data value doesn’t fall
exactly 1, 2, or 3 standard deviations
from the mean, we can look it up in
a table of Normal percentiles.
• Table Z in Appendix D provides us
with normal percentiles, but many
calculators and statistics computer
packages provide these as well.
Finding Normal Percentiles by Hand (cont.)
• Table Z is the standard Normal table. We
have to convert our data to z-scores
before using the table.
• The figure shows us how to find the area
to the left when we have a z-score of
1.80:
Standard Normal Distribution Table
• Gives area under the
curve to the left of a
positive z-score.
• Z-scores are in the
1st column and the
1st row
– 1st column – whole
number and first
decimal place
– 1st row – second
decimal place
Table Z
• The table entry for each
value z is the area under
the curve to the LEFT of z.
USING THE Z TABLE
•
You found your z-score to be 1.40 and
you want to find the area to the left of
1.40.
1.
2.
3.
Find 1.4 in the left-hand column of the Table
Find the remaining digit 0 as .00 in the top
row
The entry opposite 1.4 and under .00 is
0.9192. This is the area we seek: 0.9192
Other Types of Tables
Using Left-Tail Style Table
1. For areas to the left of a specified z
value, use the table entry directly.
2. For areas to the right of a specified z
value, look up the table entry for z
and subtract the area from 1. (can
also use the symmetry of the normal
curve and look up the table entry for
–z).
3. For areas between two z values, z1
and z2 (where z2 > z1), subtract the
table area for z1 from the table area
for z2.
More using Table Z (left tailed table)
Use table directly
Example: Find Area Greater Than a Given
Z-Score
• Find the area from the
standard normal
distribution that is greater
than -2.15
THE ANSWER IS 0.9842
• Find the corresponding Table Z value
using the z-score -2.15.
• The table entry is 0.0158
• However, this is the area to the left of 2.15
• We know the total area of the curve =
1, so simply subtract the table entry
value from 1
– 1 – 0.0158 = 0.9842
– The next slide illustrates these areas
Practice using Table A to find areas under the Standard Normal Curve
1. z<1.58
2. z<-.93
3. z>-1.23
4. z>2.48
5. .5<z<1.89
6. -1.43<z<1.43
Using the TI-83/84 to Find the Area Under
the Standard Normal Curve
• Under the DISTR menu, the 2nd entry is
“normalcdf”.
• Calculates the area under the Standard
Normal Curve between two z-scores (1.43<z<.96).
• Syntax normalcdf(lower bound, upper
bound). Upper and lower bounds are zscores.
• If finding the area > or < a single zscore use a large positive value for the
upper bound (ie. 100) and a large
negative value for the lower bound (ie.
-100) respectively.
Practice use the TI-83/84 to find areas under the standard normal curve
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
z>-2.35 and z<1.52
.85<z<1.56
-3.5<z<3.5
0<z<1
z<1.63
z>.85
z>2.86
z<-3.12
z>1.5
z<-.92
Using TI-83/84 to Find Areas Under the
Standard Normal Curve Without Z-Scores
• The TI-83/84 can find areas under the
standard normal curve without first
changing the observation x to a z-score
• normalcdf(lower bound, upper bound,
mean, standard deviation) If finding
area < or > use very large observation
value for the lower and upper bound
receptively.
• Example: N(136,18) 100<x<150
• Answer: .7589
• Example: N(2.5,.42) x>3.21
• Answer: .0455
Procedure for Finding Normal Percentiles
1.
State the problem in terms of the observed
variable y.
– Example : y > 24.8
2. Standardize y to restate the problem in terms
of a z-score.
– Example: z > (24.8 - μ)/σ, therefore z > ?
3. Draw a picture to show the area under the
standard normal curve to be calculated.
4.
Find the required area using Table Z or the TI83/84 calculator.
Example 1:
• The heights of men are
approximately normally distributed
with a mean of 70 and a standard
deviation of 3. What proportion of
men are more than 6 foot tall?
Answer:
1. State the problem in terms of y.
(6’=72”)
y  72
2. Standardize and state in terms of z.
z
y

72  70
z
 .67
3
3. Draw a picture of the area under the
curve to be calculated.
4. Calculate the area under the curve.
Example 2:
• Suppose family incomes in a town
are normally distributed with a
mean of $1,200 and a standard
deviation of $600 per month. What
are the percentage of families that
have income between $1,400 and
$2,250 per month?
Answer:
1. State the problem in terms of y.
1400  y  2250
2. Standardize and state in terms of z.
1400  1200
2250  1200
z
600
600
 .33  z  1.75
3. Draw a picture.
4. Calculate the area.
Your Turn:
• The Chapin Social Insight (CSI) Test
evaluates how accurately the subject
appraises other people. In the reference
population used to develop the test,
scores are approximately normally
distributed with mean 25 and standard
deviation 5. The range of possible scores is
0 to 41.
1. What percent of subjects score above a
32 on the CSI Test?
2. What percent of subjects score at or
below a 13 on the CSI Test?
3. What percent of subjects score between
16 and 34 on the CSI Test?
From Percentiles to Scores: z in Reverse
• Sometimes we start with areas and
need to find the corresponding zscore or even the original data
value.
• Example: What z-score represents
the first quartile in a Normal model?
z in Reverse
• Given a normal distribution proportion
(area under the standard normal
curve), find the corresponding
observation value.
• Table Z – find the area in the table
nearest the given proportion and read
off the corresponding z-score.
• TI-83/84 Calculator – Use the DISTR
menu, 3rd entry invNorm. Syntax for
invNorm(area,[μ,σ]) is the area to the
left of the z-score (or Observation y)
wanted (left-tail area).
From Percentiles to Scores: z in Reverse (cont.)
• Look in Table Z for an area of 0.2500.
• The exact area is not there, but 0.2514
is pretty close.
• This figure is associated with z = –0.67,
so the first quartile is 0.67 standard
deviations below the mean.
Inverse Normal Practice
• Proportion (area
under curve, left tail)
Using Table Z
1. .3409
2. .7835
3. .9268
4. .0552
Using TI-83/84
1. .3409
2. .7835
3. .9268
4. .0552
Procedure for Inverse Normal Proportions
1. Draw a picture showing the given
proportion (area under the curve).
2. Find the z-score corresponding to
the given area under the curve.
3. Unstandardize the z-score.
4. Solve for the observational value y
and answer the question.
Example 1: SAT VERBAL SCORES
• SAT Verbal scores are approximately
normal with a mean of 505 and a
standard deviation of 110
• How high must a student score in
order to place in the top 10% of all
students taking the verbal section of
the SAT.
Analyze the Problem and Picture It.
• The problem wants to know the SAT
score y with the area 0.10 to its
right under the normal curve with a
mean of 505 and a standard
deviation of 110. Well, isn't that the
same as finding the SAT score y with
the area 0.9 to its left? Let's draw
the distribution to get a better look
at it.
1. Draw a picture showing the given
proportion (area under the curve).
y=505
y=?
2.
Find Your Z-Score
1. Using Table Z - Find the entry
closest to 0.90. It is 0.8997. This is
the entry corresponding to z =
1.28. So z = 1.28 is the
standardized value with area 0.90
to its left.
2. Using TI-83/84 –
DISTR/invNorm(.9). It is 1.2816.
3. Unstandardize
• Now, you will need to unstandardize
to transform the solution from the z,
back to the original y scale. We
know that the standardized value of
the unknown y is z = 1.28. So y itself
satisfies:
y  505
 1.28
110
4. Solve for y and Summarize
• Solve the equation for y:
y  505  (1.28)(110)  645.8
• The equation finds the y that
lies 1.28 standard deviations
above the mean on this
particular normal curve. That is
the "unstandardized" meaning
of z = 1.28.
• Answer: A student must score
at least 646 to place in the
highest 10%
Example 2:
• A four-year college will accept any
student ranked in the top 60
percent on a national examination.
If the test score is normally
distributed with a mean of 500 and
a standard deviation of 100, what is
the cutoff score for acceptance?
Answer:
1. Draw picture of given proportion.
2. Find the z-score. From TI-83/84,
invNorm(.4) is z = -.25.
y  500
3. Unstandardize:
0.25 
100
4. Solve for y and answer the question.
y = 475, therefore the minimum score
the college will accept is 475.
Your Turn:
• Intelligence Quotients are normally
distributed with a mean of 100 and
a standard deviation of 16. Find the
90th percentile for IQ’s.
Are You Normal? How Can You Tell?
• When you actually have your own
data, you must check to see
whether a Normal model is
reasonable.
• Looking at a histogram of the data is
a good way to check that the
underlying distribution is roughly
unimodal and symmetric.
Are You Normal? How Can You Tell? (cont.)
• A more specialized graphical display
that can help you decide whether a
Normal model is appropriate is the
Normal probability plot.
• If the distribution of the data is
roughly Normal, the Normal
probability plot approximates a
diagonal straight line. Deviations
from a straight line indicate that the
distribution is not Normal.
The Normal Probability Plot
A normal probability plot for data from a normal
distribution will be approximately linear:
X
90
60
30
-2
-1
0
1
2
Z
The Normal Probability Plot
Left-Skewed
Right-Skewed
X 90
X 90
60
60
30
30
-2 -1 0
1
2 Z
-2 -1 0
Rectangular
X 90
60
30
-2 -1 0
1
2 Z
1
2 Z
Nonlinear
plots
indicate a
deviation
from
normality
Are You Normal? How Can You Tell? (cont.)
• Nearly Normal data have a histogram and a Normal
probability plot that look somewhat like this
example:
Are You Normal? How Can You Tell? (cont.)
• A skewed distribution might have a histogram and
Normal probability plot like this:
Summary Assessing Normality
(Is The Distribution Approximately Normal)
1. Construct a Histogram or Stemplot.
See if the shape of the graph is
approximately normal.
2. Construct a Normal Probability Plot
(TI-83/84). A normal Distribution
will be a straight line. Conversely,
non-normal data will show a
nonlinear trend.
Assess the Normality of the Following Data
• 9.7, 93.1, 33.0, 21.2, 81.4, 51.1,
43.5, 10.6, 12.8, 7.8, 18.1, 12.7
• Histogram – skewed right
• Normal Probability Plot – clearly not
linear