Download Chapter 2.2 STANDARD NORMAL DISTRIBUTIONS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 2.2
STANDARD NORMAL
DISTRIBUTIONS
Normal Distributions
• Last class we looked at a particular type of density
curve called a Normal distribution.
• All Normal distributions are described by two
μ
parameters: ________
and __________
σ
• Because of this, we can abbreviate a Normal
distribution as ____(___,
N μ ___)
σ
• Another important quality of Normal distributions is
that the follow the __________
Empirical rule. This rule states
68 of the data falls within 1 standard deviation
that ____%
of the mean, ____%
falls within 2 standard deviations
95
99.7 falls within 3 standard deviations.
and _____%
The Standard Normal Distribution
• All normal distributions are the same if we
measure in units of size σ about the mean μ as
center.
• Changing these units requires that we
standardize (like we did in 2.1)
Z=x-μ
σ
• If the variable we standardize has a normal
distribution, then so does the new variable, z
• The new distribution is called the standard
Normal Distribution
*The standard Normal distribution follows a normal
distribution and has mean 0 and standard deviation 1
*Notice that the distribution is perfectly symmetric
about 0.
Great…but why is that useful?
• Remember, the area under a density curve is a
proportion of the observations in a
distribution.
1
– The area under the entire density curve is ____.
– The proportion of observations to the left of the
.5
median is_____.
• We can find the proportion of observation
that lie within any range of values simply by
finding the area under the curve.
The standard Normal Table
• Because standardizing Normal distributions
makes them all the same, we can use a single
table to find the areas under a Normal
distribution.
• This table is called the standard Normal table.
– It’s inside the front cover of you textbook!
– You will be given this table on the AP exam
The standard Normal Table
CAREFUL!!!!
The standard Normal table
• Example: Find the proportion of observations
from the standard Normal distribution that
are less than -2.15.
• For the value of z = -2.15, the area is 0.0158
Using the standard Normal table…
• Caution: the area that we found was to the LEFT of z = 2.15. In this case, that is what we were looking for.
• HOWEVER if the problem had asked for the area lying to
the right of -2.15. What would that answer be?
Area to the Right
1
• The total area under the curve is _____.
• So if 0.0158 lies to the left of -2.15…
0.9842 lies to the right of -2.15.
1 - 0.0158= _______
• Then _____
How do you avoid making a mistake
when asked to find the area to the
RIGHT?
• Always sketch the Normal curve, mark the zvalue, and shade the area of interest (aka the
area you are looking for in the problem)
• THEN, when you get you answer, CHECK TO
SEE IF IT IS REASONABLE!!!
Putting it all Together:
Solving Problems Involving Normal Distributions
• Step 1: State the problem in terms of the
observed variable x. Draw a picture of the
distribution and shade the area of interest.
– Hint…use σ and μ
• Step 2: Standardize and draw a picture. We
need to standardize x to restate the problem in
terms of a standard Normal variable z. Draw a
new picture to show the area of interest under
our now standard Normal curve.
Putting it all Together:
Solving Problems Involving Normal Distributions
• Step 3: Use the table. Find the are under the
standard Normal curve using Table A. (careful
if the problem asks for the area to the right)
• Step 4: Conclusion. Write your conclusion in
the context of the problem.
– Just saying “the area under the curve that is less
that 2.1” means nothing! Your results should tell
you something about the data.
Example: Cholesterol and Young Boys
• For 14-year-old boys, the mean is μ = 170
milligrams of cholesterol per deciliter of blood
(mg/dl) and the standard deviation σ = 30
mg/dl.
• Levels above 240 mg/dl may require medical
attention. What percent of 14-year-old boys
have more than 240 mg/dl of cholesterol?
Step 1: STATE THE PROBLEM
170
200
240
• Call the level of cholesterol in the blood x.
170 30 distribution.
• x has the N(____,_____)
• What are we looking for?
The
proportion of boys with cholesterol level x > 240.
_________________________________________
Step 2: Standardize and Draw a Picture
X > 40
X - 170 > 240-170
30
30
Z > 2.33
A little more than 2
standard deviations
away from the
mean
z = 2.33
Step 3: Use the Table
Use Table A to look up z = 2.33
*look for 2.3 on the left and then move
over until you are under 0.03*
Step 4: Conculsion
• Don’t forget to state your conclusion in the
context of the question!
• Remember, we were trying to find out how
many 14-year-old boys have cholesterol levels
over 240 mg/dl because boys over this level
require medical attention.
• So what does our conclusion mean?
Only about 1% of 14-year-old boys have
cholesterol levels that require medical attention.
Finding a Value when Given a
Proportion
• What if you wanted to know what score you would have
to get in order to place among the top 10% of your class
on a test?
• Sometimes, we may be asked to find the observed value
with a given proportion of the observations above or
below it.
• To do this, we just read Table A going backwards. In
other words, find the proportion you are looking for in
the body of the table, figure out the corresponding zscore, and then “unstandardize” to get the observed
value.
Inverse Normal Calculation Example
• Scores on the SAT Verbal test in recent years
follow approximately the N(505, 110)
distribution. How high must a student score in
order to place in the top 10% of all students
taking the SAT.
Step 1: State the problem and draw a picture
• We are looking for the SAT score x with an
.10 to its _____
right under the Normal
area of ____
505 and standard
curve with a mean μ =______
110
deviation σ = _____.
Step 1: State the problem and draw a picture
• If we are looking for
an area of .10 to the
RIGHT of our value
(x) then we want to
z-score with .90 to
the left.
Step 2: Use the Table
• Look at the body of Table A (remember, we
KNOW the proportion—.90—we are LOOKING
for the z-score.
.8997
• The entry closest to .9 is _______.
This entry
1.28
corresponds with z=_____.
• So our unknown x has a standardized value of
1.28
_______.
Step 3: Unstandardize
• We have the standardized value for x, but we
need to unstandardized value in order to
answer our question.
X - 505 = 128
110
x = 645.8
Step 4: Conclusion
• Put the results into the context of the
question…what would you say knowing that x
= 645.8?
• Scores on the SAT Verbal test in recent years
follow approximately the N(505, 110)
distribution. How high must a student score in
order to place in the top 10% of all students
taking the SAT.
Next time on Statistics AP
• The Chapter 2 Test will be on WEDNESDAY!
– It will cover all of Chapter 2
• SO on Monday we will
– Learn about Normal Probability Plots
– Learn how to do standard Normal calculations using
our calculators (yay!) and talk about avoiding
“calculator speak” on the AP
– Review Chapter 2
• YOUR HOMEWORK:
– Exercises 2.29, 2.30 (ignore Normal Curve Applet
part), 2.31 - 2.34
Assessing Normality
• The normal distribution provides a good
model for come distributions of real data.
• However, not all distributions are Normal.
• It is important to assess the Normality of
distributions before we assume that they are
normal.
• This will be very important when we learn
about statistical inference procedures (much
later)
Assessing Normality Method 1
• One method for assessing normality is to
construct a histogram or a stemplot and then
see if the graph is approximately bell-shaped
and symmetric about the mean.
• Histograms and stemplots can reveal
important “non-Normal” features of a
distributions such as skewness, outliers, or
gaps and clusters.
Method 1 Continued
• For example, this distribution of vocabulary scores
appears Normal.
– The distribution is bell-shaped, it is roughly symmetric,
there are no gaps or clusters, and there do not appear
to be any outliers.
Method 1 Continued
• the
We can improve the effectiveness of
our plots by marking x, x ± s, x ± 2s on
the horizontal axis. Then compare the
counts of observations in each interval
using the empirical rule.
• MEAN = 6.8585
• STDEV = 1.5952
1
21
2.07
x - 3s
129
3.67
x - 2s
5.26
x–s
331
6.86
x
318
8.45
x+s
125
21
10.05
x + 2s
1
11.64
x+3s
Method 1 Continued
1
21
2.07
x - 3s
129
3.67
x - 2s
5.26
x–s
331
6.86
x
318
8.45
x+s
125
21
10.05
x + 2s
1
11.64
x+3s
• Does the distribution follow the empirical rule?
• There are a total of 947 observations
• What percent fall within 1 standard deviation of the
68.5%
mean? _____
• How does this compare with the empirical rule?
95.4%
• What percent fall within 2 standard deviations? _____
• How does this compare with the empirical rule?
99.8%
• Within 3 standard deviations?______
• How does this compare with the empirical rule?
Method 1 Continued…
• Because the actual counts of our distribution
follow the empirical rule very closely, we can
confirm that the Normal distribution with μ =
6.86 and σ = 1.595 fits the data well.
Method #2 for Assessing Normality
• Construct a normal probability plot. This requires
the use of your graphing calculator.
• Basically…without a calculator..
– Arrange the observed data values from smallest to
largest. Record what percentile of the data set each
value occupies. (i.e., the smallest observation in a set
of 20 is the 5% point, the second smallest is the 10%,
etc)
– Use the standard Normal distribution table to find the
z-scores for these percentiles. (i.e., z = -1.645 is the 5%
point of the standard Normal distribution)
– Plot each data point x against the corresponding z.
Method 2 Continued
• Let’s interpret some Normal probability plots!
Normal Probability Plots
The only substantial
deviations from the line are
short horizontal runs of
points.
These represent repeated
observations of the same
value. The phenomenon is
called granularity and does
not effect Normality.
• If you draw a line, it appears that most of the
data lies close to a straight line.
• HOWEVER, the points above and below the line
represent outliers in our data.
Normal Probability Plots
• This is the Normal probability plot for guinea pig survival times.
• Draw a line through the leftmost points (smallest observations)
• Notice that the larger observations fall systematically ABOVE the
line.
– In other words, the right-of-center observations have larger values than
the Normal distribution
– Therefore, the distribution is right skewed
Normal Probability Plots
• The Normal probability plot indicates that the
data is left-skewed because the smallest
observations fall below the line.
Interpreting Normal Probability Plot
Graphing Normal Probability Plots on
your Calculator
• Enter the test scores for Mr. Pryor’s statistics class
on page 116 into L1 on your calculator.
• Press 2nd , Y= (STAT PLOT)
• Turn Plot 1 ON
• Select the type on the lower right
• Data List: L1
• Data Axis: x
• Mark (doesn’t matter)
• Press Zoom, 9:ZoomStat
• You should have a probability plot!
Using Your Calculator for Ch.2
• Finding areas with ShadeNorm
– Follow the instructions in the Technology Toolbox
on page 165
– Notice that the interval in order to find the
proportion greater than 125 is (125, 1E99, 100,
15). That is because there is no “infinity” option
on your calculator.
– How would you find the area to the left of 125?
Using your calculator: Finding Areas
with normalcdf
• You can also find the areas under the Normal
curve using normalcdf. This method is quicker
than shadenorm, but it does not give us a visual.
• Complete the technology toolbox on page 166
• What if we wanted the area between 125 and
140?
• Be sure to note that if you are given the
standardized scores, you only need to specify the
left and right endpoints of the interval you are
looking for
– i.e., normalcdf(-2,1) gives us .818. This means that the
area from z=-2 to z=1 is approximately .818.
Using Your Calculator: invNorm
• Finally, we can use our calculators to calculate
raw or standardized values given the area under
the Normal curve or a relative frequency.
• Complete the technology toolbox on page 167 to
find the WISC score that has 90% of the scores
below it.
• Notice that we enter (.9, 100, 15) to get the raw
data score and we enter just (.9) to get the
standardized score.
Next time in Statistics AP
• The Chapter 2 Test is on WEDENSDAY
• You will need your graphing calculator
• Homework: Read the Chapter Summary on
p.161 – 162.
• Exercises: 2.37, 2.40, 2.45, 2.50, 2.51, 2.54, 2.55, 2.58 2.61, 2.63