Download Lesson 2.2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 2.2
NORMAL DISTRIBUTIONS
Lesson Objectives
 ESTIMATE the relative locations of the median
and mean on a density curve.
 ESTIMATE areas (proportions of values) in a
Normal distribution.
 FIND the proportion of values in a specified
interval, or the value that corresponds to a given
percentile in any Normal distribution.
 Using The Empirical Rule
 Using the Standard Normal Distribution
 DETERMINE whether a distribution of data is
approximately Normal from graphical and
numerical evidence.
Density Curve
In Chapter 1, you learned how to plot a dataset to
describe its shape, center, spread, etc.
Sometimes, the overall pattern of a large number of
observations is so regular that we can describe it
using a smooth curve.
Density Curve:
An idealized description of
the overall pattern of a
distribution.
Area underneath = 1,
representing 100% of
observations.
Density Curves
Density Curves come in many different shapes;
symmetric, skewed, uniform, etc.
The area of a region of a density curve represents the %
of observations that fall in that region.
The median of a density curve cuts the area in half.
The mean of a density curve is its “balance point.”
Density Curves
The area of a region of a density curve represents the %
of observations that fall in that region.
What % of the observations represented by the following
density curve fall between .4 and .6?
1-
.4 .6
1
Normal Distributions
• Once special type of density curve is the
Normal curve.
• These density curves are symmetric, single
peaked, and bell shaped. Normal curves
describe Normal distributions.
• “Normal” distributions are very important in
statistics (hence the capital “N” for Normal)
Normal Distributions
• All Normal distributions, although they may vary in
appearance somewhat, have the same overall shape.
• We describe a Normal distribution by giving its mean, μ and
its standard deviation, σ.
• Because Normal distributions are symmetric, the mean is
located in the _______ of the distribution and is _______ to
the median.
Let’s talk notation..
• Because Normal distributions come up a lot in
statistics, we abbreviate a Normal distribution
with a mean μ and a standard deviation σ as
N(μ ,σ)
• For example, heights of young women follow a
Normal distribution with μ = 64.5 inches and σ
=2.5 inches
– The distribution of young women’s heights would
be ___________
Why Normal Distributions?
(1) Normal distributions are good descriptions for
some distributions of real data.
*can you think of any data that would be
normally distributed?
(2) Normal distributions are good approximations to
the results of many kinds of chance outcomes,
such as tossing a coin many times.
(3) Many statistical inference procedures based
on Normal distributions work well for other
roughly symmetric distributions
Warning…
• Although many sets of data follow a Normal
distribution, there are also many that do not.
– Even symmetric distributions may not be Normal!
The 68-95-99.7 Rule (AKA the Empirical Rule)
• Heights of young women follow a Normal distribution with
μ = 64.5 inches and σ =2.5 inches
• Heights of young women, μ = 64.5 inches and σ =2.5 inches
• N(64.5, 2.5)
• Between what heights do 95% of women fall?
• Heights of young women, μ = 64.5 inches and σ =2.5 inches
• N(64.5, 2.5)
• Between what heights do 95% of women fall?
• Heights of young women, μ = 64.5 inches and σ
=2.5 inches
• There are _____ women in this room. What # of
women would we expect to have heights within 1
standard deviation of the mean?
• Heights of young women, μ = 64.5 inches and σ =2.5 inches
• What proportion of girls in this room would we expect to have
heights greater than 67?
• Heights of young women, μ = 64.5 inches and σ =2.5 inches
• What proportion of girls in this room would we expect to have
heights greater than 68?
How do we know if a distribution is Normal???
Assessing Normality Method 1
• One method for assessing normality is to
construct a histogram or a stemplot and then
see if the graph is approximately bell-shaped
and symmetric about the mean.
• Histograms and stemplots can reveal
important “non-Normal” features of a
distributions such as skewness, outliers, or
gaps and clusters.
Method 1 Continued
• For example, this distribution of vocabulary scores
appears Normal.
– The distribution is bell-shaped, it is roughly symmetric,
there are no gaps or clusters, and there do not appear
to be any outliers.
Method 1 Cont.
• A boxplot also works!
• Assess that the boxplot is roughly symmetric
and check for any outliers
Method 2
We can improve the effectiveness of
our plots by marking x, x ± s, x ± 2s on
the horizontal axis. Then compare the
counts of observations in each interval
using the empirical rule.
• MEAN = 6.8585
• STDEV = 1.5952
1
21
2.07
x - 3s
129
3.67
x - 2s
5.26
x–s
331
6.86
x
318
8.45
x+s
125
21
10.05
x + 2s
1
11.64
x+3s
Method 2 Continued
1
21
2.07
x - 3s
129
3.67
x - 2s
5.26
x–s
331
6.86
x
318
8.45
x+s
125
21
10.05
x + 2s
1
11.64
x+3s
Method 2 Continued…
• Because the actual counts of our distribution
follow the empirical rule very closely, we can
confirm that the Normal distribution with μ =
6.86 and σ = 1.595 fits the data well.
STANDARD NORMAL
DISTRIBUTIONS
The Standard Normal Distribution
• All normal distributions are the same if we measure in units of
size σ about the mean μ as center.
• Changing these units requires that we standardize (like we did
in 2.1)
Z=x-μ
σ
• If the variable we standardize has a normal distribution, then
so does the new variable, z
• The new distribution is called the standard Normal
Distribution
• We can find the proportion of observation
that lie within any range of values simply by
finding the area under the curve.
The standard Normal Table
• Because standardizing Normal distributions
makes them all the same, we can use a single
table to find the areas under a Normal
distribution.
• This table is called the standard Normal table.
– It’s inside the front cover of you textbook!
– You will be given this table on the AP exam
The standard Normal Table
CAREFUL!!!!
• Example: Find the proportion of observations
from the standard Normal distribution that
are less than -2.15.
Using the standard Normal table…
• Caution: the area that we found was to the LEFT of z = 2.15. In this case, that is what we were looking for.
• HOWEVER if the problem had asked for the area lying to
the right of -2.15. What would that answer be?
Area to the Right
• The total area under the curve is _____.
• So if 0.0158 lies to the left of -2.15…
• Then
lies to the right of -2.15.
How do you avoid making a mistake when asked
to find the area to the RIGHT?
• Always sketch the Normal curve, mark the zvalue, and shade the area of interest (aka the
area you are looking for in the problem)
• THEN, when you get you answer, CHECK TO
SEE IF IT IS REASONABLE!!!
Practice
• Exercise 2.29
Putting it all Together:
Solving Problems Involving Normal Distributions
• Step 1: State the distribution and values of interest. Draw a
picture of the distribution with the mean/standard deviation
clearly ID’s and shade the area of interest.
• Step 2: Perform calculations – SHOW YOUR WORK. Either...
(i) Compute a z-score for each boundary variable an use Table A
or your calculator to find the area under the standard Normal
Curve; or (ii) use the normalcdf command and label each of the
inputs.
• Step 3: Conclusion. Write your conclusion in the context of the
problem.
– Just saying “the area under the curve that is less that 2.1” means
nothing! Your results should tell you something about the data.
Example: Cholesterol and Young Boys
•
•
For 14-year-old boys, the mean is μ = 170 milligrams of cholesterol per deciliter of blood
(mg/dl) and the standard deviation σ = 30 mg/dl.
Levels above 240 mg/dl may require medical attention. What percent of 14-year-old boys
have more than 240 mg/dl of cholesterol?
Finding a Value when Given a Proportion
• What if you wanted to know what score you would have to get
in order to place among the top 10% of your class on a test?
• Sometimes, we may be asked to find the observed value with a
given proportion of the observations above or below it.
• To do this, we just read Table A going backwards. In other
words, find the proportion you are looking for in the body of
the table, figure out the corresponding z-score, and then
“unstandardize” to get the observed value.
Inverse Normal Calculation Example
• Scores on the SAT Verbal test in recent years follow approximately
the N(505, 110) distribution. How high must a student score in
order to place in the top 10% of all students taking the SAT?
Practice!
• 2.31a, b
• 2.32
Using your calculator: Finding Areas
with normalcdf
•
•
•
You can find the areas under the Normal curve using normalcdf.
For 14-year-old boys, the mean is μ = 170 milligrams of cholesterol per deciliter of
blood (mg/dl) and the standard deviation σ = 30 mg/dl.
Levels above 240 mg/dl may require medical attention. What percent of 14-yearold boys have more than 240 mg/dl of cholesterol?
Using Your Calculator: invNorm
• Finally, we can use our calculators to calculate raw or
standardized values given the area under the Normal curve or
a relative frequency.
• Scores on the SAT Verbal test in recent years follow
approximately the N(505, 110) distribution. How high must a
student score in order to place in the top 10% of all students
taking the SAT?