Download A.P. Statistics: Lesson 2.1-Density Curves and the Normal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Density of states wikipedia , lookup

Transcript
A.P. Statistics: Lesson 2.1-Density Curves and the Normal Distributions
In Chapter 1 we developed a strategy for exploring data on a single quantitative variable.



Always plot you data: make a graph, usually a histogram or stem plot.
Look for overall pattern (shape, center, spread) and for striking deviations such as outliers
Calculate a numerical summary to briefly describe the center and spread.
In Chapter 2 we will add an additional step to our strategy.

Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a
smooth curve.
The histogram below shows the scores of all 947 7th grade students in Gary, IN on the vocabulary part of
the Iowa Test of Basic Skills. The histogram is _______________,
and both tails fall off quite smoothly from a single center peak.
There are no large gaps or obvious outliers.
The curve is a _________________________ for the distribution.
A mathematical model is an idealized description.
It gives a compact picture of the overall pattern of the data but
ignores minor irregularities as well as any outliers.
Example 1: Our eyes respond to the areas of the bars in a histogram. The bar areas represent proportions of the
observations. In Figure A below the area of the shaded bars represent the students with vocabulary scores 6.0 or
lower. There are 287 such students, who make up the proportion 287/947 = 0.303 of all Gary 7th graders.
Figure A
Now concentrate on the curve drawn through the bars. In figure B below, the area under the curve to the left of
6.0 is shaded. Adjust the scale of the graph so that the total area under the curve is exactly 1. This area
represents the proportion 1, that is, all observations. Areas under the curve then represent proportions of the
observations.
The curve is now a _________________________.
The shaded area under the density curve in Figure B represents the proportion of students with score 6.0 or
lower. This area is 0.293 only 0.010 away from the histogram result.
Figure B
Density Curve:
A density curve is a curve that

Is always on or above the horizontal axis

Has exactly 1 underneath it.
A density curve describes the overall pattern of a distribution. The area under the curve and above any range of
values is the proportion of all observations that fall in that range.
The density curve in Example 1 is a _________________________.
Our measures of center and spread apply to density curves as well as to actual sets of observations. Areas under
a density curve represent proportions the total number of observations.
The _______________ of a density curve is the equal-areas point, the point with half the area under the curve to
its left and the remaining half o the area to its right.
The __________________ divide the area under the curve into quarters. One-fourth of the area under the curve
to the left of the first quartile, and three-fourths of the area is to the left of the third quartile. You can roughly
locate the median and quartiles of any density curve by eye by dividing the area under the curve into four equal
parts.
Symmetric Density Curve
Right Skewed Density Curve
The __________ of a density curve us the equal-areas point, the point that divides the area under the curve in
half.
The __________ of a density curve is the balance point, at which the curve would balance if made of solid
material.
The mean and median of a symmetric density curve are __________.
We know that the median and mean of a skewed distribution are pulled toward the long tail – there are
mathematical ways calculating these for any density curve, we will do so in the future.
We can roughly locate the mean, median, and quartiles of any density curve by eye. Because a density curve is
an idealized description of the distribution of data, we need to distinguish between the mean and standard
deviation of the density curve and the mean x and standard deviation s computed from the actual observations.
The usual notation for the mean of an idealized distribution is μ (the Greek letter mu). We write the standard
deviation of a density curve as σ (the Greek letter sigma).
One particularly important class of density curves that we have already seen are symmetric, single peaked, and
bell shaped.
They are called normal curves, and they describe _________________________.
Changing μ without changing σ moves the normal curve along the horizontal axis without changing its spread.
The _________________________ controls the spread of a normal curve. (The curve with the larger value σ is
more spread out.)
Not only do μ and σ completely determine the shape of a normal curve, but we can locate σ by eye on the curve.
As we move in either direction from the center μ, the curve changes from falling ever more steeply to falling
ever less steeply.
The points at which this change of curvature takes place are called _________________________ and are
located at distance σ on either side of the mean μ.
Remember that μ and σ alone do not specify the shape of most distributions, and that the shape of density
curves in general does not reveal σ. These are specific properties of normal distributions.
Although there are many normal curves, they all have common properties and obey the following rule.
The 68-95-99.7 Rule:
In the normal distribution with mean μ and standard distribution σ:

68% of the observations fall within σ of the mean μ

95% of the observations fall within 2σ of μ

99.7% of the observations fall within 3σ of μ
Example 2: The distribution of heights of young women aged 18 to 24 is approximately normal with
mean μ = 64.5 inches and standard deviation σ = 2.5 inches. Apply the 68-95-99.7 rule for this distribution.
Because we will mention normal distributions often, a short notation is helpful. We abbreviate the normal
distribution with mean μ and standard deviation σ as N(μ, σ).
For example, the distributions of young women’s heights is N(64.5, 2.5).
A.P. Statistics: Density Curves and the Normal Distribution
We now have a kit of graphical and numerical tools for describing distributions. We also have a strategy for
exploring univariate data.
1.) Plot your data. (Usually a histogram, dot plot, or box plot)
2.) Look at the overall pattern (center, shape, and spread) and any deviations from the pattern (outliers)
3.) Calculate the numerical summary (5 number summary or mean and standard deviation)
Many times the overall pattern is so regular that it can be described with a smooth curve.
The histogram below gives the ACT scores for a sample of students at a large high school.
Draw a smooth curve that gives the overall pattern of the distribution. If the heights of each bar represent the
proportion of students earning each score, then this mathematical model is called a density curve.
Properties of a density curve:
1.) The curve is always on or above the horizontal axis.
2.) The area under the curve is equal to 1.
Relationship among shape, mean, and median:
Locate the mean and median on each of the density curves above.
One particularly important class of density curves is the Normal Distribution. It is symmetric, single peaked,
and bell shaped. All normal distributions have the same overall shape. The exact density is determined by the
mean µ and the standard deviation σ. The mean determines the position of the curve and the standard deviation
determines the spread.
Empirical Rule (68-95-99.7 Rule):
Although there are many normal distributions, they all obey a common rule.
68% of the distributions fall within 1 standard deviation of the mean (in the interval μ ± σ)
95% of the distributions fall within 2 standard deviations of the mean (in the interval μ ± 2σ)
99.7% of the distributions fall within 3 standard deviation of the mean (in the interval μ ± 3σ)
Examples: Suppose ACT scores are normally distributed with μ = 23 and σ = 3.
What proportion scored between 20 and 26?
What proportion scored between 23 and 26?
What proportion scored between 17 and 20?
What proportion scored above 29?
What proportion scored below 14?
What proportion scored below 21?