Download The Normal Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
The Normal Distribution
Up until now, we’ve dealt with
discrete probability
distributions.
The normal distribution is a
continuous probability
distribution. It has the shape
of a bell curve
http://mathworld.wolfram.com/
NormalDistribution.html
Since the normal curve represents
a probability distribution, the area
under the normal curve must be
equal to 1.
The center and spread of the
normal curve change.
Probability density histogram with
continuous normal distribution curve
The location the center of the
normal probability is based on the
mean of the population (called µ,
“mu”).
The spread of the normal curve is
based on the standard deviation of
the population (called σ, “sigma”).
http://en.wikibooks.org/wiki/A-level_Mathematics/OCR/S2/Normal_Distribution
1
The area under a normal
curve from a to b is the
probability that a data
value will like between a
and b.
Typically, calculus is used to
find the area under a curve.
In statistics, though, the
calculus has been done and
stored in a table. All we need
is to be able to look up areas
on a table.
64 66 68 70 72 74 76
Example: Say the curve above represents
the probability distribution for mens’
height. If the area in blue is .301, then the
probability that a man is between 70 and
72 inches is .301.
Definition: In a normal distribution
with mean µ and standard
distribution σ, the z-score related to
any data point x is
z=
x−µ
σ
Note: the z score related to x = µ is 0.
Example: If the weight of paper
discarded weekly by a typical
household is normally distributed,
with a mean of 9.4 pounds and a
standard distribution of 4.2 pounds,
find the z score related to 19.354
pounds of paper being discarded
weekly.
The values in the table of page A-1 of
your test return the area under the
normal curve up to the x value related to
each z score (pink area). This area is the
probability that the z score will be less
than the z score related to that data
point.
2
For our problem, we can interpret that
as the probability that the weight of
paper discarded by a household is in a
week is less than 19.354 pounds.
pink area = P(discarded paper weighs
less than 19.354 pounds) = .9911
Remember that the total probability
= total area under the curve = 1.
What is the probability that a
household discards more than 19.354
pounds of paper weekly?
1 - .9911 = .0089
Example: Find the z-score related to
a household discarding 4.444 pounds
of trash weekly.
-1.18
4.444 − 9.4
z=
= −1.18
4.2
Note that for data points less than
the mean, the z score is always
negative.
Example: What is the
probability that the amount
of trash discarded by a
household is between 4.444
and 19.354 pounds?
What is the probability that a
household discards less than 4.444
pounds of paper weekly?
2.37
The probability is the area in pink. We
know that the area up to the z score of –
1.18 is .1190, while the area up to 2.37 is
.9911
3
2.37
Area is pink =
We can use our table of areas to show
that for normally distributed data, the
probability that a data point is within
one standard deviation of the mean is
.683,
area up to 2.37 – area up to –1.18 =
within 2 standard deviations: .954,
.9911 – .1190 = .8721 =
within 3 standard deviations: .997.
P(between 4.444 and 19.354 pounds of
paper discarded.
4