Download The Normal Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
The Normal Distribution
Readings
3.4;
Main Part. 4.1–4.3, 4.5
Outline
1
Introduction: Continuous Random Variables (3.4)
2
Introduction: Normal Distribution (4.1-4.2)
3
Areas Under a Normal Curve (4.3)
Standard Normal Table: Computing Probabilities
Standard Normal Table: Finding Quantiles
General Normal Distribution
A Case Study
Outline
1
Introduction: Continuous Random Variables (3.4)
2
Introduction: Normal Distribution (4.1-4.2)
3
Areas Under a Normal Curve (4.3)
Standard Normal Table: Computing Probabilities
Standard Normal Table: Finding Quantiles
General Normal Distribution
A Case Study
Recall: Discrete Random Variables
We can describe a discrete random variable with a table.
For example:
k
0
1
5
10
Pr {X = k } 0.1 0.5 0.1 0.3
But a continuous random variable can take any value in a
interval; we could never list all of these values in a table
Continuous Distributions
A continuous random variable has possible values over a
continuum.
The total probability of one is not in discrete chunks at
specific locations, but rather is ground up like a very fine
dust and sprinkled on the number line.
We cannot represent the distribution with a table of
possible values and the probability of each.
Continuous Distributions (continued)
Instead, we represent the distribution with a probability
density function which measures the thickness of the
probability dust.
Probability is measured over intervals as the area under
the curve.
A legal probability density f :
1
2
is never negative (f (x) ≥ 0 for −∞ < x <R ∞).
∞
has a total area under the curve of one ( −∞ f (x)dx = 1).
Outline
1
Introduction: Continuous Random Variables (3.4)
2
Introduction: Normal Distribution (4.1-4.2)
3
Areas Under a Normal Curve (4.3)
Standard Normal Table: Computing Probabilities
Standard Normal Table: Finding Quantiles
General Normal Distribution
A Case Study
The Normal Distribution
The Normal Distribution is the most important distribution
of continuous random variables.
The normal density curve is the famous symmetric,
bell-shaped curve.
The central limit theorem is the reason that the normal
curve is so important. Essentially, many statistics that we
calculate from large random samples will have
approximate normal distributions (or distributions derived
from normal distributions), even if the distributions of the
underlying variables are not normally distributed.
This fact is the basis of most of the methods of statistical
inference we will study in the last half of the course.
Chapter 4 introduces the normal distribution as a
probability distribution.
Chapter 5 culminates in the central limit theorem, the
primary theoretical justification for most of the methods of
statistical inference in the remainder of the textbook.
The famous bell curve
Used everywhere: very useful description for lots of biological
(and other) random variables.
Body weight,
Crop yield,
Protein content in soybean,
Density of blood components
σ
σ
Y ∼ N (µ, σ)
values can, in
principle, go to
infinity, both ways.
µ − 4σ µ − 2σ
µ
µ + 2σ µ + 4σ
The Normal Density
Normal curves have the following bell-shaped, symmetric
density.
1
f (x) = √
e
2πσ
− 21
x−µ
σ
2
,
(−∞ < x < ∞)
Parameters: The parameters of a normal curve are the
mean µ and the standard deviation σ.
Notation: a random variable is normal with mean µ and
standard deviation σ,
Y ∼ N (µ, σ)
Standard Normal. We often consider µ = 0 and σ = 1.
Z ∼ N (0, 1)
Recall: Empirical Rule
The 68–95–99.7 Rule.
For every normal curve:
1
2
3
≈ 68% of the area is within one SD of the mean
≈ 95% of the area is within two SDs of the mean
≈ 99.7% of the area is within three SDs of the mean
Standard Normal Density
Standard Normal Density
Area within 1 = 0.68
Area within 2 = 0.95
Density
Area within 3 = 0.997
−3
−2
−1
0
Possible Values
1
2
3
Outline
1
Introduction: Continuous Random Variables (3.4)
2
Introduction: Normal Distribution (4.1-4.2)
3
Areas Under a Normal Curve (4.3)
Standard Normal Table: Computing Probabilities
Standard Normal Table: Finding Quantiles
General Normal Distribution
A Case Study
Recall: the Normal Density
f (x) = √
1
2πσ
e
(can you integrate that?)
− 12
x−µ
σ
2
,
(−∞ < x < ∞)
Compute areas
There is no formula to calculate general areas under the
normal curve.
We will learn to use normal tables for this.
LEARN the normal table.
The Standard Normal Table
We have tables for N (0, 1)
The standard normal table lists the area to the left of z
under the standard normal curve for each value from
−3.49 to 3.49 by 0.01 increments.
The normal table is located: Table 3 in your book (bring a
copy with you to class every day).
Numbers in the margins represent z.
Numbers in the middle of the table are areas to the left of z.
Warning!
Learn the normal table today.
Make it part of your being.
Standard Normal Table: Computing Probabilities
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} =
Pr {−1 ≤ Z ≤ 1} =
Pr {Z > 1.5} =
−4
−2
0
1
2
3
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} =
Pr {−1 ≤ Z ≤ 1} =
Pr {Z > 1.5} =
−4
−2
0
1
2
3
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} =
Pr {−1 ≤ Z ≤ 1} =
Pr {Z > 1.5} =
−4
−2
0
1
2
3
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} =
Pr {−1 ≤ Z ≤ 1} =
Pr {Z > 1.5} =
−4
−2
0
1
2
3
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} =
Pr {−1 ≤ Z ≤ 1} =
Pr {Z > 1.5} =
−5 −4
−2
0
1
2
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} =
Pr {−1 ≤ Z ≤ 1} =
Pr {Z > 1.5} =
−4
−2
0
1
2
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} = 0.8413 − 0.5 = .3413
Pr {−1 ≤ Z ≤ 1} =
Pr {Z > 1.5} =
−4
−2
0
1
2
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} = 0.8413 − 0.5 = .3413
Pr {−1 ≤ Z ≤ 1} =
Pr {Z > 1.5} =
−4
−2 −1 0
1
2
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} = 0.8413 − 0.5 = .3413
Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826
Pr {Z > 1.5} =
−4
−2 −1 0
1
2
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} = 0.8413 − 0.5 = .3413
Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826
Pr {Z > 1.5} =
−4
0
1.5
4
5
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} = 0.8413 − 0.5 = .3413
Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826
Pr {Z > 1.5} = 1 − Pr {Z ≤ −1.5}
= 1 − 0.9332 = .0668
−4
0
1.5
4
5
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} = 0.8413 − 0.5 = .3413
Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826
Pr {Z > 1.5} = 1 − Pr {Z ≤ −1.5}
= 1 − 0.9332 = .0668
−4
−1.5
0
1.5
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} =
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} = 0.8413 − 0.5 = .3413
Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826
Pr {Z > 1.5} = 1 − Pr {Z ≤ −1.5}
= 1 − 0.9332 = .0668
−4
−0.50.3
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} = Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}
The standard normal Z ∼ N (0, 1)
Whole area = 1 (total probability rule)
Pr {Z ≤ 0} = .5 by symmetry
Pr {Z = 0} = 0
Pr {Z < 1} = .8413 (table, front cover)
Pr {0 ≤ Z ≤ 1} = 0.8413 − 0.5 = .3413
Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826
Pr {Z > 1.5} = 1 − Pr {Z ≤ −1.5}
= 1 − 0.9332 = .0668
−4
−0.50.3
4
Draw a picture and make use of symmetry.
Pr {−0.5 ≤ Z ≤ 0.3} = Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}
= .6179 − .3085 = .3094
Standard Normal Table: Finding Quantiles
Basic Idea
This is the inverse problem.
Before. Question: here is the interval; Answer: Probability
of the interval
Now. Question: Here is the probability. Answer: What is
the interval that matches with the probability.
Problem: many intervals have the same area.
We ask for a certain type of interval.
The Reverse Question: Finding quantiles
Pr {Z ≤?} = .975
the value ? is a quantile.
Use Table 3 (or front cover) backward
−3
0
?
3
Pr {Z ≤?} = 0.20
Pr {Z ≥?} = 0.80
−3
?
0
3
The Reverse Question: Finding quantiles
? = 1.96
Pr {Z ≤?} = .975
the value ? is a quantile.
Use Table 3 (or front cover) backward
−3
0
?
3
Pr {Z ≤?} = 0.20
Pr {Z ≥?} = 0.80
−3
?
0
3
The Reverse Question: Finding quantiles
? = 1.96
Pr {Z ≤?} = .975
the value ? is a quantile.
Use Table 3 (or front cover) backward
Pr {Z ≤?} = 0.20
Pr {Z ≥?} = 0.80
−3
0
?
3
? = −0.84
−3
?
0
3
The Reverse Question: Finding quantiles
? = 1.96
Pr {Z ≤?} = .975
the value ? is a quantile.
Use Table 3 (or front cover) backward
Pr {Z ≤?} = 0.20
Pr {Z ≥?} = 0.80
−3
0
?
3
? = −0.84
? = −0.84
−3
?
0
3
zα Notation
Instead of ? we introduce a fancy notation.
Let zα be the number such that
Pr {Z < zα } = 1 − α
See Figures 4.3.11, 4.3.12 (p.129) (α to the right; 1 − α to
the left)
Example. z.025 = 1.96
General Normal Distribution
General Normal?
We are good if Y ∼ N (0, 1)
What if Y ∼ N (10, 5) or N (11, 3), or ...
(Directions from “Home”)
General form N (µ, σ)
All normal distributions have the same shape.
The question: How many standard deviations away from
the mean?
Transformation
If
Y ∼ N (µ, σ)
then
Y −µ
Z =
∼ N (0, 1)
σ
Standardization
All normal curves have the same shape, and are simply
rescaled versions of the standard normal density.
Consequently, every area under a general normal curve
corresponds to an area under the standard normal curve.
The key standardization formula is
z=
x −µ
σ
Solving for x yields
x = µ + zσ
which says algebraically that x is z standard deviations
above the mean.
Probability Example
Use z =
x−µ
σ
If X ∼ N(100, 2), find Pr {X > 97.5}.
Probability Example: Solution
Use z =
x−µ
σ
If X ∼ N(100, 2), find Pr {X > 97.5}.
Solution:
X − 100
97.5 − 100
Pr {X > 97.5} = Pr
>
2
2
= Pr {Z > −1.25}
= 1 − Pr {Z < −1.25}
= 1 − .1056 = .8944
Quantile Example
Use x = µ + zσ
If X ∼ N(100, 2), find the cutoff values for the middle 70%
of the distribution.
Quantile Example: Solution
Use x = µ + zσ
If X ∼ N(100, 2), find the cutoff values for the middle 70%
of the distribution.
Solution: The cutoff points will be the 0.15 and 0.85
quantiles.
From the table, 1.03 < z < 1.04 (1.04 is closer)
P(Z < 1.03) = .8485,
P(Z < 1.04) = .8508
Thus, the cutoff points are the mean plus or minus 1.04
standard deviations.
100 − 1.04(2) = 97.92,
100 + 1.04(2) = 102.08
A Case Study
Exam Hint
This how exam questions will often look.
I give you some question in the context of a scientific area;
you figure out the probabilty calculations to use.
(word problems)
Case Study
Example
Body temperature varies within individuals over time (it can be
higher when one is ill with a fever, or during or after physical
exertion). However, if we measure the body temperature of a
single healthy person when at rest, these measurements vary
little from day to day, and we can associate with each person an
individual resting body temperture. There is, however, variation
among individuals of resting body temperture.
The Question
Example
In the population, suppose that:
the mean resting body temperature is 98.25 degrees
Fahrenheit;
the standard deviation is 0.73 degrees Fahrenheit;
resting body temperatures are normally distributed.
Let X be the resting body temperature of a randomly chosen
individual. Find:
1
Pr {X < 98}, the proportion of individuals with temperature
less than 98.
2
Pr {98 < X < 100}, the proportion of individuals with
temperature between 98 and 100.
3
The 0.90 quantile of the distribution.
4
The cutoff values for the middle 50% of the distribution.
Warning!
Learn the normal table today.
Make it part of your being.
Related documents