Download Probability and Statistics Unit 1 Exploring Data: Distributions

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Unit 4
Randomness in Data
Topic 15
Normal Distributions
(page 325)
OVERVIEW
You began studying randomness and
probability in the previous topic. Toward the
end of that topic you saw that
a mound-shaped distribution described
the outcomes of a particular variable, such as
counts and proportions. Such a pattern arises
often enough that it has been very extensively
studied mathematically.
OVERVIEW
In this topic you will investigate
mathematical models known
as normal distributions,
which describe this pattern of variation very
accurately. You will learn how to use
normal distributions to calculate
probabilities of interest in a variety of
contexts.
Do the Preliminaries
(page 326)
Essential Question
How can normal curves be use
as mathematical models for
approximating distributions?
Activity 15-1
Placement Scores and Hospital Births
(page 327)
(a) The similarities that I notice about the shapes of these
distributions is that they are …
roughly symmetric, mound shaped,
with a single peak.
(b) [Draw a sketch of a smooth curve that approximates the
general shape of the histograms.]
Data that display the general
shape shown above occur
frequently. Theoretical
mathematical models used to
approximate such distributions
are called
normal distributions .
Each normal distribution shares 3 characteristics:
(1) Every normal distribution is
symmetric .
(2) Every normal distribution has a
single peak at its
center .
(3) Every normal distribution follows a
bell-shaped curve.
Each normal distribution is
distinguished (identified)
by 2 things:
mean and
its standard deviation .
(1)
(2)
its
The mean
the
µ determines where its center is;
peak
its point of
of a normal curve is at its mean and
symmetry .
The standard deviation
out the distribution is.
s
indicates how spread
µ
The distance between the mean
and the points
where the curvature changes is equal to the
standard deviation
s.
Greek Alphabet
(c) [Label each curve with A, B, or C.]
A
A : m =70, s =5
B : m =70, s =10
C : m =50, s =10
C
B
(d) Are these proportions quite close
to each other and to the predictions
of the empirical rule that you studied
YES
in Topic 5?
_______
Review:
observation - mean x - m
z - score =
=
standard deviation s
8.5 - 10.221
(e) z = _____________________
= _________
-0.45
3.859
(show work with answer rounded to the nearest hundredth)
23.5 - 25.060
-0.45
3.472
(f) z = _____________________
= _________
(show work with answer rounded to the nearest hundredth)
(g) The thing true about these 2 scores is that they
are
equal .
(h) The interpretation of these z-scores is that the
two values fall
0.45
of one standard
deviation below their respective
means .
(i) The proportion of observations falling below 8.5
in the placement exam data is _________.
33%
(1+1+ 5 + 7 +12 +13+16 +15) 70
= = .3286
213
213
(j) The proportion of observations falling below 23.5
in the simulated births data is _________.
32%
(3+10 + 7 +15 + 31+ 24 + 25) 115
= = .3151
365
365
The proportion of observations falling below 8.5 in
the placement exam data is 33%.
The proportion of observations falling below 23.5 in
the simulated births data is 32%.
Are these proportions fairly close? YES
______
The closeness of these percentages indicates
that
to find the proportion of data falling in a given
region for normal distributions, all we need to
determine is the
z-score .
Values with the same z-score will have the same
percentage of observations lying below them,
for any normal distribution. Thus, instead of
finding percentages for all normal distributions,
we need only the percentages corresponding to
these z-scores.
(k) The probability that the randomly selected
student’s score will be below 8.5 is
_________.
33%
[Hint: Refer back to your answer to (i).]
The probability of a randomly selected
observation falling in a certain interval is
equivalent to the proportion of the population’s
observations falling in that interval.
under the curve of a
normal distribution is 1 , this probability can
Since the total area
be calculated by finding the area under the
normal curve for that interval.
under the
curve is always equal to 1 or 100% .
The total area
(l) Area corresponding to the probability of a
placement score falling below 8.5 = …
Pr(z<-.45)
(l) Area corresponding to the probability of a
placement score falling below 8.5 = …
Pr(z<-.45) = .3264 = 33%
(m) Is this value reasonably close to your
answers for the proportions of the
placement and birth data less than
their z-scores of -0.45?
_______
YES
Use
Z
to denote the standard normal
distribution. The notation Pr ( a < Z < b )
denotes the probability lying between
the values a and b, calculated as the area
under the standard normal curve in that
region. The notation Pr ( Z ≤ z ) denotes
the area to the
left
of a particular value z,
while Pr( Z ≥ z ) refers to the area above a
particular z value.
EXAMPLES
Sketch the curve and the area under the curve,
then find the proportion. Use the Standard Normal
Probability Table from your notes.
EXAMPLE #1
Sketch the curve and the area under the curve,
then find the proportion. Use the Standard Normal
Probability Table from your notes.
Pr (Z < 0.68) = _________
.7517 = _________
75.17%
Less than sign means
shade to the left.
0.68
EXAMPLE #2
Sketch the curve and the area under the curve,
then find the proportion. Use the Standard Normal
Probability Table from your notes.
.7517 = _________
75.17%
Pr (Z ≤ 0.68) = _________
The less than sign and
less than or equal to sign
are treated the same.
0.68
EXAMPLE #3
Sketch the curve and the area under the curve,
then find the proportion. Use the Standard Normal
Probability Table from your notes.
Pr (Z > 0.68) = _________
.2483 = _________
24.83%
Greater than sign
means shade to
the right.
100% - 75.17%
= 24.83%
0.68
EXAMPLE #3
Here is another way to look at this problem!
Pr (Z > 0.68) = _________
.2483 = _________
24.83%
You can use the
table if you think
of the opposite
Pr (Z < -0.68).
Remember …
Normal curves
are symmetric!
- 0.68
EXAMPLE #4
Sketch the curve and the area under the curve,
then find the proportion. Use the Standard Normal
Probability Table from your notes.
Pr (Z < -1.38) = _________
.0838 = _________
8.38%
-1.38
EXAMPLE #5
Sketch the curve and the area under the curve,
then find the proportion. Use the Standard Normal
Probability Table from your notes.
Pr (Z > -1.75) = _________
.9599 = _________
95.99%
-1.75
EXAMPLE #6
Sketch the curve and the area under the curve,
then find the proportion. Use the Standard Normal
Probability Table from your notes.
Pr (Z < -2.49) = _________
.0064 = _________
0.64%
-2.49
EXAMPLE #7
Sketch the curve and the area under the curve,
then find the proportion. Use the Standard Normal
Probability Table from your notes.
Pr (-1.38 < Z < 0.68) = _______ = ________
-1.38
0.68
Find the area of the smaller region …
.0838 = _________
8.38%
Pr (Z < -1.38) = _________
-1.38
Then find the area of the larger region …
.7517 = 75.17%
Pr (Z < 0.68) = _________
_________
0.68
Subtract the area of the smaller region from
the area of the larger region …
Pr (-1.38 < Z < 0.68)
= Pr (Z < 0.68) - Pr (Z < -1.38)
= .7517 - .0838
-1.38
0.68
EXAMPLE #7
Pr (-1.38 < Z < 0.68) = ________
.6679 = __________
66.79%
-1.38
0.68
EXAMPLE #8
Sketch the curve and the area under the curve,
then find the proportion. Use the Standard Normal
Probability Table from your notes.
Pr (Z < -3.81) = ____________________
less than .0002
-3.81
EXAMPLE #9
Sketch the curve and the area under the curve,
then find the z-value. Use the Standard Normal
Probability Table from your notes in reverse.
Pr (Z < k) = .8997  k = ________
1.28
Must
find the
Z-value.
89.97%
?
EXAMPLE #10
Sketch the curve and the area under the curve,
then find the z-value. Use the Standard Normal
Probability Table from your notes in reverse.
Pr (Z > k) = .0630  k = ________
1.53
100% - 6.30%
= 93.70%
So lookup .9370.
?
Take-Home
quiz on these
examples.
Essential Question
How can normal curves be use
as mathematical models for
approximating distributions?
Activity 15-2
Birth Weights
(pages 331 to 334)
mean µ = 3250 grams and standard deviation s = 550 grams
(a)
Note: Low birth weight is
under 2500 grams.
Remember: Low birth weight is under 2500 grams.
low birth
weight is under
2500 grams
b) My guess as to the proportion of babies born with
a low birth weight is _________.
2500 - 3250
(c) z-score = ________________
= ______
-1.36
550
(show work with answer rounded to the nearest hundredth)
-1.36
.0869
8.69%
(d)
Pr ( Z < __________) = ___________ = ____________
(e)
[Follow directions with the calculator.]
normalcdf (-1E99, 2500, 3250, 550) = .0863
lower
bound
upper
bound
mean
s.d.
μ
σ
Does your answer match (to within rounding
discrepancies) your answer to (d)?
_________
YES
(f)
[Use Standard Normal Probabilities Table.]
4536 - 3250
z-score = ________________
= ______
2.34
550
(show work with answer rounded to the nearest hundredth)
.0096 = ____________
0.96%
2.34 = ___________
Pr ( Z > __________)
[Use your calculator.]
.0097 = ____________
0.97%
Pr ( X > 4536) = ___________
normalcdf (4536, 1E99, 3250, 550)
lower
bound
upper
bound
mean
s.d.
μ
σ
(g) [Know both ways in (f).]
Pr ( Z > 2.34) = 0.96%
Pr ( X > 4536) = 0.97%
(h) mean µ = 3250 grams & standard deviation s = 550 grams
lower
bound
upper
bound
3000
4000
[Use Standard Normal Probabilities Table.]
4000 - 3250
z-score = ________________
= ______
1.36
550
3000 - 3250
z-score = ________________
= ______
-0.45
550
Pr (__________
-0.45 < Z < __________)
1.36
.9131 - ____________
.3264
= ___________
58.67%
= _____________
[Use your calculator.]
Pr ( 3000 < X < 4000 )
.5889 = _____________
58.89%
= ___________
normalcdf (3000, 4000, 3250, 550)
lower
bound
upper
bound
mean
s.d.
μ
σ
(i) Proportion for low birth weight babies is
__________.
7.50%
291,154
= 0.0750
3, 880, 894
Proportion for babies weighing between
3000 and 4000 is __________.
65.78%
2, 552, 852
= 0.6578
3, 880, 894
[comment]
First of all I just want to point out that my
youngest son, Noah, was born in 1997.
The proportion for low birth weight
babies was off by about 1% (8.69% vs.
7.50%).
The proportion of babies weighing
between 3000 and 4000 grams is off by
less than 7% (58.89% vs 65.78%).
2172
(j) A newborn would have to weigh _______
grams to be among the lightest 2.5%.
2.5%
?
invNorm (.025, 3250, 550)
2172 grams
% to left
of value
mean
s.d.
μ
σ
3955
(k) A newborn would have to weigh _______
grams to be among the heaviest 10%.
10%
90%
?
invNorm (.9, 3250, 550)
meangrams
s.d.
% to 3955
left
μ
σ
Assignment
Activity 15-5: Pregnancy Durations
(page 338)
Assignment
Activity 15-6: Professors’ Grades
(pages 338 & 339)
Assignment
Activity 15-8: IQ Scores
(page 339)
How can the properties of a normal
distirbution be used to make predictions?
Essential Question
How can normal curves be use
as mathematical models for
approximating distributions?
Activity 15-3
Matching Samples to Density Curves
(pages 334 to 336)
While normal distributions are the most
common, they are not the only kind of
theoretical probability model. Any curve
under which the total area is one and for
which areas correspond to probabilities
represents a probability model. Such
curves are called density curves .
A
(a) Sample 1: Population _______
B
Sample 2: Population _______
C
Sample 3: Population _______
D
Sample 4: Population _______
1
2
(a) Here are the answers again.
3
4
(b) Now try it with a sample size of 10!
C
(b) Sample 5: Population _______
B
Sample 6: Population _______
D
Sample 7: Population _______
A
Sample 8: Population _______
(c) It
is easier to discern the shape
of the population from a sample
size of ______.
100
Especially with small sample
sizes, sample data
from normal populations may not
look very normal and may be
hard to distinguish from sample
data from other shapes of
populations.
WRAP-UP
This topic has introduced you to the most important
mathematical model in all of statistics:
the normal distribution.
You have discovered how z-scores provide the key to
using a table of standard normal probabilities to
perform calculations related to normal distributions.
You have also compared predictions from a normal
model to observed data as a means of assessing
the usefulness of the model in a given situation.
You have seen that the model predictions are not
exact, but are close, especially with larger samples.
WRAP-UP
The next two topics will reveal how the normal
distributions describe the pattern of variation that
arises when one repeatedly takes samples from a
population. In the next topic you will explore how
sample proportions vary, while the variation of
sample means will occupy your attention in Topic 17.
These topics point toward the key role of
normal distributions in the most important
theoretical result in all of statistics:
the Central
Limit Theorem.
Activity 15-4
Normal Curves
(pages 337 to 338)
a: mean µ = _______ and standard deviation s = _______
Click
for
c: mean µ = _______ and standard deviation s = _______
HELP!
b: mean µ = _______ and standard deviation s = _______
d: mean µ = _______ and standard deviation s = _______
Remember :
The mean is in the middle of the curve…
and the normal curve is contained within
3 standard deviations from the mean!
50
5
a: mean µ = _______ and standard deviation s = _______
65 - 35 30
s=
=
6
6
35
45
55
35 + 65 100
=
2
2
65
Remember :
The mean is in the middle of the curve…
and the normal curve is contained within 3 standard
deviations from the mean!
1100
b: mean µ = ________ and standard deviation s =
_______
300
2000 - 200 1800
s=
=
6
6
0
1000
200 + 2000 2200
=
2
2
2000
Remember :
The mean is in the middle of the curve…
and the normal curve is contained within 3 standard
deviations from the mean!
-15
45
c: mean µ = _______ and standard deviation s = _______
120 - (-150) 270
s=
=
6
6
-100
0
-150 +120 -30
=
2
2
100
Now you do this one on your own.
225
75
d: mean µ = _______ and standard deviation s = _______
0
100
200
300
400
500
Assignment
Activity 15-14: Empirical Rule
(page 341)
Click
for
HELP!
Click for another assignment!
68%
95%
99.7%
Assignment
Activity 15-15: Critical Values
(page 342)
Here is some help
to get you started!
Activity 15-15:
Critical Values (page 342)
(a) Pr(Z>z*) = .10  then z* = _______
1.28
10%
90%
z*
invNorm ( .9 , 0 , 1 ) or just invNorm (.9)
1.28
% to
left
mean
s.d.
μ
σ
Since it is
standardized!
1.28
1.645
1.96
2.33
2.58
These values are called the
critical values
of the normal distribution.
Your topic is due!
The quiz for this topic is on the facts of a
Normal Distribution.
How can the properties of a normal
distirbution be used to make predictions?