• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia, lookup

Transcript
```Normality
Notes page 138
The heights of the female students at
RSH are normally distributed with a
What
is the zmean of 65 inches. What
is the
for the
standard deviation of this score
distribution
63?
if 18.5% of the female students are
shorter than 63 inches?
P(X < 63) = .185
63  65
 .9 

2
 
 2.22
 .9
-0.9
63
The heights of female teachers at RSH
are normally distributed with mean of
65.5 inches and standard deviation of
2.25 inches. The heights of male
teachers are normally distributed with
mean of 70 inches and standard
deviation of 2.5 inches.
•Describe the distribution of differences
of heights (male – female) teachers.
Normal distribution with
m = 4.5 &  = 3.3634
• What is the probability that a
randomly selected male teacher is
shorter than a randomly selected
female teacher?
P(X<0) = .0901
0  4.5
z 
 1.34
3.3634
4.5
Ways to Assess Normality
• Use graphs (dotplots,
boxplots, or histograms)
• Normal probability
(quantile) plot
Normal Scores
Suppose
we have
the following
To
construct
a normal
probability plot,
Sketch
a
scatterplot
by
pairing
the
Think
of
selecting
sample
after
sample
of
observations
of
widths
of
contact
you cansmallest
use quantities
called
normal
normal
score with
the
size
10
from
a
standard
normal
windows
in integrated
circuit
chips:
What should
score.
The
values
of
the
normal
scores
smallest
observation
from
the
the
1distribution. Then -1.539 is data
happen
if sample size n. The normal
depend
on
the
set smallest
& so on observation
average of the
our when
data n = 10 are below:
scores
from each sample & so on . . .
is
3.21 set2.49
2.94 4.38 4.02
2
33.34
4 3.81
5
3.62normally
3.301 2.85
distributed?
-1
-1.539 -1.001 -0.656 -0.376 -0.123
Contact 1.001
Windows1.539
0.123 Widths
0.376of 0.656
Normal Probability (Quantile) plots
• The observation (x) is plotted against known
normal z-scores
• If the points on the quantile plot lie close
to a straight line, then the data is normally
distributed
• Deviations on the quantile plot indicate
nonnormal data
• Points far away from the plot indicate
outliers
• Vertical stacks of points (repeated
observations of the same number) is called
granularity
Are these approximately normally
distributed?
50 48 54 47 51 52 46 53
What
52 51 48 48 54 55
57is this
45
53 50 47 49 50 56 called?
53 52
Both the histogram & boxplot
are approximately
symmetrical, so these data
are approximately normal.
The normal probability
plot is approximately
linear, so these data are
approximately normal.
Normal Approximation to the
Binomial
technology, binomial probability
calculations were very tedious.
Let’s see how statisticians
estimated these calculations in
the past!
Premature babies are those born more than
3 weeks early. Newsweek (May 16, 1988)
reported that 10% of the live births in the
U.S. are premature. Suppose that 250 live
births are randomly selected and that the
number X of the “preemies” is determined.
What is the probability that there are
between 15 and 30 preemies, inclusive?
(POD, p. 422)
1) Find this probability using the binomial
distribution.P(15<X<30) = binomialcdf(250,.1,30) –
binomialcdf(250,.1,14) =.866
2) What is the mean and standard deviation
of the above distribution? m = 25 &  = 4.743
3) If we were to graph a
histogram for the above binomial
distribution,
whatdistribution
shape do you
Let’s graph this
–
think it will have?
•Put the numbers 1-45 in L1
Since the probability is only 10%,
we
expect
the histogram
be
•Inwould
L2, use
binomialpdf
to to
find
strongly
skewed right.
the probabilities.
4) What do you notice about the
shape?
Overlay a normal curve on your
histogram:
•In Y1 = normalpdf(X,m,)
Normal distributions can be used to
estimate probabilities for binomial
distributions when:
1) the probability of success is close
to .5
or
2) n is sufficiently large
Rule: if n is large enough,
then np > 10 & n(1 –p) > 10
Why 10?
Normal distributions extend infinitely in
both directions; however, binomial
distributions are between 0 and n. If
we use a normal distribution to
estimate a binomial distribution, we
must cut off the tails of the normal
distribution. This is OK if the mean of
the normal distribution (which we use
the mean of the binomial) is at least
three standard deviations (3) from 0
and from n. (BVD, p. 334)
We require:
m  3  0
Or
m  3
As binomial:
np  3 np 1  p 
Square:
n 2 p 2  9np 1  p 
Simplify:
np  91  p 
Since (1 - p) < 1:
np  9
n 1  p   9
And p < 1:
Therefore, we say the np should be at
least 10 and n (1 – p) should be at least
10.
Normal
can be used
to
Thinkdistributions
histograms
estimate
probabilities
for
binomial
are
Each
bar
is
centered
distributions when:
over
the
discrete
values.
The
bar
1) the probability of success is close to .5
for
“1”
actually
goes
from
0.5
to
or
1.5
&
the
bar
for
“2”
goes
from
2) n is sufficiently large
1.5
to
2.5.
Therefore,
by
Rule: if n is large enough,
or
subtracting
.5
from
the
discrete
then np > 10 & n(1 –p) > 10 Why?
values, you find the actually width
of
the
bars
that
you
need
to
Since a continuous distribution is used to
estimate
with the normal
curve.
estimate
the probabilities
of a discrete
distribution, a continuity correction is used
to make the discrete values similar to
continuous values.(+.5 to discrete values)
(Back to our example) Since P(preemie) = .1 which
is not close to .5, is n large enough?
np = 250(.1) = 25 & n(1-p) = 250(.9) = 225
Yes, Ok to use normal to approximate binomial
5) Use a normal distribution with the binomial mean
and standard deviation above to estimate the
probability that between 15 & 30 preemies,
inclusive, are born in the 250 randomly selected
babies.
Binomial
written as
Normal (w/cont. correction)
P(15 < X < 30)
P(14.5 < X < 30.5) =

Normalcdf(14.5,30.5,25,4.743) = .8635
6) How does the answer in question 6 compare to