Download Section 5-1 What is Normal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Gibbs sampling wikipedia , lookup

Transcript
5.1 What Is Normal?
LEARNING GOAL
Understand what is meant by a normal distribution and
be able to identify situations in which a normal
distribution is likely to arise.
Copyright © 2009 Pearson Education, Inc.
Suppose a friend is pregnant and due to give birth on
June 30. Would you advise her to schedule an important
business meeting for June 16, two weeks before the due
date?
Figure 5.1 is a
histogram for a
distribution of 300
natural births. The
left vertical axis
shows the number
of births for each 4day bin. The right
vertical axis shows
relative frequencies.
Figure 5.1
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 2
We can find the proportion of births that occurred more
than 14 days before the due date by adding the relative
frequencies for the bins to the left of -14.
These bins have
a total relative
frequency of
about 0.21,
which says that
about 21% of the
births in this data
set occurred
more than 14
days before the
due date.
Figure 5.1
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 3
TIME OUT TO THINK
Suppose the friend plans to take a three-month maternity
leave after the birth. Based on the data in Figure 5.1 and
assuming a due date of June 30, should she promise to
be at work on October 10?
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 4
The Normal Shape
The distribution of the birth data has a fairly distinctive
shape, which is easier to see if we overlay the
histogram with a smooth curve (Figure 5.2).
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 5
For our present purposes, the shape of this smooth
distribution has three very important characteristics:
• The distribution is single-peaked. Its mode, or most
common birth date, is the due date.
• The distribution is symmetric around its single peak;
therefore, its median and mean are the same as its mode.
The median is the due date because equal numbers of
births occur before and after this date. The mean is also the
due date because, for every birth before the due date, there
is a birth the same number of days after the due date.
• The distribution is spread out in a way that makes it
resemble the shape of a bell, so we call it a “bell-shaped”
distribution.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 6
TIME OUT TO THINK
The histogram in Figure 5.2, which is based on natural
births, is fairly symmetric. Today, doctors usually induce
birth if a woman goes too far past her due date. How
would the shape of the histogram change if it included
induced labor births?
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 7
Figure 5.3 Both distributions are normal and have the same mean
of 75, but the distribution on the left has a larger standard deviation.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 8
Definition
The normal distribution is a symmetric, bell-shaped
distribution with a single peak. Its peak corresponds to
the mean, median, and mode of the distribution. Its
variation can be characterized by the standard deviation
of the distribution.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 9
The Normal Distribution and Relative
Frequencies
Relative Frequencies and the Normal Distribution
• The area that lies under the normal distribution curve
corresponding to a range of values on the horizontal axis
is the relative frequency of those values.
• Because the total relative frequency must be 1, the total
area under the normal distribution curve must equal 1, or
100%.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 10
Figure 5.5 The percentage of the total area in any region under the
normal curve tells us the relative frequency of data values in that region.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 11
TIME OUT TO THINK
According to Figure 5.5 (previous slide), what percentage
of births occur between 14 days early and 18 days late?
Explain. (Hint: Remember that the total area under the
curve is 100%.)
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 12
EXAMPLE 2 Estimating Areas
Look again at the normal distribution in Figure 5.5 (slide 5.1-11).
a. Estimate the percentage of births occurring between 0 and 60
days after the due date.
Solution:
a. About half of the total area under the curve lies in the region
between 0 days and 60 days. This means that about 50% of
the births in the sample occur between 0 and 60 days after
the due date.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 13
EXAMPLE 2 Estimating Areas
Look again at the normal distribution in Figure 5.5 (slide 5.1-11).
b. Estimate the percentage of births occurring between 14 days
before and 14 days after the due date.
Solution:
b. Figure 5.5 shows that about 18% of the births occur more
than 14 days before the due date. Because the distribution is
symmetric, about 18% must also occur more than 14 days
after the due date. Therefore, a total of about 18% 18%
36% of births occur either more than 14 days before or more
than 14 days after the due date. The question asked about
the remaining region, which means between 14 days before
and 14 days after the due date, so this region must represent
100% - 36% = 64% of the births.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 14
When Can We Expect a Normal
Distribution?
Conditions for a Normal Distribution
A data set that satisfies the following four criteria is likely
to have a nearly normal distribution:
1. Most data values are clustered near the mean, giving
the distribution a well-defined single peak.
2. Data values are spread evenly around the mean,
making the distribution symmetric.
3. Larger deviations from the mean become increasingly
rare, producing the tapering tails of the distribution.
4. Individual data values result from a combination of
many different factors, such as genetic and
environmental factors.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 15
EXAMPLE 3 Is It a Normal Distribution?
Which of the following variables would you expect to
have a normal or nearly normal distribution?
a. Scores on a very easy test
Solution:
a. Tests have a maximum possible score (100%) that
limits the size of data values. If the test is easy, the
mean will be high and many scores will be close to the
maximum possible. The few lower scores may be
spread out well below the mean. We therefore expect
the distribution of scores to be left-skewed and nonnormal.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 16
EXAMPLE 3 Is It a Normal Distribution?
Which of the following variables would you expect to
have a normal or nearly normal distribution?
b. Heights of a random sample of adult women
Solution:
b. Height is determined by a combination of many
factors (the genetic makeup of both parents and
possibly environmental or nutritional factors). We
expect the mean height for the sample to be close to
the mode (most common height). We also expect there
to be roughly equal numbers of women above and
below the mean, and extremely large and small heights
should be rare. That is why height is nearly normally
distributed.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 17
TIME OUT TO THINK
Would you expect scores on a moderately difficult exam
to have a normal distribution? Suggest two more
quantities that you would expect to be normally
distributed.
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 18
The End
Copyright © 2009 Pearson Education, Inc.
Slide 5.1- 19