Download The Normal Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Math 11
The Normal Distribution
If random data is “normal,” then most of the data falls around the expected or “normal” value and less and less data
falls at the extremes – the data is centrally-distributed. Remember the height example: most men are about 5 foot 10 inches
(178cm), and there are men who are shorter and taller, but these men occur less often. A graph of the distribution of a set of
normal data is represented by the bell-shaped curve. The normal distribution is also called the Gaussian distribution.
The normal distribution is widely used to approximate the distributions of many data sets: measurements of length,
mass, height, etc for living things; how measurement errors are distributed; marks on a final exam; and the distribution of events
where there are only two outcomes (the heads/tails game). Remember that none of these truly fit the normal curve, but that the
normal curve is a good approximation of all of these.
On this curve, the mean, median, and mode all have the same value. The curve is symmetrical about the axis of the
mean. The distribution can be completely characterized with only two parameters: mean and standard deviation.
The mean is indicated with the symbol  (Greek letter “mu”). The standard deviation is indicated with the symbol 
(Greek letter “sigma”).
We will be calculating the mean
and standard deviation for a
sample of a population. In mah
11, the mean can be denoted
x or . If you’re wondering
“why two symbols?” then ask
Woods.
34%
34%
13.5%
13.5%
2.35%
2.35%
Within 1 standard deviation on either side of the mean (
Notice that 34% + 34% = 68%
 
Within 2 standard deviations on either side of the mean (
Notice that 13.5% + 13.5% + previous 68% = 95%
), approximately 68% of all data is contained by the curve.
  2
), approximately 95% of all data is contained by the curve.

), approximately 99.7% of all data is contained by the

3
Within 3 standard deviations on either side of the mean (
curve. Notice that 2.35% + 2.35% + previous 95% = 99.7%
Therefore, most of the data (68%) is contained within 1 standard deviation, and virtually all data (99.7%) is contained with three
standard deviations. Due to the symmetry of the curve, half the data lies above the mean, and half the data lies below it.
Those data that fall outside 3are considered statistically insignificant. They are called outliers. While the normal
distribution does allow that values will fall beyond these measures, we can’t use our normal curve to accurately predict these
values.
Math 11
Some questions on mean, standard deviation, and the normal distribution
1. Which of these sets, if any, do you think can be modeled by a normal distribution?
1. Create a rough sketch of two normal distribution curves overlayed on top of each other: distribution A has a mean of 29 and a
standard deviation of 3. Distribution B has a mean of 32 and a standard deviation of 7.
2. The mass of an ant is known to be normally distributed. The mean mass of an ant is 58 nanograms (ng) with a standard
deviation of 9 ng. In a population of 850 ants, how many (approximately) would fall between 40 and 65 ng of mass?
3. You are the head of a multi-billion dollar conglomerate of international holding
companies Glob-Dom Inc. Your next project is to acquire a rubber tire factory so you
can begin selling winter tires in Canada. You’ve got your choice down to two different
factories: Ride-rite or Smooth-em-bumps. Below is a sample of 6 tires that your
scientists are analyzing. They want each tire to be exactly 12 kg in mass, but they want
to have few customer returns due to non-conforming tires. (assume these are normally
distributed)
a) What does your scientific staff recommend? Who should you buy?
b) If tires that are off by 0.5kg are considered defective – tires less than 11.5 or more than 12.5 – then approximately what
percentage of tires will be defective for both of these companies?
4.
5. Woods conducted an experiment - when rolling 3 dice and adding the total, he generated this table of
results:
a) find the mean and standard deviation of these data (might be easier in excel)
b) does this data seem approximately normal? A bar chart or frequency polygon might be useful
c) assume that it is normal in distribution. How much data do you expect there to be within 2
standard deviations of the
mean? Does your data fit this prediction?
6. A really good question… a population of data is 6, x, 9, y, 3, 11. It has a mean of 7 and a standard deviation of
What are x and y, if x < y?
5
.
3