Download The Normal Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
The Normal
Distribution
Normal Distributions
Many data sets display similar
characteristics
 The normal distribution is a way of
describing a certain kind of "ideal"
data set


Although no real-world data is perfect,
a surprising amount of natural
phenomina are approximately
"normal"
What is a Normal
Distribution?

Symmetrical
no skew
 mean, median, mode all equal

Histogram
data
0.10
0.08
Density
•Mound / Bell Shaped
•peaks in the middle
•slopes down towards the
sides
0.06
0.04
0.02
10
15
20
25
30
s
Density = normalDensity ( x , 30, sd )
35
40
45
50
55
Histograms Revisited

The bars in the histogram tell us how
much of the data falls into each interval
Histogram
data
little data here
0.10
Density
0.08
0.06
0.04
lots of data here
0.02
10
15
20
25
30
s
Density = normalDensity ( x , 30, sd )
35
40
45
50
55
Why is Normal good?

The Normal Distribution is so well
behaved that we can draw a curve
that almost matches it
This makes it very easy to measure
how tall the histogram bars are
 The height of the bars are given by
the curve that matches it
 This allows us to find almost exactly
how much data is in each part of the
distribution

Where are we on the curve?
the mean
x  2σ x  σ
x
xσ
x  2σ
Where are we on the curve?
the mean
one standard
deviation below
one standard
deviation above
x
x  2σ x  σ
x
xσ
x  2σ
x
Where are we on the curve?
the mean
one standard
deviation below
one standard
deviation above
x
2 standard
deviations below
x
2 standard
deviations above
x
x
x  2σ x  σ
x
xσ
x  2σ
Area under the curve
95%
68%
34%
34%
2.25%
x  3σ
13.5%
x  2σ x  σ
13.5%
x
xσ
2.25%
x  2σ
x  3σ
More Properties
Approximately 68% of the data is
within one standard deviation of the
mean
 Approximately 95% of the data is
within two standard deviations of the
mean
 Approximately 99.7% of the data is
within three standard deviations of the
mean

2
Notation X ~ N(x, σ )

If we want to say "this data is
approximated by the standard
distribution"...

We should also state what the mean
and standard deviation are
2
our data
(call it X)
X ~ N(x, σ )
and this standard
deviation or
variance
with this mean
"is approximated by" the normal distribution
Notation example

The data is normal, and has a mean
of 3 and a standard deviation of 2
2
X ~ N(3,2 )

The data is normal, has a mean of
5.4, and a standard deviation of 3
X ~ N(5.4,9)
be careful - if there is no square, then the second number is the variance,
and you need to take the square root to get the standard deviation....
Problem Example
Julie is an engineer who is designing roller coasters.
Her roller coaster must have mass restrictions that are
suitable for 95% of the population.
The average adult in North America has a mass of
71.8kg with a standard deviation of 13.6kg.
What range of mass should her ride accommodate?
Problem Example
Julie is an engineer who is designing
roller coasters. Her roller coaster must
have mass restrictions that are
suitable for 95% of the population.
The average adult in North America
has a mass of 71.8kg with a standard
deviation of 13.6kg.
What range of mass should her ride
accommodate?
1. Assume that the
masses are normally
distributed.
2. 95% of the data will
fall within two standard
deviations
Consequently, the range will be between
71.8 - 2(13.6) = 44.6kg and 71.6 + 2(13.6) = 99 kg
Example 2:
Out of 150 packages of crackers, 97
packages contain between 80 and
100 crackers. Assume a normal
distribution
Estimate:
A) the average number of crackers
B) the standard deviation of the sample