Download 3) The Normal Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Normal Distributions
• Normal Distribution – A bell-shaped and
symmetrical theoretical distribution, with the
mean, the median, and the mode all coinciding at
its peak and with frequencies gradually decreasing
at both ends of the curve.
The Normal Distribution
Properties Of Normal Curve
•
•
•
•
Normal curves are symmetrical.
Normal curves are unimodal.
Normal curves have a bell-shaped form.
Mean, median, and mode all have the same
value.
Normal Distribution
Many sets of data have common characteristics in how they are distributed.
One of the most important probability distributions is the normal distribution.
When a set of data forms a bell shape when plotted in a histogram,
it is said to be normally distributed.
Example: The results of tossing 8 coins 2540 times were recorded and plotted:
800
Frequency
700
0
1
2
3
4
5
6
7
8
600
500
400
300
200
100
0
0
1
2
3
4
Number of Tails
5
6
7
8
9.3.2
Normal Distribution [cont’d]
You can convert the data in a histogram to a probability distribution:
0.3
Probability
0.25
0.2
0.15
0.1
0.05
Probability Distribution
0
0
9.3.3
1
2 3 4 5 6
Number of Tails
7
8
3- 7
Mean
=Median
=Mode
Mean
Median
Mode
The Relative Positions of the Mean, Median, and Mode:
Symmetric Distribution
Symmetric distribution:
3- 8
A distribution having the
same shape on either side of the center
Skewed distribution:
One whose shapes on either
side of the center differ; a nonsymmetrical distribution.
Can be positively or negatively skewed, or bimodal
Shape of a Distribution
• Describes how data are distributed
• Measures of shape
– Symmetric or skewed
Left-Skewed
Symmetric
Right-Skewed
Mean < Median
Mean = Median
Median < Mean
Shape
• Describes how data are distributed
• Measures of shape
Left-Skewed
Mean Median Mode
Symmetric
Mean = Median = Mode
Right-Skewed
Mode Median Mean
The Relative Positions of the Mean,
Median and the Mode
More properties
of normal curves
The 68-95-99.7 Rule
Standard deviation and the normal
distribution
The Empirical Rule
The Empirical Rule
•
μ ± 2σ covers about 95% of X’s
•
μ ± 3σ covers about 99.7% of X’s
2σ
3σ
2σ
μ
95.44%
x
3σ
μ
99.72%
x
Percent of Values Within One
Standard Deviations
68.26% of Cases
Percent of Values Within Two
Standard Deviations
95.44% of Cases
Percent of Values Within Three
Standard Deviations
99.72% of Cases
The Normal Distribution Curve
Basic Properties of the Normal Distribution Curve:
• The total area under the curve is 1.
• It is symmetrical about the mean.
• Approximately 68.3% of the data lies within 1 standard
deviation of the mean.
• Approximately 95.4% of the data lies within 2 standard
deviations of the mean.
• Approximately 99.7% of the data lies within 3 standard
deviations of the mean.
9.3.5
Standard Scores
• One use of the normal curve is to explore Standard
Scores. Standard Scores are expressed in standard
deviation units, making it much easier to compare
variables measured on different scales.
• There are many kinds of Standard Scores. The
most common standard score is the ‘z’ scores.
• A ‘z’ score states the number of standard
deviations by which the original score lies above
or below the mean of a normal curve.
The Standard Normal Curve
• The Standard Normal Curve (z distribution)
is the distribution of normally distributed
standard scores with mean equal to zero and
a standard deviation of one.
• A z score is nothing more than a figure,
which represents how many standard
deviation units a raw score is away from the
mean.
An entire population of scores is transformed
into z-scores. The transformation does not
change the shape of the population but the
mean is transformed into a value of 0 and the
standard deviation is transformed to a value of
1.
Following a z-score transformation, the X-axis is relabled
in z-score units. The distance that is equivalent to 1
standard deviation on the X-axis (σ = 10 points in this
example) corresponds to 1 point on the z-score scale.
The Standardized Normal
•
Any normal distribution (with any mean
and standard deviation combination) can
be transformed into the standardized
normal distribution (Z)
•
Need to transform X units into Z units
Z Scores Help in Comparisons
• One method to interpret the raw score is to
transform it to a z score.
• The advantage of the z score transformation
is that it takes into account both the mean
value and the variability in a set of raw
scores.
Translation to the Standardized Normal
Distribution
•
Translate from X to the standardized normal (the
“Z” distribution) by subtracting the mean of X and
dividing by its standard deviation:
X μ
Z
σ
Z always has mean = 0 and standard deviation = 1
The Z-score
The z-score is a conversion of the raw
score into a standard score based on the
mean and the standard deviation.
z-score = Raw score – Mean
Standard Deviation
Example
z-score
Mean = 55
65 – 55
15
= 0.67
Standard Deviation = 15
Raw Score = 65
Standardize the
Normal Distribution
X 
Z

Normal
Distribution
Standardized
Normal Distribution

= 1

X
=0
Z
• The sample z-score is calculated by subtracting
the sample mean from the individual raw score
and then dividing by the sample standard
deviation.
z= x- x
s
Where : x is the individual score
x is the sample mean
s is the sample standard
deviation.
• The population z-score is calculated by
subtracting the population mean from the
individual raw score and then dividing by the
population standard deviation.
Z=X-µ

Where : x is the individual score
µ is the population mean
 is the population standard
deviation
Z-Scores and the Normal Distribution
• If we have a normal distribution we can make
the following assumptions.
• Approximately 68% of the scores are between
a z-score of 1 and -1.
• Approximately 95% of the scores will be
between a z-score of 2 and -2.
• Approximately 99.7% of the scores will be
between a z-score of 3 and -3.
Represents
those scores
below the
mean, i.e.,
50% of the
data set.
Total Area = 1; This
represents 100% of
The data set.
0.5
Represent
those scores
above the
mean, i.e.,
50% of the
data set.
0.5
mean
x
• The z-score for x gives the area from x to the mean. This
represents the percentage of those in the data set that score
between x and the mean. To get percentile for x, we add
this to 0.5 from the first part of the distribution
Example
•
If X is distributed normally with mean, ,
of 100 and standard deviation, , of 50,
the Z value for X = 200 is
X  μ 200  100
Z

 2.0
σ
50
•
This says that X = 200 is two standard
deviations above the mean of 100.
Application of 68-95-99.7 rule
• Male height has a Normal distribution with μ = 70.0
inches and σ = 2.8 inches
• Notation: Let X ≡ male height; X~ N(μ = 70, σ = 2.8)
68-95-99.7 rule
• 68% in µ   = 70.0  2.8 = 67.2 to 72.8
• 95% in µ  2 = 70.0  2(2.8) = 64.4 to 75.6
• 99.7% in µ  3 = 70.0  3(2.8) = 61.6 to 78.4
35
Application: 68-95-99.7 Rule
What proportion of men are less than 72.8 inches tall?
μ + σ = 70 + 2.8 = 72.8 (i.e., 72.8 is one σ above μ)
68%
?68%
16%
-1
(by 68-95-99.7 Rule)
16%
(total AUC = 100%)
+1
70
72.8
(height)
84%
Therefore, 84% of men are less than 72.8” tall.
36
Finding Normal proportions
What proportion of men are less than 68” tall? This is
equal to the AUC to the left of 68 on X~N(70,2.8)
?
68 70
(height values)
To answer this question, first determine the z-score
for a value of 68 from X~N(70,2.8)
37
Z score
x
z

• The z-score tells you how many standard deviation
the value falls below (negative z score) or above
(positive z score) mean μ
• The z-score of 68 when X~N(70,2.8) is:
z
x

68  70

 0.71
2.8
Thus, 68 is 0.71 standard deviations below μ.
38
Example: z score and associate
value
?
68 70 (height values)
-0.71
0
(z values)
39
Normal Cumulative
Proportions (Table A)
z
.00
0.8
0.7
.2119
.01
.2090
.2389
.2420
0.6
5/8/2017
.2743
.02
.2061
.2358
.2709
.2676
Thus, a z score of −0.71 has a cumulative
Chapter 3of .2389
proportion
40
Area to the right
(“greater than”)
Since the total AUC = 1:
AUC to the right = 1 – AUC to left
Example: What % of men are greater than 68” tall?
.2389
1.2389 =
.7611
68 70
-0.71
0
(height values)
(z values)
41
Finding the Area Under the Curve
1. Find the area between z-scores -1.22 and 1.44.
The area for z-score -1.22
is 0.1112.
-1.22
1.44
-1.22
1.44
The area for z-score 1.44
is 0.9251.
Therefore, the area
between z-scores
-1.22 and 1.44 is
0.9251 - 0.1112 = 0.8139.
Finding the Area Under the Curve
2. Find the area between the mean and z-score -1.78.
The area for z-score -1.78
is 0.0375.
Therefore, the area between
the mean and z-score -1.78 is
0.5 - 0.0375 = 0.4625.
-1.78
3. Find the area between the mean and z-score 1.78.
The area for z-score 1.78
is 0.9625.
Therefore, the area between
the mean and z-score 1.78 is
1.78
0.9625 - 0.5 = 0.4625.
Finding the Area Under the Curve
4. Find the area greater than z-score -0.68.
The area for z-score -0.68
is 0.2483.
Therefore, the area between
the mean and z-score -0.68 is
1 - 0.2483 = 0.7515.
-0.68
5. Find the area greater than z-score 1.40.
1.40
The area for z-score 1.40
is 0.9192.
Therefore, the area between
the mean and z-score 1.40 is
1 - 0.9192 = 0.0808.