Download The Standard Normal Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Normal Distributions
Heibatollah Baghi, and
Mastee Badii, integrated with
another anonymous PowerPoint by
Linda Imhoff
1
The Normal Curve
• A mathematical model or and an
idealized conception of the form a
distribution might have taken under
certain circumstances.
– Mean of any distribution has a Normal
distribution (Central Limit Theorem)
– Many observations (height of adults,
weight of children in California,
intelligence) have Normal distributions
• Shape
– Bell shaped graph, most of data in
middle
– Symmetric, with mean, median and
mode at same point
2
Percent of Values Within One
Standard Deviations
68.26% of Cases
3
Percent of Values Within Two
Standard Deviations
95.44% of Cases
4
Percent of Values Within Three
Standard Deviations
99.72% of Cases
5
The Normal Distribution
f(X)
Changing μ shifts the
distribution left or right.


Changing σ increases or
decreases the spread.
X
Percent of Values Greater than
1 Standard Deviation
7
Percent of Values Greater than
-2 Standard Deviations
8
**The beauty of the normal curve:
No matter what  and  are, the area between - and
+ is about 68%; the area between -2 and +2 is
about 95%; and the area between -3 and +3 is
about 99.7%. Almost all values fall within 3 standard
deviations.
Percent of Values Greater than +2
Standard Deviations
10
Data in Normal Distribution
(X  1S ) contains about 68% of the scores
(X  2S ) contains about 95% of the scores
(X  3S ) contains about 99% of the scores
11
Properties Of Normal Curve
•
•
•
•
Normal curves are symmetrical.
Normal curves are unimodal.
Normal curves have a bell-shaped form.
Mean, median, and mode all have the same
value.
12
Are my data normally
distributed?
1. Look at the histogram! Does it appear bell shaped?
2. Compute descriptive summary measures—are
mean, median, and mode similar?
3. Do 2/3 of observations lie within 1 std dev of the
mean? Do 95% of observations lie within 2 std dev
of the mean?
How good is rule for real data?
Check some example data:
The mean of the weight of the women = 127.8
The standard deviation (SD) = 15.5
68% of 120 = .68x120 = ~ 82 runners
In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean.
112.3
127.8
143.3
25
20
P
e
r
c
e
n
t
15
10
5
0
80
90
100
110
120
POUNDS
130
140
150
160
95% of 120 = .95 x 120 = ~ 114 runners
In fact, 115 runners fall within 2-SD’s of the mean.
96.8
127.8
158.8
25
20
P
e
r
c
e
n
t
15
10
5
0
80
90
100
110
120
POUNDS
130
140
150
160
99.7% of 120 = .997 x 120 = 119.6 runners
In fact, all 120 runners fall within 3-SD’s of the mean.
81.3
127.8
174.3
25
20
P
e
r
c
e
n
t
15
10
5
0
80
90
100
110
120
POUNDS
130
140
150
160
Example
• Suppose SAT scores roughly follows a
normal distribution in the U.S. population of
college-bound students (with range
restricted to 200-800), and the average math
SAT is 500 with a standard deviation of 50,
then:
– 68% of students will have scores between 450
and 550
– 95% will be between 400 and 600
– 99.7% will be between 350 and 650
Example
• BUT…
• What if you wanted to know the math SAT
score corresponding to the 90th percentile
(=90% of students are lower)?
P(X≤Q) = .90 
Q
1 x 500 2
)
50
dx
 (
1
e 2
(50) 2
200

Solve for Q?….Yikes!
 .90
Standard Scores
• One use of the normal curve is to explore Standard
Scores. Standard Scores are expressed in standard
deviation units, making it much easier to compare
variables measured on different scales.
• There are many kinds of Standard Scores. The
most common standard score is the ‘z’ scores.
• A ‘z’ score states the number of standard
deviations by which the original score lies above
or below the mean of a normal curve.
20
The Z Score
• The normal curve is not a single curve but a
family of curves, each of which is
determined by its mean and standard
deviation.
• In order to work with a variety of normal
curves, we cannot have a table for every
possible combination of means and standard
deviations.
21
The Z Score
• What we need is a standardized normal
curve which can be used for any normally
distributed variable. Such a curve is called
the Standard Normal Curve.
Xi  X
z
S
22
The Standard Normal Distribution (Z)
All normal distributions can be converted into
the standard normal curve by subtracting the
mean and dividing by the standard deviation:
Z
X 

Somebody calculated all the integrals for the standard
normal and put them in a table! So we never have to
integrate!
Even better, computers now do all the integration.
The Standard Normal Curve
• The Standard Normal Curve (z distribution)
is the distribution of normally distributed
standard scores with mean equal to zero and
a standard deviation of one.
• A z score is nothing more than a figure,
which represents how many standard
deviation units a raw score is away from the
mean.
24
Example Z Score
• For scores above the mean, the z score has a
positive sign. Example + 1.5z.
• Below the mean, the z score has a minus
sign. Example - 0.5z.
• Calculate Z score for blood pressure of 140
if the sample mean is 110 and the standard
deviation is 10
•
Z = 140 – 110 / 10 = 3
25
Comparing Scores from
Different Distributions
• Interpreting a raw score requires additional
information about the entire distribution. In most
situations, we need some idea about the mean
score and an indication of how much the scores
vary.
• For example, assume that an individual took two
tests in reading and mathematics. The reading
score was 32 and mathematics was 48. Is it correct
to say that performance in mathematics was better
than in reading?
26
Z Scores Help in Comparisons
• Not without additional information. One
method to interpret the raw score is to
transform it to a z score.
• The advantage of the z score transformation
is that it takes into account both the mean
value and the variability in a set of raw
scores.
27
Did Sara improve?
• Score in pretest was 18 and post test was
42
• Sara’s score did increase. From 18 to 42.
• But her relative position in the Class
decreased.
Observation
Mean
Standard deviation
Z score
Pretest
18
Post test
42
17
3
49
49
0.33
-0.14
28
Area When Score is Known
• For a normal distribution with mean of 100
and standard deviation of 20, what
proportion of cases fall below 80?
• ~16%
29
Score When Area Is Known
• For a normal distribution with mean of 100
and standard deviation of 20, find the score
that separates the upper 20% of the cases
from the lower 80%
• Answer = 116.8
30
Transforming Standard Scores
• Sometimes it is more convenient to work with
standard scores that do not have negative numbers
or decimals.
• Standard scores can be transformed to have any
desired mean and standard deviation.
• SAT and GRE are transformed scores (similar to
z) with a mean of 500 and an SD of 100
– (score x 100) + 500
• Widely used cognitive and personality test
(Wechsler IQ test) are standardized to have a
mean of 100 and an SD of 15
– ( z x 15) + 100
31
Transforming a raw score of 12 on
Behavioral Problem Index
• Age 5: Mean: 10.0
• Age 6: Mean: 12.0
• Age 7: Mean: 14.0
SD: 2.0
SD: 3.0
SD: 3.0
32
Transforming a raw score of 12 on
Behavioral Problem Index
•
•
•
•
•
•
Age 5:
Age 6:
Age 7:
Age 5:
Age 6:
Age 7:
Mean: 10.0
SD: 2.0
Mean: 12.0
SD: 3.0
Mean: 14.0
SD: 3.0
Z = (12-10) / 2 = 1.0
Z = (12-12) / 3 = 0.0
Z = (12-14) / 3 = -0.67
33
Transforming a raw score of 12 on
Behavioral Problem Index
•
•
•
•
•
•
•
•
•
Age 5: Mean: 10.0
SD: 2.0
Age 6: Mean: 12.0
SD: 3.0
Age 7: Mean: 14.0
SD: 3.0
Age 5: Z = (12-10) / 2 = 1.0
Age 6: Z = (12-12) / 3 = 0.0
Age 7: Z = (12-14) / 3 = -0.67
Age 5: Standard Score 100.15=(1.0 X 15) + 100= 115
Age 6: Standard Score 100.15=(0.0 X 15) + 100= 100
Age 7: Standard Score 100.15=(-0.67 X 15) +100= 90
34
Other Standard Scores
• A T score is created from a z score simply
by multiplying each standard deviation unit
by 10 to get rid of the decimals, and then
adding 50 to each of these scores to get rid
of the negatives.
• Now the mean becomes 50 ([10*0] + 50 =
50).
• Plus 1 z becomes 60 ([10*1] + 50 = 60).
35
Multiple Transformation of Data
36
The Normal Curve & Probability
• The normal curve also is central to many aspects
of inferential statistics. This is because the normal
curve can be used to answer questions concerning
the probability of events.
• For example, by knowing that 16% of adults have
a Wechsler IQ greater than 115 (z = +1.00), one
can state the probability(p) of randomly selecting
from the adult population a person whose IQ is
greater than 115.
• You are correct if you suspect that P is .16.
37
Data on the IQ Scores of 1000 Six Grade
Children
38
The Normal Curve & Probability
• The mean of the distribution is 100 and the
SD is 15
• What is the probability that a randomly
selected student from this population would
have an IQ score of 115 or greater?
• Approximately .16
• 16 percent of the total area under the curve
in the distribution
39