Download The Normal Curve - Thomas S. Wallsten

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Standard Scores
• Raw scores are standardized by subtracting the
mean and dividing by the standard deviation
For samples zi =
Xi − X
s
For populations zi =
Xi − µ
σ
– So-called z-scores always have mean 0 and s.d. 1
– Allow score comparison across populations or samples
• Beware, however, if the distributions are different
• One use is to enter the table of the normal distribution
• Another is to construct T-scores
– Indicate how many s.d. units above or below the mean an
observation is
– This linear transformation leaves distribution shape
unchanged
The Normal Curve
• Don’t be confused:
– Any set of scores can be converted to z-scores, i.e.
standardized, regardless of their distribution
– The standard normal distribution table is used
only if the distribution is assumed to be normal
• What is the normal curve?
– It is a mathematical equation that provides a
theoretical distribution of scores
– Many variables are approximately normally
distributed
– Sampling distributions of many statistics, very
important to inferential statistics, are
(asymptotically) normally distributed
1
• You do not need to know the equation, but
– The text gives it as Y = N e−( X −µ ) 2σ , where N is
2π σ
the total frequency
– And refers to the percentage of scores, or percent
of area under the curve, between 2 z-scores,
below a z-score, etc
2
2
• With this equation, the area under the curve is N
• It does not matter that one often does not know what
N is, as we can divide both sides of the equation by N
– Letting f(X) = Y/N, we have
f (X ) =
2
1
e −( X − µ )
2π σ
2σ 2
• This is the equation of the standard normal distribution
• It describes a probability distribution; we will return to
it when we cover probability theory
Two Normal Curves
Normal Distribution
0.03
0.025
f(x)
0.02
sigma=15
sigma=25
0.015
0.01
0.005
0
0
50
100
150
200
X
2
Using the Standard Normal Distribution
Table (Areas under the Normal Curve)
f(Z)
• The equation is not needed, because one can
obtain necessary information from the table
B
B
C
-5
-4
-3 -2
C
-1
0
1
2
3
4
5
Z
3
Examples
For a certain school district, the mean SAT-Q is 620 with a standard
deviation of 100. Assume the scores are normally distributed
Q: What is the percentile rank of the score 500?
A: z = (500-620)/100 = -1.20. Area to the left of –1.20 (C in table) is
.1151. Rounding, the score is at the 12th percentile.
Q: What is the percentile rank of the score 795?
A: z = (795-620)100 = 1.75. Area between mean and z (B in the table)
is .4599. Adding that to area to the left of the mean (.5000), and
rounding, the score is at the 96th percentile.
Q: What percent of scores are between 650 and 750?
A: z1 = (650-620)/100 = 0.30 and z2 = (750-620)/100 = 1.30.
Area1 (B) = .1179 and Area2 (B) = .4032. Fraction = .4032-.1179 =
.2853, rounded to 29%.
Scores on an aptitude test are normally distributed with
a mean of 100 and a standard deviation of 15.
Q: What score is at the 50th percentile?
A: 100 (How do we know that without any calculation?)
Q: What score is at the 40th percentile?
A: Find the z-score such that C = .40 and B = .10, and take
the negative of that value. That operation yields z = -0.25.
Next, solve for the score X, using z = (X − µ ) σ . Rearrange to
obtain X = z ⋅ σ + µ . Substituting, X = -0.25(15)+100 = 96.25.
Q: What score is at the 75th percentile?
A: Find the (positive) z-score such that C = .25 and B = .25,
and proceed as above.
4