Download standard deviation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
CHAPTER 9
NORMAL DISTRIBUTION
and
SAMPLING DISTRIBUTIONS
I Normal Distribution
A. A Bit of History
1. Abraham DeMoivre’s search for a shortcut
method of computing binomial probabilities
1
led to the normal distribution.
0.20
0.18
0.16
0.14
f ( X ) 0.12
0.10
0.08
0.06
0.04
0.02
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Number of Heads, X
Figure 1. Normal distribution superimposed on the
probability distribution for tossing 16 fair coins.
As the number, n, of coins increases, the
correspondence between the normal distribution
and the binomial distribution becomes better and
better.
2
B. Function Rule for the Normal Distribution
1. The height of the distribution, f(X), is given by
f (X) 
1
 2
e
( X   )2 /(2 2 )
where  and 2 identify a particular normal
distribution
 is approximately 3.1416
e is approximately 2.718
3
C. Finding Areas under the Standard Normal
Distribution Using Appendix Table D.2
a.
b.
Area A = Area from  to z  Area B = Area beyond z 
f ( X)
A
0
1
 z
f ( X)
X
B
0 1
 z
X
1. Converting scores, X, to standard scores, z scores,
z  (X  X) / S
4
2. z scores have a mean = 0 and a standard deviation
= 1. The mean and standard deviation are the same
as the mean and standard deviation of the
standard normal distribution.
3. Example of a z score transformation
 Consider a raw score of X = 123.5, where
X  103.3 and S = 20.2.
123.5  103.3 20.2
z

 1.0
20.2
20.2
5
Table 1 Areas under the standard normal
distribution (From Appendix Table D.2)
6
4. Areas corresponding to columns 2 and 3 of the
standard normal distribution table
Area (column 2)
Area (column 3)
0
z
Figure 2. The subscript  denotes the size of the area
that lies above the z score.
7
5. Using the standard normal distribution to find the
raw score corresponding to a percentile rank
XX
From z 
, it follows that X  X  Sz
S
Let X  103.3 and S = 20.2. If X is normally
distributed, the raw score corresponding to
the 80th percentile, z.20 = 0.84, is
X  103.3  20.2(0.84)  120.3
8
6. Using the standard normal distribution to find the
proportion of scores between two raw scores
Let X1  113.4, X 2  123.5, X  103.3
and S = 20.2
 First, convert the two scores to z scores
113.4  103.3 20.2
z

 0.5
20.2
20.2
123.5  103.3 20.2
z

 1.0
20.2
20.2
9
0.1915
0.3413
X
z = 0.5
z = 1.0
 Second, determine the proportion of the standard
normal distribution between the mean and z = 0.5
and between the mean and z = 1.0.
 Third, subtract the two areas: 0.3413 – 0.1915 =
0.1498
10
D. Normal Approximation to the Binomial
Distribution
1. If five fair coins are tossed, according to the
binomial function rule, the probability of
observing four or more heads is
p(X = 4 or X = 5) = 5/32 + 1/32 = 0.1875.
2. The normal approximation to the exact probability
is obtained by finding the size of the area above
3.5, the lower bound of 4 heads.
11
 First, convert 3.5 into a z score, where the mean of
the binomial variable is np = 5(.5) = 2.5 and
the standard deviation is
npq  5(.5)(.5)  1.118
The z score is
z
X  E( X )

3.5  2.5

 0.894
1.118
12
 Second, find the area above z = 0.894
0.1867
0
1
2
3
4
5
z = 0.894
Number of heads
X
3. The area above the lower limit of four heads, 3.5,
is 0.1867. The approximation to the exact
probability, 0.1857, for n = 5 coins is quite good.
13
II Interpreting Psychological and Educational
Test Scores in Terms of Percentile Ranks
and Standard Scores
A. Transformation of Test Scores to Percentile
Ranks
1. A percentile rank tells you the percentage of the
scores that fall below a particular score.
14
2. The percentile rank, PR, of a raw score, denoted by
P%, is given by
fi (P%  X ll 
100 
PR 
  fb 

n 
i

The computation of PR is illustrated in Chapter 4.
3. Comparison of percentile ranks and raw scores,
where the mean of the raw scores = 100 and the
standard deviation = 15.
15
Raw Score = 55
Percentile = 0
70
2
85
100
16
50
115
84
130
98
145
100
10% 10% 10% 10% 10% 10% 10% 10% 10% 10%
Percentile = 0
10
Raw Score = 55 81
20 30 40 50 60 70 80 90 100
87 92
96 100 104 108 113 119 145
16
4. The transformation of scores into percentile ranks
alters four characteristics of the score distribution.
 central tendency
 dispersion
 skewness
 kurtosis
5. One characteristic is not altered.
 relative order of the scores
17
B. Transformation of Test Scores to Standard
Scores
1. A standard scores expresses the value of a raw
score relative to the mean and standard deviation
of its distribution.
2. Consider a test with a mean of 100 and standard
deviation of 15. The z score corresponding to a
test score of 130 is
X  X 130  100
z

 2.0
S
15
18
3. A test score of 130 is 2 standard deviations above
the mean. If the distribution is normal, Appendix
Table D.2 tells us that 0.4772 + 0.5000 = 0.9772
of the scores fall below this test score.
4. The transformation of scores into standard scores
alters only two characteristics of the score
distribution.
 central tendency
 dispersion
19
5. Comparison of standard scores and raw scores,
where the mean of the raw scores = 100 and the
standard deviation = 15.
Raw Score = 55
70
85
100
115
130
145
Standard Score = –3
–2
–1
0
1
2
3
20
6. The transformation of scores into standard scores
does not alter the following characteristics of the
score distribution.
 skewness
 kurtosis
 relative order of the scores
C. Relative Advantages of z Scores and
Percentile Ranks
21
D. Other Kinds of Standard Scores
1. The z formula produces scores that range from
approximately –3 to +3 and have a mean = 0 and
standard deviation = 1.
2. The z formula can be modified to produce z scores
with any desired mean, X  , and standard
deviation, S .
XX
z 
S  X 
S
22
3. Scholastic Aptitude Scores (verbal) are obtained
by multiplying z scores by S   100 and adding
X   500.
XX
z 
100  500
S
4. IQ scores are obtained by multiplying z scores by
S   15 and adding X   100.
XX
z 
15  100
S
23
III Sampling Distributions
A. Sampling Distribution of the Mean
1. Frequency distribution of a population with N = 4
scores
N
2

f 1
0
 Xi
i 1
N
 2. 5
N
1
2 3
X
4

 ( X i  )
i 1
N
2
 1.118
24
Table 2. All possible random samples of size n = 2
(1)
(2)
Sample Sample
Number Values
(3)
(4)
(5)
(6)
Xj
Sample
Number
Sample
Value
Xj
1
1, 1
1.0
9
2, 3
2.5
2
1, 2
1.5
10
3, 2
2.5
3
2, 1
1.5
11
2, 4
3.0
4
1, 3
2.0
12
4, 2
3.0
5
3, 1
2.5
13
3, 3
3.0
6
1, 4
2.5
14
3, 4
3.5
7
4, 1
2.5
15
4, 3
3.5
8
2, 2
2.0
16
4, 4
4.0
25
4
3
f
2
1
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
X
Figure 3. Sampling Distribution of Sample Means
26
2. Mean and standard deviation of the sampling
distribution
k

X

 Xj
j 1
k
1.0  1.5  L  4.0

 2.5
16

j 1
k
 X 
X

2
k
1.0  2.5  1.5  2.5  L  4.0  2.5
2

2
16
2
 0.791
27
3. The mean of the 16 sample means is denoted by
X .
4. The standard deviation of the 16 sample means is
denoted by  X , and is called the standard error
of the mean to distinguish it from the standard
deviation of scores.
28
5. Some key points
 Distribution of the 16 sample means does not
resemble the original population that was
rectangular; instead, it resembles a normal
distribution.
 The mean of the 16 sample means,  X  2.5,
is equal to the mean of the four scores in the
population,   2.5.
29
 The standard deviation of the 16 sample means
 X  0.791
is equal to the standard deviation of the four
scores in the population,  = 1.118, divided by
the square root of the sample size, n:
 X   / n  1.118 / 4  0.791
30
6. These points are expressed in the central limit
theorem: If random samples are selected from a
population with mean  and finite standard
deviation , as the sample size n increases, the
distribution of X approaches a normal distribution
with mean  and standard deviation .
31
IV Two Properties of Good Estimators
A. Unbiased Estimator
1. An estimator is unbiased if the expected value of
the estimator is equal to the parameter it estimates.
2.
Examples:
E( X )  
E(φ2 )   2
32
3. S2 is a biased estimator because
E(S2) ≠ 2
B. Minimum Variance Estimator
1. An estimator is a minimum variance estimator if
the variance of the estimator is smaller than the
variance of any other unbiased estimator.
33
2. Example:
X is a minimum variance estimator
3. The median also is an unbiased estimator of the
population mean, E(Mdn) = but the median is
not a minimum variance estimator because
Var( Mdn)  1.57 2 / n  Var( X )   2 / n.
34
V Test Statistics and Sample Statistics
A. Example of a Test Statistic
z
X  0
X

X  0
/ n
1. z is used to test the hypothesis that the mean of a
population, , is equal to 0.
35
2. Comparison of test statistic and z score
z
X  0
/ n
XX
z
S
3. The two z’s have the same form
Statistic ΠMean of the statistic
z
Standard deviation (error) of the statistic
36
B. Other Test Statistics
1.
t
X  0
φ / n
2. t is used to test the hypothesis that the mean of a
population, , is equal to 0 when  is unknown
and must be estimated from sample data.
37
3. F 
φ12
φ22
4. F is used to test the hypothesis that two population
2
2
variances, 1 and  2 , are equal.
38