• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript
```Ismor Fischer, 5/29/2012
5.2-1
5.2 Formal Statement and Examples
Sampling Distribution of a Normal Variable
Given a random variable X. Suppose that the population
distribution of X is known to be normal, with mean µ and
variance σ 2, that is, X ~ N(µ, σ). Then, for any sample size n,
it follows that the sampling distribution of X is normal,
σ2

σ 
.
with mean µ and variance n , that is, X ~ Nµ,
n


σ
n
is called the “standard error of the mean,” denoted SEM, or more simply, s.e.
 The corresponding Z-score transformation formula is Z =
X −µ
~ N(0, 1).
σ/ n
Example: Suppose that the ages X of a certain population are normally distributed,
with mean µ = 27.0 years, and standard deviation σ = 12.0 years, i.e., X ~ N(27, 12).
The probability that the age of
a single randomly selected
individual is less than 30 years

is P(X < 30) = PZ <
30 − 27

12 
= P(Z < 0.25) = 0.5987.
In this population, the
probability that the average
age of 36 random people is
under 30 years old, is much
greater than the probability
that the age of one random
person is under 30 years old.
Now consider all random samples of size n = 36 taken
from this population. By the above, their mean ages
X are also normally distributed, with mean µ = 27 yrs
12 yrs
σ
=
= 2 yrs.
as before, but with standard error
n
36
That is, X ~ N(27, 2).
Exercise: Compare the two
probabilities of being under
24 years old.
The probability that the mean age of a single sample of
n = 36 randomly selected individuals is less than 30

years is P( X < 30) = PZ <
X
µ = 27 30
Exercise: Compare the two
probabilities
of
being
between 24 and 30 years old.
30 − 27

2  = P(Z < 1.5) =
0.9332.
µ = 27 30
X
Ismor Fischer, 5/29/2012
5.2-2

 If X ~ N(µ, σ) approximately, then X ~ Nµ,
σ 
 approximately. (The larger the value
n
of n, the better the approximation.) In fact, more is true...

IMPORTANT GENERALIZATION:
The Central Limit Theorem
Given any random variable X, discrete or continuous, with finite
mean µ and finite variance σ 2. Then, regardless of the shape of
the population distribution of X, as the sample size n gets larger,
the sampling distribution of X becomes increasingly closer to
σ2

σ 
,
normal, with mean µ and variance n , that is, X ~ Nµ,
n

approximately.


X −µ
 More formally,
=
→ N ( 0,1) as n → ∞ . 
Z


σ/ n


 Intuitively perhaps, there is less variation between different sample mean values, than
there is between different population values. This formal result states that, under very
general conditions, the sampling variability is usually much smaller than the population
variability, as well as gives the precise form of the “limiting distribution” of the statistic.
 What if the population standard deviation σ is unknown? Then it can be replaced by the
s 

sample standard deviation s, provided n is large. That is, X ~ Nµ,
 approximately,
n

s
if n ≥ 30 or so, for “most” distributions (... but see example below). Since the value
n

is a sample-based estimate of the true standard error s.e., it is commonly denoted s.e.
 Because the mean µ X of the sampling distribution is equal to the mean µ X of the
population distribution – i.e., E [ X ] = µ X – we say that X is an unbiased estimator of
µ X . In other words, the sample mean is an unbiased estimator of the population mean.
A biased sample estimator is a statistic θˆ whose “expected value” either consistently
overestimates or underestimates its intended population parameter θ .
 Many other versions of CLT exist, related to so-called Laws of Large Numbers.
Ismor Fischer, 5/29/2012
5.2-3
Example: Consider a(n infinite) population of paper notes, 50% of which are
blank, 30% are ten-dollar bills, and the remaining 20% are twenty-dollar bills.
Experiment 1: Randomly select a single note from the population.
Random variable: X = \$ amount obtained
x
f(x) = P(X = x)
0
.5
10
.3
.5
20
.2
.3
.2

Mean µ X = E[X] = (.5)(0) + (.3)(10) + (.2)(20) = \$7.00

Variance σ X 2 = E[ (X – µ X )2 ] = (.5)(−7)2 + (.3)(3)2 + (.2)(13)2 = 61

Standard deviation σ X = \$7.81
Ismor Fischer, 5/29/2012
5.2-4
Experiment 2: Each of n = 2 people randomly selects a note, and split the winnings.
Random variable: X = \$ sample mean amount obtained per person
x
(x1, x2)
Probability
x
0
(0, 0)
.5 × .5
= 0.25
5
(0, 10)
.5 × .3
= 0.15
10
(0, 20)
.5 × .2
= 0.10
5
(10, 0)
.3 × .5
= 0.15
10
(10, 10)
.3 × .3
= 0.09
15
(10, 20)
.3 × .2
= 0.06
10
(20, 0)
.2 × .5
= 0.10
15
(20, 10)
.2 × .3
= 0.06
20
(20, 20)
.2 × .2
= 0.04
f ( x ) = P( X = x )
0
.25
5
.30 = .15 + .15
10
.29 = .10 + .09 + .10
.25
.30
.29
.12
15
.12 = .06 + .06
20
.04
.04

Mean µ X = (.25)(0) + (.30)( 5) + (.29)(10) + (.12)(15) + (.04)(20) = \$7.00 = µ X !!

Variance σ X 2 = (.25)(−7)2 + (.30)(−2)2 + (.29)(3)2 + (.12)(8)2 + (.04)(13)2
σX2
61
= 30.5 =
= n !!
2

Standard deviation σ X = \$5.52 =
σX
!!
n
Ismor Fischer, 5/29/2012
5.2-5
Experiment 3: Each of n = 3 people randomly selects a note, and split the winnings.
Random variable: X = \$ sample mean amount obtained per person
x
0
3.33
6.67
3.33
6.67
10
6.67
10
13.33
(x1, x2, x3)
(0, 0, 0)
(0, 0, 10)
(0, 0, 20)
(0, 10, 0)
(0, 10, 10)
(0, 10, 20)
(0, 20, 0)
(0, 20, 10)
(0, 20, 20)
Probability
.5 × .5 × .5
= 0.125
.5 × .5 × .3
= 0.075
.5 × .5 × .2
= 0.050
.5 × .3 × .5
= 0.075
.5 × .3 × .3
= 0.045
.5 × .3 × .2
= 0.030
.5 × .2 × .5
= 0.050
.5 × .2 × .3
= 0.030
.5 × .2 × .2
= 0.020
3.33
6.67
10
6.67
10
13.33
10
13.33
16.67
(10, 0, 0)
.3 × .5 × .5
= 0.075
(10, 0, 10)
.3 × .5 × .3
= 0.045
(10, 0, 20)
.3 × .5 × .2
= 0.030
(10, 10, 0)
.3 × .3 × .5
= 0.045
(10, 10, 10)
.3 × .3 × .3
= 0.027
(10, 10, 20)
.3 × .3 × .2
= 0.018
(10, 20, 0)
.3 × .2 × .5
= 0.030
(10, 20, 10)
.3 × .2 × .3
= 0.018
(10, 20, 20)
.3 × .2 × .2
= 0.012
6.67
10
13.33
10
13.33
16.67
13.33
16.67
20
(20, 0, 0)
.2 × .5 × .5
= 0.050
(20, 0, 10)
.2 × .5 × .3
= 0.030
(20, 0, 20)
.2 × .5 × .2
= 0.020
(20, 10, 0)
.2 × .3 × .5
= 0.030
(20, 10, 10)
.2 × .3 × .3
= 0.018
(20, 10, 20)
.2 × .3 × .2
= 0.012
(20, 20, 0)
.2 × .2 × .5
= 0.020
(20, 20, 10)
.2 × .2 × .3
= 0.012
(20, 20, 20)
.2 × .2 × .2
= 0.008
x
f ( x ) = P( X = x )
0.00
.125
3.33
.225 = .075 + .075 + .075
6.67
.285 = .050 + .045 + .050 +
.045 + .045 + .050
10.00
.207 = .030 + .030 + .030 + .027
+ .030 + .030 + .030
13.33
.114 = .020 + .018 + .018 +
.020 + .018 + .020
16.67
.036 = .012 + .012 + .012
20.00
.008
.285
.225
.114
.036

Mean µ X = Exercise = \$7.00 = µ X !!!

Variance σ X

Standard deviation σ X = \$4.51 =
2
.207
.125
σX2
61
= Exercise = 20.333 =
= n !!!
3
σX
!!!
n
.008
Ismor Fischer, 5/29/2012
5.2-6
The tendency toward a normal distribution becomes stronger as the sample size
n gets larger, despite the mild skew in the original population values. This is
an empirical consequence of the Central Limit Theorem.
For most such distributions, n ≥ 30 or so is sufficient for a reasonable
normal approximation to the sampling distribution. In fact, if the
distribution is symmetric, then convergence to a bell curve can often be
seen for much lower n, say only n = 5 or 6. Recall also, from the first
result in this section, that if the population is normally distributed (with
known σ), then so will be the sampling distribution, for any n.
BUT BEWARE....
Ismor Fischer, 5/29/2012
5.2-7
However, if the population distribution of X is highly skewed, then the sampling
distribution of X can be highly skewed as well (especially if n is not very large),
i.e., relying on CLT can be risky! (Although, sometimes using a transformation,
such as ln(X) or X, can restore a bell shape to the values. Later…)
Example: The two graphs on the bottom of this page are simulated sampling
distributions for the highly skewed population shown below. Both are density
histograms based on the means of 1000 random samples; the first corresponds to
samples of size n = 30, the second to n = 100. Note that skew is still present!
Population Distribution
```
Similar