Download Stat 216 – Central Limit Theorem 1 Central Limit Theorem 2 Mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Stat 216 – Central Limit Theorem
Robin R. Rumple
October 30, 2006
Below is further information pertaining to Central Limit Theorem. The intention is that this clarifies aspects
concerning CLT. Enjoy.
1
Central Limit Theorem
Given:
• The random variable X has a distribution with mean µ and standard deviation σ.
• Samples of size n are randomly selected from the population of X values.
Conclusion:
• The distribution of the sample mean X̄ will, as the sample size increases, approach a normal distribution.
Note:
• For samples of size n larger than 30, the distribution of the sample means can be approximated reasonably
by a normal distribution. The approximation gets better as the sample size n becomes larger.
• For sample of size n smaller than 30, CLT does NOT guarantee that the distribution of sample means is
well approximated by a normal distribution. Therefore, we cannot use the normal distribution to calculate
probabilities in this case.
• Normal Distribution Fact: If the original population is itself exactly normally distributed, then the
distribution of sample means will be exactly normally distributed for any sample size n (not just the values
of n larger than 30).
– Note: if the original population is approximately normally distributed, then the distribution of sample
means will also be approximately normally distributed for any sample size n. The CLT is invoked
here, so the larger the sample size the closer the X distribution will be to normal, but large sample
sizes are not necessary to calculate probabilities in this case.
2
Mean and Standard Deviation of X̄
Something to note: Central Limit Theorem does not apply to the equations for µX̄ and σX̄ . The fact that
µX̄ = µ
and
σ
σX̄ = √
n
is completely arbitrary as far as CLT is concerned. Note, µ and σ are parent population mean and standard
deviation values.
Closer observation of the Central Limit Theorem shows that the key is sample means can be approximated
reasonably by a normal distribution, not what the mean and standard deviation values are for X̄. Notice the
distinction between X̄ and x̄. The notation X̄ refers to the random variable X̄, whereas x̄ pertains to an
individual sample mean.
3
The Parent Population
The parent, or original population from which the sample mean x̄ is taken can either be normal, or non-normal.
Central Limit Theorem is applied to the distribution of X̄ defenseless of the state of the parent population. Let
us consider each case of the original population, and discuss further.
1
3.1
Parent Population is Normal
So, for an arbitrary random variable X, X ∼ N (µ, σ). For a sample drawn from the parent population,
µ
¶
σ
X̄ ∼ N µ, √
.
n
Note: Central Limit Theorem is not applied in this case. Further, the normal distribution under X̄ provides
exact probability values.
3.2
Parent Population is Not Normal
If the parent population is not normal, the question to ask is, how large is n?
• n ≥ 30: By Central Limit Theorem
X̄ ∼
˙ N
µ
¶
σ
µ, √
,
n
where calculations are approximately normal. The notation ∼
˙ means “approximately distributed as”.
• n < 30: If this is the case, Central Limit Theorem cannot be invoked. Thus, probabilities about X̄ cannot
be calculated.
4
Example
Here’s an example problem to illustrate the use of the CLT, with each scenario of parent population status, plus
the size of n.
a. Assume that women’s weights are normally distributed with a mean µ = 143 lbs and standard deviation
σ = 29 lbs.
If 15 women are randomly selected, find the probability that their mean weight is above 140 lbs.
The size of n makes no difference in this case, because the parent population is a normal distribution. Let
X̄ be the mean weight of women. So, our parameters for the distribution of X̄ are:
µx̄ = 143
σ
29
σx̄ = √ = √ = 7.49.
n
15
Note: Central Limit Theorem is NOT used in this scenario! So
µ
P
P (X̄ > 140) =
¶
140 − 143
X̄ − 143
>
=
7.49
7.49
P (Z > −0.401) =
1 − P (Z < −0.401).
(1)
(2)
(3)
(4)
This is found to be 0.656. Thus, the probability that the mean weight of 15 random women is greater than
140 lbs is exactly 0.656.
b. Assume that women’s weights are distributed with a mean µ = 143 lbs and standard deviation σ = 29 lbs.
If 100 women are randomly selected, find the probability that their mean weight is above 140 lbs.
Now we utilize CLT. This is because we are interested in mean weight, plus the parent population is not
normally distributed (no distribution is indicated) and n ≥ 30. Our parameters are now:
µx̄ = 143
2
σ
29
σx̄ = √ = √
= 2.9.
n
100
So
µ
P
P (W > 140) =
¶
140 − 143
W − 143
>
=
2.9
2.9
(5)
P (Z > −1.03) =
1 − P (Z < −1.03).
(7)
(8)
(6)
This is found to be 0.849. Thus, the probability that the mean weight of 100 random women is greater than
140 lbs is approximately 0.849. Note: as the sample size increases in relation to the distribution of X̄, the better
the probability approximation is.
c. Assume that women’s weights are distributed with a mean µ = 143 lbs and standard deviation σ = 29 lbs.
If 15 women are randomly selected, find the probability that their mean weight is above 140 lbs.
Central Limit Theorem cannot be used in this case. The difference between this scenario and that in part
a. is that in this case, the parent population is not normally distributed. And, because n < 30, Central Limit
Theorem cannot be applied.
5
Conclusion
The goal of this supplemental handout is to clarify the Central Limit Theorem. More specifically, where it is
applied, plus where it does not work. Hope this helps!
References
[1] McCabe, George W. and David S. Moore. Introduction to the Practice of Statistics. W.H. Freeman, New
York: 2006.
[2] Parker, Becky. Discussion. 27 October 2006.
[3] Triola, Mario F. Elementary Statistics. Addison-Wesley, Boston: 2001.
3