Download Understanding the Central Limit Theorem

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Theorem wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Fundamental theorem of calculus wikipedia , lookup

Nyquist–Shannon sampling theorem wikipedia , lookup

History of statistics wikipedia , lookup

Law of large numbers wikipedia , lookup

Tweedie distribution wikipedia , lookup

Normal distribution wikipedia , lookup

Student's t-distribution wikipedia , lookup

Transcript
Case Study 31
31
Understanding the Central Limit Theorem
Understanding the Central Limit Theorem
Problem Description
The Central Limit Theorem is a theorem used widely in statistics. Formally, it is stated as
follows: Let X1, X2, ………, Xn be a random sample of size n taken from a population (either
finite or infinite) with mean μ and finite variance σ2. Let
form of the distribution of
z
X 
 n
X
be the sample mean. The limiting
Fn (z ) , approaches (z) as n approaches infinity:
,
lim Fn ( z )  ( z ) .
n
Where, (z) is the distribution function of a normal random variable with
  0 and
 2  1 (standard normal distribution).
The Central Limit Theorem tells us that if we are sampling from a population that has an
unknown distribution, the distribution of the sample mean will be normal with mean μ and
variance σ2 /n, if the sample size n is large.
The aim of this project is to build a support system that will be used in a statistics course to
help students understand the fundamentals of the Central Limit Theorem.
Example
The following is an example that demonstrates how the Central Limit Theorem works. Let Y
be the outcome from tossing a die. Note that Y is uniformly distributed. There is a equal
probability (1/6) that Y takes any of the values in set S={1,2,3,4,5,6}. The mean value μ of Y
is 3.5, and the variance is σ2 = 35/12. If Sn is the sum of outcomes when n dices are rolled,
and if n is “large,” the distribution of random variable
 
S n  n
 n
,
should approximate the standard normal distribution. As a result, Sn itself should have a
normal distribution with mean 3.5*n and variance 35*n/12.
The system we build allows the computer to randomly generate the outcome from tossing a
die and calculates Sn,  and the corresponding means and standard deviations.
User Interface
1.
Build a welcome form.
2.
Build a form that enables the user to perform the experiment described above. The
following are suggestions for designing this form.
a.
Insert a text box where the user types in the sample size n.
b.
Insert a text box where the user types in the number of times the dice should be
tossed, m.
c.
Insert a command button that, when clicked on, does the following:
K T
K T
K T
k 1t 1
k 1t 1
k 1t 1
min :  ckt xkt   hkt I kt   Fkt z kt
Subject to :
K
 zkt  1
Case Study 31
Understanding the Central Limit Theorem
Randomly
generates n(1integer
numbers uniformly distributed in the interval
for i.t  1,...,
T,
)
[1, 6] using Excel functions.
k 1
xkt  I k ,t 1  I kt  d kt
xkt  Pkt z kt
forii.k  1Calculates
,..., K ; t  1the
,...,sum
T , (Sn()2of
) the numbers generated.
for k  1,..., K ; t  1,..., TS,  n(3)
iii.k  1Calculates
 T ,n (4) .
for
,..., K ; t  1,...,
xkt , I kt  0
z kt  {0,1}
for k  1,..., K ; t  1,..., T .
iv.
 n
(5)
Records the results in the summary table described in part 2.d.
Repeat steps 2.c.i to 2.c.iv m times.
3.
d.
Build a table that records the following summary results: run number, Sn and .
e.
Insert a command button that, when clicked on, does the following:
i.
Calculates the mean of Sn and  using the data in Table 2.d.
ii.
Calculates the variance of Sn and  using the data in Table 2.d.
Insert a summary table where the user records the mean and variance of Sn and  for
different values of n.
Design a logo for this project. Insert this logo in the forms created above. Pick a background
color and a font color for the forms created. Include the following in the forms created: record
navigation command buttons, record operations command buttons, and form operations
command buttons as needed.
Reports
1.
Plot the probability curve for the sampling distribution of the sample mean for a single
die throw (n = 1).
2.
Plot the probability distribution curves for the sampling distribution of the sample mean
for all of the incremental sample sizes (n = 2,...,n*).
3.
If a curve-fitting feature is available in MS Excel, then for each of the above plots,
calculate the probability that the plot conforms to the normal distribution, and report on
what you find.
4.
Present the probabilities obtained in each of the above incremental experiments in
tabular form.
Reference
Montgomery, D.C., Runger, G.C., “Applied Statistics and Probability for Engineers,” 3 rd Ed.,
John Wiley & Sons, 2003.