Download MA 490 Senior Project Project: Prove that the cumulative binomial

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
MA 490
Senior Project
Project: Prove that the cumulative binomial distributions and the Poisson distributions can be
approximated by the Normal distribution and that that approximation gets better as the numbers
increase.
It is imperative that one first understands the distributions given before proving that they
can be approximated. The binomial distribution represents the total number of successes out of n
Bernoulli trials under specific conditions: only two outcomes are possible on each of the n trials,
the probability of success for each trial is constant, and all trials are independent of each other
(Weisstein). A Bernoulli trial, which is used in the binomial distribution, is an experiment with a
random result with only two possible outcomes, either a “success” or a “failure.” The following
is the binomial distribution:
From the following, k represents the number of success in n trials of a Bernoulli process with
probability of success p (Weisstein). The binomial distribution is a discrete probability
distribution that is used to analyze the possible number of times that a specific event could occur
in a particular amount of trials. The following is a graphical representation of the binomial
distribution (Weisstein):
Furthermore, the cumulative binomial distribution refers to a specific range of data being
collected in a binomial distribution (Devore). It uses the same probability distribution as the
binomial distribution, but has a specific lower limit and upper limit. The following is a graphical
representation of the cumulative distribution (Weisstein):
2
A limiting case of the binomial distribution is the Poisson distribution. The Poisson
distribution is another type of probability distribution that represents the probability of a number
of times a random event occurs in a given amount of time unit. The Poisson distribution is a
discrete one-parameter distribution with the parameter being both the mean and variance of the
distribution. The following is the Poisson distribution (Weisstein):
Where t is the number of units of time, λ is the expected number of occurrences in a given
interval, and k is the number of times an event occurs. The following is graphical representation
of the Poisson distribution (Weisstein):
The cumulative binomial distribution and the Poisson distribution can be approximated
by the normal distribution due the central limit theorem. The normal distribution is another
probability distribution that is used to approximate continuous random variables around a single
mean value. The normal distribution allows individuals to predict where a specific probability
3
will fall based on such approximation. The following is the normal distribution where µ is the
mean and is your variance:
With the understanding of the binomial distribution, Poisson distribution, and the normal
distribution, one can begin to prove their approximation. Both the binomial distribution and the
Poisson distribution can be approximated by the normal distribution by using the central limit
theorem. The central limit theorem states that given a distribution with a mean and a variance,
the sampling distribution of the mean approaches a normal distribution as the same size increases
(Cryer and Whitmer). The central limit theorem assumes Y1, Y2,…,Yn are independent variables
that are distributed identically with a mean µ and a finite variance . It then defines the random
variable Un as the following:
One can see that as n in this random distribution function gets larger it converges to the normal
distribution function and the approximation becomes more accurate as n becomes even larger.
For further proof of the approximation of these distributions converging to normal
distribution, one must prove the central limit theorem. The central limit theorem can be proved
using many different methods, but the easiest method of proof is using moment generating
functions. Moment generating functions redefines a specific probability distribution by using
expected values of a random variable. Moment generating functions are used to find all the
moments of a random variable distribution by using simple operation (Bain and Engelhardt). The
moment generating function of a random variable X is the function MX(t)= E(etx). The function
can then be rewritten since the term etx can be approximated around zero using a Taylor series
expansion. Thus the moment function becomes:
1
1
2
3


M X  t   E etx   E e 0  te t 0  x  0   t 2 et 0  x  0   t 3et 0  x  0   
2
6


2
3
t
t
 1  E  x  t  E  x 2   E  x 3   
2
6
4
The normal distribution can be written as a moment generating function, which will be
used in the proof of the central limit theorem. To find the moment generating function for the
normal distribution, first begin by letting X be a normal random variable with a mean µ and a
standard deviation σ. Then (Weisstein),
1  x  


1
M X t  
etx e 2

 2 
2
2
dx
2


1
1  x   

exp tx 
 dx
2  2 
 2 

The term exponential term can be further broken down.
tx 
2
2
2
4 2
1  x  
1  x    t    2 t   t

2
2
2 
2

2
1  x    t 
1

 t  2t 2
2
2

2
Therefore, the moment generating function for the normal distribution is given as:
 1  x    t2  

1

 1

dx
M X  t   exp  t  2t 2 
exp

 2

2
2

  2  


or
1


 exp  t  2t 2 
2


Now that the background of the proof has been given, one can move to the proof of the
central limit theorem. First define a random variable of Zi by
?? ?? ??=?????? ?? ??− ?????? ??. We
can determine that the
mean of Zi will be zero and the variance will be one (Cryer and Whitmer). Next, we can write the
moment generating function for Zi as
From the stated central limit theorem, we know that
5
Since the random variables Yi are independent, we can conclude that the random variables of Zi
are also independent. By the defined properties of moment generating functions, the sum of
independent random variables is the product of individual moment generating functions (Cryer
and Whitmer). Therefore, we can write
In order to bring n from the exponent position we can apply the natural log, such that
The expression on the right can be rewritten using the Taylor series expansion, which is ln(1+x)=
1+?? ??1!??+???? ??2????2!??+???? ??3????3!??+… .
Therefore our moment generating function becomes
Looking closely at our function we can see that all of the terms except the first time will have a
power of n in the denominator (Cryer and Whitmer). Therefore, as n becomes large, or as n∞
all the terms but the first term will go to zero. Furthermore, we want to see what happens to the
remaining term as n gets larger, so we must take the limit of mn(t) as n ∞
We then can apply e in order to get rid of ln, so we have
The function above should look familiar. Recall from the earlier proof of the moment
generating function for the normal distribution being
. Since we stated earlier that the
6
mean is zero and the variance is one we have ?? ?????? ??2????2????. This means that any
given random variable distribution converges towards the normal distribution as n goes towards
∞ (Cryer and Whitmer). Therefore, the approximation becomes more accurate as the larger n
becomes. Thus, the central limit theorem can be used to approximate random variable
distributions.
As already stated, the cumulative binomial distribution and the Poisson distribution are
both random and independent variable probability distribution, so we can apply the central limit
theorem in order for them to be approximated by the normal distribution. The best representation
of such approximation is using graphs.
The following graph represents a cumulative binomial distribution being approximated
with the normal distribution (Boucher, Normal Approximation to a Binomial Random Variable).
The cumulative binomial distribution is seen as the red “steps” with n=15 and p=.5. The function
seen running over the cumulative binomial distribution is the normal distribution. As you can
see, the cumulative binomial distribution is extremely similar to the normal distribution, hence
the application of the central limit theorem.
following
The
graph is using
7
the same binomial distribution as the graph above, however n has gotten larger, n=55 and p is
the same (Boucher, Normal Approximation to a Binomial Random Variable). You can see that
the approximation is becoming closer and more accurate due to the distributions “lying on top”
of one another.
Similarly to the cumulative binomial distribution, the Poisson distribution can be
approximated by the normal distribution (Boucher, Normal Approximation to a Poisson Random
Variable). The following graph represent a Poisson distribution with λ=20 and the zoom
representing n. Once again you can see that the Poisson distribution overlaps the normal
distribution.
The following graph is using the same Poisson distribution, but just extending your view
in order for n to become larger (Boucher, Poisson Distribution). Notice that the approximation of
the Poisson distribution by the normal distribution becomes more accurate as n increases.
8
Based on the proof of the central limit theorem, one can approximate a random variable
distribution to the normal distribution. The theorem also states that as n increases, the
approximation becomes more accurate. When given either a cumulative binomial distribution or
a Poisson distribution, they too can be approximated by the normal distribution since they have
independent random variables. This has been shown through examples of graphs earlier. This is
an important concept to understand since it allows one to further analyze data given in a binomial
distribution or a Poisson distribution.
9
Works Cited
Bain, Lee and Max Engelhardt. Introduction to Probability and Mathematical Statistics. Boston:
PWS Publishers, 1987.
Boucher, Chris. Binomial Distribution. 2007. 9 April 2011
<http://demonstrations.wolfram.com/BinomialDistribution/>.
—. Normal Approximation to a Binomial Random Variable. 2007. 9 April 2011
<http://demonstrations.wolfram.com/NormalApproximationToABinomialRandomVariable/>.
—. Normal Approximation to a Poisson Random Variable. 2007. 9 April 2011
<http://demonstrations.wolfram.com/NormalApproximationToAPoissonRandomVariable/>.
—. Poisson Distribution. 2007. 9 April 2011
<http://demonstrations.wolfram.com/PoissonDistribution/>.
Cryer, Jon and Jeff Whitmer. Introduction to the Central Limit Theorem. 26 June 1999. 15 April
2011 <http://courses.ncssm.edu/math/Stat_Inst/PDFS/SEC_4_f.pdf>.
Devore, Jay. Probability and Statistics for Engineering and the Sciences. Ed. Jennifer Burger. 4th
Edition. Belmont: Wadsworth Publishing Company, 1995.
10
Falmagne, Jean-Claude. Lectures in Elementary Probability Theory and Stochastic Processes.
New York: William Barter, 2003.
Lane, David. Central Limit Theorem. 2008. 14 April 2011
<http://davidmlane.com/hyperstat/A14043.html>.
Weisstein, Eric. Binomial Distribution. 1999. 16 April 2011
<http://mathworld.wolfram.com/BinomialDistribution.html>.
HYPERLINK "http://mathworld.wolfram.com/about/author.html" Weisstein HYPERLINK
"http://mathworld.wolfram.com/about/author.html" , Eric W. "Central Limit Theorem." From
HYPERLINK "http://mathworld.wolfram.com/" MathWorld --A Wolfram Web Resource. HYPERLINK
"http://mathworld.wolfram.com/CentralLimitTheorem.html"
http://mathworld.wolfram.com/CentralLimitTheorem.html
HYPERLINK "http://mathworld.wolfram.com/about/author.html" Weisstein HYPERLINK
"http://mathworld.wolfram.com/about/author.html" , Eric W. "Moment-Generating Function."
From HYPERLINK "http://mathworld.wolfram.com/" MathWorld --A Wolfram Web Resource.
HYPERLINK "http://mathworld.wolfram.com/Moment-GeneratingFunction.html"
http://mathworld.wolfram.com/Moment-GeneratingFunction.html
HYPERLINK "http://mathworld.wolfram.com/about/author.html" Weisstein HYPERLINK
"http://mathworld.wolfram.com/about/author.html" , Eric W. "Normal Distribution." From
HYPERLINK "http://mathworld.wolfram.com/" MathWorld --A Wolfram Web Resource. HYPERLINK
"http://mathworld.wolfram.com/NormalDistribution.html"
http://mathworld.wolfram.com/NormalDistribution.html
HYPERLINK "http://mathworld.wolfram.com/about/author.html" Weisstein HYPERLINK
"http://mathworld.wolfram.com/about/author.html" , Eric W. "Poisson Distribution." From
HYPERLINK "http://mathworld.wolfram.com/" MathWorld --A Wolfram Web Resource. HYPERLINK
"http://mathworld.wolfram.com/PoissonDistribution.html"
http://mathworld.wolfram.com/PoissonDistribution.html