Download Binomial distribution, Central limit theorem, Gamma distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
American Journal of Mathematics and Statistics 2016, 6(3): 115-121
DOI: 10.5923/j.ajms.20160603.05
Convergence of Binomial, Poisson, Negative-Binomial,
and Gamma to Normal Distribution: Moment Generating
Functions Technique
Subhash C. Bagui1,*, K. L. Mehra2
1
Department of Mathematics and Statistics, University of West Florida, Pensacola, USA
Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, USA
2
Abstract In this article, we employ moment generating functions (mgf’s) of Binomial, Poisson, Negative-binomial and
gamma distributions to demonstrate their convergence to normality as one of their parameters increases indefinitely. The
motivation behind this work is to emphasize a direct use of mgf’s in the convergence proofs. These specific mgf proofs may
not be all found together in a book or a single paper. Readers would find this article very informative and especially useful
from the pedagogical stand point.
Keywords
Binomial distribution, Central limit theorem, Gamma distribution, Moment generating function,
Negative-Binomial distribution, Poisson distribution
1. Introduction
The basic Central Limit Theorem (CLT) tells us that,
when appropriately normalised, sums of independent
identically distributed (i.i.d.) random variables (r.v.’s) from
any distribution, with finite mean and variance, would have
their distributions converge to normality, as the sample size
n tends to infinity. If we accept this CLT and are in
knowledge of the fact that Binomial, Poisson,
Negative-binomial and Gamma r.v.’s are themselves sums of
i.i.d. r.v.’s, we can conclude the limiting normality of these
distributions by applying this CLT. We must note, however,
that the proof of this CLT is based on the use of
Characteristic Functions theory involving Complex Analysis,
the study of which primarily only advanced math majors in
colleges and universities undertake. There are available,
indeed, other methods of proof in specific cases, e.g., in case
of Binomial and Poisson distributions through
approximations of probability mass functions (pmf) by the
corresponding normal probability density function (pdf)
using Stirling’s formula (cf., Stigler, S.M. 1986, pp.70-88,
[8]; Bagui et al. 2013b, p. 115, [2]) or by simply
approximating the ratios of successive pmf terms of the
distribution one is dealing with (cf., Proschan, M.A. 2013, pp.
62-63, [6]). However, by using the parallel (to characteristic
functions) methodology of mgf’s, which does not involve
* Corresponding author:
[email protected] (Subhash C. Bagui)
Published online at http://journal.sapub.org/ajms
Copyright © 2016 Scientific & Academic Publishing. All Rights Reserved
Complex Analysis, we can also accomplish the same
objective with relative ease. This is what we propose to
explicitly demonstrate in this paper.
The structure of the paper is as follows. We provide some
useful preliminary results in Section 2. These results will be
used in section 3. In Section 3 we give all the details of
convergence for all the above mentioned distributions to
normal distribution. Section 4 contains some concluding
remarks.
2. Preliminaries
In this section, we state some results that will be used in
various proofs presented in section 3.
Definition 2.1. Let X be a r.v. with probability mass
function (pmf) or probability density function (pdf)
f X ( x)
,
−∞<x<∞
function (mgf) of the r.v.
M X (t ) = E ( etX )
. Then the moment generating
X
is defined as

tx
∑ e f X ( x), if X is discrete
x
= ∞

tx
 ∫ e f X ( x)dx, if X is continuous
 −∞
assume to exist and be finite for all
If
X
variance
t <h
for an
has a normal distribution with mean
σ
2
,
then
mgf
of
X
is
h>0.
µ
given
and
by
116
Subhash C. Bagui et al.: Convergence of Binomial, Poisson, Negative-Binomial,
and Gamma to Normal Distribution: Moment Generating Functions Technique
M X (t ) = e µt +σ
2
( t 2 2)
, [3]. If =
Z ( X − µ ) σ , then Z
is said to have standard normal distribution (i.e., a normal
distribution with mean zero and variance one). The mgf of
2
is given by M Z (t ) = et 2 .
Z
FX
Let
of the r.v.
denote the cumulative distribution function (cdf)
X
.
Theorem 2.1. Let
FX
and
FY
be two cumulative
for all
t
FX (u ) = FY (u )
for all
u
in
f X (u ) = fY (u )
for all
u .)
A probability distribution is not always determined by its
moments. Suppose
X
has cdf
FX
and moments
E ( X r ) = µr′ which exist for all r = 1, 2, . If
∞
∑
r =1
µr′ t r
r!
has a positive radius of convergence for all
−h < t < h , h > 0 (Billingsley 1995, Section 30, [4];
Serfling 1980, p. 46, [7]), then mgf exists in the interval
−h < t < h , h > 0 , and hence uniquely determines the
probability distribution.
A weaker sufficient condition for the moment sequence to
determine a probability distribution uniquely is
∞
1
= +∞ . This sufficient condition is due to
1 (2 r )
′
(
µ
)
r =1 2 r
Carleman (Chung 1974, p. 82, [5]; Serfling 1980, p. 46, [7]).
Theorem 2.2. Let { X n , n ≥ 1} be a sequence of r.v’s with
∑
the corresponding mgf sequence as M X (t ) , n = 1, 2,
n
and
X
be a r.v. with mgf M X (t ) which are assumed exist
−h < t < h , h > 0 . If
for all
lim M X n (t ) = M X (t )
for
n→∞
−h < t < h , then
variance σ 2 , 0 < σ 2 < ∞ , and set S n =
X n 
→X .
The notation
d
X n 
→ X means that, as n → ∞ , the
distribution of the r.v.
Xn
converges to the distribution of
the r.v. X .
Lemma 2.1. Let {ψ ( n), n ≥ 1} be a sequence of reals.


Then, lim  1 +
n →∞
a ψ ( n) 
+

n
n 
bn
e ab , provided
=
a
and
b
n
∑ Xi
,
i =1
X n = [ Sn n] and
Zn
=
Then
S n − nµ
=
σ n
n(X n − µ)
σ
Z n 
→ Z  N (0,1)
d
.
n→∞
, as
, where
N (0,1) stands for a normal distribution with mean 0 and
variance 1.
For Definition 2.1, Theorem 2.1, Theorem 2.2, and
Lemma 2.1, see Casella and Berger, 2002, pp. 62-66, [4] and
Bain and Engelhardt, 1992, p. 234, [3].
3. Congergence of Mgf’s
3.1. Binomial
Binomial probabilities apply to situations involving a
series of n independent and identical trials with two
possible outcomes –a success with probability p and a
failure with probability q = 1 − p - on each trial. Let
Xn
Xn
has
be the number of successes in
n
trials, then
binomial distribution with parameters
probability
mass
function
of
n
Xn
p
and
is
. The
given
by
n
f X n ( x)   p x (1 − p)n− x , x = 0,1, , n . Thus the
=
 x
mean of
Var
Xn
=
Zn
is
E ( X n ) = np
n
n
is
is given by
∑ etx  x  p x q n− x = ( q + pet )
 
n
.
Let
x =0
( X n − np )
σ n = npq
Xn
and the variance of
( X n ) = npq, q = 1 − p . The mgf of X n
M X n (t ) =
d
n →∞
sequence of independent and identically distributed (i.i.d.)
random variables with mean µ , −∞ < µ < ∞ , and
−h < t < h , h > 0 , then
(i.e.,
an limψ ( n) = 0 .
CLT (See Bagui et al. 2013a, [1]). Let { X n : n ≥ 1} be a
distribution functions (cdf’s) whose moments exist. If the
X and Y and
mgf’s exist for the r.v.’s
M X (t ) = M Y (t )
n
do not depend on
npq . With simplified notation
=
Zn
, we have
derive the mgf of
X n σ n − np σ n .
Z n . Now the mgf of Z n
Below we
is given by
American Journal of Mathematics and Statistics 2016, 6(3): 115-121
(
)
117
(
 t ( X n σ n −np σ n ) 
− npt σ n
=
=
M Z n (t ) E=
etZ n
Ee
E e(t σ n ) X n
 e


(
−npt σ n q + pet σ n
= e −npt σ n M
=
X n (t σ n ) e
(
)
)
n
)
n
= qe− pt σ n + peqt σ n .
(3.1)
σ n = t q np
Based on the Taylor’s series expansion, there exists a number ξ (n) , between 0 and qt
such that
qt
q 2t 2
q3t 3
q 4t 4 ξ ( n )
, where ξ ( n) → 0 as n → ∞ .
1+
e qt σ n =
e
+
+
+
σ n (2!)σ n2 (3!)σ n3 (4!)σ n4
(3.2)
Similarly, based on the Taylor’s series expansion, there exists a number ς (n) , between 0 and pt σ n = t p nq such
that
pt
p 2t 2
p3t 3
p 4t 4 ς ( n )
, where ς ( n) → 0 as n → ∞ .
1−
e− pt σ n =
e
+
−
+
σ n (2!)σ n2 (3!)σ n3 (4!)σ n4
(3.3)
Now substituting these two equations (3.2) and (3.3) in the last expression for M Z (t ) in (3.1), we have
n
  pqt pqt  pqt 2

pqt 3 2
pqt 4 3 ξ ( n)
2
3 ς (n)
(
)
(
)
(
)
M Z n (t ) = 1 + 
q
p
q
p
q
e
p
e
−
+
+
+
−
+
−


  σ n σ n  2!σ 2
3!σ n3
4!σ n4
n


 t2
t 3 (q − p)
t 4 (q 3eξ ( n ) − p 3eς ( n ) ) 
=1 + +
+

12
(n)(4!)(npq) 
 2n (n)(3!)(npq)
n
n
.
(3.4)
The above equation (3.4) may be written as
n
 t 2 ψ ( n) 
t 3 (q − p)
t 4 (q3eξ ( n ) − p3eς ( n ) )
, where ψ ( n) =
.
M Z n (t ) =1 + +
+

 2n

n
n
pq
(
)(4!)
n
pq
(
)(3!)


Since ξ ( n), ς ( n) → 0 as n → ∞ , then limψ ( n) = 0 for every fixed value of
n →∞
t . Thus based on Lemma 2.1 we
have
2
lim M Z n (t ) = et 2
n→∞
for all real values of
t . That is, in view of Theorems 2.1 and 2.2, we conclude that the r.v.
( X n − np ) npq has the limiting standard normal distribution. Consequently, the binomial r.v. X n
2
has, for large n , an approximate normal distribution with mean µ n = np and variance σ n = npq .
=
Zn
3.2. Poisson
The Poisson distribution is appropriate for predicting rare events within a certain period of time. Let
with parameter
λ.
mean and variance of
The probability mass function of
Xλ
are
λ . The mgf of X λ
Xλ
is given by f X ( x) =
is given by
λ
e− λ λ x
x!
Xλ
be a Poisson r.v.
, x = 0,1, 2, . Both the
118
Subhash C. Bagui et al.: Convergence of Binomial, Poisson, Negative-Binomial,
and Gamma to Normal Distribution: Moment Generating Functions Technique
=
M X λ (t )
e−λ λ x
∞
=
etx
∑
x!
Z n ( X n −=
n) n ( X n
eλ (e −1) . For notational convenience let λ = n and =
t
n − n) .
x =0
Below we derive the mgf of
Z n , which is given by
)
(
( )
n X n  −t n
n − n )  e −t n E  e t =
M Z n (t ) E=
etZn
E et ( X n =
M X n (t
=

 e




n

n  et n −1 
−t n e 
e−t e 


= e=




n et n −1
Now consider the simplification of the term
)
(
n)
.
(3.5)
)
(
n et n − 1 as


t
t2
t3
t4
n 1 +
eς ( n ) − 1 , where ς (n) is number between 0 and
+
+
+
32
2
(4!)n
n (2!)n (3!)n


n et n − 1 =
and converges to zero as
n→∞
(
n et
. Further the above term
n
)
−1
may be simplified as
(
n et
n
t
n
)
−1
t2
t3
t4
t+
=
+
+
ς (n) . Now substituting this in the last expression (3.5) for
(2!) n (3!)n (4!)n3 2
M Z n (t ) , we have
n
M Z n (t )
where b( n) = et
2
 −t t +t 2 [(2!) n ]+t 3 [(3!)n]+t 4ς (n) [(4!)n3 2 
e e
et 2 b ( n ) ,
=



3 [(3!) n ] [t 4ς (n)] [(4!)n3 2 ] which tends to as n → ∞ . Hence, we have
1
e
2
lim M Z n (t ) = et 2
n→∞
for all real values of
t . Using Theorems 2.1 and 2.2 we conclude that =
Z n ( X n − n) n
normal distribution. Hence, the Poisson r.v.
equal to
λ = n , for large n .
Xλ
has the limiting standard
has also an approximate normal distribution with both mean and variance
3.3. Negative Binomial
Consider an infinite series of independent trials, each having two possible outcomes, success or failure. Let
p = P (success) and q = P (failure) = 1 − p . Define the random variable
n th success. Then X n
of
Xn
Xn
n
and
to be the number of failures before the
p . Thus, the probability mass function
 n + x − 1 n x
f X n ( x) = 
 p q , x = 0,1, 2, . The mean of X n is given by E ( X n ) = nq p and the
 x

is given by
variance of
has negative binomial distribution with parameters
Xn
is given by Var ( X n ) = nq
n
p 2 . The mgf of X n
∞
can be obtained as
Z n ( X n − nq=
p) ( nq p) ( pX n )
=  p (1 − qet )  . Let =


 n + x − 1 n x
p q
 x

M X n (t ) = ∑ etx 
x =0
nq  − nq . Now the mgf of
Zn
is given by
American Journal of Mathematics and Statistics 2016, 6(3): 115-121
119
 t ( pX n nq − nq ) 
M Z n (t ) E=
etZ n
E e
=



( )
− ( nq )t  ( p nq )t X n 
= e=
E e
e−(


=e
−( nq )t 
nq 


pt
 p (1 − qe
=  1p e( q
nq )t

 pt 
M Xn 

 nq 
n
nq )t
− qp et
nq 
−n


−n


.
(3.6)
According to Taylor’s series expansion, there exists a number ξ (n) , between
1 (q
e
p
nq )(t ) n  1
p

) = e−( q
nq ) 
− qp e(t
nq )t
0 and
q
t such that
nq
qt
q 2t 2
q3t 3
1
ξ (n) 
e
= 1 +
+
+

p 
nq (2!)nq (3!)(nq)3 2

qt
qt 2
1
( q ) q 2t 3
=
+
+
+
eξ ( n ) , where ξ (n) → 0 as n → ∞ .
3
2
p p nq p (2n) p (3!)(nq )
Similarly, there exists a number ς (n) between
q t
e
p
nq
(3.7)
t
such that
nq
0 and
q
t
t2
t3
ς (n) 
e
= 1 +
+
+

p 
nq (2!)nq (3!)(nq)3 2

q
qt
t2
qt 3
=
+
+
+
eς ( n ) , where ς (n) → 0 as n → ∞ .
p p nq p (2n) p (3!)(nq )3 2
(3.8)
Now substituting these two expressions (3.7) and (3.8) in the last expression for M Z (t ) in (3.6), we have
n
 1 q  t 2  1 q  q
t3
ς (n) 
2 ξ (n)
−
M Z n (t ) =  −  −  −  +
q
e
e

3/2

 p p  2n  p p  p (3!)(nq)
(
1 − q t 2 1 − q
t3
=
−
+
q 2eξ ( n) − eς ( n)

2n p
(n) p nq
 p
(
 t 2 1 t3

q 2eξ ( n) − eς ( n) 
=1 − +

 2n n (3!) nq
(
The above equation (3.9) can be written as
Since both ξ ( n), ς ( n) → 0 as
n→∞
)
(
M Z n (t ) =
2 ψ (n)
1 − 2t n + n
)
−n
n→∞
2
M Z n (t ) = et 2
t . Hence, by Theorems 2.1 and 2.2, we conclude the r.v.
−n
−n
.
(3.9)
, where ψ ( n) =
, lim ψ (n) = 0 for every fixed value of
Hence by lemma 2.1 we have
for all real values of
)



)
−n
t.
t3
⋅
nq
(q 2eξ ( n ) − eς ( n ) ) .
120
Subhash C. Bagui et al.: Convergence of Binomial, Poisson, Negative-Binomial,
and Gamma to Normal Distribution: Moment Generating Functions Technique
Z n ( X n − nq p) ( nq p) has the limiting standard normal distribution. Accordingly, the negative-Binomial r.v.
=
Xn
has approximately a normal distribution with mean
µn =
nq
p
and variance
σ n2 =
nq
, for large n .
p2
3.4. Gamma
The Gamma distribution is appropriate for modeling waiting times for events. Let X be a Gamma r.v. with pdf
f X ( x) = 1 α xα −1e − x β , α , β > 0 and x > 0 . The α is called the shape parameter of the distribution and β
Γ (α ) β
is called the scale parameter of the distribution. For convenience let us denote α by α = n . It is well known that the mean
of X is E ( X ) = nβ and the variance of X is Var ( X ) = nβ . The mgf of X is given by
2
M X (t ) =
∞
1
Γ ( n) β
n
∫e
tx n −1 − x β
x
e
dx = (1 − β t ) − n , t < 1 β .
0
Let Z n =
( X − nβ ) β n =
X β n − n . The mgf of Z n is given by
 t ( X β n − n )  − n (t )  (t β n ) X  − n (t )
(etZ n ) E e
M
E e = e
M X (t ( β n ))
=
=
 e
Z n (t ) E=




t 

=e−(t n ) n 1 −

n

Observe that et
zero as
n
= 1+
−n
=et n − t et n 

n

−n
, t<
n.
(3.10)
1
t
t2
t3
and tends to
eξ ( n ) , where ξ (n) is a number between 0 and
+ +
3
2
n
n 2n (3!)n
n
n → ∞ , and t et =
n
t
t2
t3
t4
eξ ( n ) . Now substituting these two in the last expression of
+ +
+
32
(3!)n 2
n n (2!)n
M Z n (t ) in (3.10), we have M Z n (t ) =(1 −
−n
t3
t 2 t 3eξ ( n )
+
−
2n (3!)n3 2 (2!)n3 2
3 ξ (n)
3
te
t
 t 2 ψ (n)  , where=
ψ ( n)
−
M Z n (t ) =1 −
+

 2n
(3!) n 2 n
n 

for every fixed value of t . Hence by Lemma 2.1 we have
−
−
t 4 eξ ( n ) − n
) . This can be written as
(3!)n
t 4 eξ ( n )
. Since ξ ( n) → 0 as n → ∞ , limψ ( n) = 0
n →∞
(3!)n
2
M Z n (t ) = et 2
( X − nβ ) β n has the limiting
t . Hence, by Theorems 2.1 and 2.2, we conclude the r.v. Z=
n
standard normal distribution. Accordingly, the Gamma r.v. X has approximately a normal distribution with mean
µn = nβ and varianc σ n2 = nβ 2 , for large n .
for all real values of
4. Concluding Remarks
It is well-known that a Binomial r.v. is the sum of i.i.d.
Bernouli r.v.’s, a Poisson P (λ ) r.v., with λ = n a
positive integer, the sum of n i.i.d. P (1) r. v.’s, a
Negative-binomial r.v. the sum of i.i.d. geometric r.v.’s and a
Gamma r.v. the sum of i.i.d. exponential r.v.’s. In view of
these facts, one can easily conclude by applying the above
stated general CLT that the above distributions, after proper
normalizations, converge to a normal distribution as n , the
number of terms in their respective sums, increases to
infinity. But these facts may be beyond the knowledge of
undergraduate students, especially those who are non-math
majors. However, as demonstrated in the preceding Section 3
for the Binomial, Poisson, Negative-binomial and Gamma
distributions, in dealing with distributional convergence
problems where individual mgf’s exist and are available, we
can use the mgf technique effectively to formally deduce
American Journal of Mathematics and Statistics 2016, 6(3): 115-121
their limiting distributions. In our view, this latter technique
is natural, equally instructive and at a more manageable level.
In any case, it provides an alternative approach.
In the proof of general central limit theorem using mgf
both Bain and Engelhardt (1992), [3] and Inlow (2010), [6a]
use the mgf of sum of i.i.d r.v’s. But we are using the existing
mgf of all the above mentioned distributions without treating
them as sums of i.i.d. r.v.’s. Bain and Engelhardt (1992), [3]
discusses a proof of convergence of binomial to normal
using mgf. But this paper formalizes mgf proofs of collection
of distributions. The paper framed in this way can serve as an
excellent teaching reference. The proofs are straightforward
and require only an additional knowledge of Taylor series
expansion, beyond the skills to handle algebraic equations
and basic probabilistic concepts. The material should be of
pedagogical interest, and can be discussed in classes where
only basic calculus and skills to deal with algebraic
expressions are the only background requirements. The
article should also be of reading interest for senior
undergraduate students in probability and statistics.
ACKNOWLEDGEMENTS
The authors are thankful to the Editor-in-Chief and an
anonymous referee for their careful reading of the paper.
121
REFERENCES
[1]
Bagui, S.C., Bhaumik, D.K., Mehra, K.L. (2013a). A few
counter examples useful in teaching central limit theorem,
The American Statistician, 67(1), 49-56.
[2]
Bagui, S.C., Bagui, S.S., Hemasinha, R. (2013b).
Nonrigourous proof’s Stirling’s formula, Mathematics and
Computer Education, 47(2), 115-125.
[3]
Bain, L.J. and Engelhardt, M. ((1992). Introduction to
Probability and Mathematical Statistics, 2nd edition, Belmont:
Duxbury Press.
[4]
Billingsley, P. (1995). Probability and Measure, 3rd edition,
New York: Wiley.
[5]
Casella, G. and Berger, R.L. (2002). Statistical Inference,
Pacific Grove: Duxbury.
[6]
Chung, K.L (1974). A Course in Probability Theory, New
York: Academic Press.
a. Inlow, Mark (2010). A moment generating function proof
of the Lindeberg-Lévy central limit theorem, The American
Statistician, 64(3), 228-230.
[7]
Proschan, M.A. (2008). The Normal approximation to the
binomial, The American Statistician, 62(1), 62-63.
[8]
Serfling, R.J. (1980). Approximation
Mathematical Statistics, New York; Wiley.
[9]
Stigler, S.M. (1986), The History of Statistics: The
Measurement of Uncertainty before 1900, Cambridge, MA:
The Belknap Press of Harvard University Press.
Theorems
of