Download THE MODIFIED EXPONENTIAL

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
THE PUBLISHING HOUSE
OF THE ROMANIAN ACADEMY
PROCEEDINGS OF THE ROMANIAN ACADEMY, Series A,
Volume 12, Number 1/2011, pp. 22–29
THE MODIFIED EXPONENTIAL-POISSON DISTRIBUTION
Vasile PREDA*, Eugenia PANAITESCU**, Roxana CIUMARA***
*
University of Bucharest, Faculty of Mathematics and Computer Science, Bucharest, Romania
**
"Carol Davila" University of Medicine and Pharmacy,
Department of Medical Informatics and Biostatistics, Bucharest, Romania
***
Academy of Economic Studies, Department of Mathematics, Bucharest, Romania
E-mail: [email protected]
The two-parameter distribution known as exponential-Poisson (EP) distribution, which has decreasing
failure rate, was introduced by Kus (2007). In this paper we generalize the EP distribution and show
that the failure rate of the new distribution can be decreasing or increasing. The failure rate can also
be upside-down bathtub shaped. We provide closed-form expressions for the density, cumulative
distribution, survival and failure rate functions. The EM algorithm is used to determine the maximum
likelihood estimates and the asymptotic varinces and covariances of these estimates are obtained.
Key words: Compounding; Failure rate; EM algorithm; Maximum likelihood estimates; Modified
exponential-Poisson distribution.
1. INTRODUCTION
The exponential distribution (ED) provides a simple, elegant and close form solution to many problems
in lifetime testing and reliability studies. However, the ED does not provide a reasonable parametric fit for
some practical applications where the underlying hazard rates are nonconstant, presenting monotone shapes.
In recent years, in order to overcome such a problem, new classes of models were introduced based on
modifications of the ED. Gupta and Kundu (1999) proposed a generalized ED. This extended family can
accommodate data with increasing and decreasing failure rate functions. Adamidis and Loukas (1998)
introduced a distribution with decreasing failure rate. This distribution is known as exponential-geometric
distribution and is obtained by compouding an exponential distribution with a geometric distribution. In the
same fashion, Kus (2007) introduced a two-parameter distribution known as exponential-Poisson (EP)
distribution, which has decreasing failure rate, by compouding an exponential distribution with a Poisson
distribution. While Barreto-Souza and Cribari-Neto (2009) generalizes the distribution proposed by Kus
(2007) by including a power parameter in his distribution.
In this paper, we propose a new distribution family based on the ED with increasing or decreasing
failure rate function. Its genesis is stated on a parameterization scheme for a survival function proposed by
Marshall and Olkin (1997).
The paper is organized as follows. Sections 1 and 2 present the new distribution and its properties and
Section 3 outlines an EM-type algorithm for maximum likelihood estimation.
2. GENESIS OF THE DISTRIBUTION
Marshall and Olkin (1997) introduced a parameterization scheme for a distribution function F ( y, a )
by defining another distribution function
F ( y, a ) =
F ( y)
, y∈
F ( y ) + a(1 − F ( y ))
, a>0.
2
The modified exponential-Poisson distribution
23
We use this parameterization to obtain the modified exponential
1 − e −βx
Fˆ (x, α, β) =
, x > 0 , α, β > 0 .
1 − (1 − α)e −βx
distribution function
Let W1 , W2 , W3 … , WZ be a random sample from modified exponential distribution with density
fˆ ( w, α, β) =
αβe −βw
(1 − αe−βw )
2
, where w, α, β > 0 and α = 1 − α , Z a zero truncated Poisson variable with
probability function
P ( z; λ ) =
e −λ λ z
, z∈
Γ( z + 1)(1 − e −λ )
, λ > 0,
where Γ(⋅) is the gamma function and Z and W are independent.
α z β ze −βzx
Let X = min(W1 , W2 , … , WZ ) . Then fˆ ( x | z , α, β) =
z +1
(1 − αe−βx )
and the marginal probability
density function of X is
f ( x, α, β, λ ) =
αβλe
−λ−βx +
αλe−βx
1−αe −βx
(1 − αe−βx ) (1 − e−λ )
2
,
(1)
where x > 0 and α, β, λ > 0 .
Fig. 1 – Probability density function of MEP distribution.
In the sequel, distribution of X will be referred to as the modified exponential-Poisson distribution
(MEP) which is customary for such names to be given to the distribution arising via the operation of
compounding in the literature. The EP distribution introduced by Kus (2007) is a particular case of the MEP
distribution corresponding α = 1.
24
Vasile Preda, Eugenia Panaitescu, Roxana Ciumara
3
It can be seen that the MEP density function arising is bell-shaped for α > 2 and
 2α

0 < λ < min
, α − 2 
 α

2
t0 =
αλ +
with
modal
value
αβλt0 e
−λ−
αλt0
(1−αt0 )
(1 − αt0 ) (1 − e )
2
−λ
at
x0 = −
1
ln(t 0 ) ,
β
where
α 2 λ2 + 4α 2
. If the condition is false then the MEP density function is monotone decreasing
2α 2
(reverse J-shaped) with modal value
αβλe
−λ−
αλ
1−α
(1 − α )2 (1 − e−λ )
at x = 0.
MEP probability density function is displayed in Fig. 1 for selected parameter values.
1. PROPERTIES OF THE DISTRIBUTION
The distribution function is given by
−λ+
F (x; α, β, λ ) =
αλe−βx
1−αe −βx
1− e
1 − e −λ
(2)
.
The qth quantile xq can be obtained from (2) as


1 
αλ
xq = ln  
β    1 + ( q − 1) eλ
ln 
  
q







+ α .

 

In particular, the median is




1 
αλ
+ α
x 1 = ln
   1 + eλ 

β
2
  ln 
 



  2 
The rth raw moment of the MEP distribution can be written as
E (X r ; α, β, λ ) =
αλe −λ Γ(r + 1) ∞ ∞  k + m + 1
αm αk λm
.


r +1
k
βr (1 − e −λ ) m =0 k =0 
 m !( m + k + 1)
∑∑
Hence the mean and variance of MEP distribution are given, respectively, by
E (X ; α, β, λ) =
Var(X ; α, β, λ ) =
∞ ∞
 k + m + 1 α m α k λ m
αλe −λ
,


2
k
β (1 − e −λ ) m =0 k =0 
 m !( m + k + 1)
∑∑
∞ ∞
 k + m + 1 α m α k λ m
2αλe −λ
−


3
k
β2 (1 − e −λ ) m =0 k =0 
 m !( m + k + 1)
∑∑
∞ ∞
 αλe −λ
 k + m + 1 α m α k λ m
−


2
 β (1 − e −λ ) m =0 k =0 
k
 m !( m + k + 1)

∑∑
2

 .


4
The modified exponential-Poisson distribution
25
Using (1) and (2), the survival function (also known reliability function) and hazard function (also
known as failure rate function) of MEP distribution are given, respectively, by
s (x; α, β, λ ) =
αλe−βx
1
e −αe−βx
1−
1 − eλ
αβλ (1 − eλ ) e
h(x; α, β, λ ) =
(1 − αe
)
−βx 2
(3)
,
−λ−βx +
αλe−βx
1−αe−βx
αλe−βx

−λ
1
1
e
1
e
−
−
(
)  −αe−βx





.
(4)
We shall now show that the failure rate of the GEP distribution can be decreasing or increasing
depending on the parameter values.
Fig. 2 – Hazard function of MEP distribution.
Define η( x) =
f ' ( x)
, where f ' denotes the first derivative of f . It is straightforward to show that
f ( x)
η(x) =
−β (1 + αλe −βx − α 2 e −2βx )
and
η '(x) = −β2 e −βx
(1 − αe−βx )
( αλ − 2α ) (1 − αe−βx ) − 2αλ
(1 − αe−βx )
If α > 1 (then α < 0 ) for 2α + αλ > 0 and x >
2α + αλ > 0 and 0 < x <
2
3
.
1 α (2α − αλ )
ln
, we have η' ( x) > 0 and for
β
2α + αλ
1 α ( 2α − αλ )
ln
, we have η' ( x) < 0 .
β
2α + αλ
26
Vasile Preda, Eugenia Panaitescu, Roxana Ciumara
If 0 < α < 1 (then α > 0 ) for 2α − αλ > 0 and 0 < x <
5
1 α (2α − αλ )
, we have η' ( x) > 0 and
ln
β
2 α + αλ
1 α (2α − αλ )
ln
, we have η' ( x) > 0 .
β
2 α + αλ
It follows from Theorem (b) of Glaser (1980) that if η' ( x) > 0 the failure rate is increasing and if
η' ( x) < 0 the failure rate is decreasing.
The mean residual life of the MEP distribution is given by
for 2α − αλ > 0 and x >
m(x0 ; α, β, λ ) = E ( X − x0 | X ≥ x0 ; α, β, λ ) =
∞
1
−βx0
αλe
e1−αe−βx0
−1
∞
α m α k λ m Γ(m + j )
∑∑ m ! j !Γ(m)(m + j )e
− ( m + j ) x0
.
m =1 k = 0
2. ESTIMATION OF THE PARAMETERS
The log-likelihood function based on observed sample size n, yobs = ( xi ; i = 1,2,..., n) from MEP
distribution is given by
ln (α,β,λ;yobs ) = n ln α + n ln β + n ln λ − n ln (1 − e −λ ) − nλ − β
+αλ
n
n
∑x
i
+
i =1
n
e −βxi
−
2
ln(1 − αe −βxi )
−βxi
1
−
α
e
i =1
i =1
∑
∑
and subsequently the associated gradients are found to be
n
n
∂ln (α,β,λ;yobs ) n
(1 − e −βxi )e −βxi
e −βxi
= +λ
−
2
,
−βxi
−βxi 2
∂α
α
i =1 (1 − αe
i =1 1 − αe
)
∑
∂ln (α,β,λ;yobs ) n
= −
∂β
β
n
∑
i =1
xi − αλ
∑
xi e −βxi
n
∑
i =1
(1 − αe−βx )
i
2
αxi e −βxi
,
−β xi
i =1 1 − αe
n
∑
−2
(5)
n
∂ln (α,β,λ;yobs ) n
ne −λ
e −βxi
= −n+α
−
.
−β xi
∂λ
λ
1 − e −λ
i =1 1 − αe
∑
(6)
(7)
We can use the first equation to express λ as a function of α and β and replace this form in the last two
equations. The obtained system may be solving using an iteration scheme.
Newton-Raphson algorithm is one of the standard methods to determine the MLE of the parameters. To
employ the algorithm, the second derivate of the log-likelihood are required for all iteration. EM algorithm is
an iterative method by repeatedly replacing the missing data with estimated value and updating the parameter
estimates. It is especially useful if the complete data set is easy to analyze. To start the algorithm,
hypothetical complete-data distribution is defined with density function
f ( x, z , α, β, λ ) =
α z β ze −βzx
(1 − αe−βx )
z +1
e −λ λ z
,
Γ( z + 1)(1 − e −λ )
where x > 0, z = 1, 2, …, β, α, λ > 0.
Thus it is straightforward to verify that the E-step of an EM cycle requires the computation of the
conditional expectation (Z | X ; α ( h ) , β ( h ) , λ( h ) ) where (α ( h ) , β ( h ) , λ( h ) ) is the current estimate of (α, β, λ ) .
6
The modified exponential-Poisson distribution
Using
the
fact
P ( z | x, α, β, λ ) =
that
α
z −1
λ
z −1
e
27
−βx ( z −1) −
(1 − αe−βx )
z −1
αλe−βx
1−αe −βx
Γ( z )
,
z∈
,
we
find
αλe −βx
. The EM cycle is completed with M-step, which is complete data maximum
1 − αe −βx
likelihood over (α, β, λ ) , with missing Z’s replaced by their conditional expectations E (Z | x; α, β, λ ) . Thus
E ( Z | x; α, β, λ ) = 1 +
an EM iteration, taking (α ( h ) , β ( h ) , λ( h ) ) into (α ( h +1) , β ( h +1) , λ( h +1) ) is given by
α ( h +1)
(
 n e −β( h ) xi λ ( h ) e −β( h ) xi − 2α ( h ) e −β( h ) xi − λ ( h ) + 2
= n 
2
(h)
 i =1
1 − α ( h ) e −β xi

∑
β( h +1)
(
)
(
 n x 1 − α ( h ) 2 e −2β( h ) xi + α ( h ) λ ( h ) e −β( h ) xi
( )
i

= n
2
(h)
1 − α ( h ) e −β xi
 i =1

∑
λ
( h +1)
(
)
 ne −λ( h )
= n
−
 1 − e −λ( h )

 1 − e −β( h ) xi

( h ) −β( h ) xi
i =1  1 − α e
n
∑
)
) 
−1
 ,


−1


 ,


−1

  .

It can be seen that only a one-dimensional search such as Newton-Raphson is required for M-step of an
EM cycle.
Applying the usual large sample approximation, the MLE of θ = (α, β, λ ) can be treated as being
approximately multivariate normal with mean θ and variance-covariance matrix, which is the inverse of the
expected information matrix J n (θ) = E ( I n ; θ) , where I n = I (θ; y obs ) is the observed information matrix
with elements
(I n )ij
=−
∂ 2ln
, i, j = 1,2,3 , and the expectation is to be taken with respect to the
∂θ i ∂θ j
distribution of X. Differentiating (5), (6) and (7), the elements of the symmetric, second-order observed
information matrix are found to be
( I n )11 =
( I n )12 = λ
n
n
e −2βxi (1 − e −βxi )
n
e −2βxi
,
+
2
λ
−
2
−β xi 3
−βxi 2
α2
)
)
i =1 (1 − αe
i =1 (1 − αe
∑
n
∑
i =1
n
xi e −βxi (1 − e −βxi − αe −βxi )
xi e −βxi
−
2
,
−βxi 2
(1 − αe −βxi )3
)
i =1 (1 − αe
∑
( I n )13 = −
( I n )22 =
∑
n
∑
i =1
(1 − e −βxi )e −βxi
,
(1 − αe −βxi ) 2
n
n
xi2 e −βxi (1 + αe −βxi )
αxi2 e −βxi
n
,
−
αλ
−
2
−βxi 2
β2
(1 − αe −βxi )3
)
i =1
i =1 (1 − αe
∑
∑
( I n )23 = α
( I n ) 33 =
n
xi e −βxi
∑ (1 − αe
i =1
−β xi 2
)
n
ne −λ
−
.
2
(1 − e −λ ) 2
λ
,
28
Vasile Preda, Eugenia Panaitescu, Roxana Ciumara
7
The elements of the expected information matrix J n (θ) are calculated by taking the expectation of
(I n )ij , i, j = 1, 2, 3 , with respect to the distribution of X, i.e the following expectation is required:

X r e −bt1 X
Etr1t2 = E 
 (1 − ae −bX )t2


∞ ∞
−λ
 k + m + t2 + 1
αm αk λm
 = αλe Γ(r + 1)
.


r +1

k
βr (1 − e −λ ) m =0 k =0 
m !( m + k + t1 + 1)


∑∑
Thus J n (θ) are derived in the form
(J n )11
=
(J n )12
n
n
0
0
0
+
2
λ
E
−
2
λ
E33
− 2 E 22
,
23
α2
i =1
∑
1
1
1
= λE13
− λ(α + 1) E 23
− 2 E12
,
(J n )13
(J n )22
=
0
= E 22
− E120 ,
n
2
) − 2αE122 ,
− αλ (E132 − E 23
λ2
(J n )23
( J n ) 33 =
1
= αE12
,
n
ne −λ
−
.
λ (1 − e −λ ) 2
The inverse of J n (θ) , evaluated at θ̂ , provides the asymptotic variance-covariance matrix of the
MLEs. Alternative estimates can be obtained from the inverse of the observed information matrix since it is a
consistent estimator of J n−1 (θ) .
Under conditions that are fulfilled for the parameter θ in the interior of the parameter space but not on
the boundary, the asymptotic distribution of
(
)
n θˆ − θ is N 3 (0, K −1 (θ) ) ,
where lim J n (θ) = K (θ) is the unit information matrix. This asymptotic behaviour remains valid if K (θ) is
n →∞
replaced by the average sample information matrix evaluated at θ̂ , i.e., J n (θ) .
(
−1
)
It is noteworthy that the multivariate normal distribution N 3 0, J n (θ) can be used to construct
1
confidence intervals for the parameters. In fact, an (1 − γ ) × 100% ( 0 < γ < ) asymptotic confidence
2
interval for the i-th parameter θi in θ is
ACI i =  θˆ i − z1 − γ / 2 Jˆ θi θ j , θˆ i + z1 − γ / 2 Jˆ θi θ j

 ,

where Jˆ θi θ j denotes the i-th diagonal element of J n−1 (θ) , i = 1,2,3 and z1− γ / 2 is the 1 − γ / 2 standard
normal quantile.
We shall now move to hypothesis testing inference on the parameters of the MEP law. Consider the
partition θ = (θ1T , θ1T ) of the MEP parameter vector and suppose we wish to test the hypothesis
H 0 : θ1 = θ1( 0 ) against the alternative hypothesis H A : θ1 ≠ θ1( 0) .
To that end, we can use the likelihood ratio (LR) test whose test statistic is given by
T
w = 2 l (θˆ ) − l (θ) , where θˆ = θˆ 1T , θˆ 1T and θ =  θ1(0) , θ1T  stand for the MLEs of θ under the null and


the alternative hypotheses, respectively.
{
}
(
)
( )
8
The modified exponential-Poisson distribution
29
Under the null hypothesis, w is asymptotically (as n → ∞ ) distributed as χ 2k , where k is the dimension
of the vector θ1 of parameters of interest. We reject the null hypothesis at the nominal level γ ( 0 < γ < 1 ) if
w > χ 2k ,1 − γ , where χ 2k ,1 − γ is the 1 − γ quartile of χ 2k .
Using this test, one can decide between a MEP and an EP model, which can be done by testing
H 0 : α = 1 against H A : α ≠ 1 .
ACKNOWLEDGEMENTS
The work of the second author was supported by the grant PN II IDEI, code ID 42-139/2008.
REFERENCES
1. ADAMIDIS, K.; LOUKAS, S., A lifetime distribution with decreasing failure rate. Statistics and Probability Letters, 39, pp. 35–42,
1998.
2. BARRETO-SOUZA, W., CRIBARI-NETO, F., A generalization of the exponential-Poisson distribution. Statistics and Probability
Letters, 79, pp. 2493–2500, 2009.
3. GLASER, R.E., Bathtub and related failure rate characterizations. Journal of the American Statistical Association, 75, pp. 667–672,
1980.
4.GUPTA, R.D.; KUNDU, D., Generalized exponential distributions. Australian and New Zealand Journal of Statistics, 41, pp. 173–188,
1999.
5. KUS, C., A new lifetime distribution. Computational Statistics and Data Analysis, 51, pp. 4497–4509, 2007.
6. MARSHALL, A.W., OLKIN, I., Life Distributions: Structure of Nonparametric, Semiparametric and Parametric Families.
Springer, New York, 2007.
7. MARSHALL, A.W., OLKIN, I., A new method for adding a parameter to a family of distributions with application to the
exponential and Weibull families, Biometrika, 84, pp. 641–652, 1997.
Received December 21, 2010