Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
THE PUBLISHING HOUSE OF THE ROMANIAN ACADEMY PROCEEDINGS OF THE ROMANIAN ACADEMY, Series A, Volume 12, Number 1/2011, pp. 22–29 THE MODIFIED EXPONENTIAL-POISSON DISTRIBUTION Vasile PREDA*, Eugenia PANAITESCU**, Roxana CIUMARA*** * University of Bucharest, Faculty of Mathematics and Computer Science, Bucharest, Romania ** "Carol Davila" University of Medicine and Pharmacy, Department of Medical Informatics and Biostatistics, Bucharest, Romania *** Academy of Economic Studies, Department of Mathematics, Bucharest, Romania E-mail: [email protected] The two-parameter distribution known as exponential-Poisson (EP) distribution, which has decreasing failure rate, was introduced by Kus (2007). In this paper we generalize the EP distribution and show that the failure rate of the new distribution can be decreasing or increasing. The failure rate can also be upside-down bathtub shaped. We provide closed-form expressions for the density, cumulative distribution, survival and failure rate functions. The EM algorithm is used to determine the maximum likelihood estimates and the asymptotic varinces and covariances of these estimates are obtained. Key words: Compounding; Failure rate; EM algorithm; Maximum likelihood estimates; Modified exponential-Poisson distribution. 1. INTRODUCTION The exponential distribution (ED) provides a simple, elegant and close form solution to many problems in lifetime testing and reliability studies. However, the ED does not provide a reasonable parametric fit for some practical applications where the underlying hazard rates are nonconstant, presenting monotone shapes. In recent years, in order to overcome such a problem, new classes of models were introduced based on modifications of the ED. Gupta and Kundu (1999) proposed a generalized ED. This extended family can accommodate data with increasing and decreasing failure rate functions. Adamidis and Loukas (1998) introduced a distribution with decreasing failure rate. This distribution is known as exponential-geometric distribution and is obtained by compouding an exponential distribution with a geometric distribution. In the same fashion, Kus (2007) introduced a two-parameter distribution known as exponential-Poisson (EP) distribution, which has decreasing failure rate, by compouding an exponential distribution with a Poisson distribution. While Barreto-Souza and Cribari-Neto (2009) generalizes the distribution proposed by Kus (2007) by including a power parameter in his distribution. In this paper, we propose a new distribution family based on the ED with increasing or decreasing failure rate function. Its genesis is stated on a parameterization scheme for a survival function proposed by Marshall and Olkin (1997). The paper is organized as follows. Sections 1 and 2 present the new distribution and its properties and Section 3 outlines an EM-type algorithm for maximum likelihood estimation. 2. GENESIS OF THE DISTRIBUTION Marshall and Olkin (1997) introduced a parameterization scheme for a distribution function F ( y, a ) by defining another distribution function F ( y, a ) = F ( y) , y∈ F ( y ) + a(1 − F ( y )) , a>0. 2 The modified exponential-Poisson distribution 23 We use this parameterization to obtain the modified exponential 1 − e −βx Fˆ (x, α, β) = , x > 0 , α, β > 0 . 1 − (1 − α)e −βx distribution function Let W1 , W2 , W3 … , WZ be a random sample from modified exponential distribution with density fˆ ( w, α, β) = αβe −βw (1 − αe−βw ) 2 , where w, α, β > 0 and α = 1 − α , Z a zero truncated Poisson variable with probability function P ( z; λ ) = e −λ λ z , z∈ Γ( z + 1)(1 − e −λ ) , λ > 0, where Γ(⋅) is the gamma function and Z and W are independent. α z β ze −βzx Let X = min(W1 , W2 , … , WZ ) . Then fˆ ( x | z , α, β) = z +1 (1 − αe−βx ) and the marginal probability density function of X is f ( x, α, β, λ ) = αβλe −λ−βx + αλe−βx 1−αe −βx (1 − αe−βx ) (1 − e−λ ) 2 , (1) where x > 0 and α, β, λ > 0 . Fig. 1 – Probability density function of MEP distribution. In the sequel, distribution of X will be referred to as the modified exponential-Poisson distribution (MEP) which is customary for such names to be given to the distribution arising via the operation of compounding in the literature. The EP distribution introduced by Kus (2007) is a particular case of the MEP distribution corresponding α = 1. 24 Vasile Preda, Eugenia Panaitescu, Roxana Ciumara 3 It can be seen that the MEP density function arising is bell-shaped for α > 2 and 2α 0 < λ < min , α − 2 α 2 t0 = αλ + with modal value αβλt0 e −λ− αλt0 (1−αt0 ) (1 − αt0 ) (1 − e ) 2 −λ at x0 = − 1 ln(t 0 ) , β where α 2 λ2 + 4α 2 . If the condition is false then the MEP density function is monotone decreasing 2α 2 (reverse J-shaped) with modal value αβλe −λ− αλ 1−α (1 − α )2 (1 − e−λ ) at x = 0. MEP probability density function is displayed in Fig. 1 for selected parameter values. 1. PROPERTIES OF THE DISTRIBUTION The distribution function is given by −λ+ F (x; α, β, λ ) = αλe−βx 1−αe −βx 1− e 1 − e −λ (2) . The qth quantile xq can be obtained from (2) as 1 αλ xq = ln β 1 + ( q − 1) eλ ln q + α . In particular, the median is 1 αλ + α x 1 = ln 1 + eλ β 2 ln 2 The rth raw moment of the MEP distribution can be written as E (X r ; α, β, λ ) = αλe −λ Γ(r + 1) ∞ ∞ k + m + 1 αm αk λm . r +1 k βr (1 − e −λ ) m =0 k =0 m !( m + k + 1) ∑∑ Hence the mean and variance of MEP distribution are given, respectively, by E (X ; α, β, λ) = Var(X ; α, β, λ ) = ∞ ∞ k + m + 1 α m α k λ m αλe −λ , 2 k β (1 − e −λ ) m =0 k =0 m !( m + k + 1) ∑∑ ∞ ∞ k + m + 1 α m α k λ m 2αλe −λ − 3 k β2 (1 − e −λ ) m =0 k =0 m !( m + k + 1) ∑∑ ∞ ∞ αλe −λ k + m + 1 α m α k λ m − 2 β (1 − e −λ ) m =0 k =0 k m !( m + k + 1) ∑∑ 2 . 4 The modified exponential-Poisson distribution 25 Using (1) and (2), the survival function (also known reliability function) and hazard function (also known as failure rate function) of MEP distribution are given, respectively, by s (x; α, β, λ ) = αλe−βx 1 e −αe−βx 1− 1 − eλ αβλ (1 − eλ ) e h(x; α, β, λ ) = (1 − αe ) −βx 2 (3) , −λ−βx + αλe−βx 1−αe−βx αλe−βx −λ 1 1 e 1 e − − ( ) −αe−βx . (4) We shall now show that the failure rate of the GEP distribution can be decreasing or increasing depending on the parameter values. Fig. 2 – Hazard function of MEP distribution. Define η( x) = f ' ( x) , where f ' denotes the first derivative of f . It is straightforward to show that f ( x) η(x) = −β (1 + αλe −βx − α 2 e −2βx ) and η '(x) = −β2 e −βx (1 − αe−βx ) ( αλ − 2α ) (1 − αe−βx ) − 2αλ (1 − αe−βx ) If α > 1 (then α < 0 ) for 2α + αλ > 0 and x > 2α + αλ > 0 and 0 < x < 2 3 . 1 α (2α − αλ ) ln , we have η' ( x) > 0 and for β 2α + αλ 1 α ( 2α − αλ ) ln , we have η' ( x) < 0 . β 2α + αλ 26 Vasile Preda, Eugenia Panaitescu, Roxana Ciumara If 0 < α < 1 (then α > 0 ) for 2α − αλ > 0 and 0 < x < 5 1 α (2α − αλ ) , we have η' ( x) > 0 and ln β 2 α + αλ 1 α (2α − αλ ) ln , we have η' ( x) > 0 . β 2 α + αλ It follows from Theorem (b) of Glaser (1980) that if η' ( x) > 0 the failure rate is increasing and if η' ( x) < 0 the failure rate is decreasing. The mean residual life of the MEP distribution is given by for 2α − αλ > 0 and x > m(x0 ; α, β, λ ) = E ( X − x0 | X ≥ x0 ; α, β, λ ) = ∞ 1 −βx0 αλe e1−αe−βx0 −1 ∞ α m α k λ m Γ(m + j ) ∑∑ m ! j !Γ(m)(m + j )e − ( m + j ) x0 . m =1 k = 0 2. ESTIMATION OF THE PARAMETERS The log-likelihood function based on observed sample size n, yobs = ( xi ; i = 1,2,..., n) from MEP distribution is given by ln (α,β,λ;yobs ) = n ln α + n ln β + n ln λ − n ln (1 − e −λ ) − nλ − β +αλ n n ∑x i + i =1 n e −βxi − 2 ln(1 − αe −βxi ) −βxi 1 − α e i =1 i =1 ∑ ∑ and subsequently the associated gradients are found to be n n ∂ln (α,β,λ;yobs ) n (1 − e −βxi )e −βxi e −βxi = +λ − 2 , −βxi −βxi 2 ∂α α i =1 (1 − αe i =1 1 − αe ) ∑ ∂ln (α,β,λ;yobs ) n = − ∂β β n ∑ i =1 xi − αλ ∑ xi e −βxi n ∑ i =1 (1 − αe−βx ) i 2 αxi e −βxi , −β xi i =1 1 − αe n ∑ −2 (5) n ∂ln (α,β,λ;yobs ) n ne −λ e −βxi = −n+α − . −β xi ∂λ λ 1 − e −λ i =1 1 − αe ∑ (6) (7) We can use the first equation to express λ as a function of α and β and replace this form in the last two equations. The obtained system may be solving using an iteration scheme. Newton-Raphson algorithm is one of the standard methods to determine the MLE of the parameters. To employ the algorithm, the second derivate of the log-likelihood are required for all iteration. EM algorithm is an iterative method by repeatedly replacing the missing data with estimated value and updating the parameter estimates. It is especially useful if the complete data set is easy to analyze. To start the algorithm, hypothetical complete-data distribution is defined with density function f ( x, z , α, β, λ ) = α z β ze −βzx (1 − αe−βx ) z +1 e −λ λ z , Γ( z + 1)(1 − e −λ ) where x > 0, z = 1, 2, …, β, α, λ > 0. Thus it is straightforward to verify that the E-step of an EM cycle requires the computation of the conditional expectation (Z | X ; α ( h ) , β ( h ) , λ( h ) ) where (α ( h ) , β ( h ) , λ( h ) ) is the current estimate of (α, β, λ ) . 6 The modified exponential-Poisson distribution Using the fact P ( z | x, α, β, λ ) = that α z −1 λ z −1 e 27 −βx ( z −1) − (1 − αe−βx ) z −1 αλe−βx 1−αe −βx Γ( z ) , z∈ , we find αλe −βx . The EM cycle is completed with M-step, which is complete data maximum 1 − αe −βx likelihood over (α, β, λ ) , with missing Z’s replaced by their conditional expectations E (Z | x; α, β, λ ) . Thus E ( Z | x; α, β, λ ) = 1 + an EM iteration, taking (α ( h ) , β ( h ) , λ( h ) ) into (α ( h +1) , β ( h +1) , λ( h +1) ) is given by α ( h +1) ( n e −β( h ) xi λ ( h ) e −β( h ) xi − 2α ( h ) e −β( h ) xi − λ ( h ) + 2 = n 2 (h) i =1 1 − α ( h ) e −β xi ∑ β( h +1) ( ) ( n x 1 − α ( h ) 2 e −2β( h ) xi + α ( h ) λ ( h ) e −β( h ) xi ( ) i = n 2 (h) 1 − α ( h ) e −β xi i =1 ∑ λ ( h +1) ( ) ne −λ( h ) = n − 1 − e −λ( h ) 1 − e −β( h ) xi ( h ) −β( h ) xi i =1 1 − α e n ∑ ) ) −1 , −1 , −1 . It can be seen that only a one-dimensional search such as Newton-Raphson is required for M-step of an EM cycle. Applying the usual large sample approximation, the MLE of θ = (α, β, λ ) can be treated as being approximately multivariate normal with mean θ and variance-covariance matrix, which is the inverse of the expected information matrix J n (θ) = E ( I n ; θ) , where I n = I (θ; y obs ) is the observed information matrix with elements (I n )ij =− ∂ 2ln , i, j = 1,2,3 , and the expectation is to be taken with respect to the ∂θ i ∂θ j distribution of X. Differentiating (5), (6) and (7), the elements of the symmetric, second-order observed information matrix are found to be ( I n )11 = ( I n )12 = λ n n e −2βxi (1 − e −βxi ) n e −2βxi , + 2 λ − 2 −β xi 3 −βxi 2 α2 ) ) i =1 (1 − αe i =1 (1 − αe ∑ n ∑ i =1 n xi e −βxi (1 − e −βxi − αe −βxi ) xi e −βxi − 2 , −βxi 2 (1 − αe −βxi )3 ) i =1 (1 − αe ∑ ( I n )13 = − ( I n )22 = ∑ n ∑ i =1 (1 − e −βxi )e −βxi , (1 − αe −βxi ) 2 n n xi2 e −βxi (1 + αe −βxi ) αxi2 e −βxi n , − αλ − 2 −βxi 2 β2 (1 − αe −βxi )3 ) i =1 i =1 (1 − αe ∑ ∑ ( I n )23 = α ( I n ) 33 = n xi e −βxi ∑ (1 − αe i =1 −β xi 2 ) n ne −λ − . 2 (1 − e −λ ) 2 λ , 28 Vasile Preda, Eugenia Panaitescu, Roxana Ciumara 7 The elements of the expected information matrix J n (θ) are calculated by taking the expectation of (I n )ij , i, j = 1, 2, 3 , with respect to the distribution of X, i.e the following expectation is required: X r e −bt1 X Etr1t2 = E (1 − ae −bX )t2 ∞ ∞ −λ k + m + t2 + 1 αm αk λm = αλe Γ(r + 1) . r +1 k βr (1 − e −λ ) m =0 k =0 m !( m + k + t1 + 1) ∑∑ Thus J n (θ) are derived in the form (J n )11 = (J n )12 n n 0 0 0 + 2 λ E − 2 λ E33 − 2 E 22 , 23 α2 i =1 ∑ 1 1 1 = λE13 − λ(α + 1) E 23 − 2 E12 , (J n )13 (J n )22 = 0 = E 22 − E120 , n 2 ) − 2αE122 , − αλ (E132 − E 23 λ2 (J n )23 ( J n ) 33 = 1 = αE12 , n ne −λ − . λ (1 − e −λ ) 2 The inverse of J n (θ) , evaluated at θ̂ , provides the asymptotic variance-covariance matrix of the MLEs. Alternative estimates can be obtained from the inverse of the observed information matrix since it is a consistent estimator of J n−1 (θ) . Under conditions that are fulfilled for the parameter θ in the interior of the parameter space but not on the boundary, the asymptotic distribution of ( ) n θˆ − θ is N 3 (0, K −1 (θ) ) , where lim J n (θ) = K (θ) is the unit information matrix. This asymptotic behaviour remains valid if K (θ) is n →∞ replaced by the average sample information matrix evaluated at θ̂ , i.e., J n (θ) . ( −1 ) It is noteworthy that the multivariate normal distribution N 3 0, J n (θ) can be used to construct 1 confidence intervals for the parameters. In fact, an (1 − γ ) × 100% ( 0 < γ < ) asymptotic confidence 2 interval for the i-th parameter θi in θ is ACI i = θˆ i − z1 − γ / 2 Jˆ θi θ j , θˆ i + z1 − γ / 2 Jˆ θi θ j , where Jˆ θi θ j denotes the i-th diagonal element of J n−1 (θ) , i = 1,2,3 and z1− γ / 2 is the 1 − γ / 2 standard normal quantile. We shall now move to hypothesis testing inference on the parameters of the MEP law. Consider the partition θ = (θ1T , θ1T ) of the MEP parameter vector and suppose we wish to test the hypothesis H 0 : θ1 = θ1( 0 ) against the alternative hypothesis H A : θ1 ≠ θ1( 0) . To that end, we can use the likelihood ratio (LR) test whose test statistic is given by T w = 2 l (θˆ ) − l (θ) , where θˆ = θˆ 1T , θˆ 1T and θ = θ1(0) , θ1T stand for the MLEs of θ under the null and the alternative hypotheses, respectively. { } ( ) ( ) 8 The modified exponential-Poisson distribution 29 Under the null hypothesis, w is asymptotically (as n → ∞ ) distributed as χ 2k , where k is the dimension of the vector θ1 of parameters of interest. We reject the null hypothesis at the nominal level γ ( 0 < γ < 1 ) if w > χ 2k ,1 − γ , where χ 2k ,1 − γ is the 1 − γ quartile of χ 2k . Using this test, one can decide between a MEP and an EP model, which can be done by testing H 0 : α = 1 against H A : α ≠ 1 . ACKNOWLEDGEMENTS The work of the second author was supported by the grant PN II IDEI, code ID 42-139/2008. REFERENCES 1. ADAMIDIS, K.; LOUKAS, S., A lifetime distribution with decreasing failure rate. Statistics and Probability Letters, 39, pp. 35–42, 1998. 2. BARRETO-SOUZA, W., CRIBARI-NETO, F., A generalization of the exponential-Poisson distribution. Statistics and Probability Letters, 79, pp. 2493–2500, 2009. 3. GLASER, R.E., Bathtub and related failure rate characterizations. Journal of the American Statistical Association, 75, pp. 667–672, 1980. 4.GUPTA, R.D.; KUNDU, D., Generalized exponential distributions. Australian and New Zealand Journal of Statistics, 41, pp. 173–188, 1999. 5. KUS, C., A new lifetime distribution. Computational Statistics and Data Analysis, 51, pp. 4497–4509, 2007. 6. MARSHALL, A.W., OLKIN, I., Life Distributions: Structure of Nonparametric, Semiparametric and Parametric Families. Springer, New York, 2007. 7. MARSHALL, A.W., OLKIN, I., A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families, Biometrika, 84, pp. 641–652, 1997. Received December 21, 2010