Download CONSTRUCTING SHORTEST-LENGTH CONFIDENCE

Transport and Telecommunication Vol.3, N 1, 2002 CONSTRUCTING SHORTEST-LENGTH CONFIDENCE INTERVALS Konstantin N. Nechval Transport and Telecommunication Institute Lomonosov Street 1, LV-1019 Riga, Latvia Ph: (+371)-7100650, Fax: (+371)-7100660, E-mail: [email protected] Nicholas A. Nechval & Edgars K. Vasermanis Department of Mathematical Statistics, University of Latvia Raina Blvd 19, LV-1050 Riga, Latvia Ph: (+371)-7034702, Fax: +371-7034702, E-mail: [email protected] Valery Ya. Makeev Department of Computer Control, Transport and Telecommunication Institute Lomonosov Street 1, LV-1019 Riga, Latvia Ph: (+371)-7100650, Fax: (+371)-7100660, E-mail: [email protected] Abstract In this paper, we present an approach to invariant confidence intervals that emphasizes pivotal quantities. We consider confidence interval problems that are invariant under a group of transformation G such that the induced group G acts transitively on the parameter space. The purpose of this paper is to give a technique for deriving confidence intervals with a minimum length property. Examples illustrating the use of this technique are given. Key words: Confidence interval, Shortest length, Technique for constructing 1. Introduction In many problems of statistical inference the experimenter is interested in constructing a family of sets that contain the true (unknown) parameter value with a specified (high) probability. If X, for example, represents the length of life of a piece of equipment, the experimenter is interested in a lower bound θL for the mean θ of X. Since θL=θL(X) will be a function of the observations, one cannot ensure with probability 1 that θL(X)≤θ. All that one can do is to choose a number 1−α that is close to 1 so that Pθ{θL(X)≤θ}≥1−α for all θ. Problems of this type are called problems of confidence estimation. Let Pθ, θ∈Θ⊆Rk, be the set of probability distributions of an rv X. A family of subsets S(x) of Θ, where S(x) depends on the observation x but not on θ, is called a family of random sets. If, in particular, Θ⊆R and S(x) is an interval (θL(x), θU(x)), where θL(x) and θU(x) are functions of x alone (and not θ), we call S(X) a random interval with θL(X) and θU(X) as lower and upper bounds, respectively. θL(X) may be −∞, and θU(X) may be +∞. In a wide variety of inference problems one is not interested in estimating the parameter or testing some hypothesis concerning it. Rather, one wishes to establish a lower or an upper bound, or both, 95 Transport and Telecommunication Vol.3, N 1, 2002 for the real-valued parameter. For example, if X is the time to failure of a piece of equipment, one is interested in a lower bound for the mean of X. If the rv X measures the toxicity of a drug, the concern is to find an upper bound for the mean. Similarly, if the rv X measures the nicotine content of a certain brand of cigarettes, one is interested in determining an upper and a lower bound for the average nicotine content of these cigarettes. In this paper we are interested in the problem of confidence estimation, namely, that of finding a family of random sets S(x) for a parameter θ such that, for a given α, 0<α<1 (usually small), Pθ{S(X)∋θ}≥1−α for all θ∈Θ. (1) A family of subsets S(x) of Θ⊆Rk is said to constitute a family of confidence sets at confidence level 1−α if the random set S(X) covers the true parameter value θ with probability ≥1−α. We restrict our attention mainly to the case where θ∈Θ⊆R . A lower confidence bound corresponds to the case where S(x)={θ: θL(x) ≤ θ < ∞}; (2) and an upper confidence bound, to the case where S(x)={θ: θU(x) ≥ θ > −∞}. (3) If S(x) is of the form S(x)=(θL(x), θU(x)) (4) we will call it a confidence interval at confidence level 1−α, provided that Pθ{(θL(X) < θ < θU(X)} ≥ 1−α for all θ, (5) and the quantity inf Pθ{(θL(X) < θ < θU(X)} will be referred to as the confidence coefficient θ associated with the random interval. 2. Main Theorem The following result provides a general method of finding confidence intervals and covers most cases in practice. Theorem 1. Let X1, X2, …, Xn be a random sample from a df Fθ, θ∈Θ, where Θ is an interval of R. Let V(X,θ)=V(X1, X2, …, Xn,θ) be a real-valued function defined on Rn × Θ, such that, for each θ, V(X,θ) is a statistic, and as a function of θ, V is strictly increasing or decreasing at every x=(x1, x2, …, xn)∈Rn. Let Ω⊆R be the range of V, and for every ω∈Ω and x∈Rn let the equation ω=V(x,θ) be solvable. If the distribution of V(X,θ) is independent of θ, one can construct a confidence interval for θ at any level. Proof. Let 0 < α < 1. Then we can choose a pair of numbers ω1(α) and ω2(α) in Ω not necessarily unique, such that Pθ{ω1(α) < V(X,θ) < ω2(α)} ≥ 1−α for all θ. 96 (6) Transport and Telecommunication Vol.3, N 1, 2002 Since the distribution of V is independent of θ, it is clear that ω1 and ω2 are independent of θ. Since, moreover, V is monotone in θ, we can solve the equations V(x,θ)=ω1(α) and V(x,θ)=ω2(α) (7) for every x, uniquely for θ. We have Pθ{(θL(X) < θ < θU(X)} ≥ 1−α for all θ, (8) where θL(X) < θU(X) are rv’s. This completes the proof. Remark 1. The condition that ω=V(x,θ) be solvable will be satisfied if, for example, V is continuous and strictly increasing or decreasing as a function of θ in Θ. Remark 2. It is usually possible to find a function V that is monotone and has a distribution independent of θ. For example, if Fθ is continuous and monotone, in θ, we can take V ( X, θ) = n ∏ Fθ (X i ). (9) i =1 Since Fθ is continuous, Fθ(Xi) are iid with U[0,1] distribution. Then − ln V ( X, θ) = − n ∑ ln Fθ (Xi ), (10) i =1 where −lnFθ(Xi) are iid rv’s, each with common G(1,1) distribution. It follows that −lnV(X,θ) ~ G(n,1), and we can find ω1, ω2 such that 1 Pθ {− ln ω2 < − ln V ( X, θ) < − ln ω1} = Γ(n ) − ln ω1 ∫x n −1 − x e dx = 1 − α. (11) − ln ω 2 Thus Pθ{ω1 < V(X,θ) < ω2}=1−α. (12) Pθ{(θL(X) < θ < θU(X)} ≥ 1−α. (13) This last statement is equivalent to Note that in the continuous case we can find a confidence interval with equality on the right side of (6). In the discrete case, however, this is usually not possible. Remark 3. Relation (6) is valid even when the assumption of monotonicity of V in the theorem is dropped. In that case inversion of the inequalities may yield a set of intervals (random set) S(X) in Θ instead of a confidence interval. Remark 4. The argument used in Theorem 1 can be extended to cover the multiparameter case, and the method will determine a confidence set for all the parameters of a distribution. 97 Transport and Telecommunication Vol.3, N 1, 2002 3. Shortest-Length Confidence Intervals It follows from the above that, for a given confidence level 1−α, a wide choice of confidence intervals is available. Clearly, the larger the interval, the better is the chance of trapping a true parameter value. Thus the interval (−∞, +∞), which ignores the data completely, will include the real-valued parameter θ with confidence level 1. In practice, one would like to set the level at a given fixed number 1−α (0 < α < 1) and, if possible, construct an interval as short as possible among all confidence intervals with the same level. Such an interval is desirable since it is more informative. Unfortunately, shortest-length confidence intervals do not always exist. In this paper we will investigate the possibility of constructing shortest-length confidence intervals based on some simple rv’s. Theorem 1 is really the key to the following discussion. Let X1, X2, …, Xn be a sample from a pdf fθ(x), and V(X1, X2, …, Xn, θ) = V be an rv with distribution independent on θ. Also, let ω1=ω1(α), ω2=ω2(α) be chosen so that P{ω1 < V < ω2} = 1−α, (14) and suppose that (14) can be rewritten as P{(θL(X) < θ < θU(X)} = 1−α (15) (see Theorem 1 for a set of sufficient conditions). For every V, ω1 and ω2 can be chosen in many ways. We would like to choose ω1 and ω2 so that θU−θL is minimum. Such an interval is a 1−α level shortest-length confidence interval based on V. It may be possible, however, to find another rv V* that may yield an even shorter interval. Therefore we are not asserting that the procedure, if it succeeds, will lead to a 1−α level confidence interval that has shortest length among all intervals of this level. For V we use a random variable V(X,θ), which is a function of X=(X1, X2, …, Xn) and θ, and whose distribution is independent of θ. This function is called a pivotal quantity. The simplest pivotal quantity represents a function of a sufficient statistic and θ. For example, in sampling from a normal population, is the variance is known, ( X −µ) n / σ is a natural choice for a pivotal quantity. If σ is unknown, ( X −µ) n / S is a natural choice for a pivotal quantity. If one desires a confidence n interval for the variance σ2, ∑1 (X i − µ) 2 / σ 2 is a pivotal quantity; and, if µ is unknown, S2/σ2 is a pivotal quantity. 4. Finding The Shortest-Length Confidence Interval Let F be the distribution function of the pivotal quantity V(X1, X2, …, Xn, θ) and let ω1, ω2 be such that F(ω2) − F(ω1) = P{ω1 < V < ω2} = 1−α. (16) A 100(1−α)% confidence interval of θ is (θL(x), θU(x)) and the length of this interval is L(X1, X2, …, Xn, ω1, ω2) = θU − θL. We want to choose ω1, ω2, minimizing θU − θL and satisfying (16). Thus, we consider the problem: Minimize: L(X1, X2, …, Xn, ω1, ω2) = θU − θL, 98 (17) Transport and Telecommunication Vol.3, N 1, 2002 Subject to: F(ω2) − F(ω1) = 1−α. (18) The search for the shortest-length confidence interval θU − θL is greatly facilitated by the use of the following result. Theorem 2. Under appropriate derivative conditions, there will be a pair (ω1, ω2) giving rise to the shortest-length confidence interval as a solution to the simultaneous equations dL = 0, dω1 where (19) F(ω2) − F(ω1) = 1−α, (20) dω2 F′(ω1 ) = dω1 F′(ω2 ) (21) . Proof. Note that (20) forces ω2 to be a function of ω1 (or visa-versa). Take L(ω1, ω2) as a function of ω1, say L(ω1, ω2(ω1)). Then, by using the result of Phipps [1], the proof follows immediately. 5. Examples Example 1. Let X1, X2, …, Xn be a sample from U(0,θ). Then X(n)=max(X1, X2, …, Xn) is sufficient for θ with density f n (x (n ) ) = n x (nn−)1 θn , 0 < x (n) < θ. (22) The rv V=X(n)/θ has pdf f(v)=nvn-1, 0 < v < 1. (23) Using V as a pivotal quantity, we see that the confidence interval is (X(n)/ω2, X(n)/ω1) with length L=X(n)(1/ω1−1/ω2). We minimize L subject to F(ω2) − F(ω1) = ωn2 − ω1n = 1−α. (24) Now  1 dL 1 dω2  = X ( n )  − 2 + 2 , dω1  ω1 ω2 dω1  (25) where dω2 ω1n −1 = . dω1 ωn2 −1 It follows from (25) and (26) that 99 (26) Transport and Telecommunication Vol.3, N 1, 2002  ωn +1 − ωn +1  dL = X ( n )  1 2 n +12  < 0, dω1  ω1 ω2  (27) so that the minimum occurs at ω2=1. The shortest interval is therefore (X(n), X(n)/α1/n). Note that the length of the interval (X(n), X(n)/α1/n) goes to 0 as n becomes large. Example 2. Consider the problem of estimation of system availability from time-to-failure and time-to-repair test data. Availability is usually defined as the probability that a system is operating satisfactorily at any point in time. This probability can be expressed mathematically as A= θ 1 = θ + φ 1 + (φ / θ) (28) where θ is a system mean-time-to-failure, φ is a system mean-time-to-repair. The one-to-one correspondence between availability and φ/θ is obvious. The usual sample estimate of availability is Â = θˆ ˆθ + φˆ (29) where θ̂ , the sample estimate of θ, is calculated from θˆ = n1 ∑ X i / n1 , (30) i =1 Xi is time between the (i-1)th and ith failures, n1 is the number of failures, and φ̂ , the sample estimate of φ , is calculated from φˆ = n2 ∑ Yj / n 2 , (31) j=1 where Yj is time-to-repair associated with the jth failure, n2 is the number of repair actions initiated. It is assumed that X (time-to-failure) and Y (time-to-repair) are stochastically independent random variables with probability density functions f ( x; θ) = 1 −x / θ e , θ x ∈ (0, ∞), (32) f ( y; φ) = 1 −y / φ e , φ y ∈ (0, ∞). (33) and Consider a random sample of n1 times-to-failure and n2 times-to-repair drawn from the populations described by (32) and (33) with sample means θ̂ and φ̂ calculated from (30) and (31). It is well known that 2n1 θˆ / θ and 2n2 φˆ / φ are chi-square distributed variables with 2n1 and 2n2 degrees of freedom, respectively. Since they are independent due to the independence of the variables X and Y , it is possible to define two new variables 100 Transport and Telecommunication Vol.3, N 1, 2002  2n θˆ / θ   Z1 =  1  2n  1    2n 2 φˆ / φ  =    2n 2   θˆ φ φˆ θ (34) which is F-distributed with 2n1, 2n2 degrees of freedom, and Z2 = 1 φˆ θ = Z1 θˆ φ (35) which is F-distributed with 2n2 , 2n1 degrees of freedom. The variable Z1 can be used to obtain a lower confidence limit for A as follows:   φ φˆ   θˆ φ φ  φ  Pr  ≤ F1− α (2n1 ,2n 2 ) = Pr  ≤ F1− α (2n1 ,2n 2 ) = Pr 1 + ≤ 1 + F1− α (2n1 ,2n 2 ) ˆ ˆ θ   θ θ   φθ  θ        θˆ 1 1   = Pr  ≤ A  = 1 − α. = Pr  ≤  ˆ  θˆ + φˆ F1− α (2n1 ,2n 2 )  1 + φ F (2n ,2n ) 1 + φ  1 − α 1 2 θ   θˆ (36) In most practical cases n1=n2=n and (9) becomes   θˆ ≤ A = 1 − α Pr  ˆ ˆ  θ + φF1− α (2n,2n )  (37) and the (1-α) lower confidence limit is found from LCL = θˆ . ˆθ + φˆ F (2n,2n ) 1− α (38) A two-sided (1-α) confidence interval, derived in a similar manner, is given by LCL = UCL = θˆ θˆ + φˆ F1− α / 2 (2n,2n ) θˆ F1− α / 2 (2n,2n ) θˆ F1− α / 2 (2n ,2n ) + φˆ , (39) . (40) These confidence limits should be interpreted in the usual Neyman-Pearson sense that, in the long run, confidence intervals calculated from (38) or from (39) and (40) will cover the true value of availability 100(1-α) percent of the time. A test of the null hypothesis H0: A=A0 against the alternative hypothesis H1: A<A0 at the α level of significance can be performed using Z2 as the decision statistic and F1-α (2n2,2n1) as the decision criterion. H0 is accepted when Z2<F1-α(2n2,2n1) and rejected when Z2> F1-α (2n2,2n1). To perform a test of H0 vs. H1, we want to find some Q such that { } Pr Â ≤ Q; H 0 = α 101 (41) Transport and Telecommunication Vol.3, N 1, 2002 which is equivalent to   φˆ θ    1  φˆ Pr  ≤ Q; H 0  = Pr  ≤ Q′; H 0  = Pr  ⋅ 0 ≤ Q′′; H 0  = Pr{Z 2 ≥ F1− α (2n 2 ,2n1 )} = α ˆ ˆ ˆ ˆ  θ   θ φ0   1 + φ / θ (42) since under H0, Z2 is F-distributed with (2n2,2n1) degrees of freedom. Because of the one-to-one correspondence between φ / θ and availability, the test of H0 against H1 can be performed using the test statistic Z2. The power of the test of H0 against H1 is defined as the probability that H1 will be accepted when true (i.e., when A=A1<A0) and function of A1. The power function of H0 vs. H1 is  φˆ θ  Pr  0 ≥ F1− α (2n 2 ,2n1 ); H1 : A = A1 < A 0  = 1 − β,  θˆ φ0  (43) where β is the consumer’s risk or Type II error. Equation (43) is equivalent to  φˆ θ  φ θ Pr  1 ≥ 0 1 F1− α (2n 2 ,2n1 ); H1 : A = A1 < A 0  = 1 − β  θˆ φ1 θ0 φ1  (44)   1 φ0 θ1 Pr F(2n 2 ,2n1 ) ≥ F1− α (2n 2 ,2n1 ); H1 : A = A1 < A 0  = 1 − β, ϑ θ0 φ1   (45) or where ϑ= φ0 θ0 φ1 θ1 (46) since under H1, φˆ θ1 / θˆ φ1 is an F-distributed variable with 2n2, 2n1 degrees of freedom. The particular value of ϑ for which the power function equals some specified value of 1-β is ϑ = F1− α (2n 2 ,2n1 )F1− β (2n1 ,2n 2 ). (47) Again, in most practical problems n1=n2=n, and F(2n2,2n1) is replaced by F(2n,2n) in (42) through (47). Using the technique proposed above, it can be shown that the two-sided shortest-length 100(1−α)% confidence interval for availability can be obtained from solving the problem: Minimize: L(X1, X2, …, Xn, ω1, ω2) = AU − AL, (48) 1 , (49) , (50) where AU = φˆ 1 + ω1 θˆ 1 AL = 1 + ω2 102 φˆ θˆ Transport and Telecommunication Vol.3, N 1, 2002 Subject to: F2 n1 , 2 n 2 (ω2 ) − F2 n1 , 2 n 2 (ω1 ) = 1 − α. (51) Another time-to-repair distribution, frequently assumed in waiting line analysis, is the Erlang family of service time distributions defined by f ( y; µ, k ) = (µk ) k k −1 y exp(− kµy), (k − 1)! y ∈ (0, ∞). (52) Preliminary analysis indicates that if time-to-repair is distributed as the kth member of the Erlang family, confidence limits can be placed on availability using an approach similar to that presented in this paper. It will be noted that the surprising result, that from a single observation it is possible to have finite length confidence intervals for the parameters of the normal distribution, and the extensive bibliography can be found in [2]. 6. Conclusion The technique given in this paper for constructing shortest-length confidence intervals is easy to apply. This technique can also be used for solving the similar problems. Acknowledgments This work was supported in part by Grant No.01- 48d and Grant No.01.0031 from the Latvian Council of Sciences and the National Institute of Mathematics and Informatics of Latvia. References 1. C.G. Phipps, Maxima and minima under restraint, American Mathematical Monthly, 59, 1952, 230-235. 2. M.M. Wall, J. Boen, and R. Tweedie, An effective confidence interval for the mean with samples of size one and two, The American Statistician, 55(2), 2001, 102-105. 103

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download CONSTRUCTING SHORTEST-LENGTH CONFIDENCE