Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Confidence Intervals If X1,...,Xn have joint distribution depending on θ, a 100(1-α)% confidence interval for θ is the evaluation from the data (observed values x1,...,xn of X1,...,Xn) of a random interval (A(X1,...,Xn),B(X1,...,Xn)) which has probability 1-α of containing θ. We obtain the random interval by using a pivot, which is a function of X1,...,Xn and θ whose distribution is completely known. It is referred to as an asymptotic pivot if the distribution of the pivot is approximate for large n. The pivot is usually based on the maximum likelihood estimator if the joint distribution of the X's is known, since this provides shortest expected length for the random intervals. We also usually find central confidence intervals. The confidence intervals given below are central confidence intervals. Approximate confidence intervals based on the MLE In the multi parameter as well as the one parameter case, for large n the maximum likelihood estimator θ̂ of θ has approximately normal distribution with mean θ and variance equal to the Cramer-Rao lower bound CRLB(θ). If there is a single parameter only, we could therefore use the asymptotic pivot θ̂ − θ Z= ~ N(0,1) CRLB(θ) However it is usual to do a further approximation and replace θ in the formula for the CRLB(θ) by its estimator θ̂ . This is essential in the multi parameter case when CRLB(θ) depends on other parameters in addition to θ (so that Z above is not a pivot). We therefore, for both the one and multi parameter cases, can use the asymptotic pivot Z= θ̂ − θ ~ N(0,1) CRL̂B(θ) where all the parameters in CRLB(θ) are replaced by their maximum likelihood estimators. Using this asymptotic pivot, denoting the upper 100α% point of N(0,1) by zα, for large n we have: θ̂ − θ 1 − α ≅ P( −z α / 2 < Z < z α / 2 ) = P −z α / 2 < < zα / 2 CRL̂B(θ) ( = P(θ̂ − z CRL̂B(θ) < θ < θ̂ + z CRL̂B(θ), θ̂ + z Hence (θ̂ − z = P −z α / 2 CRL̂B(θ) < θ̂ − θ < z α / 2 CRL̂B(θ) α/2 α/2 α/2 α/2 ) CRL̂B(θ) ) ) CRL̂B(θ) is a random interval which has approximately probability (1-α) of containing θ. An evaluation of this interval (by replacing the estimator of the parameters by the estimates from the data) is an approximate 100(1-α)% confidence interval for θ. Approximate confidence intervals based on an independent sample from a distribution where the distributional form is not known We can base confidence intervals on the method of moments estimators. The rth sample 1 n moment about the origin m'r = ∑ X rj is the method of moments estimator of E[X r ] and n j=1 is a sample mean of the random variables Y j = X rj for j=1,...,n. Hence, from the Central Limit Theorem, m'r has approximately normal distribution when n is large. We will only consider a confidence interval for the mean µ. The method of moments estimator of µ is the sample mean X . If the variance of the distribution is known then the X−µ asymptotic pivot is Z = ~ N(0,1). If the variance is unknown then the asymptotic σ2 / n X−µ pivot is Z = ~ N(0,1). Here S2 is the sample variance. 2 S /n In the latter case: X−µ 1 − α ≅ P( −z α / 2 < Z < z α / 2 ) = P −z α / 2 < < z α / 2 S2 / n ( = P( X − z = P −z α / 2 S2 / n < X − µ < z α / 2 S2 / n α/2 ) S2 / n < µ < X + z α / 2 S2 / n ) So the approximate 100(1-α)% confidence interval, when n is large, is s2 s2 s s = x − zα / 2 x − z , x + z , x + zα / 2 α/2 α/2 n n n n When the variance σ2 is known, the only difference is that σ replaces the sample standard deviation s. Exact confidence intervals when the sample is from a normal distribution Confidence interval for µ When the variance is known Base the pivot on X , which is the minimum variance unbiased estimator for µ. X−µ The pivot (for any sample size n) is Z = ~ N(0,1) σ2 / n X−µ 1 − α = P( −z α / 2 < Z < z α / 2 ) = P −z α / 2 < < z α/2 σ2 / n ( = P X − zα / 2 σ 2 / n < µ < X + zα / 2 σ 2 / n ) σ2 σ2 So the exact 100(1-α)% confidence interval is x − z α / 2 , x + zα / 2 n n When the variance is unknown Z above is no longer a pivot. Replace σ2 by S2. Using results for the distribution of the sample mean and variance for a sample from the normal distribution, the pivot is X−µ T= ~ t n −1 S2 / n Denoting the upper 100α% point of tn-1 by tn-1,α X−µ 1 − α = P( −t n −1,α / 2 < T < t n −1,α / 2 ) = P −t n −1,α / 2 < < t n −1,α / 2 S2 / n ( = P X − t n −1,α / 2 S2 / n < µ < X + t n −1,α / 2 S2 / n ) So the exact 100(1-α)% confidence interval is s2 s2 s s = x − t n −1,α / 2 x − t , x + t ,x + t n −1,α / 2 n −1,α / 2 n −1,α / 2 n n n n Confidence interval for σ 2 This is based on the minimum variance unbiased estimator for σ2, S2. We use the result (n − 1)S2 that U = ~ χ 2n −1 . So U is the pivot. 2 σ Let A and B be the lower and upper 100α/2% points of the χ 2n −1 distribution. Then (n − 1)S2 (n − 1)S2 (n − 1)S2 2 1 − α = P( A < U < B) = P A < < B = P <σ < σ2 A B So the exact 100(1-α)% confidence interval for σ2 is (n − 1)s2 (n − 1)s2 , B A It is trivial to convert this into a confidence interval for the standard deviation σ. (n − 1)S2 (n − 1)S2 (n − 1)S2 (n − 1)S2 = P 1 − α = P < σ2 < < σ < B B A A Hence the exact 100(1-α)% confidence interval for σ is (n − 1)s2 (n − 1)s2 , B A The connection with 2-tailed tests considered in Statistics 1 Test for the mean µ The test is of H0:µ=µ0 against Ha:µ≠µ0, and uses significance level α. When the variance is known The test statistic is almost the same as the pivot, but has µ0 replacing µ X − µ0 The test statistic Z = ~ N(0,1) when H0 is true. σ2 / n We reject the null hypothesis if the calculated value of the test statistic is either ≥zα/2 or is ≤-zα/2 . Hence we accept the null hypothesis if x − µ0 −z α / 2 < < z α / 2 i.e. if x − z α / 2 σ 2 / n < µ 0 < x + z α / 2 σ 2 / n 2 σ /n Thus we accept H0 if µ0 lies in the 100(1-α)% confidence interval for µ. Otherwise we reject H0. When the variance is unknown Again the test statistic is almost the same as the pivot, but has µ0 replacing µ X − µ0 The test statistic T = ~ t n −1 when H0 is true. S2 / n We reject the null hypothesis if the calculated value of the test statistic is either ≥tn-1,α/2 or is ≤-tn-1,α/2 . Hence we accept the null hypothesis if x−µ −t n −1,α / 2 < 2 0 < t n −1,α / 2 i.e. if x − t n −1,α / 2 s2 / n < µ 0 < x + t n −1,α / 2 s2 / n s /n Thus we accept H0 if µ0 lies in the 100(1-α)% confidence interval for µ. Otherwise we reject H0. Tests for the variance σ 2 The test is of H0: σ 2 = σ 20 against Ha: σ 2 ≠ σ 20 , and uses significance level α. The test statistic is almost the same as the pivot, but has σ 20 replacing σ 2 . (n − 1)S2 The test statistic U = ~ χ 2n −1 when H0 is true. 2 σ0 We reject the null hypothesis if the calculated value of the test statistic is either ≤A or is ≥B, where A and B are the lower and upper 100α/2% points of χ 2n −1 . Hence we accept the null hypothesis if (n − 1)s2 (n − 1)s2 (n − 1)s2 2 A< < B i.e. if < σ < 0 B σ 20 A 2 Thus we accept H0 if σ 0 lies in the 100(1-α)% confidence interval for σ2. Otherwise we reject H0. Note There is a similar link between approximate (central) confidence intervals for the mean µ and two-tailed tests concerning µ when the distributional form for X is not known. A brief note on one-sided confidence intervals We will just illustrate this via a confidence interval for the mean when the variance is known and the distribution is normal. The pivot is X−µ Z= ~ N(0,1) σ2 / n Take any 0 ≤ β ≤ α, then X−µ 1 − α = P −zβ < Z < z α −β = P −zβ < < z α −β σ2 / n ( ( ) = P X − z α −β σ 2 / n < µ < X + zβ σ 2 / n ( ) The confidence interval is then x − z α −β σ 2 / n, x + zβ σ 2 / n ) So we can obtain different 100(1-α)% confidence intervals from the same set of data by choosing different values of β. The length of the confidence interval is (zβ + z α −β ) σ 2 / n , so will be shortest if we take β=α/2 (so that the confidence interval is central). Also the confidence interval will be centred on the estimate x for µ when β=α/2. We obtain one-sided confidence intervals by taking either β=0 or β=α. One-sided confidence intervals are useful if we want to obtain either a lower or upper bound for the parameter µ. One-sided confidence intervals link to one-tailed tests in an equivalent manner to the link between central confidence intervals and two-tailed tests. Upper bound for µ ( Take β=α so that zα−β=z0= ∞. The confidence interval is then −∞, x + z α σ 2 / n ) So we are 100(1-α)% confident that µ < x + z α σ 2 / n . We can carry out a test of H 0 :µ = µ 0 against H a :µ < µ 0 with significance level α, by accepting H0 if µ0 lies in this one-sided confidence interval. So we accept H0 if x − µ0 µ 0 < x + z α σ 2 / n . (This is equivalent to rejecting H0 if ≤ −z α .) σ2 / n Lower bound for µ ( Take β=0 so that zβ=z0= ∞. The confidence interval is then x − z α σ 2 / n,∞ ) So we are 100(1-α)% confident that µ > x − z α σ 2 / n . We can carry out a test of H 0 :µ = µ 0 against H a :µ > µ 0 with significance level α, by accepting H0 if µ0 lies in the one-sided confidence interval. So we accept H0 if x − µ0 µ 0 > x − z α σ 2 / n . (This is equivalent to rejecting H0 if ≥ z α .) σ2 / n