Download Confidence Intervals If X1,...,Xn have joint distribution depending on

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Confidence Intervals
If X1,...,Xn have joint distribution depending on θ, a 100(1-α)% confidence interval for
θ is the evaluation from the data (observed values x1,...,xn of X1,...,Xn) of a random
interval (A(X1,...,Xn),B(X1,...,Xn)) which has probability 1-α of containing θ.
We obtain the random interval by using a pivot, which is a function of X1,...,Xn and θ
whose distribution is completely known. It is referred to as an asymptotic pivot if the
distribution of the pivot is approximate for large n.
The pivot is usually based on the maximum likelihood estimator if the joint distribution of
the X's is known, since this provides shortest expected length for the random intervals. We
also usually find central confidence intervals. The confidence intervals given below are
central confidence intervals.
Approximate confidence intervals based on the MLE
In the multi parameter as well as the one parameter case, for large n the maximum
likelihood estimator θ̂ of θ has approximately normal distribution with mean θ and
variance equal to the Cramer-Rao lower bound CRLB(θ).
If there is a single parameter only, we could therefore use the asymptotic pivot
θ̂ − θ
Z=
~ N(0,1)
CRLB(θ)
However it is usual to do a further approximation and replace θ in the formula for the
CRLB(θ) by its estimator θ̂ . This is essential in the multi parameter case when CRLB(θ)
depends on other parameters in addition to θ (so that Z above is not a pivot).
We therefore, for both the one and multi parameter cases, can use the asymptotic pivot
Z=
θ̂ − θ
~ N(0,1)
CRL̂B(θ)
where all the parameters in CRLB(θ) are replaced by their maximum likelihood estimators.
Using this asymptotic pivot, denoting the upper 100α% point of N(0,1) by zα, for large n
we have:


θ̂ − θ
1 − α ≅ P( −z α / 2 < Z < z α / 2 ) = P −z α / 2 <
< zα / 2 
CRL̂B(θ)


(
= P(θ̂ − z
CRL̂B(θ) < θ < θ̂ + z
CRL̂B(θ), θ̂ + z
Hence (θ̂ − z
= P −z α / 2 CRL̂B(θ) < θ̂ − θ < z α / 2 CRL̂B(θ)
α/2
α/2
α/2
α/2
)
CRL̂B(θ)
)
)
CRL̂B(θ) is a random interval which has
approximately probability (1-α) of containing θ. An evaluation of this interval (by
replacing the estimator of the parameters by the estimates from the data) is an approximate
100(1-α)% confidence interval for θ.
Approximate confidence intervals based on an independent sample from a
distribution where the distributional form is not known
We can base confidence intervals on the method of moments estimators. The rth sample
1 n
moment about the origin m'r = ∑ X rj is the method of moments estimator of E[X r ] and
n j=1
is a sample mean of the random variables Y j = X rj for j=1,...,n. Hence, from the Central
Limit Theorem, m'r has approximately normal distribution when n is large.
We will only consider a confidence interval for the mean µ. The method of moments
estimator of µ is the sample mean X . If the variance of the distribution is known then the
X−µ
asymptotic pivot is Z =
~ N(0,1). If the variance is unknown then the asymptotic
σ2 / n
X−µ
pivot is Z =
~ N(0,1). Here S2 is the sample variance.
2
S /n
In the latter case:


X−µ
1 − α ≅ P( −z α / 2 < Z < z α / 2 ) = P −z α / 2 <
<
z
α
/
2



S2 / n
(
= P( X − z
= P −z α / 2 S2 / n < X − µ < z α / 2 S2 / n
α/2
)
S2 / n < µ < X + z α / 2 S2 / n
)
So the approximate 100(1-α)% confidence interval, when n is large, is

s2
s2  
s
s 
= x − zα / 2
x
−
z
,
x
+
z
, x + zα / 2
α/2
α/2


n
n
n
n 

When the variance σ2 is known, the only difference is that σ replaces the sample standard
deviation s.
Exact confidence intervals when the sample is from a normal distribution
Confidence interval for µ
When the variance is known
Base the pivot on X , which is the minimum variance unbiased estimator for µ.
X−µ
The pivot (for any sample size n) is Z =
~ N(0,1)
σ2 / n


X−µ
1 − α = P( −z α / 2 < Z < z α / 2 ) = P −z α / 2 <
<
z
α/2


σ2 / n
(
= P X − zα / 2 σ 2 / n < µ < X + zα / 2 σ 2 / n
)

σ2
σ2 
So the exact 100(1-α)% confidence interval is  x − z α / 2
, x + zα / 2
n
n 

When the variance is unknown
Z above is no longer a pivot. Replace σ2 by S2. Using results for the distribution of the
sample mean and variance for a sample from the normal distribution, the pivot is
X−µ
T=
~ t n −1
S2 / n
Denoting the upper 100α% point of tn-1 by tn-1,α


X−µ
1 − α = P( −t n −1,α / 2 < T < t n −1,α / 2 ) = P −t n −1,α / 2 <
<
t
n
−1,α
/
2



S2 / n
(
= P X − t n −1,α / 2 S2 / n < µ < X + t n −1,α / 2 S2 / n
)
So the exact 100(1-α)% confidence interval is

s2
s2  
s
s

= x − t n −1,α / 2
x
−
t
,
x
+
t
,x +
t n −1,α / 2
n −1,α / 2
n −1,α / 2



n
n
n 
n

Confidence interval for σ 2
This is based on the minimum variance unbiased estimator for σ2, S2. We use the result
(n − 1)S2
that U =
~ χ 2n −1 . So U is the pivot.
2
σ
Let A and B be the lower and upper 100α/2% points of the χ 2n −1 distribution. Then
 (n − 1)S2

(n − 1)S2
(n − 1)S2 

2
1 − α = P( A < U < B) = P A <
< B = P
<σ <




σ2
A 
B
So the exact 100(1-α)% confidence interval for σ2 is
 (n − 1)s2 (n − 1)s2 
,



B
A 
It is trivial to convert this into a confidence interval for the standard deviation σ.
 (n − 1)S2
 (n − 1)S2
(n − 1)S2 
(n − 1)S2 
=
P
1 − α = P
< σ2 <
<
σ
<




B
B
A 
A


Hence the exact 100(1-α)% confidence interval for σ is
 (n − 1)s2 (n − 1)s2 
,

B
A 

The connection with 2-tailed tests considered in Statistics 1
Test for the mean µ The test is of H0:µ=µ0 against Ha:µ≠µ0, and uses significance
level α.
When the variance is known
The test statistic is almost the same as the pivot, but has µ0 replacing µ
X − µ0
The test statistic Z =
~ N(0,1) when H0 is true.
σ2 / n
We reject the null hypothesis if the calculated value of the test statistic is either ≥zα/2 or is
≤-zα/2 . Hence we accept the null hypothesis if
x − µ0
−z α / 2 <
< z α / 2 i.e. if x − z α / 2 σ 2 / n < µ 0 < x + z α / 2 σ 2 / n
2
σ /n
Thus we accept H0 if µ0 lies in the 100(1-α)% confidence interval for µ. Otherwise we
reject H0.
When the variance is unknown
Again the test statistic is almost the same as the pivot, but has µ0 replacing µ
X − µ0
The test statistic T =
~ t n −1 when H0 is true.
S2 / n
We reject the null hypothesis if the calculated value of the test statistic is either ≥tn-1,α/2 or
is ≤-tn-1,α/2 . Hence we accept the null hypothesis if
x−µ
−t n −1,α / 2 < 2 0 < t n −1,α / 2 i.e. if x − t n −1,α / 2 s2 / n < µ 0 < x + t n −1,α / 2 s2 / n
s /n
Thus we accept H0 if µ0 lies in the 100(1-α)% confidence interval for µ. Otherwise we
reject H0.
Tests for the variance σ 2
The test is of H0: σ 2 = σ 20 against Ha: σ 2 ≠ σ 20 , and uses significance level α.
The test statistic is almost the same as the pivot, but has σ 20 replacing σ 2 .
(n − 1)S2
The test statistic U =
~ χ 2n −1 when H0 is true.
2
σ0
We reject the null hypothesis if the calculated value of the test statistic is either ≤A or is
≥B, where A and B are the lower and upper 100α/2% points of χ 2n −1 . Hence we accept the
null hypothesis if
(n − 1)s2
(n − 1)s2
(n − 1)s2
2
A<
<
B
i.e.
if
<
σ
<
0
B
σ 20
A
2
Thus we accept H0 if σ 0 lies in the 100(1-α)% confidence interval for σ2. Otherwise we
reject H0.
Note There is a similar link between approximate (central) confidence intervals for the
mean µ and two-tailed tests concerning µ when the distributional form for X is not known.
A brief note on one-sided confidence intervals
We will just illustrate this via a confidence interval for the mean when the variance is
known and the distribution is normal. The pivot is
X−µ
Z=
~ N(0,1)
σ2 / n
Take any 0 ≤ β ≤ α, then


X−µ
1 − α = P −zβ < Z < z α −β = P −zβ <
<
z
α −β 


σ2 / n
(
(
)
= P X − z α −β σ 2 / n < µ < X + zβ σ 2 / n
(
)
The confidence interval is then x − z α −β σ 2 / n, x + zβ σ 2 / n
)
So we can obtain different 100(1-α)% confidence intervals from the same set of data by
choosing different values of β.
The length of the confidence interval is (zβ + z α −β ) σ 2 / n , so will be shortest if we take
β=α/2 (so that the confidence interval is central). Also the confidence interval will be
centred on the estimate x for µ when β=α/2.
We obtain one-sided confidence intervals by taking either β=0 or β=α. One-sided
confidence intervals are useful if we want to obtain either a lower or upper bound for the
parameter µ. One-sided confidence intervals link to one-tailed tests in an equivalent manner
to the link between central confidence intervals and two-tailed tests.
Upper bound for µ
(
Take β=α so that zα−β=z0= ∞. The confidence interval is then −∞, x + z α σ 2 / n
)
So we are 100(1-α)% confident that µ < x + z α σ 2 / n .
We can carry out a test of H 0 :µ = µ 0 against H a :µ < µ 0 with significance level α, by
accepting H0 if µ0 lies in this one-sided confidence interval. So we accept H0 if
x − µ0
µ 0 < x + z α σ 2 / n . (This is equivalent to rejecting H0 if
≤ −z α .)
σ2 / n
Lower bound for µ
(
Take β=0 so that zβ=z0= ∞. The confidence interval is then x − z α σ 2 / n,∞
)
So we are 100(1-α)% confident that µ > x − z α σ 2 / n .
We can carry out a test of H 0 :µ = µ 0 against H a :µ > µ 0 with significance level α, by
accepting H0 if µ0 lies in the one-sided confidence interval. So we accept H0 if
x − µ0
µ 0 > x − z α σ 2 / n . (This is equivalent to rejecting H0 if
≥ z α .)
σ2 / n
Related documents