Download 21.Confidence Intervals-1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
CONFIDENCE INTERVAL ESTIMATION
I. Definitions
1. An interval estimate of a parameter θ is any pair of functions L ( x ) , U ( x ) of the
sample which satisfy L ( x ) ≤ U ( x ) ∀x ∈ℵ . If the realization x=X is observed and the
inference L ( x ) ≤ θ ≤ U ( x ) is made, then the random interval  L ( x ) , U ( x )  is the
interval estimator.
Why is it called a random interval?
Is it open/closed/two sided?
2. The coverage probability is the probability that the random interval  L ( x ) , U ( x ) 
{
}
covers the true parameter θ (ie P θ ∈  L ( x ) ,U ( x )  θ )
3. The confidence coefficient of  L ( x ) , U ( x )  is the infimum of the coverage
probabilities.
Note: interval estimates + confidence coefficient = confidence intervals
Note: The concepts can be generalized to sets (need not be an interval)
Example 9.1.6
Set up to illustrate the concepts – does not tell you “how” to obtain the interval.
X ∈ U ( 0,θ ) . Want an interval estimator for θ . Propose two intervals:
[ aY , bY ] ,1 ≤ a < b [Y + c, Y + d ] , 0 ≤ c < d
where Y = X ( n )
Compute the coverage probabilities:
1 Y 1 
Pθ {θ ∈ [ aY , bY ]} = Pθ {aY ≤ θ ≤ bY } = P  ≤ ≤ 
b θ a 
1
1
= P ≤ T ≤ 
a
b
Now, use basic probability and the distribution of Y to develop this expression:
n
n
1 1
=   −   = confidence coefficient (not a function of the parameter)
a b
n
n
c  d

Do the same for the other proposed interval: = 1 −  − 1 −  (a function of
 θ  θ
the parameter). So, must construct the infimum, which occurs as θ → ∞ . So the
confidence coefficient of this estimator is 0!
II. Methods for Computing Interval Estimators
1. Inverting a Test Statistic
- Builds directly from hypothesis testing.
- Hypothesis tests with good properties typically result in confidence sets/intervals
with good properties
- Essentially involves setting up a hypothesis test which determines a critical
region (and thereby implies an acceptance region) for a specified level (or size) of test
and then solving for the “range” on the parameter associated with the corresponding
interval.
- There is thus a “dual” relationship between the corresponding acceptance region
and the confidence interval. The hypothesis test fixes parameter and determines values of
the statistic that support this parameter value. The confidence interval fixes the sample
values and asks what values of the parameter are consistent with these sample values.
- See Figure 9.2.1 for an example of a test of the mean for normally distributed
variables.
Theorem 9.2.2 For each θ o ∈ Θ , let A (θ o ) be the acceptance region of a level α test of
H o : θ ∈ θ o . For each ∀x ∈ℵ , define a set C ( x ) in the parameter space by
C ( x ) = {θ o : x ∈ A (θ o )} . Then the random set C ( X ) is a 1 − α confidence set.
Conversely, let C ( X ) be a 1 − α confidence set. Then for any θ o ∈ Θ , define
A (θ o ) = { x : θ o ∈ C ( x )} . Then, A (θ o ) is an acceptance region of a level α test of
H o : θ ∈θo .
Proof: Straightforward utilization of the probability of a sample being in an
acceptance/rejection region. The alternative hypothesis determines the appropriate form
of this acceptance region (as in hypothesis testing)
However, the actual construction of the CI may not be easy. The procedure that
is often used involves the LRT test. However, other tests can also be inverted.
Example 1: Compute the confidence interval for the parameter λ in an exponential
distribution by inverting an LRT test:
H o : λ = λo
H1 : λ ≠ λo
a) Set up the LRT test statistic, just as you would for a hypothesis test. After
simplifying,
 ∑ xi
λ ( x) = 
 nλ
 o


∑ xi
 exp  n −
λo


n



b) So, the acceptance region is:
  x
∑i
A ( λo ) =  x : 
 λ
  o


*
≥
k

 (See Fig 9.2.2)



 ∑ xi
 exp  −

 λo
n
c) So, the confidence set is:
  x
∑ i
C ( x ) = λ : 
 λ
 
One is a function of
∑x
i

 ∑ xi
 exp  −
λ


n


*
 ≥ k  (See Fig 9.2.2)


and the other is a function of λ .
d) The confidence interval is then simplified to the form:
{
}
C ( x ) = {λ : L ( x ) ≤ λ ≤ U ( x )} = λ : L ( ∑ xi ) ≤ λ ≤ U ( ∑ xi )
So, substituting and noting that you are looking for the values where:
 ∑ xi 

∑ xi  =  ∑ xi  exp  − ∑ xi 

 exp  −
 L ( ∑ xi ) 
 L ( ∑ xi )   U ( ∑ xi ) 
 U ( ∑ xi ) 



 



n
a n exp ( − a ) = b n exp ( −b )
n
The actual solutions are often computed numerically. See text for solution to this
example. You need to use distributions of the statistic – here it is simple
( ∑ X i ~ Gamma ( n, λ ) ), but often not.
1
 1

C ( x ) = {λ : L ( x ) ≤ λ ≤ U ( x )} = λ : ∑ xi ≤ λ ≤ ∑ xi 
b
 a

where

1
 1

P   ∑ xi ≤ λ ≤ ∑ xi   = P b ≤
b

 a

∑x
i
λ

≤ a = 1 − α

Example 2: Determine a 1-sided confidence interval of the form
C ( x ) = { p : L ( x ) < p ≤ 1} for p in a Bernoulli distribution. The associated hypothesis
test:
H o : p = po
H1 : p > po
(alternative hypothesis implies p is small under the null hypothesis)
So, the confidence interval of the form C ( x ) = { p : L ( x ) < p ≤ 1} . Now, we know that
the resulting UMP hypothesis test is MLR and involves T = ∑ X i ~ Bin ( n, p ) .
Rejection region is: R = { x : T > k ( po )} , where k is selected to satisfy the size (level) of
the test.
k ( po )
n y
n− y
∑
  p (1 − p ) ≥ 1 − α and
y =0  y 
k ( po ) −1
∑
y =0
 n y
n− y
  p (1 − p ) < 1 − α
 y