Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CONFIDENCE INTERVAL ESTIMATION I. Definitions 1. An interval estimate of a parameter θ is any pair of functions L ( x ) , U ( x ) of the sample which satisfy L ( x ) ≤ U ( x ) ∀x ∈ℵ . If the realization x=X is observed and the inference L ( x ) ≤ θ ≤ U ( x ) is made, then the random interval L ( x ) , U ( x ) is the interval estimator. Why is it called a random interval? Is it open/closed/two sided? 2. The coverage probability is the probability that the random interval L ( x ) , U ( x ) { } covers the true parameter θ (ie P θ ∈ L ( x ) ,U ( x ) θ ) 3. The confidence coefficient of L ( x ) , U ( x ) is the infimum of the coverage probabilities. Note: interval estimates + confidence coefficient = confidence intervals Note: The concepts can be generalized to sets (need not be an interval) Example 9.1.6 Set up to illustrate the concepts – does not tell you “how” to obtain the interval. X ∈ U ( 0,θ ) . Want an interval estimator for θ . Propose two intervals: [ aY , bY ] ,1 ≤ a < b [Y + c, Y + d ] , 0 ≤ c < d where Y = X ( n ) Compute the coverage probabilities: 1 Y 1 Pθ {θ ∈ [ aY , bY ]} = Pθ {aY ≤ θ ≤ bY } = P ≤ ≤ b θ a 1 1 = P ≤ T ≤ a b Now, use basic probability and the distribution of Y to develop this expression: n n 1 1 = − = confidence coefficient (not a function of the parameter) a b n n c d Do the same for the other proposed interval: = 1 − − 1 − (a function of θ θ the parameter). So, must construct the infimum, which occurs as θ → ∞ . So the confidence coefficient of this estimator is 0! II. Methods for Computing Interval Estimators 1. Inverting a Test Statistic - Builds directly from hypothesis testing. - Hypothesis tests with good properties typically result in confidence sets/intervals with good properties - Essentially involves setting up a hypothesis test which determines a critical region (and thereby implies an acceptance region) for a specified level (or size) of test and then solving for the “range” on the parameter associated with the corresponding interval. - There is thus a “dual” relationship between the corresponding acceptance region and the confidence interval. The hypothesis test fixes parameter and determines values of the statistic that support this parameter value. The confidence interval fixes the sample values and asks what values of the parameter are consistent with these sample values. - See Figure 9.2.1 for an example of a test of the mean for normally distributed variables. Theorem 9.2.2 For each θ o ∈ Θ , let A (θ o ) be the acceptance region of a level α test of H o : θ ∈ θ o . For each ∀x ∈ℵ , define a set C ( x ) in the parameter space by C ( x ) = {θ o : x ∈ A (θ o )} . Then the random set C ( X ) is a 1 − α confidence set. Conversely, let C ( X ) be a 1 − α confidence set. Then for any θ o ∈ Θ , define A (θ o ) = { x : θ o ∈ C ( x )} . Then, A (θ o ) is an acceptance region of a level α test of H o : θ ∈θo . Proof: Straightforward utilization of the probability of a sample being in an acceptance/rejection region. The alternative hypothesis determines the appropriate form of this acceptance region (as in hypothesis testing) However, the actual construction of the CI may not be easy. The procedure that is often used involves the LRT test. However, other tests can also be inverted. Example 1: Compute the confidence interval for the parameter λ in an exponential distribution by inverting an LRT test: H o : λ = λo H1 : λ ≠ λo a) Set up the LRT test statistic, just as you would for a hypothesis test. After simplifying, ∑ xi λ ( x) = nλ o ∑ xi exp n − λo n b) So, the acceptance region is: x ∑i A ( λo ) = x : λ o * ≥ k (See Fig 9.2.2) ∑ xi exp − λo n c) So, the confidence set is: x ∑ i C ( x ) = λ : λ One is a function of ∑x i ∑ xi exp − λ n * ≥ k (See Fig 9.2.2) and the other is a function of λ . d) The confidence interval is then simplified to the form: { } C ( x ) = {λ : L ( x ) ≤ λ ≤ U ( x )} = λ : L ( ∑ xi ) ≤ λ ≤ U ( ∑ xi ) So, substituting and noting that you are looking for the values where: ∑ xi ∑ xi = ∑ xi exp − ∑ xi exp − L ( ∑ xi ) L ( ∑ xi ) U ( ∑ xi ) U ( ∑ xi ) n a n exp ( − a ) = b n exp ( −b ) n The actual solutions are often computed numerically. See text for solution to this example. You need to use distributions of the statistic – here it is simple ( ∑ X i ~ Gamma ( n, λ ) ), but often not. 1 1 C ( x ) = {λ : L ( x ) ≤ λ ≤ U ( x )} = λ : ∑ xi ≤ λ ≤ ∑ xi b a where 1 1 P ∑ xi ≤ λ ≤ ∑ xi = P b ≤ b a ∑x i λ ≤ a = 1 − α Example 2: Determine a 1-sided confidence interval of the form C ( x ) = { p : L ( x ) < p ≤ 1} for p in a Bernoulli distribution. The associated hypothesis test: H o : p = po H1 : p > po (alternative hypothesis implies p is small under the null hypothesis) So, the confidence interval of the form C ( x ) = { p : L ( x ) < p ≤ 1} . Now, we know that the resulting UMP hypothesis test is MLR and involves T = ∑ X i ~ Bin ( n, p ) . Rejection region is: R = { x : T > k ( po )} , where k is selected to satisfy the size (level) of the test. k ( po ) n y n− y ∑ p (1 − p ) ≥ 1 − α and y =0 y k ( po ) −1 ∑ y =0 n y n− y p (1 − p ) < 1 − α y