Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/ We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The citation key on the following slide provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact [email protected] with any questions, corrections, or clarification regarding the use of content. For more information about how to cite these materials visit http://open.umich.edu/privacy-and-terms-use. Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please speak to your physician if you have questions about your medical condition. Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers. 1 / 17 Binomial confidence intervals Kerby Shedden Department of Statistics, University of Michigan Wednesday 16th January, 2013 2 / 17 Bernoulli trials A random variable X is a Bernoulli trial if it has two points in its sample space, e.g. X ∈ {0, 1}. The distribution of X is entirely determined by P(X = 1) = p, since P(X = 0) = 1 − p. This is a one parameter family of probability distributions, indexed by the parameter p ∈ [0, 1]. The expected value of X is E [X ] = p, and the variance of X is var[X ] = p(1 − p). 3 / 17 The binomial distribution Suppose X1 , . . . , Xn are a sample of size n from a Bernoulli distribution with parameter p. P The sum Y = X1 + . . . Xn = j Xj is a random variable with sample space {0, 1, . . . , n}. It follows a binomial distribution. The probability mass function (pmf) of Y is n k P(Y = k) = p (1 − p)n−k . k The expected value of Y is E [Y ] = np and the variance is var[Y ] = np(1 − p). 4 / 17 Estimation of p We can estimate p using p̂ = (X1 + · · · + Xn )/n = Y /n. This estimate is unbiased since E [p̂] = p. The variance p is var[p̂] = p(1 − p)/n, and the standard deviation is SD[p̂] = p(1 − p)/n. We can write p̂ ∼ [p, p(1 − p)/n] to indicate that p̂ is a random variable that follows a distribution with expected value p and variance p(1 − p)/n. By the law of large numbers, p̂ is consistent, since p̂ → p as n → ∞. 5 / 17 Standardizing p̂ We can standardize any random variable by subtracting its mean and dividing the result by the standard deviation. The resulting random variable has expected value 0 and variance 1: p̂ − p p p(1 − p)/n = √ np p̂ − p p(1 − p) ∼ [0, 1]. 6 / 17 Wald-type confidence intervals for p √ p n(p̂ − p)/ p(1 − p) ≤ 1.96) p √ √ = P(−1.96/ n ≤ (p̂ − p)/ p(1 − p) ≤ 1.96/ n) p p √ √ = P(−1.96 p(1 − p)/ n ≤ p̂ − p ≤ 1.96 p(1 − p)/ n) p p √ √ = P(−1.96 p(1 − p)/ n − p̂ ≤ −p ≤ 1.96 p(1 − p)/ n − p̂) p p √ √ = P(p̂ − 1.96 p(1 − p)/ n ≤ p ≤ 1.96 p(1 − p)/ n + p̂) P(−1.96 ≤ This gives us the Wald-type 95% confidence interval p̂ ± 1.96 p p p(1 − p)/n ≈ p̂ ± 2 p(1 − p)/n. p The lower confidence limit (LCL) isp p̂ − 1.96 p(1 − p)/n, and the upper confidence limit (UCL)) is p̂ + 1.96 p(1 − p)/n. 7 / 17 The plug-in trick The general form of a Wald-type confidence interval is estimate ± 2 × standard error. In the case of the binomial interval, the standard error depends on p, which puts us in a “catch 22” situation. We get around this by replacing p with p̂, so the interval becomes p̂ ± 2 p p̂(1 − p̂)/n. In general, such an interval has the form c estimate ± 2 · SE, c is the estimated standard error of p̂. where SE 8 / 17 Coverage probabilities The coverage probability of a Wald-type confidence interval is P(estimate − 2 · SE ≤ true value ≤ estimate + 2 · SE). In the binomial case, this is P(p̂ − 2 p p p̂(1 − p̂)/n ≤ p ≤ p̂ + 2 p̂(1 − p̂)/n). Recall from above that the coverage probability of the Wald interval is P(−1.96 ≤ √ p n(p̂ − p)/ p(1 − p) ≤ 1.96) 9 / 17 Coverage probabilities By the central limit theorem, √ p n(p̂ − p)/ p(1 − p) ≡ Z ≈ N(0, 1). Since P(−2 ≤ Z ≤ 2) ≈ 0.95, we anticipate that the coverage probability of the Wald confidence interval will be around 95%. Thus we say that the nominal coverage probability of the interval is 0.95. 10 / 17 Coverage probabilities In reality, the coverage probability of our Wald interval for p will not be exactly 0.95 for two reasons: √ p p(1 − p) is not exactly normally distributed √ I We replace the actual standard deviation of n(p̂ − p), which is p p p(1 − p), with the estimated standard deviation p̂(1 − p̂). I n(p̂ − p)/ (a third approximation is that we replaced 1.96 with 2, but this has a negligible effect and is not worth considering further) The actual coverage probability of our confidence interval is P(−2 ≤ √ n(p̂ − p)/ p p̂(1 − p̂) ≤ 2). There is no explicit formula for this probability. But we can use simulations to learn about it, and with more advanced mathematical techniques this can be approximated with formulas. 11 / 17 Criteria for evaluating confidence intervals I The actual coverage probability of the interval should be close to the nominal coverage probability. If the actual coverage probability is greater than the nominal coverage probability, the interval is conservative. If the actual coverage probability is less than the nominal coverage probability, the interval is permissive. I The width (a.k.a. length) of the confidence interval should be short. The width of a 95% Wald-type confidence interval is 4 · SE, which in the binomial case is p 4 p(1 − p)/n. √ The width of a Wald-type interval almost always has a factor of 1/ n. This means that to cut the width of the interval in half, we need to increase the sample size by a factor of 4. 12 / 17 Confidence intervals based on transformations The main difficulty with constructing a CI for p is that the variance of p̂, which is p(1 − p)/n, depends on p. Thus there is a mean/ variance relationship. We can use a variance stabilizing transformation to eliminate this relationship. Set p p̃ = arcsin( p̂). The variance of p̃ is approximately 1/(4n), which does not depend √ on p. We can√thus form a confidence interval based on p̃, as p̃ ± 2/ 4n, or p̃ ± 1/ n, then convert this back to the original scale by applying the transformation sin(u)2 to the LCL and UCL. The resulting interval is √ √ (sin(p̃ − 1/ n)2 , sin(p̃ + 1/ n)2 ). 13 / 17 Confidence intervals based on inverted hypothesis tests Suppose we have a hypothesis testing procedure for H0 (p 0 ): HA (p 0 ): p = p0 p 6= p 0 . We can define a confidence interval for p as the set of all p 0 that are not rejected under this test. If the test is carried out at level α (i.e. the probability of a type-I error is α), then the confidence interval will have a nominal coverage probability that is 1 − α. Thus to get the (usual) 95% confidence interval, we perform the (usual) 0.05 level test. 14 / 17 Confidence intervals based on inverted hypothesis tests The Z-test for p = p 0 is based on the test statistic T = √ p̂ − p 0 np . p̂(1 − p̂) If we reject this test whenever |T | > 2, then we get an approximate level 0.05 test. If we invert this test, we get the Wald-type confidence interval p̂ ± 2 p p̂(1 − p̂)/n. 15 / 17 Confidence intervals based on inverted hypothesis tests An alternative test for p = p 0 is based on the test statistic Ta = √ np p̂ − p 0 p 0 (1 − p 0 ) . Note that the only difference between T and Ta is that the variance is estimated as p̂(1 − p̂) in one case, and as p 0 (1 − p 0 ) in the other. If we reject this test whenever |T | > 2, then we get an approximate level 0.05 test. If we invert this test, we get the Wilson interval. 16 / 17 Confidence intervals based on a Bayesian approach From a Bayesian perspective, p is considered to be a random variable, with a distribution π(p) called the prior distribution. Once we specify a prior, we can use Bayes’ theorem to obtain the posterior distribution for p P(p|Y ) = P(Y |p)π(p)/P(Y ). The central 95% probability interval of the posterior distribution is called a credible interval for p. Credible intervals sometimes have similar properties to confidence intervals. If we use a Beta distribution for the prior on p we obtain a useful confidence interval for p. 17 / 17