Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Stat 311 Approximate confidence intervals for the expectation Let X1 , . . . , Xn be independent, identically distributed random variables with expectation µ and variance σ 2 . The “method of moments” estimator for µ is just the sample mean, that is X̄ ≡ X1 + · · · + Xn ≈µ n A somewhat tedious calculation shows that E[ n X (Xi − X̄)2 ] = (n − 1)σ 2 , (1) i=1 and the method of moments suggests the estimator for the variance n 1 X S ≡ (Xi − X̄)2 ≈ σ 2 n − 1 i=1 2 where S is called the sample standard deviation. The law of large numbers implies the consistency of both of these estimators, that is, lim P {|X̄ − µ| > } = 0 n→∞ lim P {|S 2 − σ 2 | > } = 0 n→∞ for each > 0. The consistency of S and the central limit theorem in turn imply that Z b x2 X̄ − µ 1 √ e− 2 dx = Φ(b) − Φ(a), √ ≤ b} = lim P {a ≤ n→∞ S/ n 2π a so in particular P {−α ≤ S S X̄ − µ √ ≤ α} = P {X̄ − α √ ≤ µ ≤ X̄ + α √ } ≈ 2Φ(α) − 1. S/ n n n If we select α so that 2Φ(α) − 1 = .95, that is, α = 1.960, then we say that the interval S S S S [X̄ − α √ , X̄ + α √ ] = [X̄ − 1.960 √ , X̄ + 1.960 √ ] n n n n is the 95% confidence interval for µ. For a 90% confidence interval select α so that 2Φ(α) − 1 = .90 (i.e., α = 1.645). For a 97% confidence interval select α so that 2Φ(α) − 1 = .97 (i.e., α = 2.17). Of course, if the value of the standard deviation σ is known, then the confidence interval is given by σ σ [X̄ − α √ , X̄ + α √ ]. n n General principle for confidence intervals Find a function of the data and the parameter h(X1 , . . . Xn , θ) whose distribution does not depend on the unknown parameter (at least approximately). For example if X1 , . . . , Xn are exponentially distributed with parameter λ, then the distribution of (X1 + · · · + Xn )λ does not depend on λ. (In particular, one doesn’t need to estimate the variance to calculate a confidence interval for the parameter of an exponential distribution.) For a C% confidence interval, find aC and bC such that P {aC < h(X1 , . . . , Xn , θ) < bC } = C . 100 (2) (Note that aC and bC are not uniquely determined by (2). They are usually selected so that the resulting confidence interval is as short as possible.) Solve the inequality in the probability in (2) to obtain P {l(X1 , . . . , Xn ) < θ < u(X1 , . . . , Xn )} = C . 100 For example, if h(X1 , . . . , Xn , λ) = (X1 + · · · + Xn )λ and P {aC < h(X1 , . . . , Xn , λ) < bC } = C then 100 aC bC C P{ <λ< }= . X1 + · · · + Xn X1 + · · · + Xn 100 determines the confidence interval. Problems 1. A sample of 60 meteorites found in the Arizona desert was weighed. The sample mean was 58 grams, and the sample standard deviation was 20 grams. Compute an approximate (large sample) 95% confidence interval for the population mean. 2. A sample of 1300 UW students are asked if the school colors should be changed to purple and pink. 960 respond that the colors should be changed. Based on this data, construct the 95% confidence interval for the true fraction of UW students who believe that the colors should be changed. 3. Verify the identity in (1) for n = 3. 4. Each of 40 students in a chemistry class performs 20 repetitions of an experiment measuring the percentage of iron in an ore sample. Assuming that the measurements are normally distributed with expectation equal to the correct percentage, each student calculates a 95% confidence interval for the percentage based on his or her data. When the results are collected, two of the students discover that their confidence intervals don’t even overlap. A student in the course who is also taking Stat 311 claims that this result is not surprising. Do you agree? Explain. (Support your explanation with calculations.)