Download Stat 311 Approximate confidence intervals for the expectation Let X1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Stat 311
Approximate confidence intervals for the expectation
Let X1 , . . . , Xn be independent, identically distributed random variables with expectation µ
and variance σ 2 . The “method of moments” estimator for µ is just the sample mean, that is
X̄ ≡
X1 + · · · + Xn
≈µ
n
A somewhat tedious calculation shows that
E[
n
X
(Xi − X̄)2 ] = (n − 1)σ 2 ,
(1)
i=1
and the method of moments suggests the estimator for the variance
n
1 X
S ≡
(Xi − X̄)2 ≈ σ 2
n − 1 i=1
2
where S is called the sample standard deviation. The law of large numbers implies the
consistency of both of these estimators, that is,
lim P {|X̄ − µ| > } = 0
n→∞
lim P {|S 2 − σ 2 | > } = 0
n→∞
for each > 0. The consistency of S and the central limit theorem in turn imply that
Z b
x2
X̄ − µ
1
√ e− 2 dx = Φ(b) − Φ(a),
√ ≤ b} =
lim P {a ≤
n→∞
S/ n
2π
a
so in particular
P {−α ≤
S
S
X̄ − µ
√ ≤ α} = P {X̄ − α √ ≤ µ ≤ X̄ + α √ } ≈ 2Φ(α) − 1.
S/ n
n
n
If we select α so that 2Φ(α) − 1 = .95, that is, α = 1.960, then we say that the interval
S
S
S
S
[X̄ − α √ , X̄ + α √ ] = [X̄ − 1.960 √ , X̄ + 1.960 √ ]
n
n
n
n
is the 95% confidence interval for µ.
For a 90% confidence interval select α so that 2Φ(α) − 1 = .90 (i.e., α = 1.645). For a 97%
confidence interval select α so that 2Φ(α) − 1 = .97 (i.e., α = 2.17).
Of course, if the value of the standard deviation σ is known, then the confidence interval is
given by
σ
σ
[X̄ − α √ , X̄ + α √ ].
n
n
General principle for confidence intervals
Find a function of the data and the parameter h(X1 , . . . Xn , θ) whose distribution does not
depend on the unknown parameter (at least approximately). For example if X1 , . . . , Xn are
exponentially distributed with parameter λ, then the distribution of (X1 + · · · + Xn )λ does
not depend on λ. (In particular, one doesn’t need to estimate the variance to calculate a
confidence interval for the parameter of an exponential distribution.)
For a C% confidence interval, find aC and bC such that
P {aC < h(X1 , . . . , Xn , θ) < bC } =
C
.
100
(2)
(Note that aC and bC are not uniquely determined by (2). They are usually selected so
that the resulting confidence interval is as short as possible.) Solve the inequality in the
probability in (2) to obtain
P {l(X1 , . . . , Xn ) < θ < u(X1 , . . . , Xn )} =
C
.
100
For example, if h(X1 , . . . , Xn , λ) = (X1 + · · · + Xn )λ and P {aC < h(X1 , . . . , Xn , λ) < bC } =
C
then
100
aC
bC
C
P{
<λ<
}=
.
X1 + · · · + Xn
X1 + · · · + Xn
100
determines the confidence interval.
Problems
1. A sample of 60 meteorites found in the Arizona desert was weighed. The sample
mean was 58 grams, and the sample standard deviation was 20 grams. Compute an
approximate (large sample) 95% confidence interval for the population mean.
2. A sample of 1300 UW students are asked if the school colors should be changed to
purple and pink. 960 respond that the colors should be changed. Based on this data,
construct the 95% confidence interval for the true fraction of UW students who believe
that the colors should be changed.
3. Verify the identity in (1) for n = 3.
4. Each of 40 students in a chemistry class performs 20 repetitions of an experiment
measuring the percentage of iron in an ore sample. Assuming that the measurements
are normally distributed with expectation equal to the correct percentage, each student
calculates a 95% confidence interval for the percentage based on his or her data. When
the results are collected, two of the students discover that their confidence intervals
don’t even overlap. A student in the course who is also taking Stat 311 claims that
this result is not surprising. Do you agree? Explain. (Support your explanation with
calculations.)