Download Interval Estimates

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
ST 380
Probability and Statistics for the Physical Sciences
Interval Estimates
A point estimate by itself provides no information about the precision
and reliability of estimation.
Consider, e.g. X̄ as an estimator for µ . We have no idea how close x̄
is to µ .
An alternative to reporting a single number is to report an entire
interval of plausible values, that is an interval estimate.
1 / 15
Interval Estimation
Introduction
ST 380
Probability and Statistics for the Physical Sciences
Confidence Interval
For a 95% confidence interval, at the 95% confidence level, any value
of parameter θ in the interval is plausible.
A confidence level of 95% implies that 95% of all samples would give
an interval that includes θ, and only 5% of all samples would yield an
erroneous interval.
The most frequently used confidence levels are 90%, 95%, and 99%.
The higher the confidence level, the more strongly we believe that
the value of the parameter lies within the interval.
2 / 15
Interval Estimation
Introduction
ST 380
Probability and Statistics for the Physical Sciences
Basic Properties of Confidence Intervals
The basic properties of confidence intervals (CIs) are most easily
introduced by first focusing on a simple, albeit somewhat unrealistic,
problem.
Suppose that the parameter of interest is µ, the population is normal,
and the value of the standard deviation σ is known.
3 / 15
Interval Estimation
Basic Properties
ST 380
Probability and Statistics for the Physical Sciences
The Assumptions
Normality of the population distribution is often a reasonable
assumption, or at least an approximation.
However, if the value of µ is unknown, it is typically implausible that
the value of σ is known.
Methods based on less restrictive assumptions will be shown later.
4 / 15
Interval Estimation
Basic Properties
ST 380
Probability and Statistics for the Physical Sciences
Recall that X̄ ∼ N(µ, σ 2 /n), and that
Z=
X̄ − µ
√ ∼ N(0, 1).
σ/ n
So
0.95 = P(−1.96 < Z < 1.96)
X̄ − µ
√ < 1.96
= P −1.96 <
σ/ n
σ
σ
= P X̄ − 1.96 √ < µ < X̄ + 1.96 √
n
n
This is a random interval for the fixed value µ.
5 / 15
Interval Estimation
Basic Properties
ST 380
Probability and Statistics for the Physical Sciences
The interpretation is: “the probability is .95 that the random interval
includes the true value of µ.”
If x̄ = 80, n = 31 and σ = 2, the 95% confidence interval would be
2.0
80.0 ± 1.96 × √ = (79.3, 80.7).
31
It is tempting to conclude that µ is within this (now fixed) interval
with probability .95 ...
6 / 15
Interval Estimation
Basic Properties
ST 380
Probability and Statistics for the Physical Sciences
But µ is a constant, if unknown, and once we evaluate the interval,
the end-points are also fixed.
It is therefore incorrect to write the statement
P [µ ∈ (79.3, 80.7)] = .95.
The correct interpretation is that if we repeatedly formed confidence
intervals using this procedure, in the long run, 95% of them would
contain the parameter µ.
We might write that we are “95% confident” that µ lies within the
interval.
7 / 15
Interval Estimation
Basic Properties
ST 380
Probability and Statistics for the Physical Sciences
For a 99% confidence interval, we would need to replace 1.96 by 2.58.
In general, a confidence level of 1 − α is achieved by using zα/2 in
place of 1.96. Recall that
P Z > zα/2 = α/2.
8 / 15
Interval Estimation
Basic Properties
ST 380
Probability and Statistics for the Physical Sciences
Definition
A 100(1 − α)% confidence interval for the mean µ of a normal
population, when the value of σ is known, is given by
σ
σ
x̄ − zα/2 × √ , x̄ + zα/2 × √
n
n
We often write it more compactly as
σ
x̄ ± zα/2 × √ .
n
9 / 15
Interval Estimation
Basic Properties
ST 380
Probability and Statistics for the Physical Sciences
Confidence Level, Precision, and Sample Size
Why settle for a 95% confidence interval when a 99% interval is
available?
One issue is that the 99% interval is wider (it uses 2.58 instead of
1.96), and thus has less precision.
If we want both high confidence and precision, we could fix both and
then solve for the necessary sample size.
10 / 15
Interval Estimation
Basic Properties
ST 380
Probability and Statistics for the Physical Sciences
Large Sample Confidence Intervals
Suppose as before that the parameter of interest is µ, but the
population is not known to be normal; we still assume for now that
the value of the standard deviation σ is known.
The Central Limit Theorem assures us that X̄ is approximately
normally distributed as N(µ, σ 2 /n), and hence
σ
x̄ ± zα/2 × √ .
n
is a confidence interval for µ with a confidence level of approximately
100(1 − α)%.
11 / 15
Interval Estimation
Large Sample Intervals
ST 380
Probability and Statistics for the Physical Sciences
Suppose, more realistically, that σ is also unknown.
Replacing σ by s, the sample standard deviation, in the calculation of
the confidence interval is an additional approximation, but it is still
true that
s
x̄ ± zα/2 × √ .
n
is a confidence interval for µ with a confidence level of approximately
100(1 − α)%.
12 / 15
Interval Estimation
Large Sample Intervals
ST 380
Probability and Statistics for the Physical Sciences
General Large Sample Case
In other situations, we may want to use an estimator θ̂ of some
parameter θ, and we may know that θ̂ is approximately normally
distributed with mean θ, and we may have an estimated standard
error σ̂θ̂ of θ̂.
Then
θ̂ ± zα/2 × σ̂θ̂
is a confidence interval for θ with a confidence level of approximately
100(1 − α)%.
13 / 15
Interval Estimation
Large Sample Intervals
ST 380
Probability and Statistics for the Physical Sciences
Small Samples from a Normal Distribution
Recall the confidence interval for the mean µ of a normal
distribution, when σ is known:
σ
x̄ ± zα/2 × √ .
n
If σ is not known, we replace it by its estimate,
s = sample standard deviation.
To maintain the coverage probability of 100(1 − α)%, we must adjust
the multiplier zα/2 .
14 / 15
Interval Estimation
Large Sample Intervals
ST 380
Probability and Statistics for the Physical Sciences
The necessary probability result is that
T =
X̄ − µ
√
S/ n
has a known probability distribution, the Student’s t-distribution with
ν = n − 1 degrees of freedom.
It follows that
s
x̄ ± tα/2,ν × √ ,
n
is a 100(1 − α)% confidence interval for µ, where tα/2,ν is the 1 − α
quantile of that distribution.
15 / 15
Interval Estimation
Large Sample Intervals