Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
BIOINF 2118
Estimation Part 3
Page 1 of 4
Classical Statistics – basic principles
• All unknown parameters are fixed.
• “Probability” = “long-run frequency” (i.i.d. repetitions)
• Although the likelihood function is often used
(max likelihood, likelihood ratio test, score test,
information matrix for standard errors, Wald test),
nevertheless these uses are justified only by large sample
approximations to frequentist calculations!!
Classical estimation
1) Moment estimators
2) Maximum likelihood estimators
3) Estimate something else, then “plug-in”.
Examples:
Estimate E(f(X)) by
.
Estimate s.d.(X) by
(If
is MLE, so is
. If
is moment estimator,
is NOT.)
Some notation
(quantile function; inverse c.d.f.)
> qnorm(1-0.025)
[1] 1.959964
> pnorm(1.96)
[1] 0.9750021 , which is close to 1 – (0.05 x ½).
A 95% confidence interval for the normal mean, given an i.i.d. sample, is
.
Note that
BIOINF 2118
Estimation Part 3
Page 2 of 4
What if the X’s are NOT normal?
If variance(X) is finite and known and equals ,
then by the central limit theorem www.mathsisfun.com/data/quincunx.html
.
So
(confidence interval).
What is a “95% confidence interval, really?
A function (algorithm) CI:
( CI1(data) ,CI2 (data) ) such that
yields a 95% confidence interval
.
The “coverage probability” is 1 – 0.05. No particular realization
(CI1(data),CI2 (data))
can be said to have the coverage probability.
I.e. it’s a property of a “recipe”, not of a particular interval.
(There are some strange examples that highlight this peculiar interpretation.)
Approximate confidence interval for the binomial mean
Let Y ~ binom(n,p), so that Y is a sum of i.i.d. Bernoullis Xi, and
and
.
BIOINF 2118
Estimation Part 3
Page 3 of 4
Then an estimated standard error of the mean is:
.
Example: n=10, Y=8. Then the s.e.m. (standard error of the mean) is
=
= 0.133.
From R, a confidence interval based on the normal approximation is
0.8+c(-1,1)*sqrt(.8*.2/9)*qnorm(0.975)
which gives
( 0.5386715, 1.0613285).
The normal approx. can give confidence intervals extending beyond [0,1]!!
That seems pretty silly.
Do we really have more “confidence” in p=1.06 than in p=0.52?
Later, we’ll see why an “exact interval” is
(0.4439040, 0.9747889),
obtained by thinking of confidence intervals as values not rejected by a test.
BIOINF 2118
Estimation Part 3
Page 4 of 4
A silly confidence interval
Suppose the distribution is
ìï m - 1 w.p.1/ 2
X |m = í
ïî m + 1 w.p.1/ 2
("w.p." stands for "with probability).
and the parameter space is the real line:
m λ .
You observe two independent observations. The model is then
ì
ï
ï
X 1, X 2 | m = í
ï
ï
î
( m - 1, m - 1) w.p.1/ 4
( m - 1, m + 1) w.p.1/ 4
( m + 1, m - 1) w.p.1/ 4
( m + 1, m + 1) w.p.1/ 4
Then a 75% confidence set is:
If X 1 = X 2 , CI = {X 1 - 1}
If X 1 ¹ X 2 , CI = {(X 1 + X 2 ) / 2}
They each happen with 50% probability.
The coverage is 50% with the first case, 100% with the second case (where the X's differ by 2).
So averaging over the sample space, the coverage probability is 75%.
IN classical statistics all probabilities average over the sample space.
On the other hand,
Y = I(X 1 = X 2 )
is an ancillary statistic. Its distribution does not depend on
m . It is not informative about m .
So we can do conditional inference.
Then probabilities average over that subset of the sample space which agree with the ancillary
statistic.
That would make sense here.
But then, if we demand a confidence interval with confidence ≥ 75%, the only way to get it is this:
If X 1 = X 2 , CI = {X 1 - 1,X 1 + 1}
If X 1 ¹ X 2 , CI = {(X 1 + X 2 ) / 2}
which has coverage probability = 100%.