Download Parameter Estimation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Randomness wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability box wikipedia , lookup

Probability interpretations wikipedia , lookup

German tank problem wikipedia , lookup

Bias of an estimator wikipedia , lookup

Transcript
ST 380
Probability and Statistics for the Physical Sciences
Parameter Estimation
Probability theory tells us what to expect when we carry out some
experiment with random outcomes, in terms of the parameters of the
problem.
Statistical theory tells us what we can learn about those parameters
when we have seen the outcome of the experiment.
We speak of making statistical inferences about the parameters.
1 / 16
Point Estimation
Introduction
ST 380
Probability and Statistics for the Physical Sciences
Point Estimation
A point estimate of a parameter is a single value that represents a
best guess as to the value of the parameter.
For example, Rasmussen Reports surveyed 1,500 likely voters over a
3-day period, and 690 agreed that they approve the President’s
performance in office.
We assume that each voter was randomly selected from a population
in which a fraction p of voters would agree.
Here p is the parameter of interest, and the natural point estimate of
it is p̂ = 690/1500 = .46, or 46%.
2 / 16
Point Estimation
General Concepts
ST 380
Probability and Statistics for the Physical Sciences
Sample Mean
In any situation where we observe a simple random sample
X1 , X2 , . . . , Xn from some population with mean µ, we know that the
sample mean X̄ = (X1 + X2 + · · · + Xn )/n satisfies
E (X̄ ) = µ,
so it is natural to estimate µ by X̄ .
We treat the Rasmussen survey as a binomial experiment with
E (Xi ) = p, so using p̂ (= X̄ ) to estimate p is a special case of using
X̄ to estimate µ.
3 / 16
Point Estimation
General Concepts
ST 380
Probability and Statistics for the Physical Sciences
Estimator and Estimate
It is important to distinguish between the rule that we follow to
estimate a parameter and the value that we find for a particular
sample.
We call the rule an estimator and the value an estimate.
For example, in the survey data, the rule is “estimate p by the sample
fraction p̂”, and the value is .46.
So the estimator is p̂, and the estimate is .46.
One week ago, the same estimator p̂ with a different sample gave a
different estimate, .49.
4 / 16
Point Estimation
General Concepts
ST 380
Probability and Statistics for the Physical Sciences
Sampling Distribution
Clearly a point estimator is a statistic, and therefore has a sampling
distribution.
Suppose that X1 , X2 , . . . , Xn is a random sample from some
population with a parameter θ, and that
θ̂ = θ̂(X1 , X2 , . . . , Xn )
is a statistic that we want to use as an estimator of θ.
5 / 16
Point Estimation
General Concepts
ST 380
Probability and Statistics for the Physical Sciences
Bias
If
E (θ̂) = θ for all possible values of θ,
θ̂ is an unbiased estimator of θ.
In general, the bias of θ̂ as an estimator of θ is
E (θ̂ − θ) = E (θ̂) − θ.
A biased estimator in a sense systematically over-estimates or
under-estimates θ, so we try to avoid estimators with large biases.
An unbiased estimator is desirable, but not always available, and not
always sensible.
6 / 16
Point Estimation
General Concepts
ST 380
Probability and Statistics for the Physical Sciences
For example, suppose that n = 1, and X = X1 has the Poisson
distribution with parameter µ:
P(X = x) = p(x; µ) = e −µ
µx
, x = 0, 1, . . .
x!
E (X ) = µ, so X is an unbiased estimator of µ, but suppose that the
parameter of interest is
θ = e −µ .
The only unbiased estimator of θ is
(
1 if X = 0,
θ̂ =
0 if X > 0.
7 / 16
Point Estimation
General Concepts
ST 380
Probability and Statistics for the Physical Sciences
Mean Squared Error
We measure how far an estimator θ̂ is from the true value θ using the
mean squared error:
MSE(θ̂; θ) = E [(θ̂ − θ)2 ].
We can show that
MSE(θ̂; θ) = (bias)2 + V (θ̂).
For an unbiased estimator, bias = 0, so
MSE(θ̂; θ) = V (θ̂).
8 / 16
Point Estimation
General Concepts
ST 380
Probability and Statistics for the Physical Sciences
Many biased estimators are approximately unbiased, in the sense that
(bias)2 V (θ̂),
so
MSE(θ̂; θ) ≈ V (θ̂).
Standard error
So if an estimator is unbiased, either exactly or approximately, its
performance is measured by V (θ̂), or by its standard deviation
q
σθ̂ = V (θ̂),
also known as its standard error.
9 / 16
Point Estimation
General Concepts
ST 380
Probability and Statistics for the Physical Sciences
Often an estimator’s standard error is a function of θ or other
parameters; these must be replaced by estimates before we can
actually calculate a value.
Estimated standard error
The resulting statistic is called the estimated standard error, and is
denoted σ̂θ̂ .
Example: binomial distribution; V (p̂) = p(1 − p)/n, so
r
r
p(1 − p)
p̂(1 − p̂)
σp̂ =
, and σ̂p̂ =
.
n
n
10 / 16
Point Estimation
General Concepts
ST 380
Probability and Statistics for the Physical Sciences
Methods of Point Estimation
In some situations we have an obvious estimator θ̂, such as the
binomial p̂ = X /n.
In other cases we may not.
Example: Ozone pollution
Suppose that X1 , X2 , . . . , X28 are daily maximum ozone levels on 28
consecutive days. Suppose further that we want to model these as
independent variables with the Weibull distribution
α−1
α x
α
f (x; α, β) =
e −(x/β) , 0 < x < ∞.
β β
11 / 16
Point Estimation
Methods of Point Estimation
ST 380
Probability and Statistics for the Physical Sciences
It is not obvious how to estimate either α or β.
Suppose that we know from other data that α is well approximated
by the value 2. It is still not obvious how to estimate β.
Before we observed the data, the joint pdf
n
Y
f (xi ; α, β)
i=1
measures the relative probability of observing specific values
x1 , x2 , . . . , xn .
12 / 16
Point Estimation
Methods of Point Estimation
ST 380
Probability and Statistics for the Physical Sciences
Likelihood function
After observing x1 , x2 , . . . , xn , we can use the same function to
measure the relative likelihood of different values of α and β (or just
β if we believe we know the value of α = α0 ).
When used this way, we call it the likelihood function,
L(β) =
n
Y
f (xi ; α0 , β).
i=1
13 / 16
Point Estimation
Methods of Point Estimation
ST 380
Probability and Statistics for the Physical Sciences
Example, with simulated ozone levels:
n <- 28
alpha0 <- 2
beta <- 70
x <- rweibull(n, alpha0, beta)
L <- function(beta) {
lik <- rep(NA, length(beta))
for (i in 1:length(beta))
lik[i] <- prod(dweibull(x, alpha0, beta[i]))
lik
}
plot(L, from = 50, to = 100)
14 / 16
Point Estimation
Methods of Point Estimation
ST 380
Probability and Statistics for the Physical Sciences
Maximum Likelihood
The most likely value of β, the value that maximizes the likelihood, is
the maximum likelihood estimate.
Maximum likelihood estimators are generally approximately unbiased,
and have close to the smallest possible mean squared error.
Most of the estimators that we cover later will be maximum
likelihood estimators, or sometimes unbiased modifications of them.
15 / 16
Point Estimation
Methods of Point Estimation
ST 380
Probability and Statistics for the Physical Sciences
In the example, we can use the optimize() function to find the
maximum likelihood estimate of β:
o <- optimize(L, c(50, 100), maximum = TRUE)
abline(v = o$maximum, col = "blue")
title(paste("MLE of beta:", round(o$maximum, 1)))
Alternatively, we can show analytically that
xiα0
n
P
β̂ML =
α1
0
mean(x^alpha0)^(1/alpha0)
16 / 16
Point Estimation
Methods of Point Estimation