Download Properties of Estimators

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Economics 140A
Properties of Estimators
Our …rst lecture reviewed basic concepts from probability. Economic theory
yields models that describe the evolution of economic variables. The evolution
depends on the value of the parameters that characterize the model. If the parameters are known, and if the distribution of the driving process (or error) is
known, then the evolution is derived using the tools of probability. Because the
parameters of a model are never known, such an approach is infeasible. Instead,
we use the data to infer the parameters; that is, we estimate the parameters.
In studying the evolution of economic variables, we typically focus on one of
three tasks: measurement, testing, or forecasting. If all of the parameters of the
model were known, then measurement of an e¤ect would be straightforward (and
equal to a function of the known parameters), testing would be moot (we would
know the parameter value and would not need to conjecture as to whether or not it
equalled some speci…ed constant), and forecasting would be ‡awless except for the
unknown future values of the driving process. With unknown parameters replaced
by estimates; measurement is uncertain, testing is important, and forecasting
more ‡awed. Clearly the accuracy with which we perform each task depends on
the accuracy of our estimator and so we turn to discussion of how to evaluate
estimators.
P
An estimator is a function of the data: Y = n1 ni=1 Yi
An estimate is a speci…c value obtained: y
Remark: We distinguish between random variables, which are denoted with
upper case, and the values random variables may take, which are denoted with
lower case. An estimator is a random variable.
Let A be an estimator of the parameter . We study features of the distribution
of A. We then turn to a property that is not a feature of the sampling distribution.
The distribution of A is obtained by constructing innumerable samples and
plotting the estimate from each sample.
We begin with the location of the distribution of A.
De…nition: The estimator A is unbiased if the expected value of A equals
,
EA = :
The bias of A measures the departure of EA from ,
Bias (A) = EA
:
Remark 1: If an estimator is unbiased, the process of drawing repeated samples, obtaining an estimate from each sample, and averaging the estimates will
yield a value that is likely close to the true parameter value. If on average, the
value of the estimator is less than , we say the estimator is biased downward.
Other things equal, an unbiased estimator is likely to yield a more accurate indicator of the true parameter value (because rarely is the magnitude of the bias for
a biased estimator known).
Remark 2: The above de…nition is most accurately termed mean unbiased.
As we know from our previous lecture there are other de…nitions of location.
Corresponding to the median, an estimator is median unbiased if the median of
the estimator equals the true parameter value. With a unimodal distribution, one
could have a modally unbiased estimator.
We next study dispersion of the distribution of A.
De…nition. The estimator A is the e¢ cient unbiased estimator if for any
sample size the variance of A is smaller than the variance of any other unbiased
estimator.
Remark: Given two unbiased estimators we prefer the estimator with the
smaller variance, because the distribution of such an estimator is more tightly
clustered around the true parameter value. When comparing two estimators we
speak of relative e¢ ciency: If A1 and A2 are two unbiased estimators of , then
A1 is relatively e¢ cient to A2 if V ar (A1 ) < V ar (A2 ). (Diagram)
Remark: If the degree of bias of an estimator is known, then it makes sense
to de…ne e¢ ciency for two estimators with the same known bias, even if the bias
is not zero. If the degree of bias is unknown, such a statement may not be useful,
as it is di¢ cult to tell which estimator one prefers.
2
Location and dispersion may also be combined.
De…nition. The mean square error (MSE) of the estimator A is
M SE (A) = E (A
)2
= V ar (A) + [Bias (A)]2 :
Derivation:
E (A
)2 = E [(A EA) (
EA)]2
= V ar (A) + [ Bias (A)]2 2 (
where E (A EA) equals 0.
EA) E (A
EA)
Remark: The mean square error is especially useful for comparing two estimators with unequal bias. With unequal bias, we need a measure that combines
bias and variance. (Diagram)
Remark: For an unbiased estimator, MSE equals the variance.
Bias and e¢ ciency relate to location and dispersion of the distribution of an
estimator.
Consistency relates to the behavior of the estimator itself.
De…nition. An estimator is consistent if it equals the true parameter value
for an arbitrarily large sample.
Remark: As the sample size grows, it becomes increasingly likely that an estimator lies in some neighborhood of the true parameter value. For consistency,
the probability that the estimator lies in a neighborhood, for any given neighborhood of the true parameter value, must approach 1 as the sample size approaches
in…nity.
Remark: For sampling from a …nite population, we will certainly learn the true
parameter value if the sample is the entire population. The idea of consistency is
to apply to populations that are in…nite.
Remark: Consistency does not imply the bias goes to zero as the sample size
tends to in…nity. Consistency is a property of an estimator for one sample as the
sample size gets large. Bias is a property of an estimator for one sample size as
the number of samples gets large.
3
Examples
Let fYi gni=1 be a sequence of independent identically distributed N ( ;
dom variables.
2
) ran-
Estimation of .
The estimator most familiar to you is the sample mean
1X
Yi :
n i=1
n
Yn =
(Note: we have replaced the population mean with an empirical analogue, in
which we assign equal probability to each observation.)
Bias: Does E Yn equal ?
1X
Yi
n i=1
n
E
!
1X
=
EYi = :
n i=1
n
E¢ ciency: Is Yn the e¢ cient unbiased estimator of ? Yes.
!
!
n
n
X
1
1 2
1X
:
V ar
Yi = 2 V ar
Yi =
n i=1
n
n
i=1
Remark: Consider using only m < n of the sample. Because m1 > n1 , the
variance of Ym exceeds the variance of Yn . Intuitively, we throw away n m bits
of information when we use Ym .
Remark: The sample mean achieves the Cramer-Rao lower bound.
Consistency: The estimator Yn is consistent.
Estimation of 2 .
Recall the population variance
)2 ;
E (Yi
which is naturally estimated by the sample analogue
1X
=
Yi
n i=1
n
2
Sn;M
LE
4
Yn
2
:
2
Bias: Does ESn;M
LE equal
"
2
?
1X
E
Yi
n i=1
n
Yn
2
#
1 X
= E
Yi
n i=1
n
Yn
2
:
Note
n
X
Yi
Yn
2
=
i=1
=
=
n
X
i=1
n
X
i=1
n
X
(Yi
)
2
(Yi
) +
2
Yn
n
X
2
Yn
n
X
2 Yn
i=1
)2
(Yi
(Yi
)
i=1
2
n Yn
i=1
where
n
X
) equals n Yn
(Yi
:
i=1
Hence
1 X
E
Yi
n i=1
n
Yn
2
=
1
nV ar (Yi )
n
nV ar Yn
2
2
=
n
=
n
2
1
:
n
The ML estimator is biased downward
2
Bias Sn;:M
LE =
1
n
2
:
An unbiased estimator is
Sn2
=
n
n
1
2
Sn;M
LE
=
1
n
1
n
X
Yi
Yn
2
:
i=1
Remark: The bias of the ML estimator reveals an important feature of estimation: degrees-of-freedom. Intuitively, with a sample of size n we have n pieces
of information. The degrees-of-freedom equal the number of data points less the
5
number of constraints already imposed on the data. When we estimate the sample
mean to begin, we have no prior constraints on the data and so have n degreesof-freedom. Our estimator of the sample variance requires that we estimate the
sample mean …rst. In estimating the sample mean we have placed one constraint
on the data (used up one piece of information). Why? If you were given n 1
of the observations and Yn , you could infer the …nal observation (because the n
observations sum to nYn ). The bias results
to estimate , if were
P from the need
)2 .
known then the unbiased estimator is n1 ni=1 (Yi
2
Remark: The downward bias of Sn;:M
LE is intuitive because it treats the mean
as known and does not account for the added uncertainty from Yn .
2
Question: Large values of 2 imply large bias. Therefore, if Sn;:M
LE is large,
use an unbiased estimator.
Additional Material
Let fWi gni=1 be a sequence of independent identically distributed random variables for which
if i = j
Cov (Wi ; Yj ) =
0 if i 6= j
Estimation of covariance.
The population covariance between Wi and Yi is de…ned as
Cov(Wi ; Yi ) = E [(W
EW ) (Y
EY )] ;
which is naturally estimated by the sample analogue
2
Sn;Y;W
=
1
n
1
n
X
Wi
Wn
Yi
Yn :
i=1
2
Bias: Does ESn;Y;W
equal Cov(Wi ; Yi )?
E
"
1
n
1
n
X
i=1
Wi
Wn
Yi
Yn
#
=
6
1
n
1
E
n
X
i=1
Wi
Wn
Yi
Yn :
Note
n
X
Wi
Wn
Yi
Yn
n
X
=
i=1
(Wi
EW )
Wn
EW
(Yi
EY )
Yn
EY
i=1
n
X
=
(Wi
EW ) (Yi
EY )
n Wn
EW
Yn
EY
i=1
where Wn
EW
n
X
(Yi
EY ) equals n Wn
EW
i=1
Hence
n
X
1
E
Wi
n 1 i=1
where E
Wn
Wn
EW
Yi
Yn =
Yn
EY
1
n
1
[nCov(Wi ; Yi )
Cov(Wi ; Yi )] = Cov(Wi ; Yi )
= E W n Yn
EW EY
n
n
1 XX
=
[E (Wi Yj ) EW EY ]
n2 i=1 j=1
=
n
1 X
[Cov (Wi ; Yi ) + EW EY
n2 i=1
EW EY ]
1
Cov(Wi ; Yi );
n
where the third equality follows because: if i = j, then E (Wi Yj ) = Cov (Wi ; Yi ) +
EW EY ; while if i 6= j, then E (Wi Yj ) = EW EY .
=
Remark: Two sample means are estimated yet the divisor indicates
only one
Pn
(W
degree-of-freedom is “usedP
up”. The estimator requires
only
that
i + Yi ) =
i=1
Pn
n
n Wn + Yn rather than i=1 Wi = nWn and i=1 Yi = nYn .
The variance of Sn2 is somewhat complicated. The steps are (see Mood and
Graybill pages 243-44)
n
X
Yi
Yn
2
is independent of Yn
i=1
and
n
1 X
2
Yi
Yn
i=1
7
2
2
n 1
Yn
EY :
and
2
Sn2
=
n
1
"
n
1 X
2
i=1
Yi
Yn
2
#
Because the variance of a random variable that has distribution
4
the variance of Sn2 is n2 1 .
8
2
n 1
is 2 (n
1),