Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Economics 140A Properties of Estimators Our …rst lecture reviewed basic concepts from probability. Economic theory yields models that describe the evolution of economic variables. The evolution depends on the value of the parameters that characterize the model. If the parameters are known, and if the distribution of the driving process (or error) is known, then the evolution is derived using the tools of probability. Because the parameters of a model are never known, such an approach is infeasible. Instead, we use the data to infer the parameters; that is, we estimate the parameters. In studying the evolution of economic variables, we typically focus on one of three tasks: measurement, testing, or forecasting. If all of the parameters of the model were known, then measurement of an e¤ect would be straightforward (and equal to a function of the known parameters), testing would be moot (we would know the parameter value and would not need to conjecture as to whether or not it equalled some speci…ed constant), and forecasting would be ‡awless except for the unknown future values of the driving process. With unknown parameters replaced by estimates; measurement is uncertain, testing is important, and forecasting more ‡awed. Clearly the accuracy with which we perform each task depends on the accuracy of our estimator and so we turn to discussion of how to evaluate estimators. P An estimator is a function of the data: Y = n1 ni=1 Yi An estimate is a speci…c value obtained: y Remark: We distinguish between random variables, which are denoted with upper case, and the values random variables may take, which are denoted with lower case. An estimator is a random variable. Let A be an estimator of the parameter . We study features of the distribution of A. We then turn to a property that is not a feature of the sampling distribution. The distribution of A is obtained by constructing innumerable samples and plotting the estimate from each sample. We begin with the location of the distribution of A. De…nition: The estimator A is unbiased if the expected value of A equals , EA = : The bias of A measures the departure of EA from , Bias (A) = EA : Remark 1: If an estimator is unbiased, the process of drawing repeated samples, obtaining an estimate from each sample, and averaging the estimates will yield a value that is likely close to the true parameter value. If on average, the value of the estimator is less than , we say the estimator is biased downward. Other things equal, an unbiased estimator is likely to yield a more accurate indicator of the true parameter value (because rarely is the magnitude of the bias for a biased estimator known). Remark 2: The above de…nition is most accurately termed mean unbiased. As we know from our previous lecture there are other de…nitions of location. Corresponding to the median, an estimator is median unbiased if the median of the estimator equals the true parameter value. With a unimodal distribution, one could have a modally unbiased estimator. We next study dispersion of the distribution of A. De…nition. The estimator A is the e¢ cient unbiased estimator if for any sample size the variance of A is smaller than the variance of any other unbiased estimator. Remark: Given two unbiased estimators we prefer the estimator with the smaller variance, because the distribution of such an estimator is more tightly clustered around the true parameter value. When comparing two estimators we speak of relative e¢ ciency: If A1 and A2 are two unbiased estimators of , then A1 is relatively e¢ cient to A2 if V ar (A1 ) < V ar (A2 ). (Diagram) Remark: If the degree of bias of an estimator is known, then it makes sense to de…ne e¢ ciency for two estimators with the same known bias, even if the bias is not zero. If the degree of bias is unknown, such a statement may not be useful, as it is di¢ cult to tell which estimator one prefers. 2 Location and dispersion may also be combined. De…nition. The mean square error (MSE) of the estimator A is M SE (A) = E (A )2 = V ar (A) + [Bias (A)]2 : Derivation: E (A )2 = E [(A EA) ( EA)]2 = V ar (A) + [ Bias (A)]2 2 ( where E (A EA) equals 0. EA) E (A EA) Remark: The mean square error is especially useful for comparing two estimators with unequal bias. With unequal bias, we need a measure that combines bias and variance. (Diagram) Remark: For an unbiased estimator, MSE equals the variance. Bias and e¢ ciency relate to location and dispersion of the distribution of an estimator. Consistency relates to the behavior of the estimator itself. De…nition. An estimator is consistent if it equals the true parameter value for an arbitrarily large sample. Remark: As the sample size grows, it becomes increasingly likely that an estimator lies in some neighborhood of the true parameter value. For consistency, the probability that the estimator lies in a neighborhood, for any given neighborhood of the true parameter value, must approach 1 as the sample size approaches in…nity. Remark: For sampling from a …nite population, we will certainly learn the true parameter value if the sample is the entire population. The idea of consistency is to apply to populations that are in…nite. Remark: Consistency does not imply the bias goes to zero as the sample size tends to in…nity. Consistency is a property of an estimator for one sample as the sample size gets large. Bias is a property of an estimator for one sample size as the number of samples gets large. 3 Examples Let fYi gni=1 be a sequence of independent identically distributed N ( ; dom variables. 2 ) ran- Estimation of . The estimator most familiar to you is the sample mean 1X Yi : n i=1 n Yn = (Note: we have replaced the population mean with an empirical analogue, in which we assign equal probability to each observation.) Bias: Does E Yn equal ? 1X Yi n i=1 n E ! 1X = EYi = : n i=1 n E¢ ciency: Is Yn the e¢ cient unbiased estimator of ? Yes. ! ! n n X 1 1 2 1X : V ar Yi = 2 V ar Yi = n i=1 n n i=1 Remark: Consider using only m < n of the sample. Because m1 > n1 , the variance of Ym exceeds the variance of Yn . Intuitively, we throw away n m bits of information when we use Ym . Remark: The sample mean achieves the Cramer-Rao lower bound. Consistency: The estimator Yn is consistent. Estimation of 2 . Recall the population variance )2 ; E (Yi which is naturally estimated by the sample analogue 1X = Yi n i=1 n 2 Sn;M LE 4 Yn 2 : 2 Bias: Does ESn;M LE equal " 2 ? 1X E Yi n i=1 n Yn 2 # 1 X = E Yi n i=1 n Yn 2 : Note n X Yi Yn 2 = i=1 = = n X i=1 n X i=1 n X (Yi ) 2 (Yi ) + 2 Yn n X 2 Yn n X 2 Yn i=1 )2 (Yi (Yi ) i=1 2 n Yn i=1 where n X ) equals n Yn (Yi : i=1 Hence 1 X E Yi n i=1 n Yn 2 = 1 nV ar (Yi ) n nV ar Yn 2 2 = n = n 2 1 : n The ML estimator is biased downward 2 Bias Sn;:M LE = 1 n 2 : An unbiased estimator is Sn2 = n n 1 2 Sn;M LE = 1 n 1 n X Yi Yn 2 : i=1 Remark: The bias of the ML estimator reveals an important feature of estimation: degrees-of-freedom. Intuitively, with a sample of size n we have n pieces of information. The degrees-of-freedom equal the number of data points less the 5 number of constraints already imposed on the data. When we estimate the sample mean to begin, we have no prior constraints on the data and so have n degreesof-freedom. Our estimator of the sample variance requires that we estimate the sample mean …rst. In estimating the sample mean we have placed one constraint on the data (used up one piece of information). Why? If you were given n 1 of the observations and Yn , you could infer the …nal observation (because the n observations sum to nYn ). The bias results to estimate , if were P from the need )2 . known then the unbiased estimator is n1 ni=1 (Yi 2 Remark: The downward bias of Sn;:M LE is intuitive because it treats the mean as known and does not account for the added uncertainty from Yn . 2 Question: Large values of 2 imply large bias. Therefore, if Sn;:M LE is large, use an unbiased estimator. Additional Material Let fWi gni=1 be a sequence of independent identically distributed random variables for which if i = j Cov (Wi ; Yj ) = 0 if i 6= j Estimation of covariance. The population covariance between Wi and Yi is de…ned as Cov(Wi ; Yi ) = E [(W EW ) (Y EY )] ; which is naturally estimated by the sample analogue 2 Sn;Y;W = 1 n 1 n X Wi Wn Yi Yn : i=1 2 Bias: Does ESn;Y;W equal Cov(Wi ; Yi )? E " 1 n 1 n X i=1 Wi Wn Yi Yn # = 6 1 n 1 E n X i=1 Wi Wn Yi Yn : Note n X Wi Wn Yi Yn n X = i=1 (Wi EW ) Wn EW (Yi EY ) Yn EY i=1 n X = (Wi EW ) (Yi EY ) n Wn EW Yn EY i=1 where Wn EW n X (Yi EY ) equals n Wn EW i=1 Hence n X 1 E Wi n 1 i=1 where E Wn Wn EW Yi Yn = Yn EY 1 n 1 [nCov(Wi ; Yi ) Cov(Wi ; Yi )] = Cov(Wi ; Yi ) = E W n Yn EW EY n n 1 XX = [E (Wi Yj ) EW EY ] n2 i=1 j=1 = n 1 X [Cov (Wi ; Yi ) + EW EY n2 i=1 EW EY ] 1 Cov(Wi ; Yi ); n where the third equality follows because: if i = j, then E (Wi Yj ) = Cov (Wi ; Yi ) + EW EY ; while if i 6= j, then E (Wi Yj ) = EW EY . = Remark: Two sample means are estimated yet the divisor indicates only one Pn (W degree-of-freedom is “usedP up”. The estimator requires only that i + Yi ) = i=1 Pn n n Wn + Yn rather than i=1 Wi = nWn and i=1 Yi = nYn . The variance of Sn2 is somewhat complicated. The steps are (see Mood and Graybill pages 243-44) n X Yi Yn 2 is independent of Yn i=1 and n 1 X 2 Yi Yn i=1 7 2 2 n 1 Yn EY : and 2 Sn2 = n 1 " n 1 X 2 i=1 Yi Yn 2 # Because the variance of a random variable that has distribution 4 the variance of Sn2 is n2 1 . 8 2 n 1 is 2 (n 1),