Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Syracuse University SURFACE Economics Faculty Scholarship Maxwell School of Citizenship and Public Affairs 11-28-2005 Results on the Bias and Inconsistency of Ordinary Least Squares for the Linear Probability Model William C. Horrace Syracuse University, [email protected] Ronald L. Oaxaca University of Arizona Follow this and additional works at: http://surface.syr.edu/ecn Part of the Economics Commons Recommended Citation Horrace, William C. and Oaxaca, Ronald L., "Results on the Bias and Inconsistency of Ordinary Least Squares for the Linear Probability Model" (2005). Economics Faculty Scholarship. Paper 10. http://surface.syr.edu/ecn/10 This Article is brought to you for free and open access by the Maxwell School of Citizenship and Public Affairs at SURFACE. It has been accepted for inclusion in Economics Faculty Scholarship by an authorized administrator of SURFACE. For more information, please contact [email protected]. Economics Letters 90 (2006) 321 – 327 www.elsevier.com/locate/econbase Results on the bias and inconsistency of ordinary least squares for the linear probability model William C. Horrace a,T, Ronald L. Oaxaca b a Department of Economics, Syracuse University, Syracuse, NY 13244, USA and NBER, United States Department of Economics, University of Arizona, Tucson, AZ 85721, USA and IZA, United States b Received 10 January 2005; received in revised form 28 June 2005; accepted 30 August 2005 Available online 28 November 2005 Abstract This note formalizes bias and inconsistency results for ordinary least squares (OLS) on the linear probability model and provides sufficient conditions for unbiasedness and consistency to hold. The conditions suggest that a btrimming estimatorQ may reduce OLS bias. D 2005 Elsevier B.V. All rights reserved. Keywords: Consistency; Unbiased; LPM; OLS JEL classification: C25 1. Introduction Limitations of the Linear Probability Model (LPM) are well-known. OLS estimated probabilities are not bounded on the unit interval, and OLS estimation implies that heteroscedasticity exists. Conventional advice points to probit or logit as the standard remedy, which bound the maximum likelihood estimated probabilities on the unit interval. However, the fact that consistent estimation of the LPM may be difficult does not imply that either probit or logit is the correct specification of the probability model; it may be reasonable to assume that probabilities are generated from bounded linear decision rules. Theoretical rationalizations for the LPM are in Rosenthal (1989) and Heckman and Snyder (1977). T Corresponding author. Tel.: +1 315 443 9061; fax: +1 315 443 1081. E-mail address: [email protected] (W.C. Horrace). 0165-1765/$ - see front matter D 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.econlet.2005.08.024 322 W.C. Horrace, R.L. Oaxaca / Economics Letters 90 (2006) 321–327 Despite the attractiveness of logit and probit for estimating binary dependent variable models, OLS on the LPM is still used. Recent applications include Klaassen and Magnus (2001), Bettis and Fairlie (2001), Lukashin (2000), McGarry (2000), Fairlie and Sundstrom (1999), Reiley (2005), and Currie and Gruber (1996). Empirical rationales for the LPM specification are plentiful. McGarry appeals to ease of interpretation of estimated marginal effects, while Reiley cites a perfect correlation problem associated with the probit model. Fairlie and Sundstrom prefer LPM because it implies a simple expression for the change in unemployment rate between two censuses. Bettis and Farlie choose LPM because of an extremely large sample size and other simplifications implied by it. Lukashin uses the LPM, because it lends itself to a model selection algorithm based on an adaptive gradient criterion. Currie and Gruber state that logit, probit, and OLS are similar for their data and only report LPM results. Other rationales for the OLS on the LPM are complications of probit/logit models in certain contexts. Klaassen and Magnus cite panel data complications in their tennis example and select OLS. OLS is perhaps justified in simultaneous equations/instrumental variable methods. The presence of dummy endogenous regressors is problematic if the DGP is assumed to be probit or logit; these problems were first considered by Heckman (1978). While perhaps less popular than logit and probit, OLS on the LPM model still finds its way into the literature for various reasons. Some well-known LPM theorems are provided in Amemiya (1977). Econometrics textbooks (e.g., Greene, 2000), acknowledge complications leading to biased and inconsistent OLS estimates. Nevertheless, the literature is not clear on the precise conditions when OLS is problematic. This note rigorously lays out these conditions, derives the finite-sample and asymptotic biases of OLS, and provides additional results that highlight the appropriateness or inappropriateness of OLS estimation of the LPM. Finally, we suggest a trimmed sample estimator that could reduce OLS bias. 2. Results Let y i be a discrete random variable, taking on the values 0 or 1. Let x i be a 1 k vector of explanatory variables on Rk , b be a k 1 vector of coefficients, and e i be a random error. Define probabilities over the random variable x i b a R. Prðxi bN1Þ ¼ p; Prðxi ba½0; 1Þ ¼ c Prðxi bb0Þ ¼ q; where p + c + q = 1. Consider a random sample of data: ( y i , x i ); i a N; N = {1, . . . , n}. Define the data partition: jc ¼ fijxi ba½0; 1g; jp ¼ fijxi bN1g; ð1Þ implying PrðiajpÞ ¼ p; Priajc ¼ c; Pr igjc [ jp ¼ q: ð2Þ W.C. Horrace, R.L. Oaxaca / Economics Letters 90 (2006) 321–327 323 The LPM DGP is: yi ¼ 1 for iajp ; ¼ xi b þ ei for iajc ; ¼ 0 otherwise: ð3Þ The conditional probability of y i is: Prðyi ¼ 1jxi ; iajpÞ ¼ 1; Pryi ¼ 1jxi ; iajc ¼ xi b; Pryi ¼ 0jxi ; iajc ¼ 1 xi b; Pr yi ¼ 0jxi ; igjc [ jp ¼ 1: ð4Þ Therefore, y i traces the familiar ramp function on x i b with error process: ei ¼ 0 for iajp ; ¼ yi xi b; i a jc ; ¼ 0 for igjc [ jp ; and probabilities Prðei ¼ 0jxi ; iajp Þ ¼ 1; Prei ¼ 1 xi bjxi ; iajc ¼ xi b; Prei ¼ xi bjxi ; iajc ¼ 1 xi b; Pr ei ¼ 0jxi ; igjc [ jp ¼ 1: ð5Þ OLS proceeds as: yi ¼ xi b þ ui ; iaN ; where u i is a zero-mean random variable, independent of the x i . Notice that the OLS error term, u i , differs from e i : ui ¼ 1 xi b for iajp ; ¼ yi xi b for iajc ; ¼ xi b for igjc [ jp ; with probability function: Prðui ¼ 1 xi bjxi ; iajpÞ ¼ 1; Prui ¼ 1 xi bjxi ; iajc ¼ xi b; Prui ¼ xi bjxi ; iajc ¼ 1 xi b; Pr ui ¼ xi bjxi ; igjc [ jp ¼ 1: ð6Þ The distinction between u i and e i induces problems in OLS. Theorem 1. If c b 1, then Ordinary Least Squares estimation of the Linear Probability Model is generally biased and inconsistent. Proof. Eq. (6) implies: E ðui jxi ; iajpÞ ¼ 1 xi b; E ui jxi ; iajc ¼ 0; E ui jxi ; igjc [ jp ¼ xi b: Therefore, the conditional expectation of the OLS error, u i , is a function of x i with probability (1 c). Hence, OLS is biased and inconsistent, if c b 1. 5 324 W.C. Horrace, R.L. Oaxaca / Economics Letters 90 (2006) 321–327 Hence, only observations i a j g possess mean-zero errors, so OLS with i g j g is problematic. Remark 2. If nc p N, then OLS estimation is biased and inconsistent. That is, if the sample used to estimate b contains any i g j g , then c b 1, so OLS is problematic. Also: Remark 3. If c = 1, then OLS is unbiased and consistent, because p = q = 0, E(u i | x i ) = 0 for all i a N, and: Eðyi jxi Þ ¼ Prðyi ¼ 1jxi Þ ¼ xi b; iaN : Define random variables z i and w i : zi ¼ 1 for iajc ; ¼ 0 otherwise: wi ¼ 1 for iajp ; ¼ 0 otherwise: Hence, Pr(z i = 1) = c and Pr(w i = 1) = p. Alternative representation of Eq. (3) is: yi ¼ wi þ zi xi b þ ui zi ; ð7Þ iaN ; making explicit that u i is not the correct OLS error. Notice, ui zi ¼ 0 for igjc ; ¼ 1 xi b for yi ¼ 1; iajc ; ¼ xi b for yi ¼ 0; iajc ; so the conditional probability function of u i z i is the same as that of e i . Therefore, E(u i z i | x i ) = 0, and Eq. (7) has a zero-mean error, independent of x i . Taking the unconditional mean of Eq. (7): Eðyi Þ ¼ p þ E ðzi xi Þb þ Eðui zi Þ ¼ p þ cEðzi xi jzi ¼ 1Þb þ cE ðzi ui jzi ¼ 1Þ ¼ p þ clxc b; ð8Þ where l xg = E(x i | z i = 1). Eq. (8) will be used in the sequel. The OLS estimator is: " #1 X X bn ¼ b̂ xi Vxi xi Vyi : iaN iaN Substituting Eq. (7): " #1 X X xi Vxi xi Vðwi þ zi xi b þ ui zi Þ: bn ¼ b̂ iaN ð9Þ iaN Partitioning the data by j g and j k, and taking into consideration z i and w i in each regime: " #1 " # X X X X xi Vxi xi Vð0Þ þ xi Vðxi b þ ui Þ þ xi Vð1Þ bn ¼ b̂ " ¼ iaN X iaN #1 " xi Vxi igjc [jp X iajc xi Vxi b þ iajc X iajc xi uV i þ X iajp # xi V : iajp W.C. Horrace, R.L. Oaxaca / Economics Letters 90 (2006) 321–327 325 Hence: E b̂ b n jxi ¼ " X #1 " xi Vxi iaN " ¼ X #1 " xi Vxi ¼ xi xV i b þ X #1 xi Vxi iajc X X xi xV i b þ X " xi Vxi b þ X iaN xi VE ui jxi ; iajc þ X # xi V iajp xi Vð0Þ þ iajc iajc iaN X iajc iajc iaN " X X # xi V iajp #1 xi Vxi X xi Vpb; ð10Þ iajp which is generally biased and asymptotically biased, because c b 1. When c = 1, j g = N, the first term on the RHS is b, the second term is 0, and b̂n is unbiased. The inconsistency of bˆn follows in a similar fashion. Letting C denote the cardinality operator, define denote the probability limit operator as n Y l. n k = C(j k), n g = C(jP g) and n U = n n k n g. Let plim 1 P 1 Q and Q g are finite, (nonAssume plim [n iaN xVx i i ] = Q and plim [nP g iang xVx i i ] = Q g where 1 1 1 P x V] = l V , plim [n singular) positive definite. Assume plim [n k ian i xk iaN xV] i = l xV and plim [n g k P 1 1 V and l xV are finite vectors. Assume plim [n n k] = p and plim [n gn ] = c. iang x Viu i ] = 0, where l xk Then: plim b̂ b n ¼ Q1 Qc bc þ plVxp pb: Even if c and p were known, b̂n could not be bias corrected, yet Eq. (8) seems to imply that if c and p were known, an OLS regression of ( y i p) on (cx i ) might be unbiased. Define transformed OLS estimator: b̂ n* ¼ b " X #1 c2 xi Vxi iaN X cxi Vðyi pÞ: ð11Þ iaN Theorem 4. b̂n* is biased and inconsistent for b. Proof. Eq. (11) implies " #1 " #1 " #1 X X X 1 X 1 X 1 p X * xi Vxi xi Vyi xi Vxi xi p V ¼ b̂ xi xV i xi V: bn ¼ b̂ bn c iaN c iaN c c iaN iaN iaN iaN Hence, " #1 1 p X X * b n jxi xi xV i xi Vpb: E b̂ b n jxi ¼ E b̂ c c iaN iaN 5 326 W.C. Horrace, R.L. Oaxaca / Economics Letters 90 (2006) 321–327 Thus, knowledge of p and c does not ensure an unbiased OLS estimator of b, and the bias will persist asymptotically. Moreover, it does not facilitate consistent estimation. The problem with b̂n and b̂*n is not that c and p are unknown but that j g is unknown. If we knew j g, we could perform OLS only on observations i a j g. Therefore: Remark 5. Sufficient information for unbiased and consistent OLS estimation is knowledge of j g. Also, if j g = N, then: X xi Vxi ¼ iajc X xi xV i ; and X xi V¼ 0: iajp iaN Therefore, Eq. (10) becomes: E b̂ b n jxi ¼ " X #1 xi xV i iaN X iaN " xi Vxi b þ X #1 xi xV i ð0Þ ¼ b; iaN unbiased for j g = N. A similar argument can be made for consistency. If c = 1, then j g = N. Therefore: Remark 6. Without knowledge of j g and j k, a sufficient condition for unbiased OLS when c b 1 is j c = N. j g = N is a weaker sufficient condition than c = 1, but probably unlikely in large samples. For any given random sample, Pr[j g = N] = c n , so lim Pr jc pN ¼ lim ð1 cn Þ ¼ 1: nYl nYl Remark 7. Without knowledge of j g and j k, if c b 1 and j g = N, then as n Y l, j g p N with probability approaching 1, and b̂n is asymptotically biased and inconsistent. Therefore, as N grows, once the first observation x i b g [0, 1] appears, then j g p N and unbiasedness is lost. Oddly, the estimator b̂n could be reliable in small samples yet unreliable in large samples. 3. Conclusions Although it is theoretically possible for OLS on the LPM to yield unbiased estimation, this generally would require fortuitous circumstances. Furthermore, consistency seems to be an exceedingly rare occurrence as one would have to accept extraordinary restrictions on the joint distribution of the regressors. Therefore, OLS is frequently a biased estimator and almost always an inconsistent estimator of the LPM. If we had knowledge of the sets j g and j k, then a consistent estimate of b could be based on the sub-sample i a j g. This is tantamount to removing observations i g j g, suggesting that trimming observations violating the rule ŷ i = x i b̂n a [0, 1] and re-estimating the OLS model (based on the trimmed sample) may reduce finite sample bias. This seems to hold in simulations, but formal proof of this result is left for future research. W.C. Horrace, R.L. Oaxaca / Economics Letters 90 (2006) 321–327 327 Acknowledgements We gratefully acknowledge valuable comments by Seung Ahn, Badi Baltagi, Gordon Dahl, Dan Houser, Price Fishback, Art Gonzalez, Shawn Kantor, Alan Ker, Paul Ruud and Peter Schmidt. Capable research assistance was provided by Nidhi Thakur. References Amemiya, T., 1977. Some theorems in the linear probability model. International Economic Review 18, 645 – 650. Bettis, J.R., Fairlie, R.W., 2001. Explaining ethnic, racial, and immigrant differences in private school attendance. Journal of Urban Economics 50, 26 – 51. Currie, J., Gruber, J., 1996. Health insurance eligibility, utilization of medical care, and child health. Quarterly Journal of Economics 111, 431 – 466. Fairlie, R.W., Sundstrom, W.A., 1999. The emergence, persistence, and recent widening of the racial unemployment gap. Industrial and Labor Relations Review 52, 252 – 270. Greene, W.H., 2000. Econometric Analysis. Prentice-Hall, Upper Saddle River, NJ. Heckman, J.J., 1978. Dummy endogenous variables in a simultaneous equation system. Econometrica 46, 931 – 959. Heckman, J.J., Snyder Jr., J.M., 1977. Linear probability models of the demand for attributes with an empirical application to estimating the preferences of legislators. Rand Journal of Economics 28, S142 – S189. Klaassen, F.J.G.M., Magnus, J.R., 2001. Are points in tennis independent and identically distributed? Evidence from a dynamic binary panel data model. Journal of the American Statistical Association 96, 500 – 509. Lukashin, Y.P., 2000. Econometric analysis of managers’ judgements on the determinants of the financial situation in Russia. Economics of Planning 33, 85 – 101. McGarry, K. 2000, Testing parental altruism: Implications of a dynamic model,Q NBER Working Paper 7593. Reiley, D.H., 2005, Field experiments on the effects of reserve prices in auctions: More magic on the internet, mimeo, University of Arizona. Rosenthal, R.W., 1989. A bounded-rationality approach to the study of noncooperative games. International Journal of Game Theory 18, 273 – 292.