* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

DEV\ 1 HB31 .M415 working paper department of economics TWO-STEP SERIES ESTIMATION OF SAMPLE SELECTION MODELS Whitney K. Newey No. 99-04 February, 1999 massachusetts institute of technology 50 memorial drive Cambridge, mass. 02139 WORKING PAPER DEPARTMENT OF ECONOMICS TWO-STEP SERIES ESTIMATION OF SAMPLE SELECTION MODELS Whitney K. Newey No. 99-04 February, 1999 MASSACHUSETTS INSTITUTE OF TECHNOLOGY MEMORIAL DRIVE CAMBRIDGE, MASS. 02142 50 TWO STEP SERIES ESTIMATION OF SAMPLE SELECTION MODELS by Whitney K. Newey Department of Economics MIT, E52-262D Cambridge, MA 02139 April 1988 Latest Revision, January 1999 Abstract Sample selection models are important for correcting for the effects of nonrandom sampling in microeconomic data. This note is about semiparametric estimation using a Regression spline and power series approximation to the selection correction term. Consistency and asymptotic normality are shown, as series approximations are considered. well as consistency of an asymptotic variance estimator. JEL Classification: C14, C24 Keywords: Sample selection models, semiparametric estimation, series estimation, two-step estimation. Presented at the 1988 European meeting of the Econometric Society. Helpful comments were provided by D.W.K. Andrews, G. Chamberlain, A. Gregory, J. Ham, J. MacKinnon, D. McFadden, and J. Powell. The NSF and the Sloan Foundation provided financial support. Digitized by the Internet Archive in 2011 with funding from Boston Library Consortium Member Libraries http://www.archive.org/details/twostepseriesestOOnewe 1. Introduction Sample selection models provide an approach to correcting for nonrandom sampling that is important in econometrics. and Heckman (1974). This paper form restricting the functional is Pioneering work in this area includes Gronau (1973) about two-step estimation of these models without The estimators are of the selection correction. particularly simple, using polynomial or spline approximations to correct for selection. Asymptotic normality and consistency of an asymptotic variance estimator are shown. Some of the estimators considered here are similar to two-step least squares estimators with flexible correction terms previously proposed by Lee (1982) and Heckman and Robb (1987). The theory here allows the functional form of the correction to be entirely unknown, with the number of approximating functions size to achieve /^-consistency and asymptotic normality. menu of approximations by considering new types of growing with the sample Also, this paper adds to the power series, along with regression splines that are important in statistical approximation theory (e.g. Stone, 1985). Early work on semiparametric estimation of sample selection models includes Cosslett (1991) and Gallant and Nychka (1987). normality results. These papers do not have asymptotic Powell (1987) and Ahn and Powell (1993) give distribution theory for density weighted kernel estimators. The series estimators analyzed here have the virtue of being extremely easy to implement. the regression splines. Also, some of the estimators are new, including Practical experience with these estimators is given in Newey, Powell, and Walker (1990). Section 2 of the paper presents the model and discusses identification. estimators are described in Section 3, The and Section 4 gives the asymptotic theory. -2- 2. The Model and Identification The selection model model considered here y = x'/3 (2.1) + £, only observed y E[£|w,d=l] = E[£|v(w,a is d = if d e {0, 1, 1>. Prob(d = l|w) = 7r(v(w,a ),d=l], Here the conditional mean of the disturbance, given selection and v = v(w,a the index This restriction ). is )), w, The function i 0), l(v+£; E[£|f] is h A basic implication that is E[y|w,d=l] = x'p (2.2) depends only on implied by other familiar conditions, such as independence of disturbances and regressors, see Powell (1994). of this model x £ w. linear in w, independent of is (£,£;) h (v) = E[?|w,d=l] (v), a selection correction that is (v) + h £, then h £j standard normal CDF and p.d.f. respectively. by Heckman (1976). Equation (2.2) In this is paper we allow if d = where This term $(v) is are the <p[v) the correction term considered to have an h (v) and unknown functional form. an additive semiparametric regression like that considered by Robinson (1988), except that the variable Making use of this information implied by equation For example has a standard normal distribution, and = 0(v)/$(v), (v) familiar. is (2.1), is v = v(w,a depends on unknown parameters. ) important for identification. and regarding would mean that any component of x h that unknown function of variables as an is Ignoring the structure in included in those variables would not be identified. The identification condition for Assumption there is 1: M this paper is = E[d(x-E[x v,d=l])(x-E[x| v,d=l])' | no measurable function f(v) w, such that ] is x'A = nonsingular, f(v) when i.e. for any d = 1. A * This condition was imposed by Cosslett (1991), and the selection model version of is Robinson's (1988) identification condition for additive semiparametric regression. shown by Chamberlain is (1986), this condition is not necessary for identification, but necessary for existence of a (regular) Vn'-consistent estimator. note that this condition does not allow for a constant term in separately identified from h v given x is Var(x) that is from v x. important to because is it not some cases. in A simple has an absolutely continuous component with conditional density that is An obvious necessary condition is Such an exclusion restriction a choice variable and v Identification of is of an estimator of a implied by be excluded v many economic models, where d is includes a price variable for another choice. from equation ji . x. requiring that something in x, in a . order to allow flexibility in the choice Of course, consistency of but different consistent estimators a (2.2) also requires identification of Here no specific assumptions will be imposed, 3. x, is nonsingular and the conditional distribution of not be a linear combination of assumptions. are available 1 positive on the entire real line for almost all that It it (v). More primitive conditions for Assumption sufficient condition As a may correspond a will imply identification of to different identifying For brevity, a menu of different assumptions not discussed here. is Estimation The type of estimator we consider semiparametric estimator least squares regression on selected data. is a two-step estimator, of the selection parameters a x a where the first step and the second step and approximating functions of v = v(x,a) in is a is the These estimators are analogous to Heckman's (1976) two-step procedure for the Gaussian disturbances case. The difference is that a is estimated by a distribution-free method rather than by probit and a nonparametric approximation to -4- h(v) is used in the second step regression rather than the inverse Mills ratio. There are many distribution free estimates that are available for the first step, including those of Manski (1975), Cosslett (1983), and Ruud (1986). Ichimura (1993), and Cavanagh and Sherman (1997). Also, the asymptotic variance of will be an increasing function of the asymptotic variance of may estimator like that of Klein and Spady (1993) that can approximate h monotonic transformation of IK x on and functions of t(v,t)) T). This transformation Let as discussed below. v, v denote some strictly p (x) is = K be a vector of functions with the property that for large (x))' (x),...,p so an efficient a, let depending on parameters v, useful for adjusting the location and scale of (p y To describe the estimator (v). /3 be useful. a linear regression of The second step consists of will estimator of Powell, Stock, and Stoker (1989), like the need to be /n-consistent, The first step K.K. a linear combination of p = (d.,w.,d.y.), z. T) denote an estimator of K superscript for x nn estimator is 11 (3.1) /3 /3 T), 11 M = (i _1 x'(I-Q)y/n, ~~~,s x. = x(v.,t)), and d y nn M P = )', [d, P ,,...,d p 11 nn ]', and Suppose that x. assumed throughout to be n), ..., = v(w.,a), v. i.i.d.. ~K- p. = p For Let where a (x.), x = Q = P(P'P) _1 P' the = x' (I-Q)x/n, will exist in large the coefficient of is 1, suppressed for notational convenience. y = (d,y, ]', = is p. where the inverses estimator can approximate an unknown function of liiri the data are [d,x,,...,d (x) x. samples under conditions discussed below. from the regression of y. on x. and The p. in the selected data. This estimator depends on the choice of approximating functions and transformation. Here we consider two kinds of approximating functions, power series and splines. power series the approximating functions are given by (3.2) pfx) kK = x k_1 . For Depending on the transformation power series can lead this x(v,t)), Three examples are a power series different types of sample selection corrections. the index a nonlinear transformation of v is or in the normal $(•)/$(•), in the inverse Mills ratio v, used (e.g. to several CDF for a power series in $), in When $(•). may be it appropriate to undo a location and scale normalization imposed on most semiparametric estimators of To v(w,a). this end let probit estimation with regressors require that t) = (tj ,t) be the coefficients from )' where we do not impose normality (but (l,v.), be a vri-consi stent of some population parameter). 77 will Then the transformed observations for the three examples will be (3.3a) x. = v., 1 1 (3.3b) X. = 0(TJ +T) v.)/f(T} +T) v.), (3.3c) T. = $(T) 2 The power series V +7) in ) equation (3.3a) will have as a leading term the index The one from equation itself. so that the first term Mills, is (3.3b) will have leading the Heckman term given by the inverse (1976) correction. approximating functions that preserve a shape property of and l(v+^2;0) The (£,£) example last Gaussian are independent of will correspond to a that v, h power series v. (v) This one also has h (v) when that holds goes to zero as in the selection d = v gets large. probability for £•. Replacing power series by corresponding polynomials that are orthogonal with respect to some weight function may help avoid « ~ max. ~ and ,{t(v.,t))} , l^n.d =1 1 1 k that is orthogonal for the uniform weight on = [2x(v.,T))-x -xj/(x -x„). u I 1 u I replacement, since one could replace . x k-1 = x by a i polynomial of order x. For example, for „ x„ = min. ,<t(v.,ti)> I i^n,d =1 1 i at multicollinearity. it is Of course, 6 is is evaluated not affected by such a just a nonsingular linear transformation of the An alternative approximation that [-1,1], power series. better in several respects than power -6- series is splines, that are piecewise polynomials. Splines are less sensitive to outliers and to singularities in the function being approximated. Also, as discussed below, asymptotic normality holds under weaker conditions for splines than power series. For theoretical convenience attention For [-1,1]. b Kb = 0)«b, > is limited to splines with evenly spaced knots on m of degree a spline x in L with evenly spaced can be based on knots on [-1,1] (3.4) xk_1, P kK (x) = = ~ k ~ m+1 l <[x + > m - 2(k-m-l)/(L+D] 1 An alternative, equivalent series that is ) m+2 , == k < m+l+L s K. less subject to multicollinearity problems is B-splines; e.g. see Powell (1981). Fixed, evenly spaced knots convenience. 4, difficult which relies on linear For inference variance of restrictive, and is motivated by theoretical Allowing the knots to be estimated may improve the approximation, but would make computation more Section is it and require substantial modification to the theory of parameter approximations. in important to have a consistent estimator of the asymptotic is This can be formed by treating the approximation as /3. using formulae for parametric two-step estimators such as those of estimator will depend on a consistent estimator Let v'n(a-a-). d.p., e. of residuals from the regression V(/3) H = fir n l [£. "1=11 d.x. on d.p., 2 / u.u .(e.) /n + HV(a)H' ]M l l Define 1111 of (1984). h(v) = p u = (I-Q)x 1111 d.y. (xtv.Tj))' y u'u _1 , l i i is the sum of two terms, the first of -7- which is and d.x. the to be the matrix x' (I-Q)x = so that on n = y. u.[ah(v.)/av]Sv(w.,a)/Sa'/n. "1=1 l l This estimator The of the asymptotic variance of the corresponding residual, and obtained from this regression. h(v) were exact and Newey be the estimates from the regression of jr = d.Cy.-x'.p-p'y) estimate of (3.5) and B V(a) if the White (1980) and let specification robust variance estimator for the second step regression and the second a term that accounts for the first-stage estimation of the parameters of the selection equation. and i It can also be interpreted as the block of a joint variance estimator for corresponding to where the (3, joint estimator is formed as This estimator will be consistent for the asymptotic variance of conditions of Section 4. Newey / v n(/3--3 ) „ asymptotic confidence interval for /3 . is [/3 „ -V(/3) 1/2 . . rather n For example, a 95 percent sample. in the selected (1984). under the Note here the normalization by the total sample size than the number of observations 4. in /3 « ~ « 1/? 1.96/Vn, |3.+V(/3).. 1.96/Vn]. Asymptotic Normality Some regularity conditions normality. The Assumption 2: show consistency and asymptotic first condition is about the first stage estimator. There exists ,i//./Vn + o (1). T. / -1=1 i p -£-> V(a) will be used to E[i/».] = i//(w,d) and 0, such that for E[i//.i//'. l l i//. exists and ] = is v'ntcc-a i/»(w.,d.), nonsingular. ) = Also, for V(a) l = E[0.0'.]. 11 a This condition requires that depends only on w and d. be asymptotically equivalent to a sample average that It many semiparametric estimators satisfied by is of binary choice models, such as that of Klein and Spady (1993). The next condition imposes some moment conditions on the second stage. Assumption d(y-x'p -h 3: (v)), For some E[e 2 5 |v,d=l] > E[dllxll 0, is ] < oo, Var(x|v,d=l) bounded, and for e = bounded. The bounded conditional variance assumptions are standard be very restrictive here because is v will also be -8- in the literature, assumed to be bounded. and will not To control the bias of the estimator conditions on functions of Assumption and s We 4: v. and h (v) some smoothness essential to impose is are continuously differentiate in E[x|v,d=l] v, of orders respectively. t also require that the transformation Assumption 5: There is satisfy / with 77 x v n(T)-7) ) = some properties. x(v(w,a n ),T) the distribution of (1), has an absolutely continuous component with p.d.f. bounded away from zero on which is with respect to t(v,t)) and a Also, the first and second partial derivatives of compact. from zero, which + x , density of v, and are bounded for tj a and in a T) support, and neighborhood of respectively. T) The first condition of = x a, v(w.,a) its ) v, useful for series estimation, but is where means that the density this assumption x and which is is of t. is bounded away For example, restrictive. v if are continuously distributed and independent, then the x x a convolution of the densities of x and , will be everywhere continuous, and hence cannot have density bounded away from zero. useful to weaken this condition, but this would be difficult and is It would be beyond the scope of this paper. The next assumption imposes growth rate conditions for the number of approximating terms. K = K Assumption 6: i K /n 5, and 7 — > " such that 0; Here, splines require the or VriK p (x) smoothing. It is is and a spline with a) m p ^ t-1 minimum smoothness conditions and the rate for the number of terms, with differentiate. > K b) — h (v) (x) s is ^ 3, a power series, and least stringent t in the — > growth only required to be three times continuously also of note that this assumption does not required under- The presence of 4 K /n rate conditions means that smoothness in -9- s 0. can compensate for lack of smoothness E[x|v,d=l] in does not have to go to zero faster than the variance. requirement is h so that the bias of (v), h(v) This absence of an undersmoothing a feature of series estimators of semiparametric regression models that has been previously noted in Donald and Newey (1994). Asymptotic normality of the two-step least squares estimator and consistency of the estimator of its asymptotic covariance matrix follow from the previous conditions. 111111 u. Theorem M~ Q = E[c = d.{x.-E[x.|v.,d.=l]}, 2 (n + 1: If Assumptions HV(a)H')M~\ 2 u.u'. 111 1-6 ], H = E[u.{dh_(v.)/dv.}av(w.,aJ/a<x' and are satisfied and vnCp-^ -^ 1 1 N(0,V((3)), and Q is V(p) ]. i i nonsingular then for -^ Let V(£) = V(J3). This result gives v^n-consistency and asymptotic normality of the series estimators considered in this useful to have a paper, that are useful for large sample inference. way minimizes goodness of of choosing the fit number of functions in practice. It would also be A K that criteria for the selection correction, such as cross-validation on the equation of interest, should satisfy the rate conditions of Assumption 6. In Newey, Powell and Walker (1990) such a criteria was used and gave reasonable results. However, the results of Donald and Newey (1994) and Linton (1995) for the partially linear model suggests that meaning K it may be optimal for estimation of |3 should be larger than the minimum of a goodness of to undersmooth, fit criteria. Such results are beyond the scope of this paper, but remain an important topic for future research. -10- Appendix: Proof of Theorem Throughout the Appendix we Also, different uses. C will denote a positive constant that Y To begin the proof, note that by i from zero, both = I Also, by the density of x. —=-* Y a, max.x. £ boundary points of will be Vn-consistent for the and min.x. 11 and max.|x.-x.| = 1 l 6, Newey as in l ~K transformation of p 111 E[d.p (A.l) K ^ (K)K (x.)p K -> /vrT p = (x.)'] 1/2 K of (x) l l HAH = tr(A'A) (1997) that for (x) max.x.. Therefore, by a location 11 there , K S lld . |x q(K)K~ 0, Now, (1/vri). p 1/2 it it can follows from Assumption a nonsingular linear is such that sup, I, and by bounded away and scale transformation for power series, which will not change the regression, |x.| 0. i l be assumed that in for a bounded and vri-consistency of (1/vrT). and hence so will x., —=-» ] then , p i and min.x. the support of i X and conditioning sets 9v(w,oc)/9a max. |t.-t. bounded, can be different E[Y |X will use repeatedly the result that if sequence of positive random variables St(v,ti)/Sv 1 | p si S+1 -> S (x)/dx s ll C, ^s (K), 0, 1 C, (K) = CK for splines, CK ^ (K) = Since a nonsingular transformation does not change K p = p K . Then, as Newey in mean value theorem, max. IIP. HP'P/n (1997), -P. II < C(K)max. - III p, = |x.-x. 1 for power series. (C | will be convenient to just let it (K)K 1/2 /vrT) = O (C,(K)/vrT), P -^ so that pi ,2 .,1/2. A „.. ^ ,>. ,„,2 P'P/nll s HP-PII/n + IIPIMIP-PII/n = ) (C(K) /n + K CAK)/VR) -?-* 0. 1 P , , 0. , pi ( Also, by the HP'P/n Hence, by J the triangle inequality, HP'P/n (A. 2) It follows, as in where A(A) - III Newey -^ 0. (1997), that A(P'P/n) a C with probability approaching one, denotes the smallest eigenvalue of a symmetric matrix Next, since x(v,7j ) is one-to-one, conditioning on -11- v is A. equivalent to conditioning on so that, for example, h x, can be regarded as a function of (v) - Let n. By tr(x'Qx-2x'Qfi+fi'(j)/n. moment u = = d.E[x. |x.,d.=l], of for x., ,...,jjl [jjl A(F"P/n) ^ A = P(P'P)" and ], 2 So that II(j-/jII /n = and existence of the second idempotent, Q C, = Qx. ju x. 1 1 2 llx'AII = tr(x'AA'x) < (l)tr(x'x/n) = Qx/n) ^ (l)tr(x' P It follows similarly that = llx'AII for (1) O (1). P P A = P(P'P)" 1 Also, . llx' £ (P-P)/nll P IIP-PH/n = llxll (C(K)/Vn) P (A. 3) llx' _1 -^ Q = P(P'P) so that for P', 1 Qx/n < - x'Qx/nll llx' (P-P)A'x/nll + A(P' P-P' P)A' x/nll + llx' < llx'(P-P)/nll(IIA'xll + IIA'xll) + llx'AIIII(P'P-P'P)/nllllA'xll It x'Qu/n follows similarly that 2 (A. 4) ll/j-fill T = For (x - x'Q]Li/n In 1111111 ,x )' D = and (d,,...,d E[u.u. |x.,x .,d.,d i J i J i .] = E[u.E[u .|u.,x.,x i J J ijjjijij E[u.E[u.|x C. .] = i i .] j |x.,x .,d.,d.] = i J Also, by 0. (1). E[u.u.|T,D] = Therefore, 0. i .,d.,d J = tr(u' Qu+n' (I-Q)]j)/n + o (1) J i J Assumption E[u'.u.|T,D] = E[u'.u. |x.,d.] £ 3, 1111 ii Therefore, with probability one, (A.5) It .,d.] |x.,x .,d.,d i 0. by independence of the observations, )', n E[u.|T,D] = E[d.(x.-E[x.|x.,d.=l])|x.,d.] = -^ A(P-P)' x/nl Therefore, 0. /n = tr(x'Qx-2x'Qfi+/j'fi)/n + o 1 l —^> llx' E[uu' |T,D] < CI. follows that Also, by E[tr(u'Qu)/n|T,D] < Ctr(Q)/n = CK/n tr(u'Qu)/n so that -h> 0, -^ 0. Assumption 4 and standard approximation theory results for power series and splines (e.g. see Newey, there exists TI.. 1997 for references), and by such that E[tr((j' (I-Q)(j)]/n (I-Q)P = = E[tr((fi-PIT' )' and (I-Q)(fi-PIT' ))]/n £ K. K. is. K K iiisiiK.1 E[tr((/n-PIT')'(M-Pn'))]/n = E[d.{/i.-n v p is is (x.)}' {/_t.-TT results with equation (A. 4) gives -12- p (x.)}] I-Q idempotent, -^ 0. Combining these 2 (A. 6) ll/J-/j|l /n M This implies that - — u'u/n In e = (e, ,...,£ Next, let E[ee' |W,D] £ that E[llx' (Q-Q)e/Vnll 2 M. > and Q Q |W,D] = tr{x'(Q-Q)E[ee' follows similarly to equation It that (A. 3) | It follows similarly to eq. W are functions of W,D](Q-Q)x}/n (Q-Q)Qx/n x' -^ and and D, Ctr(x' (Q-Q)(Q-Q)x)/n. == x' (Q-Q)Qx/n x' (I-Q)e/Vn = x' (I-Q)e/Vn + o and hence so that llx'tQ-Qje/Vnll -£-* 0, follows by the law of large = [w' ...,w']'. Then, since CI. — M M => In W and )', — u'u/n while 0, > The triangle inequality then gives numbers. (A. 5) -£-» 0. (1). -2-» 0, follows It P Newey as in Donald and x'(I-Q)e/Vn = (A. 7) (1994) that c/Vn u' + o (1). P For both power series and splines follows as it in Newey (1997) that there are and y K. such that for h tt sup (A.8) sup Let h. = ^ H i x I ^j_ all K.1 n = (a', 77')', and (x)'y < (x)| M-CtJ-Mj^Cx) let £ 1 h K CK~ CK S+1 K h (x.), K expressions without the Then , |dh (x)/dx-dh Q x' = h . i K (x.), n. (x)/dx| and v and E[u.li that .] 0i E[u.a(v.)] = h. = M Ri = jxh:.), (x-jl,)' = Sh(x(w.,e n ))/5e' U x'(I-Q)h /n for any function S+1 , ^(xJ, n . ,0] a(v.) = [H,0]. = Let K. Since ax(w,e n )/3T) with finite It follows similarly i i —^-» CK~ (I-QHh-Lj/Vn. 1 = E[u.{dh.(v.)/dv}3v(w.,a )/3a' l == subscript denote corresponding (I-Q)lWn = x' (I-Q)(h-h)/Vn + x(w,e) = x(v(w,a),77), v/n-consistency of K K. l M ——» M (x)'ti j\.n depends only on to K sup , 01 mean-square, = p (x) . = h . (J observations multiplied by selection indicators, e.g. [dJL,,,...,d jL. ]'. i K = h(x.), h. and (x.), I K = p (x)-h |h = h(x.), matrices over (t) E[u.h '. ]. Then by a second-order expansion and 0, -13- 8 Op x'U-QMh-hWn (A. 9) HVn(a-0 = = -[x' (I-Q)ho /n]v^(0-e.) + o 9 + o Op = -E[u.h .Wn(9-eJ + o (1) 8i i (1) (1). p Also, by eq. and (A.8) idempotent, I-Q )' (£-£,. lii-nY [I-QUh-h v )/Vn = (K~ -^-> O. Also, -H-> 0. (<T(K)K~ pi — 0, so that ) and 2 Also, E[llu'Q<= ll that for — u'Qe^/Vn x'(I-Q)h/Vn = (x-£ Combining equations and 11111111 ] = E[u.E[e.|w.,d.]i/»'. = ] -^ 0. n = (1) V. i = A'(y-xjS). (1/Vn), II A' (h-fi) 11 < sup. | = O (K -s+1 " ). = (l)ll(h-h)//nll l (A. 3), IIA'x(J3-|3 and (1/Vn), )ll = (1/Vn). £ (l)llx(/3-0 IIA' (h-P-y^)ll K P Then by (K/n). K. triangle inequality, lly-3f„ll = ((K/n) K. + sup < (1) II Also, = P S S S K |d [p (C (K)[(K/n) s (K (T)'3- 1/2 +K K so that HA' + A' e + A' (h-h) + ) (T)/dT S | S S ]/dT - d h < SU (h-P>., Willi K 2 ell ). Then for S S | |[d p l) = o (1). p -14- K = s 1 A'(h-P^) and or 2, S (T)/dT ]'(y-r < C (K)lly-y s _S+1 = e'AA'e K. P|T|£l (x)/dx Willi = p p |d h(T)/dT -d h |T|£l + ) p S P|T|£l = A'x(p-|3 n y-y p SU E[e'Qe|D,W] ^ CK, it l dx(v.,T})/dv-dx(v.,T))/dv| Similarly to previous results, p (A. 12) theorem and - (De'Qc/n = = limit (1). p dh(v.)/dv = [dh(x.)/dx]dx(v.,7i)/dv, and Similarly to eq. P P + Hip.)/Vn + o 0. follows from the Assumption 5 that (t)'£, ill ,(u.e. ^i=l l K K. we obtain (A. 10), To show the second conclusion, note that h(x) = p ) (Q-Q)e^/Vn = u' is. from the Lindberg-Levy central first conclusion then follows E[u. c.dj'. _S+1 K.K.K.K.K. p The = h-h.,, is. x'(I-Q)(e+h)/Vn = u' c/Vn + HVn(a-a_) + o (A. 11) K (K p /n|T,D] = e'QE[uu' |T,D]Qe.,/n £ Ce'Qe^/n ^ e'e^/n (I-Q)(fi-fL)/V3 )' (A. 9), (A. 7), = The triangle inequality then gives 0. > e -£-> o. ) K ix (A. 4) 1 1 p u' [l-Q)(h-h v -h+hT .)/Vn K. > (A. 10) -B-> ) follows similarly to eq. it S+1 S+1 p is. " " (v/nK R IS. Also, -5-1 (I-QHh-fuJ/Vn = K ll + 0(K K )l -S+1 ) the It max. follows that max. l Also, max. implying -?-> 0, |dh„(T.)/dT-dh_(T.)/dx| i<n -^-» 0. | since the conditions require 1 1 be at least twice differentiate with bounded derivative, h (x) that dh(x.)/dx-dh,Jx.)/dx | i£n i^n i 9v(w.,a)/da, Then, by J boundedness of H = for n ^ tr(u'u/n) dh(v.)/dv-dh^(v.)/dv -^ | 11 ^1=1 l n 2 1/2 liaT(w.,a)/aall /n) max. dh^(x.)/dT-dh^(x.)/dT i^n ^i=l l l 1/2 (r. 0. l l ,u.[5x(w.,a)/aa' ]dh„(v.)/dv, Y. l IIH-HII I | i -^ | 0. l It also follows by eq. H = that for (A. 3) ,u.[aT(w.,a^)/aa' ]dhjv.)/dv, ^i=l l l n IIH-HI Y. l H —!—> H Then since 0. H —^-> H by the law of large numbers, follows by the triangle inequality. Now, let = A. x'. l max. l^n max. |h(x.)-h.(x.)| i Nx.l i^n + 11 max. . II Etle. ||W,D] £ C, = (1), and hence p n Y. Ilu.ll 2 , llu.ll ^i=l | e. |/n = e. | i^n . n . |x'. > (/3-/3J n 1/(2+5),, P P Furthermore, by Assumption 0. 2 |/n W,D] = £." E[ llu.ll | | c. | | 2 llu.ll Therefore, (1). \\Y , 2 n llu.ll |e.|/n)max. A. l i^n i ^i=l 2(J]. I 1 + (Y. | Also, note that | 11 11 /n -^ 0. It the law of large numbers, inequality, 2 /n)max. |a.| i^n l -?h>0. l | 2 ,lle.u.-e.u.ll 2 ,llu.ll ill 2 n 2 Z EIY." Ile.u.-e.u.ll /n |W,D] = E[Y. ,c \\ii.-^.\\ /n W,D] < ^1=1 11 11 ^i=l l l l ^.^Ele^lW.Dlll/l.-iJ.I^/n £ C£."llji.-fi.ll /n n n ^i=l l V(a) n 0. 3, W,D]/n < C£.|\ i ^i=l ^ . P mri (1)0 ,, (1/vn) — 2 2 2 2 n n n 2 2 2 n 2 , ,u.u'.£ /n - £. i u.u .e /nll ^ v llG.II |e -e |/n = V. llu.ll (e.-A.) -e |/n ^1=1 ^i=l i i i ^i=l l l i i i ^i=l i (A.13) y. < I l p l l — |A.| 2 E[£." so that .. i p l max.^ Then by the triangle inequality l (1/vn) = n /n) llx.ll < |h.-h.| max. Also, 0. ..2+5. .1/(2+5)^ „ , ^i=l -^ | i^n l l 1/(2+5). „n ^ < n (). l max. (A.12), eq. h_(x.)-h.(x.) | l^n l ,S 18-R By (/3-/3J + h.-h.. l £._ u.u'.e./n follows that Y. — > e./n ill ,u.u'. ^1=1 Q. A by equation n r. 2 ,u.u'.c ^i=l -^-» l l l /n - Y. n Therefore, 2 ill /n -^ ,u.u'.e ^i=l Etu.u'.c] = Q, ill (A. 6). 0. Then byJ so by the triangle The second conclusion then follows by consistency of and the Slutzky theorem. -15- /n References Ahn, H. and J.L. Powell (1993): "Semiparametric Estimation of Censored Selection Models with a Nonparametric Selection Mechanism," Journal of Econometrics 58, 3-29. Chamberlain, G. (1986): "Asymptotic Efficiency Journal of Econometrics 32, 189-218. in Semiparametric Models with Censoring," (1983): "Distribution-Free Maximum Likelihood Estimator of the Binary Choice Model," Econometrica 51, 765-782. Cosslett, S.R. "Distribution-Free Estimator of a Regression Model With Sample Barnett, J.L. Powell and G. Tauchen, eds., Nonparametric and Semiparametric Methods in Econometrics and Statistics. Cambridge, Cambridge University Cosslett, S.R. (1991): Selectivity," in W.A. Press. Donald, S.G. and W. Newey (1994): "Series Estimation of Semilinear Models," Journal of Multivariate Analysis 50, 30-40. Gallant, A.R. and D.W. Nychka (1987): "Semi-nonparametric Econometrica 55, Maximum Likelihood Estimation," 363-390. Gronau, R. (1973): "The Effects of Children on the Housewife's Value of Time," Journal of Political Heckman, Economy J.J. (1974): 81, S168-S199. "Shadow Prices, Market Wages, and Labor Supply," Econometrica 42, 679-693. J.J. (1976): "The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Vairables and a Simple Estimator for Such Models," Annals of Economic and Social Measurement 5, 475-492. Heckman, J.J. and R. Robb (1987): "Alternative Mehods for Evaluating the Impact of Interventions," Ch. 4 of Longitudinal Analysis of Labor Market Data, J.J. Heckman and B. Singer eds., Cambridge, UK: Cambridge University Press. Heckman, Ichimura, H. (1993). Estimation of single index models. Journal of Econometrics 58, 71-120. Klein, R.W. and R.S. Spady (1993): Choice Models," Econometrica Lee, L.F. (1995): Econometrica "An Efficient Semiparametric Estimator for Discrete 387-421. "Some Approaches to the Correction of Selectivity Bias," Review of (1982): Economic Studies Linton, 0. 61, "Second Order Approximation 63, Manski, C. (1975): 49, 355-372. in a Partially Linear Regression Model," 1079-1112. "Maximum Score Estimation Journal of Econometrics 3, of the Stochastic Utility Model of Choice," 205-228. Newey, W.K. (1997): "Convergence Rates and Asymptotic Normality for Series Estimators," Journal of Econometrics 79, 147-168. Newey, W.K. and J.L. Powell (1993): "Efficiency Bounds for Semiparametric Selection Models," Journal of Econometrics 58, 169-184. -16- Powell, and J.R. Walker (1990): "Semiparametric Estimation of Selection Results," American Economic Review Papers and Proceedings, May. Empirical Models: Some Newey, W.K., J.L. Powell, J.L. (1994): "Estimation of Semiparametric Models," in R.F. Engle and D. McFadden, eds., Handbook of Econometrics: Volume 4, New York: North-Holland. Powell, J.L., J.H. Stock, and T.M. Stoker (1989). Semiparametric Estimation of Index Coefficients Econometrica 57, 1403-1430. Powell, J.L. (1987): "Semiparametric Estimation of Bivariate Limited Dependent Variable Models," manuscript, University of California, Berkeley. Powell, M.J.D. (1981): Approximation Theory and Methods, Cambridge, UK, Cambridge University Press. Robinson, 931-954. P. (1988): "Root-N-Consistent Semiparametric Regression," Econometrica 56, Ruud, P. A. (1986): "Consistent Estimation of Limited Dependent Variable Models Despite Misspecification of Distribution," Journal of Econometrics 32, 157-187. Stone, C.J. Statistics (1985): 13, "Additive Regression and Other Nonparametric Models, Annals of 689-705. White, H. (1980): "Using Least Squares to Approximate International Economic Review 21, 149-170. 70 7 6 DO -17- Unknown Regression Functions," Date Due MIT LIBRARIES 3 9080 01972 1015 ISWWW8 wmm^mi memili m Hi ^¥iiiiii ^SmSMM ifmwm ^WM ]: 4 liiiiiiiiiiiii iiiiaiii in m"v ; ; : v;'' WMilllgi&i-%^ Ifiltf : :: ;, ". l;!;ll :: Sliiip'i;® 31 ftp : , : :::;;:r; j:^;.:;.•/, l/;,^;/i:(;;:^; ;; i , ::;'^^h:;^^^;::^;; v^:^M!;;;:;i:;: in )nmihii^tm: I:. was ;?>;: : iBmmmma^mMm t mnm MM; illlilliiil 'if. ,.,:- ::;;., : .'.,-:., ;.,,:. jmgmam^ llllililllllllllSIIS