Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models The generalized method of moments Robert M. Kunst [email protected] University of Vienna and Institute for Advanced Studies Vienna February 2008 Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Based on the book ‘Generalized Method of Moments’ by Alastair R. Hall (2005), Oxford University Press. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models The main motivation Following the publication of the seminal paper by Lars Peter Hansen in 1982, GMM (generalized method of moments) has been used increasingly in econometric estimation problems. Some econometrics textbooks have even switched from maximum likelihood (ML) to GMM in their basic introduction to estimation methods. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Why maximum likelihood? Given a parametric model for the statistical distribution of observed variables fθ (x) and so-called regularity conditions, the ML estimator θ̂ = argmaxθ L(θ|X ), with the notation L(θ|X ) = fθ (x1 , . . . , xn ) is consistent and (at least asymptotically) efficient for estimating θ. Why then consider anything else? Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Reasons for not using ML 1 Regularity conditions are violated (rare). 2 The researcher does not accept a parametric model frame. 3 The maximization of the likelihood is unattractive and time-consuming. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Why some do not accept parametric model frames In simple, data-driven models with little a priori information, a strong belief in the parametric model is acceptable. Bayesians recognize the likelihood models as subjective constructions. The question of believing in them is void. By contrast, theory-driven researchers believe in some parametric aspects of their models (parameters of interest) but have little belief in others (error process parameters). In a report on the forecasting performance of the Bank of England in 2003, Adrian Pagan views modelling as a trade-off between theoretical coherence and empirical coherence. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Pagan’s efficiency frontier of econometric modelling All models on the frontier are ‘efficient’. Following a downward movement in the 1980s, empirical economics has been moving upward on the frontier. GMM usage is following this increased emphasis on theory. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models A basic example: linear regression For a likelihood adept, analysis of the linear regression model yt = Xt β + ut starts from the assumption of i.i.d. N(0, σ 2 ) ut . Given Xt , y = (y1 , . . . , yn ) has a Gaussian distribution. Maximization of the likelihood L(β, σ 2 ; y , X ) yields the familiar OLS estimates β̂ = (X ′ X )−1 X ′ y and σ̂ 2 = n−1 (y − X β̂)′ (y − X β̂). Then, consequences of assumption violations are studied. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Methods of moments and OLS A GMM adept sees the regression model as defined by the population moments conditions E(ut ) = 0, E(ut2 ) = σ 2 , E(Xt ut ) = 0. These are to be matched to sample moments conditions for ût = yt − Xt β̂ = ût (β) n−1 n X ût = 0, n X ût2 = σ̂ 2 , t=1 t=1 and n−1 n−1 n X Xt ût = 0. t=1 Again, the solutions are the OLS estimates. Conditions on X and y are studied that yield good properties for the OLS estimates. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Methods of moments becomes GMM In the linear regression, k + 1 moments conditions yield k + 1 equations and thus k + 1 parameter estimates. If there are more moments conditions than parameters to be estimated, the moments equations cannot be solved exactly. This case is called GMM (generalized method of moments). In GMM, moments conditions are solved approximately. To this aim, single condition equations are weighted. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Population moment condition Definition (1.1 Population moment condition) Let θ0 be a true unknown vector parameter to be estimated, vt a vector of random variables, and f (.) a vector of functions. Then, a population moment condition takes the form E{f (vt , θ0 )} = 0, t ∈ T. Often, f (.) will contain linear functions only, then the problem essentially becomes one of linear regression. In other cases, f (.) may still be products of errors and functions of observed variables, then the problem becomes one of non-linear regression. The definition is even more general. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models The GMM estimator Definition (1.2 GMM estimator) The Generalized Method of Moments estimator based on these population moments conditions is the value of θ that minimizes Qn (θ) = {n −1 n X t=1 ′ f (vt , θ) }Wn {n −1 n X f (vt , θ)}, t=1 where Wn is a non-negative definite matrix that usually depends on the data but converges to a constant positive definite matrix as n → ∞. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models OLS as GMM revisited The k + 1 functions are fj (yt , Xt , β, σ 2 ) = Xjt ût (β), 2 2 j = 1, . . . , k, 2 fk+1 (yt , Xt , β, σ ) = ût (β) − σ . Choosing W = Ik+1 yields Qn (β, σ 2 ) = n−2 {û(β)}′ XX ′ {û(β)} + (n−1 û ′ (β)û(β) − σ 2 )2 , which is minimized for the OLS estimate at Qn (β̂, σ̂ 2 ) = 0. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Hall’s Example I: Asset Pricing This is a theory-based, a bit more involved example that is used throughout the book. A representative agent maximizes discounted utility ∞ X E δ0k U(ct+k |Ωt ). k=0 N assets j with maturity mj are held at prices pj,t and quantities qj,t . The budget constraint for consumption ct and ‘saving’ is defined by the sum of the payoffs rj,t and wage income wt , ct + N X pj,t qj,t = j=1 N X rj,t qj,t−mj + wt . j=1 The consumption good acts as the numeraire. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Euler’s equation for Example I O.c.s. (=‘one can show’) that the optimal consumption path satisfies ′ m rj,t+mj U (ct+mj ) E{δ0 j |Ωt } = 1. pj,t U ′ (ct ) For the utility function U(c) = (c γ0 − 1)/γ0 , this expression becomes m rj,t+mj ct+mj γ0 −1 E{δ0 j ( ) |Ωt } − 1 = 0 = E{uj,t (γ0 , δ0 )|Ωt }, pj,t ct say, which implies E[E{uj,t (γ0 , δ0 )|Ωt }zt ] = 0 for any zt ∈ Ωt . Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Characteristics of Hall’s Example I There are just two parameters to be estimated: δ and γ. The discount rate δ is easy to estimate, γ is difficult to estimate (‘weakly identified’). Maximum Likelihood would require solving a complicated maximization under distributional assumptions. GMM is straight forward (though not trivial). For just one asset and five instruments zt = (1, x1,t , x1,t−1 , x2,t , x2,t−1 )′ with x1,t = ct /ct−1 and x2,t = rt /pt−1 , Hall uses this model as a running example. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Short statistics review Hall closes his introduction with a review of some useful statistical theorems and definitions. Definition (1.3 Convergence in probability) The sequence of random variables {hn } converges in probability to the r.v. h iff for all ǫ > 0 lim P(|hn − h| < ǫ) = 1, n→∞ p in symbols plim hn = h or hn → h. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Orders in probability Definition (1.4 Orders in probability) 1 The sequence of r.v. {hn } is said to be of large order in probability cn , in symbols Op (cn ), if for every ǫ > 0 there are positive mǫ , nǫ such that P(|hn |/cn > mǫ ) < ǫ for all n > nǫ . 2 The sequence of r.v. {hn } is said to be of small order in p probability cn , in symbols op (cn ), if hn /cn → 0. These definitions extend the common mathematical notation O(xn ) and o(xn ) to random convergence. They may also be used for vectors or matrices. Often, cn will be a simple function of n, such as nα . Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Consistency and distributional convergence Definition (1.5 Consistency of an estimator) Let {θ̂n } be a sequence of estimators for the true parameter vector p θ0 , then θ̂n is said to be a consistent estimator of θ0 if θ̂n → θ0 . Hall does not require ‘strong’ consistency here. Definition (1.6 Convergence in distribution) The sequence of r.v. {hn } with distribution functions {Fn (c)} converges in distribution to the r.v. h with distribution function d F (c), in symbols hn → h, iff there exists nǫ for every ǫ > 0 such that |Fn (c) − F (c)| < ǫ for n > nǫ at all continuity points c. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Slutzky and friends Lemma (Slutzky’s Theorem) Let hn be a sequence of random vectors that converges in probability to the random vector h and let f (.) be a vector of p continuous functions then f (hn ) → f (h). Lemma (Corollary) Let {Mn } be a sequence of random matrices that converge in probability to a constant matrix M, and {hn } be a sequence of vector-valued r.v. that converge in distribution to N(0, Σ). Then d Mn hn → N(0, MΣM ′ ). Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models LLN Lemma (Weak law of large numbers) Assume vt is a sequence of r.v. with Evt = µ, then any set of assumptions that imply n −1 n X p vt → µ t=1 is called a weak law of large numbers (WLLN). Strict stationarity together with some ‘regularity conditions’ suffices. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models CLT Lemma (Central limit theorem) Assume vt is a sequence of r.v. with Evt = µ, then any set of assumptions that imply n−1/2 n X d (vt − µ) → N(0, Σ), t=1 with Σ = lim var{n−1/2 n→∞ n X (vt − µ)} t=1 is called a central limit theorem (CLT). Strict stationarity together with some ‘regularity conditions’ suffices. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Consider yt = xt′ θ0 + ut , t = 1, . . . , n, where xt collects p explanatory variables and θ0 ∈ Rp . Write ut (θ) = yt − xt′ θ for the residual such that ut (θ0 ) = ut . There is a q–vector of observed instruments zt . We will also use the (n × q)–matrix Z = (z1 , . . . , zn )′ and the (n × p)–matrix X . Assumption (2.1 Strict stationarity) The random vector vt = (xt′ , zt′ , ut )′ is a strictly stationary process. Integrated processes can be handled (just take differences) but breaks etc. are excluded. Assumption (2.2 Population moment condition) The q–vector zt satisfies E[zt ut (θ0 )] = 0. θ0 solves the basic problem but it may not be the only solution. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Assumption (2.3 Identification condition) rk{E(zt xt′ )} = p There must be at least as many instruments as regressors (q ≥ p) and these should be correlated with them. If assumption 2.3 holds and q > p, θ0 is said to be over-identified. If q = p, it is just-identified. If the indicated rank is ‘almost’ p − 1, θ0 is said to be weakly identified. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models The estimator Given an appropriate weighting matrix Wn , the GMM minimand is Qn (θ) = {n−1 u(θ)′ Z }Wn {n−1 Z ′ u(θ)}, and the GMM estimate is defined as θ̂n = arg min Qn (θ). θ∈Θ Some algebraic manipulation yields θ̂n = (X ′ ZWn Z ′ X )−1 X ′ ZWn Z ′ y . Clearly, X ′ Z must have rank p, otherwise the estimate cannot be calculated. Assumption 2.3 is the large-sample counterpart of this condition. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models In detail, the customary first-order condition ∂Qn (θ) =0 ∂θ for θ̂ yields the equation (n−1 X ′ Z )Wn {n−1 Z ′ u(θ̂)} = 0, with its asymptotic counterpart E(xt zt′ )W E{zt ut (θ0 )} = 0. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models The fundamental decomposition Define F = W 1/2 E(zt xt′ ), with W 1/2 a ‘root’ of W such that ′ W = W 1/2 W 1/2 then re-write the population moment condition as F ′ W 1/2 E{zt ut (θ0 )} = 0. Multiplying these p equation from the left with a projection matrix yields the identifying restrictions of GMM: F (F ′ F )−1 F ′ W 1/2 E{zt ut (θ0 )} = 0. These are q equations but only p are linearly independent. Conversely, the equation system {Iq − F (F ′ F )−1 F ′ }W 1/2 E{zt ut (θ0 )} = 0 has q equations but the rank of the relevant matrix is only q − p. This other system comprises the overidentifying restrictions of GMM. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Because the GMM estimator is defined in sample counterparts to population moment conditions, this decomposition is also valid in a finite-sample version. The identifying restrictions hold exactly. It is always possible to impose p restrictions if p parameters are to be estimated. Define 1/2 ′ Fn′ = n−1 X ′ Z Wn . The equation 1/2 Fn (Fn′ Fn )−1 Fn′ Wn Z ′ u(θ̂) = 0 holds for finite n by definition. The overidentifying restrictions will usually not hold exactly for the finite-sample estimate θ̂. The property that they should hold as n → ∞ is the basis for customary GMM specification tests. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Asymptotic properties of the GMM estimator Like any good econometric estimator, the GMM estimator should be consistent (θ̂n → θ0 ) and asymptotically normal (n1/2 (θ̂n − θ0 ) → N(0, V ) in distribution). Impose the following assumption: Assumption (2.4 Independence) The vector vt = (xt′ , zt′ , ut )′ is independent of vt+s for all s 6= 0. Explanatory variables, instruments, and errors taken together form an iid process. This defines a static model. It may be convenient to relax this assumption later. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Consistency is straight forward θ̂n = (X ′ ZWn Z ′ X )−1 X ′ ZWn Z ′ y = θ0 + {(n−1 X ′ Z )Wn (n−1 Z ′ X )}−1 (n−1 X ′ Z )Wn (n−1 Z ′ u) p → θ0 + {E(zt xt′ )W E(xt zt′ )}−1 E(zt xt′ )W E(zt ut ) = θ0 , using the population moment condition, LLN and Slutzky’s Theorem. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Asymptotic distribution Start from the representation n1/2 (θ̂n −θ0 ) = {(n−1 X ′ Z )Wn (n−1 Z ′ X )}−1 (n−1 X ′ Z )Wn (n−1/2 Z ′ u), where everything converges to constants, except for the last term n−1/2 Z ′ u = n−1/2 n X zt ut , t=1 d which follows a CLT (iid!), such that n−1/2 Z ′ u → N(0, S). The interesting part is S defined by S = var(zt ut ). Hall calls the complicated constant limit matrix M (see above), such that the Corollary of Slutzky’s Theorem yields d n1/2 (θ̂n − θ0 ) → N(0, MSM ′ ). Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Estimating S In order to construct asymptotic confidence intervals or hypothesis tests, S must be estimated. One can show (o.c.s.) that Ŝn = n −1 n X p ut (θ̂n )2 zt zt′ → S. t=1 This is not trivial, as u(θ̂n ) are not true errors. Of course, under the so-called ‘classical assumptions’ Assumption (2.5 Classical assumptions on errors) Eut = 0, Eut2 = σ02 ,ut and zt independent, one may use ŜCIV = σ̂n2 n−1 Z ′ Z , with σ̂n2 = n−1 u(θ̂n )′ u(θ̂n ). Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Asymptotic distribution of moments For technical reasons, one may be interested in the asymptotic distribution of n−1/2 Z ′ u(θ̂n ), the sample counterpart to the E(zt ut ) that is supposed to be zero. Consider the representation in F notation 1/2 θ̂n − θ0 = (Fn′ Fn )−1 Fn′ Wn n−1 Z ′ u. Therefore, u(θ̂n ) = u − X θ̂n + X θ0 1/2 = u − X (Fn′ Fn )−1 Fn′ Wn n−1 Z ′ u. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Thus, residuals are expressed as a function of errors. This representation can be multiplied by n−1/2 W 1/2 Z ′ , so it yields 1/2 n−1/2 W 1/2 Z ′ u(θ̂n ) = (Iq − Pn )Wn n−1/2 Z ′ u, with the short notation Pn = Fn (Fn′ Fn )−1 Fn′ , and finally p n−1/2 W 1/2 Z ′ u(θ̂n ) → N(0, NSN ′ ), with a slightly complicated but tractable matrix N. This asymptotic property is convenient for developing specification tests. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Optimal choice of the weighting matrix The asymptotic variance of the GMM estimator is MSM ′ , where M = {E(xt zt′ )W E(zt xt′ )}−1 E(xt zt′ )W . O.c.s. (L.P. Hansen) that the variance becomes minimal for W0 = S −1 . Then, the asymptotic variance becomes, by straight forward insertion, V0 = {E(xt zt′ )S −1 E(zt xt′ )}−1 . Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Two-step GMM In practice, S must be replaced by a consistent estimator, for example by Ŝn , which however is infeasible, as it uses a GMM estimate θ̂n . Solution is two-step: 1 Estimate θ by a simple and not optimal weighting matrix, for example the identity matrix I . Estimate S based on the residuals. 2 Estimate θ again by using the weighting matrix according to step 1. The thus defined estimator is called the ‘optimal two-step GMM estimator’. Iterations may continue and then define the ‘optimal iterated GMM estimator’. Under the classical assumptions 2.5, the two-step GMM estimator becomes the familiar two-stage least-squares estimator (2SLS). Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Several types of mis-specification Hall distinguishes among three cases: 1 Correct specification. The specified model M is the true model. Ezt ut (θ0 ) = 0 for a unique θ0 ∈ Θ. 2 The true model MA is not the specified model M but θ+ is still a unique solution to the moment conditions. θ+ is a pseudo-true value. 3 The true model MB is not M, and there is no θ such that Ezt ut (θ) = 0. In most aspects, item # 2 is innocuous. MA (θ+ ) can be treated like M(θ0 ), these models are ‘observationally equivalent’ w.r.t. moment conditions. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models The idea of the J–test In the case of item #3, identifying restrictions will hold by definition but overidentifying restrictions will be violated even in the limit for large n. One may consider the statistic Jn = nQn (θ̂n ) = n−1 u(θ̂n )′ Z Ŝn−1 Z ′ u(θ̂n ), which, under H0 of Ezt ut (θ0 ) = 0, converges to a χ2q−p in distribution. Note that the asymptotic properties of the moment condition are used in deriving the χ2 limit. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models A few assumptions to define well behaved models. Assumption (3.1 Strict stationarity) The r–dimensional random vectors {vt , t ∈ Z} form a strictly stationary process with sample space V ⊆ Rr . Assumption (3.2 Regularity conditions for f ) The function f : V × Θ → Rq satisfies: 1 f is continuous on Θ for all vt ∈ V; 2 Ef (vt , θ) < ∞ for all θ ∈ Θ; 3 Ef (vt , θ) is continuous on Θ. The continuity assumption is restrictive and excludes some cases of empirical interest. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Population moment condition and identification Assumption (3.3 Population moment condition) The r.v. vt and the parameter θ0 ∈ Rp satisfy the q–vector of population moment conditions Ef (vt , θ0 ) = 0. Assumption (3.4 Global identification) Ef (vt , θ̄) 6= 0 for all θ̄ ∈ Θ with θ̄ 6= θ0 . These two conditions exclude the misspecified case MB as well as the case of multiple solutions. In some applications, local identification may suffice. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models An example for a non-identified model: partial adjustment In the model yt − yt−1 = β0 (y0∗ − yt−1 ) + ut , ut = ρ0 ut−1 + et , y ∗ represents a target or desired level for yt , and et is i.i.d. with mean 0. θ = (β, ρ, y ∗ )′ and p = 3. Assume there are some q > p variables z that serve as instruments such that Ezt et (θ) = 0, et (θ) = yt − β(1 − ρ)y ∗ − (1 + ρ − β)yt−1 − (β − 1)ρyt−2 , where the residuals just follow from the model definition by transformation. The model is nonlinear in θ. An AR(2) model with a constant is identified and linear and would have parameters µ = (µ̄, φ1 , φ2 )′ , say. O.c.s. that θ as a function of given µ has multiple solutions due to a quadratic function. The model is not globally identified. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Conditions for local identification Local identification, which may be more relevant in practice, requires that derivatives exist and that for every θ an entire ǫ–neighborhood is contained in the parameter space. Assumption (3.5 Regularity condition on ∂f (vt , θ)/∂θ) 1 2 3 the derivative matrix ∂f (vt , θ)/∂θ′ exists and is continuous on Θ for all vt ∈ V; θ0 is an interior point of Θ; E∂f (vt , θ)/∂θ′ < ∞. Assumption (3.6 Local identification) rk(E∂f (vt , θ)/∂θ′ ) = p. This condition naturally generalizes Assumption 2.3 from the linear model. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models The partial adjustment model is locally identified With y˜t = (1, yt−1 , yt−2 )′ , the matrix of Assumption 3.6 can be written as E∂f (vt , θ)/∂θ′ ) = E(zt ỹt′ )M(θ0 ), where −(1 − ρ)y ∗ βy ∗ −β(1 − ρ) . 1 −1 0 M(θ) = −ρ −(β − 1) 0 Clearly, in general this matrix is of full rank, and the local identification condition holds. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Technical issues of GMM estimation in nonlinear models The GMM estimate θ̂ is the minimum of the function Qn (θ) = {n−1 n X t=1 f (vt , θ)}′ Wn {n−1 n X f (vt , θ)}. t=1 For the large-sample behavior of the weighting matrix we assume Assumption (3.7 Properties of the weighting matrix) Wn is a non-negative definite matrix that converges in probability to the positive definite constant matrix W . Typically, θ̂ does not have a closed-form solution. It will be obtained via an iterative numerical algorithm. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models GMM as a solution to first-order conditions If derivatives exist (Assumption 3.5), the minimization can rely on ∂Qn (θ)/θ = 0, or, in detail, {n −1 n X ∂f (vt , θ̂n ) t=1 ∂θ′ ′ } Wn {n −1 n X f (vt , θ̂n )} = 0. t=1 This expression has a population counterpart E{ ∂f (vt , θ0 ) ′ } W Ef (vt , θ0 ) = 0. ∂θ′ Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Numerical optimization Numerical minimization has three aspects: 1 the starting value θ̄(0); 2 the iterative search method, i.e. how to find θ̄(j + 1) given θ̄(j); 3 the convergence criterion. Hall recommends to vary starting values as a sensitivity check. Convergence criteria can be absolute (kθ̄(j + 1) − θ̄(j)k < ǫ), relative (kθ̄(j + 1) − θ̄(j)k/kθ̄(j)k < ǫ), or a mixture of both. They can check convergence of θ̄, of the function Q, or of a gradient—in particular, if gradients of Q are used in the search iterations. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Fundamental decomposition in the nonlinear model Consider the first-order condition E{ ∂f (vt , θ0 ) ′ } W Ef (vt , θ0 ) = 0. ∂θ′ Again, this permits a form F (θ0 )′ W 1/2 E{f (vt , θ0 )} = 0, and a decomposition into the identifying restrictions F (θ0 ){F (θ0 )′ F (θ0 )}−1 F (θ0 )′ W 1/2 E{f (vt , θ0 )} = 0, and the overidentifying restrictions. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models The overidentifying restrictions The overidentifying restrictions [Iq − F (θ0 ){F (θ0 )′ F (θ0 )}−1 F (θ0 )′ ]W 1/2 E{f (vt , θ0 )} = 0 are not fulfilled in sample (in estimation) but only asymptotically if the model is specified correctly (or misspecified of the MA type?). Note that the two projection matrices sum to I and therefore Qn (θ̂) shows directly how well the overidentifying restrictions are satisfied. This is the basis for a popular specification test statistic. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Asymptotic properties of the GMM estimator Assumption (3.8 Ergodicity) The random process {vt , t ∈ Z} is ergodic, meaning its sample moments converge to the population moments. There exist non-ergodic stationary processes, and minimal conditions to exclude them are complex. We just assume. Assumption (3.9 Compactness) The parameter space Θ is compact. No sequence of admissible parameter values should converge to a non-admissible one. No sequence should escape to infinity. Assumption (3.10 Domination of f ) E supθ∈Θ kf (vt , θ)k < ∞ . Because Θ is now bounded anyway, this assumption excludes local areas and points with undefined expectation. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Consistency of GMM O.c.s. the following Lemma. Lemma Assumptions 3.1,3.2,3.7–3.10 imply that p supθ∈Θ |Qn (θ) − Q0 (θ)| → 0, where Q0 (θ) is introduced for the population analog to Qn (θ). Using this lemma, it is not difficult to show Theorem (3.1 Consistency of the GMM estimator) p Assumptions 3.1–3.4 and 3.7–3.10 imply that θ̂n → θ0 . Note that this theorem assumes global identification. Qn converges to Q0 , and its minimum converges to the minimum of Q0 , which is unique according to assumptions. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Assumptions for the asymptotic normality of GMM Application of a CLT requires some more technical assumptions. Pn −1 Introduce gn (θ) = n t=1 f (vt , θ), the sample analog to Ef (vt , θ). Assumption (3.11 Variance of sample moment) 1 E{f (vt , θ0 )f (vt , θ0 )′ } < ∞; 2 var(n1/2 gn (θ0 )) → S; 3 S is finite and positive definite. This assumption is needed for a CLT: Lemma d Assumptions 3.1,3.3,3.8, and 3.11 imply n1/2 gn (θ0 ) → N(0, S). Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Now g converges P but so must its derivatives Gn (θ) = n−1 nt=1 ∂f (vt , θ)/∂θ′ . Denote its population analog E∂f (vt , θ0 )/∂θ′ by G0 . Assumption (3.12 Continuity of expected derivatives) E∂f (vt , θ)/∂θ′ is continuous in a neighborhood Nǫ of θ0 . Assumption (3.13 Uniform convergence of Gn (θ)) p supθ∈Nǫ kGn (θ) − E∂f (vt , θ)/∂θ′ k → 0. With these assumptions, Gn (θ̂n ) finally converges to G0 in probability and the desired property follows. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Asymptotic normality of GMM Theorem (3.2 Asymptotic normality of parameter estimator) Assumptions 3.1–3.5 and 3.7–3.13 imply d n1/2 (θ̂n − θ0 ) → N(0, MSM ′ ), where M = (G0′ WG0 )−1 G0′ W . Clearly, G0 must be estimated by some finite-sample analog, and so must S. Unfortunately, estimation of S is not trivial. Theorem (3.3 Asymptotic normality of sample moments) Assumptions 3.1–3.5 and 3.7–3.13 imply ′ d 1/2 Wn n1/2 gn (θ̂n ) → N(0, NW 1/2 SW 1/2 N ′ ) where N = Iq − F (θ0 ){F (θ0 )′ F (θ0 )}−1 F (θ0 )′ . Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Estimating S S is the limit of the variances of cumulative sums ! n X −1/2 ft , S = lim var n n→∞ t=1 where (ft ) is stationary but not i.i.d. In time-series analysis, this variance is known as the spectrum at zero, while econometricians call it the long-run variance. In population, it is simply given by S= ∞ X Γj , j=−∞ where Γj are the autocovariance matrices of ft . Plugging in autocovariance estimates (for fˆt ) and chopping of the sum at some J and −J yields an unattractive estimate. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Nonparametric estimation of S In time-series analysis, a preferred method for estimating the spectral density is kernel or window estimation. At frequency zero, this amounts to ŜHAC = Γ0 + n−1 X ωj,n (Γ̂j + Γ̂′j ), j=1 with the kernel function ωj,n that downweights autocovariance estimates with large j. Usually, ωj,n = 0 for j > b(n), with the ‘bandwidth’ b(n) → ∞ as n → ∞. The most common kernel function is the Bartlett or triangular kernel ( j 1 − b(n)+1 , j ≤ b(n), ωj,n = 0, else. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models What does ‘HAC’ mean? HAC stands for ‘heteroskedasticity and autocorrelation consistent’ covariance estimation, a technique developed by Newey and West, not for ‘heteroscedastic autocorrelation covariance’ (Hall). The HAC method requires spectral estimation at frequency zero, and this is done via kernel estimation. For this reason, some econometricians call the triangular Bartlett window a ‘Newey and West window’. Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Parametric estimation of S As an alternative way, one may fit a vector autoregression (VAR) to ft (or rather fˆt : ft = A1 ft−1 + A2 ft−2 + . . . + Ak ft−k + et , selecting k according to Schwarz’ BIC, for example, and estimate S via ′ k k X X −1 −1 Ak ) , Ak ) Σ̂(e)(Iq − ŜVAR = (Iq − j=1 j=1 with Σ̂(e) = n−1 ê ′ ê estimated from the residuals of the VAR. One may even combine parametric and nonparametric stages (Andrews and Monahan). Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models Optimal choice of the weighting matrix In analogy to the linear model, the optimal weighting matrix W is obtained by W = S −1 . Theorem (3.4 Optimal weighting matrix) Assumptions 3.1–3.5,3.7–3.13 imply that the minimum asymptotic variance matrix of θ̂ is (G0′ S −1 G0 )−1 , which is obtained by W = S −1 . In practice, S must be estimated. One may start by weighting W = I and use the obtained θ̂ to obtain Ŝ to update θ̂ (two-step GMM) or one may iterate these steps to convergence (iterated GMM). A variant solves for θ̂ and Ŝ jointly, leaving iterations to the computer (continuously updated GMM). Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna Introduction The Instrumental Variable Estimator in the Linear Regression Model GMM in correctly specified models A technical independence result Theorem (3.5 Asymptotic independence of estimate and moment) Assume 3.1–3.5 and 3.7–3.13 hold and W = S −1 , then n1/2 (θ̂n − θ0 ) and S −1/2 n1/2 gn (θ̂n ) are asymptotically independent. This theorem guarantees that iterated GMM converges to the exact solution as n → ∞ and thus is comparable to continuous updating. It also guarantees that J–test specification statistics on ‘overidentifying restrictions’ will be independent from the parameter estimates. The details of its proof (in Hall’s book) show that this independence does not hold if W 6= S −1 . Robert M. Kunst [email protected] The generalized method of moments University of Vienna and Institute for Advanced Studies Vienna