Download The generalized method of moments

Document related concepts
no text concepts found
Transcript
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
The generalized method of moments
Robert M. Kunst
[email protected]
University of Vienna
and
Institute for Advanced Studies Vienna
February 2008
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Based on the book ‘Generalized Method of Moments’ by
Alastair R. Hall (2005), Oxford University Press.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
The main motivation
Following the publication of the seminal paper by Lars Peter
Hansen in 1982, GMM (generalized method of moments) has
been used increasingly in econometric estimation problems.
Some econometrics textbooks have even switched from maximum
likelihood (ML) to GMM in their basic introduction to estimation
methods.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Why maximum likelihood?
Given a parametric model for the statistical distribution of
observed variables fθ (x) and so-called regularity conditions, the ML
estimator
θ̂ = argmaxθ L(θ|X ),
with the notation L(θ|X ) = fθ (x1 , . . . , xn ) is consistent and (at
least asymptotically) efficient for estimating θ.
Why then consider anything else?
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Reasons for not using ML
1
Regularity conditions are violated (rare).
2
The researcher does not accept a parametric model frame.
3
The maximization of the likelihood is unattractive and
time-consuming.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Why some do not accept parametric model frames
In simple, data-driven models with little a priori information, a
strong belief in the parametric model is acceptable.
Bayesians recognize the likelihood models as subjective
constructions. The question of believing in them is void.
By contrast, theory-driven researchers believe in some parametric
aspects of their models (parameters of interest) but have little
belief in others (error process parameters).
In a report on the forecasting performance of the Bank of England
in 2003, Adrian Pagan views modelling as a trade-off between
theoretical coherence and empirical coherence.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Pagan’s efficiency frontier of econometric modelling
All models on the frontier are ‘efficient’. Following a downward
movement in the 1980s, empirical economics has been moving
upward on the frontier. GMM usage is following this increased
emphasis on theory.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
A basic example: linear regression
For a likelihood adept, analysis of the linear regression model
yt = Xt β + ut
starts from the assumption of i.i.d. N(0, σ 2 ) ut . Given Xt ,
y = (y1 , . . . , yn ) has a Gaussian distribution. Maximization of the
likelihood L(β, σ 2 ; y , X ) yields the familiar OLS estimates
β̂ = (X ′ X )−1 X ′ y and σ̂ 2 = n−1 (y − X β̂)′ (y − X β̂).
Then, consequences of assumption violations are studied.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Methods of moments and OLS
A GMM adept sees the regression model as defined by the
population moments conditions
E(ut ) = 0, E(ut2 ) = σ 2 , E(Xt ut ) = 0.
These are to be matched to sample moments conditions for
ût = yt − Xt β̂ = ût (β)
n−1
n
X
ût = 0,
n
X
ût2 = σ̂ 2 ,
t=1
t=1
and
n−1
n−1
n
X
Xt ût = 0.
t=1
Again, the solutions are the OLS estimates. Conditions on X and
y are studied that yield good properties for the OLS estimates.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Methods of moments becomes GMM
In the linear regression, k + 1 moments conditions yield k + 1
equations and thus k + 1 parameter estimates. If there are more
moments conditions than parameters to be estimated, the
moments equations cannot be solved exactly. This case is called
GMM (generalized method of moments).
In GMM, moments conditions are solved approximately. To this
aim, single condition equations are weighted.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Population moment condition
Definition (1.1 Population moment condition)
Let θ0 be a true unknown vector parameter to be estimated, vt a
vector of random variables, and f (.) a vector of functions. Then, a
population moment condition takes the form
E{f (vt , θ0 )} = 0,
t ∈ T.
Often, f (.) will contain linear functions only, then the problem
essentially becomes one of linear regression. In other cases, f (.)
may still be products of errors and functions of observed variables,
then the problem becomes one of non-linear regression. The
definition is even more general.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
The GMM estimator
Definition (1.2 GMM estimator)
The Generalized Method of Moments estimator based on these
population moments conditions is the value of θ that minimizes
Qn (θ) = {n
−1
n
X
t=1
′
f (vt , θ) }Wn {n
−1
n
X
f (vt , θ)},
t=1
where Wn is a non-negative definite matrix that usually depends
on the data but converges to a constant positive definite matrix as
n → ∞.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
OLS as GMM revisited
The k + 1 functions are
fj (yt , Xt , β, σ 2 ) = Xjt ût (β),
2
2
j = 1, . . . , k,
2
fk+1 (yt , Xt , β, σ ) = ût (β) − σ .
Choosing W = Ik+1 yields
Qn (β, σ 2 ) = n−2 {û(β)}′ XX ′ {û(β)} + (n−1 û ′ (β)û(β) − σ 2 )2 ,
which is minimized for the OLS estimate at Qn (β̂, σ̂ 2 ) = 0.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Hall’s Example I: Asset Pricing
This is a theory-based, a bit more involved example that is used
throughout the book. A representative agent maximizes discounted
utility
∞
X
E
δ0k U(ct+k |Ωt ).
k=0
N assets j with maturity mj are held at prices pj,t and quantities
qj,t . The budget constraint for consumption ct and ‘saving’ is
defined by the sum of the payoffs rj,t and wage income wt ,
ct +
N
X
pj,t qj,t =
j=1
N
X
rj,t qj,t−mj + wt .
j=1
The consumption good acts as the numeraire.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Euler’s equation for Example I
O.c.s. (=‘one can show’) that the optimal consumption path
satisfies
′
m rj,t+mj U (ct+mj )
E{δ0 j
|Ωt } = 1.
pj,t
U ′ (ct )
For the utility function U(c) = (c γ0 − 1)/γ0 , this expression
becomes
m rj,t+mj ct+mj γ0 −1
E{δ0 j
(
)
|Ωt } − 1 = 0 = E{uj,t (γ0 , δ0 )|Ωt },
pj,t
ct
say, which implies
E[E{uj,t (γ0 , δ0 )|Ωt }zt ] = 0
for any zt ∈ Ωt .
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Characteristics of Hall’s Example I
There are just two parameters to be estimated: δ and γ. The
discount rate δ is easy to estimate, γ is difficult to estimate
(‘weakly identified’).
Maximum Likelihood would require solving a complicated
maximization under distributional assumptions. GMM is straight
forward (though not trivial).
For just one asset and five instruments
zt = (1, x1,t , x1,t−1 , x2,t , x2,t−1 )′ with x1,t = ct /ct−1 and
x2,t = rt /pt−1 , Hall uses this model as a running example.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Short statistics review
Hall closes his introduction with a review of some useful
statistical theorems and definitions.
Definition (1.3 Convergence in probability)
The sequence of random variables {hn } converges in probability to
the r.v. h iff for all ǫ > 0
lim P(|hn − h| < ǫ) = 1,
n→∞
p
in symbols plim hn = h or hn → h.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Orders in probability
Definition (1.4 Orders in probability)
1
The sequence of r.v. {hn } is said to be of large order in
probability cn , in symbols Op (cn ), if for every ǫ > 0 there are
positive mǫ , nǫ such that P(|hn |/cn > mǫ ) < ǫ for all n > nǫ .
2
The sequence of r.v. {hn } is said to be of small order in
p
probability cn , in symbols op (cn ), if hn /cn → 0.
These definitions extend the common mathematical notation
O(xn ) and o(xn ) to random convergence. They may also be used
for vectors or matrices. Often, cn will be a simple function of n,
such as nα .
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Consistency and distributional convergence
Definition (1.5 Consistency of an estimator)
Let {θ̂n } be a sequence of estimators for the true parameter vector
p
θ0 , then θ̂n is said to be a consistent estimator of θ0 if θ̂n → θ0 .
Hall does not require ‘strong’ consistency here.
Definition (1.6 Convergence in distribution)
The sequence of r.v. {hn } with distribution functions {Fn (c)}
converges in distribution to the r.v. h with distribution function
d
F (c), in symbols hn → h, iff there exists nǫ for every ǫ > 0 such
that |Fn (c) − F (c)| < ǫ for n > nǫ at all continuity points c.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Slutzky and friends
Lemma (Slutzky’s Theorem)
Let hn be a sequence of random vectors that converges in
probability to the random vector h and let f (.) be a vector of
p
continuous functions then f (hn ) → f (h).
Lemma (Corollary)
Let {Mn } be a sequence of random matrices that converge in
probability to a constant matrix M, and {hn } be a sequence of
vector-valued r.v. that converge in distribution to N(0, Σ). Then
d
Mn hn → N(0, MΣM ′ ).
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
LLN
Lemma (Weak law of large numbers)
Assume vt is a sequence of r.v. with Evt = µ, then any set of
assumptions that imply
n
−1
n
X
p
vt → µ
t=1
is called a weak law of large numbers (WLLN). Strict stationarity
together with some ‘regularity conditions’ suffices.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
CLT
Lemma (Central limit theorem)
Assume vt is a sequence of r.v. with Evt = µ, then any set of
assumptions that imply
n−1/2
n
X
d
(vt − µ) → N(0, Σ),
t=1
with
Σ = lim var{n−1/2
n→∞
n
X
(vt − µ)}
t=1
is called a central limit theorem (CLT). Strict stationarity together
with some ‘regularity conditions’ suffices.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Consider
yt = xt′ θ0 + ut ,
t = 1, . . . , n,
where xt collects p explanatory variables and θ0 ∈ Rp . Write
ut (θ) = yt − xt′ θ for the residual such that ut (θ0 ) = ut . There is a
q–vector of observed instruments zt . We will also use the
(n × q)–matrix Z = (z1 , . . . , zn )′ and the (n × p)–matrix X .
Assumption (2.1 Strict stationarity)
The random vector vt = (xt′ , zt′ , ut )′ is a strictly stationary process.
Integrated processes can be handled (just take differences) but
breaks etc. are excluded.
Assumption (2.2 Population moment condition)
The q–vector zt satisfies E[zt ut (θ0 )] = 0.
θ0 solves the basic problem but it may not be the only solution.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Assumption (2.3 Identification condition)
rk{E(zt xt′ )} = p
There must be at least as many instruments as regressors (q ≥ p)
and these should be correlated with them. If assumption 2.3 holds
and q > p, θ0 is said to be over-identified. If q = p, it is
just-identified. If the indicated rank is ‘almost’ p − 1, θ0 is said to
be weakly identified.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
The estimator
Given an appropriate weighting matrix Wn , the GMM minimand is
Qn (θ) = {n−1 u(θ)′ Z }Wn {n−1 Z ′ u(θ)},
and the GMM estimate is defined as
θ̂n = arg min Qn (θ).
θ∈Θ
Some algebraic manipulation yields
θ̂n = (X ′ ZWn Z ′ X )−1 X ′ ZWn Z ′ y .
Clearly, X ′ Z must have rank p, otherwise the estimate cannot be
calculated. Assumption 2.3 is the large-sample counterpart of this
condition.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
In detail, the customary first-order condition
∂Qn (θ)
=0
∂θ
for θ̂ yields the equation
(n−1 X ′ Z )Wn {n−1 Z ′ u(θ̂)} = 0,
with its asymptotic counterpart
E(xt zt′ )W E{zt ut (θ0 )} = 0.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
The fundamental decomposition
Define F = W 1/2 E(zt xt′ ), with W 1/2 a ‘root’ of W such that
′
W = W 1/2 W 1/2 then re-write the population moment condition
as
F ′ W 1/2 E{zt ut (θ0 )} = 0.
Multiplying these p equation from the left with a projection matrix
yields the identifying restrictions of GMM:
F (F ′ F )−1 F ′ W 1/2 E{zt ut (θ0 )} = 0.
These are q equations but only p are linearly independent.
Conversely, the equation system
{Iq − F (F ′ F )−1 F ′ }W 1/2 E{zt ut (θ0 )} = 0
has q equations but the rank of the relevant matrix is only q − p.
This other system comprises the overidentifying restrictions of
GMM.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Because the GMM estimator is defined in sample counterparts to
population moment conditions, this decomposition is also valid in a
finite-sample version.
The identifying restrictions hold exactly. It is always possible to
impose p restrictions if p parameters are to be estimated. Define
1/2 ′
Fn′ = n−1 X ′ Z Wn . The equation
1/2
Fn (Fn′ Fn )−1 Fn′ Wn Z ′ u(θ̂) = 0
holds for finite n by definition. The overidentifying restrictions will
usually not hold exactly for the finite-sample estimate θ̂. The
property that they should hold as n → ∞ is the basis for
customary GMM specification tests.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Asymptotic properties of the GMM estimator
Like any good econometric estimator, the GMM estimator should
be consistent (θ̂n → θ0 ) and asymptotically normal
(n1/2 (θ̂n − θ0 ) → N(0, V ) in distribution). Impose the following
assumption:
Assumption (2.4 Independence)
The vector vt = (xt′ , zt′ , ut )′ is independent of vt+s for all s 6= 0.
Explanatory variables, instruments, and errors taken together form
an iid process. This defines a static model. It may be convenient
to relax this assumption later.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Consistency is straight forward
θ̂n
=
(X ′ ZWn Z ′ X )−1 X ′ ZWn Z ′ y
=
θ0 + {(n−1 X ′ Z )Wn (n−1 Z ′ X )}−1 (n−1 X ′ Z )Wn (n−1 Z ′ u)
p
→ θ0 + {E(zt xt′ )W E(xt zt′ )}−1 E(zt xt′ )W E(zt ut )
=
θ0 ,
using the population moment condition, LLN and Slutzky’s
Theorem.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Asymptotic distribution
Start from the representation
n1/2 (θ̂n −θ0 ) = {(n−1 X ′ Z )Wn (n−1 Z ′ X )}−1 (n−1 X ′ Z )Wn (n−1/2 Z ′ u),
where everything converges to constants, except for the last term
n−1/2 Z ′ u = n−1/2
n
X
zt ut ,
t=1
d
which follows a CLT (iid!), such that n−1/2 Z ′ u → N(0, S). The
interesting part is S defined by S = var(zt ut ). Hall calls the
complicated constant limit matrix M (see above), such that the
Corollary of Slutzky’s Theorem yields
d
n1/2 (θ̂n − θ0 ) → N(0, MSM ′ ).
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Estimating S
In order to construct asymptotic confidence intervals or hypothesis
tests, S must be estimated. One can show (o.c.s.) that
Ŝn = n
−1
n
X
p
ut (θ̂n )2 zt zt′ → S.
t=1
This is not trivial, as u(θ̂n ) are not true errors. Of course, under
the so-called ‘classical assumptions’
Assumption (2.5 Classical assumptions on errors)
Eut = 0, Eut2 = σ02 ,ut and zt independent,
one may use
ŜCIV = σ̂n2 n−1 Z ′ Z ,
with σ̂n2 = n−1 u(θ̂n )′ u(θ̂n ).
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Asymptotic distribution of moments
For technical reasons, one may be interested in the asymptotic
distribution of n−1/2 Z ′ u(θ̂n ), the sample counterpart to the
E(zt ut ) that is supposed to be zero.
Consider the representation in F notation
1/2
θ̂n − θ0 = (Fn′ Fn )−1 Fn′ Wn n−1 Z ′ u.
Therefore,
u(θ̂n ) = u − X θ̂n + X θ0
1/2
= u − X (Fn′ Fn )−1 Fn′ Wn n−1 Z ′ u.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Thus, residuals are expressed as a function of errors. This
representation can be multiplied by n−1/2 W 1/2 Z ′ , so it yields
1/2
n−1/2 W 1/2 Z ′ u(θ̂n ) = (Iq − Pn )Wn n−1/2 Z ′ u,
with the short notation Pn = Fn (Fn′ Fn )−1 Fn′ , and finally
p
n−1/2 W 1/2 Z ′ u(θ̂n ) → N(0, NSN ′ ),
with a slightly complicated but tractable matrix N. This
asymptotic property is convenient for developing specification tests.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Optimal choice of the weighting matrix
The asymptotic variance of the GMM estimator is MSM ′ , where
M = {E(xt zt′ )W E(zt xt′ )}−1 E(xt zt′ )W .
O.c.s. (L.P. Hansen) that the variance becomes minimal for
W0 = S −1 . Then, the asymptotic variance becomes, by straight
forward insertion,
V0 = {E(xt zt′ )S −1 E(zt xt′ )}−1 .
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Two-step GMM
In practice, S must be replaced by a consistent estimator, for
example by Ŝn , which however is infeasible, as it uses a GMM
estimate θ̂n . Solution is two-step:
1
Estimate θ by a simple and not optimal weighting matrix, for
example the identity matrix I . Estimate S based on the
residuals.
2
Estimate θ again by using the weighting matrix according to
step 1.
The thus defined estimator is called the ‘optimal two-step GMM
estimator’. Iterations may continue and then define the ‘optimal
iterated GMM estimator’.
Under the classical assumptions 2.5, the two-step GMM estimator
becomes the familiar two-stage least-squares estimator (2SLS).
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Several types of mis-specification
Hall distinguishes among three cases:
1
Correct specification. The specified model M is the true
model. Ezt ut (θ0 ) = 0 for a unique θ0 ∈ Θ.
2
The true model MA is not the specified model M but θ+ is
still a unique solution to the moment conditions. θ+ is a
pseudo-true value.
3
The true model MB is not M, and there is no θ such that
Ezt ut (θ) = 0.
In most aspects, item # 2 is innocuous. MA (θ+ ) can be treated
like M(θ0 ), these models are ‘observationally equivalent’ w.r.t.
moment conditions.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
The idea of the J–test
In the case of item #3, identifying restrictions will hold by
definition but overidentifying restrictions will be violated even in
the limit for large n. One may consider the statistic
Jn = nQn (θ̂n ) = n−1 u(θ̂n )′ Z Ŝn−1 Z ′ u(θ̂n ),
which, under H0 of Ezt ut (θ0 ) = 0, converges to a χ2q−p in
distribution. Note that the asymptotic properties of the moment
condition are used in deriving the χ2 limit.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
A few assumptions to define well behaved models.
Assumption (3.1 Strict stationarity)
The r–dimensional random vectors {vt , t ∈ Z} form a strictly
stationary process with sample space V ⊆ Rr .
Assumption (3.2 Regularity conditions for f )
The function f : V × Θ → Rq satisfies:
1
f is continuous on Θ for all vt ∈ V;
2
Ef (vt , θ) < ∞ for all θ ∈ Θ;
3
Ef (vt , θ) is continuous on Θ.
The continuity assumption is restrictive and excludes some cases of
empirical interest.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Population moment condition and identification
Assumption (3.3 Population moment condition)
The r.v. vt and the parameter θ0 ∈ Rp satisfy the q–vector of
population moment conditions Ef (vt , θ0 ) = 0.
Assumption (3.4 Global identification)
Ef (vt , θ̄) 6= 0 for all θ̄ ∈ Θ with θ̄ 6= θ0 .
These two conditions exclude the misspecified case MB as well as
the case of multiple solutions. In some applications, local
identification may suffice.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
An example for a non-identified model: partial adjustment
In the model
yt − yt−1 = β0 (y0∗ − yt−1 ) + ut ,
ut
= ρ0 ut−1 + et ,
y ∗ represents a target or desired level for yt , and et is i.i.d. with
mean 0. θ = (β, ρ, y ∗ )′ and p = 3. Assume there are some q > p
variables z that serve as instruments such that
Ezt et (θ) = 0,
et (θ) = yt − β(1 − ρ)y ∗ − (1 + ρ − β)yt−1 − (β − 1)ρyt−2 ,
where the residuals just follow from the model definition by
transformation. The model is nonlinear in θ. An AR(2) model with
a constant is identified and linear and would have parameters
µ = (µ̄, φ1 , φ2 )′ , say. O.c.s. that θ as a function of given µ has
multiple solutions due to a quadratic function. The model is not
globally identified.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Conditions for local identification
Local identification, which may be more relevant in practice,
requires that derivatives exist and that for every θ an entire
ǫ–neighborhood is contained in the parameter space.
Assumption (3.5 Regularity condition on ∂f (vt , θ)/∂θ)
1
2
3
the derivative matrix ∂f (vt , θ)/∂θ′ exists and is continuous on
Θ for all vt ∈ V;
θ0 is an interior point of Θ;
E∂f (vt , θ)/∂θ′ < ∞.
Assumption (3.6 Local identification)
rk(E∂f (vt , θ)/∂θ′ ) = p.
This condition naturally generalizes Assumption 2.3 from the linear
model.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
The partial adjustment model is locally identified
With y˜t = (1, yt−1 , yt−2 )′ , the matrix of Assumption 3.6 can be
written as
E∂f (vt , θ)/∂θ′ ) = E(zt ỹt′ )M(θ0 ),
where


−(1 − ρ)y ∗
βy ∗
−β(1 − ρ)
.
1
−1
0
M(θ) = 
−ρ
−(β − 1)
0
Clearly, in general this matrix is of full rank, and the local
identification condition holds.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Technical issues of GMM estimation in nonlinear models
The GMM estimate θ̂ is the minimum of the function
Qn (θ) = {n−1
n
X
t=1
f (vt , θ)}′ Wn {n−1
n
X
f (vt , θ)}.
t=1
For the large-sample behavior of the weighting matrix we assume
Assumption (3.7 Properties of the weighting matrix)
Wn is a non-negative definite matrix that converges in probability
to the positive definite constant matrix W .
Typically, θ̂ does not have a closed-form solution. It will be
obtained via an iterative numerical algorithm.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
GMM as a solution to first-order conditions
If derivatives exist (Assumption 3.5), the minimization can rely on
∂Qn (θ)/θ = 0, or, in detail,
{n
−1
n
X
∂f (vt , θ̂n )
t=1
∂θ′
′
} Wn {n
−1
n
X
f (vt , θ̂n )} = 0.
t=1
This expression has a population counterpart
E{
∂f (vt , θ0 ) ′
} W Ef (vt , θ0 ) = 0.
∂θ′
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Numerical optimization
Numerical minimization has three aspects:
1
the starting value θ̄(0);
2
the iterative search method, i.e. how to find θ̄(j + 1) given
θ̄(j);
3
the convergence criterion.
Hall recommends to vary starting values as a sensitivity check.
Convergence criteria can be absolute (kθ̄(j + 1) − θ̄(j)k < ǫ),
relative (kθ̄(j + 1) − θ̄(j)k/kθ̄(j)k < ǫ), or a mixture of both. They
can check convergence of θ̄, of the function Q, or of a gradient—in
particular, if gradients of Q are used in the search iterations.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Fundamental decomposition in the nonlinear model
Consider the first-order condition
E{
∂f (vt , θ0 ) ′
} W Ef (vt , θ0 ) = 0.
∂θ′
Again, this permits a form
F (θ0 )′ W 1/2 E{f (vt , θ0 )} = 0,
and a decomposition into the identifying restrictions
F (θ0 ){F (θ0 )′ F (θ0 )}−1 F (θ0 )′ W 1/2 E{f (vt , θ0 )} = 0,
and the overidentifying restrictions.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
The overidentifying restrictions
The overidentifying restrictions
[Iq − F (θ0 ){F (θ0 )′ F (θ0 )}−1 F (θ0 )′ ]W 1/2 E{f (vt , θ0 )} = 0
are not fulfilled in sample (in estimation) but only asymptotically if
the model is specified correctly (or misspecified of the MA type?).
Note that the two projection matrices sum to I and therefore
Qn (θ̂) shows directly how well the overidentifying restrictions are
satisfied. This is the basis for a popular specification test statistic.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Asymptotic properties of the GMM estimator
Assumption (3.8 Ergodicity)
The random process {vt , t ∈ Z} is ergodic,
meaning its sample moments converge to the population moments.
There exist non-ergodic stationary processes, and minimal
conditions to exclude them are complex. We just assume.
Assumption (3.9 Compactness)
The parameter space Θ is compact.
No sequence of admissible parameter values should converge to a
non-admissible one. No sequence should escape to infinity.
Assumption (3.10 Domination of f )
E supθ∈Θ kf (vt , θ)k < ∞ .
Because Θ is now bounded anyway, this assumption excludes local
areas and points with undefined expectation.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Consistency of GMM
O.c.s. the following Lemma.
Lemma
Assumptions 3.1,3.2,3.7–3.10 imply that
p
supθ∈Θ |Qn (θ) − Q0 (θ)| → 0,
where Q0 (θ) is introduced for the population analog to Qn (θ).
Using this lemma, it is not difficult to show
Theorem (3.1 Consistency of the GMM estimator)
p
Assumptions 3.1–3.4 and 3.7–3.10 imply that θ̂n → θ0 .
Note that this theorem assumes global identification. Qn converges
to Q0 , and its minimum converges to the minimum of Q0 , which is
unique according to assumptions.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Assumptions for the asymptotic normality of GMM
Application of a CLT requires
some more technical assumptions.
Pn
−1
Introduce gn (θ) = n
t=1 f (vt , θ), the sample analog to
Ef (vt , θ).
Assumption (3.11 Variance of sample moment)
1
E{f (vt , θ0 )f (vt , θ0 )′ } < ∞;
2
var(n1/2 gn (θ0 )) → S;
3
S is finite and positive definite.
This assumption is needed for a CLT:
Lemma
d
Assumptions 3.1,3.3,3.8, and 3.11 imply n1/2 gn (θ0 ) → N(0, S).
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Now g converges
P but so must its derivatives
Gn (θ) = n−1 nt=1 ∂f (vt , θ)/∂θ′ . Denote its population analog
E∂f (vt , θ0 )/∂θ′ by G0 .
Assumption (3.12 Continuity of expected derivatives)
E∂f (vt , θ)/∂θ′ is continuous in a neighborhood Nǫ of θ0 .
Assumption (3.13 Uniform convergence of Gn (θ))
p
supθ∈Nǫ kGn (θ) − E∂f (vt , θ)/∂θ′ k → 0.
With these assumptions, Gn (θ̂n ) finally converges to G0 in
probability and the desired property follows.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Asymptotic normality of GMM
Theorem (3.2 Asymptotic normality of parameter estimator)
Assumptions 3.1–3.5 and 3.7–3.13 imply
d
n1/2 (θ̂n − θ0 ) → N(0, MSM ′ ), where M = (G0′ WG0 )−1 G0′ W .
Clearly, G0 must be estimated by some finite-sample analog, and
so must S. Unfortunately, estimation of S is not trivial.
Theorem (3.3 Asymptotic normality of sample moments)
Assumptions 3.1–3.5 and 3.7–3.13 imply
′
d
1/2
Wn n1/2 gn (θ̂n ) → N(0, NW 1/2 SW 1/2 N ′ ) where
N = Iq − F (θ0 ){F (θ0 )′ F (θ0 )}−1 F (θ0 )′ .
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Estimating S
S is the limit of the variances of cumulative sums
!
n
X
−1/2
ft ,
S = lim var n
n→∞
t=1
where (ft ) is stationary but not i.i.d. In time-series analysis, this
variance is known as the spectrum at zero, while econometricians
call it the long-run variance. In population, it is simply given by
S=
∞
X
Γj ,
j=−∞
where Γj are the autocovariance matrices of ft . Plugging in
autocovariance estimates (for fˆt ) and chopping of the sum at some
J and −J yields an unattractive estimate.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Nonparametric estimation of S
In time-series analysis, a preferred method for estimating the
spectral density is kernel or window estimation. At frequency zero,
this amounts to
ŜHAC = Γ0 +
n−1
X
ωj,n (Γ̂j + Γ̂′j ),
j=1
with the kernel function ωj,n that downweights autocovariance
estimates with large j. Usually, ωj,n = 0 for j > b(n), with the
‘bandwidth’ b(n) → ∞ as n → ∞. The most common kernel
function is the Bartlett or triangular kernel
(
j
1 − b(n)+1
, j ≤ b(n),
ωj,n =
0,
else.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
What does ‘HAC’ mean?
HAC stands for ‘heteroskedasticity and autocorrelation consistent’
covariance estimation, a technique developed by Newey and
West, not for ‘heteroscedastic autocorrelation covariance’
(Hall). The HAC method requires spectral estimation at
frequency zero, and this is done via kernel estimation. For this
reason, some econometricians call the triangular Bartlett window a
‘Newey and West window’.
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Parametric estimation of S
As an alternative way, one may fit a vector autoregression (VAR)
to ft (or rather fˆt :
ft = A1 ft−1 + A2 ft−2 + . . . + Ak ft−k + et ,
selecting k according to Schwarz’ BIC, for example, and estimate
S via
′
k
k
X
X
−1
−1
Ak ) ,
Ak ) Σ̂(e)(Iq −
ŜVAR = (Iq −
j=1
j=1
with Σ̂(e) = n−1 ê ′ ê estimated from the residuals of the VAR. One
may even combine parametric and nonparametric stages
(Andrews and Monahan).
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
Optimal choice of the weighting matrix
In analogy to the linear model, the optimal weighting matrix W is
obtained by W = S −1 .
Theorem (3.4 Optimal weighting matrix)
Assumptions 3.1–3.5,3.7–3.13 imply that the minimum asymptotic
variance matrix of θ̂ is (G0′ S −1 G0 )−1 , which is obtained by
W = S −1 .
In practice, S must be estimated. One may start by weighting
W = I and use the obtained θ̂ to obtain Ŝ to update θ̂ (two-step
GMM) or one may iterate these steps to convergence (iterated
GMM). A variant solves for θ̂ and Ŝ jointly, leaving iterations to
the computer (continuously updated GMM).
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna
Introduction
The Instrumental Variable Estimator in the Linear Regression Model
GMM in correctly specified models
A technical independence result
Theorem (3.5 Asymptotic independence of estimate and moment)
Assume 3.1–3.5 and 3.7–3.13 hold and W = S −1 , then
n1/2 (θ̂n − θ0 ) and S −1/2 n1/2 gn (θ̂n ) are asymptotically independent.
This theorem guarantees that iterated GMM converges to the
exact solution as n → ∞ and thus is comparable to continuous
updating. It also guarantees that J–test specification statistics on
‘overidentifying restrictions’ will be independent from the
parameter estimates. The details of its proof (in Hall’s book)
show that this independence does not hold if W 6= S −1 .
Robert M. Kunst [email protected]
The generalized method of moments
University of Vienna and Institute for Advanced Studies Vienna