Download The Linear Regression Model with Autocorrelated Disturbances

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Vector generalized linear model wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Taylor's law wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Regression analysis wikipedia , lookup

Generalized linear model wikipedia , lookup

Transcript
The Linear Regression Model with Autocorrelated
Disturbances: Finite Sample Theory
(Reference – Greene, Chapter 13)
Consider the standard normal linear regression model
yt = Xt’β + εt , t = 1,…,T
where β is a kx1 constant vector, Xt is a kx1 strictly
exogenous process and εt ~ i.i.d. N(0,σ2).
The OLS estimator is an unbiased, normally
distributed, and efficient estimator of β.
We will begin by reviewing the problem of
autocorrelated disturbances from the traditional finite
sample theory approach.
Now suppose that we relax the independence
assumption so that
ε ~ N(0, σ2Ω )
where ε is the Tx1 random vector [ε1 … εT]’ and Ω is
a TxT p.d. symmetric matrix whose diagonal
elements are equal to 1.
That is, Ωij = E(εiεj). We are assuming that ε’s are
homoskedastic (so that the diagonal elements are all
the same) but are allowing the off-diagonal elements
to be non-zero to allow for serial correlation in the
ε’s.
In this case, the OLS estimator of β is unbiased and
normally distributed, but it is not an efficient
estimator. In addition
Var(ˆOLS )   2 ( X ' X )1 X ' 1 X ( X ' X )1 .
So, hypothesis tests and confidence tests based on the
OLS estimator will be invalid if the disturbances are
autocorrelated and the simple variance formula
Var(ˆOLS )   2 ( X ' X ) 1 is applied.
If Ω is known, then exact finite sample hypothesis
tests and confidence intervals can be constructed
from the OLS estimator by using the appropriate
variance formula. However, if Ω is known it is
straightforward to construct the GLS estimator,
which is unbiased, normally distributed, and efficient.
Recall the GLS Estimator of β –
ˆGLS  ( X '  1 X ) 1 X '  1Y
Another way to think about the GLS Estimator –
~ ~
Y
 X  ~
Apply OLS to the transformed model:
~
~
Y  CY , X  CX ,
~  C
and CC’ = Ω-1
The transformed model is a classical normal linear
regression model since the transformed X’s are
strictly exogenous and transformed disturbances are
normally distributed, zero-mean random variables
~~ ' )  2
E
(

with
σ I.
Valid hypothesis tests and intervals estimates for β
can then be constructed by applying “OLS
procedures” to the transformed model.
Suppose that we suspect that our disturbances may be
serially correlated but, if they are, we don’t know the
Ω. Then it is natural to consider testing the null
hypothesis of no serial correlation. If we fail to reject
the null, we can proceed under the assumption that
the disturbances are not serially correlated. If we do
reject the null, then we have to decide how to
proceed. (More on this later.)
It would seem natural that to test for autocorrelated
ε’s we should simply test for whether the OLS
residuals, i.e., the ˆ' s , are serially correlated.
However, in finite samples the OLS residuals will be
serially correlated (and heteroskedastic) even if the
true disturbances are independent (and
homoskedastic)!
Var(ˆ)  I  X ( X ' X ) 1 X '
So, we will want to construct a test that accounts for
the “normal amount” of serial correlation that appears
in the OLS residuals under the null and see whether
there is “too much” serial correlation in these
residuals for the independence assumption to be
plausible.
The Durbin-Watson (DW)Test :
The null hypothesis is that the ε’s are independently
drawn N(0,σ2) random variables. The alternative
hypothesis is that there is first-order (and, possibly
higher) autocorrelation in the ε’s.
The DW Test Statistic –
T
d
 (ˆ  ˆ
t 1
t
2
T
 ˆ
)2
 2(1  r1 )
2
t
1
where r1 is the first-order sample autocorrelation of
the residuals.
If there is no first-order autocorrelation, we would
expect the d statistic to be close to 2. Positive
autocorrrelation would be evident in values of d less
than 2 (bounded below by 0) and negative
autocorrelation would be evident in values of d
greater than 2 (bounded above by 4).
Under the null hypothesis of no first-order
autocorrelation, the d statistic is drawn from the
“Durbin-Watson distribution.”
The p-value of the test of H0:ρ1=0 vs. H1:ρ1>0 is
Pr obH 0 (d  dˆ )
where d-hat is the sample value of d. So, for example,
we reject H0 at the 5-percent test size if d̂ < d0.05.
The exact distribution of the DW statistic under H0
depends on T,k and X. Durbin and Watson derived
lower and upper bounds for the percentiles of the
distribution that do not depend on X. These are often
available in econometric textbooks (e.g., Greene).
Thus, for example, if T = 50 and k = 3 and any
appropriate X matrix
1.46 < d0.05 < 1.63
Then if d̂ < 1.46, reject H0 at the 5-percent size. If
d̂ > 1.63 don’t reject H0, and if 1.46 < d̂ < 1.63?
The exact p-value for given T,k, and X can be
computed from a somewhat complicated formula.
Most modern regression software will do these
computations and provide you with the appropriate pvalue so that the “inconclusive range” problem is no
longer a serious problem.
For our purposes, the biggest problem with the DW
test is that it requires the regressors to be strictly
exogenous.
Question – What should we do if we find evidence of
autocorrelation? FGLS? OLS with “correct” s.e.’s?
Standard OLS?
Standard OLS is unbiased. But the unadjusted s.e.’s
are incorrect leading to improperly sized confidence
intervals and hypothesis tests.
Applying OLS but correcting the s.e.’s based on
consistent estimates of the Ω matrix (or using
“heteroskedasticity-autocorrelation-consistent” s.e.’s
that don’t require knowing the exact form of the
autocorrelation) provide the basis for asymptotically
valid inference, but how well do these corrections
work in finite samples?
FGLS is biased but consistent, asymptotically
normal, and asymptotically efficient. How well does
it work in finite samples?