Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data assimilation wikipedia , lookup
Choice modelling wikipedia , lookup
Regression toward the mean wikipedia , lookup
Time series wikipedia , lookup
German tank problem wikipedia , lookup
Instrumental variables estimation wikipedia , lookup
Regression analysis wikipedia , lookup
Chapter 12: Serial Correlation and Heteroskedasticity in Time Series Regressions Econometrics II Spring 2010 Properties of OLS with Serially Correlated Errors • Unbiasedness and Consistency • Efficiency and Inference Because the Gauss-Markov Theorem requires both homoskedasticity and serially uncorrelated errors, OLS is no longer BLUE in the presence of serial correlation. Even more importantly, the usual OLS standard errors and test statistics are not valid, even asymptotically. Consider an AR(1) serial correlation model with the first four GaussMarkov assumptions held true. Assume yt = β0 + β − 1xt + ut ut = ρut−1 + et , t = 1, 2, . . . , n where |ρ| < 1 and the et are uncorrelated random variables with mean zero and variance σe2 . For simplicity, we assume that the sample average of the xt is zero (x̄ = 0). Then, the OLS estimator of β1 can be written as n P xt ut t=1 , β̂1 = β1 + SSTx 1 P where SSTx = nt=1 x2t . Now, in computing the variance of β̂1 (conditional on X), we must account for the serial correlation in the ut : n X xt u t ) V ar(β̂1 ) = SSTx−2 · V ar( t=1 = SSTx−2 · n X x2t V ar(ut ) + 2 σ2 σ2 +2 SSTx SSTx2 xt xt+j E(ut ut+j ) t=1 j=1 t=1 = n−1 X n−t X n−1 X n−t X ρj xt xt+j (12.4) t=1 j=1 where σ 2 = Var(ut ) and we have used the fact that E(ut ut+j ) = Cov(ut , ut+j ) = ρj σ 2 . • Goodness-of-fit • Serial Correlation in the Presence of Lagged Dependent Variables Almost every textbook on econometrics contains some form of the statement “OLS is inconsistent in the presence of lagged dependent variables and serially correlated errors.” Unfortunately, as a general assertion, this statement is false. 2 There is a version of the statement that is correct. To illustrate, suppose that the expected value of yt given yt−1 is linear: E(yt |yt−1 ) = β0 + β1 yt−1 , where we assume stability, |β1 | < 1. Certainly, we can write this equation with an error term as yt = β0 + β1 yt−1 + ut (12.6) where E(ut |yt−1 ) = 0. We now see how the OLS estimation of (12.6) leads to inconsistent estimators, provided the errors ut follow an AR(1) model: Testing for Serial Correlation We now start to discuss several methods of testing for serial correlation in the error terms. Consider a multiple regression model: yt = β0 + β1 xt1 + . . . + βk xtk + ut . • A t test for AR(1) Serial Correlation with Strictly Exogenous Regressors In the AR(1) model, ut = ρut−1 + et , t = 2, . . . , n, the null hypothesis that the errors are serially uncorrelated is H0 : ρ = 0 3 However, the actual ρ can never be observed. We could estimate ρ by regressing ût on ût−1 , for every t = 2, . . . , n, to obtain the estimated value, ρ̂. • The Durbin-Watson Test under Classical Assumptions Another test for AR(1) serial correlation is the Durbin-Watson test. The Durbin-Watson (DW) statistic is also based on the OLS residuals: n P (ût − ût−1 )2 DW = t=2 P . n 2 ût t=1 In addition, it can be shown that DW ≈ 2(1 − ρ) 4 DW test versus t test • Testing for AR(1) Serial Correlation without Strictly Exogenous Regressors When the explanatory variables are not strictly exogenous,so that one or more xtj are correlated with ut−1 , neither the t test nor the DurbinWatson test are valid, even in large samples. To overcome this challenge, Durbin suggested two alternatives: 1. Durbin’s h test =⇒ This statistic is not always computable. 5 2. Testing with general regressors: More generally, we can test for serial correlation in the autoregressive model of order q: ut = ρ1 ut−1 + ρ2 ut−2 + . . . + ρq ut−q + et . The null hypothesis is H0 : ρ1 = 0, ρ2 = 0, . . . , ρq = 0 (12.21) • Testing for Higher Order Serial Correlation The previous test is easily extended to higher orders of serial correlation. Example [AR(2)] Procedure for testing AR(q) serial correlation: 6 • The Breusch-Godfrey test An alternative to computing the F test is to use the Lagrange multiplier (LM ) form of the statistic. The LM statistic for testing (12.21) is H LM = (n − q)Rû2 ∼0 χ2q where Rû2 is the R-squared derived from running the regression of ût on xt1 , xt2 , . . . , xtk , ût−1 , ût−2 , . . . , ût−q , f or all t = (q + 1), . . . , n. • Testing for Seasonality For example, with quarterly data, we might postulate the autoregressive model ut = ρ4 ut−4 + et . 7 Correcting for Serial Correlation with Strictly Exogenous Regressors • obtaining the BLUE in the AR(1) model Consider a simple regression model with errors follow the AR(1) process: yt = β0 + β1 xt + ut , f or all t = 1, 2, . . . , n. For t ≥ 2, we write yt−1 = β0 + β1 xt−1 + ut−1 , yt = β0 + β1 xt + ut . Now,if we multiply this first equation by ρ and subtract it from the second equation, we get yt − ρyt−1 = (1 − ρ)β0 + β1 (xt − ρxt−1 ) + et , t ≥ 2 where we have used the fact that et = ut − ρut−1 . We can write this as ỹt = (1 − ρ)β0 + β1 x̃t + et , t ≥ 2, where ỹt = yt − ρyt−1 and x̃t = xt − ρxt−1 are called the quasidifferenced data. (If ρ = 1, these are differenced data; here we are assuming |ρ| < 1.) Be cautious about the equation for t = 1: 8 • Feasible GLS Estimation with AR(1) Errors Although the ρ is rarely known, we already know how to get a consistent estimator for it: we simply regress ût on ût−1 to obtain the estimate, ρ̂. Next, we use this ρ̂ in place of ρ to obtain the quasi-differenced variables. We then use OLS on the equation ỹt = β0 x̃t0 + β1 x̃t1 + . . . + βk x̃tk + errort , where x̃t0 = (1 − ρ̂) for t ≥ 2, and x̃10 = (1 − ρ̂2 )1/2 . This results in the feasible GLS (FGLS) estimator of the βj . Remarks: There are several names for FGLS estimation of the AR(1) model that come from different methods of estimating ρ and different treatment of the first observation. For example, 1. Cochrane-Orcutt (CO) estimation 2. Prais-Winsten (PW) estimation • Comparing OLS and FGLS Consider the regression model yt = β0 + β1 xt + ut 9 where the time series processes are stationary. Now assuming that the law of large numbers holds, consistency of OLS for β1 holds if Cov(xt , ut ) = 0. Consistency of FGLS estimators Why OLS and FGLS differ? • Correcting for Higher Order Serial Correlation Here, we illustrate the approach for AR(2) serial correlation: ut = ρ1 ut−1 + ρ2 ut−2 + et where et is identically with mean zero and variance σe2 . The stability conditions are more complicated now. They can be shown to be ρ2 > −1, ρ2 − ρ1 < 1, and ρ1 + ρ2 < 1. 10 Differencing and Serial Correlation Differencing with respect to the highly persistent data possesses some advantage to the estimation. Suppose we have a simple regression model yt = β0 + β1 xt + ut , t = 1, 2, . . . , where ut follows the AR(1) process. Serial Correlation-Robust Inference after OLS Recall equation (12.4), which represents the variance of the OLS slope estimator in a simple regression model with AR(1) errors. We can estimate this variance very simply by plugging in our standard estimators of ρ and σ 2 . Now we relax the assumption that errors follow AR(1) process and are homoskedastic. Consider the standard multiple linear regression model yt = β0 + β1 xt1 + . . . + βk xtk + ut , t = 1, 2, . . . , n, (12.39) which we have estimated by OLS. We are interested in obtaining a serial correlation-robust standard error for β̂1 . Write xt1 as a linear function of the remaining independent variables and an error term, xt1 = δ0 + δ2 xt2 + . . . + δk xtk + rt 11 where the error rt has zero mean and is uncorrelated with xt2 , xt3 , . . . , xtk . Then it can be shown that the asymptotic variance of β̂1 is V ar n P rt ut t−1 Avar(β̂1 ) = n P 2 E(rt2 ) . t=1 Wooldridge (1989) shows that Avar(β̂1 ) can be estimated as follows. Let se(β̂1 ) denote the usual (but incorrect) OLS standard error and let σ̂ be the usual standard error of the regression (or root mean squared error) from setimating (12.39) by OLS. Let r̂t denote the residuals from the auxiliary regression of xt1 on 1, xt2 , xt3 , . . . , xtk . For a chosen integer g > 0, define ν̂ = n X t=1 â2t g X n X h ât ât−h ] +2 [1 − g + 1 t=h+1 h=1 (12.43) where ât = r̂t ût , t = 1, 2, . . . , n. Once we have ν̂, the serial correlation-robust standard error of β̂1 is simply se(β̂1 ) = se(β̂1 ) 2 √ ν̂. σ̂ The standard error in (12.43) is also robust to arbitrary heteroskedasticity. In the time series literature, the serial correlation-robust standard errors 12 are sometimes called heteroskedasticity and autocorrelation consistent, or HAC, standard errors. Notes for the serial correlation-robust standard error: Heteroskedasticity in Time Series Regressions Because the usual OLS statistics are asymptotically valid under Assumptions TS.10 through TS.50 , we are interested in what happens when the homoskedasticity assumption, TS.40 , does not hold. • Heteroskedasticity-Robust Statistics • Testing for Heteroskedasticity Sometimes,we wish to test for heteroskedasticity in time series regressions,especially if we are concerned about the performance of heteroskedasticityrobust statistics in relatively small sample sizes. The tests proposed 13 in Chapter 8 can be applied directly. However, these test should be performed with caution: (1) (2) If heteroskedasticity is found in the ut (and the ut are not serially correlated), then the heteroskedasticity-robust test statistics can be used. An alternative is to use weighted least squares, as for the cross-sectional case. • Autoregressive Conditional Heteroskedasticity (ARCH) Consider a simple static regression model: yt = β0 + β1 zt + ut, and assume that the Gauss-Markov assumptions hold. This means that the OLS estimators are BLUE. The homoskedasticity assumption says that Var(ut —Z) is constant, where Z denotes all n outcomes of zt . Even if the variance of ut given Z is constant,there are other ways that heteroskedasticity can arise. Engle (1982) suggested looking at the conditional variance of ut given past errors. He proposed a model known as the autoregressive conditional heteroskedasticity (ARCH) model. 14 The first order ARCH model is E(u2t |ut−1 , ut−2 , . . .) = E(u2t |ut−1 ) = α0 + α1 u2t−1 , where we leave the conditioning on Z implicit. This equation represents the conditional variance of ut given past ut only if E(ut |ut−1 , ut−2 , . . .) = 0, which means that the errors are serially uncorrelated. Why should we care about ARCH forms of heteroskedasticity? ARCH models also apply when there are dynamics in the conditional mean. Suppose we have the dependent variable, yt , a contemporaneous exogenous variable, zt , and E(yt |zt , yt−1 , zt−1 , yt−2 , . . .) = β0 + β1 zt + β2 yt−1 + β3 zt−1 , so that at most one lag of y and z appears in the dynamic regression. 15 • Heteroskedasticity and Serial Correlation in Regression Models It is possible to have both heteroskedasticity and serial correlation present in a regression model. We can model them and make a correction through a combined weighted least squares AR(1) procedure. Specifically, consider the model yt = β0 + β1 xt1 + . . . + βtk + ut p ut = ht νt νt = ρνt−1 + et , |ρ| < 1, (12.52) where the explanatory variables X are independent of et for all t, and ht is a function of the xtj . the process et has zero mean and constant variance σe2 and is serially uncorrelated. Therefore, νt satisfies a stable AR(1) process. Suppressing the conditioning on the explanatory variables, we have V ar(ut ) = σν2 ht , √ where σν2 = σe2 /(1 − ρ2 ). But νt = ut / ht is homoskedastic and follows a stable AR(1) model. Therefore, the transformed equation y 1 x x √ t = β0 √ + β1 √t1 + . . . + βk √tk + νt ht ht ht ht has AR(1) errors. Now, if we have a particular kind of heteroskedasticity in mind, such as ht , we can estimate (12.52) using standard CO or PW methods. 16