Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Time Series and Forecasting Lecture 3 Forecast Intervals, Multi-Step Forecasting Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 1 / 102 Today’s Schedule Review Forecast Intervals Forecast Distributions Multi-Step Direct Forecasts Fan Charts Iterated Forecasts Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 2 / 102 Review Optimal point forecast of yn +1 given information In is the conditional mean E (yn +1 jIn ) Estimate linear approximations by least-squares Combine point forecasts to reduce MSFE Select estimators and combination weights by cross-validation Estimate GARCH models for conditional variance Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 3 / 102 Interval Forecasts Take the form [a, b ] Should contain yn +1 with probability 1 1 2α 2α = Pn (yn +1 2 [a, b ]) = Pn (yn +1 b ) = Fn (b ) Fn (a) Pn (yn +1 a) where Fn (y ) is the forecast distribution It follows that a = qn ( α ) b = qn ( 1 a = α’th and b = (1 Bruce Hansen (University of Wisconsin) α) α)’th quantile of conditional distribution Forecasting July 23-27, 2012 4 / 102 Interval Forecasts are Conditional Quantiles The ideal 80% forecast interval, is the 10% and 90% quantile of the conditional distribution of yn +1 given In Our feasible forecast intervals are estimates of the 10% and 90% quantile of the conditional distribution of yn +1 given In The goal is to estimate conditional quantiles. Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 5 / 102 Mean-Variance Model Write yt + 1 = µ t + σ t ε t + 1 µt σ2t = E (yt +1 jIt ) = var (yt +1 jIt ) Assume that εt +1 is independent of It . Let qt (α) and q ε (α) be the α’th quantiles of yt +1 and εt +1 . Then qt ( α ) = µ t + σ t q ε ( α ) Thus a (1 2α) forecast interval for yn +1 is [ µn + σ n q ε ( α ), Bruce Hansen (University of Wisconsin) µn + σ n q ε (1 Forecasting α)] July 23-27, 2012 6 / 102 Mean-Variance Model Given the conditional mean µn and variance σ2n , the conditional quantile of yn +1 is a linear function µn + σn q ε (α) of the conditional quantile q ε (α) of the normalized error ε n +1 = en + 1 σn Interval forecasts thus can be summarized by µn , σ2n , and q ε (α) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 7 / 102 Normal Error Quantile Forecasts N (0, 1) Make the approximation εt +1 I I q ε (α) Then = Z (a) are normal quantiles Useful simpli…cation, especially in small samples 0.10, 0.25, 0.75, 0.90 quantiles are I 1.285, 0.675, 0.675, 1.285 Forecast intervals b n Z ( α ), bn + σ [µ Bruce Hansen (University of Wisconsin) b n Z (1 bn + σ µ Forecasting α)] July 23-27, 2012 8 / 102 Nonparametric Error Quantile Forecasts Let εt +1 I I I I F be unknown We can estimate q ε (α) as the empirical quantiles of the residuals Set e e bεt +1 = t +1 bt σ Sort bε1 , ...,bεn . q bε (α) and q bε (1 α) are the α’th and (1 bn q bn + σ bε ( α ), [µ Computationally simple Reasonably accurate when n α)’th percentiles bn q bn + σ µ bε (1 α)] 100 Allows asymmetric and fat-tailed error distributions Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 9 / 102 Constant Variance Case bt = σ b is a constant, there is no advantage for estimation of σ b for If σ forecast interval Let q be (α) and q be (1 α) be the α’th and (1 original residuals e et + 1 α)’th percentiles of Forecast Interval: bn + q bε ( α ), [µ bn + q µ be ( 1 α)] When the estimated variance is a constant, this is numerically identical to the de…nition with rescaled errors bεt +1 Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 10 / 102 Computation in R quadreg package I I I may need to be installed library(quadreg) rq command If e is vector of (normalized) residuals and a is the quantile to be evalulated I I I rq(e~1,a) q=coef(rq(e~1,a)) Quantile regression of e on an intercept Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 11 / 102 Example: Interest Rate Forecast n = 603 observations e et + 1 bεt +1 = from GARCH(1,1) model bt σ 0.10, 0.25, 0.75, 0.90 quantiles 1.16, 0.59, 0.62, 1.26 Point Forecast = 1.96 50% Forecast interval = [1.82, 2.10] 80% Forecast interval = [1.69, 2.25] Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 12 / 102 Example: GDP n = 207 observations e et + 1 bεt +1 = from GARCH(1,1) model bt σ 0.10, 0.25, 0.75, 0.90 quantiles 1.18, 0.63, 0.57, 1.26 Point Forecast = 1.31 50% Forecast interval = [0.04, 80% Forecast interval = [ 1.07, Bruce Hansen (University of Wisconsin) 2.4] 3.8] Forecasting July 23-27, 2012 13 / 102 Mean-Variance Model Interval Forecasts - Summary The key is to break the distribution into the mean µt , variance σ2t and the normalized error εt +1 yt +1 = µt + σt εt +1 Then the distribution of yn +1 is determined by µn , σ2n and the distribution of εn +1 Each of these three components can be separately approximated and estimated Typically, we put the most work into modeling (estimating) the mean µt I I The remainder is modeled more simply For macro forecasts, this re‡ects a belief (assumption?) that most of the predictability is in the mean, not the higher features. Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 14 / 102 Alternative Approach: Quantile Regression Recall, the ideal 1 2α interval is [qn (α), qn (1 α)] qn (α) is the α’th quantile of the one-step conditional distribution Fn (y ) = P (yn +1 y j In ) Equivalently, let’s directly model the conditional quantile function Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 15 / 102 Quantile Regression Function The conditional distribution is P (yn +1 y j In ) ' P (yn +1 y j xn ) The conditional quantile function qα (x) solves P (yn +1 qα (x) j xn = x) = α q.5 (x) is the conditional median q.1 (x) is the 10% quantile function q.9 (x) is the 90% quantile function Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 16 / 102 Quantile Regression Functions For each α, qα (x) is an arbitrary function of x For each x, qα (x) is monotonically increasing in α Quantiles are well de…ned even when moments are in…nite When distributions are discrete then quantiles may be intervals – we ignore this We approximate the functions as linear in qα (x) qα (x ) ' x0 β α (after possible transformations in x) The coe¢ cient vector x0 βα depends on α Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 17 / 102 Linear Quantile Regression Functions qα (x ) = x0 β α If only the intercept depends on α, qα (x ) ' µ α + x0 β then the quantile regression lines are parallel I I This is when the error et +1 in a linear model is independent of the regressors Strong conditional homoskedasticity In general, the coe¢ cients are functions of α I Similar to conditional heteroskedasticity Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 18 / 102 Interval Forecasts An ideal 1 2α interval forecast interval is xn0 βα , xn0 β1 Note that the ideal point forecast is predictor xn0 β α where β is the best linear An alternative point forecast is the conditional median xn0 β0.5 I This has the property of being the best linear predictor in L1 (mean absolute error) All are linear functions of xn , just di¤erent functions A feasible forecast interval is h b , xn0 β α b and β b where β α 1 Bruce Hansen (University of Wisconsin) α b xn0 β 1 α i are estimates of βα and β1 Forecasting α July 23-27, 2012 19 / 102 Check Function Recall that the mean µ = EY minimizes the L2 risk E (Y Similarly the median q0.5 minimizes the L1 risk E jY m )2 mj The α’th quantile qα minimizes the “check function risk E ρ α (Y m) where ρ α (u ) = 8 < : = u (α u (1 α) uα u<0 u 0 1 (u < 0)) This is a tilted absolute value function To see the equivalence, evaluate the …rst order condition for minimization Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 20 / 102 Extremum Representation qα (x) solves qα (x) = argmin E (ρα (yt +1 m m ) jxt = x) Sample criterion Sα ( β ) = 1n 1 ρα yt +1 n t∑ =0 xt0 β Quantile regression estimator b = argmin Sα ( β) β α β Computation by linear programming I I I Stata R Matlab Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 21 / 102 Computation in R quantreg package I I I may need to be installed library(quantreg) For quantile regression of y on x at a’th quantile F I I do not include intercept in x , it will be automatically included rq(y~x,a) For coe¢ cients, F b=coef(rq(y~x,a)) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 22 / 102 Distribution Theory The asymptotic theory for the dependent data case is not well developed The theory for the cross-section (iid) case is Angrist, Chernozhukov and Fernandez-Val (Econometrica, 2006) Their theory allows for quantile regression viewed as a best linear approximation p d b βα n β ! N (0, Vα ) α Vα = Jα 1 Σ α Jα Jα = E fy xt0 βα jxt xt xt0 Σα = E xt xt0 ut2 ut = 1 yt +1 < xt0 βα α Under correct speci…cation, Σα = α(1 α)E (xt xt0 ) I suspect that this theorem extends to dependent data if the score is uncorrelated (dynamics are well speci…ed) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 23 / 102 Standard Errors The asymptotic variance depends on the conditional density function I Nonparametric estimation! To avoid this, most researchers use bootstrap methods For dependent data, this has not been explored Recommend: Use current software, but be cautious! Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 24 / 102 Crossing Problem and Solution The conditional quantile functions qα (x) are monotonically increasing in α But the linear quantile regression approximations qα (x) ' x0 βα cannot be globally monotonic in α, unless all lines are parallel The regression approximations may cross! b may cross! The estimates q bα (x) = x0 β α If this happens, forecast intervals may be inverted: I A 90% interval may not nest an 80% interval Simple Solution: Reordering I I I 1 If q bα1 (x) > q bα2 (x) when α1 < α2 < , simply set q bα1 (x) = q bα2 (x), 2 1 and conversely quantiles above 2 Take the wider interval Then the endpoint of the two intervals will be the same Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 25 / 102 Model Selection and Combination To my knowledge, no theory of model selection for median regression or quantile regression, even in iid context A natural conjecture is to use cross-validation on the sample check function I But no current theory justi…es this choice My recommendation for model selection (or combination) I I I I I Select the model for the conditional mean by cross-validation Use the same variables for all quantiles Select the weights by cross-validation on the conditional mean For each quantile, estimate the models with positive weights Take the weighted combination using the same weights. Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 26 / 102 Example: Interest Rates AR(2) Speci…cation (selected for regression by CV) yt +1 = β0 + β1 yt + β2 yt β0 β1 β2 α = 0.10 0.31 0.46 0.22 α = 0.25 0.14 0.31 0.17 1 + et α = 0.75 0.15 0.35 0.21 α = 0.90 0.29 0.34 0.25 Forecast 10% quantile q0.1 (xn ) = 0.31 + 0.46yn 50% Forecast interval = [1.84, 2.12] 80% Forecast interval = [1.65, 2.25] 0.22yn 1 Very close to those from mean-variance estimates Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 27 / 102 Example: GDP Leading Indicator Model yt +1 = β0 + β1 yt + β2 Spreadt + β3 HighYield + β4 Starts + β5 Permits + e β0 β1 β2 β3 β4 β5 α = 0.10 2.72 0.28 1.17 2.12 2.20 3.45 α = 0.25 0.14 0.14 0.75 1.83 0.44 1.61 50% Forecast interval = [0.1, 80% Forecast interval = [ 1.8, Bruce Hansen (University of Wisconsin) α = 0.75 0.10 0.33 0.31 0.62 6.68 4.87 α = 0.90 2.0 0.28 0.14 0.37 11.4 9.53 3.2] 4.0] Forecasting July 23-27, 2012 28 / 102 Distribution Forecasts The conditional distribution is Ft (y ) = P (yt +1 y j It ) It is not common to directly report Ft (y ) I or the one-step forecast distribution Fn (y ) However, Ft (y ) may be used as an input For example, simulation We thus may want an estimate Fbt (y ) of Ft (y ) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 29 / 102 Mean-Variance Model Distribution Forecasts Model yt +1 = µt + σt εt +1 with εt +1 is independent of It . Let εt +1 have distribution F ε (u ) = P (εt u) . The conditional distribution of yt +1 is Ft ( y ) = F ε yt +1 µt σt Fbt (y ) = Fb ε bt yt +1 µ bt σ Estimation where Fb ε (u ) is an estimate of F ε (u ) = P (εt Bruce Hansen (University of Wisconsin) Forecasting u) . July 23-27, 2012 30 / 102 Normal Error Model Under the assumption εt +1 N (0, 1), F ε (u ) = Φ(u ), the normal CDF bt y µ Fbt (y ) = Φ bt σ To simulate from Fbt (y ) I I I bt bt and σ Calculate µ Draw εt +1 iid from N (0, 1) b t ε t +1 bt + σ yt + 1 = µ The normal assumption can be used when sample size n is very small bt b and σ But then Fbt (y ) contains no information beyond µ t Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 31 / 102 Nonparametric Error Model Let Fbnε be the empirical distribution function (EDF) of the normalized residuals bεt +1 The EDF puts probability mass 1/n at each point fbε1 , ...,bεn g Fbnε (u ) = n Fbt (y ) = Fbnε = n 1 n 1 ∑ 1 (bεt +1 y 1 bt σ n 1 ∑1 bt µ y j =0 = n 1 u) t =0 n 1 ∑ 1 (y j =0 bt σ bt µ bεj +1 bt bεj +1 ) bt + σ µ bt …xed bt , σ Notice the summation over j, holding µ Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 32 / 102 Simulate Estimated Conditional Distribution To simulate I I I I bt bt and σ Calculate µ Draw εt +1 iid from normalized residuals fbε1 , ...,bεn g b t ε t +1 bt + σ yt + 1 = µ yt +1 is a draw from Fbt (y ) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 33 / 102 Plot Estimated Conditional Distribution Fbn (y ) = n bnbεt +1 ) bn + σ µ ∑tn=01 1 (y bnbεt +1 bn + σ A step function, with steps of height 1/n at µ 1 Calculation I I I I bn , and yt +1 = µ bnbεt +1 , t = 0, ..., n 1 bn , σ bn + σ Calculate µ Sort yt +1 into order statistics y(j ) Equivalently, sort bεt +1 into order statistics bε(1 ) and set bnbε(j ) bn + σ y(j ) = µ Plot on the y-axis f1/n, 2/n, 3/n, ..., 1g against on the x-axis y(1 ) , y(2 ) , ..., y(n ) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 34 / 102 Examples: Interest Rate GDP Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 35 / 102 0.0 0.2 0.4 0.6 0.8 1.0 Figure: 10-Year Bond Rate Forecast Distribution 1.0 Bruce Hansen (University of Wisconsin) 1.5 2.0 Forecasting 2.5 July 23-27, 2012 36 / 102 0.0 0.2 0.4 0.6 0.8 1.0 Figure: GDP Forecast Distribution -4 Bruce Hansen (University of Wisconsin) -2 0 2 Forecasting 4 6 8 July 23-27, 2012 37 / 102 Quantile Regression Approach The distribution function may also be recovered from the estimated quantile functions. Fn (qα (xn )) = α Fbn (q bα (xn )) = α b q bα (xn ) = xn0 β α b for a set of quantiles fα1 , ..., αJ g Compute q bα (xn ) = xn0 β α Plot αj on the y -axis against q bαj (xn ) on the x-axis I The plot is Fbn (y ) at y = q bαj (xn ) If the quantile lines cross, then the plot will be non-monotonic The reordering method ‡attens the estimated distribution at these points Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 38 / 102 Multi-Step Forecasts Forecast horizon: h We say the forecast is “multi-step” if h > 1 Forecasting yn +h given In e.g., forecasting GDP growth for 2012:3, 2012:4, 2013:1, 2013:2 The forecast distribution is yn +h j In Bruce Hansen (University of Wisconsin) Forecasting Fh (yn +h jIn ) July 23-27, 2012 39 / 102 Point Forecast fn +h jh minimizes expected squared loss fn +h jh = argmin E (yn +h f f ) 2 j In = E (yn +h jIn ) Optimal point forecasts are h-step conditional means Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 40 / 102 Relationship Between Forecast Horizons Take an AR(1) model yt +1 = αyt + ut +1 Iterate yt +1 = α (αyt + ut ) + ut + 1 = α yt 1 + αut + ut +1 1 2 or yt +2 = α2 yt + et +2 ut +2 = αut +1 + ut +2 Repeat h times yt +h et + h = αh yt + et +h = ut +h + αut +h Bruce Hansen (University of Wisconsin) 1 + α 2 ut + h Forecasting 2 + + αh 1 ut + 1 July 23-27, 2012 41 / 102 AR(1) h-step forecast = αh yt + et +h et +h = ut +h + αut +h E (yn +h jIn ) = αh yn yt +h 1 + α 2 ut + h 2 + + αh 1 ut + 1 h step point forecast is linear in yn h-step forecast error en +h is a MA(h Bruce Hansen (University of Wisconsin) Forecasting 1) July 23-27, 2012 42 / 102 AR(2) Model 1-step AR(2) model yt +1 = α0 + α1 yt + α2 yt 1 + ut + 1 2-steps ahead yt +2 = α0 + α1 yt +1 + α2 yt + ut +2 Taking conditional expectations E (yt +2 jIt ) = α0 + α1 E (yt +1 jIt ) + α2 E (yt jIt ) + E (et +2 jIt ) = α0 + α1 (α0 + α1 yt + α2 yt 1 ) + α2 yt = α0 + α1 α0 + α21 + α2 yt + α1 α2 yt 1 which is linear in (yt , yt 1) In general, a 1-step linear model implies an h-step approximate linear model in the same variables Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 43 / 102 AR(k) h-step forecasts If yt +1 = α0 + α1 yt + α2 yt 1 + + αk yt k +1 + ut + 1 yt +h = β0 + β1 yt + β2 yt 1 + + βk yt k +1 + et + h then where et +h is a MA(h-1) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 44 / 102 Leading Indicator Models If yt +1 = xt0 β + ut then E (yt +h jIt ) = E (xt +h If E (xt +h 1 j It ) 1 j It ) 0 β is itself (approximately) a linear function of xt , then E (yt +h jIt ) = xt0 γ yt +h = xt0 γ + et +h Common Structure: h-step conditional mean is similar to 1-step structure, but error is a MA. Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 45 / 102 Forecast Variable We should think carefully about the variable we want to report in our forecast The choice will depend on the context What do we want to forecast? I Future level: yn +h F I I interest rates, unemployment rates Future di¤erences: ∆yt +h Cummulative Change: ∆yt +h F Cummulative GDP growth Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 46 / 102 Forecast Transformation fn +h jn = E (yn +h jIn ) = expected future level I Level speci…cation yt + h fn + h j n I Di¤erence speci…cation ∆yt +h fn +h jn I = xt0 β + et +h = xt0 β = xt0 βh + et +h = yn + xt0 β1 + + xt0 βh Multi-Step di¤erence speci…cation yt +h yt fn +h jn Bruce Hansen (University of Wisconsin) = xt0 β + et +h = yn + xt0 β Forecasting July 23-27, 2012 47 / 102 Direct and Iterated There are two methods of multistep (h > 1) forecasts Direct Forecast I Model and estimate E (yn +h jIn ) directly Iterated Forecast I I I Model and estimate one-step E (yn +1 jIn ) Iterate forward h steps Requires full model for all variables Both have advantages and disadvantages I For now, we will forcus on direct method. Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 48 / 102 Direct Multi-Step Forecasting Markov approximation I E (yn +h jIn ) = E (yn +h jxn , xn 1 , ...) E (yn +h jxn , ..., xn p ) Linear approximation I E (yn +h jxn , ..., xn p ) β0 xn Projection De…nition I β = (E (xt xt0 )) 1 (E (xt yt +h )) Forecast error I et +h = yt +h Bruce Hansen (University of Wisconsin) β0 xt Forecasting July 23-27, 2012 49 / 102 Multi-Step Forecast Model yt +h = β0 xt + et +h β = E xt xt0 1 (E (xt yt +h )) E (xt et +h ) = 0 σ2 = E et2+h Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 50 / 102 Properties of the Error E (xt et +h ) = 0 I Projection E ( et + h ) = 0 I Inclusion of an intercept The error et +h is NOT serially uncorrelated It is at least a MA(h-1) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 51 / 102 Least Squares Estimation b = β ybn +h jn Bruce Hansen (University of Wisconsin) n 1 ∑ t =0 xt xt0 ! 1 n 1 ∑ xt yt +h t =0 ! b 0 xn = b fn + h j n = β Forecasting July 23-27, 2012 52 / 102 Distribution Theory - Consistent Estimation By the WLLN, b= β n 1 ∑ xt xt0 t =0 ! 1 p = β Bruce Hansen (University of Wisconsin) ! E xt xt0 Forecasting n 1 ∑ xt yt +h t =0 1 ! (E xt yt +h ) July 23-27, 2012 53 / 102 Distribution Theory - Asymptotic Normality By the dependent CLT, 1n 1 xt et +h n t∑ =0 Ω = E xt xt0 et2+h + ∞ ∑ d ! N (0, Ω) xt xt0 +j et +h et +h +j + xt +j xt0 et +h et +h +j j =1 ' E xt xt0 et2+h + h 1 ∑ xt xt0 +j et +h et +h j + xt +j xt0 et +h et +h +j j =1 A long-run (HAC) covariance matrix If model is correctly speci…ed, the errors are a MA(h-1) and the sum truncates at h 1 Otherwise, this is an approximation It does not simplify to the iid covariance matrix Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 54 / 102 Distribution Theory p b n β V =Q Ω β d ! N (0, V ) 1 ΩQ 1 E xt xt0 et2+h + ∑hj =11 xt xt0 +j et +h et +h j + xt +j xt0 et +h et +h +j HAC variance matrix Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 55 / 102 Residuals Least-squares residuals I I b 0 xt b et +h = yt +h β Standard, but over…t Leave-one-out residuals I I b 0 xt e et +h = yt +h β t Does not correct for MA errors Leave h out residuals b β t,h = e et +h = yt +h ∑ xj xj0 jj +h t j h ! b0 β t,h xt 1 ∑ jj +h t j h The summation is over all observations outside h Bruce Hansen (University of Wisconsin) Forecasting xj yj +h ! 1 periods of t + h. July 23-27, 2012 56 / 102 Algebraic Computation of Leave h out residuals Loop across each observation t = (yt +h , xt ) Leave out observations t h + 1, ..., t, ..., t + h 1 R command I I I For positive integers i x[-i] returns elements of x excluding indices i Consider F F F F I ii=seq(i-h+1,i+h-1) ii<-ii[ii>0] yi=y[-ii] xi=x[-ii,] This removes t Bruce Hansen (University of Wisconsin) h + 1, ..., t, ..., t + h Forecasting 1 from y and x July 23-27, 2012 57 / 102 Variance Estimator Asymptotic variance (HAC) estimator with leave-h-out residuals bQ b 1Ω b 1 = Q n 1 b = 1 ∑ xt xt0 Q n t =0 b V b = Ω 1 n 1 h 1n j xt xt0 e et2+h + ∑ ∑ xt xt0 +j e et + h e et +h +j + xt +j xt0 e et + h e et + h + ∑ n t =1 n j =1 t =1 Can use least-squares residuals b et +h instead of leave-h-out residuals, b but then multiply V by n/(n dim(xt )). b are the square roots of the diagonal elements of Standard errors for β 1 b n V Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 58 / 102 Example: GDP Forecast yt = 400 log(GDPt ) Forecast Variable: GDP growth over next h quarters, at annual rate yt +h yt = β0 + β1 ∆yt + β1 ∆yt h 1 + Spreadt + HighYieldt + β2 HSt + et +h HSt =Housing Startst β0 ∆yt ∆yt 1 Spreadt HighYieldt HSt h=1 0.33 0.16 0.09 0.61 1.10 1.86 (1.0) (.10) (.10) (.23) (.75) (.65) Bruce Hansen (University of Wisconsin) h=2 0.38 (1.3) 0.18 (.09) 0.04 (.05) 0.65(.19) 0.68 (.70) 1.64 (.70) Forecasting h=3 0.01 0.13 0.05 0.65 0.48 1.31 (1.6) (.08) (.07) (.22) (.90) (.80) h=4 0.47 (1.8) 0.13 (.09) 0.02 (.06) 0.65 (.25) 0.41 (1.01) 1.01 (.94) July 23-27, 2012 59 / 102 Example: GDP Forecast 2012:2 2012:3 2012:4 2013:1 2013:2 2013:3 2013:4 2014:1 Bruce Hansen (University of Wisconsin) Cummulative Annualized Growth 1.3 1.6 2.9 2.2 2.4 2.7 2.9 3.2 Forecasting July 23-27, 2012 60 / 102 Selection and Combination for h step forecasts AIC routinely used for model selection PLS (OOS MSFE) routinely used for model evaluation Neither well justi…ed Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 61 / 102 Point Forecast and MSFE b (m ) of β , the point forecast for yn +h is Given an estimate β b 0 xn fn + h j n = β The mean-squared-forecast-error (MSFE) is MSFE = E en + h ' σ2 + E b xn0 β b β where Q = E (xn xn0 ) and σ2 = E en2+h β 2 β 0 b Q β β Same form as 1-step case Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 62 / 102 Residual Fit b2 = σ 1n 1 2 1n 1 b et +h + ∑ xt0 β ∑ n t =0 n t =0 2n 1 b et +h xt0 β n t∑ =0 2 0 ' MSFE e Pe n where B = E (e0 Pe) Save form as 1-step case Bruce Hansen (University of Wisconsin) b2 ' MSFEn E σ Forecasting 2 β β 2 B n July 23-27, 2012 63 / 102 Asymptotic Penalty 1 p e0 X n e0 Pe = 1 0 XX n !d Z 0 Q where Z 1 1 1 p X0 e n Z N (0, Ω), with Ω =HAC variance. B Bruce Hansen (University of Wisconsin) = E e0 Pe ! tr Q 1 E ZZ 0 = tr Q 1 Ω Forecasting July 23-27, 2012 64 / 102 Ideal MSFE Criterion b 2 (m ) + Cn ( m ) = σ 2 tr Q n 1 Ω Q = E xt xt0 Ω = E xt xt0 et2+h + h 1 ∑ xt xt0 +j et +h et +h j + xt +j xt0 et +h et +h +j j =1 Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 65 / 102 H-Step Robust Mallows Criterion b 2 (m ) + Cn ( m ) = σ b is a HAC covariance matrix where Ω Bruce Hansen (University of Wisconsin) 2 b tr Q n Forecasting 1b Ω July 23-27, 2012 66 / 102 H-Step Cross-Validation for Selection CVn (m ) = b β t,h = 1n 1 e et + h ( m ) 2 n i∑ =0 b0 e et +h = yt +h β ! 1 ∑ xj xj0 jj +h t j h t,h xt ∑ jj +h t j h xj yj +h ! Theorem: E (CVn (m )) ' MSFE (m ) Thus m b = argmin CVn (m ) is an estimate of m = argmin MSFEn (m ), but there is no proof of optimality Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 67 / 102 H-Step Cross-Validation for Forecast Combination CVn (w) = = 1 n e et + 1 ( w ) 2 n t∑ =1 1 n n t∑ =1 M = ∑ M M ∑ m =1 w (m )e et + 1 ( m ) ∑ w (m)w (`) m =1 `=1 0e = w Sw where !2 1 n e et +1 (m )e et +1 (`) n t∑ =1 e = 1e S e 0e e n is covariance matrix of leave-h-out residuals. Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 68 / 102 Cross-validation Weights Combination weights found by constrained minimization of CVn (w) e min CVn (w) = w0 Sw w subject to M ∑ w (m ) = 1 m =1 0 Bruce Hansen (University of Wisconsin) w (m ) Forecasting 1 July 23-27, 2012 69 / 102 Illustration 1 k = 8 regressors I I intercept normal AR(1)’s with coe¢ cient ρ = 0.9 h-step error I I normal MA(h-1) equal coe¢ cients Regression coe¢ cients I I I β = (µ, 0, ..., 0) n = 50 MSPE plotted as a function of µ Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 70 / 102 Estimators Unconstrained Least-Squares Leave-1-out CV Selection Leave-h-out CV Selection Leave-1-out CV Combination Leave-h-out CV Combination Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 71 / 102 Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 72 / 102 Illustration 2 Model yt = αyt 1 + ut Unconstrained model: AR(3) yt = µ̂ + β̂1 yt Bruce Hansen (University of Wisconsin) h + β̂2 yt h 1 Forecasting + β̂3 yt h 2 + êt July 23-27, 2012 73 / 102 Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 74 / 102 Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 75 / 102 Example: GDP Forecast Weights by Horizon h=1 AR(1) AR(2) AR(1)+HS AR(1)+HS+BP AR(2)+HS ybn +h jn .30 .66 h=2 .15 h=3 .19 h=4 .28 h=5 .18 h=6 .16 h=7 .11 .70 .14 .22 .58 .72 .82 .84 .89 2.0 1.9 2.0 2.1 2.3 2.6 .04 1.7 Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 76 / 102 h-step Variance Forecasting Not well developed using direct methods Suggest using constant variance speci…cation Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 77 / 102 h-step Interval Forecasts Similar to 1-step interval forecasts I But calculated from h step residuals Use constant variance speci…cation Let q be (α) and q be ( 1 residuals e et + h α) be the α’th and (1 α)’th percentiles of Forecast Interval: bn + q bε ( α ), [µ Bruce Hansen (University of Wisconsin) bn + q µ be ( 1 Forecasting α)] July 23-27, 2012 78 / 102 Quantile Regression Approach Fn (y ) = P (yn +h qα (x ) ' x0 β y j In ) α Estimate quantile regression of yt +h on xt b b , xn0 β 1 2α forecast interval is [xn0 β 1 α] α Asymptotic theory not developed for h step case I I Developed for 1-step case Extension is expected to work Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 79 / 102 Example: GDP Forecast Intervals (80%) Using quantile regression approach 2012 : 2 2012 : 3 2012 : 4 2013 : 1 2013 : 2 2013 : 3 2013 : 4 2014 : 1 Bruce Hansen (University of Wisconsin) ybn +h jn 1.3 1.6 2.0 2.2 2.4 2.7 2.9 3.2 Forecasting Interval [ 1.8, 4.1] [ 0.4, 3.6] [ 0.6, 4.6] [ 0.3, 4.1] [0.2, 4.2] [0.6, 3.8] [0.7, 4.8] [1.5, 4.8] July 23-27, 2012 80 / 102 Fan Charts Plots of a set of interval forecasts for multiple horizons I I I I Pick a set of horizons, h = 1, ..., H Pick a set of quantiles, e.g. α = .10, .25, .75, .90 Recall the quantiles of the conditional distribution are qn (α, h ) = µn (h ) + σn (h )q ε (α, h ) Plot qn (.1, h ), qn (.25, h ), µn (h ), qn (.75, h ), qn (.9, h ) against h Graphs easier to interpret than tables Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 81 / 102 Illustration I’ve been making monthly forecasts of the Wisconsin unemployment rate Forecast horizon h = 1, ..., 12 (one year) Quantiles: α = .1, .25, .75, .90 This corresponds to plotting 50% and 80% forecast intervals 50% intervals show “likely” region (equal odds) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 82 / 102 Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 83 / 102 Comments Showing the recent history gives perspective Some published fan charts use colors to indicate regions, but do not label the colors Labels important to infer probabilities I like clean plots, not cluttered Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 84 / 102 Illustration: GDP Growth Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 85 / 102 -1 0 1 2 3 4 5 Figure: GDP Average Growth Fan Chart 2011.0 2011.5 Bruce Hansen (University of Wisconsin) 2012.0 2012.5 Forecasting 2013.0 2013.5 2014.0 July 23-27, 2012 86 / 102 It doesn’t “fan” because we are plotting average growth Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 87 / 102 Iterated Forecasts Estimate one-step forecast Iterate to obtain multi-step forecasts Only works in complete systems I I Autoregressions Vector autoregressions Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 88 / 102 Iterative Forecast Relationships in Linear VAR vector yt yt +1 = A0 + A1 yt + A2 yt 1 + + Ak yt k +1 + ut + 1 1-step conditional mean E (yt +1 jIt ) = A0 + A1 E (yt jIt ) + + Ak E (yt = A0 + A1 yt + A2 yt 1 + + Ak yt k +1 jIt ) k +1 2-step conditional mean E (yt +1 jIt 1) = E (E (yt +1 jIt ) jIt 1 ) = A0 + A1 E (yt jIt 1 ) + + Ak E (yt k +1 jIt = A0 + A1 E (yt jIt 1 ) + A2 yt 1 + + Ak yt 1) k +1 h step conditional mean E (yt +1 jIt h +1 ) = E E ( y t + 1 j It ) j I t h + 1 = A0 + A1 E (yt jIt h +1 ) + Linear in lower-order (up to h Bruce Hansen (University of Wisconsin) + Ak E (yt 1 step) conditional means Forecasting k + 1 j It h + 1 July 23-27, 2012 89 / 102 Iterative Least Squares Forecasts Estimate 1-step VAR(k) by least-squares b0 + A b 1 yt + A b 2 yt yt +1 = A 1 b k yt +A + Gives 1-step point forecast b0 + A b 1 yn + A b 2 yn ybn +1 jn = A 1 + 2-step iterative forecast b k yn +A h step iterative forecast 1 jn b 2 ybn +h +A 2 jn + ubt +1 b k yn +A b0 + A b 1 ybn +1 jn + A b 2 yn + ybn +2 jn = A b0 + A b 1 ybn +h ybn +h jn = A k +1 + k +1 k +2 b k ybn +h +A k jn This is (numerically) di¤erent than the direct LS forecast Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 90 / 102 Illustration 1: GDP Growth AR(2) Model yt +1 = 1.6 + 0.30yt + .16yt 1 yn = 1.8, yn ybn +1 ybn +2 ybn +3 ybn +4 1 = 2.9 = 1.6 + 0.30 1.8 + .16 = 1.6 + 0.30 2.6 + .16 = 1.6 + 0.30 2.7 + .16 = 1.6 + 0.30 2.9 + .16 Bruce Hansen (University of Wisconsin) 2.9 = 2.6 1.8 = 2.7 2.6 = 2.9 2.7 = 3.0 Forecasting July 23-27, 2012 91 / 102 Point Forecasts 2012:2 2012:3 2012:4 2013:1 2013:2 2013:3 2013:4 2014:1 Bruce Hansen (University of Wisconsin) Forecasting 2.65 2.72 2.87 2.93 2.97 2.99 3.00 3.01 July 23-27, 2012 92 / 102 Illustration 2: GDP Growth+Housing Starts VAR(2) Model y1t = GDP Growth, y2t =Housing Starts xt = (GDP Growtht , Housing Startst , GDP Growtht Startst 1 b0 + A b 1 yt + A b 2 yt 1 + u yt +1 = A bt +1 y1t +1 = 0.43 + 0.15y1t + 11.2y2t + 0.18y1t y2t +1 = 0.07 0.001y1t + 1.2y2t Bruce Hansen (University of Wisconsin) Forecasting 0.001y1t 1, Housing 10.1y2t 1 1 0.26y2t 1 1 July 23-27, 2012 93 / 102 Illustration 2: GDP Growth+Housing Starts y1n = 1.8, y2n = 0.71, y1n = 2.9, y2n 1 = 0.68 y1n +1 = 0.43 + 0.15 1.8 + 11.2 0.71 + 0.18 2.9 10.1 0.68 = 2.3 y2t +1 = 0.07 0.001 1.8 + 1.2 0.71 0.001 2.9 0.26 0.68 = 0.76 y1n +2 = 0.43 + 0.15 2.3 + 11.2 0.76 + 0.18 1.8 10.1 0.71 = 2.4 y2t +1 = 0.07 0.001 2.3 + 1.2 0.76 0.001 1.8 0.26 0.71 = 0.80 Bruce Hansen (University of Wisconsin) 1 Forecasting July 23-27, 2012 94 / 102 Point Forecasts 2012:2 2012:3 2012:4 2013:1 2013:2 2013:3 2013:4 2014:1 Bruce Hansen (University of Wisconsin) GDP 2.36 2.38 2.53 2.58 2.64 2.66 2.69 2.71 Forecasting Housing 0.76 0.80 0.84 0.88 0.92 0.95 0.98 1.01 July 23-27, 2012 95 / 102 Model Selection It is typical to select the 1-step model and use this to make all h-step forecasts However, there theory to support this is incomplete (It is not obvious that the best 1-step estimate produces the best h-step estimate) For now, I recommend selecting based on the 1-step estimates Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 96 / 102 Model Combination There is no theory about how to apply model combination to h-step iterated forecasts Can select model weights based on 1-step, and use these for all forecast horizons Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 97 / 102 Variance, Distribution, Interval Forecast While point forecasts can be simply iterated, the other features cannot Multi-step forecast distributions are convolutions of the 1-step forecast distribution. I Explicit calculation computationally costly beyond 2 steps Instead, simple simulation methods work well The method is to use the estimated condition distribution to simulate each step, and iterate forward. Then repeat the simulation many times. Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 98 / 102 Multi-Step Forecast Simulation Let µ (x) and σ (x) denote the models for the conditional one-step mean and standard deviation as a function of the conditional variables x b (x) denote the estimates of these functions, and let b (x) and σ Let µ fbε1 , ...,bεn g be the normalized residuals xn = (yn , yn 1 , ..., yn p ) is known. Set xn = xn To create one h-step realization: I Draw ε ε1 , ...,bεn g n +1 iid from normalized residuals fb I Set y b b = µ x + σ x ε ( ) ( ) n n t +1 n +1 I Set x n +1 = (yn +1 , yn , ..., yn p +1 ) I Draw ε ε1 , ...,bεn g n +2 iid from normalized residuals fb I Set y b b = µ x + σ x ε n +2 n +1 n +1 t +2 I Set x n +2 = (yn +2 , yn +1 , ..., yn p +2 ) I I Repeat until you obtain yn +h yn +h is a draw from the h step ahead distribution Repeat this B times, and let yn +h (b ), b = 1, ..., B denote the B repetitions Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 99 / 102 Multi-Step Forecast Simulation The simulation has produced yn +h (b ), b = 1, ..., B For forecast intervals, calculate the empirical quantiles of yn +h (b ) I For an 80% interval, calculate the 10% and 90% For a fan chart I I Calculate a set of empirical quantiles (10%, 25%, 75%, 90%) For each horizon h = 1, ..., H As the calculations are linear they are numerically quick I I I Set B large For a quick application, B = 1000 For a paper, B = 10, 000 (minimum)) Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 100 / 102 VARs and Variance Simulation The simulation method requires a method to simulate the conditional variances In a VAR setting, you can: I Treat the errors as iid (homoskedastic) F I Treat the errors as independent GARCH errors F I Easiest Also easy Treat the errors as multivariate GARCH F F Allows volatility to transmit across variables Probably not necessary with aggregate data Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 101 / 102 Assignment Take your favorite model from yesterday’s assignment Calculate forecast intervals Make 1 through 12 step forecasts I I point interval Create a fan chart Bruce Hansen (University of Wisconsin) Forecasting July 23-27, 2012 102 / 102