Download S5.2b - United Nations Statistics Division

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computer simulation wikipedia , lookup

Predictive analytics wikipedia , lookup

Theoretical ecology wikipedia , lookup

General circulation model wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Data assimilation wikipedia , lookup

2010 Flash Crash wikipedia , lookup

Transcript
International Seminar on Rapid Estimates, Ottawa - May 2009
Building Flash Estimates for Selected PEEIs
Dominique Ladiray (Insee, France)
Fabio Bacchini (Istat, Italy)
Foreword
› In 2005 the Economic and Financial Committee
stressed the need to improve timeliness
– “more use of flash estimation techniques for European
aggregates should be considered”
› In 2006 Eurostat launched a call for proposals to
develop methods and tools to produce flash estimates
of 3 short-term economic indicators (GDP, IPI and LCI)
› Joint proposal from France (Insee), Germany (Destatis),
Italy (Istat) and United-Kingdom (ONS) accepted in
November 2006.
› Presentation of the joint project (objectives and results)
that ran from January 2007 to October 2008
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Outline
› Foreword
› Eurostat Grant “Flash estimates for certain
Principal European Economic Indicators”
› What is a flash estimate?
› What is a “good model”?
› The variable selection problem
› Some results for LCI, IPI and GDP
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Eurostat Call for Proposal
› Launched in June 2006
› “Flash estimates for certain PEEIs”
– Monthly IPI for Euro-area and EU25 at 30 days (42)
– Quarterly GDP for Euro-area and EU25 at 30 days (45)
– Quarterly LCI for Euro-area and EU25 at 45 days (74)
› It must be seen as an EXPLORATORY project
– Explore the possibilities (and techniques) to improve
timeliness through flash estimation techniques
– Explore also the advantages and drawbacks:
implementation, maintenance, communication etc.
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
A Joint Proposal
› A proposal by 4 NSI’s
– France (INSEE), Germany (DESTATIS), Italy (ISTAT),
United-Kingdom (ONS)
› Work divided in 2 phases
– Phase 1 (9 months), Preparatory work
‐Literature, collecting experiences, preparing databases,
developing methodologies etc.
– Phase 2 (12 months)
‐Simulations, evaluation and comparison of models,
bibliography
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
What is a Flash Estimate?
› We had strong discussion, not to say arguments on
this point ….
–A “flash estimate” is like a Giraffe: easy to recognize but
not so easy to define
› A first definition (Eurostat – Barcellan’s paper)
–“A flash estimate is an early estimate produced and
published as soon as possible after the end of the
reference period, using a more incomplete set of
information than the set used for traditional estimates”
› A clear and secure definition but quite restrictive
–No way to produce our flash estimates according to this
definition
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
A more flexible definition
› An agreement
–Pure AR models (no new information) can be used as
benchmarks but are not accepted for FE
› Still a disagreement
–Can a model be based on soft data (BCS) only?
› A compromise
–FE models must incorporate hard data (as much as
possible).
–Easier of course for quarterly data (monthly hard data
available for at least part of the quarter)
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
What is a “good” model?
› Simple (few variables)
– You can handle it during production ….
› Interpretable
– You can explain it and …. “sell” it
› Good “statistical properties” including robustness
– You do not want to change your model every month
› Good estimations (small revisions)
– You publish! Credibility (“Trustful statistics”, Peter Everaers).
– “I prefer no data than misleading data” (Drummond)
› Mazzi & Montana’s paper (Eurostat)
– “The selected model should be as simple as possible,
statistically sound, easy to use in the regular production
process ….”
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Selecting a batch of explanatory variables
› A-priori selection based on several criteria
– Timeliness
– Economic theory, expert knowledge
– Available hard information on the period to “flash-estimate”
› Soft data are very timely
– Coincident and leading variables: opinion on the current
and expected production
› Hard data may be leading
– New orders
› But the selection can be quite large
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
The variable selection problem
› Example: estimating the EA13 GDP
› Potential explanatory variables:
–IPI, New orders, Energy prices, HICP, Unemployment,
Business surveys (Industry, Retail trade, Construction,
Services) etc. Easy to find at least 20 variables
(monthly and/or quarterly)
–13 countries + EA, 2 lags?
–It comes to 20*(13+1)*3=840 potential variables
› More than 20 billions possible models with 4
explanatory variables!!!!
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Variable selection methods
› You must reduce the set of explanatory variables
–Trial and Error approach scarcely works
–NIESR approach (J. Mitchell)
‐ Drastically restrict the set (Expert knowledge) and then evaluate all
possible models
–GETS (General to specific) approach
‐ Start from an over-parameterized model and use statistical tests to
reduce it
–Dynamic Factor Analysis approach
‐ Summarize the set of variables with uncorrelated factors and use some
of them in the model
–Cluster Analysis approach
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
A “nice case”: Labour Cost Index
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Which dependant variable ?
› Annual (red) or quarterly (black) growth rate?
› No big difference; note the level shift ….
› RMSE of an ARIMA model on the linearized quarterly
growth rate of SA data : 0.07
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Some Remarks
› No real problem
› Lot of available information (monthly or quarterly) at
t+45
› Quite large number of models with good statistical
properties and excellent Rsquare (>0.7)
› Pure AR models do not perform as well.
› Mixed of “monthly” indicators from Business Surveys
and other “hard” data.
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Example
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
A “difficult case”: the IPI
› Annual (red) or monthly (black) growth rate?
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
The target variable
› Gian Paolo Oneto’s remark on volatility ….
› RMSE of an ARIMA model on the linearized monthly
growth rate of SA data : 0.7 (1.1 on the annual growth
rate). It is huge !!!!
› Very few (not to say no) hard data available at t+30
› Difficult to propose a good and simple enough model.
› Simple models only explain a small part of the volatility.
Very difficult to publish the flash estimate.
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
An “interesting case”: the GDP
› Annual (red) or quarterly (black) growth rate?
› RMSE of an ARIMA model on the linearized quarterly
growth rate of SA data : 0.3 (quite large)
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Some Remarks
› Lots of available information (monthly or quarterly) at
t+30
› You can reduce the RMSE and find models with a
RMSE close to 0.2 but it is still to much.
› Why should we take a risk of a big revision to gain
only 15 days?
› You can do better but with more complex models
difficult to handle in production
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Last considerations
› Note that you look for the “best” model to flash estimate
the « worst » figure (the first estimation will always be
revised)
› To do that, you use “non homogeneous” data (mixed
between first, second, third … releases)
› It is always better to estimate several models
– In case a X-variable is not available
– Because “pooled estimations” are better
› It is often better to do the estimation at the National level
and then aggregate but more models to maintain
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009
Conclusion
› Quite disappointing isn’t it?
› The solution seems to fasten the production process and
respect the first (and restrictive) definition of flash
estimates.
– But it puts the burden on the National institutes and it could
be very costly.
› Anyway, in France
– We try to get early estimates of the IPI from a restricted
sample;
– We are working on a GDP flash at t+30 which would respect
the way we compute QNA
Building Flash Estimates for Selected PEEIs
Ottawa 27-29 May 2009