Download Lecture 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Statistical inference wikipedia , lookup

Law of large numbers wikipedia , lookup

Time series wikipedia , lookup

Karhunen–Loève theorem wikipedia , lookup

Transcript
Econometrics
Week 3
Institute of Economic Studies
Faculty of Social Sciences
Charles University in Prague
Fall 2012
1 / 23
Recommended Reading
For the today
Further issues in using OLS with time series data.
Chapter 11 (pp. 347 – 370).
For the next week
Serial correlation and heteroskedasticity in time series
regressions.
Chapter 12 (pp. 376 – 404).
2 / 23
Today’s Lecture
We have learned that OLS has exactly same desirable finite
sample properties in the time-series case as in the
cross-sectional case.
But we need somewhat altered set of assumptions.
Today, we will study the large sample properties, which are
much more problematic in time-series than in
cross-sectional case.
I will introduce the key concepts needed to apply the usual
large sample approximations in regression analysis with
time series.
Trend analysis - crucial in economic analysis, most of the
time, you will deal with these data.
3 / 23
Stationary and Nonstationary Time Series
A stationary process is important for time series analysis.
PDF of stationary process is stable over time.
If we shift the sequence of the collection of random
variables in any time, joint probability distribution must
remain unchanged.
Stationarity
The stochastic process {xt : t = 1, 2, . . .} is stationary if for
every collection of time indices 1 ≤ t1 < t2 < . . . < tm , the joint
probability distribution of (xt1 , xt2 , . . . , xtm ) is the same as the
joint probability distribution of (xt1 +h , xt2 +h , . . . , xtm +h ) for all
integers h ≥ 1.
Definition may seem little abstract, but meaning is pretty
straightforward ⇒ see example on the next page.
Time series with trend are nonstationary ⇒ its mean
changes in time.
4 / 23
STATIONARY
NONSTATIONARY
Time Series 1
4
Time Series 2
120
110
2
yt
xt
100
0
90
-2
-4
80
0
200
400
600
800
1000
70
0
200
400
600
t
800
1000
t
⇓
⇓
Histograms
Histograms
35
20
30
25
15
20
10
15
10
5
5
0
0
-3
-2
-1
Hx100, ... ,x300L
0
1
Hx600, ... ,x800L
2
80
85
90
Hy100, ... ,y300L
95
100
105
110
Hx600, ... ,x800L
5 / 23
Covariance Stationary Process
Covariance Stationary Process
A stochastic process {xt : t = 1, 2, . . .} with finite second
moment [E(x2t ) < ∞] is covariance stationary if:
E(xt ) is constant.
V ar(xt ) is constant.
for any t, h ≥ 1, Cov(xt , xt+h ) depends only on h and not
on t.
Covariance stationarity focuses only on the first two
moments of a stochastic process and the covariance
between xt and xt+h .
If a stationary process has a finite second moment, it must
be covariance stationary, BUT NOT VICE VERSA.
⇒ strict stationarity is stronger than covariance
stationarity.
6 / 23
Weakly Dependent Time Series
A very different concept from stationarity.
Weakly Dependent Process
A stationary time series process {xt : t = 1, 2, . . .} is said to be
weakly dependent if xt and xt+h are “almost independent” as h
increases.
Very vague definition, as there are many cases of weak
dependence ⇒ Not covered by this course.
If for a covariance stationary process, Corr(xt , xt+h ) → 0
as h → ∞, we will say this process is weakly dependent.
In other words, as the variables get farther apart in time,
the correlation between them becomes smaller and smaller.
Technically, we assume that the correlation converges to
zero fast enough.
7 / 23
Weakly Dependent Time Series cont.
⇒ Asymptotically uncorrelated sequences will characterize
weak dependence.
Why we need it? ⇒ It replaces assumption of random
sampling in implying that law of large numbers (LLN) and
central limit theorem (CLT) holds.
The simplest example of Weakly Dependent Time Series
Independent, identically distributed sequence.
A sequence that is independent is trivially weakly
dependent.
8 / 23
Moving Average – MA(1) – Process
More interesting example is Moving Average Process of
order one – MA(1).
xt = t + α1 t−1 ,
t = 1, 2, . . .
where {t : t = 0, 1, . . .} is an i.i.d. sequence with zero mean
and variance σ2 .
Is MA(1) weakly dependent? YES!
⇒ its consecutive terms xt+1 and xt are correlated, with
Corr(xt , xt+1 ) = α1 /(1 + α12 ), but xt+2 and xt are
independent.
Is MA(1) stationary? YES!
(Ex.1): Interactive Example in Mathematica.
(NOTE: To make this link work, you need to have Lecture3.cdf at the same directory
and Free CDF player installed on your computer.)
9 / 23
Autoregressive – AR(1) – Process
The process {yt } is called Autoregressive process of order
one – AR(1), when it follows:
yt = ρyt−1 + t ,
t = 1, 2, . . .
with starting point y0 at time t = 0.
where {t : t = 1, 2, . . .} is an i.i.d. sequence with zero mean
and variance σ2
This process is weakly dependent and stationary in case
that |ρ < 1|.
(Ex.2): Interactive Example in Mathematica.
(NOTE: To make this link work, you need to have Lecture3.cdf at the same directory
and Free CDF player installed on your computer.)
10 / 23
Autoregressive – AR(1) – Process cont.
To show that AR(1) is asymptotically uncorrelated, we will
assume that it is covariance stationary.
Then, E(yt ) = E(yt−1 ) ⇐ E(yt ) = 0.
V ar(yt ) = ρ2 V ar(yt−1 ) + V ar(t ), and under covariance
stationarity, we must have:
σy2 = ρ2 σy2 + σ2 .
For ρ2 < 1, we can easily solve for σy2 :
σy2 =
σ2
.
1−ρ2
11 / 23
Autoregressive – AR(1) – Process cont.
The covariance between yt and yt+h for h ≥ 1 can be found
by repeated substitution:
yt+h = ρyt+h−1 + t+h = ρ(ρyt+h−2 + t+h−1 ) + t+h
...
yt+h = ρh yt + ρh−1 t+1 + . . . + ρt+h−1 + t+h
As E(yt ) = 0, we can multiply it by yt , take expectations
and obtain Cov(yt , yt+h ).
Cov(yt , yt+h ) = E(yt , yt+h ) =
ρh E(yt2 )+ρh−1 E(yt t+1 )+. . .+E(yt t+h ) = ρh E(yt2 ) = ρh σy2 .
As t+j is uncorrelated with yt .
Finally, Corr(yt , yt+h ) =
Cov(yt ,yt+h )
σy σy
= ρh
So ρ is correlation between any adjacent 2 terms, and for
large h, correlation gets very low (ρh → 0 as h → ∞.)
12 / 23
Trends Revisited
A trending series can not be stationary.
A trending series can be weakly dependent.
Trend-stationarity
A series that is stationary about its time trend, as well as
weakly dependent, is called trend-stationary process.
As long as a trend is included in the regression, everything
is OK.
13 / 23
Asymptotic Properties of OLS
Following assumptions will alter the assumptions for
unbiasedness we introduced last lecture (Ass. 1, Ass. 2 and Ass.
3).
Ass. 1A: Linearity and Weak Dependence
In a linear model:
yt = β0 + β1 xt1 + . . . + βk xtk + ut ,
we assume that {(xt , yt ) : t = 1, 2, . . .} is weakly dependent.
Thus law of large numbers and central limit theorem can be
applied to sample averages.
Ass2: Zero Conditional Mean
For each t, E(ut |xt ) = 0
Ass. 2A: is much weaker than original one ⇐ no restriction on
the relationship of ut and X.
14 / 23
Asymptotic Properties of OLS cont.
Ass. 3A: No Perfect Collinearity
No independent variable is constant or a perfect linear
combination of the others. remains the same
Theorem 1: Consistency of OLS
Under Ass. (1A), (2A) and (3A), the OLS estimators are
consistent: plim βˆj = βj , j = 0, 1, . . . , k.
The theorem is similar to the one we used in cross-sectional
data, but we have omitted random sampling assumption
(thanks to the Ass 2.).
15 / 23
Large-Sample Inference
We have weaker Ass. 4 and Ass. 5 than in CLM counterpart:
Ass. 4A: Homoskedasticity
For all t, V ar(ut |xt ) = σ 2 .
Ass. 5A: No Serial Correlation
For all t 6= s, E(ut , us |xt ) = 0
Theorem 2: Asymptotic Normality of OLS
Under the Ass. (1A–5A) for time series, the OLS estimators are
asymptotically normally distributed, and the usual OLS
standard errors, t statistics, F statistics and LM statistics are
asymptotically valid.
16 / 23
Large-Sample Inference
Why is the Theorem 2 so important?
Even if the CLM assumptions do not hold, OLS may still be
consistent, and the usual inferences are valid!
Of course, only if Ass. (1A – 5A) are met !!!
In other words: Provided the time series are weakly
dependent, usual OLS inference is valid user assumptions
weaker than CLM assumptions.
Under Ass. 1A – 5A, we can show that the OLS estimators
are asymptotically efficient (as in the cross-sectional case).
Finally, when an trend-stationary variable satisfy the Ass.
1A – 5A, and we add time trend to the regression, usual
inference is valid.
17 / 23
Highly Persistent Time Series in Regression
Analysis
Highly persistent = strongly dependent.
Unfortunately, many economic time series have strong
dependence. Ass. on weak dependence violated.
This is no problem, if the CLM assumptions (1 – 6) from
previous Lecture holds.
But, usual inference is susceptible to violation.
18 / 23
Random Walk
A simple example of highly persistent series often found in
economics is AR(1) with ρ = 1:
yt = yt−1 + t ,
t = 1, 2, . . .
where {t : t = 1, 2, . . .} is an i.i.d. with zero mean and
variance σ2 .
Expected value of {yt } do not depend on t, as
yt = t + t−1 + . . . + 1 + y0 :
E(yt ) = E(t ) + E(t−1 ) + . . . + E(1 ) + E(y0 ) = E(y0 )
If y0 = 0, then E(yt ) = 0 for all t.
V ar(yt ) = V ar(t ) + V ar(t−1 ) + . . . + V ar(1 ) + V ar(y0 ) =
σ2 t. Process is not stationary, as variance increases in time.
19 / 23
Random Walk
Thus RW is highly persistent, as value of yt today
influences all the future values of yt+h :
E(yt+h |yt ) = yt for all h ≥ 1.
Do you remember an AR(1) with ρ < 1?:
E(yt+h |yt ) = ρh yt for all h ≥ 1 ⇒
with h → ∞, E(yt+h |yt ) = 0.
(Ex.3): Interactive Example in Mathematica.
(NOTE: To make this link work, you need to have Lecture3.cdf at the same directory
and Free CDF player installed on your computer.)
A random walk is a special case of an unit root process.
20 / 23
Random Walk
It is important to distinguish between highly persistent
time series and trending time series.
Interest rate, inflation rate, unemployment rate can be
highly persistent, but they do not have trend.
Finally, we can have highly persistent time series with
trend: random walk with trend:
yt = α + yt−1 + t ,
t = 1, 2, . . .
where α is a drift term (see interactive Ex. 3).
E(yt+h |yt ) = αh + y0 .
21 / 23
Transforming Highly Persistent Time Series
Highly persistent time series can lead to very misleading
results if CLM are violated. (do you remember spurious
regression problem?)
We need to transform it into a weekly dependent process.
Weakly dependent process will be referred to as
integrated of order zero, I(0).
Random Walk is integrated of order one, I(1).
It’s first difference is integrated of order zero, I(0):
∆yt = yt − yt−1 = t ,
t = 1, 2, . . .
Therefore {∆yt } is i.i.d. sequence and we can use it for
analysis.
(Ex.3.3): Interactive Example in Mathematica.
22 / 23
Thank you
Thank you very much for your attention!
23 / 23