Download LINEAR AUTOREGRESSIVE MODELS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Transcript
LINEAR AUTOREGRESSIVE MODELS
ULRICH HERKENRATH, MARIUS IOSIFESCU and ANDREAS RUDOLPH
We study linear autoregressive models (LARM’s for short) in two different variants: one is based on a doubly infinite sequence of i.i.d. innovations thus leading
to a time series (Wt , t ∈ Z) and the other one is based on a sequence of i.i.d.
innovations and an independent starting variable thus leading to a time series
(Wt , t ∈ N). In both cases the resulting time series obeys the Markov property.
For both variants we give sufficient conditions for the “stabilization” of the process
(Wt , t ∈ Z or t ∈ N). That amounts to a strictly stationary process (Wt , t ∈ Z) in
the first case and to weak respectively geometric ergodicity of the Markov chain
(Wt , t ∈ N) in the second case. Special attention is paid to the case of normally
distributed innovations.
AMS 2000 Subject Classification: 60J05, 60G10.
Key words: autoregressive time series, Markov chain, strictly stationary process,
stabilization of a time series.
1. MODELS
We consider linear autoregressive models (LARM’s for short). They are
characterized by a linear stochastic recursion of the form
(LARM)
Wt = F Wt−1 + GUt
with either t ∈ N+ or t ∈ Z. If t ∈ N+ , a random variable W0 has to be given
as starting value.
As for the dimensions of the items in (LARM): Wt ∈ Rk , Ut ∈ Rm , F ∈
Mat(k, k), G ∈ Mat(k, m), where Mat(k, m) denotes the set of k × m matrices.
In this paper we generally assume that on a probability space (Ω, A, P) there
are given i.i.d. random vectors
(Ut , t ∈ N+ or t ∈ Z)
with common distribution τ . The Euclidean space Rp is endowed with its
Borel-σ-algebra B p . In the case (Ut , t ∈ N+ ) the random variable W0 defined
on (Ω, A, P) is independent of (Ut , t ∈ N+ ). Often the Ut ’s are called noise
variables or innovations. Expectations and covariances refer to the probability
measure P, unless otherwise stated.
MATH. REPORTS 12(62), 3 (2010), 245–259
246
Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph
2
In this case we consider the (time-homogeneous) Markov chain (MC for
short) (Wt , t ∈ N) with initial distribution p0 , that means W0 ∼ p0 . A LARM
with t ∈ N+ is called a linear state space model in Meyn and Tweedie (1996).
In the case (Ut , t ∈ Z), we consider the (time-homogeneous) MC (Wt ,
t ∈ Z) based on the doubly infinite sequence (Ut , t ∈ Z), i.e., we think that the
MC (Wt , t ∈ Z) has started at time −∞, whereas we regard observation times
t ∈ N. Restricted to observation times t ∈ N, (Wt , t ∈ N) is a MC with “an
infinite history” subsumed in an unknown initial distribution p0 , i.e., W0 ∼ p0 .
Under suitable assumptions on the components of a LARM, one may consider that the process (Wt , t ∈ Z) which has started at time −∞ did already
“stabilize” during its infinitely long history in such a way that p0 = π, the
unique invariant probability measure (UIPM for short) of the MC induced by
the LARM. Then (Wt , t ∈ N) is an ergodic strictly stationary process, the
“stabilized” process generated by a LARM. As general references for terminology and results on MC’s we refer to the books by Meyn and Tweedie (1996)
and Hernández-Lerma and Lasserre (2003).
If (Wt , t ∈ N) starts according to an arbitrary initial distribution p0 6= π,
this process is “unstable” and one may ask for conditions for its “stabilization”
when running through observation times t ∈ N, thus for some kind of asymptotic stability or stationarity. Of course, that amounts to the same question
as ensuring the stability of the observable process (Wt , t ∈ N) on the basis of
an unobservable history (. . . , W−3 , W−2 , W−1 ).
Since a LARM is characterized by a stochastic recursion, it is natural
to ask for solutions of this linear stochastic recursion (LSR for short). With
regard to the above explanations one is interested in particular in strictly
stationary solutions of LSR. A strictly stationary solution means a stochastic
process which obeys LSR and is a strictly stationary process.
Next, we collect assumptions on the components of a LARM, which will
be made in order to ensure stability properties. As for integrability conditions on a typical member U of the i.i.d. family (Ut , t ∈ Z or t ∈ N+ ), we
assume that
(UL)
E[(log kU k)+ ] < ∞,
(U1)
U ∈ L1 , i.e., E[kU k] < ∞,
(U2)
U ∈ L2 , i.e., E[kU k2 ] < ∞,
and
P
Cov [U ] = E [V V 0 ] =: U non-singular,
where V means “U centered”, i.e., U = V + ν with ν = E[U ], k · k denotes a
vector norm and log the natural logarithm.
Lemma 1.1. (U2) ⇒ (U1) ⇒ (UL).
3
Linear autoregressive models
247
Proof. The first implication is trivial, so that only the second one needs
to be proved. For U ∈ L1 we have
Z
Z
+
+
E[(log kU k) ]
(log kU k) dP +
(log kU k)dP =
{kU k≤1}
{kU k>1}
Z
Z
(log kU k)dP ≤ log
=
{kU k>1}
by Jensen’s inequality.
kU kdP < ∞,
{kU k>1}
The essential condition on the matrix F to ensure the “stability” of a
LARM is
(EVF)
ρ(F ) < 1,
where ρ(F ) denotes the spectral radius of F , i.e., the maximal absolute value
of the eigenvalues of F .
The “eigenvalue condition” (EVF) ensures in some sense a contractibility property of the linear mapping induced by F . If this property of F does
not hold, then, by iterating (LARM), the process (Wt , t ∈ N) “drifts away
to infinity”.
It is sometimes useful to assume the so-called controllability condition for
the pair of matrices (F, G). One introduces the so-called controllability matrix
Ck as the aggregation of the matrices F k−1 G, F k−2 G, . . . , F G, G written side
by side and abbreviated to Ck := [F k−1 G| . . . |F G|G] ∈ Mat(k, km). If
(CMC)
rank{Ck := [F k−1 G | . . . |F G|G]} = k,
the pair of matrices (F, G) is called controllable or (CMC) is called valid.
Obviously, for G = Ik , Ik the identity matrix of order k, condition (CMC)
holds. The essence of the controllability property of (F, G) is as follows (see
Meyn and Tweedie (1996, p. 95). If one considers the deterministic recursion
wt = F wt−1 + Gut , t ∈ N+ , then the controllability of (F, G) means that
for each pair of “states” w0 , w∗ ∈ Rk there exists a sequence of “controls”
(u∗1 , . . . , u∗k ), u∗i ∈ Rm , such that w∗ = wk , when (u1 , . . . , uk ) = (u∗1 , . . . , u∗k )
and the recursion starts at w0 , i.e., each state w∗ can be reached from each
starting point w0 in k steps. The reason for that is simply the representation


u1


wk = F k w0 + [F k−1 G | · · · | F G|G]  ... 
uk
and the fact that the range of the linear mapping induced by [F k−1 G| · · · |F G|G]
is Rk . Therefore, each state w∗ can be reached from each starting point w0 in
t steps for all t ≥ k.
248
Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph
4
Condition (CMC) obviously represents a concept of communication between states in the deterministic model generated by F and G and is the key
to irreducibility of the MC (Wt , t ∈ N+ ) under appropriate conditions on the
sequence (Ut , t ∈ N+ ).
In this context we prove the following result.
Lemma 1.2. If for a LARM started at w0 ∈ Rk , (CMC) holds and U
has a strictly positive λm -density, then Wt has a strictly positive λk -density
for all t ≥ k. Therefore, the MC (Wt , t ∈ N) is λk -irreducible and aperiodic.
Rk
Proof. Consider for a fixed w0 ∈ Rk the linear transformation T : Rkm →
defined by
T (u1 , . . . , uk ) = Ck (u1 | · · · |uk )0 =: wk − F k w0 .
Then the matrix Ck ∈ Mat(k, km) can be extended to a non-singular matrix
C ∈ Mat(km, km), i.e., | det C| > 0, inducing a linear transformation Tb :
Rkm → Rkm . Next, Tb−1 exists as inverse transformation and, according to the
theorem on transformation of densities, Wk induced by (LARM) has a strictly
positive λk -density q k as marginal density of a strictly positive density for all
starting points w0 ∈ Rk . Here (u1 | · · · |uk ) := (u11 , . . . , u1m , u21 , . . . , u2m , . . . ,
uk1 , . . . , ukm ) = (u01 , . . . , u0k ), so that (u1 | . . . |uk )0 ∈ Mat(km, 1). It follows
from the Markov property of (Wt , t ∈ N) that for all t ≥ k the random
vectors Wt have a strictly positive λk -density, too,
Z
P(Wk+j ∈ A) = P(Wk+j ∈ A | Wj = w)P(Wj ∈ dw) =
W
Z
Z Z
k
Q (w, A)P(Wj ∈ dw) =
=
W
Rk ,
q k (w, dw0 )P(Wj ∈ dw)λk (dw0 )
A W
Bk
for W =
A∈
and j ∈ N. Here Qk is the k-step transition probability
function of the MC (Wt , t ∈ N). Strictly positive densities obviously imply
λk -irreducibility.
As for aperiodicity, according to Proposition 5.2.4 in Meyn and Tweedie
(1996), there exists a countable
of so-called “small” sets Ci such
S collection
k . Since there should exist a small
that Ci are Borel sets and ∞
C
=
R
i
i=1
set Cj with λk (Cj ) > 0, Proposition 4.2.2 in Hernández-Lerma and Lasserre
(2003) ensures in turn the existence of d ≥ 1 disjoint Borel sets D1 , D2 , . . . , Dd
such that
∀x ∈ Di : Q(x, Di+1 ) = 1 for i = 1, . . . , d − 1,
d
[
k
k
∀x ∈ Dd : Q(x, D1 ) = 1 and λ R \
Di = 0.
i=1
5
Linear autoregressive models
249
{D1 , . . . , Dd } is called a d-cycle and the MC is called aperiodic if d = 1. Now,
because of the strictly positive λk -densities, d must be equal to 1. Otherwise,
S
∀i = 1, . . . , d, ∀x ∈ Di : Q(x, Di ) = 0 and, therefore, λk ( di=1 Di ) would be 0.
So, (Wt , t ∈ N) is aperiodic. 2. STRICTLY STATIONARY SOLUTIONS OF (LARM)
As for the process (Wt ), we first consider the case (Ut , t ∈ Z), i.e., assume
that (Wt , t ∈ Z) starts at time −∞. Then it is natural to ask whether (Wt ,
t ∈ N), has “stabilized” during its infinitely long history (. . . , −3, −2, −1).
The starting time −∞ can only be meant asymptotically, i.e., as a limit of
finite starting times. So, for t ∈ Z fixed, we think of (t − k), k ∈ N, as starting
time and then let k → ∞. Iterating (LARM) from (t − k) to t, the central
k−1
P i
term to be studied is
F GUt−i . Next, we collect results on its convergence
i=0
in terms of properties of (Ut , t ∈ Z).
Lemma 2.1. Let (EVF) hold, i.e., ρ(F ) < 1.
(i) If (Ut , t ∈ Z) satisfies (U L), i.e.,
E[(log kU k)+ ] < ∞,
then
ft :
∀t ∈ Z ∃W
k−1
X
ft a.s. as k → ∞.
F i GUt−i → W
i=0
(ii) If (Ut , t ∈ Z) satisfies (U1), i.e.,
E[kU k] < ∞, U = V + ν with ν = E[U ],
then
ft ∈ L1 :
∀t ∈ Z ∃W
k−1
X
L1 f
F i GUt−i −→
Wt as k → ∞,
i=0
in addition to the a.s. convergence, and
ft ] = µ = (I − F )−1 Gν.
E[W
(iii) If (Ut , t ∈ Z) satisfies (U2), i.e.,
P
E[kU k2 ] < ∞ and Cov[U ] = U non-singular,
then
ft ∈ L2 :
∀t ∈ Z ∃W
k−1
X
i=0
L2 f
F i GUt−i −→
Wt as k → ∞,
250
Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph
6
in addition to the convergence properties in (ii), and
ft , W
ft−h ] =
Cov[W
∞
X
P
F h+i G U G0 F i0 ,
i=0
ft ] =
Cov[W
∞
X
P
F i G U G0 F i0 .
i=0
0
Here, denotes the transposition of the corresponding matrix.
Proof. (i) We refer to Theorem 1.1 in Bougerol and Picard (1992) which
essentially had already been proved by Brandt (1986). Set At = F and
Bt = GUt for all t ∈ Z. Since for any matrix norm |k · k| one has ρ(F ) =
lim |kF n k|1/n [see Horn and Johnson (1988, p. 299)], we can write
n→∞
1
n
log |kF k| ≤ lim log |kF n k|1/n = log ρ(F ) < 0
γ := inf
n→∞
n∈N n
by (EVF) and, furthermore, (log |||At |||)+ = (log |||F |||)+ < ∞. Moreover,
E [(log kBt k)+ ] = E[(log kGU k)+ ] ≤ (log |kGk|)+ + E[(log kU k)+ ] < ∞
for the operator norm |k · k| associated with the vector normk · k. Therefore,
all assumptions of the quoted Theorem 1.1 are satisfied, so that our statement
follows.
(ii) For all t ∈ Z, k ∈ N+ we define
Wt−k = F k Wt−k +
k−1
X
F i GUt−i ,
i=0
i.e., Wt−k is the “state” at
Wt−k . With regard to the
time t, if the LARM started at time (t − k) in state
assumed starting time −∞, we consider for a fixed
t ∈ Z the limit of the sequence (Wt−k , k ∈ N+ ), that is,
ft := lim W −k =
W
t
k→∞
∞
X
(∗)
=
(∗∗)
F i GUt−i =
i=0
∞
X
i
∞
X
F i GVt−i +
i=0
∞
X
F i Gν
i=0
F GVt−i + (I − F )−1 Gν;
i=0
(∗) holds by condition (EVF) according to Theorem 5.6.12 in Horn and Johnson (1988, p. 298) while (∗∗) holds according to a result on p. 301 in that
book. Since
X
X
∞
k−1 i
F GVt−i ≤
|kF i k| kGVt−i k
i=0
i=0
7
Linear autoregressive models
251
for an appropriate matrix norm, we have
X
X
∞
∞
i
E
|kF k| kGVt−i k =
|kF i k| E[kGVt−i k] < ∞
i=0
i=0
by the Monotone Convergence Theorem, condition (U1) and Corollary 5.6.14
∞
P
in Horn and Johnson (1988, p. 299), i.e.,
|kF i k| kGVt−i k ∈ L1 . In turn this
i=0
implies [see, e.g., Loève (1963, p. 163)]
k−1
X
L1
i
F GUt−i −→
i=0
∞
X
ft
F i GUt−i = W
i=0
as k → ∞ and
E
X
∞
i=0
i
F GVt−i = lim E
k→∞
X
k−1
i
F GVt−i = 0.
i=0
ft ] = (I − F )−1 Gν.
Therefore, E[W
(iii) The proof follows from Proposition C.7 in Lütkepohl (1991, p. 490),
as, by (EVF) (F i , i ∈ N) is an absolutely summable sequence of matrices (see
again Horn and Johnson (1988, p. 301), and E[Ut0 Ut ] ≤ C < ∞ because of
(U2). Moreover, according to Proposition C.8 in Lütkepohl (1991, p. 491),
one can compute the covariances
0
f
f
ΓW
f (h) := E[(Wt − µ)(Wt−h − µ) ] lim
k−1 X
k−1
X
k→∞
= lim
k→∞
k−1
X
j=0
0
0
]G0 F j =
F i GE[Vt−i Vt−h−j
i=0 j=0
∞
X
P
P
0
0
ft , W
ft−h ].
F h+j G U G0 F j =
F h+i G U G0 F i = Cov[W
i=0
P
This holds as E[Vt Vs0 ] = 0 for s 6= t and E[Vt Vt0 ] = U for all t ∈ Z. In
particular, for h = 0 we have
∞
X
P
0
0
f
f
ft ]. ΓW
F i G U G0 F i = Cov[W
f (0) = E[(Wt − µ)(Wt − µ) ] =
i=0
ft , t ∈ Z) which
Next, we ask for properties of the stochastic process (W
is well defined under the assumptions of Lemma 2.1.
Lemma 2.2. Under conditions (EVF) and (UL) the stochastic process
f
(Wt , t ∈ Z) is the unique strictly stationary solution of the stochastic recursion
(LARM ) and enjoys the Markov property. Let P denote the transition probabift , t ∈ Z). Then the distribution m of W
ft is an
lity function of the M C (W
R
invariant probability measure of the chain, that is, m(A) = P (w, A)m(dw).
252
Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph
8
Proof. Although the result is contained in Theorem 1.1 in Bougerol and
Picard (1992), we prove it here except for the uniqueness of the solution.
Uniqueness will be ensured by Theorem 3.1 to be proved in what follows.
∞
ft = P F i GUt−i , t ∈ Z, the stochastic
On account of the representation W
i=0
recursion (LARM)
X
∞
∞
X
i
f
ft .
F Wt−1 +GUt = F
F GUt−1−i +GUt =
F i+1 GUt−(i+1) +F 0 GUt = W
i=0
i=0
ft , t ∈ Z) obeys (LARM), it enjoys the Markov property.
is satisfied. Since (W
Then in combination with the equation
ft =
W
∞
X
i
D
F GUt−i =
i=0
∞
X
ft+h
F i GUt+h−i = W
i=0
D
ft , t ∈ Z) follows. Here, =
for all h ∈ Z, the strict stationarity of (W
denotes equality in distribution. The last property of the statement follows from
the equation
Z
ft ∈ A) = P(W
ft ∈ A | W
ft−1 = w)P(W
ft−1 ∈ dw)
m(A) = P(W
Z
=
P (w, A)m(dw). ∞
ft = P F i GUt−i ,
Remarks 1.1. 1. The strictly stationary solution W
i=0
t ∈ Z is a nonanticipative one according to Definition 2.2 in Bougerol and
ft is independent of the random vectors (Ur , r > t) for
Picard (1992) since W
ft , t ∈ Z) is independent of the future at
any t ∈ Z. Therefore, the solution (W
any given time and contains only noise terms or innovations from the past up
to the present.
2. Bougerol and Picard (1992) introduce an irreducibility property for
LARM: An affine subspace H of Rk is said to be invariant under (LARM)
if {F w + GU, w ∈ H} ⊂ H a.s. A LARM is called irreducible if Rk is the
only affine invariant subspace, i.e., there is no proper affine subspace H ⊂ Rk
such that if LARM starts at W0 = w ∈ H, then it remains in H all the
time. They show in addition to the result of Lemma 1.3 that under the above
irreducibility condition and condition (UL) the validity of (EVF) is even a
necessary condition for the existence of a nonanticipative strictly stationary
solution of (LARM).
3. In the time series literature, see e.g. Lütkepohl (1991), the stochasft , t ∈ Z), which is well-defined under the conditions (EVF)
tic process (W
9
Linear autoregressive models
253
and (UL), is called “stable” and correspondingly condition (EVF) for LARM
“stability condition” or “stationarity condition” [see Lütkepohl (1991, p. 20)].
In the time series literature the notions “stability” and “stationarity”
are often used synonymously. Moreover, in some books, there is no distinction
between “strict stationarity” and “second-order stationarity”. If one regards
the stochastic recursion (LARM) as a time series model, i.e., as a vectorial
autoregressive model of order 1, one can ask for time series, i.e., stochastic
processes (Wt , t ∈ Z or t ∈ N), which are compatible with (LARM).
Therefore, in the case “t ∈ Z” one asks for solutions of (LARM) while
in the case “t ∈ N”, given a starting variable W0 , one generates (Wt , t ∈ N)
by iterating (LARM). As was shown above, under suitable conditions, a time
series which has started in the infinite past, i.e., at time −∞, has “stabilized”,
so that it appears as “stable” in the stochastics sense during observation times
ft , t ∈ Z). t ∈ N. This explains the introduction of (W
∞
ft = P F i GUt−i , t ∈ Z dealt
The stochastic process or time series W
i=0
with in Lemmas 2.1 and 2.2 is called the canonical moving average (MA for
short) representation of the linear autoregressive model LARM. It requires
the validity of (EVF) and exists under (UL) almost surely, under (U1) in the
L1 -sense and under (U2) in the L2 -sense as infinite series based on the infinite
ft , t ∈ Z).
past. Usually, in the time series literature one directly deals with (W
3. THE CASE (UT , T ∈ N+ ): MARKOV CHAINS GENERATED BY LARM
Since we consider as observation times for LARM t ∈ N, respectively an
induced time series indexed by t ∈ N, we now turn to study LARM starting
f0 ) resulting
with a random vector W0 . The distribution of W0 may be m = L(W
from Lemma 2.2 or any other probability measure p0 .
In any case, we study (Wt , t ∈ N) as a MC. If W0 ∼ m, we have a
“stable”, i.e., strictly stationary MC. If W0 ∼ p0 , again the question arises
whether (Wt , t ∈ N) “stabilizes” as t → ∞. In this context, we will take
advantage on available results for MC’s.
Theorem 3.1. Assume (EVF) holds, i.e., ρ(F ) < 1, and (Ut , t ∈ N+ )
satisfies (U1), i.e., E[ kU k ] < ∞, U = V + ν with ν = E[U ]. Let the random
vector W0 be independent of (Ut , t ∈ N+ ) and (Wt , t ∈ N) generated by
LARM. Then
L
(i) Wt → W∞ as t → ∞ and
W∞ =
∞
X
i=0
F i GUi+1 ∈ L1 ,
E[W∞ ] = (I − F )−1 Gν = µ;
254
Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph
10
(ii) if P denotes the transition probability function of the M C (Wt , t ∈
w
N), then (Wt , t ∈ N) is P t -weakly ergodic, i.e., P t (w0 , ·) → π as t → ∞,
π = L(W∞ ) being the unique P -invariant probability measure, so that π = m;
(iii) (Wt , Pπ , t ∈ N), i.e., the M C (Wt , t ∈ N) for which W0 ∼ π, is an
ergodic strictly stationary process;
(iv) if, in addition, (Ut , t ∈ N+ ) satisfies (U2), i.e., E[ kU k2 ] < ∞, then
W∞ ∈ L2 with covariance matrix
Cov[W∞ ] =
∞
X
P
F i G U G0 F i0 .
i=0
L
w
Here → stands for convergence in distribution and → for weak convergence of
probability measures.
t−1
P i
Proof. (i) Wt = F t W0 +
F GUt−i after iterating (LARM). Moreover,
i=0
t−1
P
F i GU
D
t−i
=
i=0
t−1
P
F i GU
i+1 .
Just as in the proof of Lemma 2.1 (ii), we con-
i=0
clude that
∞
P
F i GUi+1 = W∞ ∈ L1 . Therefore,
i=0
t−1
P
t−1
P
L
1
F i GUi+1 −→
W∞ , whence
i=0
P
F i GUi+1 → W∞ and then
i=0
t−1
P
L
F i GUi+1 → W∞ as t → ∞. Since F t W0 → 0
i=0
L
a.s. by (EVF), Wt → W∞ as t → ∞. The representation of E[W∞ ] follows as
in Lemma 2.1 (ii).
(ii) If P is the transition probability function of the MC (Wt , t ∈ N), then
P t (w0 , ·) is the distribution of Wt if the MC starts at W0 = w0 . Therefore, (i)
w
says that P t (w0 , ·) → π as t → ∞, π = L(W∞ ) = m. Next, we show that π is
P -invariant. This amounts to show that if W0 ∼ π then W1 = (F W0 +GU1 ) ∼
∞
P
π. Choose for W0 the representation W0 =
F i GUi+2 ∼ π. Then
i=0
W1 =
∞
X
F i GUi+1 ∼ π.
i=0
Because of the time-homogeneity of the MC (Wt , t ∈ N) the result is valid for
each transition from time t to time (t + 1). Moreover, π turns out to be the
w
unique P -invariant probability measure: P t (w, ·) → π means that
Z
Z
0
t
0
f (w )P (w, dw ) → f (w0 )π(dw0 )
W
W
11
Linear autoregressive models
255
for all bounded, continuous real-valued functions f on W = Rk , as t → ∞. If
q is another P -invariant probability measure then, for f as above,
Z
Z
Z
0
0
0
f (w )q(dw ) = f (w ) P t (w, dw0 )q(dw) =
W
W
Z Z
0
W
Z
0
t
f (w0 )π(dw0 )
f (w )P (w, dw )q(dw) →
=
W W
as t → ∞. Since
R
f (w0 )q(dw0 )
W
=
R
W
0
f (w )π(dw0 )
for all bounded continuous
W
real-valued functions f, q should be equal to π.
(iii) is a well-known consequence of the uniqueness of the invariant probability measure π, see Hernández-Lerma and Lasserre (2003, p. 35), e.g.
∞
t−1
P
P i
F i GUi+1 a.s., acF GUi+1 →
(iv) Since we know from (i) that
i=0
i=0
cording to the “Equivalence Theorem” in Loève (1963, p. 251), we have
2
X
2
t−1
X
t−1 i
2
i
kW∞ k = lim
F GUi+1 lim F GUi+1 ≤
t→∞
≤
i=0
∞
X
t→∞
i=0
|kF i k|2 kGUi+1 k2 a.s.
i=0
for an appropriate matrix norm. As in the proof of Lemma 2.1 (ii), one concludes E[kW∞ k2 ] < ∞. The covariance matrix of W∞ can be calculated as in
the proof of Lemma 2.1 (iii):
Cov[W∞ ] = E[(W∞ − µ)(W∞ − µ)0 ]
∞
X
P
0
F i G U G0 F i . i=0
In order to get a stronger ergodicity property, we assume strictly positive
densities of the innovations and, in addition, condition (CMC) which “spreads
out” the probability mass.
Theorem 3.2. Consider a LARM starting with the random vector W0 ,
independent of (Ut , t ∈ N+ ). Moreover, assume that (EVF) and (CMC) are
valid, (Ut , t ∈ N+ ) satisfies condition (U1) and U has a strictly positive λm density. Then (Wt , t ∈ N) is a geometrically ergodic MC, i.e., there exists a
unique invariant probability measure π with respect to the transition probability
function P of (Wt , t ∈ N) such that
kP t (w, ·) − π(·)k ≤ ρt Rw ,
t ∈ N,
for some 0 < ρ < 1, for all w ∈ Rk with corresponding constants Rw < ∞.
Here, k · k denotes the norm of total variation of signed measures.
256
Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph
12
Proof. By Lemma 1.2, the MC (Wt , t ∈ N) is λk -irreducible and aperiodic. Moreover, it is a Feller chain. Geometric ergodicity of (Wt , t ∈ N) can
be proved by means of a theorem of Feigin and Tweedie (1985), see also Meyn
and Tweedie (1996, p. 354/5).
First, we spot a special vector norm from condition (EVF). Since ρ(F ) <
1, there exists a matrix norm k| · |k such that k|F |k < 1, which yields a compatible vector norm k · k, i.e., the inequality kF wk ≤ k|F |k kwk holds for all
w ∈ Rk , see Horn and Johnson (1988, p. 297). Using that vector norm we
define the test function h : Rk → [1, ∞) by h(w) = 1 + kwk. With δ > 0
defined by k|F |k = 1 − δ < 1, one can estimate
E[h(Wt ) | Wt−1 = w] = E[h(F w + GUt )] = E[1 + kF w + GUt k] ≤
≤ k|F |k kwk + 1 + E[kGUt k] ≤ (1 − δ)kwk + (1 − δ) + δ + K1 =
= (1 − δ)[1 + kwk] + K2 = (1 − δ)h(w) + K2
with finite constants K1 , K2 . Now, a large enough compact set C ⊂ Rk can be
chosen such that K2 < 2δ [1 + kwk] for all w ∈
/ C. Then for all w ∈
/ C one gets
δ
E[h(Wt ) | Wt−1 = w] ≤ 1 −
h(w).
2
As all norms on Rk are equivalent, C being compact with respect to k · k, it
also is compact with respect to the Euclidean vector norm. Therefore, the
proof is complete by the theorem mentioned above. 4. NORMALLY DISTRIBUTED INNOVATIONS
In the case of normally distributed innovations (Ut , t ∈ N+ ), the unique
invariant probability measure π is a normal distribution.
Theorem 4.1. Consider a LARM starting with the random vector W0
independent of (Ut , t ∈ N+ ). Moreover, assume that (EVF) and (CMC) are
valid and
P
U ∼ N (ν, U )
P
with a positive definite covariance matrix U . Then the M C (Wt , t ∈ N) is
geometrically ergodic and, moreover,
P
W∞ ∼ N (µ, ) = π
∞
P
P
P
0
with µ = (I − F )−1 Gν and =
F i G U G0 F i . As a consequence, the
i=0
M C (Wt , t ∈ N) is weakly asymptotically stationary.
P
Proof. The positive definiteness of U ensures the strict positivity of the
λm -density of U on the whole of Rm , therefore the validity of Theorem 3.2.
13
Linear autoregressive models
257
P
It remains to determine the
distribution of W∞ . First, for U ∼ N (ν, U )
P
with a positive
definite U one can find a positive definite matrix A such
P
that AA0 = U [see Horn and Johnson (1988, p. 405)] and one can write
U = AV + ν with V ∼ N (0, Im ). Since
t
Wt = F W0 +
t−1
X
i
F Gν +
i=0
t−1
X
F i GAVt−i
i=0
and
E[Wt ] =: µt = F t E[W0 ] +
t−1
X
F i Gν,
i=0
for W0 = w0 we get
0
Cov[Wt ] = E[(Wt − µt )(Wt − µt ) ]E
X
t−1
i
F GAVt−i
i=0
=
t−1
X
t−1
X
i
F GAVt−i
0 =
i=0
P
P
0
F i G U G0 F i := t
i=0
as
0
E[AVt Vt−j
A0 ] = Cov[Ut , Ut−j ] = 0
for t, t − j ∈ N.
Obviously,
the Wt are normally distributed for all t ∈ N+ : Wt ∼
P
N (µt , t ). If A denotes the block matrix with k matrices A on its main
0
0
diagonal and zeroes outside and A the block matrix with kP
matrices A on
its main diagonal and zeroes outside, then one can represent
k in the form
P
k = Ck A A0 C 0 with
k


(F k−1 G)0


..


.
Ck0 = 
.
 (F G)0 
G0
0
Since the matrices A and A are non-singular,
we have rank (Ck A) = rank Ck =
P
0 0
k, rank(A Ck ) = k, hence rank
k = k according to Horn and Johnson
(1988, p. 13). Moreover, since by the Cayley Hamilton theorem, see Horn
and Johnson (1988) e.g., one can express powers F t for all t ≥ k as linear
combinations of I, F, F 2 , . . . , F k−1 , one concludes that
X
t−1
P
i P
0 i0
rank t = rank
F G UG F
= k.
i=0
258
Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph
14
∞
P
P
P
F i G U G0 F i0 , which is well-defined under (EVF), has rank
Finally, :=
i=0
k, too. The distribution π of
−1
W∞ = (I − F )
Gν +
∞
X
F i GAVi+1
i=0
is determined by means of the limit of the Fourier-transform ϕt of
t−1
P
F i GAVt−i
i=0
for t → ∞ : for r ∈ Rk one gets
t−1
1 0X i P
0 i0
lim ϕt (r) = lim exp − r
F G U GF r =
t→∞
t→∞
2
i=0
∞
1 0X i P
0 i0
(r),
P
= exp − r
F G U GF r =ϕ
N (0, )
2
i=0
where ϕ
N (0,
P
distribution.
)
P
denotes the Fourier-transform of the corresponding N (0, )
Remarks. 1. If G = Ik then (CMC) holds obviously, U has a strictly
and, therefore, (Wt , t ∈ N) is λk -irreducible and aperiodic.
2. If (CMC) is not satisfied, then the MC may be restricted to the range
WR ⊂ W = Rk of the controllability matrix:
λk -density
WR = range[F
k−1
G| · · · |F G|G] =
X
k−1
m
F Gαi αi ∈ R
i
i=0
which is also the range of
∞
P
0
F i GG0 F i . If w0 ∈ WR then F w0 +Gu1 ∈ WR for
i=0
any u1 ∈ Rm . This shows that WR is absorbing, hence the LARM is restricted
to WR . In turn, the controllability condition is satisfied on WR .
If the corresponding MC has an invariant probability measure π, then π
is concentrated on WR , hence is singular w.r.t. λk .
5. CONCLUDING REMARKS
Within the framework of LARM, numerous time series models of the autoregressive type can be dealt with, see, e.g., Meyn and Tweedie (1996), Feigin
and Tweedie (1985) or Bougerol and Picard (1992). Ergodicity properties and
the Markov property of the times series (Wt , t ∈ Z or t ∈ N, respectively)
make accessible the classical limit theorems of probability, see again Meyn
15
Linear autoregressive models
259
and Tweedie (1996). In particular, for the central limit theorem Jones (2004)
presents an excellent collection of results.
The unified treatment of different time series models within the framework of LARM yields simultaneously many important results for those models
(i.e., time series).
Acknowledgements. The authors gratefully acknowledge support from Deutsche
Forschungsgemeinschaft under Grant 436/RUM113/21/4–1.
REFERENCES
[1] P. Bougerol and N. Picard, Strict stationarity of generalized autoregressive processes.
Ann. Probab. 20 (1992), 1714–1730.
[2] A. Brandt, The stochastic equation YnH = An Yn + Bn with stationary coefficients. Adv.
in Appl. Probab. 18 (1986), 211–220.
[3] P.D. Feigin and R.L. Tweedie, Random coefficient autoregressive processes: A Markov
chain analysis of stationarity and finiteness of moments. J. Time Ser. Anal. 6 (1985),
1–14.
[4] O. Hernández-Lerma and J.B. Lasserre, Markov Chains and Invariant Probabilities.
Birkhäuser, Basel, 2003.
[5] G.L. Jones, On the Markov chain central limit theorem. Probability Surveys 1 (2004),
299–320.
[6] R.A. Horn and C.R. Johnson, Matrix Analysis. Cambridge Univ. Press, Cambridge, 1988.
[7] M. Loève, Probability Theory. Van Nostrand, Princeton, 1963.
[8] H. Lütkepohl, Introduction to Multiple Time Series Analysis. Springer, Berlin, 1991.
[9] S.P. Meyn and R.L. Tweedie, Markov Chains and Stochastic Stability. Springer, London,
1996.
Received 21 June 2009
Revised 18 November 2009
Universität Duisburg - Essen, Campus Duisburg
Institut für Mathematik
D-47048 Duisburg, Deutschland
[email protected]
Romanian Academy
“Gheorghe Mihoc-Caius Iacon” Institute
of Mathematical Statistics and Applied Mathematics
Casa Academiei Române
Calea 13 Septembrie no. 13
050711 Bucharest, Romania
[email protected]
and
Universität der Bundeswehr München
Werner-Heisenberg-Weg 39
D-85577 Neubiberg, Deutschland
[email protected]