Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LINEAR AUTOREGRESSIVE MODELS ULRICH HERKENRATH, MARIUS IOSIFESCU and ANDREAS RUDOLPH We study linear autoregressive models (LARM’s for short) in two different variants: one is based on a doubly infinite sequence of i.i.d. innovations thus leading to a time series (Wt , t ∈ Z) and the other one is based on a sequence of i.i.d. innovations and an independent starting variable thus leading to a time series (Wt , t ∈ N). In both cases the resulting time series obeys the Markov property. For both variants we give sufficient conditions for the “stabilization” of the process (Wt , t ∈ Z or t ∈ N). That amounts to a strictly stationary process (Wt , t ∈ Z) in the first case and to weak respectively geometric ergodicity of the Markov chain (Wt , t ∈ N) in the second case. Special attention is paid to the case of normally distributed innovations. AMS 2000 Subject Classification: 60J05, 60G10. Key words: autoregressive time series, Markov chain, strictly stationary process, stabilization of a time series. 1. MODELS We consider linear autoregressive models (LARM’s for short). They are characterized by a linear stochastic recursion of the form (LARM) Wt = F Wt−1 + GUt with either t ∈ N+ or t ∈ Z. If t ∈ N+ , a random variable W0 has to be given as starting value. As for the dimensions of the items in (LARM): Wt ∈ Rk , Ut ∈ Rm , F ∈ Mat(k, k), G ∈ Mat(k, m), where Mat(k, m) denotes the set of k × m matrices. In this paper we generally assume that on a probability space (Ω, A, P) there are given i.i.d. random vectors (Ut , t ∈ N+ or t ∈ Z) with common distribution τ . The Euclidean space Rp is endowed with its Borel-σ-algebra B p . In the case (Ut , t ∈ N+ ) the random variable W0 defined on (Ω, A, P) is independent of (Ut , t ∈ N+ ). Often the Ut ’s are called noise variables or innovations. Expectations and covariances refer to the probability measure P, unless otherwise stated. MATH. REPORTS 12(62), 3 (2010), 245–259 246 Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph 2 In this case we consider the (time-homogeneous) Markov chain (MC for short) (Wt , t ∈ N) with initial distribution p0 , that means W0 ∼ p0 . A LARM with t ∈ N+ is called a linear state space model in Meyn and Tweedie (1996). In the case (Ut , t ∈ Z), we consider the (time-homogeneous) MC (Wt , t ∈ Z) based on the doubly infinite sequence (Ut , t ∈ Z), i.e., we think that the MC (Wt , t ∈ Z) has started at time −∞, whereas we regard observation times t ∈ N. Restricted to observation times t ∈ N, (Wt , t ∈ N) is a MC with “an infinite history” subsumed in an unknown initial distribution p0 , i.e., W0 ∼ p0 . Under suitable assumptions on the components of a LARM, one may consider that the process (Wt , t ∈ Z) which has started at time −∞ did already “stabilize” during its infinitely long history in such a way that p0 = π, the unique invariant probability measure (UIPM for short) of the MC induced by the LARM. Then (Wt , t ∈ N) is an ergodic strictly stationary process, the “stabilized” process generated by a LARM. As general references for terminology and results on MC’s we refer to the books by Meyn and Tweedie (1996) and Hernández-Lerma and Lasserre (2003). If (Wt , t ∈ N) starts according to an arbitrary initial distribution p0 6= π, this process is “unstable” and one may ask for conditions for its “stabilization” when running through observation times t ∈ N, thus for some kind of asymptotic stability or stationarity. Of course, that amounts to the same question as ensuring the stability of the observable process (Wt , t ∈ N) on the basis of an unobservable history (. . . , W−3 , W−2 , W−1 ). Since a LARM is characterized by a stochastic recursion, it is natural to ask for solutions of this linear stochastic recursion (LSR for short). With regard to the above explanations one is interested in particular in strictly stationary solutions of LSR. A strictly stationary solution means a stochastic process which obeys LSR and is a strictly stationary process. Next, we collect assumptions on the components of a LARM, which will be made in order to ensure stability properties. As for integrability conditions on a typical member U of the i.i.d. family (Ut , t ∈ Z or t ∈ N+ ), we assume that (UL) E[(log kU k)+ ] < ∞, (U1) U ∈ L1 , i.e., E[kU k] < ∞, (U2) U ∈ L2 , i.e., E[kU k2 ] < ∞, and P Cov [U ] = E [V V 0 ] =: U non-singular, where V means “U centered”, i.e., U = V + ν with ν = E[U ], k · k denotes a vector norm and log the natural logarithm. Lemma 1.1. (U2) ⇒ (U1) ⇒ (UL). 3 Linear autoregressive models 247 Proof. The first implication is trivial, so that only the second one needs to be proved. For U ∈ L1 we have Z Z + + E[(log kU k) ] (log kU k) dP + (log kU k)dP = {kU k≤1} {kU k>1} Z Z (log kU k)dP ≤ log = {kU k>1} by Jensen’s inequality. kU kdP < ∞, {kU k>1} The essential condition on the matrix F to ensure the “stability” of a LARM is (EVF) ρ(F ) < 1, where ρ(F ) denotes the spectral radius of F , i.e., the maximal absolute value of the eigenvalues of F . The “eigenvalue condition” (EVF) ensures in some sense a contractibility property of the linear mapping induced by F . If this property of F does not hold, then, by iterating (LARM), the process (Wt , t ∈ N) “drifts away to infinity”. It is sometimes useful to assume the so-called controllability condition for the pair of matrices (F, G). One introduces the so-called controllability matrix Ck as the aggregation of the matrices F k−1 G, F k−2 G, . . . , F G, G written side by side and abbreviated to Ck := [F k−1 G| . . . |F G|G] ∈ Mat(k, km). If (CMC) rank{Ck := [F k−1 G | . . . |F G|G]} = k, the pair of matrices (F, G) is called controllable or (CMC) is called valid. Obviously, for G = Ik , Ik the identity matrix of order k, condition (CMC) holds. The essence of the controllability property of (F, G) is as follows (see Meyn and Tweedie (1996, p. 95). If one considers the deterministic recursion wt = F wt−1 + Gut , t ∈ N+ , then the controllability of (F, G) means that for each pair of “states” w0 , w∗ ∈ Rk there exists a sequence of “controls” (u∗1 , . . . , u∗k ), u∗i ∈ Rm , such that w∗ = wk , when (u1 , . . . , uk ) = (u∗1 , . . . , u∗k ) and the recursion starts at w0 , i.e., each state w∗ can be reached from each starting point w0 in k steps. The reason for that is simply the representation u1 wk = F k w0 + [F k−1 G | · · · | F G|G] ... uk and the fact that the range of the linear mapping induced by [F k−1 G| · · · |F G|G] is Rk . Therefore, each state w∗ can be reached from each starting point w0 in t steps for all t ≥ k. 248 Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph 4 Condition (CMC) obviously represents a concept of communication between states in the deterministic model generated by F and G and is the key to irreducibility of the MC (Wt , t ∈ N+ ) under appropriate conditions on the sequence (Ut , t ∈ N+ ). In this context we prove the following result. Lemma 1.2. If for a LARM started at w0 ∈ Rk , (CMC) holds and U has a strictly positive λm -density, then Wt has a strictly positive λk -density for all t ≥ k. Therefore, the MC (Wt , t ∈ N) is λk -irreducible and aperiodic. Rk Proof. Consider for a fixed w0 ∈ Rk the linear transformation T : Rkm → defined by T (u1 , . . . , uk ) = Ck (u1 | · · · |uk )0 =: wk − F k w0 . Then the matrix Ck ∈ Mat(k, km) can be extended to a non-singular matrix C ∈ Mat(km, km), i.e., | det C| > 0, inducing a linear transformation Tb : Rkm → Rkm . Next, Tb−1 exists as inverse transformation and, according to the theorem on transformation of densities, Wk induced by (LARM) has a strictly positive λk -density q k as marginal density of a strictly positive density for all starting points w0 ∈ Rk . Here (u1 | · · · |uk ) := (u11 , . . . , u1m , u21 , . . . , u2m , . . . , uk1 , . . . , ukm ) = (u01 , . . . , u0k ), so that (u1 | . . . |uk )0 ∈ Mat(km, 1). It follows from the Markov property of (Wt , t ∈ N) that for all t ≥ k the random vectors Wt have a strictly positive λk -density, too, Z P(Wk+j ∈ A) = P(Wk+j ∈ A | Wj = w)P(Wj ∈ dw) = W Z Z Z k Q (w, A)P(Wj ∈ dw) = = W Rk , q k (w, dw0 )P(Wj ∈ dw)λk (dw0 ) A W Bk for W = A∈ and j ∈ N. Here Qk is the k-step transition probability function of the MC (Wt , t ∈ N). Strictly positive densities obviously imply λk -irreducibility. As for aperiodicity, according to Proposition 5.2.4 in Meyn and Tweedie (1996), there exists a countable of so-called “small” sets Ci such S collection k . Since there should exist a small that Ci are Borel sets and ∞ C = R i i=1 set Cj with λk (Cj ) > 0, Proposition 4.2.2 in Hernández-Lerma and Lasserre (2003) ensures in turn the existence of d ≥ 1 disjoint Borel sets D1 , D2 , . . . , Dd such that ∀x ∈ Di : Q(x, Di+1 ) = 1 for i = 1, . . . , d − 1, d [ k k ∀x ∈ Dd : Q(x, D1 ) = 1 and λ R \ Di = 0. i=1 5 Linear autoregressive models 249 {D1 , . . . , Dd } is called a d-cycle and the MC is called aperiodic if d = 1. Now, because of the strictly positive λk -densities, d must be equal to 1. Otherwise, S ∀i = 1, . . . , d, ∀x ∈ Di : Q(x, Di ) = 0 and, therefore, λk ( di=1 Di ) would be 0. So, (Wt , t ∈ N) is aperiodic. 2. STRICTLY STATIONARY SOLUTIONS OF (LARM) As for the process (Wt ), we first consider the case (Ut , t ∈ Z), i.e., assume that (Wt , t ∈ Z) starts at time −∞. Then it is natural to ask whether (Wt , t ∈ N), has “stabilized” during its infinitely long history (. . . , −3, −2, −1). The starting time −∞ can only be meant asymptotically, i.e., as a limit of finite starting times. So, for t ∈ Z fixed, we think of (t − k), k ∈ N, as starting time and then let k → ∞. Iterating (LARM) from (t − k) to t, the central k−1 P i term to be studied is F GUt−i . Next, we collect results on its convergence i=0 in terms of properties of (Ut , t ∈ Z). Lemma 2.1. Let (EVF) hold, i.e., ρ(F ) < 1. (i) If (Ut , t ∈ Z) satisfies (U L), i.e., E[(log kU k)+ ] < ∞, then ft : ∀t ∈ Z ∃W k−1 X ft a.s. as k → ∞. F i GUt−i → W i=0 (ii) If (Ut , t ∈ Z) satisfies (U1), i.e., E[kU k] < ∞, U = V + ν with ν = E[U ], then ft ∈ L1 : ∀t ∈ Z ∃W k−1 X L1 f F i GUt−i −→ Wt as k → ∞, i=0 in addition to the a.s. convergence, and ft ] = µ = (I − F )−1 Gν. E[W (iii) If (Ut , t ∈ Z) satisfies (U2), i.e., P E[kU k2 ] < ∞ and Cov[U ] = U non-singular, then ft ∈ L2 : ∀t ∈ Z ∃W k−1 X i=0 L2 f F i GUt−i −→ Wt as k → ∞, 250 Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph 6 in addition to the convergence properties in (ii), and ft , W ft−h ] = Cov[W ∞ X P F h+i G U G0 F i0 , i=0 ft ] = Cov[W ∞ X P F i G U G0 F i0 . i=0 0 Here, denotes the transposition of the corresponding matrix. Proof. (i) We refer to Theorem 1.1 in Bougerol and Picard (1992) which essentially had already been proved by Brandt (1986). Set At = F and Bt = GUt for all t ∈ Z. Since for any matrix norm |k · k| one has ρ(F ) = lim |kF n k|1/n [see Horn and Johnson (1988, p. 299)], we can write n→∞ 1 n log |kF k| ≤ lim log |kF n k|1/n = log ρ(F ) < 0 γ := inf n→∞ n∈N n by (EVF) and, furthermore, (log |||At |||)+ = (log |||F |||)+ < ∞. Moreover, E [(log kBt k)+ ] = E[(log kGU k)+ ] ≤ (log |kGk|)+ + E[(log kU k)+ ] < ∞ for the operator norm |k · k| associated with the vector normk · k. Therefore, all assumptions of the quoted Theorem 1.1 are satisfied, so that our statement follows. (ii) For all t ∈ Z, k ∈ N+ we define Wt−k = F k Wt−k + k−1 X F i GUt−i , i=0 i.e., Wt−k is the “state” at Wt−k . With regard to the time t, if the LARM started at time (t − k) in state assumed starting time −∞, we consider for a fixed t ∈ Z the limit of the sequence (Wt−k , k ∈ N+ ), that is, ft := lim W −k = W t k→∞ ∞ X (∗) = (∗∗) F i GUt−i = i=0 ∞ X i ∞ X F i GVt−i + i=0 ∞ X F i Gν i=0 F GVt−i + (I − F )−1 Gν; i=0 (∗) holds by condition (EVF) according to Theorem 5.6.12 in Horn and Johnson (1988, p. 298) while (∗∗) holds according to a result on p. 301 in that book. Since X X ∞ k−1 i F GVt−i ≤ |kF i k| kGVt−i k i=0 i=0 7 Linear autoregressive models 251 for an appropriate matrix norm, we have X X ∞ ∞ i E |kF k| kGVt−i k = |kF i k| E[kGVt−i k] < ∞ i=0 i=0 by the Monotone Convergence Theorem, condition (U1) and Corollary 5.6.14 ∞ P in Horn and Johnson (1988, p. 299), i.e., |kF i k| kGVt−i k ∈ L1 . In turn this i=0 implies [see, e.g., Loève (1963, p. 163)] k−1 X L1 i F GUt−i −→ i=0 ∞ X ft F i GUt−i = W i=0 as k → ∞ and E X ∞ i=0 i F GVt−i = lim E k→∞ X k−1 i F GVt−i = 0. i=0 ft ] = (I − F )−1 Gν. Therefore, E[W (iii) The proof follows from Proposition C.7 in Lütkepohl (1991, p. 490), as, by (EVF) (F i , i ∈ N) is an absolutely summable sequence of matrices (see again Horn and Johnson (1988, p. 301), and E[Ut0 Ut ] ≤ C < ∞ because of (U2). Moreover, according to Proposition C.8 in Lütkepohl (1991, p. 491), one can compute the covariances 0 f f ΓW f (h) := E[(Wt − µ)(Wt−h − µ) ] lim k−1 X k−1 X k→∞ = lim k→∞ k−1 X j=0 0 0 ]G0 F j = F i GE[Vt−i Vt−h−j i=0 j=0 ∞ X P P 0 0 ft , W ft−h ]. F h+j G U G0 F j = F h+i G U G0 F i = Cov[W i=0 P This holds as E[Vt Vs0 ] = 0 for s 6= t and E[Vt Vt0 ] = U for all t ∈ Z. In particular, for h = 0 we have ∞ X P 0 0 f f ft ]. ΓW F i G U G0 F i = Cov[W f (0) = E[(Wt − µ)(Wt − µ) ] = i=0 ft , t ∈ Z) which Next, we ask for properties of the stochastic process (W is well defined under the assumptions of Lemma 2.1. Lemma 2.2. Under conditions (EVF) and (UL) the stochastic process f (Wt , t ∈ Z) is the unique strictly stationary solution of the stochastic recursion (LARM ) and enjoys the Markov property. Let P denote the transition probabift , t ∈ Z). Then the distribution m of W ft is an lity function of the M C (W R invariant probability measure of the chain, that is, m(A) = P (w, A)m(dw). 252 Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph 8 Proof. Although the result is contained in Theorem 1.1 in Bougerol and Picard (1992), we prove it here except for the uniqueness of the solution. Uniqueness will be ensured by Theorem 3.1 to be proved in what follows. ∞ ft = P F i GUt−i , t ∈ Z, the stochastic On account of the representation W i=0 recursion (LARM) X ∞ ∞ X i f ft . F Wt−1 +GUt = F F GUt−1−i +GUt = F i+1 GUt−(i+1) +F 0 GUt = W i=0 i=0 ft , t ∈ Z) obeys (LARM), it enjoys the Markov property. is satisfied. Since (W Then in combination with the equation ft = W ∞ X i D F GUt−i = i=0 ∞ X ft+h F i GUt+h−i = W i=0 D ft , t ∈ Z) follows. Here, = for all h ∈ Z, the strict stationarity of (W denotes equality in distribution. The last property of the statement follows from the equation Z ft ∈ A) = P(W ft ∈ A | W ft−1 = w)P(W ft−1 ∈ dw) m(A) = P(W Z = P (w, A)m(dw). ∞ ft = P F i GUt−i , Remarks 1.1. 1. The strictly stationary solution W i=0 t ∈ Z is a nonanticipative one according to Definition 2.2 in Bougerol and ft is independent of the random vectors (Ur , r > t) for Picard (1992) since W ft , t ∈ Z) is independent of the future at any t ∈ Z. Therefore, the solution (W any given time and contains only noise terms or innovations from the past up to the present. 2. Bougerol and Picard (1992) introduce an irreducibility property for LARM: An affine subspace H of Rk is said to be invariant under (LARM) if {F w + GU, w ∈ H} ⊂ H a.s. A LARM is called irreducible if Rk is the only affine invariant subspace, i.e., there is no proper affine subspace H ⊂ Rk such that if LARM starts at W0 = w ∈ H, then it remains in H all the time. They show in addition to the result of Lemma 1.3 that under the above irreducibility condition and condition (UL) the validity of (EVF) is even a necessary condition for the existence of a nonanticipative strictly stationary solution of (LARM). 3. In the time series literature, see e.g. Lütkepohl (1991), the stochasft , t ∈ Z), which is well-defined under the conditions (EVF) tic process (W 9 Linear autoregressive models 253 and (UL), is called “stable” and correspondingly condition (EVF) for LARM “stability condition” or “stationarity condition” [see Lütkepohl (1991, p. 20)]. In the time series literature the notions “stability” and “stationarity” are often used synonymously. Moreover, in some books, there is no distinction between “strict stationarity” and “second-order stationarity”. If one regards the stochastic recursion (LARM) as a time series model, i.e., as a vectorial autoregressive model of order 1, one can ask for time series, i.e., stochastic processes (Wt , t ∈ Z or t ∈ N), which are compatible with (LARM). Therefore, in the case “t ∈ Z” one asks for solutions of (LARM) while in the case “t ∈ N”, given a starting variable W0 , one generates (Wt , t ∈ N) by iterating (LARM). As was shown above, under suitable conditions, a time series which has started in the infinite past, i.e., at time −∞, has “stabilized”, so that it appears as “stable” in the stochastics sense during observation times ft , t ∈ Z). t ∈ N. This explains the introduction of (W ∞ ft = P F i GUt−i , t ∈ Z dealt The stochastic process or time series W i=0 with in Lemmas 2.1 and 2.2 is called the canonical moving average (MA for short) representation of the linear autoregressive model LARM. It requires the validity of (EVF) and exists under (UL) almost surely, under (U1) in the L1 -sense and under (U2) in the L2 -sense as infinite series based on the infinite ft , t ∈ Z). past. Usually, in the time series literature one directly deals with (W 3. THE CASE (UT , T ∈ N+ ): MARKOV CHAINS GENERATED BY LARM Since we consider as observation times for LARM t ∈ N, respectively an induced time series indexed by t ∈ N, we now turn to study LARM starting f0 ) resulting with a random vector W0 . The distribution of W0 may be m = L(W from Lemma 2.2 or any other probability measure p0 . In any case, we study (Wt , t ∈ N) as a MC. If W0 ∼ m, we have a “stable”, i.e., strictly stationary MC. If W0 ∼ p0 , again the question arises whether (Wt , t ∈ N) “stabilizes” as t → ∞. In this context, we will take advantage on available results for MC’s. Theorem 3.1. Assume (EVF) holds, i.e., ρ(F ) < 1, and (Ut , t ∈ N+ ) satisfies (U1), i.e., E[ kU k ] < ∞, U = V + ν with ν = E[U ]. Let the random vector W0 be independent of (Ut , t ∈ N+ ) and (Wt , t ∈ N) generated by LARM. Then L (i) Wt → W∞ as t → ∞ and W∞ = ∞ X i=0 F i GUi+1 ∈ L1 , E[W∞ ] = (I − F )−1 Gν = µ; 254 Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph 10 (ii) if P denotes the transition probability function of the M C (Wt , t ∈ w N), then (Wt , t ∈ N) is P t -weakly ergodic, i.e., P t (w0 , ·) → π as t → ∞, π = L(W∞ ) being the unique P -invariant probability measure, so that π = m; (iii) (Wt , Pπ , t ∈ N), i.e., the M C (Wt , t ∈ N) for which W0 ∼ π, is an ergodic strictly stationary process; (iv) if, in addition, (Ut , t ∈ N+ ) satisfies (U2), i.e., E[ kU k2 ] < ∞, then W∞ ∈ L2 with covariance matrix Cov[W∞ ] = ∞ X P F i G U G0 F i0 . i=0 L w Here → stands for convergence in distribution and → for weak convergence of probability measures. t−1 P i Proof. (i) Wt = F t W0 + F GUt−i after iterating (LARM). Moreover, i=0 t−1 P F i GU D t−i = i=0 t−1 P F i GU i+1 . Just as in the proof of Lemma 2.1 (ii), we con- i=0 clude that ∞ P F i GUi+1 = W∞ ∈ L1 . Therefore, i=0 t−1 P t−1 P L 1 F i GUi+1 −→ W∞ , whence i=0 P F i GUi+1 → W∞ and then i=0 t−1 P L F i GUi+1 → W∞ as t → ∞. Since F t W0 → 0 i=0 L a.s. by (EVF), Wt → W∞ as t → ∞. The representation of E[W∞ ] follows as in Lemma 2.1 (ii). (ii) If P is the transition probability function of the MC (Wt , t ∈ N), then P t (w0 , ·) is the distribution of Wt if the MC starts at W0 = w0 . Therefore, (i) w says that P t (w0 , ·) → π as t → ∞, π = L(W∞ ) = m. Next, we show that π is P -invariant. This amounts to show that if W0 ∼ π then W1 = (F W0 +GU1 ) ∼ ∞ P π. Choose for W0 the representation W0 = F i GUi+2 ∼ π. Then i=0 W1 = ∞ X F i GUi+1 ∼ π. i=0 Because of the time-homogeneity of the MC (Wt , t ∈ N) the result is valid for each transition from time t to time (t + 1). Moreover, π turns out to be the w unique P -invariant probability measure: P t (w, ·) → π means that Z Z 0 t 0 f (w )P (w, dw ) → f (w0 )π(dw0 ) W W 11 Linear autoregressive models 255 for all bounded, continuous real-valued functions f on W = Rk , as t → ∞. If q is another P -invariant probability measure then, for f as above, Z Z Z 0 0 0 f (w )q(dw ) = f (w ) P t (w, dw0 )q(dw) = W W Z Z 0 W Z 0 t f (w0 )π(dw0 ) f (w )P (w, dw )q(dw) → = W W as t → ∞. Since R f (w0 )q(dw0 ) W = R W 0 f (w )π(dw0 ) for all bounded continuous W real-valued functions f, q should be equal to π. (iii) is a well-known consequence of the uniqueness of the invariant probability measure π, see Hernández-Lerma and Lasserre (2003, p. 35), e.g. ∞ t−1 P P i F i GUi+1 a.s., acF GUi+1 → (iv) Since we know from (i) that i=0 i=0 cording to the “Equivalence Theorem” in Loève (1963, p. 251), we have 2 X 2 t−1 X t−1 i 2 i kW∞ k = lim F GUi+1 lim F GUi+1 ≤ t→∞ ≤ i=0 ∞ X t→∞ i=0 |kF i k|2 kGUi+1 k2 a.s. i=0 for an appropriate matrix norm. As in the proof of Lemma 2.1 (ii), one concludes E[kW∞ k2 ] < ∞. The covariance matrix of W∞ can be calculated as in the proof of Lemma 2.1 (iii): Cov[W∞ ] = E[(W∞ − µ)(W∞ − µ)0 ] ∞ X P 0 F i G U G0 F i . i=0 In order to get a stronger ergodicity property, we assume strictly positive densities of the innovations and, in addition, condition (CMC) which “spreads out” the probability mass. Theorem 3.2. Consider a LARM starting with the random vector W0 , independent of (Ut , t ∈ N+ ). Moreover, assume that (EVF) and (CMC) are valid, (Ut , t ∈ N+ ) satisfies condition (U1) and U has a strictly positive λm density. Then (Wt , t ∈ N) is a geometrically ergodic MC, i.e., there exists a unique invariant probability measure π with respect to the transition probability function P of (Wt , t ∈ N) such that kP t (w, ·) − π(·)k ≤ ρt Rw , t ∈ N, for some 0 < ρ < 1, for all w ∈ Rk with corresponding constants Rw < ∞. Here, k · k denotes the norm of total variation of signed measures. 256 Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph 12 Proof. By Lemma 1.2, the MC (Wt , t ∈ N) is λk -irreducible and aperiodic. Moreover, it is a Feller chain. Geometric ergodicity of (Wt , t ∈ N) can be proved by means of a theorem of Feigin and Tweedie (1985), see also Meyn and Tweedie (1996, p. 354/5). First, we spot a special vector norm from condition (EVF). Since ρ(F ) < 1, there exists a matrix norm k| · |k such that k|F |k < 1, which yields a compatible vector norm k · k, i.e., the inequality kF wk ≤ k|F |k kwk holds for all w ∈ Rk , see Horn and Johnson (1988, p. 297). Using that vector norm we define the test function h : Rk → [1, ∞) by h(w) = 1 + kwk. With δ > 0 defined by k|F |k = 1 − δ < 1, one can estimate E[h(Wt ) | Wt−1 = w] = E[h(F w + GUt )] = E[1 + kF w + GUt k] ≤ ≤ k|F |k kwk + 1 + E[kGUt k] ≤ (1 − δ)kwk + (1 − δ) + δ + K1 = = (1 − δ)[1 + kwk] + K2 = (1 − δ)h(w) + K2 with finite constants K1 , K2 . Now, a large enough compact set C ⊂ Rk can be chosen such that K2 < 2δ [1 + kwk] for all w ∈ / C. Then for all w ∈ / C one gets δ E[h(Wt ) | Wt−1 = w] ≤ 1 − h(w). 2 As all norms on Rk are equivalent, C being compact with respect to k · k, it also is compact with respect to the Euclidean vector norm. Therefore, the proof is complete by the theorem mentioned above. 4. NORMALLY DISTRIBUTED INNOVATIONS In the case of normally distributed innovations (Ut , t ∈ N+ ), the unique invariant probability measure π is a normal distribution. Theorem 4.1. Consider a LARM starting with the random vector W0 independent of (Ut , t ∈ N+ ). Moreover, assume that (EVF) and (CMC) are valid and P U ∼ N (ν, U ) P with a positive definite covariance matrix U . Then the M C (Wt , t ∈ N) is geometrically ergodic and, moreover, P W∞ ∼ N (µ, ) = π ∞ P P P 0 with µ = (I − F )−1 Gν and = F i G U G0 F i . As a consequence, the i=0 M C (Wt , t ∈ N) is weakly asymptotically stationary. P Proof. The positive definiteness of U ensures the strict positivity of the λm -density of U on the whole of Rm , therefore the validity of Theorem 3.2. 13 Linear autoregressive models 257 P It remains to determine the distribution of W∞ . First, for U ∼ N (ν, U ) P with a positive definite U one can find a positive definite matrix A such P that AA0 = U [see Horn and Johnson (1988, p. 405)] and one can write U = AV + ν with V ∼ N (0, Im ). Since t Wt = F W0 + t−1 X i F Gν + i=0 t−1 X F i GAVt−i i=0 and E[Wt ] =: µt = F t E[W0 ] + t−1 X F i Gν, i=0 for W0 = w0 we get 0 Cov[Wt ] = E[(Wt − µt )(Wt − µt ) ]E X t−1 i F GAVt−i i=0 = t−1 X t−1 X i F GAVt−i 0 = i=0 P P 0 F i G U G0 F i := t i=0 as 0 E[AVt Vt−j A0 ] = Cov[Ut , Ut−j ] = 0 for t, t − j ∈ N. Obviously, the Wt are normally distributed for all t ∈ N+ : Wt ∼ P N (µt , t ). If A denotes the block matrix with k matrices A on its main 0 0 diagonal and zeroes outside and A the block matrix with kP matrices A on its main diagonal and zeroes outside, then one can represent k in the form P k = Ck A A0 C 0 with k (F k−1 G)0 .. . Ck0 = . (F G)0 G0 0 Since the matrices A and A are non-singular, we have rank (Ck A) = rank Ck = P 0 0 k, rank(A Ck ) = k, hence rank k = k according to Horn and Johnson (1988, p. 13). Moreover, since by the Cayley Hamilton theorem, see Horn and Johnson (1988) e.g., one can express powers F t for all t ≥ k as linear combinations of I, F, F 2 , . . . , F k−1 , one concludes that X t−1 P i P 0 i0 rank t = rank F G UG F = k. i=0 258 Ulrich Herkenrath, Marius Iosifescu and Andreas Rudolph 14 ∞ P P P F i G U G0 F i0 , which is well-defined under (EVF), has rank Finally, := i=0 k, too. The distribution π of −1 W∞ = (I − F ) Gν + ∞ X F i GAVi+1 i=0 is determined by means of the limit of the Fourier-transform ϕt of t−1 P F i GAVt−i i=0 for t → ∞ : for r ∈ Rk one gets t−1 1 0X i P 0 i0 lim ϕt (r) = lim exp − r F G U GF r = t→∞ t→∞ 2 i=0 ∞ 1 0X i P 0 i0 (r), P = exp − r F G U GF r =ϕ N (0, ) 2 i=0 where ϕ N (0, P distribution. ) P denotes the Fourier-transform of the corresponding N (0, ) Remarks. 1. If G = Ik then (CMC) holds obviously, U has a strictly and, therefore, (Wt , t ∈ N) is λk -irreducible and aperiodic. 2. If (CMC) is not satisfied, then the MC may be restricted to the range WR ⊂ W = Rk of the controllability matrix: λk -density WR = range[F k−1 G| · · · |F G|G] = X k−1 m F Gαi αi ∈ R i i=0 which is also the range of ∞ P 0 F i GG0 F i . If w0 ∈ WR then F w0 +Gu1 ∈ WR for i=0 any u1 ∈ Rm . This shows that WR is absorbing, hence the LARM is restricted to WR . In turn, the controllability condition is satisfied on WR . If the corresponding MC has an invariant probability measure π, then π is concentrated on WR , hence is singular w.r.t. λk . 5. CONCLUDING REMARKS Within the framework of LARM, numerous time series models of the autoregressive type can be dealt with, see, e.g., Meyn and Tweedie (1996), Feigin and Tweedie (1985) or Bougerol and Picard (1992). Ergodicity properties and the Markov property of the times series (Wt , t ∈ Z or t ∈ N, respectively) make accessible the classical limit theorems of probability, see again Meyn 15 Linear autoregressive models 259 and Tweedie (1996). In particular, for the central limit theorem Jones (2004) presents an excellent collection of results. The unified treatment of different time series models within the framework of LARM yields simultaneously many important results for those models (i.e., time series). Acknowledgements. The authors gratefully acknowledge support from Deutsche Forschungsgemeinschaft under Grant 436/RUM113/21/4–1. REFERENCES [1] P. Bougerol and N. Picard, Strict stationarity of generalized autoregressive processes. Ann. Probab. 20 (1992), 1714–1730. [2] A. Brandt, The stochastic equation YnH = An Yn + Bn with stationary coefficients. Adv. in Appl. Probab. 18 (1986), 211–220. [3] P.D. Feigin and R.L. Tweedie, Random coefficient autoregressive processes: A Markov chain analysis of stationarity and finiteness of moments. J. Time Ser. Anal. 6 (1985), 1–14. [4] O. Hernández-Lerma and J.B. Lasserre, Markov Chains and Invariant Probabilities. Birkhäuser, Basel, 2003. [5] G.L. Jones, On the Markov chain central limit theorem. Probability Surveys 1 (2004), 299–320. [6] R.A. Horn and C.R. Johnson, Matrix Analysis. Cambridge Univ. Press, Cambridge, 1988. [7] M. Loève, Probability Theory. Van Nostrand, Princeton, 1963. [8] H. Lütkepohl, Introduction to Multiple Time Series Analysis. Springer, Berlin, 1991. [9] S.P. Meyn and R.L. Tweedie, Markov Chains and Stochastic Stability. Springer, London, 1996. Received 21 June 2009 Revised 18 November 2009 Universität Duisburg - Essen, Campus Duisburg Institut für Mathematik D-47048 Duisburg, Deutschland [email protected] Romanian Academy “Gheorghe Mihoc-Caius Iacon” Institute of Mathematical Statistics and Applied Mathematics Casa Academiei Române Calea 13 Septembrie no. 13 050711 Bucharest, Romania [email protected] and Universität der Bundeswehr München Werner-Heisenberg-Weg 39 D-85577 Neubiberg, Deutschland [email protected]