Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sankhyā : The Indian Journal of Statistics 2010, Volume 72-A, Part 1, pp. 170-190 c 2010, Indian Statistical Institute Limit Theorems for Monotone Markov Processes Rabi Bhattacharya The University of Arizona, Tucson, USA Mukul Majumdar Cornell University, Ithaca, USA Nigar Hashimzade University of Reading, Reading, UK Abstract This article considers the convergence to steady states of Markov processes generated by the action of successive i.i.d. monotone maps on a subset S of an Eucledian space. Without requiring irreducibility or Harris recurrence, a “splitting” condition guarantees the existence of a unique invariant probability as well as an exponential rate of convergence to it in an appropriate metric. For a special class of Harris recurrent processes on [0, ∞) of interest in economics, environmental studies and queuing theory, criteria are derived for polynomial and exponential rates of convergence to equilibrium in total variation distance. Central limit theorems follow as consequences. AMS (2000) subject classification. Primary 60F05, 60J05. Keywords and phrases. Markov processes, coupling, monotone i.i.d. maps, polynomial convergence rates. 1 Introduction In this paper we present some limit theorems for Markov processes defined by: (n = 0, 1, . . .) (1.1) Xn+1 = αn+1 (Xn ) where {αn : n ≥ 1} is a sequence of i.i.d. random monotone maps on a suitable subset S of Rk . Such processes have been of particular interest in developing models of dynamic systems subject to exogenous random shocks in many disciplines. Often, with specific assumptions on S ( for example, an interval in R, a closed subset of Rk ,...), and the maps αn (for example, continuity, concavity,...), it has been possible to derive strong results on the Monotone Markov processes 171 asymptotic behavior of Xn and throw light on questions of long-standing importance such as estimating the long run expected value of per capita output or capital stock, or the probabilty of extreme scarcity of a renewable resource. Since the literature exploring (1.1) is already vast and growing, before proceeding to the formal analysis, we touch upon a few issues and applications to economic growth and resource management. First, the most funadamental theme is to identify conditions that guarantee the existence of an invariant distribution (a stochastic steady state), its uniqueness, and its stability. When, irrespective of the intial state, the distribution of Xn converges to the invariant distribution π, it is of interest to estimate the speed of convergence. Insights into all these questions are gained when the process (1.1) satisfies a “splitting” condition. In various contexts, the process (1.1) is interpreted as a purely descriptive model. As an instance, Xn is a non-negative k− vector of stocks of all the commodities in an economy and its evolution, when the law of motion is a random monotone function, is captured by (1.1); or Xn denotes the list of k interacting groups of a population or k interacting species. See Ellner (1984) or Bhattacharya and Majumdar (2007), pp.262- 274, and the references cited there. One may also start with a discounted dynamic programming model (Blackwell, 1965), impose appropriate restrictions on the state and action spaces, obtain a stationary optimal policy function (as in Maitra, 1968). This policy function, together with the law of motion, gives rise to the process (1.1) portraying the evolution of optimal states. In many optimization problems, monotonicity of an optimal policy function is obtained by assuming a supermodularity or concavity property of the reward function (see Ross, 1983 and Majumdar, Mitra and Nyarko, 1989). It should perhaps be noted that the continuity of a policy function may be more problematic: even in a deterministic model of intertemporal optimization; an example of discontinuity (in which the production function is S-shaped, the return function is linear) was given in Majumdar and Mitra (1983). Undoubtedly, the process (1.1) has become one of the most useful frameworks to explore some of the basic questions of the theory of intertemporal resource allocation in economics. We should emphasize that there are interesting Markov processes, arising, for example, in the study of i.i.d. iterates of quadratic maps (Bhattacharya and Rao, 1993, Bhattacharya and Majumdar, 2004), or in the study of interaction of growth and cyclical forces (Bhattacharya and Majumdar, 2007, pp. 267–273) that enter into an invariant subset of S on which all the maps turn out to be monotone. The results on processes (1.1) have been decisive in the analysis of the long run behavior of such dynamical systems. 172 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade Another class of models allows for regeneration or replenishment of a resource (such as groundwater). We sketch a simple example of the management of such a resource to provide a motivation for the analysis in Section 4. Let Xn ≥ 0 be the stock of a resource at the end of period n (assuming X0 = x ≥ 0), and let {Rn+1 : n ≥ 0} be a sequence of i.i.d. non-negative real-valued random variables representing random flows of input (for example, rainfalls). Let c > 0 be a given parameter: it is interpreted as a target level of consumption. If, at the end of period n + 1, the planner observes Xn + Rn+1 > c, he withdraws from the reservoir the amount c, leaving Xn+1 = Xn + (Rn+1 − c) as the stock. If Xn + Rn+1 ≤ c then the entire available stock is withdrawn, leaving Xn+1 = 0. Thus, the evolution of stock is described by the process: Xt+1 = (Xn + Zn+1 )+ ≡ max (Xn + Zn+1 , 0) for n ≥ 0, where Zn+1 ≡ Rn+1 − c. The policy of “constant harvesting” or “constant yield” is easy to describe and implement. The process arose in queuing theory and has been extensively studied subseqently. It is a monotone Markov process with S = R+ . Assume that EZ1 < 0 (and, to avoid trivialities, P (Z1 > 0)˙ > 0). Then there is a unique invariant distribution π, and one is interested in the nature of convergence to π, as well as in estimating the probability π({0}). Here is a brief summary of the main results of the article. In Section 2, we introduce Markov processes generated by i.i.d. monotone maps on a subset S of Rk , which may include both increasing and decreasing maps. Assuming a “splitting condition” Theorem 2.1 generalizes earlier results of Dubins and Freedman (1966), Bhattacharya and Lee (1988) and Bhattacharya and Majumdar (1999, 2007) on the existence of a unique invariant probability π, and the exponential convergence to π in an appropriate metric. In Section 3, Theorem 3.1, we elaborate on an earlier central limit theorem due to Bhattachaya and Lee (1988) for increasing processes on S. Section 4 concerns monotone increasing Markov processes on S = [0, ∞) and especially the class of Lindley processes. For these Harris-recurrent processes, coupling arguments, large deviation type estimates and non-uniform error bounds in the classical central limit theorems provide known criteria, due to Lund and Tweedie (1996), for exponential rates of convergence to the invariant probablity in total variation distance (Theorems 4.1 and 4.2), and new criteria for polynomial rates of convergence (Theorems 4.3 and 4.4) and central limit theorems (Theorem 4.5). Monotone Markov processes 2 173 Markov processes generated by iterations of i.i.d. monotone maps Let (S, S) be a measurable space, and {αn : n ≥ 1} a sequence of i.i.d. random maps on S (into S), defined on a probability space (Ω, F, P ). Then for any S− valued random variable X0 independent of {αn : n ≥ 1}, the process Xn (n = 0, 1, . . .) defined by Xn+1 = αn+1 Xn (n = 0, 1, . . .) (2.1) is a Markov process on S with a transition probability p (x, dy) given by p (x, B) = P (α1 x ∈ B) , (x ∈ S, B ∈ S) , assuming x → p (x, B) is measurable ∀B ∈ S. Conversely, if S is a standard Borel space (i.e., a Borel subset of a Polish space), and B (S) is the Borel sigmafield of S, then given any transition probability p (x, dy) on (S, S = B (S)), one can construct a sequence of i.i.d. maps {αn : n ≥ 1} on a probability space (Ω, F, P ) such that the process (2.1) is Markov with transition probability p (x, dy). (See Blumenthal and Corson, 1972, Kifer, 1986, p. 8, or Bhattacharya and Waymire, 2009, p.228.) It will be useful to write Xn (x) for the process (2.1), when X0 ≡ x. In this article we consider only those Markov processes on a measurable subset S of Rk that are generated by i.i.d. monotone maps. For x = x1 , . . . , xk , y = y 1 , . . . , y k ∈ Rk , write x ≤ y, or y ≥ x, if xi ≤ y i ∀i, and x < y if x ≤ y with a strict inequality xi < y i for some i. A measurable function f on S into S, or into R, is increasing (decreasing) if f (x) ≤ f (y) (respectively, f (y) ≤ f (x)) ∀x ≤ y. The map f is monotone if it is either increasing or decreasing. To construct such a Markov process with αn taking values in a set Γ of measurable monotone functions γ = γ 1 , . . . , γ k on S, consider an appropriate sigmafield C on Γ and a probability measure Q on (Γ, C). As1 2 k sume that (γ, x) → γx = γ (x) , γ (x) , . . . γ (x) is measurable with respect to the product sigmafield C ⊗ B (S) on Γ × S and the Borel sigmafield B (S) on S. On a probability space (Ω, F, P ), let {αn : n ≥ 1} be an i.i.d. sequence of maps with common distribution Q. Then define the Markov process {Xn : n = 0, 1, . . .} as described earlier (see (2.1) and note that p (x, B) = Q ({γ ∈ Γ : γ (x) ∈ B})). We will make the following assumptions. 174 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade (A1): S is either a closed subset of Rk , or a Borel subset of Rk which can be made homeomorphic to a closed subset of Rk by means of a strictly increasing continuous map f on S into Rk : x < y =⇒ f (x) < f (y). Note that every rectangle S = I1 × · · · × Ik satisfies (A1), if Ij ’s are all arbitrary subintervals of R. For, (a, b), [a, b), (a, b] are made homeomorphic to (−∞, ∞), [0, ∞), (−∞, 0], respectively, by means of strictly increasing continuous maps. Next assume the following splitting condition. For its statement, write γ1N for the composition γ1N = γN · · · γ1 ∀γ = (γ1 , . . . , γN ) ∈ ΓN . (A2): There exists x0 ∈ S, an integer N ≥ 1, and sets Fi belonging to the product sigmafield C ⊗N on ΓN ( i = 1, 2), such that (i) (ii) δi ≡ QN (Fi ) > 0 for i = 1, 2; γ1N x ≤ x0 ∀x ∈ S, if γ ≡ (γ1 , . . . , γN ) ∈ F1 , and γ1N x ≥ x0 ∀x ∈ S, if γ ∈ F2 . Finally, we will make the following rather innocuous assumption. N N N (A3): Consider the +, T sets H+ = γT∈ Γ : γ1 is increasing , H− = Γ H Fi+ = Hi+ Fi , Fi− = Hi− Fi (i = 1, 2). Then Fi+ (and Fi− ) ∈ C ⊗N . On the space P (S) of all probability measure on (S, B (S)) define, for each a > 0, the metric Z Z da (µ, ν) = sup g dµ − g dν , g∈Ga where Ga is the class of all Borel measurable monotone functions g on S into [0, a]. It is easy to verify that (1) da = ad1 and (2) the metric da remains the same if Ga is restricted to Borel measurable incrasing functions on S into [0, a]. We will make use of the following result of Chakraborty and Rao (1998). The second part of the lemma is proved in Bhattacharya and Majumdar (2007), pp. 287-288. Lemma 2.1. Under the assumption (A1), (i) (P (S) , da ) is a complete metric space, and (ii) convergence in da implies weak convergence. 175 Monotone Markov processes We now state the main result of this section which improves upon earlier results of Bhattacharya and Lee (1988), Bhattacharya and Majumdar (1999) (see also Bhattacharya and Majumdar, 2007, pp. 259, 260, 288-291). The latter in turn were generalizations of the seminal result of Dubins and Freedman (1966). Also see Yahav (1975). Let p(n) (x, dy) denote the n− step transition probability of the Markov process (2.1), i.e., the distribution of X when X0 ≡ x. Let T ∗n be the R n (n) ∗n operator on P (S) defined by T µ = p (x, ·) µ (dx). That is, T ∗n µ is the distribution of Xn when X0 has distribution µ. Theorem 2.1. Assume (A1)-(A3) hold. Then there exists a unique invariant probability π for the Markov process (2.1), and, for every µ ∈ P (S) , d1 (T ∗n µ, π) ≤ (1 − δ)[ N ] (n ≥ 1) , n n is the integer part of N . where δ = min {δ1 , δ2 }, and N n (2.2) Proof. We will provide a sketch of the proof, whose details appeared in Bhattacharya and Majumdar (2010). For an arbitrary increasing g ∈ G1 , define Z hi+ (x) = g γ1N x QN (dγ) , Fi+ (Fi+ Z hi− (x) = Fi− (Fi− h3+ (x) = H+ T (F1 h3− (x) = H− h4 (x) = F1 T Z T Z Z (F1 T S S g F2 T Fj ) 1 − g γ1N x QN (dγ) , Fj ) F2 ) c (i, j = 1, 2; i 6= j) g γ1N x QN (dγ) , 1 − g γ1N x QN (dγ) , F2 )c γ1N x \ Q (dγ) = g (x0 ) Q F1 F2 . N N The functions hi± are increasing (i = 1, 2, 3). For µ, ν ∈ P (S), write Z Z ai+ = hi+ (x) µ (dx) − hi+ (x) ν (dx) , Z Z ai− = hi− (x) µ (dx) − hi− (x) ν (dx) (i = 1, 2, 3) 176 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade Then hi± ∈ Gai± (i = 1, 2, 3), and Z Z g dT ∗N (µ) − g dT ∗N (ν) ≤ ≤ Z 3 Z X hi+ (x) µ (dx) − hi+ (x) ν (dx) i=1 Z 3 Z X + hi− (x) µ (dx) − hi− (x) ν (dx) i=1 3 X (ai+ + ai− ) d1 (µ, ν) i=1 ≤ (1 − δ) d1 (µ, ν) . Hence, T ∗N is a uniformly strict contraction on the complete metric space (P (S) , d1 ) and, the contraction mapping theorem (see, e.g., Bhattacharya and Majumdar, 2001, pp. 6, 288–290) gives, d1 (T ∗n µ, T ∗n ν) ≤ (1 − δ)[ N ] d1 (µ, ν) , (µ, ν) ∈ P (S) . n In particular, (2.2) holds, since d1 (µ, ν) ≤ 1. 2 Remark 2.1. The main significance of Theorem 2.1 lies in the fact that it applies to a class of Markov processes which may not be Harris recurrent, or irreducible (at least with respect to any discernible measure). One may also derive central limit theorems for certain classes of functions of the Markov proceess, which are useful, e.g., in estimating the invariant probability π. (See Bhattacharya and Lee, 1988 and Bhattaharya and Majumdar, 2007, Chapter 5.) Remark 2.2. By looking at the proof of Theorem 2.1, one may notice that the Euclidean structure of S is not used very much. On a topological (metric) space S with a partial order, the main ingredients of the proof are assumption (A2) and Lemma 2.1. See Hopenhayn and Prescott (1992) and Diaconis and Freedman (1999) for applications to economics and physics, respectively, on such S. Remark 2.3. On S=[0, 1] the splitting condition (A2) is necessary for the existence of a nondegenerate unique invariant probability, if αn are increasing and coninuous. See, e.g., Bhattacharya and Majumdar (2007), p. 280. The processes on S = [0, ∞) considered in Section 4, however, do not in general satisfy (A2), but have unique invariant probabilities. Monotone Markov processes 3 177 Central Limit Theorem for monotone Markov processes. In this section we review and explore central limit theorems for Markov processes (2.1). In particular, we give a proof of Theorem 3.1 below (due to Bhattacharya and Lee, 1988), clarifying parts of the original proof. Let T m denote the m−step transition operator on L2 (π): Z m T f (x) = f (y) p(m) (x, dy) , f ∈ L2 (π) . We will write T for T 1 and simply call it the transition operator. For Theorem 3.1 and Lemma 3.1 we restrict attention to i.i.d. increasing maps {αn : n ≥ 1} with values in a function space Γ of (measurable) increasing maps on S into S. First, we refer to the following result in Bhattacharya and Lee (1988). Lemma 3.1. Let {αn : n ≥ 1} be i.i.d. increasing maps generating the Markov process (2.1). Assume (A1), (A2). Then for every function f which 2 (π) (i = 1, 2), may be expressed as f = f1 −f2 , with fi increasing and fi ∈ L R f − f belongs to the range of T − I on L2 (π), where f = f dπ and I is the identity operator. Proof. The proof of the lemma is based on the following estimate (see Bhattacharya and Lee, 1988, relations (3.7), (3.8)). 1/2 N 1 T , (3.1) fi − f i 2 ≤ c fi − f i 2 , c = 1 − (1 − δ) 2 which holds for i = 1, 2. Here kk2 is the usual norm in L2 (π). Since T N fi is increasing, as fi is, iterating (3.1) one obtains n T fi − f i ≤ c[ Nn ] fi − f i ∀n ≥ 1, (i = 1, 2) . 2 2 From this one obtains n T f − f ≤ 2c[ Nn ] f − f , 2 2 Now write g=− ∞ X n=0 ∞ X n T f − f < ∞. 2 n=0 Tn f − f . Then (T − I) g = f − f , completing the proof of the lemma. 2 178 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade Theorem 3.1 (Bhattacharya and Lee, 1988). Suppose {αn : n ≥ 1} is an i.i.d. increasing sequence and the assumptions (A1), (A2) in Section 2 hold. Then the CLT (3.2) holds for all f = f1 − f2 , fi increasing, fi ∈ L2 (π). Proof. From Lemma 3.1, the martingale CLT yields the following result as originally proved by Gordin and Lifsic (1978). Also see Bhattacharya and Waymire (2009), pp. 511-513. n L 1 X √ f (Xj ) − f −→ N 0, σ 2 as n → ∞, n (3.2) j=1 L if X0 has theinvariant distribution π. Here −→ denotes convergence in Law, and N 0, σ 2 is the Normal distribution with mean zero and variance Z Z 2 2 σ = g dπ − (T g)2 dπ. (3.3) Our main task is to show that (3.2) holds, whatever the (initial) distribution µ of X0 . For this, first let f be increasing (on S into R), f ∈ L2 (π), and denote by {Xn′ : n ≥ 0}, {Xn : n ≥ 0} the processes (2.1) with X0′ having distribution µ and X0 having distribution π, respectively. Write ′ Sm,q = n−1/2 Sm,q = n−1/2 q X j=m q X j=m f Xj′ − f , f (Xj ) − f , 0 ≤ m ≤ q ≤ n. (3.4) ′ ′ ′ Then S0,n = S0,n + Sn′ 0 ,n , and, for every n0 , S0,n → 0 a.s. as n → ∞. 0 −1 0 −1 Same holds for Sm,q . Now, for any given r ∈ R, Z ′ ′ P Sn0 ,n > r = Ehn−n0 Xn0 = hn−n0 (y) (T ∗n0 µ) (dy) , (3.5) where hj (y) = P (S0,j (y) > r) , (3.6) defining {Xn (y) : n ≥ 0} for (2.1) with X0 ≡ y, and Sm,q (y) = n −1/2 q X j=m f (Xj ) − f , 0 ≤ m ≤ q ≤ n. (3.7) 179 Monotone Markov processes Then hj is increasing, hj ∈ G1 , so that, by Theorem 2.1, Z Z ∗n0 sup hn−n0 (y) (T µ) (dy) − hn−n0 (y) π (dy) → 0 as n0 → ∞. n>n0 (3.8) Fix ε > 0. In view of (3.5)–(3.8), there exists n0 (ε) such that the left side of (3.8) is less than ε/4 if n0 = n0 (ε), and hence for all r P Sn′ ,n > r − P (Sn0 ,n > r) < ε 0 4 ∀n > n0 = n0 (ε) . In view of (3.2) (i.e. the CLT for S0,n ), there exists n (ε) such that |P (Sn0 ,n > r) − (1 − Φσ2 (r))| < ε 4 ∀n ≥ n (ε) > n0 = n0 (ε) , ∀r, where Φσ2 is the cumulative distribution function of N 0, σ 2 . Hence, for all r, P Sn′ ,n > r − (1 − Φσ2 (r)) < 2ε ∀n ≥ n (ε) > n0 = n0 (ε) . 0 4 (3.9) Finally, for each δ > 0, one has P Sn′ 0 ,n > r + δ − P ≤ P Sn′ 0 ,n > r − δ + P ′ ′ S0,n ≥ δ ≤ P S0,n > r ′ 0 S0,n ≥ δ . 0 Choose δ = δ (ε) such that |Φσ2 (r ± δ) − Φσ2 (r)| < ε/4 for all r, and choose ′ ≥ δ (ε) < ε/4, ∀n ≥ n1 (ε). Since the n1 (ε) ≥ n (ε) such that P S0,n 0 estimate (3.9) is uniform in r (as the function (3.6) belongs to G1 , whatever be r), so is (3.9), and one gets 2ε ε ε ′ P S0,n > r − (1 − Φσ2 (r)) < + + = ε, ∀n ≥ n1 (ε) . 4 4 4 This concludes the proof of Theorem 3.1 for f increasing, f ∈ L2 (π). For f = f1 − f2 , with f1 , f2 increasing, consider the (joint) distribution of ′(1) ′(2) (1) (2) S0,n , S0,n and of S0,n , S0,n , where the superscript (i) indicates that fi is used in placeof f in (3.4) and subsequently. For arbitrary r1 , r2 ∈ R, one now ′(1) ′(2) (1) (2) considers P S0,n > r1 , S0,n > r2 and P S0,n > r1 , S0,n > r2 , compared with the help of the increasing function (1) (2) y −→ hj,1,2 (y) = P S0,j (y) > r1 , S0,j (y) > r2 , 180 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade (i) with S0,j (y) as in (3.7) with f replaced by f (i) . (1) (2) The asymptotic distribution of S0,n , S0,n (under initial distribution π) is shown to be a two-dimensional Normal, again using the martingale CLT (1) (2) for Markov processes applied to a linear combinations of S0,n , S0,n . The two ′(1) ′(2) dimensional CLT for S0,n , S0,n (under arbitrary initial distribution µ) now follows as in the one-dimensional case. From this follows immediately the ′(1) ′(2) 2 CLT for S0,n − S0,n . Remark 3.1. We conjecture that the conclusion of Theorem 3.1 holds under the hypothesis of Theorem 2.1. Remark 3.2. For Harris recurrent processes, such as considered in Section 4, it is generally true that the CLT (3.2) holds, whatever the initial distribution (see Bhattacharya, 1982, Theorem 2.6, for a precise statement). However, for processes which are not Harris recurrent or irreducible, such convergence for arbitrary initial distributions may not be true. For examples of many processes for which Theorem 3.1 holds, but which are not Harris recurrent, see Bhattacharya and Rao (1993), Bhattacharya and Majumdar (1999, 2001, 2004), and Chapters 3, 4 of Bhattacharya and Majumdar (2007). Remark 3.3. Theorem 3.1 may be strengthened to its functional form, providing convergence in distribution of the scaled partial sums process in (3.2) to the Wiener measure. See Bhattacharya and Lee (1988). 4 Monotone Markov processes with nonnegativity constraints As explained in the Introduction, nonnegativity and monotonicity constraints arise naturally in economics, queueing theory, and environmental studies. In the present section we study certain Markov processes on S = [0, ∞) which are monotone (increasing) in the sense that the transition probability p (x, dy) is stochastically ordered: a transition probability p (x, dy) on an interval J is stochastically larger than p (x′ , dy) if x′ < x; in other words, Fx (y) ≤ Fx′ (y) ∀y ∈ J, if x′ < x, (4.1) where Fx is the distribution function of p (x, ·). The following lemma shows that such Markov processes may be generated by iterations of i.i.d. increasing maps. Monotone Markov processes 181 Lemma 4.1. A monotone Markov process on an interval J, satisfying (4.1), may be represented as (2.1), with {αn : n ≥ 1} i.i.d. increasing maps on J. Proof. By a strictly increasing continuous map one may map J onto the unit interval I = (0, 1), or (0, 1], or [0, 1), or [0, 1], depending on J. So we may take the state space to be I. Let U be a random variable with the uniform distribution on I. Then define the random map α on I by αx = Fx−1 (U ), where Fx−1 (u) = inf {y ∈ R : Fx (y) > u} , u ∈ I (See Bhattacharya and Waymire, 2009, p.228). Using (4.1), it is simple to check α is increasing. 2 For a more general treatment see Lindvall (1992), pp. 132–136. For the rest of this section, we take S = [0, ∞), and assume that {0} is a recurrent set, P (Xn (x) = 0 for some n ≥ 1) = 1 ∀x ∈ [0, ∞) , or P (τ0 < ∞) = 1, (4.2) where τ0 = inf {n ≥ 1 : Xn = 0} is the first time ( > 0) to reach 0. Also, assume p(n) (0, [x, ∞)) > 0 for some n = n (x) , ∀x > 0. (4.3) The process is then Harris recurrent, or ϕ− recurrent, with respect to the Dirac measure ϕ at 0: ϕ ({0}) = 1 (see, e.g. Meyn and Tweedie, 1993, p. 200). For two processes Xn (x), Xn (x′ ), n ≥ 0, given by (2.1), with X0 = x and X0 = x′ , respectively, a (strong) coupling occurs with probability one, at time τ0 (x) if x′ ≤ x, and at time τ0 (x′ ) if x < x′ , where τ0 (z) = inf {n ≥ 1 : Xn (z) = 0} , z ∈ [0, ∞) . (4.4) That is, Xn (x) = Xn (x′ ) ∀n ≥ τ0 (x), if x′ ≤ x, or ∀n ≥ τ0 (x′ ), if x < x′ . By a conditioning argument, it follows that such a coupling occurs for two processes (2.1) with initial random variables X0 , X0′ which are independent, and independent of {αn : n ≥ 1}, at a time τ0 (X0 ) on {X0′ ≤ X0 } τ= (4.5) τ0 (X0′ ) on {X0 < X0′ } From standard theory, the Markov process has an invariant probability, say π, necessarily unique, if and only if Eτ0 (0) < ∞. (4.6) 182 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade Also, letting X0 have distribution µ and X0′ have distribution π, one then has (4.7) dtv (T ∗n µ, π) ≤ P (τ > n) → 0 as n → ∞, ∀µ ∈ P (S) , where dtv is the total variation distance: dtv (µ, ν) = sup {|µ (B) − ν (B)| : B ∈ B ([0, ∞))} . A useful criterion for an exponential rate of convergence in (4.7), due to Lund and Tweedie (1996), is the following. Define Vx (c) = E exp {cτ0 (x)} , c∗ = sup {c : Vx (c) < ∞ ∀x} . Theorem 4.1 (Lund and Tweedie, 1996). Suppose c∗ > 0. The for all x ∈ [0, ∞), and for every 0 < c < c∗ , dtv p(n) (x, ·) , π = o (exp {−cn}) as n → ∞. For an application to a process of special interest to us, sometimes referred to as the Lindley process, let {Zn : n ≥ 1} be an i.i.d. sequence, Xn+1 = max {0, Xn + Zn+1 } (n ≥ 0) , (4.8) where X0 is an arbitrary nonnegative random variable independent of the sequence {Zn : n ≥ 1}. Many authors have looked at this process. (See, e.g., Lindley, 1952, Spitzer, 1956, Feller, 1971, pp. 194–200, Lund and Tweedie, 1996, Bhattacharya and Majumdar, 2007, pp. 336–338.) From Spitzer (1956), it follows that this process is ergodic (i.e. P it has an invariant probability π, necessarily unique), if and only if n−1 ∞ n=1 P (Sn > 0) < ∞, where Sn := Z1 + · · · + Zn . We will assume the stronger condition (see, e.g., Bhattacharya and Majumdar, 2007, pp. 237, 238): EZ1 < 0, and, to avoid trivialities, it is also assumed that P (Z1 > 0) > 0. Note that the αn (n ≥ 1) defining the process (4.8) are given by αn x = max {0, x + Zn } , x ∈ [0, ∞) (n ≥ 1). To estimate Vx (c) in this case, observe that P (τ0 (x) > n) ≤ P (x + Sn > 0) = P edSn > e−dx ≤ edx M n (d) M (d) := EedZ1 , (4.9) 183 Monotone Markov processes assuming that the moment generating function M (d) is finite for some d > 0. Since M (0) = 1 and M ′ (0) = EZ1 < 0, the quantity M ∗ = inf {M (d) : d > 0} is less than 1. Let d∗ be the point where this minimum is attained. Then let d = d∗ in (4.9). Following Lund and Tweedie (1996), use the estimate Vx (c) = ∞ X ecn P (τ0 (x) = n) ≤ n=1 c+d∗ x ≤ e c ∗ −1 (1 − e M ) ∞ X ec(n+1) P (τ0 (x) > n) n=0 <∞ for all c such that ec M ∗ < 1. Now let c∗ = ln (1/M ∗ ). Then c∗ ≥ c∗ , and one arrives at a slight extension of a result of Lund and Tweedie (1996). Theorem 4.2. Under the above assumptions on Zn , one has 1 dtv p(n) (x, dy) , π = o (exp {−cn}) ∀c < ln ∗ . M Example 4.1. (Two-sided exponential distribution). Let Z1 have the density ( ab −bx e , x>0 g (x) = a+b ab ax x ≤ 0. a+b e , Assume 0 < a < b. Then the invariant distribution π has an atom at 0, and a density π (x) on (0, ∞) given by (see, e.g., Feller, 1971, Example VI.8.b) a π ({0}) = 1 − , b π (x) = b − a −(b−a)x ae , x > 0. b In this example, M (d) = EedZ1 = d∗ = b−a , 2 1 a a+b1− M∗ = d b + 1 b a+b1+ d b (d < b) , 4ab . (a + b)2 Example 4.2. (Shifted exponential distribution). Here Z1 has the density, for some c > 0, θ > 0, θ < c, ( 1 − 1θ (x+c) e for x ≥ −c, g (x) = θ 0 for x < c, 184 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade so that 1 M (d) = e−cd / (1 − θd) , d < , θ c c c − θ M∗ = M = e− θ +1 . cθ θ One can check that the invariant distribution π has a density π (x) on (0, ∞) given by π (x) = β −1 exp {− (x + c) /β} , x > 0, where β > θ solves θ c . 1 − = exp − β β The point mass at 0 is, of course, given by π ({0}) = 1 − R∞ 0 π (x) dx. The following example was considered by Iams and Majumdar (2010). Example 4.3. (Normal with negative mean). Let Z1 have the Normal distribution N µ, σ 2 with mean µ < 0 and variance σ 2 > 0. Then M (d) = edµ+σ 2 d2 /2 , d∗ = − µ , σ2 µ2 M ∗ = e− 2σ2 . Remark 4.1. In general, one has π ({0}) = (E0 τ0 )−1 > 0 for the process in (4.8) if E0 τ0 < ∞. In the case Z1 has an absolutely continuous distribution (with respect to Lesbegue measure on [0, ∞)), the invariant probability π has a density on (0, ∞), in addition to the point mass at 0. We now turn to those cases when τ0 (z) in (4.4) does not have a finite moment generating function Vx (c) for any c > 0. This is the case when Z1 has a relatively ‘fat tail’, often observed in certain types of data in economics and finance. In such cases one may still hope to get polynomially decaying convergence rates to equilibrium. First, we consider general monotone increasing processes on [0, ∞) satisfying (4.2), (4.3), (4.6). It is convenient to denote probabilities and expectations under an initial state x as Px and Ex , and Pµ , Eµ for the corresponding quantities when the initial distribution is µ. (Thus Px and Ex are really Pδx , Eδx .) Then (see, e.g., Bhattacharya and Majumdar, 2007, p. 214, relations C9.27-28), Z f dπ = E0 Pτ0 −1 m=0 f (Xm ) E0 τ0 , 185 Monotone Markov processes for all π−integrable (or nonnegative measurable) f on [0, ∞). For α > 0, writing h (x) ≡ Ex τ0α , one gets Eπ τ0α = Z Ex τ0α π (dx) = Z h (x) π (dx) = E0 Pτ0 −1 h (Xm ) . E0 τ0 m=0 + )α |F + Note that h (Xm ) = EXm τ0α = E (τ0 ◦ Xm m , where Xm is the after + m− process: (Xm )n ≡ Xm+n (n ≥ 0), and Fm is the sigmafield generated by {Xj : 0 ≤ j ≤ m}. Then E0 τX 0 −1 h (Xm ) = E0 m=0 = = = ∞ X m=0 ∞ X m=0 ∞ X ∞ X m=0 + α h (Xm ) 1{m<τ0 } E0 E τ0 ◦ Xm E0 E + τ0 ◦ Xm α 1{m<τ0 } |Fm E0 E + τ0 ◦ Xm α 1{m<τ0 } m=0 |Fm 1{m<τ0 } = E0 (since {m < τ0 } ∈ Fm ), τX 0 −1 m=0 + τ0 ◦ Xm α . + = τ − m on {m < τ }, one then has Noticing that τ0 ◦ Xm 0 0 E0 τX 0 −1 m=0 h (Xm ) = E0 τX 0 −1 m=0 (τ0 − m)α = E0 Hence Eπ τ0α ≤ τ0 X m=1 mα ≤ E0 τ0α+1 . E0 τ0α+1 . E0 τ0 (4.10) Theorem 4.3. In addition to (4.2), (4.3), assume E0 τ0α+1 < ∞ for some α > 0. Then, for every µ such that Eµ τ0α < ∞, one has dtv (T ∗n µ, π) = o n−α , as n → ∞. (4.11) In particular, (4.11) holds for µ = δ{x} for all x ∈ [0, ∞). Proof. Let the coupling time τ be expressed by (4.5), with X0 , X0′ having distributions µ and π, respectively. Then Z ∗n −α dtv (T µ, π) ≤ P (τ > n) ≤ n τ α dP. {τ >n} 186 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade It suffices then to prove that Eτ α < ∞. For this write Z Z Eτ α = τ α dP + τ α dP ′ ′ {X0 ≤X0 } {X0 >X0 } α = Eτ0 (X0 ) 1{X ′ ≤X0 } + Eτ0α X0′ 1{X ′ >X0 } 0 0 ≤ Eµ τ0α + Eπ τ0α < ∞, in view of (4.10) and the assumption Eµ τ0α < ∞. For the second part, take µ = δ{x} and note that E0 τ0α+1 < ∞, together with (4.3), implies that there exists arbitrarily large x′ such that Ex′ τ0α+1 ≡ E τ0α+1 (x′ ) < ∞. But τ0 (x) ≤ τ0 (x′ ) ∀x ≤ x′ . Hence Ex τ0α+1 < ∞ ∀x ≥ 0. 2 We now turn to Lindley process (4.8), and assume Z1 − µ1 s < ∞, µ1 ≡ EZ1 < 0, P (Z1 > 0) > 0, ρs ≡ E σ (4.12) where s ≥ 3 is an integer and σ 2 = E (Z1 − µ1 )2 . Denote by Φ the standard Normal distribution function. By the non-uniform error estimate in the CLT (see Bhattacharya and Ranga Rao, 1976, Corollary 17.7, p. 172), one has √ µ1 Sn − nµ1 √ P0 (τ0 > n) ≤ P (Sn > 0) = P >− n σ n σ √ √ Sn − nµ1 |µ1 | |µ1 | c′ s √ > n = P ≤1−Φ n +√ √ σ σ σ n n 1 + n |µσ1 | s ′′ |µ1 | ≤ c n−(s+1)/2 , σ where c′ , c′′ are constants which depend only on ρs and s. A standard summation by parts and elementary estimation lead to, for 0 < α < (s − 1)/2, E0 τ0α+1 = ∞ X n=1 nα+1 P0 (τ0 = n) ≤ (α + 1) 2α ∞ X n=1 nα P0 (τ0 > n) < ∞. (4.13) Using Theorem 4.3, we have now proved the first part of Theorem 4.4 below. Theorem 4.4. Assume (4.12), with s ≥ 3, for the Lindley process (4.8). Then, for every x ∈ [0, ∞) , s−1 dtv p(n) (x, ·) , π = o n−α for all α < . 2 187 Monotone Markov processes For an initial distribution µ, one has dtv (T ∗n µ, π) = o n−α provided R [o,∞) x (s+1)/2 µ (dx) < ∞. for all α < s−1 , 2 Proof. In order to prove the second part of the theorem, again use Corollary 17.7, p. 172, in Bhattacharya and Ranga Rao (1976), to get Px (τ0 > n) ≤ P (x + Sn > 0) √ |µ1 | Sn − nµ1 x √ = P >− √ + n σ n σ n σ √ |µ1 | x c′ ≤ 1−Φ − √ + n +√ √ |µ1 | s .(4.14) x σ σ n n 1 + − σ√ + n σ n √ √ x Note that − σ√ + n |µσ1 | > n |µ2σ1 | if x ≤ n n|µ1 | 2 , and, therefore, (4.14) yields Px (τ0 > n) ≤ c′′′ n−(s+1)/2 ∀x ≤ n |µ1 | , 2 where c′′′ is a constant depending only on |µ1 |, σ, ρs and s. Also, trivially, Px (τ0 > n) ≤ 2x n |µ1 | ∀x > . n |µ1 | 2 Therefore, Pµ (τ0 > n) ≤ c′′′ n−(s+1)/2 + ′′′ −(s+1)/2 ≤ c n + 1 n |µ1 | Z 2xµ (dx) {x>n|µ1 /2|} 1 n |µ1 | (n |µ1 /2|) s−1 2 Z x s+1 2 µ (dx) {x>n|µ1 /2|} = O n−(s+1)/2 . P α The series ∞ n=1 n Pµ (τ0 > n) then converges to a finite limit if (s + 1) /2 − α > 1, i.e., α < (s − 1) /2. This concludes the proof of Theorem 4.4. 2 For proving central limit theorems for general monotone (increasing) Markov processes on [0, ∞), under the hypothesis (4.2), (4.3), (4.6), and 188 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade some appropriate moment assumptions, a good way to proceed is to establish a CLT for the normalized sum (j+1) τ0 −1 Z n 1 X X √ (4.15) f (Xm ) − f dπ −→ N 0, σ 2 , n (j) j=1 m=τ0 (j) is the j− th return time to 0: τ0 (0) = 0, τ0 where τ0 j ≥ 1, τ0 (j) (1) n o (j−1) = inf n > τ0 : Xn = 0 , = τ0 . The classical CLT holds for (4.15), as n → ∞, if !2 Z τX 0 −1 E0 f (Xm ) − f dπ < ∞. (4.16) m=0 From this one may derive the desired result (See, e.g., Bhattacharya and Majumdar, 2007, Theorem 10.2, pp. 187–188.), Z N 1 X √ (4.17) f (Xi ) − f dπ −→ N 0, δ2 as N → ∞, N i=0 where δ2 is given by δ2 = (E0 τ0 )−1 E0 "τ −1 0 X m=0 f (Xm ) − Z f dπ # 2 ≡ (E0 τ0 )−1 σ 2 . (4.18) Taking this route, it easily follows that the CLT holds whatever be the initial distribution µ. Also see Bhattacharya (1982), Theorem 2.6. Our final result is now an easy consequence of (4.13), and the above sufficient condition (4.16) for the CLT (4.17). Theorem 4.5. For the process (4.8), assume (4.12) holds for some s > 3. Then for all bounded measurable f on [0, ∞), the CLT (4.17) holds, whatever the initial distribution. R Proof. To verify (4.16), write c = sup f (x) − f dπ : x ∈ [0, ∞) . P R 2 τ0 −1 Then E0 f (X ) − f dπ ≤ E0 τ02 c2 < ∞, provided s > 3, as m m=0 shown by (4.13). 2 Remark 4.2. The technique of representing the sum in (4.18) as a sum of i.i.d. block sums such as (4.15), allows one not only to apply the classical CLT, but also to immediately extend Theorem 4.5 to a functional central limit theorem (See,e.g., Billingsley, 1968, pp. 68-73, or Bhattacharya and Waymire, 2009, pp. 99-101). Monotone Markov processes 189 References Bhattacharya, R.N. (1982). On the functional central limit theorem and the law of the iterated logarithm for Markov processes. Z. Wahrsch. Verw. Gebiete, 60, 185–201. Bhattacharya, R.N. and Lee, O. (1988). Asymptotics of a class of Markov processes which are not in general irreducible. Ann. Probab., 16, 1333–47 (correction (1997): Ann. Probab., 25, 1541–43). Bhattacharya, R.N. and Majumdar, M. (1999). On a theorem of Dubins and Freedman. J. Theoret. Probab., 12, 1067–1087. Bhattacharya, R.N. and Majumdar, M. (2001). On a class of random dynamical systems: theory and applications. J. Econom. Theory, 96, 208–229. Bhattacharya, R.N. and Majumdar, M. (2004). Stability in distribution of randomly perturbed quadratic maps as Markov processes. Ann. Appl. Probab., 14, 1802–1809. Bhattacharya, R.N. and Majumdar, M. (2007). Random Dynamical Systems: Theory and Applications. Cambridge University Press, Cambridge. Bhattacharya, R.N. and Majumdar, M. (2010). Random iterates of monotone maps. Rev. Econ. Des., 14, 185–192. Bhattacharya, R.N. and Ranga Rao, R. (1976). Normal Approximation and Asymptotic Expansions. John Wiley and Sons, New York. Bhattacharya, R.N. and Rao, B.V. (1993). Random iteration of two quadratic maps. In Stochastic Processes: A Festschrift in Honour of Gopinath Kallianpur, (Cambanis, S., Ghosh, J.K., Karandikar, R.L. and Sen, P.K., eds.). Springer-Verlag, New York, 13–22. Bhattacharya, R.N. and Waymire, E.C. (2009). Stochastic Processes with Applications. SIAM Classics in Applied Mathematics, 61. SIAM, Philadelphia. Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York. Blackwell, D. (1965). Discounted dynamic programming. Ann. Math. Statist., 36, 226–235. Blumenthal, R.M. and Corson, H. (1972). On continuous collections of measures. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, 2, (L.M. Le Cam, J. Neyman and E.L. Scott, eds.). Univ. California Press, Berkeley, 33–40. Chakraborty, S. and Rao, B.V. (1998) Completeness of Bhattacharya metric on the space of probabilities. Statist. Probab. Lett., 36, 321–326. Diaconis, P. and Freedman, D. (1999) Iterated random functions. SIAM Rev., 41, 45–76. Dubins, L.E. and Freedman, D.A. (1966). Invariant probabilities for certain Markov processes. Ann. Math. Statist., 37, 837–868. Ellner, S. (1984). Asymptotic behavior of some stochastic difference equation population models. J. Math. Biol., 19, 169-200. Feller, W. (1971). An Introduction to Probability Theory and Its Applications, Vol. 2. Second Edition. John Wiley and Sons, New York. Gordin, M.I. and Lifsic, B.A. (1978). The central limit theorem for stationary Markov processes (English translation). Soviet Math. Dokl., 19, 392–394. Hopenhayn, H.A. and Prescott, E.C. (1992). Stochastic monotonicity and stationary distributions for dynamic economies. Econometrica, 60, 1387–1406. 190 Rabi Bhattacharya, Mukul Majumdar and Nigar Hashimzade Iams, S. and Majumdar, M. (2010). Stochastic equlibrium: concepts and computations for Lindley processes. Internat. J. of Econom. Theory, 6, 47–56. Kifer, Y. (1986). Ergodic Theory of Random Transformations. Birkhauser, Boston. Lindley, D.V. (1952). The theory of queues with a single server. Math. Proc. Cambridge Philos. Soc., 48, 277. Lindvall, T. (1992). Lectures on the Coupling Method. John Wiley and Sons, New York. Lund, R.B. and Tweedie, R.L. (1996). Geometric convergence rates for stochastically ordered Markov chains. Math. Oper. Res., 21, 182–194. Maitra, A. (1968). Discounted dynamic programming on compact metric spaces. Sankhyā, Ser. A, 27, 241–248. Majumdar, M. and Mitra, T. (1983). Dynamic optimization with non-convex technology: The case of a linear objective function. Rev. Econom. Stud., 50, 143–151. Majumdar, M., Mitra, T. and Nyarko, Y. (1989). Dynamic optimization under uncertainty: non-convex feasible set. In Joan Robinson and Modern Economic Theory, (G. Feiwel et al., eds.). Macmillan, New York, 545–590. Meyn, S.P. and Tweedie, R.L. (1993). Markov Chains and Stochastic Stability. Springer-Verlag, New York. Ross, S.M. (1983). Introduction to Stochastic Dynamic Programming. Academic Press, New York. Spitzer, F. (1956). A combinatorial lemma and its application to probability theory. Trans. Amer. Math. Soc., 82, 323–339. Yahav, J.A. (1975). On a fixed point theorem and its stochastic equivalent. J. Appl. Probab., 12, 605–611. Rabi Bhattacharya Department of Mathematics The University of Arizona Tucson, AZ 85721, USA E-mail: [email protected] Nigar Hashimzade School of Economics University of Reading Reading, Berkshire RG6 6AA United Kingdom E-mail: [email protected] Paper received October 2009; revised January 2010. Mukul Majumdar Department of Economics Cornell University Ithaca, NY 14853, USA E-mail: [email protected]