* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Stochastic Modelling Unit 1: Markov chain models
Survey
Document related concepts
Transcript
Stochastic Modelling Unit 1: Markov chain models Russell Gerrard and Douglas Wright Cass Business School, City University, London June 2004 Contents of Unit 1 1 Stochastic Processes 2 Markov Chains 3 Poisson Processes 4 Markov Jump Processes 5 Martingales 1 Stochastic processes 1.1 The Stochastic Process A stochastic process is just a sequence of random variables. It involves a random element and a time element. Any collection {Xα : α ∈ A} of random variables may be considered as a stochastic process. Examples: ∗ rainfall since midnight ∗ the number of cars passing a traffic census ∗ stock market index, observed day by day (or minute by minute) ∗ a football team’s points score, match by match ∗ the size of the hole in the ozone layer An unpredictable (random) observable process may be modelled to predict its future behaviour: ∗ short-term movements (exchange rates) 1 ∗ medium-term extrema (storm damage claims, derivatives) ∗ long-term asymptotics (steady-state costs, eg health insurance) Usefulness for modelling is determined by the nature of the dependence of Xn (or X(t)) on preceding X values. The set of all values X can ever take is the state space. 1.2 Classification of stochastic processes • Based on time parameter: discrete-time or continuous-time. We write {X1 , X2 , . . .} in discrete time, {X(t) : t ≥ 0} in continuous • Based on state space: discrete or continuous. A counting process is discrete, taking only integer values; a no-claims bonus scheme is another example of a discrete state space; the size of an insurance claim is treated as continuous. 1.3 The History Ht — the history of X up until time t — is the collection of answers to all questions about the behaviour of X in [0, t]. We can write HtX to avoid confusion. In formal notation, HtX = σ({Xs : 0 ≤ s ≤ t}). IE(Xt | Ht ) = Xt , but IE(Xt+s | Ht ) 6= Xt+s . {HtX : t ≥ 0} is the filtration generated by X. A filtration may contain the histories of many processes at once. (The same things are true in discrete time.) 1.4 Stationarity A stochastic process X is stationary if • the distribution of Xt is the same for all t • for any k the distribution of the vector (Xt , Xt+1 , . . . , Xt+k ) is the same for all t. Example: we might assume that the process of interest rates is stationary. 1.5 Stationary, independent increments An increment of a process X is a quantity of the form Xt+s − Xt . X has stationary increments if the distribution of Xt+s − Xt is the same as that of Xs − X0 for all s, t. X has independent increments if Xt+s − Xt and Xu+v − Xu are independent r.v.s whenever the intervals (t, t + s) and (u, u + v) do not overlap. 2 1.6 The Markov property This is about memorylessness. ‘The future is independent of the past, given the present’ Formally, X has the Markov property if the distribution of Xt+s given Ht is the same as the distribution of Xt+s given Xt . 1.7 The Martingale property (Named after a gambling system, in turn named after a horse harness.) On average, the process stays where it is. Like the no-arbitrage principle. Formally, IE(Xt+s | Ht ) = Xt . (H is usually, but does not have to be, HX .) 2 Markov Chains 2.1 Introduction A discrete-time process {Xn : n ≥ 0} with a discrete state space is a Markov Chain if it possesses the Markov property: ‘The future is independent of the past, given the present’ Example: Weather. State space {rainy, sunny}. Example: Random walk, binomial lattice model. Example: A no-claims discount scheme with four levels: Level Level Level Level 1 2 3 4 no discount 10% discount 25% discount 40% discount After a year without making a claim, move up to the next discount level (unless already in 4). After a year with one claim, move down to the next discount level (unless already in 1). 3 Example: An actor’s career. The actor is employed, unemployed or doing temporary work. A month of employment is followed by another such month, with prob. 0.8, or by unemployment otherwise. After a month of unemployment, the actor finds temporary work with probability 0.4, employment as an actor with probability 0.2. After a month of temporary work the probability of finding employment is 0.2, or the temporary work continues with probability 0.7; otherwise unemployment looms. 2.2 Transition probabilities Define the transition probability from i to j as pij = P(Xn+1 = j | Xn = i), independent of events before time n by the Markov property. We assemble the pij into a matrix P , the transition matrix. Note: entries ≥ 0, row sums = 1. Example (actor) 0.8 0.2 0 P = 0.2 0.4 0.4 . 0.2 0.1 0.7 2.3 Chapman-Kolmogorov Equations (k) The k-step transition probability pij is defined as P(Xn+k = j | Xn = i). Chapman-Kolmogorov: P (k) = P k , the kth power of matrix p. 2.4 Long-term behaviour In many cases we observe that for large n the distribution of Xn converges to a limit π, ie. P(Xn = j | Xn = i) → πj , the limit being the same regardless of the starting point. We can find π by solving πP = π with the constraint that 4 P πj = 1. 2.5 Fitting a Markov Chain model Observe the process for an extended period, or several copies of the process (one actor for several years, or several actors for one year). ni = number of times the process is in state i, nij the number of transitions from i to j. Then estimate pij by pbij = nij /ni . 2.6 Time-inhomogeneous Markov Chains Transition probabilities may change with time. Young drivers and very old drivers may have more accidents than middle-aged drivers, for example. P would therefore depend on n, giving Pn . The analogous form of the Chapman-Kolmogorov Equations continues to hold, but there is no long-term limit. Even if individual behaviour is not time-homogeneous, the behaviour of the population as a whole could be treated as time-homogeneous because of statistical equilibrium. 3 Poisson Processes 3.1 The standard Poisson Process Events can happen at any time. The probability of an event in any time interval (t, t + dt) is λdt + o(dt), independently of other intervals. Suppose T1 , T1 + T2 , T1 + T2 + T3 , . . . are the times of the first, second, third, . . . events. The Ti are independent of one another, and all have exponential distribution with rate λ. Therefore X(t), the number of events in (0, t), is a Poisson random variable with mean λt. X is a stochastic process in continuous time (t > 0). 3.2 Time-dependent Poisson Process The underlying rate at which events occur changes in some deterministic way. For example, more domestic storm damage claims occur in autumn than in summer. X(t) is still a Poisson r.v., but the mean is now more complicated. 3.3 Age-dependent Poisson Processes The transition rate depends on the time since the last transition (‘age’). 5 If h(t) is the transition rate at age t, we have S(t + dt) = S(t) {1 − h(t)dt}, where S is the survivor function, S(t) = P(time to next transition> t). The solution is Z t h(u) du . S(t) = exp − 0 X(t) is no longer a Poisson r.v.: instead, X is a simple renewal process. 4 Markov Jump Processes 4.1 The Markov property Same as in discrete time: P(X(t + s) = j | Ht ) depends only on X(t), j, s and possibly t. Initially we only consider cases where this probability is independent of t (time-homogeneous cases); use the notation pij (s) = P(X(t + s) = j | X(t) = i). Later there will be variations involving time-inhomogeneity or age-dependence. Example: reversionary annuity: states are {both partners alive, only husband alive, only wife alive, neither alive} Example: marriage model: states {never married, married, divorced, remarried, widowed} Example: long-term care model: states {healthy, short-term sick, long-term sick} 4.2 The Behaviour of Markov Processes The length of time spent in the current state x must have a memoryless (ie, exponential) distribution, with rate parameter (λx ) which depends only on the state itself. This implies that the mean time spent in state x on any one visit is 1/λx . Once the jump occurs, the probability that it takes the chain to state y is rxy , regardless of duration in x. The matrix R is a discrete-time transition matrix. The associated chain is the jump chain of X. A state x for which λx = 0 is absorbing: if the process hits the state, it will never leave. 6 4.3 The Chapman-Kolmogorov Equations As in discrete time the Chapman-Kolmogorov Equations are P (s + t) = P (s) P (t) Notice that P (0) = I, the identity matrix. 4.4 The Kolmogorov Differential Equations Setting t = ds in the Chapman-Kolmogorov equations gives P (s + ds) = P (s) P (ds) or P (s + ds) − P (s) = P (s){P (ds) − I}, which gives P 0 (s) = P (s) Q, where Q = P 0 (0), the generator matrix of the chain. This is called the Kolmogorov Forward Equation. By putting s = dt in the Chapman-Kolmogorov equations, we obtain the Kolmogorov Backward Equation, P 0 (t) = Q P (t). The formal solution of the Kolmogorov DEs is P (t) = exp(tQ), but this exponential may be hard to evaluate. Example: Linear birth-and-death process. The rate of transitions from x to x + 1 is the birth rate xβ; from x to x − 1 the death rate xδ. Therefore qx,x−1 = xδ, qx,x+1 = xβ, qx,x = −x(β + δ) with all other entries in the Q-matrix being zero. Thus the Q-matrix is 0 0 0 0 ... δ −(β + δ) β 0 . . . 2δ −2(β + δ) 2β . . . Q = 0 0 0 3δ −3(β + δ) . . . .. .. .. .. .. . . . . . 7 4.5 Long-term behaviour As for the discrete-time version, a time-homogeneous Markov jump process converges (under certain conditions), in the sense that P(X(t) = j) → πj as t → ∞ regardless of the starting position, where π is the solution to πQ = 0. We can therefore calculate the long-run proportion of time spent in the states. 4.6 Fitting a Markov jump model The elements of R, the transition matrix of the jump chain, are estimated as in discrete time. Estimates for the parameters λx are based on average durations in state x. Testing goodness of fit can take many forms, such as ∗ is exponential a good distribution to fit? ∗ are destinations independent of durations? ∗ do 2nd and subsequent visits have different durations? 4.7 Time-inhomogeneous Markov models Used when there is an underlying reason for transition rates to change with time. Example: care model: rates of falling sick are higher in winter, recovery rates lower. Also used for a single individual with age-varying transition rates. The transition rates are: P(X(t + dt) = j | X(t) = i) = qij (t) dt + o(dt). The formal solution P (t) = exp(tQ) no longer holds. The only possibility is to attempt to solve the DEs by hand. Example: Poisson process with rate λ(t) = 2λt/(1 + t2 ): 2λt 2λt d p0,j (t) = − p0,j (t) + p0,j−1 (t). 2 dt 1+t 1 + t2 Even in this case it is not obvious that there is an explicit solution. In more complicated cases numerical approximation is required. It is unlikely that X converges to a limiting distribution 8 4.8 Semi-Markov models Two restrictions imposed by the Markov format are relaxed: ∗ the durations need not be exponential ∗ the destinations may depend on the durations Much more flexible. But difficult to fit because there are so many parameters. Only practical when there is a prior idea of the distribution of the durations. Long-run proportion of time spent in the various states may be found as above. 5 Martingales 5.1 Basic properties Formally, IE(Xt+s | Ht ) = Xt . (H is usually, but does not have to be, HX .) Example: random walk with zero-mean increments, binomial lattice, even a (non-Markov) process with increased volatility after a jump. It follows from the definition that IEXt = IEX0 . 5.2 Martingale convergence The usefulness of martingales arises from convergence theorems: A martingale bounded above or below converges, Xt → X∞ . If X is bounded above and below, or has bounded variance, then IEX∞ = IEX0 . 5.3 Stopping times A random time T is a stopping time if you can always tell whether or not it has happened yet. Example: the first time something occurs, but not the last time something occurs. Formally: the answer to the question “Is T ≤ t?” must lie in Ht . 9 5.4 Stopped Martingales If X is a martingale and T a stopping time, then Z, defined by Xt if T > t Zt = XT if T ≤ t is a stopped martingale. A stopped martingale is also a martingale. 5.5 Optional Stopping Suppose X is a martingale, certain to hit 0 or a eventually, and T is the first time of hitting one of them. Let q be the probability that X hits 0 first. Then Z is a bounded martingale, so converges, and Z∞ = 0 (with prob. q) or a (with prob. 1 − q), so IEZ∞ = a(1 − q). But IEZ∞ = IEZ0 = X0 . Hence q = 1 − X0 /a. 5.6 Practical issues “First find your martingale”. Commonly the process being studied is not a martingale. But transformations can help (such as Yt = hXt ). 10