Download Stochastic Modelling Unit 1: Markov chain models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Birthday problem wikipedia , lookup

Computational electromagnetics wikipedia , lookup

Plateau principle wikipedia , lookup

Randomness wikipedia , lookup

Theoretical ecology wikipedia , lookup

Least squares wikipedia , lookup

Generalized linear model wikipedia , lookup

Probability box wikipedia , lookup

Transcript
Stochastic Modelling
Unit 1: Markov chain models
Russell Gerrard and Douglas Wright
Cass Business School, City University, London
June 2004
Contents of Unit 1
1 Stochastic Processes
2 Markov Chains
3 Poisson Processes
4 Markov Jump Processes
5 Martingales
1 Stochastic processes
1.1 The Stochastic Process
A stochastic process is just a sequence of random variables.
It involves a random element and a time element.
Any collection {Xα : α ∈ A} of random variables may be considered as a stochastic process.
Examples:
∗ rainfall since midnight
∗ the number of cars passing a traffic census
∗ stock market index, observed day by day (or minute by minute)
∗ a football team’s points score, match by match
∗ the size of the hole in the ozone layer
An unpredictable (random) observable process may be modelled to predict its future behaviour:
∗ short-term movements (exchange rates)
1
∗ medium-term extrema (storm damage claims, derivatives)
∗ long-term asymptotics (steady-state costs, eg health insurance)
Usefulness for modelling is determined by the nature of the dependence of Xn (or X(t)) on
preceding X values.
The set of all values X can ever take is the state space.
1.2 Classification of stochastic processes
• Based on time parameter: discrete-time or continuous-time.
We write {X1 , X2 , . . .} in discrete time, {X(t) : t ≥ 0} in continuous
• Based on state space: discrete or continuous.
A counting process is discrete, taking only integer values;
a no-claims bonus scheme is another example of a discrete state space;
the size of an insurance claim is treated as continuous.
1.3 The History
Ht — the history of X up until time t — is the collection of answers to all questions about
the behaviour of X in [0, t].
We can write HtX to avoid confusion.
In formal notation, HtX = σ({Xs : 0 ≤ s ≤ t}).
IE(Xt | Ht ) = Xt , but IE(Xt+s | Ht ) 6= Xt+s .
{HtX : t ≥ 0} is the filtration generated by X.
A filtration may contain the histories of many processes at once.
(The same things are true in discrete time.)
1.4 Stationarity
A stochastic process X is stationary if
• the distribution of Xt is the same for all t
• for any k the distribution of the vector (Xt , Xt+1 , . . . , Xt+k ) is the same for all t.
Example: we might assume that the process of interest rates is stationary.
1.5 Stationary, independent increments
An increment of a process X is a quantity of the form Xt+s − Xt .
X has stationary increments if the distribution of Xt+s − Xt is the same as that of Xs − X0
for all s, t.
X has independent increments if Xt+s − Xt and Xu+v − Xu are independent r.v.s whenever
the intervals (t, t + s) and (u, u + v) do not overlap.
2
1.6 The Markov property
This is about memorylessness.
‘The future is independent of the past, given the present’
Formally, X has the Markov property if the distribution of Xt+s given Ht is the same as the
distribution of Xt+s given Xt .
1.7 The Martingale property
(Named after a gambling system, in turn named after a horse harness.)
On average, the process stays where it is.
Like the no-arbitrage principle.
Formally, IE(Xt+s | Ht ) = Xt .
(H is usually, but does not have to be, HX .)
2 Markov Chains
2.1 Introduction
A discrete-time process {Xn : n ≥ 0} with a discrete state space is a Markov Chain if it
possesses the Markov property:
‘The future is independent of the past, given the present’
Example: Weather. State space {rainy, sunny}.
Example: Random walk, binomial lattice model.
Example: A no-claims discount scheme with four levels:
Level
Level
Level
Level
1
2
3
4
no discount
10% discount
25% discount
40% discount
After a year without making a claim, move up to the next discount level (unless already in
4).
After a year with one claim, move down to the next discount level (unless already in 1).
3
Example: An actor’s career. The actor is employed, unemployed or doing temporary work.
A month of employment is followed by another such month, with prob. 0.8, or by unemployment otherwise.
After a month of unemployment, the actor finds temporary work with probability 0.4, employment as an actor with probability 0.2.
After a month of temporary work the probability of finding employment is 0.2, or the temporary work continues with probability 0.7; otherwise unemployment looms.
2.2 Transition probabilities
Define the transition probability from i to j as
pij = P(Xn+1 = j | Xn = i),
independent of events before time n by the Markov property.
We assemble the pij into a matrix P , the transition matrix.
Note: entries ≥ 0, row sums = 1.
Example (actor)


0.8 0.2 0
P =  0.2 0.4 0.4  .
0.2 0.1 0.7
2.3 Chapman-Kolmogorov Equations
(k)
The k-step transition probability pij is defined as P(Xn+k = j | Xn = i).
Chapman-Kolmogorov: P (k) = P k , the kth power of matrix p.
2.4 Long-term behaviour
In many cases we observe that for large n the distribution of Xn converges to a limit π, ie.
P(Xn = j | Xn = i) → πj ,
the limit being the same regardless of the starting point.
We can find π by solving πP = π with the constraint that
4
P
πj = 1.
2.5 Fitting a Markov Chain model
Observe the process for an extended period, or several copies of the process (one actor for
several years, or several actors for one year).
ni = number of times the process is in state i,
nij the number of transitions from i to j.
Then estimate pij by pbij = nij /ni .
2.6 Time-inhomogeneous Markov Chains
Transition probabilities may change with time. Young drivers and very old drivers may have
more accidents than middle-aged drivers, for example. P would therefore depend on n, giving
Pn .
The analogous form of the Chapman-Kolmogorov Equations continues to hold, but there is
no long-term limit.
Even if individual behaviour is not time-homogeneous, the behaviour of the population as a
whole could be treated as time-homogeneous because of statistical equilibrium.
3 Poisson Processes
3.1 The standard Poisson Process
Events can happen at any time. The probability of an event in any time interval (t, t + dt)
is λdt + o(dt), independently of other intervals.
Suppose T1 , T1 + T2 , T1 + T2 + T3 , . . . are the times of the first, second, third, . . . events. The
Ti are independent of one another, and all have exponential distribution with rate λ.
Therefore X(t), the number of events in (0, t), is a Poisson random variable with mean λt.
X is a stochastic process in continuous time (t > 0).
3.2 Time-dependent Poisson Process
The underlying rate at which events occur changes in some deterministic way. For example,
more domestic storm damage claims occur in autumn than in summer.
X(t) is still a Poisson r.v., but the mean is now more complicated.
3.3 Age-dependent Poisson Processes
The transition rate depends on the time since the last transition (‘age’).
5
If h(t) is the transition rate at age t, we have
S(t + dt) = S(t) {1 − h(t)dt},
where S is the survivor function,
S(t) = P(time to next transition> t).
The solution is
Z t
h(u) du .
S(t) = exp −
0
X(t) is no longer a Poisson r.v.: instead, X is a simple renewal process.
4 Markov Jump Processes
4.1 The Markov property
Same as in discrete time: P(X(t + s) = j | Ht ) depends only on X(t), j, s and possibly t.
Initially we only consider cases where this probability is independent of t (time-homogeneous
cases); use the notation
pij (s) = P(X(t + s) = j | X(t) = i).
Later there will be variations involving time-inhomogeneity or age-dependence.
Example: reversionary annuity: states are {both partners alive, only husband alive, only wife
alive, neither alive}
Example: marriage model: states {never married, married, divorced, remarried, widowed}
Example: long-term care model: states {healthy, short-term sick, long-term sick}
4.2 The Behaviour of Markov Processes
The length of time spent in the current state x must have a memoryless (ie, exponential)
distribution, with rate parameter (λx ) which depends only on the state itself.
This implies that the mean time spent in state x on any one visit is 1/λx .
Once the jump occurs, the probability that it takes the chain to state y is rxy , regardless of
duration in x.
The matrix R is a discrete-time transition matrix.
The associated chain is the jump chain of X.
A state x for which λx = 0 is absorbing:
if the process hits the state, it will never leave.
6
4.3 The Chapman-Kolmogorov Equations
As in discrete time the Chapman-Kolmogorov Equations are
P (s + t) = P (s) P (t)
Notice that P (0) = I, the identity matrix.
4.4 The Kolmogorov Differential Equations
Setting t = ds in the Chapman-Kolmogorov equations gives
P (s + ds) = P (s) P (ds)
or
P (s + ds) − P (s) = P (s){P (ds) − I},
which gives
P 0 (s) = P (s) Q,
where Q = P 0 (0), the generator matrix of the chain.
This is called the Kolmogorov Forward Equation.
By putting s = dt in the Chapman-Kolmogorov equations, we obtain the Kolmogorov Backward Equation,
P 0 (t) = Q P (t).
The formal solution of the Kolmogorov DEs is P (t) = exp(tQ), but this exponential may be
hard to evaluate.
Example: Linear birth-and-death process. The rate of transitions from x to x + 1 is the
birth rate xβ; from x to x − 1 the death rate xδ. Therefore
qx,x−1 = xδ, qx,x+1 = xβ, qx,x = −x(β + δ)
with all other entries in the Q-matrix being zero. Thus the Q-matrix is


0
0
0
0
...
δ −(β + δ)
β
0
. . .



2δ
−2(β + δ)
2β
. . .
Q = 0

0

0
3δ
−3(β
+
δ)
.
.
.


..
..
..
..
..
.
.
.
.
.
7
4.5 Long-term behaviour
As for the discrete-time version, a time-homogeneous Markov jump process converges (under
certain conditions), in the sense that
P(X(t) = j) → πj as t → ∞
regardless of the starting position, where π is the solution to
πQ = 0.
We can therefore calculate the long-run proportion of time spent in the states.
4.6 Fitting a Markov jump model
The elements of R, the transition matrix of the jump chain, are estimated as in discrete time.
Estimates for the parameters λx are based on average durations in state x.
Testing goodness of fit can take many forms, such as
∗ is exponential a good distribution to fit?
∗ are destinations independent of durations?
∗ do 2nd and subsequent visits have different durations?
4.7 Time-inhomogeneous Markov models
Used when there is an underlying reason for transition rates to change with time.
Example: care model: rates of falling sick are higher in winter, recovery rates lower.
Also used for a single individual with age-varying transition rates.
The transition rates are:
P(X(t + dt) = j | X(t) = i) = qij (t) dt + o(dt).
The formal solution P (t) = exp(tQ) no longer holds. The only possibility is to attempt to
solve the DEs by hand.
Example: Poisson process with rate λ(t) = 2λt/(1 + t2 ):
2λt
2λt
d
p0,j (t) = −
p0,j (t) +
p0,j−1 (t).
2
dt
1+t
1 + t2
Even in this case it is not obvious that there is an explicit solution. In more complicated
cases numerical approximation is required.
It is unlikely that X converges to a limiting distribution
8
4.8 Semi-Markov models
Two restrictions imposed by the Markov format are relaxed:
∗ the durations need not be exponential
∗ the destinations may depend on the durations
Much more flexible.
But difficult to fit because there are so many parameters.
Only practical when there is a prior idea of the distribution of the durations.
Long-run proportion of time spent in the various states may be found as above.
5 Martingales
5.1 Basic properties
Formally, IE(Xt+s | Ht ) = Xt .
(H is usually, but does not have to be, HX .)
Example: random walk with zero-mean increments, binomial lattice, even a (non-Markov)
process with increased volatility after a jump.
It follows from the definition that IEXt = IEX0 .
5.2 Martingale convergence
The usefulness of martingales arises from convergence theorems:
A martingale bounded above or below converges,
Xt → X∞ .
If X is bounded above and below, or has bounded variance, then IEX∞ = IEX0 .
5.3 Stopping times
A random time T is a stopping time if you can always tell whether or not it has happened
yet.
Example: the first time something occurs, but not the last time something occurs.
Formally: the answer to the question “Is T ≤ t?” must lie in Ht .
9
5.4 Stopped Martingales
If X is a martingale and T a stopping time, then Z, defined by
Xt if T > t
Zt =
XT if T ≤ t
is a stopped martingale.
A stopped martingale is also a martingale.
5.5 Optional Stopping
Suppose X is a martingale, certain to hit 0 or a eventually, and T is the first time of hitting
one of them.
Let q be the probability that X hits 0 first.
Then Z is a bounded martingale, so converges, and Z∞ = 0 (with prob. q) or a (with prob.
1 − q), so IEZ∞ = a(1 − q).
But IEZ∞ = IEZ0 = X0 .
Hence q = 1 − X0 /a.
5.6 Practical issues
“First find your martingale”.
Commonly the process being studied is not a martingale.
But transformations can help (such as Yt = hXt ).
10