Download Markov Chains, Renewal, Branching and Coalescent Processes: Four Topics in Probability Theory

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Transcript
Markov Chains, Renewal, Branching
and Coalescent Processes: Four
Topics in Probability Theory
Andreas Nordvall Lagerås
Mathematical Statistics
Department of Mathematics
Stockholm University
2007
Doctoral Dissertation 2007
Mathematical Statistics
Stockholm University
SE-106 91 Stockholm
Typeset by LATEX
c Andreas Nordvall Lagerås
ISBN 91-7155-375-4 pp. 1–14
Printed by US AB
Abstract
This thesis consists of four papers.
In paper 1, we prove central limit theorems for Markov chains under (local)
contraction conditions. As a corollary we obtain a central limit theorem for
Markov chains associated with iterated function systems with contractive
maps and place-dependent Dini-continuous probabilities.
In paper 2, properties of inverse subordinators are investigated, in particular
similarities with renewal processes. The main tool is a theorem on processes
that are both renewal and Cox processes.
In paper 3, distributional properties of supercritical and especially immortal
branching processes are derived. The marginal distributions of immortal
branching processes are found to be compound geometric.
In paper 4, a description of a dynamic population model is presented, such
that samples from the population have genealogies as given by a Λ-coalescent
with mutations. Depending on whether the sample is grouped according to
litters or families, the sampling distribution is either regenerative or nonregenerative.
Tack
Jag vill tacka
- min handledare Thomas Höglund som var min första föreläsare i sannolikhetsteori och omedelbart fick mig att fatta tycke för ämnet.
- Anders Martin-Löf som alltid har funnits till hands att bolla idéer med.
- Örjan Stenflo som var mycket bra som en första medförfattare, ordentligt sporrande och tillräckligt krävande.
- övriga kollegor på matematisk statistik för att vi fikar, äter, undervisar
och forskar så bra tillsammans. Särskilt fikar.
- min familj och vänner, för allt det som sker utanför min lilla verkstad.
Stockholm 25 januari 2007
Andreas Nordvall Lagerås
Contents
Introduction and summary of the four papers
1
1 Paper I
1.1 Markov chains as iterated function systems . . . . . . . . . . .
1.2 Limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1
2
3
2 Paper II
2.1 Renewal processes and beyond . . . . . . . . . . . . . . . . . .
2.2 Cox processes . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
4
5
6
3 Paper III
3.1 Compound distributions . . . . . . . .
3.2 Branching processes in continuous time
3.3 Binary branching: the Yule process . .
3.4 Main result . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8
. 8
. 8
. 9
. 10
4 Paper IV
12
4.1 Population models . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
List of papers
I Lagerås, A. N. and Stenflo, Ö. (2005) Central limit theorems
for contractive Markov chains.∗ Nonlinearity, 18(5), 1955–1965.
II Lagerås, A. N. (2005) A renewal-process-type expression for the moments of inverse subordinators.† Journal of Applied Probability, 42(4),
1134–1144.
III Lagerås, A. N. and Martin-Löf, A. (2006) Genealogy for supercritical branching processes.‡ Journal of Applied Probability, 43(4),
1066–1076.
IV Lagerås, A. N. (2006) A population model for Λ-coalescents with
neutral mutations. Submitted.
∗
c 2005 IOP Publishing Ltd.
c 2005 The Applied Probability Trust
‡ c
2006 The Applied Probability Trust
†
Introduction and summary of the four papers
This thesis consists of four papers that concern different areas in probability
theory. The following pages have short summaries of each article for the
non-specialist probabilist.
1
Paper I: Central limit theorems for contractive Markov chains
The first article of this thesis is a joint work with Örjan Stenflo, and was
first published in Nonlinearity (2005), vol 18 no 5. It was also a part of my
licentiate thesis.
1.1
Markov chains as iterated function systems
Consider the following way of generating a Markov chain (Zn )n∈N on some
state space S. Let {wi }i∈I be a collection of functions defined on S. Given
that Zn = zn , draw a random variable Xn on I, whose distribution may
depend on zn , and let Zn+1 = wXn (Zn ). In fact, any Markov chain can
be described in this way,∗ with I = [0, 1] and Xn being uniform on I and
independent of Zn , but sometimes it is more natural to take Xn dependent
on Zn .
Example Let (Zn )n∈N have state space {0, 1, 2, 3}, and transition matrix

 1
0 2 0 12
 1 0 1 0
2
2

(pij ) = 
0 1 0 3 
4
4
3
0 14 0
4
which we can also describe with this picture:
3/4
1/2
@
1/2
1/2
3/4
* * ?>=<
* ?>=<
* ?>=<
89:;
?>=<
2 Vj _ h89:;
3
1 Vj _ h89:;
0 Vj _ h89:;
1/4
1/4
S W
Z ] _ a d g k
~B
1/2
Here full (dashed) arrows indicate a step up (down) modulo 4, and the double
(simple, half) arrowheads belong to jumps occurring with probability 43 ( 12 , 14 ).
∗
At least if S is Borel, see Proposition 7.6 in Kallenberg, O. (1997) Foundations of
Modern Probability. Springer.
1
One way of generating this Markov chain is to let
w↓ (z) ≡ z − 1 mod 4
w↑ (z) ≡ z + 1 mod 4
(
z − 1 mod 4 for z = 0, 1
wl (z) ≡
z + 1 mod 4 for z = 2, 3
and let X1 , X2 , . . . be an i.i.d. sequence of random elements in {↓, ↑, l} such
that P (Xn =↓) = 14 , P (Xn =↑) = 12 and P (Xn =l) = 14 for all n, and then
set
Zn+1 = wXn (Zn ) = wXn ◦ wXn−1 (Zn−1 ) = · · · = wXn ◦ · · · ◦ wX1 (Z0 ).
If we allow X1 , X2 , . . . to be dependent on the values of Z1 , Z2 , . . . we can
describe the dynamics with only w↓ and w↑ , namely by letting
1
P (Xn =↓ |Zn = 0, 1) = ,
2
1
P (Xn =↓ |Zn = 2, 3) = ,
4
1
P (Xn =↑ |Zn = 0, 1) = ,
2
3
P (Xn =↑ |Zn = 2, 3) = ,
4
which is arguably more natural.
In the article, we investigate properties of Markov chains with a compact
state space, typically a closed and bounded subset of Rn , that are generated
by a collection of contractive maps {wi }i∈I , with I being countable, and the
probability of Xn = i given Zn = z is given by pi (z) for some functions
{pi (z)}i∈I . Such collections ({wi }, {pi }) are called iterated function systems
with place dependent probabilities.
1.2
Limit theorems
d
When Zn −→ Z, with Z having a stationary distribution for (Zn )n∈N , you
typically have a law of large numbers for the Markov chain:
n
1X
a.s.
f (Zn ) −→ E[f (Z)],
n i=1
or even a central limit theorem:
1 X
d
√
f (Zk ) − E[f (Z)] −→ N (0, σ 2 )
n k=1
n
2
(1)
where f is a function from the state space S to R. One could also want to
center the summands in (1) by E[f (Zk )] instead of E[f (Z)]. A so called functional central limit theorem is a stronger version of a central limit theorem
in which
[nt]
1 X
d
√
f (Zk ) − E[f (Z)] −→ σBt ,
n k=1
where (Bt )0≤t≤1 is a standard Brownian motion.
1.3
Main result
Our main result concerns Markov chains, that have contractive maps {wi }i∈I
when they are viewed as iterated function systems. Hence they are called
contractive Markov chains. With conditions on the smoothness of f and
{pi }i∈I we obtain a functional central limit theorem. The conditions are
such that a highly regular f allows for more “wild” {pi }i∈I and vice versa.
We also state the results with conditions on the rate of convergence towards
the stationary distribution for the Markov chain. Often one consider chains
such that the rate is, in a sense, “exponential”, but our results also work
with even slower convergence.
3
2
Paper II: A renewal-process-type
expression for the moments of
inverse subordinators
This article was first published in Journal of Applied Probability (2005), vol
42 no 4, and was a part of my licentiate thesis.
2.1
Renewal processes and beyond
One of the first stochastic processes that one is introduced to in a beginners
course in stochastic processes is the renewal process. It is simply a collection
of points in time, events of some sort, such that the times between consecutive
events are independent and identically distributed. One quantity of interest
is Nt = “the number of events in [0, t].”
The simplest case of a renewal process is the Poisson process, which has
an exponential distribution for the times between events. The name comes
from the fact that Nt is Poisson distributed for all t. Calculations for the
Poisson process are greatly simplified by the fact that it is a Markov process
in continuous time, and the number of events in disjoint time intervals are
independent. For other renewal processes, one can hardly give any exact
results about the distribution of Nt .
An exception to this is that if one has an explicit expression for E[Nt ]
it can be used to calculate moments of arbitrary integer order for the joint
moments of the increments of (Nt ) over disjoint intervals. It is easiest to state
[k]
the result with factorial moments instead of ordinary moments: E[Nt ] =
E[Nt (Nt − 1) · · · (Nt − k + 1)]. Note that these are the moments you get
if you differentiate the probability generating function of a random variable
and plug in 1:
f (s) = E[sZ ] ⇒ f (k) (s) = E[Z [k] sZ−k ] ⇒ f (k) (1) = E[Z [k] ]
The state space of (Nt )t∈R+ is clearly N0 , the non-negative integers. I
was searching in the literature for a simple increasing process, that like a
renewal process had some dependence between increments over disjoint time
intervals, but in contrast to renewal processes had the whole of R+ as its
state space. Here is where the “inverse subordinators” of the title of this
article enter. These processes are the analogs of renewal processes when the
state space is
+ . To see why, note that (Nt ) is the inverse of the random
PR
n
walk Sn =
k=0 Xk with independent steps X0 , X1 , . . . , corresponding to
the times between events of the renewal process: Nt = min(n ∈ N0 : Sn > t).
4
6N
6S
t
n
r
r
r
b
r
r
b
b
b
r b
r
b
r
-
b
b
-
t
n
Figure 1: To the left we have a realization of the renewal process (Nt ), and
to the right its inverse, the random walk (Sn ).
The renewal process has state space N0 , since (Sn ) only changes value at
times in N0 , see figure 1.
The equivalent of random walks in continuous time are called Lévy processes, and increasing Lévy processes are called subordinators. In this sense,
we can say that the equivalents of renewal processes with a continuous state
space are the inverse subordinators constructed as τt = inf(τ ∈ R+ : Yτ > t),
where (Ys ) is a subordinator, see the top part of figure 3.
2.2
Cox processes
The Poisson process mentioned above is also referred to as a homogeneous
Poisson process, since the process is homogeneous in time. One can extend
this process to an inhomogeneous process and still keep the Poisson property,
namely that Nt − Ns , the number of points in the interval (s, t], is Poisson
distributed, but the renewal property will then be lost. In the general case,
the mean of Nt − Ns is λ((s, t]), where λ is a given measure on R, called the
intensity measure. See figure 2 for an illustration. In the homogeneous case
λ((s, t]) = c(t − s), for some positive constant c.
If we let Λ be a random measure, and given a realization Λ = λ, let Π
be the points of the corresponding inhomogeneous Poisson process, we get a
point process Π that is called a Cox process. Note that a Poisson random
variable Z with mean l has probability generating function f (s) = el(s−1)
and thus factorial moments E[Z [k] ] = lk . Generalizing this result, we obtain
the well known relation that the factorial moments of a Cox process equal
the ordinary moments of its random measure. This is also true for joint
5
6
λ((0, t])
d
λ((0, t])
dt
××
×
××
×
× t
Figure 2: The ×’s denote the points of an inhomogeneous Poisson process,
whose intensity measure of (0, t] is given by the dashed line. The measure
has a density which is illustrated by the full lines. Note that no points can
occur where this density is zero.
moments.
It is also known that if we produce our random measure with aid of an
inverse subordinator by letting Λ((s, t]) = τt − τs , then the Cox process is
also a renewal process.
2.3
Main result
It can be as hard to calculate properties of inverse subordinators as it is
for renewal processes, but not necessarily harder! The main result of the
article is an expression for the joint moments of arbitrary integer order for
the increments of any inverse subordinator, that is similar to the already
known expression for factorial moments of renewal processes. A sketch of
the proof goes as follows: By constructing a random measure from a given
inverse subordinator, and from that a Cox process as above, we are left
with a Cox process that is also a renewal process. Ordinary moments of
the inverse subordinator equal the corresponding factorial moments of the
constructed Cox process, and since the Cox process by construction also is
a renewal process, we can calculate those factorial moments. Other results
from renewal theory also carry over to inverse subordinators with this device.
6
6τ
t
-
t
6
×
6N
×
×
××
× ×
t
r
b
r
b
r
b
r
rb
b
r
b
-
t
r
b
-
t
Figure 3:
Top This is a realization of an inverse subordinator, in fact an inverse of
a compound Poisson process with drift. This implies that the sloping parts
have i.i.d. exponential lengths, and the flat parts are also i.i.d. according to
some distribution.
Middle Here is a graph of dtd τt and a realization of the Cox process Π with
(τt ) as its intensity measure.
Bottom The counting process Nt associated to Π, which is both a renewal
and a Cox process.
7
3
Paper III: Genealogy for supercritical
branching processes
This article is a joint work with Anders Martin-Löf, and was first published
in Journal of Applied Probability (2006), vol 43 no 4.
3.1
Compound distributions
In order to understand the results of this article, one needs to know what a
compound distribution is. We say that a random variable X is compound-N
if
N
X
d
Yi ,
X=
i=1
where N, Y1 , Y2 , . . . are independent, N has a distribution on the non-negative
integers and Y1 , Y2 , . . . all have the same distribution.
3.2
Branching processes in continuous time
A Markov branching process in continuous time is a random process, whose
value at any given time is the number of individuals alive in a population
that evolves as follows. At time t = 0 there is one individual. She lives for an
exponentially distributed time with intensity µ, and when she dies, she gives
birth to a random number of children, distributed as X, say. Each individual
has life length and offspring size that have the same distributions as those
of her mother and are furthermore independent of those of her sisters. The
evolution carries on in the same way with the grandchildren, etc.
If E[X] =, < or > 1 the process is called critical, subcritical or supercritical respectively. It is well know that the process dies out almost surely if it
is critical or subcritical. If it is supercritical, the probability of extinction,
q, is strictly less than one, and in the case of non-extinction the population
size tends to infinity as t → ∞.
In this article we investigate the distribution of the number of individuals
in supercritical branching processes. An individual in the population will at
time t have an infinite line of descent, i.e. descendants at all further times,
if the branching process that starts with her as an ancestor tends to infinity
in size. This will happen with probability p = 1 − q independently of what
happens with the descendants of the other individuals in the population at
time t. This implies that the number of individuals who have an infinite line
of descent will have a binomial distribution with parameters n and p, given
that the population has size n at time t. We let N ≡ 0 if N ∼ Bin(0, p).
8
f
f ✝
v
v
✝
✝
✝
v
f ✝
f
v
v
v
v
f
v
0
t
✝
?
?
✝
?
?
✝
?
?
?
?
✝
✝
?
-
time
Figure 4: This picture illustrates a supercritical branching process. The
full lines denote individuals who have an infinite line of descent, and the
dashed lines those who have not. The 13 individuals that are alive at time
t are denoted with circles. Each of those 13 individuals have a chance p of
having an infinite line of descent, independently of all the other ones. In the
picture, 8 of the 13 have that, and are denoted by full circles. Deaths of
individuals who leave no children after themselves are denoted by a cross.
At the rightmost part of the tree, the crosses also denote lineages that will
eventually die out, whereas the stars denote infinite lineages.
This relation means that it sometimes suffices to study the subpopulation of
individuals who have an infinite line of descent in order to understand the
dynamics of a supercritical branching process in general. It turns out that
this subpopulation also can be described as a branching process, see figure
4, and this branching process has X ≥ 2.
3.3
Binary branching: the Yule process
A branching process that has X ≡ 2, i.e. only binary branching, is called a
Yule process. It is one of the very few types of branching processes whose
distribution can be calculated exactly. At any time t, the number of births
until t, which we call Nt , has a geometric distribution with parameter e−µt .
9
Since a death and birth event creates a net increase of one, and we start with
one individual, the size of the population at t is 1 + Nt .
It is easy to show that given Nt = n, the birth times in the population,
τ(1) < τ(2) < . . . , have the distribution as an ordered sample of size n from
independent τ1 , τ2 , . . . , that have a certain distribution F depending on t and
µ.
3.4
Main result
If Zt is the size at time t of a branching process Z with X ≥ 2, e.g. the
subpopulation of individuals with an infinite line of descent in a supercritical
branching process, then Zt − 1 is compound geometric, or more exact:
d
Zt = 1 +
Nt
X
(2)
Yi ,
i=1
where Nt is geometric with parameter e−µt , and independent of the Y1 , . . . ,
which are all i.i.d. as some Y . The random variable Y − 1 itself also has a
compound distribution:
X−2
X (j)
d
Y =1+
Zt−τ ,
(3)
j=1
(1)
where Zt−τ , . . . are i.i.d. as the value of the process Z at a random time t − τ ,
where τ has distribution F as in the previous section about the Yule process.
This should be understood as follows. We can embed a Yule process Zb
in any branching process with X ≥ 2, simply by picking out exactly two of
the ancestor’s children, two children from each of the offspring of those two,
etc., see figure 5. Let Zbt be the number of individuals in this subpopulation
at time t. We can now decompose Zt :
Zt = Zbt + (Zt − Zbt ) = 1 + Nt + (Zt − Zbt ),
where (Zt − Zbt ) counts all individuals that are related to sisters of the individuals in Zbt . A time of a birth in Zb will be distributed as τ if we pick it
uniformly at random from all the birth times between 0 and t. An additional
number of X − 2 individuals will be born in Z at that time. Each of these
individuals will start independent branching process that will evolve as Z,
but only during a random time of length t − τ . Hence
(i)
d
(Zt − Zbt ) =
Nt XX
−2
X
i=1
10
j=1
(ij)
Zt−τi ,
-
×
×
u
0
×
τ
×
×
×
×
×
time
-
t
Figure 5: A branching process with X ≥ 2. We have also embedded a Yule
process in this process (the full lines). The ×’s denote the times of birth
in the Yule process. Consider the time τ , at which a birth in both Z and
Zb occurs. Here X = 4, and thus the two new individuals in Zb have two
(= X − 2) sisters, indicated by the arrows, that start their own independent
branching processes whose sizes at time t are distributed as Zt−τ .
(ij)
with X (1) , . . . being i.i.d. as X and all Zt i.i.d. as Zt . If we put all this
together we arrive at (2) and (3). This reasoning is quite heuristic, but in the
article the results are proved rigorously with the aid of generating functions
and their properties.
11
4
Paper IV: A population model for Λ-coalescents with neutral mutations
This article has been submitted for peer-review.
4.1
Population models
The problems studied in this article come from the field of theoretical population dynamics. Consider a sample of n individuals from some very large
population. We want to know how these individuals are related to each other.
When we trace their lineages backwards in time and reach a common ancestor to some, or all, the individuals in our sample, in effect we have reached
a branching point on the family tree of the individuals in the sample. We
continue until we have reached the most recent common ancestor of all the
individuals.
There are two natural conditions to be put on this process of coalescing
lineages. First, it should be Markov. Second, it should be consistent in the
following sense. If we draw the family tree of a sample of n + 1 individuals
and then delete the branch of one of the individuals, the resulting tree should
have the same distribution as if we started with n individuals and did not
delete any branch. If both these conditions are fulfilled, the dynamics of the
process can be completely parametrized by
Pa finite measure Ξ on the infinite
simplex ∆ = {(x1 , . . . ) : x1 ≥ x2 ≥ . . . , ∞
i=1 xi ≤ 1}. These processes are
called Ξ-coalescents or coalescents with simultaneous multiple collisions.
We only consider processes with at most one collision at any given time,
i.e. we will never reach two or more ancestors to different groups in our sample
at the same time. In this case the process is completely parametrized by a
Figure 6: Example of population dynamics without mutations. The population consists of three families represented by circles. An individual denoted
by a cross is picked at random and in the next step to the right, she has
begot offspring denoted by the black circle. The total area of the circles is
constant throughout the series.
12
finite measure Λ on [0, 1], and is called a Λ-coalescent or a coalescent with
multiple collisions. We say that the process has (possibly) multiple collisions
since more than two lineages may reach their common ancestor at any given
time.
For the dynamics of the population
this means that the common ancestor
had such a large number of offspring at
the time of collision that it constituted
a considerable fraction of all of the population in the next generation, since it
would otherwise be highly unlikely that
more than two individuals in your sample, representing different lineages, would
happen to have the same mother at any
given generation. This reasoning holds
if the original sample was sampled uniformly from the entire population as we
have assumed.
The population thus evolves in jumps,
where an individual is picked uniformly
at random and begets offspring with size
being a fraction, say X, of the total population and the rest of the population is
scaled down by a factor 1 − X. See figure
6 for an illustration with X = 0.5, 0.4 and Figure 7: The gray circles de0.5. Note that large families grow on the note all the “mutants”, or sinbehalf of the smaller ones, since it is more gletons, and only grow by eroprobable that a mother is picked from a sion. If a mutant begets some
offspring, the resulting family will
large family than a small one.
In reality one can often not observe no longer count among the singlethe genealogy of a sample directly, but tons. Above, when moving from
only partition the sample into groups ac- left to right, mass is eroded and
cording to their genetic make-up. Lin- added to the mutants. When
eages have different genotypes because of moving from right, down to the
mutations that introduce new types that left, the individual denoted by
have never been seen before in the popu- a cross begets offspring indicated
lation. In the article a model is described, by the black circle.
in which mutations occur with a small
probability between generations, such that mutations appear with some constant rate, when tracing a lineage backwards in time. This have been studied
before in the sense of dynamics of the sample, but the novel idea in my pa13
per is a description of the dynamics of the whole population, such that any
sample behaves as described earlier.
The idea is simply that since any lineage mutates at constant rate, the
families of the population erode by the same constant rate. The “mutants” all
have different genotypes, so they constitute “infinitesimally” small families
in the population. Nevertheless it can happen that at the time of a jump in
the process, a mother is picked among the “mutants” and from that moment
on, her family makes up a positive fraction of the population, see figure 7 for
an example.
4.2
Main result
The main result of the paper is that a partition of a sample into families
will have the correct sampling distribution, when the sample is picked from a
population that has evolved according to proposed dynamic for a long time.
With correct distribution, we mean that the distribution is the same as for a
sample from a coalescent process with mutations, a type of process that has
been investigated by others earlier. This shows that the proposed model for
the whole population has the right dynamics.
14