Download Bayesian inference for stochastic multitype epidemics in structured

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Hospital-acquired infection wikipedia , lookup

Pandemic wikipedia , lookup

Transcript
Bayesian inference for stochastic multitype epidemics
in structured populations via random graphs
Nikolaos Demiris
Medical Research Council Biostatistics Unit, Cambridge, UK
and Philip D. O’Neill 1
University of Nottingham, UK
Summary. This paper is concerned with new methodology for statistical inference
for final-outcome infectious disease data using certain structured-population stochastic
epidemic models. A major obstacle to inference for such models is that the likelihood
is both analytically and numerically intractable. The approach taken here is to impute
missing information in the form of a random graph that describes the potential infectious
contacts between individuals. This level of imputation overcomes various constraints of
existing methodologies, and yields more detailed information about disease spread. The
methods are illustrated with both real and test data.
Keywords: Bayesian inference, epidemics, Markov chain Monte Carlo Methods, MetropolisHastings algorithm, random graphs, stochastic epidemic models
1
Introduction
This paper is concerned with the problem of inferring information about disease spread
given data on the final outcome of an epidemic in a structured population. Before
outlining our approach, we begin by briefly recalling relevant background material.
Stochastic epidemic models that incorporate structured populations have become a subject of considerable research activity in recent years. Examples include independenthousehold models (e.g. Longini and Koopman, 1982; Becker and Dietz, 1995; Becker
and Hall, 1996), models with two levels of mixing (e.g. Ball et al., 1997; Ball and
Lyne, 2001; Demiris and O’Neill, 2005), random network models (e.g. Andersson, 1999;
Britton and O’Neill, 2002), and social cluster models (e.g. Schinazi, 2002). The basic
motivation for such work is that, in contrast to epidemic models that assume a homogeneously mixing population of individuals, most human populations contain inherent
1
Address for correspondence: School of Mathematical Sciences, University of Nottingham, Nottingham NG7 2RD, UK.
Email: [email protected]
2
structure because individuals usually spend their time in various groups such as dwelling
places, work places, childcare facilities etc.
Our focus here is on so-called two-level-mixing models, defined formally below. These
models, introduced in Ball et al. (1997), describe a population partitioned into groups
in which infectious contacts can occur both locally within a group, and globally between
groups. The basic inference problem is then to estimate the local and global infection
rates, given knowledge of the underlying structure, and data that indicate which individuals in the population ever became infected during an epidemic. This problem is
complicated because of the model dependence structures. Specifically, global infections
are explicitly described in the model, and so the number infected in a given group is
not independent of the numbers infected in other groups. This in turn means that the
likelihood cannot be expressed as a simple product over group outcome, and in fact is
intractable in any case of practical concern. This problem can be partly overcome by
simply assuming independence between households, which is a reasonable assumption
in a large population, and moreover is asymptotically the case as the number of groups
tends to infinity. In particular, it is then possible to approximate the two-level mixing model with a simpler independent-groups model possessing a tractable likelihood.
In such models, local mixing occurs as before, but global mixing is replaced by the
assumption that each individual independently avoids infection from outside its group
with some fixed probability. Inference for these models is possible in a variety of ways,
see e.g. Addy et al. (1991), Becker and Dietz (1995) and Li et al. (2002). The use
of an independent-households model as an approximation for a two-level-mixing model
underlies the statistical analyses in Ball et al. (1997), Britton and Becker (2000), Ball
and Lyne (2004), and Demiris and O’Neill (2005). An attractive aspect of our methods
is that they dispense with the need for such approximation.
Another difficulty with performing inference for epidemics (and other models with
threshold behaviour, e.g. branching and contact processes) arises due to the bimodal
nature of realisations. Typically, either an epidemic dies out quickly, or else it infects
a fraction of the population which, in a large population, is approximately Gaussian
(see e.g. Andersson and Britton, 2000, Chapter 4). Many statistical analyses require
the assumption that the epidemic has taken off, this being expressed by requiring that
a threshold parameter R exceeds unity (e.g. Becker, 1989, Chapter 8; Rida, 1991; Ball
and Lyne, 2004; Demiris and O’Neill, 2005). Such an assumption (i) often leads to
underestimation in the variability of model parameters or the threshold parameter, and
(ii) is clearly not desirable when attempting to infer control strategies which require
that R < 1. The methods that we present here are free from this restriction.
For intractable likelihood problems, such as that which concerns us, one solution is to
3
augment the parameter space, adding ‘missing data’ or other quantities that then yield
a tractable likelihood. Although such methods are widely-used, in the current context
it is far from obvious what should be imputed. Demiris and O’Neill (2005) describe
an approach based on imputation of the so-called final severity of the epidemic, which
leads to an approximate analysis involving an independent-groups model, as mentioned
above. The key idea in the present paper is instead to impute much more detailed
information about the epidemic, namely the set of susceptible individuals that each
infected individual would infect if no other infections were permitted. This information
is conveniently described by a random digraph in which links correspond to potential
infections.
Three points about our approach should be noted. First, it is apparently ambitious,
since a great deal of extra information is imputed. However, as described later, it is
typically the case that only relatively few digraphs are likely to be compatible with
the data, and thus the imputation is practicable. Second, some problems involving
temporal data, such as weekly case incidence counts, have been approached by imputing
missing information in the form of infection pathways (Haydon et al., 2003; Wallinga
and Teunis, 2004). Although superficially related to our approach, in fact there are
fundamental differences, namely (i) our data are not temporal; (ii) we have to deal with
an intractable likelihood; and (iii) we do not impute the infection pathway itself. Third,
although we focus here on two-level-mixing models, in fact our approach has very wide
applicability, as will be outlined later.
The paper is structured as follows. Sections 2 and 3 contain, respectively, the epidemic
model of interest and the associated random digraph. Section 4 describes the data and
augmented likelihood obtained when the imputed information is employed. An MCMC
algorithm is described in Section 5, and illustrated in Section 6, while Section 7 contains
some concluding remarks.
2
Multitype epidemic models with two levels of mixing
In this section we describe the epidemic model of interest, and an associated threshold
parameter that will be of importance in the sequel.
4
2.1
Multitype two-level mixing model
The following model of a continuous-time epidemic process is defined in Ball and Lyne
(2001). Consider a closed population of N individuals, labelled 1, . . . , N , that is partitioned into groups (e.g. households, farms) of varying sizes. Suppose that the population
P
contains mj groups of size j, and let m = ∞
j=1 mj be the total number of groups. Thus
P∞
N = j=1 jmj . In addition, each individual in the population is assumed to be one of
a possible k types, these typically representing categorical covariates (e.g. age, vaccination status, previous infection history). For i = 1, . . . , k, let Ni denote the total number
of individuals in the population of type i. For convenience, suppose that the possible
types are labelled 1, . . . , k, and for j = 1, . . . , N denote by τ (j) the type of individual j.
Each individual in the population can, at any time t ≥ 0, be in one of three states,
namely susceptible, infective, or removed. A susceptible individual is healthy and may
contract the disease in question. An infective individual has become infected, and
moreover can transmit the disease to others. A removed individual is one who is no
longer infectious, and plays no part in further disease spread. In practice this could occur
either because of actual immunity (induced by antibodies), or by isolation following the
appearance of symptoms. If a type i individual, j say, becomes infected, then they
remain so for a random time I(j) whose distribution is the same as some specified nonnegative random variable Ii . The random variables describing the infectious periods
I(j), j = 1, 2, . . . of different individuals are assumed to be mutually independent.
The epidemic is initiated at time t = 0 by a (typically small) number of individuals becoming infectious. During its infectious period, an individual of type i makes infectious
contacts with each type j susceptible, independently of other individuals, at times given
by the points of a Poisson process of rate λG
ij /Nj . In addition, and independently, the
infectious individual also makes infectious contacts with each type j susceptible in its
own group according to a Poisson process of rate λLij . Once contacted by an infective, a
susceptible individual immediately becomes infectious. At the end of its infectious period, an infective individual becomes removed, and plays no further part in the epidemic.
The epidemic ceases as soon as there are no infectives present in the population.
Two points concerning realism should be noted. First, the model does not include a
latent period, i.e. a time period between infection and infectiousness of an individual.
However, the distribution of final numbers infected in the model is invariant to the
inclusion of any sensible latent period, as described below, and in particular the finaloutcome-data scenario that is under consideration here. Second, the stipulation that the
global and local infection rates have different scalings is common practice in two-level
mixing models. It corresponds to the assumption that an individual would, on average,
5
make more local contacts if their group size increased, but would not make more global
contacts if the entire population increased in size.
2.2
Threshold parameter
Threshold parameters, known in some contexts as basic reproduction numbers, are of
fundamental importance both in stochastic epidemic theory (Andersson and Britton,
2000, p.6), and epidemiology (Farrington et al., 2001). Typically, for a stochastic epidemic model, there is a parameter R such that epidemics in an infinite population
of susceptibles are almost surely finite if and only if R ≤ 1. Such results essentially
originate in branching process theory, in that the early stages of an epidemic are approximately identical to a suitable branching process. In practical terms, the goal of
most disease control measures is to ensure that the value of R is reduced to below unity.
A threshold parameter for the multitype two-level mixing model can be obtained by
allowing the population to become large globally, i.e. by allowing the number of groups,
m, to tend to infinity. The details are given in Ball and Lyne (2001), and are essentially
as follows. Define the k×k matrix M := (mij ) where mij is the average number of global
contacts to type j individuals made by a group in which the first infected individual is
of type i. The threshold parameter, R∗ , is then defined as the maximal eigenvalue of
M . It follows that, in a large population, epidemics are extremely unlikely to occur if
R∗ ≤ 1. Calculation of R∗ in practice involves computing M ; explicit details of how to
do this are given in Ball et al. (2004).
2.3
Distribution of final outcome
We shall be interested in the final outcome of the above model, i.e. the numbers of
initially susceptible individuals of each type in each group who ever become infected
during the epidemic. As described in Ball and Lyne (2001), it is possible in principle
to write down a triangular set of linear equations, the solution of which furnishes us
with the required joint probability mass function. However, in practice this system of
equations can be numerically intractable, even for small population sizes. Such problems
are well-known in epidemic modelling (e.g. Andersson and Britton, 2000, p.18), and
arise because the final outcome probabilities that are typically of interest are derived
recursively using probabilities that are often very close to zero, and in particular which
may be outside the range of normal machine accuracy. An extra complication is that
for non-homogeneous population models, such as that of interest here, the number of
equations themselves can be enormous.
6
In the present context, these numerical problems mean that the most natural likelihood,
namely that obtained by calculating the probability of the observed data given a parameter set, is intractable. To overcome this, we shall adopt a form of data imputation
using a random graph, to which we now turn our attention.
3
Random Digraphs
We now describe a representation of the final outcome of the epidemic model in terms of
an associated random graph. Such representations have been considered by a number of
authors (e.g. Ludwig, 1975; Barbour and Mollison, 1990; Islam et al., 1996; Andersson
and Britton, 2000, Chapter 7), although to our knowledge such graphs have never been
directly used for the purposes of statistical inference.
Consider the following random directed graph (digraph) on N vertices labelled 1, . . . , N .
Vertex j has type τ (j), where 1 ≤ τ (j) ≤ k, and there are Ni vertices of type i in total.
Associate with each vertex j of type i a random ‘lifetime’ I(j) which is a realisation
of a non-negative random variable Ii . The lifetimes of different vertices are assumed
independent. The set of vertices is partitioned into groups, not necessarily of equal size.
For j = 1, . . . , N and l = 1, . . . , k insert a global edge from vertex j to each vertex of
type l with probability 1 − exp(−I(j)λG
τ (j)l /Nl ), and insert a local edge to each vertex of
type l in the same group as j with probability 1−exp(−I(j)λLτ(j)l ). The insertion of each
possible edge is independent of the presence or absence of all other edges. Note that no
edge can be drawn from a vertex to itself. Finally, say that a vertex j is directionally
connected to a vertex l if and only if there exists a path from j to l, and for each
vertex j define Cj as the set of vertices to which j is directionally connected. For later
convenience we adopt the convention that j ∈ Cj .
The relationship between the random digraph and the epidemic model is as follows (see
also Andersson and Britton, Chapter 7). Each vertex corresponds in the obvious way to
an individual, and the lifetime random variables correspond to infectious periods. The
edges emanating from vertices do not correspond directly to infections, although they
can be thought of as corresponding to potential infections. For example, the probability
of a local edge from individual j to individual l in the same group is 1−exp(−I(j)λLτ(j)l ),
which is the probability that j ever infects l in the epidemic model, provided that j ever
becomes infected and that l remains susceptible until contacted by j. An equivalent view
is that the edges describe all contacts between individuals, and that these only result
in infection when the vertices concerned form an infective-susceptible pair. Without
loss of generality, suppose that individuals 1, . . . , a are initially infective in the epidemic
model. Then the random set of individuals who are ultimately infected in the epidemic
7
has the same distribution as the random set of vertices ∪aj=1 Cj .
Note that the random digraph does not contain any temporal information about the
epidemic, but instead is defined in terms of infection probabilities. In particular this
means that inclusion of a latent period in the epidemic model, provided it is almost
surely finite, makes no change to the digraph, and hence makes no change to the final
outcome distribution. Similarly, the infectious period in the original model need not be
continuous, but could be modelled as separate disjoint parts in real time (e.g. corresponding to daytime-only contact). Nor does the digraph explicitly represent the actual
route of infections: a vertex corresponding to an individual who does become infected
might have more than one potential infector in the graph. However, given a realisation
of the digraph it is straightforward to calculate the probability of any possible infection
pathway, if desired.
4
4.1
Data and augmented Likelihood
Final outcome Data
We consider data of the form n = {n(s1 , . . . , sk ; i1 , . . . , ik )}, where n(s1 , . . . , sk ; i1 , . . . , ik )
denotes the number of groups containing sj initially susceptible individuals of type j, of
whom ij ever become infected, where j = 1, . . . , k. Our focus is on Bayesian statistical
inference for the two infection rate matrices ΛL := (λLij ) and λG := (ΛG
ij ), given n. By
L
G
Bayes’ Theorem, the posterior density of interest, π(Λ , Λ | n), satisfies
π(ΛL , ΛG | n) ∝ π(n | ΛL , ΛG )π(ΛL , ΛG ),
where π(n | ΛL , ΛG ) denotes the likelihood and π(ΛL , ΛG ) denotes the joint prior density
of (ΛL , ΛG ). However, as described in Section 2.3, the likelihood is analytically and
numerically intractable in any case of interest, and so something extra is needed. Before
addressing this we make two observations.
First, final outcome data contain no temporal information. Specifically, there is no
information regarding the mean length of the infectious period. Consequently, we fix the
infectious period distribution in advance of any data analysis. This implicitly creates
a time-scale, with respect to which the values of ΛG and ΛL (but not R∗ ) should be
interpreted.
Second, provided the data are not of single type, we would expect that the model parameters are not all identifiable. With sufficiently many different compositions of groups
in the data, all of the ΛL parameters are identifiable. However, the only information
8
available about global infection is, essentially, the numbers of each type infected (k data
points), which is insufficient for the k 2 parameters of ΛG (c.f. Britton, 1998). Although
in principle the MCMC algorithm we describe below can function without regard to this
problem, in practice it is usually pragmatic to consider a reduced-parameter model that
involves extra constraints on ΛG . Examples of this are considered later.
4.2
Augmented Likelihood
In order to surmount the difficulty of an intractable likelihood, we consider augmenting
the parameter space by including a digraph describing potential infectious contacts.
Note that it is only necessary to consider this digraph on vertices corresponding to
individuals who ever become infected in the epidemic, according to the data. This is
because (i) none of these individuals can have contacted any individual who escapes
infection, so there is no need to impute such potential edges, and (ii) the data contain
no information about potential (but never realised) edges from individuals escaping
infection.
To be precise, suppose now that the total number of individuals in the population who
ever become infected is n, labelled 1, . . . , n, and define G as the digraph on these n
vertices. For j = 1, . . . , n let I(j) denote the lifetime random variable corresponding
to vertex j, distributed according to Iτ (j) , and I = (I(1), . . . , I(n)). As before, let N
denote the total number of individuals who are initially susceptible. It is necessary to
assume that at least one individual is initially infective: we assume henceforth that
there is exactly one individual, whose label is κ, although an arbitrary number of initial
infectives is easily catered for.
The augmented posterior density is
π(ΛL , ΛG , G, I, κ | n) ∝ π(n | ΛL , ΛG , G, I, κ)π(G | ΛL , ΛG , I, κ)π(I)π(κ)π(ΛL , ΛG ),
(4.1)
where the last three terms on the RHS are prior densities. Note that π(I) is simply a
product of the individual densities of the I(j) terms, i.e. it is specified by the model
assumptions. It remains to evaluate the other two RHS terms in (4.1), which effectively correspond to the augmented likelihood. The π(n | ΛL , ΛG , G, I, κ) term is the
probability of there being no edges from the n vertices in G to the remaining N − n
outside G, provided that κ is directionally connected to every other vertex in G, i.e.
Cκ = {1, . . . , n}. If the latter does not hold then π(n | ΛL , ΛG , G, I, κ) = 0, since G is
then incompatible with the observed data. The term π(G | ΛL , ΛG , κ, I) is simply the
probability of the edges in G and is straightforward to write down.
9
For i = 1, . . . , n and j = 1, . . . , k, define the following quantities. Let νijL denote the
number of local edges from vertex i to type j vertices, and define νijG similarly for global
edges. Let NijL denote the number of individuals in i’s group of type j. Note that
NijL includes individuals who are never infected during the epidemic, and will include
individual i if τ (i) = j. Finally, define δ(κ, G; n) = 1{Cκ ={1,...,n}} and ∆ij = 1{τ (i)=j} ,
where 1A denotes the indicator function of the set A. Thus δ(κ, G; n) is simply the
indicator function of the event that, given the initial infective κ, G is compatible with
the observed data n. Then
L(ΛL , ΛG , G, I, κ) := π(n | ΛL , ΛG , G, I, κ)π(G | ΛL , ΛG , I, κ) = δ(κ, G; n)×
n Y
k
Y
©
ªν L
1 − exp(−I(i)λLτ(i)j ) ij exp(−I(i)λLτ(i)j (NijL − ∆ij − νijL ))
i=1 j=1
ªνijG
©
G
1 − exp(−I(i)λG
exp(−I(i)λG
τ (i)j /Nj )
τ (i)j (Nj − ∆ij − νij )/Nj ).
(4.2)
Computation of (4.2) is straightforward in practice, the only minor challenge being the
δ(κ, G; n) term. Note that in the above formulation, the lifetimes I(i), i = 1, . . . , n
are included as extra model parameters. It is possible to integrate these out of (4.2)
by multiplying out the {1 − exp(· · · )}··· terms and taking expectations, exploiting the
independence of the I(i) terms. However, the resulting expressions consist of alternating
sums and possess poor numerical stability, so in most scenarios it is expeditious to retain
the lifetimes as parameters.
5
Markov chain Monte Carlo algorithm
In order to obtain samples from the posterior density defined at (4.1), we now define
a Metropolis-Hastings algorithm (see e.g. Gilks et al., 1996) in which the parameters
are updated in blocks in the following manner. Updates for parameters other than
the digraph G are largely routine so we only give brief details. It is assumed that the
initial configuration of the parameters has positive probability, and in particular that
δ(κ, G; n) = 1. The infection rate parameters λLij , λG
ij , 1 ≤ i, j ≤ k, are each updated
individually using Gaussian proposal distributions centred on the current value, with
either fixed variances, or by using an adaptive scheme in which the variances can change
as the algorithm proceeds. The latter is especially useful in those cases where the data
are relatively uninformative about a particular infection rate. The lifetime random
variables I(i), i = 1, . . . , n can be updated naturally by using the prior distributions as
proposals. The proposed new lifetimes I ∗ , say, are accepted with probability
L(ΛL , ΛG , G, I ∗ , κ)
∧ 1.
L(ΛL , ΛG , G, I, κ)
10
Note that directional connectivity is unaffected by this update, which simplifies the
computation. It is usually best to update the lifetimes in blocks rather than all at once,
to prevent low acceptance rates. The label of the initial infective, κ, can be updated
using a Gibbs step. Specifically, let A(G) = {1 ≤ j ≤ n : δ(G, j; n) = 1} denote the
possible values of κ, for a given G. Under the assumptions of the model each of these
values is equally likely, conditional upon the current G. Thus κ has full conditional
distribution given by
π(κ)δ(G, κ; n)
.
π(κ|G, n) = P
κ:κ∈A(G) π(κ)
Alternative updating methods for κ, that also ensure G remains directionally connected,
include proposing a new value κ∗ from among those vertices to which κ is connected,
and then swapping the direction of the edge from κ to κ∗ . By necessity, such moves
involve some updating of G.
Updating G can be achieved by simply adding and deleting edges at random, as described
below. We use the terminology non-edge to refer to the absence of an edge, i.e. there is
a non-edge from vertex i to vertex j if and only if there is not an edge from i to j. For
i = 1, . . . , n and j = 1, . . . , k, denote by nLij the number of individuals in i’s group of
type j who ever become infected. Thus nLij is the number of vertices in G in i’s group of
¢
P ¡
type j, and kj=1 nLij − ∆ij is the maximum possible number of local edges emanating
from i.
First, choose to try to add an edge with probability pa , otherwise try to delete an edge.
In both cases, choose to act on the local edges with probability pL , otherwise act on
the global edges. For local addition, first select, uniformly at random, an edge to add
from among the entire set of local non-edges. If this set is empty, then stop at this
point. Otherwise, suppose the edge is from vertex s to vertex t, these vertices being in
the same group. To calculate the acceptance probability, note that the likelihood ratio
of proposed to existing graph is simply {1 − exp(−I(s)λLτ(s)τ (t) )}/ exp(−I(s)λLτ(s)τ (t) ).
Combining this with the proposal mechanism, and that for deletion described below,
yields the acceptance probability
Pn Pk
L
L
(1
−
p
)
©
ª
a
j=1 (nij − ∆ij − νij )
i=1
L
´
³
∧ 1.
exp(I(s)λτ (s)τ (t) ) − 1
P P
pa 1 + ni=1 kj=1 νijL
Note that there is no need to check directional connectivity. The addition of global
edges occurs in the same way, mutatis mutandis. For local deletion, an edge is picked
P P
at random from among the ni=1 kj=1 νijL available, and then deleted with probability
P P
δ(κ, G; n)pa ni=1 kj=1 νijL
ª−1
©
L
´ ∧ 1,
³
exp(I(s)λτ (s)τ (t) ) − 1
Pn Pk
L
L
(1 − pa ) 1 + i=1 j=1 (nij − ∆ij − νij )
11
where the proposed deletion is an edge from vertex s to vertex t. Note that evaluation
of the acceptance probability requires checking directional convectivity. Global deletion
is similar.
6
Application to data
We now consider the performance of our methods in a variety of examples. Our aim is
not to perform thorough data analyses, but to use suitable data sets to illustrate the
feasibility of our approach, and its scope for providing new kinds of information not
available via existing methods. All results are based on samples of size 10,000 from
MCMC sample chain output. Algorithm convergence was checked by inspection of the
resulting chain output. Unless otherwise indicated, parameters on (0, ∞) are assigned
exponential prior densities with rate 10−6 . In all cases such priors allow the data to
dominate the posterior distribution.
If a uniform prior mass function for κ is assumed then, in the examples below, inference
for ΛL and ΛG was found to be indistinguishable from the case where κ is simply fixed.
Indeed for the single-type case, it can be shown that κ has no bearing on inference for
the local and global infection rate parameters. This is essentially a consequence of the
fact that the probability that a given individual i infects a given individual j is the same
as the probability that j infects i. For the multitype case this is no longer true, but κ
only becomes important in data on small populations.
6.1
Example 1: Single type, homogeneous mixing model
We start by exploring the performance of the algorithm in the special case of a homogeneouslymixing single-type population. In the terminology of the general model this corresponds
to a single type (k = 1), all groups being of size one, and ΛL being redundant since no
local infection occurs. Thus there is just one infection rate parameter, λG
11 = λ, say. In
some sense this setting provides the most challenging inverse problem, since the data
only comprise two numbers (initial number susceptible N , final number ever infected n),
from which we shall try to infer information about both the infection rate and, implicitly, the random digraph. Note that R∗ , usually called R0 in this setting, equals λE[I],
where I is the infectious period.
Suppose that N = 100, with one initial infective, and consider the three data points
n = 25, 50 and 75. We also consider three possible infectious period distributions, each
with mean one, namely constant, exponential, and Gamma with variance 10. It should
12
be noted that the numerical problems outlined in Section 2.3 apply in these cases, so
that direct likelihood calculation via the standard triangular equations for final size
would typically exceed machine accuracy.
Our focus in the following is on R0 ; the next example illustrates how information about
G can be easily obtained. Some posterior summary statistics are given in Table 1, and
Figure 1 shows the posterior density estimates of R0 for the case n = 25, under the three
infectious period distributions. The algorithm ran successfully, with typical run times
of a few hours, and with no apparent difficulties in terms of mixing. Various starting
values for both λ and G were explored and in all cases the Markov chain quickly moved
to a high posterior density region. In particular, this means that more exotic updates
for G do not appear necessary in this case.
Table 1 near here
Figure 1 near here
We highlight three aspects of our results. First, as Figure 1 illustrates, in all cases
the posterior density of R0 was found to be roughly symmetric, but with a discernible
right tail. This tail became more pronounced as the variance of I increased, although the
modal value for a given n was found to be very similar as I varied. Second, the key effect
of the different infectious period distributions was to alter the posterior variance of R0 ,
which increased with the variance of I. Such findings are intuitively reasonable. Note
that the posterior mean also increases with the variance of I, although this is essentially
a consequence of the increased skewness. Third, the posterior probability that R0 < 1
was found to be approximately 0.25 for n = 25, regardless of the distribution, and
between 0.01 and 0.06 for n = 50, increasing with the variance of I. We mention this to
emphasize the fact that our methods do not require any assumption that R0 > 1, and
moreover they provide information with how reliable such an assumption would be.
6.2
Example 2: Single type, two-level-mixing model
We now turn to analyses based on two-level mixing models. In the sequel we consider
data sets taken from detailed studies on outbreaks of influenza A(H3N2) in Tecumseh,
Michigan. The data are in the form that we require in that they consist of final numbers
infected in a population that has been divided into households. Many aspects of these
data have been previously explored, see for example Monto et al. (1985), Longini et
al. (1988), Addy et al. (1991), and references therein. More recent analyses based on
two-level mixing models, all of which use approximations of one kind or another, can
13
be found in Ball et al. (1997), Britton and Becker (2000), Ball and Lyne (2004), and
Demiris and O’Neill (2005).
Table 2 near here
We begin with a single-type analysis for an outbreak in 1980-81. The data are given in
Table 2 and show the numbers infected in households containing up to seven initially
susceptible individuals. Previous analyses of the Tecumseh data have often only used
households up to size five, this being due to numerical problems of the kind described
in Section 2.3 above (e.g. Longini et al., 1988; Addy et al., 1991). Our methods have
no such restriction. We define λL = λL11 and λG = λG
11 .
In keeping with previous studies, we assume that the infectious periods are distributed
according to a Gamma random variable with shape parameter 2 and scale parameter
(1/2.05), i.e. with mean 4.1 days. Table 3 gives posterior summary information for λL ,
λG , the threshold parameter R∗ , and the total numbers of local and global edges in G,
P
P
L
G
denoted ηL =
i,j νij and ηG =
i,j νij , respectively. All of the marginal posterior
density estimates of these parameters were unimodal and approximately symmetric.
Estimation for λL and λG is reasonably precise in that the posterior credible intervals
are relatively small. The 95% posterior credible interval for R∗ includes unity, and
moreover P (R∗ ≤ 1|n) ≈ 0.085, highlighting the fact that assuming R∗ > 1 is not
entirely satisfactory for these data.
Table 3 near here
The results for ηL and ηG can be interpreted in a variety of ways. First, they provide
summary information about G, and in particular the standard deviation and credible
intervals give some indication of how accurately we can infer G from the data. For
example, since there are 82 infected households, and 128 infected individuals, it follows
that 81 ≤ ηG ≤ 128 × 127 = 16256. We might expect ηG to be concentrated towards the
lower end of this range, since larger values would be incompatible with the large number
of individuals avoiding infection, but even so the posterior information reveals that ηG
can be inferred with considerable accuracy. Moreover, it would appear that the graph is
fairly tree-like in structure, since ηG +ηL is typically not far in excess of the total number
of infected individuals. The ηL and ηG parameters are also informative about the actual
typical number of potential infections, and thus they give an alternative to R∗ itself. For
example, dividing both by the number of vertices in G, 128, we find the mean numbers
of local and global links emanating from a vertex are 0.39 and 0.77, respectively. The
sum of these is close to the posterior mean of R∗ , while the individual values give some
14
idea of the relative importance of local and global infections during the outbreak. More
sophisticated variants are possible, such as considering the ratio of actual to potential
edges realised, or using the local structure to obtain more detailed descriptions of local
spread (the point being that the distribution of number of local contacts depends on an
individual’s group size.)
Thus far we have assumed that the infectious periods have a Gamma distribution, with
mean 4.1 days. Although the algorithm computation times are reasonable, typically
several hours, these can be reduced considerably (e.g. a factor of 3-5) by using a simpler
model in which the infectious periods have fixed length 4.1 days. This change does not
make much difference to the results: for example, the posterior means of λL and λG are
similar, but the posterior variances are slightly smaller than before. Such similarities
for these models are not new, see e.g. Ball et al. (1997), O’Neill et al. (2000), but the
point here is that the simpler model can be analysed rather more quickly.
6.3
Example 3: Two-type, two-level-mixing
We now consider a two-type data set described in Longini et al. (1988). These data,
also from the Tecumseh study, divide the at-risk population into two strata according
to antibody titre level, the strata being termed low or higher. The data are actually
combined from two separate influenza outbreaks, but this is immaterial for the purposes
of illustrating our methods. The data set is given in Table 4 of Longini et al. (1988),
and comprises 567 households containing between one and five initially susceptible individuals. In thirteen of the households the exact outcome is not presented, and for
simplicity we exclude these from our present analysis. Of those households included,
T1 = 163 out of N1 = 742 low-titre (type 1) individuals became infected, compared with
T2 = 53 out of N2 = 562 higher-titre (type 2) individuals.
The analysis described in Longini et al. (1988) employs an independent-households
model with fixed-length infectious periods, in which individuals can differ in their susceptibility, but not infectivity. For the two-level mixing model, a natural way of making
G
the latter assumption is to set λLij = λLlj and λG
ij = λlj for j, l = 1, 2. We refer to the
resulting model as the LGS (Local-Global-Susceptibility) model. Since our methods
do not require any structural restrictions on ΛL , we also consider for illustration the
G
model with the sole constraint that λG
ij = λlj for j, l = 1, 2, and refer to this as the GS
(Global-Susceptibility) model. In keeping with Example 2, and Longini et al. (1988),
the infectious periods were all set to be of fixed length 4.1 days.
Starting with the LGS model, we first indicate that results comparable to those presented
in Longini et al. (1988) can be easily obtained via our methods. For example, the
15
independent-households model used in that paper is defined in terms of parameters Qi
and Bi , respectively representing the probability that a type i individual avoids infection
from a single same-household infective, and the community at large. These parameters
are of direct interest because they are used to define the so-called secondary attack rate
(viz., (1 − Qi ) × 100%) and community probability of infection, 1 − Bi . In our model
−1
we have Qi = exp(−4.1 λL1i ), and a simple approximation to Bi is exp[−4.1 λG
1i (T1 N1 +
T2 N2−1 )]. The maximum likelihood estimates of Qi and Bi , i = 1, 2, in Longini et al.
(1988) were found to be very similar to our corresponding posterior mean and median
values.
Table 4 near here
Turning now to a comparison of the GS and LGS models, Table 4 gives some posterior
summary statistics. As expected, the LGS model in some sense averages out differences
in λL1j and λL2j found in the GS model. In the latter model, for j = 1, 2 the posterior
mean of λL1j is somewhat larger than λL2j , suggesting that higher-titre individuals are less
infectious. However, the posterior standard deviation of λL21 is relatively large, so any
difference between λL21 and λL11 is not clear-cut. The posterior uncertainty arises because
estimation of λL21 requires infected higher-titre individuals who then infect low-titre
individuals, but the data only contain a few households with two types of individual,
both of whom became infectious. For ΛG , the two models give roughly similar posterior
distributions, the differences essentially arising as compensation for the corresponding
ΛL differences. The threshold parameter R∗ was found to have posterior mean 1.21 for
the LGS model and 1.23 for the GS model, and in both cases the posterior standard
deviation was 0.11. Finally, although the data sets are certainly not strictly comparable,
it is notable that the single-type model of Example 2 attributes more of the epidemic
spread to global infections than the present example, insofar as the posterior mean
of λG exceeds all of the posterior ΛG entries. In particular, the extra detail of the
multitype data appears to indicate that local spread between low-titre individuals is of
key importance, suggesting that control measures should be targetted towards this.
7
Discussion
In this paper we have described new methodology for performing Bayesian inference for
two-level mixing epidemic models, given final outcome data. Implementation, although
not a trivial matter, is not especially complicated. The methods work well in practice,
although for data sets with very large numbers of infectives (thousands as opposed to
16
hundreds) the algorithm takes days rather than hours to run. However, data on such
large outbreaks are not common and so this is not a serious restriction.
The methods have several appealing features. First, they generate information regarding
the actual propagation of the epidemic via the random digraph. Although not explicitly
temporal, this information could be loosely regarded as such, for example by supposing that the real-time delay between generations of infection is roughly constant. The
random digraph can then be used to infer information about the duration of the epidemic. Second, the methods do not require approximations, such as those introducing
independence between groups, or that the epidemic is above threshold.
Third, the methods are clearly very flexible, and have scope for application to other
structured population epidemic models. Examples of the latter include spatial models,
models with three or more levels of mixing, and models with overlapping subgroups (for
example, with households, schools, and workplaces all explicitly described.) Although
the form of the structure is required to be known, in practice different plausible scenarios
could be explored if the exact contact structure was not available.
An extension of practical interest is to the case where the observed data form only a
fraction of the total population. The main impact of this setting compared to that
we have studied is on the posterior variances of the quantities of interest; estimates of
posterior mean behaviour will be largely unaffected. Our methods can still be applied,
but now require additional assumptions regarding the (unknown) number of individuals
infected from outside the observed fraction. Each such infective individual would then
give rise to a connected digraph of its own on some subset of susceptibles within the
observed fraction. One way to generate the unknown infectives is to use approximation
methods involving the final severity along the lines discussed in Demiris and O’Neill
(2005). Roughly speaking, each individual in the observed fraction would have a fixed
probability of being infected from outside, this probability itself being calculated by
an approximation to the final severity of the epidemic in the entire population. A
drawback with this approach is that it requires the undesirable R∗ > 1 assumption
discussed previously. An exact alternative is to simply impute the entire digraph in the
unobserved population as well, which would be feasible in small population settings, but
time-consuming for larger populations.
Finally, an important extension of our methodology is towards Bayesian model choice. In
principle it is possible to implement trans-dimensional MCMC methods in the multitype
epidemic model setting, the key (non-trivial) challenge being to efficiently move between
different models. This is a subject of current investigation.
Acknowledgments
17
We thank Owen Lyne for helpful discussions. The first author was partly supported by
EPSRC grant GR/M86323/01, and computing facilities were partly funded by EPSRC
JREI grant GR/R08292/01.
References
Addy, C. L., Longini, I. M. and Haber, M. (1991). A generalized stochastic model for
the analysis of infectious disease final size data. Biometrics 47, 961–974.
Andersson, H. (1999). Epidemic models and social networks. Math. Sci. 24, 128–147.
Andersson, H. and Britton, T. (2000). Stochastic Epidemic Models and Their Statistical
Analysis. Lecture Notes in Statistics 151, Springer, New York.
Ball, F. G., Britton, T. and Lyne, O. D. (2004) Stochastic multitype epidemics in a
comunity of households: Estimation of threshold parameter R∗ and secure vaccination
coverage. Biometrika 91, 345–362.
Ball, F. G. and Lyne, O. D. (2001). Stochastic multitype SIR epidemics among a
population partitioned into households. Adv. in Appl. Probab. 33, 99–123.
Ball, F. G. and Lyne, O. D. (2004). Private communication.
Ball, F. G., Mollison, D. and Scalia-Tomba, G. (1997). Epidemics with two levels of
mixing. Ann. Appl. Probab. 7, 46–89.
Barbour, A. D. and Mollison, D. (1990) Epidemics and random graphs. In Stochastic
Processes in epidemic theory, eds. Gabriel J. P. and Lefévre, C., Lecture notes in
Biomathematics 86, 86–89.
Becker, N. G. (1989) Analysis of Infectious Disease Data. Chapman and Hall, London.
Becker, N. G. and Dietz, K. (1995) The effect of the household distribution on transmission and control of highly infectious diseases. Math. Biosci. 127, 207–219.
Becker, N. G. and Hall, R. (1996) Immunization levels for preventing epidemics in a
community of households made up of individuals of various types. Math. Biosci. 132,
205-216.
Britton, T. (1998) Estimation in multitype epidemics. J. R. Statist. Soc. B 60, 993–679.
Britton, T. and Becker, N. G. (2000) Estimating the immunity coverage required to
prevent epidemics in a community of households. Biostatistics 1, 389–402.
Britton, T. and O’Neill, P. D. (2002) Bayesian inference for stochastic epidemics in
populations with random social structure. Scand. J. Statist. 29, 375–390.
Demiris, N. and O’Neill, P. D. (2005) Bayesian inference for epidemic models with two
levels of mixing. To appear, Scand. J. Statist.
Farrington C. P., Kanaan M. N. and Gay N. J. (2001) Estimation of the basic reproduction number for infectious diseases from age-stratified serological survey data, with
18
discussion. J. R. Statist. Soc. C, 50, 251–283.
Gilks, W. Richardson, S. and Spiegelhalter, D. (1996) Markov chain Monte Carlo in
practice. Chapman and Hall, London.
Haydon D. T., Chase-Topping M., Shaw D. J., Matthews L, Friar J. K., Wilesmith J.,
Woolhouse M. E. (2003) The construction and analysis of epidemic trees with reference
to the 2001 UK foot-and-mouth outbreak. Proc. R. Soc. Lond. B 270, 121–7.
Islam, M. N., O’Shaughnessy, C. D. and Smith, B. (1996) A random graph model for
the final-size distribution of household infections. Stat. in Med. 15, 837–843.
Li, N., Qian, G., and Huggins, R. (2002) Analysis of between-household heterogeneity
in disease transmission from data on outbreak sizes. Aust. N. Z. J. Stat. 44, 401–411.
Longini, I. M. and Koopman, J. S. (1982) Household and community transmission parameters from final distributions of infections in households. Biometrics 38, 115–126.
Longini, I. M., Koopman, J. S., Haber, M., and Cotsonis, G. A. (1988) Statistical
inference for infectious diseases: risk-specific household and community transmission
parameters. Am. J. Epid. 128, 845-59.
Ludwig, D. (1975) Final size distributions for epidemics. Math. Biosci. 23, 33–46.
Monto, A. S., Koopman, J. S. and Longini, I. M. (1985) Tecumseh study of illness.
XIII. Influenza infection and disease, 1976–1981. American Journal of Epidemiology
121, 811–822.
O’Neill, P. D., Balding, D. J., Becker, N. G., Eerola, M. and Mollison, D. (2000) Analyses
of infectious disease data from household outbreaks by Markov Chain Monte Carlo
methods. J. R. Statist. Soc. C, 49, 517–542.
Rida, W. (1991) Asymptotic properties of some estimators for the infection rate in the
general stochastic epidemic. J. R. Statist. Soc. B, 53, 269–283.
Schinazi, R., (2002) On the role of social clusters in the transmission of infectious diseases. Theoretical Population Biology 61, 163–169.
Wallinga J. and Teunis P. (2004) Different epidemic curves for severe acute respiratory
syndrome reveal similar impacts of control measures. American Journal of Epidemiology
160, 509–516.
19
n = 25
n = 50
n = 75
Mean
S. Dev.
Mean
S. Dev.
Mean
S. Dev.
Constant Exponential Gamma
1.16
1.27
1.37
0.23
0.37
0.50
1.42
1.49
1.55
0.21
0.27
0.36
1.86
1.96
2.04
0.21
0.27
0.37
Table 1: Posterior means and standard deviations for R0 under the assumption of
three different infectious period distributions, each with mean 1, and with variance
0 (Constant), 1 (Exponential) and 10 (Gamma). The population size is N = 100.
Susceptibles per household
No. infected
1
0
1
2
3
4
5
6
7
Total
44
10
2
4
5
6
62 47 38
13 8 11
9 2 7
3 5
1
9
5
3
1
0
1
3
3
0
0
0
0
0
54 84
3
60
62
19
7
2
0
0
0
0
0
0
0
6 2
Table 2: Final numbers infected in households during 1980-81 influenza outbreak in
Tecumseh, Michigan.
20
Mean
Median
S. dev.
95% C. I.
Parameter
λL
λG
R∗
0.050
0.193
1.24
0.049
0.192
1.22
0.010
0.025
0.19
(0.032,0.072) (0.15,0.25) (0.91,1.65)
ηL
49.4
49
5.74
(38,61)
ηG
98.8
99
4.66
(90,109)
Table 3: Posterior parameter summaries for the infection rates, threshold parameter,
and numbers of local and global edges in G, Tecumseh 1980-81 data set.
ΛL
Model
µ
GS
µ
LGS
0.0866(0.015) 0.0203(0.013)
0.0559(0.032) 0.01561(0.011)
0.0817(0.013) 0.0124(0.0073)
0.0817(0.013) 0.0124(0.0073)
ΛG
¶ µ
¶ µ
0.146(0.015) 0.0549(0.0090)
0.146(0.015) 0.0549(0.0090)
0.143(0.014) 0.0576(0.0090)
0.143(0.014) 0.0576(0.0090)
¶
¶
Table 4: Posterior mean (standard deviation) for ΛL and ΛG for the Global-Susceptibility
and Local-Global-Susceptibility models.
21
Fig. 1. Posterior density plots for R0 under the assumption of three different infectious
period distributions, each with mean 1, and with variance 0 (Constant), 1 (Exponential)
and 10 (Gamma). The data are n = 25 cases in a population size of N = 100.
Constant
Exponential
Gamma
1.2
π(R 0 )
0.8
0.4
0.0
0
1
R0
2
3
4