Download Manus 1 - IFM - Linköping University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Zero-configuration networking wikipedia , lookup

Peering wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Computer network wikipedia , lookup

Network tap wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

IEEE 1355 wikipedia , lookup

Airborne Networking wikipedia , lookup

Transcript
Draft 091216
Contact density and the spread of infectious diseases in a landscape of animal
holdings: How important is it to know all realizable links?
Jenny Lennartsson1, Annie Jonsson1, Nina Håkansson1 and Uno Wennergren2,*
1
Systems Biology Research Centre, Skövde University, Box 408, 541 28, Skövde, Sweden
2
IFM Theory and Modeling, Linköping University, 581 83 Linköping, Sweden
*Corresponding author
ABSTRACT
The lack of complete data sets is often a limitation when using network modeling. In this paper, we
analyze how connected a network has to be to be able to draw useful conclusions about the extent
of a possible epidemic from it. Virtual risk networks with different link densities are created and
diseases are simulated to see how the number of infected animal holdings depends on the level of
links. We conclude that by using distance dependence assumptions for both link creation and disease
transmission, predictions about the extent of an epidemic can be drawn from a network even with
low link density.
Keywords: network, missing links, link density, infectious diseases, disease transmission
1
Draft 091216
1. INTRODUCTION
The use and interest of network analysis has
been growing in many different scientific
areas during the last decade, for example in
biology, epidemiology, economy and social
science. A network consists of interacting
units, denoted nodes, and these units connect
to each other through relations termed links.
Nodes could for example be animals, animal
holdings, habitats, persons or schools and the
links could be personal visits, animal
transports or links between web pages. The
links between the nodes give rise to networks
with different contact structures and these
structures then depend on the amount of
nodes and links and how these are organized.
Since estimation of such structures can be
cumbersome one may expect that the
estimated network will most probably lack
some links and even some of the nodes
(Clauset et al. 2008). Hence, there is a need to
evaluate the effect of missing links to reduce
errors when networks are applied. Since data
sampling can be costly, also unnecessary
sampling including links with too seldom
occurrence should be avoided. The focus in
this study is on the number of links in the
networks and its effect on properties as
spread of disease and on network measures.
We will generate scenarios to mimic sampling
procedures.
The number of links in a network determines
its connectedness. Three network categories
can be defined according to how connected
the networks are. Firstly the complete
network (Wasserman & Faust 1994) (figure
1a) where all theoretically possible links are
included, secondly the real world network
(figure 1b) where all realizations of links
during a specified time period are included,
and thirdly the sampled network (figure 1c)
where all the estimated links, given the same
timeperiod, are included. The sampled
network can be estimated through sample
surveys, literature studies, contact tracing or
by databases such as national databases for
animal movement. The real world network is
the
network one would like to consider but the
sampled or complete network will be what
one have to represent it with.
The real network is a single event occurring
during a specific time period. Another event,
maybe also with the same time length, will
most probably result in another set of links.
The question then arises whether the
properties of the two will differ or not? May a
property, as spread of disease, of the first
sampled network apply as an approximation
of the property of the second event? This
question does also apply to the two real world
networks. Will the property of the first real
network be valid as an approximation of
property of the second one? It is obvious that
a too short time frame will result in a bad
approximation and also that a very large time
frame with an almost complete network with
specified probabilities on all its links is a
perfect approximation. Yet somewhere in
between there is an approximation that is
sufficient yet not too time consuming to
achieve.
In contrast to classical models such as
SI/SIR/SEIR epidemic models, network models
relax the assumption of homogeneous mixing
(mass-action type of assumptions). Network
analysis on the other hand requires handling
of huge amounts of data which is possible by
the computational power today. In veterinary
medicine network analysis, with explicit
contact structures, is an increasingly applied
tool (Barthélemy et al. 2005; Ortiz-Pelaez et al.
2006). The potential use of network analysis
and modeling in epidemiology is to predict
size and spread of epidemics and to examine
effects of different intervention methods such
as vaccination, stand still and stamping. For
example, Corner et al (2003) studied the
transmission of Mycobacterium bovis among a
network of wild brushtail possums and the
social contacts between them. In another
study Kiss et al. (2006) analyzed networks of
sheep movements within Great Britain. They
showed that during an epidemic it is most
efficient to concentrate control interventions
to highly connected nodes. Despite the
increased use of networks in epidemiology,
2
Draft 091216
there are shortcomings in the analysis of
missing links as well as how to represent a
structure given a single sample.
How connected measured networks are, is
highly varying, for example depending on the
context they are sampled in, sampling method
used and the time window for the sampling
period. Unfortunately, it is not unusual that
collected network data is incomplete
(Christley et al. 2005; Ortiz-Pelaez et al. 2006;
Clauset et al. 2008; Heath et al. 2008; Eames
et al. 2009; Guimerà & Sales-Pardo 2009). It
could for example be missing animal
movements or unknown locations of herds in
databases. Depending on the structure of the
network, properties as epidemic development
can vary (Newman et al. 2001; Keeling 2005;
Shirley & Rushton 2005; Kiss et al. 2006). Since
disease transmission depends on the
networks structure, results based on networks
with missing links may be misleading. Perkins
et al. (2009) demonstrated that network
structures are only approximations of contacts
and that it is almost impossible to identify all
contacts when collecting data. In practice, this
means that there will be problems with
missing data resulting in lost links in the
representation of a network. These lost links is
the result of errors during the sampling period
or a consequence of the finite length of the
sampling period. Guimerà and Sales-Pardo
(2009) introduce a method to use a single
measure of a network, a sampled network, to
generate a more correct representation, i. e.
an approximation of the real world network.
Their method focuses on a reduced network
because of errors during sampling. By
measuring and classifying the structure of the
sampled network, they could identify either
missing or spurious links. In our study, the
focus is more general and handles the relation
between link density and estimates of
properties as spread of disease and specific
network measures. This can of course also be
viewed as a study on the consequence of too
few or too many links in a network
representation.
During a survey to achieve a sampled network,
it is important to consider the time window of
the sampling period. For example, Kao et al.
(2007) studied the relation between UK
livestock movement network and disease
dynamics over different time-scales. They
simulated transmission of two diseases, footand-mouth disease and scrapie, which have
very
different
time-scales
regarding
incubation time as well as infectious period.
They concluded that for network analysis to
be a valuable tool in epidemiological modeling,
it is important to consider the time-scale as
well as the potentially infectious contacts. In
another study, Robinson et al. (2007)
investigated animal movement networks
evolving over time in Great Britain and their
findings point out the importance of temporal
scale. With increased time-period, the
networks became more and more connected
and in that way fueled the disease
transmission. They also found a seasonal
pattern with a peak in spring and August. Thus
depending on the question to be examined or
when comparing different networks it is
important to choose the appropriate temporal
scale (Vernon & Keeling 2009). Otherwise,
there could be too few or too many links
involved in the analyses.
A more probabilistic representation of
networks has weighted links (ref???). With
such a representation, one may estimate the
network and link-weights over longer (or
shorter) time periods yet apply the network
over other time spans then the measured one.
Still too few links will result in an
underestimation of the probabilities while too
many results in an overestimation. In this
study we tested how high link density is
necessary to achieve a network with correct
properties. In addition, will it depend on
sampling procedure?
2. METHODS
2.1 The model
2.1.1 Landscape of animal holdings
The number of animal holdings was arbitrarily
set to 500 and these were randomly placed
into a landscape of size 34 x 34. See figure 2.
3
Draft 091216
The holding density was chosen according to
realistic farm density in southern Sweden.
Each animal holding was considered as a node,
which implies that each animal was not
individually modeled.
2.1.2 Virtual sampled networks
Animal holdings were connected to each other
to generate virtually sampled networks. Since
we investigated how the extent of a possible
epidemic depended on the number of links in
the networks, we used the measure link
density to control for that. Link density is the
actual connections in the network as a
proportion of all theoretical possible links in
the network (Wasserman & Faust 1994). Link
density was varied from 0.001 to 1.0. A link
density of 1.0 means a complete network
(figure 1a) where all theoretical connections
(eq. 1) are included and the number of links
(Cn) will be:
Cn 
nn  1
2
(1)
Where n is the number of animal holdings in
the network. Because the link density of the
networks was set when generating the
networks, also the mean link degree was given
from start. Table 1 shows which mean degree
each link density corresponds to.
Holdings were connected to each other in two
different linking scenarios, either due to the
distances between the holdings or completely
at random. See figure 2. To simplify the model
we assumed that distance strongly affected
both the realizable links and the probability
for disease spread. Others (Keeling 2005; Le
Menach et al. 2005) also assumed distance
dependent connection. This assumption
implies more links between adjacent animal
holdings than between holdings far from each
other. With distance dependent link creation,
links were randomly drawn from a frequency
distribution where the probability of a link
between two animal holdings depended on
the Euclidian distance between them. Since
stochasticity was included in this method links
could also exist between holdings that were
more distant from each other, even if a low
link density was used. The probability, P(li,j),
that a link exist between holding i and j is
given by the exponential distribution given by
eq. 2 (Håkansson et al. 2009; Lindström et al.
2008).
P(lij )  Ke
 dij

 a
b


(2)
Where di,j is the Euclidian distance between
holding i and j. To avoid edge effects, periodic
boundaries were used (Lindström et al. 2008).
Parameters a and b are regulated by the
ingoing parameters, kurtosis, к, and standard
deviation, σ. Here, a kurtosis value of 10/3 and
a standard deviation of one were used. The
constant K normalized the distribution so that
the probabilities of all possible links were
summed to one.
With the random linking scenario, all animal
holdings have the same probability to connect
to each other, independent of the distance
between them. The two linking scenarios can
relate to two different methods of data
collection. The distance dependent linking
scenario can mimic data collection where
connections between nearby nodes are
sampled first. And only when these shorter
connections are collected or in rare cases, also
connections between distant holdings are
found. In contrast, the random linking
scenario reflects a method of data sampling
where connections between nodes are
completely randomly found and no systematic
way for finding the connections are used.
2.1.3 Risk networks and disease transmission
To simulate disease transmission in the
networks, we used a simple model, where the
holdings could be in one of the two phases,
susceptible or infectious. The model did not
incorporate incubation time so animal
holdings that have contact to an infected
holding could infect other holdings already in
the next time step. Since a recovery phase was
not included in the model, an infected holding
could never turn into the susceptible phase
again. That is, an infected holding remained in
4
Draft 091216
the infectious phase during the remaining
simulation time.
Undirected links were used and that means
that diseases could transmit in both directions
along the links. Two different scenarios of
disease transmission were tested, distance
dependent and random transmission (figure 2).
These two scenarios were combined with the
two linking scenarios into four different
combined scenarios: DlDt, DlRt, RlDt and RlRt
(figure 2). The RlRt scenario is an example of
mass action mixing model (Keeling 2005) that
assumes that all links have the same
probability of transmitting the disease. The
DlDt scenario, with different linking- and
transmission probabilities for each link, is the
opposite of the RlRt scenario. The remaining
two scenarios, DlRt and RlDt, are combinations
of the two previously mentioned scenarios,
DlDt, and RlRt. When simulating distance
dependent transmission we arbitrarily used
the same probability for transmission as given
by the exponential probability distribution
function in equation 2. The probability for
disease transmission was therefore higher
when the distance between the holdings was
shorter. This high probability could also be
interpreted as more contacts between animal
holdings close to each other, although it could
only exist one link between a couple of
holdings. When random transmission was
simulated, the probability for disease
transmission was the same regardless of the
distance between the animal holdings. This
probability was here arbitrary set to 0.01. The
two different kinds of probabilities for disease
transmission that were used could be thought
of as two different diseases. If the disease
transmits along a link or not were then
randomly determined. As before, disease
transmission could only occur between animal
holdings that were connected by a link.
simulation run 10 randomly picked animal
holdings, one at time, were initially infected.
Simulations were run for 300 time steps.
Numbers of infected animal holdings were
calculated each time step. Simulations were
run in MATLAB (version R2009a).
2.3 Analysis
To characterize the networks and to see how a
change in link density affects the structure
and function of the networks, we used
network measures. Three network measures
were calculated: degree assortativity,
clustering coefficient and fragmentation index.
Degree
assortativity
(Newman
2002)
measures proportions of connected holdings
with equal degree. Values range from minus
one to one. A value near one indicates that
holdings with equal degree are often linked to
each other. Assortativity near minus one
means that holdings with different degree are
often connected. A value of zero means that
the connections between holdings are totally
random.
Clustering coefficient (Watts & Strogatz 1998)
for a holding is the number of links that exists
between neighbors to a holding, divided by all
possible links that could exist between the
neighbors. Here we have used the average
clustering coefficient for the whole network.
This measure ranges between zero and one
where one indicates that the network is highly
clustered.
Fragmentation index (Borgatti 2003; Webb
2005) measures to what extent the network is
disconnected and it ranges from zero to one.
Low value indicates that the network is highly
connected and a high value means that the
networks are very fragmented.
The network measures were implemented
and calculated in MATLAB (version R2009a).
2.2 Simulation runs
3. RESULTS
Since stochasticity was included in the model,
replicates were needed. Therefore, 10
different spatial holding patterns were
generated and for each of them, links were
added in 10 different ways. Totally, 100
different networks were generated. For each
The results shows, that for scenario DlDt, a link
density of around 0.04 gives the same number
of infected animal holdings as a network with
a higher proportion of connections does
(figure 3). Under the assumptions of our
model, the results indicate that a low
5
Draft 091216
proportion of links in the network could be
enough to be able to say something about the
extent of the disease transmission. For
scenario RlDt with distance dependent
transmission in a randomly linked network, it
requires a higher link density to reach the
same transmission rate as with scenario DlDt.
For the scenarios with random transmission
(DlRt and RlRt) the number of infected animal
holdings increases with increased link density
and no limit was reached until link density of
1.0 was used.
The time until a given proportion of the
holdings were infected differs depending on
linking scenario as well as on disease
transmission scenario (figure 4). Random
disease transmission scenario (DlRt and RlRt)
require almost the same time to reach a given
proportion of infected animal holdings. In
addition, they also have a much faster disease
spread than with the distance dependent
disease transmission scenarios (DlDt and RlDt)
(figure 3 and 4). Scenario RlDt, which bases on
random link creation and distance dependent
transmission, ends up with the slowest
transmission rate of the methods compared.
For this scenario (RlDt) only for high link
densities, the given proportions of holdings
are infected (figure 4). For lower link densities,
the number of infected holdings did not reach
the given proportions during the simulation
time.
used (figure 6a). Distance dependent link
creation ends up with higher values of
assortativity compare to random link creation.
The networks made by random linking have as
expected assortativity around zero for all link
densities.
The average clustering coefficient for all
networks increases with increasing link
density (figure 6b). The clustering coefficients
for the networks generated by distance
dependent link creation are higher than the
values for the networks made by random link
creation. When link density increases, the
random linking method approaches the
distance dependent linking method. The
networks generated by the random linking
scenario give clustering coefficients that are
equal to the link density in question. Of course,
the clustering coefficients for all networks are
one when the link density is one, and all
animal holdings are connected to each other.
The fragmentation index for the networks
shows that for both linking scenarios the index
is close to one when link density is 0.001
(table 2). When link density increases to 0.01,
the
fragmentation
index
dramatically
decreases. With both linking scenarios, the
index has reached zero when link density is
0.03 or higher.
4. DISCUSSION
The number of infected holdings at a given
link density are compared between the four
scenarios (figure 5). At low link densities, all
methods gave different results. When link
density increases, the two distance dependent
disease transmission scenarios (DlDt and RlDt)
approach each other. As well as the two
random disease transmission scenarios (DlRt
and RlRt) did. The higher link density the more
similar are the results between the different
distance dependent disease transmission
scenarios. As mentioned before, using random
transmission gives a much faster disease
spread than using distance dependent
transmission.
The average assortativity for the networks
depends on the link creation method that is
Our aim with this study was to investigating
how wrong it could be if using a network with
too many or too few connections. We
investigated if it was possible to predict
anything about the extent of a disease
transmission with only some proportion of all
theoretical links realizable. Our results showed
that a link density of 0.04 gave the same
number of infected animal holdings as a
higher link density did. This result obtains
when the probability for link creation as well
as for disease transmission is according to the
distance dependent scenario (DlDt). For
distance dependent disease transmission in a
random linked network (RlDt), we got the
same transmission rate as when the links are
distance dependent. However, with random
6
Draft 091216
linking much more links are needed to reach
this rate. That is because with random linking
there are a higher number of longer links
included than with distance dependent linking.
These long links have low transmission
probabilities that end up in a slower disease
transmission.
For
random
disease
transmission (scenario DlRt and RlRt), the
number of infected holdings increased with
increased link density. Below we discuss
implications of our results in relation to the
effects of using networks with missing or
overrated links.
4.1 Missing links
Empirical data have showed that only a small
fraction of all connections in a network
actually takes place (Webb 2006; Eames et al.
2009). When sampling data it is almost
impossible to trace all connections between
nodes and this often leads to incomplete data
sets. Therefore, it is important to consider link
density when working with network modeling.
Assuming a complete network when modeling
disease transmission could result in an over
estimation of the extent of an epidemic.
Comparing simulations in a scenario DlDt
network with a link density of 0.04 or higher,
to simulations in a complete network, both
would result in the same number of infected
holdings. Another important issue to consider
when using empirical networks is the time
window for the sampling period. Using
“wrong” time window can lead to missing links
or too many links that both affect the link
density and the spread of diseases in the
networks. The length of this affects how
complete the network will be, a longer time
window could result in a more connected
network than one based on a very short time
window. Different studies use different length
on the time windows. For example, Kiss et al.
(2006) used a 4-week time scale in their study
of sheep movements in Great Britain. In
another study of animal transports, Robinson
and Christley (2007) used periods of 10 weeks.
Measured link densities in empirical
investigated networks is often only about
some per mille or just a few percent of the
total number of theoretical connections in the
networks. An example is Ortiz-Pelaez et al.
(2006) who have studied animal movements
during the initial phase of the epidemic of
foot-and-mouth disease in Great Britain in
2001. Their network has an average link
degree of 1.22 and that corresponds to a link
density as low as about 0.0019. Also in the
Swedish animal transport network a low link
density is measured (ref). It is important to
remember that the measured connections in
an empirical network only are realizations of
all possible contacts. That means that the
number of links in these networks is the ones
that have been realized during the time for
data collection. Actually, there are
probabilities for a huge number of additional
connections but these have not been realized
during the current time period. This is in our
study mimicked by network replicates. For
example when a link density of 0.01 is used, all
theoretical connections are possible but only
1% of all of them are realized and these could
differ between the replicates. Other examples
of empirical investigated networks are found
in Newman (2003). Also when modeling
virtual networks it is important to consider the
connection level. Kiss et al. (2005) have used
virtual networks with different mean degree
in their epidemiological modeling. They have
varied the mean degree between 5 and 20 to
see how it implies the final epidemic size.
These values corresponds to link density
values from 0.005 to 0.02, that is, rather low
densities. It would have been interesting if
they also had used higher values, to compare
to our results based on higher link densities.
Despite the study of Kiss et al. (2005), network
studies focusing on link densities and missing
links are rare and more work has to be done in
this field.
Our results indicate that, when simulating in a
network generated by scenario DlDt, the time
scale is perhaps not so important for
measuring the extent of an epidemic, that
previously thought. That because a low link
density (though over 0.04), according to the
results showed here, is as good as a high link
proportion in predicting the number of animal
holdings that will be transmitted if a disease is
entering the network.
7
Draft 091216
That random network will spread diseases
faster than spatially clustered network is well
known (Watts & Strogatz 1998; Kiss et al.
2005). This is here obtained with the random
network that uses random transmission
probabilities (RlRt). For the DlRt scenario with
random transmissibility in a distance
dependent network the transmission rate is
slightly slower. The reason for that is probably
because this kind of networks contains more
clusters than the random networks. Therefore,
it takes longer time for the disease to spread
between the clusters than it takes to transmit
between holdings that are randomly
connected. That random networks have a low
level of clustering compared to other kind of
networks (for example small-world) is wellknown (Watts & Strogatz 1998; Shirley &
Rushton 2005). We have measured the
clustering coefficient of our networks and the
results was the same as mention above, the
clustering coefficient was lower in the random
networks than in the networks generated by
distance dependent linking. How fragmented
a network is influence how well diseases could
spread between the holdings. Fragmentation
index measures to what extent the networks
are disconnected. Here, only the networks
with link density below 0.03 resulted in
disconnected networks. Link densities of 0.03
or higher give rise to a connected graph and it
is then possible for a disease to spread
between all animal holdings in the network.
Consider that a link density of 0.03
corresponds to an average link degree of
almost 7.5, the values of the fragmentation
index are reasonable. Distance dependent link
creation
results
in
slightly
higher
fragmentation index than with random link
creation. Therefore, distance dependent
connections give rise to more disconnected
networks than random connections does.
If the level of link density in a network is an
advantage or a disadvantage, depends on the
scientific area in question.
In disease
transmission networks, high link densities are
not preferred because more links in the
networks also implicates more possible
transmission ways for the disease.
Nevertheless, in other contexts as for
examples information spread or in a meta
community of a species that are vulnerable for
extinction, high link density is desirable.
In this study, we have measured the number
of infected animal holdings as a measure of
the extent of a possible epidemic. In practice,
this is perhaps not that relevant because it is
not desirable to let the disease transmission
go on for long time. Instead, it is of course
desirable that control strategies adopt as soon
as possible after identifying an infection. We
are interested in how many animal holdings
that is infected and in the rate of the disease
transmission. That is why no incubation time
is included in the model. This is a
simplification since the incubation time for
infections differs between diseases. Some
diseases have an incubation time of only a few
days while it for others may be as long as a
couple of years.
Our results are valid for the parameter values
(volume, variance and kurtosis) used in this
analysis. If other parameter values are used
the results might be different. Perhaps the
boundary (obtain with scenario DlDt), where
the extent of the epidemic will become the
same irrespective of link density, is on another
level compared with those parameter values
that are used here.
4.2 Overrated links
As mention above, using wrong time window
can lead to a sampled network with too many
links, compare to the real world network that
was aimed to be sampled. Using networks
that are static over time could also result in to
high link densities. For static networks, ones
the link structure is created it does not change
during the simulation time. Different kinds of
network representations are discussed and
compared in Vernon and Keeling (2009). They
concluded that static networks overestimate
the effects of a disease transmission and
therefore this network representation should
be used with caution when modeling
epidemics. Since the purpose in this study is to
investigate the effect of an incomplete
network rather than modeling the course of
an epidemic, static networks are sufficient
here.
8
Draft 091216
4.3 Ny rubrik men vet inte vad den ska heta än
eller om det ska bli flera…
If connections between animal holdings are
distance dependent or not can be discussed.
We assume that this is the case. Because of
that, adjacent holdings have more contacts
than holdings that are more distant. If
transmission of diseases in a network is
distance dependent or not can also be
discussed but also here we assume distance
dependence. One empirical example where
most of the transmissions of a disease
occurred between animal holdings near each
other is the epidemic of the foot-and-mouth
disease in Great Britain in 2001 (Ferguson et al.
2001). In this epidemic, only a few
transmissions occurred over long distances.
Kiss et al. (2005) mention this as an additional
manner that implies that the connections are
clustered with a connection probability that
decreases with the distances. Our method for
link connection that is based on distance
dependence can be an example of that.
Having knowledge about what proportion of
links in a network that is a good
representation of an empirical network is
important because it expects that a number of
network measures will depend on the link
density. Therefore can this study be seen as a
base for knowledge about if network
measures of empirical networks are useful and
if they can describe the system in a relevant
way. Relevant depends on the current aim.
In addition to the number of links missed or
overrated, individual links might have
different importance in the network system.
One way to find and identify the most
important links in a network is using network
measures. There are several network
measures available but most of them consider
how important the nodes are and only some
of them focus on the importance of the links.
It would be valuable with more investigations
in this field.
In the future, the simple model used here can
easily be extended to a more complex model,
by including a recovery phase. In relation to
real diseases, incubation time should also be
included. To investigate the impact of the
node density, it would be interesting to
change the node structure to more
aggregated and then see what impact that has
on the disease transmission.
Acknowledgements
References
Barthélemy, M., Barrat, A., Pastor-Satorras, R. and Vespignani, A., 2005. Dynamic patterns of
epidemic outbreaks in complex heterogenous networks. Journal of Theoretical Biology 235, 275-288.
(doi:)
Bell, D.C., Atkinson, J.S. and Carlson, J.W., 1999. Centrality measures for disease transmission
networks. Social networks 21, 1-21.
Borgatti, S., 2003. The Key Player Problem in Dynamic Social Network Modeling and Analysis:
Workshop Summery and papers, R. Breiger, K. Carley, P. Pattison, (Eds). National Academy of
Sciences Press.
9
Draft 091216
Christley, R.M., Robinson, S.E., Lysons, R. and French, N.P., 2005. Network analysis of cattle
movement in Great Britain. Proceedings of the Society for Veterinary Epidemiology and Preventive
Medicine (2005), 234-243.
Clauset, A., Moore, C. and Newman, M.E.J., 2008. Hierarchical structure and the prediction of missing
links in networks. Nature 453, 98-101.
Eames, K.T.D. and Keeling, M.J., 2003. Contact tracing and disease control. Proc. R. Soc. B 270, 25652571.
Eames, K.T.D., Read, J.M. and Edmunds, W.J., 2009. Epidemic prediction and control in weighted
networks. Epidemics 1, 70-76.
Ferguson, N.M., Donnelly, C.A. and Andersson, R.M., 2001. Transmission intensity and impact of
control policies on the foot and mouth epidemic in Great Britain. Nature 413, 542-548.
Guimerà, R. & Sales-Pardo, M., 2009. Missing and spurious interactions and the reconstruction of
complex networks. PNAS ? (doi: 10.1073/pnas.0908366106)
Heath, M.F., Vernon, M.C. and Webb, C.R., 2008. Construction of networks with intrinsic temporal
structure from UK cattle movement data. BMC Veterinary Research 4:11.
Håkansson, N., Jonsson, A., Lennartsson, J., Lindström, T. and Wennergren, U. Generating structure
specific networks. Submitted to Advances in Complex Systems. Eller hur skriver man?
Kao, R.R., Green, D.M., Johnson, J. and Kiss, I.Z., 2007. Disease dynamics over very different timescales: foot-and-mouth disease and scrapie on the network of livestock movements in the UK. J. R.
Soc. Interface 4, 907-916.
Keeling, M. 2005. The implication of network structure for epidemic dynamics. Theoretical
Population Biology 67, 1-8.
Kiss, I.Z., Green, D.M. and Kao, R.R., 2005. Disease contact tracing in random and clustered networks.
Pro. R. Soc. B 272, 1407-1414.
Kiss, I.Z., Green, D.M. and Kao, R.R., 2006. The network of sheep movements within Great Britain:
network properties and their implications for infectious disease spread. J. R. Soc. Interface 3, 669-677.
Kiss, I.Z., Green, D.M. and Kao, R.R., 2008. The effect of network mixing patterns on epidemic
dynamics and the efficacy of disease contact tracing. J. R. Soc. Interface 5, 791-799.
Le Menach, A., Legrand, J., Grais, R.F., Viboud, C., Valleron, A-J. and Flahault, A., 2005. Modeling
spatial and temporal transmission of foot-and-mouth disease in France: identification of high-risk
areas. Veterinary Research 36, 699-712. (doi:10.1051/vetres:2005025)
Lindström, T., Håkansson, N., Westerberg, L. and Wennergren, U., 2008. Splitting the tail of the
displacement kernel shows the unimportance of kurtosis. Ecology 89, 1784-1790.
Newman, M.E.J., Strogatz, S.H. and Watts, D.J., 2001. Random graphs with arbitrary degree
distributions and their applications. Phys. Rev. E 64, 026118.
10
Draft 091216
Newman, M. E. J., 2002. Assortative mixing in networks. Phys. Rev. Lett. 89 (20).
(doi:10.1103/PhysRevLett.89.208701)
Newman, M.E.J., 2003. The structure and function of complex networks. SIAM Rev. 45, 167-256.
Ortiz-Pelaez, A. Pfeiffer, D.U. Soares-Magalhães, R.J. and Guitian, F.J., 2006. Use of social network
analysis to characterize the pattern of animal movements in the initial phases of the 2001 foot and
mouth disease (FMD) epidemic in the UK. Prev. Vet. Med. 76, 40-55.
Perkins, S.E., Cagnacci, F., Straditto, A., Arnoldi, D. and Hudson, P.J., 2009. Comparison of social
networks derived from ecological data: implications for inferring infectious disease dynamics. Journal
of animal ecology 78, 1015-1022.
Robinson, S.E. and Christley, R.M. 2007. Exploring the role of auction markets in cattle movements
within Great Britain. Preventive Veterinary Medicine 81, 21-37.
Shirley, M.D.F. and Rushton, S.P. 2005. The impacts of network topology on disease spread.
Ecological Complexity 2, 287-299.
Vernon, M.C. and Keeling, M.J., 2009. Representing the UK´s cattle herd as static and dynamic
networks. Proc. R. Soc. B 276, 469-476.
Wasserman , S. and Faust, K., 1994. Social Network Analysis: Methods and Applications. Cambridge
University Press, Cambridge.
Watts, D.J. and Strogatz, S.H., 1998. Collective dynamics of ‘small-world’ networks. Nature 393, 440442.
Webb, C.R. 2005. Farm animal networks: unraveling the contact structure of the British sheep
population. Preventive Veterinary Medicine 68, 3-17.
Webb, C.R., 2006. Investigating the potential spread of infectious diseases of sheep via agricultural
shows in Great Briatin. Epidemiology and Infection 134, 31-40.
11
Draft 091216
Table captions
Table 1. Examples of link densities used in simulations and the corresponding mean link degree for
the networks.
mean degree
link density
(nr of links/node)
0.001
0.005
0.01
0.02
0.03
0.04
0.05
0.10
0.25
0.50
0.75
1.00
0.250
1.248
2.495
4.990
7.485
9.980
12.48
24.95
62.38
124,8
187.1
249.5
Table 2. Fragmentation index depending on link density and link creation method used.
link density
distance dependence
random
0.001
0.005
0.01
0.02
0.03
0.9983
0.9065
0.0385
0.0002
0.0000
0.9981
0.1976
0.0133
0.0001
0.0000
Figure captions
a)
b)
c)
Figure 1. Network categories: a) complete network, b) real world network, c) sampled network
12
Draft 091216
Number of
Animal holdings
Random Placement
Distance Dependent
Linking
Distance
dependent
transmission
Dl Dt
Random Linking
Distance
Random
dependent
transmission transmission
Dl Rt
Rl Dt
Random
transmission
Rl Rt
Figure 2. Flow chart showing the different parts of the model and how these relate to each other.
13
Draft 091216
(b) Random - Distance (RlDt)
Mean nr of infected holdings
Mean nr of infected holdings
(a) Distance - Distance (DlDt)
500
400
300
200
100
0
0
50
100
150
200
250
500
400
300
200
100
0
300
0
50
100
Time
400
300
200
100
0
50
100
150
Time
200
250
300
(d) Random - Random (RlRt)
Mean nr of infected holdings
Mean nr of infected holdings
(c) Distance - Random (DlRt)
500
0
150
Time
200
250
300
500
400
300
200
100
0
0
50
100
150
200
250
300
Time
Figure 3. Mean number of infected per time step depending on linking and disease transmission
scenarios. Scenario DlDt (a) and RlDt (b) have distance dependent disease transmission while scenario
DlRt (c) and RlRt (d) have random transmission. With scenario DlDt (a) and DlRt (c) distance
dependent link creation are used. In scenario RlDt (b) and RlRt (d) random link creation are used.
Link densities used: 0.001 (---), 0.005 (…), 0.01 (--.--), 0.02 (__), 0.03 (-○-), 0.04 (-*-), 0.05 (-□-), 0.1
(-♦-), 0.25 (-◦-), 0.5 (-▼-), 0.75 (-x-) and 1.0 (-+-).
14
Draft 091216
DD
l
300
RD
a)
l
DR
200
l
RR
l
t
t
t
t
100
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
300
Time
b)
200
100
0
0
0.1
300
c)
200
100
0
0
0.1
Link Density
Figure 4. Number of time steps until (a) 10%, (b) 50% and (c) 90% of all in the network are infected.
The time depends on which of the four scenarios that are used. For scenario RlDt the number of
infected holdings did not reach any of the given proportions during the simulation time.
15
Nr of inf holdings
2
a)
1.5
1.25
0
50
100
150
200
250
300
Nr of inf holdings
1
b)
400
200
0
0
50
100
150
200
250
300
c)
400
200
0
0
50
100
150
200
250
d)
400
200
0
0
50
100
150
Time
200
250
500
400
300
200
100
0
500
400
300
200
100
0
300
300
e)
0
500
400 f)
300
200
100
0
0
Nr of inf holdings
1.75
500
400
300
200
100
0
Nr of inf holdings
Nr of inf holdings
Nr of inf holdings
Nr of inf holdings
Nr of inf holdings
Draft 091216
50
100
150
200
250
300
50
100
150
200
250
300
50
100
150
200
250
300
g)
0
DD
h)
l
RD
l
DR
l
RR
l
0
50
100
150
200
250
t
t
t
t
300
Time
Figure 5. Mean number of infected per time step for a given link density and the four scenarios. Here,
eight link densities, one at time, are used and compared. Link densities in the sub graphs: a= 0.001, b=
0.01, c= 0.03, d=0.05, e=0.07, f=0.1, g=0.5 and h=1.0. Notice that the scales of the y-axes are not the
same in all sub graphs.
16
Draft 091216
1
(a) Assortativity
Assortativity
0.8
0.6
0.4
0.2
0
0
0.1
0.7
0.6
0.5
0.4
0.3
0.2
0.8
0.9
Link Density
Clustering coefficient
1
(b) Clustering coefficient
0.8
0.6
0.4
Distance dependent linking
Random linking
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Link Density
Figure 6. Average (a) assortativity and (b) clustering coefficient for the networks, depending on the
way the holdings are connected to each other.
Short title for page headings: Contact density and spread of infectious diseases in networks
17