Download Epidemic MathTalk Oct14/08

Document related concepts
no text concepts found
Transcript
Pandemics, Epidemic Modeling and
how Winnipeg saved the World
— Derived from ECE self invited talk ,
with a distinct computer engineering
flavor.
Robert D. McLeod [email protected]
Maciej Borkowski
Blake Podaima
et. alia
Internet Innovation Centre (IIC)
Dept. Electrical and Computer Engineering
University of Manitoba
© IIC, Oct. 2008
Internet Innovation Center
Overview (Five parts)


Part 1: Brief overview of two interesting books as
precursors

World Without Us

Pandemonium
Part 2: Our “Agent Based Epidemic Model”

Motivation

The Model

Where, Who, When, What

Implementation

Demonstration
Internet Innovation Center
1
Overview con’t



Part 3: Extensions and Limitations (opportunities)

Current Limitations

Related Work (Direct and Indirect)

Possible Extensions and data mining opportunities for CS and
ECE (maybe mathematicians)
Part 4:

New BSERC challenge and opportunities

How Winnipeg saved the world
Part 5: Summary
Internet Innovation Center
2
Part 1: Book Reviews

World Without Us

http://www.worldwithoutus.com/index2.html

Very cool site.
Premise: We are all gone
2nd Law of Thermodynamics takes
over rather quickly
i.e. Things decay quickly
Internet Innovation Center
3
Bridge in 300 years, artist drawing
of course.
Varosha, Sort of a DMZ’d since 1974
Relation to present talk: What percentage
of people need to be gone before we can
not sustain our complex infrastructure?
Internet Innovation Center
4
Part 1: Book Reviews Con’t

“Pandemonium”

Epizootic (animals)


Intensive Unit Production

Engineered lack of genetic diversity

Ripening for disaster
Epidemic (humankind)

Every thing I now read I think I now
have.

Very scary book

Thought it was written by Steven King

Viral, bacterial, fungal
Internet Innovation Center
5
The real question: Why are all
these chickens looking to the right?
Intensive Unit Production:
Reduced genetic diversity
High stressed immune
system
Creating un-natural
pathogen reservoirs
Estimated 100 Billion chickens on the planet.
Highly clustered sharing same watershed, people,
and air.
Similar problems in agriculture: monocultures
Internet Innovation Center
6
Other “connecting” Books:

The Numerati by Stephen Baker (NR)

Super Crunchers by Ian Ayres

The Tipping Point by Malcolm Gladwell

The Black Swan by Nassim Nicholas Taleb

Fooled by Randomness by Nassim Taleb

The Man Who Knew Too Much: Alan Turing and
the Invention of the Computer by David Leavitt
Internet Innovation Center
7
Part 2: Agent Based Modeling

General Interests in Complex Systems and
Modeling

Our research resulting from a Programming
Challenge

This is pure mathematics. Is that a G.H. Hardy
reference? No, it’s a G. Boole reference.

Make the “equations” as simple as possible, but
not simpler, Albert Einstein

ABM is computational modeling essentially devoid
of equations
Internet Innovation Center
8
2007: Our Loose Specification for
Epidemic Modeling

Idea: Data mine where possible the basic tenets of
people-people interactions

Topology: Data mined from maps

Behaviour: Data mined from “demographics”

This approach develops models based on “real” network
topologies and “scheduled” walkers

The goal of the research is to shed additional light on the
problems associated with very complicated phenomena
through “data-driven” modeling and simulation
Internet Innovation Center
9
Epidemics: Why

Disease modeling of a pandemic-proportion epidemic is
an important area of study

There are widely-held beliefs that a local disease
epidemic or global pandemic in the general population is
overdue

This idea is derived from analysis of previous epidemics
and catastrophes that puncture the equilibrium (the
‘black swan’ phenomenon)
Internet Innovation Center
10
The Model

Data mining is a common theme in modern information
technology




Analytical methods may not exist or are overly complex
Data does exist and can be readily extracted
Statistical methods (and computing power) can now more
easily deal with the vast amount of data that is available
Our work is an attempt to help promote data-driven
epidemic simulation and modeling


Where data is available we demonstrate its utility, where
unavailable we demonstrate how it would be utilized
Unavailable data refers to practical or political limitations on
access, rather than technical or theoretical availability
Internet Innovation Center
11
“Where”

One of the first things needed is topology data, or
physical “network” data – that is, where are we
attempting to apply the study of epidemic spread?

These are primarily real places, such as homes,
institutions, businesses, industry, schools, hospitals, and
transport, within cities of all types

Much of this information in mineable, more so now than
ever

“Geographic profiling”
Internet Innovation Center
12
Topological Data Sources
Google Earth with Overlays
Internet Innovation Center
Google Maps
13
“Who”

Of similar importance to location (where), is the agents
(who) are being infected

This is data that is generally technically available but
practically unavailable


(An approach) Collaborate with organizations that house and
take census data. Data mining of these sources are not
technical issues but rather political or policy issues and require
the will of government to make this data available for
simulations such as these
Our model attempts to illustrate how the data would be
used if available
Internet Innovation Center
14
“When”

An agents’ schedule (when) is also of critical importance.
This data is more typically inferred rather than explicitly
available, but as we are primarily creatures of habit
reasonable assumptions can be made

Many of us operate on fairly routine weekday schedules,
punctuated by more flexible weekends. As such, this
scheduled data (for the sake of the simulation) can be
associated with an agent, modified by slight variations in
arrival and departure times
Internet Innovation Center
15
“What”

In addition to where, who, and when, we also need to
address what. The what here is typically a disease,
either bacterial or viral, communicated with an
associated probability of contraction when in contact with
an infectious agent

These parameters are adjustable and represent an
aspect of the current simulation with the greatest
uncertainty in terms of their validity, and could benefit
from collaboration or input from an epidemiologist
Internet Innovation Center
16
Implementation

Based on the model as described above, it should be
clear that our underlying simulation model is that of a
Discrete-Space Scheduled Walker (DSSW), in contrast
to other models that are more traditionally based on
random or Brownian walkers on artificial topologies

We attempt to capture the most important aspects of
real-people networks, incorporating (by construction)
notions such as “small world” networks

Behaviour: Low mobility when sick or getting sick
Internet Innovation Center
17
Where - Topological Data Sources

This is starting point for
locating various institutions

Winnipeg is used to illustrate
the use of map information
as input for generating our
underlying topology of
institutions, where people
(agents) potentially come
into contact and infect one
another
Internet Innovation Center
Homes
School
Business
Mall
Transportation
Etc.
18
Implementation: Where

Object Template: Institution

Properties:


Geographical location

Probability of contracting a disease i
Special types:

Home

Bus

Car
Internet Innovation Center
19
Implementation: Who

Object Template: Agent

Properties:

Probability of contracting a disease  j

Probability of using a car

Working institutions

Leisure Institutions

Working schedule

Leisure Schedule
Internet Innovation Center
20
“Who”
Internet Innovation Center
21
City of Winnipeg, population: 635,869
Implementation: When

Object Template: Schedule

Properties:


Activity start time

Activity lasts for

Institution where the activity takes place

Probability of choosing given activity
Special Types:

Working schedule, Leisure schedule
Internet Innovation Center
22
When - Schedules

Sample working schedule:

8:00 8:00 WI 98 17:00 2:00 Mall 30 17:00
2:00 LI 50

Meaning:

98% chances that agent will spend 8h at working
institution starting from 8:00 am


30% chances that agent will spend 2h shopping
starting from 5:00 pm or
50% chances that agent will spend 2h at any of
its leisure institutions (Cinema, Restaurant, etc.)
8:00
98%
2%
Working
Institution
16:00
17:00 30%
35%
35%
Mall
Leisure
Institution
19:00
Internet Innovation Center
23
Implementation: What

Object Template: Virus

Properties:

Probability of contracting the virus 
Incubation period

Infecting period starts

Infecting period ends

Sickness (low mobility) lasts

Extra resistance boost

Extra resistance lasts


Extensions Incorporated:

Seasonal Variations, Mutation
Internet Innovation Center
24
Seasonal Variations
Real Data
Input to model
Internet Innovation Center
25
The User Interface to DSSW
• Parameters for simulation are
set up in a number of files and the
user can step or loop through the
simulation at any given rate
• During the simulation, a
number of plots and statistics
are collected and logged to a
web server where the user
can then further analyze the
simulation run
Internet Innovation Center
26
Analysis

Some data that is available on
the corresponding web server
Internet Innovation Center
27
Seasonal Variations

Seasonal variations are well
known and provide fairly well
“labeled” data for comparison

The figure illustrates the type of
data available

Comparison allows for
a tuning of parameters
to more closely reflect
actual data collected
for a particular disease
Internet Innovation Center
28
Mutations
“tipping point”
“seasonal variation”


A mutation to a deadlier strain or a sudden variation in the
mode of transmission (perhaps the virus has become airborne)
Other uses of the simulator would be in helping to evaluate the
extent of inoculations or policies in the event of a simulated
outbreak. This will allow for epidemiologists to “partially close
the loop” when evaluating policy
Internet Innovation Center
29
Summary: Part 2 DSSW



Introduced a reasonable method of epidemic modeling,
taking advantage of opportunities for data mining and
scheduled walkers
The basic characteristic of the model is to extract and
combine real topographic and demographic data. This
work shows that model creation using real data is indeed
feasible, and will likely result in better characterization of
the actual dynamics of an epidemic outbreak
Further work will focus on refining the model, and
validating our conjectures
Internet Innovation Center
30
Demonstration
Questions

Comment: The goal of research like this is to help
prevent or avert a disaster.

Unfortunately its utility won’t be validated unless we
actually have a plague 
Internet Innovation Center
31
Part 3: Extensions and limitations




Current limitations of the DSSW simulation engine
Extensions in behavioral pattern extraction
 Often from potentially unexpected sources
 Sources originally conceived for other purposes
Related work
Rationalization of why it is the work for the “Sons/
Daughters of Martha”
Internet Innovation Center
32
Limitations of DSSW
Agree
“All models are wrong
but some models are
useful.”
-- George E.P. Box,
Statistician
“Truth is ever to be
found in the simplicity,
and not in the
multiplicity and
confusion of things.”
-- Sir Isaac Newton
Maybe truth can actually be found in the multiplicity and
confusion of things. -- Me
Internet Innovation Center
Ref: Wikipedia
33
Limitations:




One city (not a show stopper, OO model)
Two types of schedules: work and leisure
 Actual demographics are more complex
However, much of this information is available
 Some of which will be derived from disparate and
unexpected sources
At present DSSW appears well suited to “egalitarian”
type disease, “who agnostic”
Internet Innovation Center
34
Extensions: Hierarchy


Incorporate Hierarchy
 Intracity and Intercity
 Basic modality remains: data-driven models of
discrete space- and time- walkers, mined from
available sources.
Cities are largely autonomous
 Allows for the problem to remain tractable and allow
for efficient modes of computation (parallelism can be
exploited).
Internet Innovation Center
35
Extensions: Extracting Patterns of
Behaviour


Patterns of behavior can be taken from tracking
technologies that are in place albeit not mined for use in
epidemic modeling.
 E.g. Financial Transaction Profiling
 Usually mined to detect fraud
 E.g. Cell phone tracking, “where are you” services
 By default the service provider already knows
where you are, even more so with GPS
Obstacle: Privacy
Internet Innovation Center
36
Related Research: Extracting Patterns
of Behaviour


Consumer wireless electronics: MAC snooping and
tracking.
 Bluetooth headsets (ingress and egress of signalized
arterials)
 Similar protocols for WiFi
 Device-enabled Kiosks and vending machines
Security cameras and systems with person detection
 Monitoring for behaviour patterns those of illegal
activities and terrorist threats
Internet Innovation Center
37
Related Research: Extracting Patterns
of Behaviour continued


Tracking subway ridership.
 Token data mining of ridership
 Objective: Bioterrorism impact
Mining online transportation information systems
 Helsinki public transport
 Objective is to provide information for riders, ours
would be using this data to model the movement of
people with a city for disease modeling and its
possible spread
Internet Innovation Center
38
Related Research: Real-time Helsinki
Public Transport Information
Internet Innovation Center
39
Related Research: Ubiquitous Vehicle
Tracking Cameras
Ref: http://www.edmontontrafficcam.com/cams.php
Internet Innovation Center
40
Related Research: Extracting Patterns
of Behaviour (Economic Impact)

Economic Impact: Costs associated with implementing
policy.
 Specifically, the economic impact of restricting air
travel as a policy in controlling a flu pandemic.
 Models global air travel and estimates impact and
cost associated with travel restrictions.
 E.g. 95% travel restriction required before
significantly impairing disease spread
 Not a surprise
Internet Innovation Center
41
Related Research: Extracting Patterns
of Behaviour (Economic Impact)
Internet Innovation Center
42
Other sources of information/concern


Occasional/periodic mass gatherings
 E.g.Olympics or other special event that may perturb
an overall or global simulation
E.g. The Hajj
 Largest mass pilgrimage in the world.
 2007 an estimated 2-3 million people participated.
 Conditions are difficult and thus it offers an
opportunity for a large scale disease such as
influenza to take hold.
 These people then disperse to their home countries,
many via public transport, and could easily influence
the spread and outbreak of the disease.
Internet Innovation Center
43
One other model worth mentioning



Wrt: Epizootic modeling
(M.Eng. Mr. Paizen)
The chicken factory
percolation model.
Derived from forest fire
percolation models
http://www.shodor.org/interactivate/activities/fire/?version=1.6.0_07&browser=MSIE&ven
dor=Sun_Microsystems_Inc.
Internet Innovation Center
44
Percolation Variations


Neighborhoods
Our chicken factory percolation model adds mobility.
Funny things happen at
the percolation threshold
Formation of the infinite
cluster
http://www.svengato.com/forestfire.html
Internet Innovation Center
45
Mobility and Infection Longevity
100%
Mobility/Longevity Impact
Substantive shift in the Percolation
Threshold
Percent dead
Percolation threshold is like a tipping point
Mobility has a big effect:
The mobility threshold for plague as a
critical percolation phenomenon for an epizootic
5%
42%
Internet Innovation Center
Population
46
Percolation with mobility.
We are sure others have
considered percolation
models with mobility (more
akin to CAs) but likely not in
this context.
Internet Innovation Center
47
Part 4: New $200 BSERC Challenge

Erdős feeling (keeping with math flavor)
www.ee.umanitoba.ca/~mcleod/BSERC.html
Internet Innovation Center
48
2008 BSERC: Augmenting DSSW

1) Model “Toronto” with the GTA population of 3.5 M
people. (Big City)
 Great deal of arterial flow of large numbers of
persons. It may be worthwhile to model the behaviour
of traffic flows into and out of Toronto as a means of
modeling how the spread of a disease may occur.
 (Some cities have web cam traffic monitoring from
which flows could be inferred, so an interest in
image processing would be useful, eventually real
time satellite views may become available).
Internet Innovation Center
49
2008 BSERC

2) Model Las Vegas, where something like 80 million
visitors a year arrive and stay for 3-4 days at a
time. Vegas is a "destination city" (as opposed to other
major cities, where a similar number of people may
arrive, but immediately transit through to other
destinations). (Strange City)

Vegas offers more opportunity for a disease to propagate than a
majority of cities. It is possible to mine air traffic schedules
(Yapta) to get a very reasonable flow in and out of Vegas. There
are a relatively small number of hotels, all mineable from google
maps or equivalent, and it seems that just about everyone in
Vegas works in the hotel industry.
Internet Innovation Center
50
2008 BSERC

3) In any city, mining behaviour through scheduled public
transport would be a good way to model contracting
disease. Public transit systems transport a good number
of people in close contact.
 Estimating ridership would likely be a challenge but
could be made available if the efficacy of the
modeling could be demonstrated.
Internet Innovation Center
51
2008 BSERC

4) Refine a model for one institution or one small set of
institutions only - but deal with all parameters thoroughly
and to a high degree of accuracy based on explicit
assumptions and a high degree of sensitivity. Mine
“Aurora”. (Model Institutional Hierarchy). The work here
would be on using agent based models as support for
knowledge bases which can then be further developed to
perhaps provide simple rules-of-thumb institutional
models, providing a demarcation between the details
required and perhaps where computationally simpler
models could take over.
Internet Innovation Center
52
2008 BSERC

5) The EPI@home project. This is a very interesting
option and network programming challenge.
 In its initial manifestation it is really a cluster based
implementation of an agent based epidemic model
well tailored to parallelism. In a cluster, each node
would run an instance of a city and communication
would emulate people traveling via air or train. Your
packets in the TCP/IP sense would transport agents
as their payload.
 EPI@home (epi-at-home.com taken for starters)
Internet Innovation Center
53
2008 BSERC

6) Model mass gatherings. The Hajj, the largest mass
pilgrimage in the world in which an estimated two-three
million people participated in 2007, is an example. In
addition to a large number of people in high density,
physical conditions are difficult and thus it presents an
unnatural pathogen reservoir, providing opportunity for a
large scale disease such as influenza to take hold. Mass
gathering participants then disperse to their homes,
many via public transport, and could easily influence the
spread and outbreak of the disease.
Internet Innovation Center
54
2008 BSERC

7) Preparedness planning: This is a massive undertaking
but one in which our individual city model could be useful
in providing planners with policies and some degree of
expectation how goods and services could be
provisioned in the event of a catastrophe. Simple
investigations as to how long food supplies would last
and could be distributed will be modeled. Provisioning of
resources extempore will lead to an aggravated and
worsening disaster. Models like ours will contribute to
preparedness planning. This can become an effective
modeling tool for any city or municipality, allowing for
provisioning not only of food and supplies but for
inoculation services as well as temporary hospital and/or
mortuary facilities.
Internet Innovation Center
55
2008 GSERC
1) Richard Gordon is keenly interested in how ABMs1 can
be used in preventing the spread of HIV/AIDS . He also
has a offered a cash award ($200) for some solutions in
this area using DSSW. HIV/AIDS is a much tougher nut
to crack using DSSW than a nice “who agnostic”
disease like H5N1 will be when it finds a nice home in
humans. (Jumps the species gap)
For more details see Richard.
1 ABMs as in agent based models not anti-ballistic missiles 
Internet Innovation Center
56
How Winnipeg saved the world



Firstly there will be a pandemic of
“Biblical/Torah/Koran/Bhagwat Gita ” proportion.
The infrastructure in many places will crumble or be
seriously taxed. (possibly not here)
One problem is going to be that associated with the
human races' “thin veil of civilization”.
 Our innate savagery is going to be a problem
 (Hopefully Winnipeg has a strong sense of
community)
Internet Innovation Center
57
How Winnipeg saved the world


Winnipeg has a high chance of remaining somewhat
unscathed and will play a crucial role in restructuring
Main reasons:
 To some degree we are isolated. (or easily isolated)
 The Vegetable Guy. (Peak of the Market, local food)
 Stores could be set aside
 Manitoba Hydro. (non local electrical generation)
 Manitoba Water. (non local supply, gravity fed)
Internet Innovation Center
58
How Winnipeg saved the world
Power generation: Remote
maintained by “healthy”
individuals
Easily Isolated: Transportation wise
Food production: Local
Water Supply: Remote
Result: Pandemic Lag
Internet Innovation Center
59
Coincidently MB looks like a Nano
Internet Innovation Center
60
The Point: Technology

Technology is both part of the
problem and simultaneously can
be part of the solution.

E.g.“Super crunching” of the type
discussed here (data mining) from
non obvious and disparate
sources is a technology solution.

E.g. Mobility reduction through
increased telepresence
Internet Innovation Center
Please be advised
you purchased
xyz from abc on
cde.
e.g. Listeria notification
61
For fun. What’s going to fail first?:











Telephones, cellular.
Electricity
Water
Waste disposal (Sewage/Garbage)
Food Distribution
The Internet
Television (maybe we should reserve an analog channel)
Radio
Health Services
Vehicular transport
Civility
Internet Innovation Center
62
Summary





Overviewed two books. (Mentioned at least 11 )
Presented our Agent Based Modeling approach to
epidemic simulation.
 Emphasis on data mining of spatial topologies and
agent behavior patterns
 We denoted this “Discrete Space Scheduled
Walkers” paradigm (instance of ABM)
Presented several indirect data sources
 Often no obvious connection to epidemic modeling
BSERC opportunities
Presented one of the many benefits of living in Winnipeg
Internet Innovation Center
63
Thank-you

Comments and suggestions very welcome.

Topics of consideration for future modeling
 Introduce a panic behavior in light of an epidemic
Internet Innovation Center
64