Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Pandemics, Epidemic Modeling and how Winnipeg saved the World — Derived from ECE self invited talk , with a distinct computer engineering flavor. Robert D. McLeod [email protected] Maciej Borkowski Blake Podaima et. alia Internet Innovation Centre (IIC) Dept. Electrical and Computer Engineering University of Manitoba © IIC, Oct. 2008 Internet Innovation Center Overview (Five parts) Part 1: Brief overview of two interesting books as precursors World Without Us Pandemonium Part 2: Our “Agent Based Epidemic Model” Motivation The Model Where, Who, When, What Implementation Demonstration Internet Innovation Center 1 Overview con’t Part 3: Extensions and Limitations (opportunities) Current Limitations Related Work (Direct and Indirect) Possible Extensions and data mining opportunities for CS and ECE (maybe mathematicians) Part 4: New BSERC challenge and opportunities How Winnipeg saved the world Part 5: Summary Internet Innovation Center 2 Part 1: Book Reviews World Without Us http://www.worldwithoutus.com/index2.html Very cool site. Premise: We are all gone 2nd Law of Thermodynamics takes over rather quickly i.e. Things decay quickly Internet Innovation Center 3 Bridge in 300 years, artist drawing of course. Varosha, Sort of a DMZ’d since 1974 Relation to present talk: What percentage of people need to be gone before we can not sustain our complex infrastructure? Internet Innovation Center 4 Part 1: Book Reviews Con’t “Pandemonium” Epizootic (animals) Intensive Unit Production Engineered lack of genetic diversity Ripening for disaster Epidemic (humankind) Every thing I now read I think I now have. Very scary book Thought it was written by Steven King Viral, bacterial, fungal Internet Innovation Center 5 The real question: Why are all these chickens looking to the right? Intensive Unit Production: Reduced genetic diversity High stressed immune system Creating un-natural pathogen reservoirs Estimated 100 Billion chickens on the planet. Highly clustered sharing same watershed, people, and air. Similar problems in agriculture: monocultures Internet Innovation Center 6 Other “connecting” Books: The Numerati by Stephen Baker (NR) Super Crunchers by Ian Ayres The Tipping Point by Malcolm Gladwell The Black Swan by Nassim Nicholas Taleb Fooled by Randomness by Nassim Taleb The Man Who Knew Too Much: Alan Turing and the Invention of the Computer by David Leavitt Internet Innovation Center 7 Part 2: Agent Based Modeling General Interests in Complex Systems and Modeling Our research resulting from a Programming Challenge This is pure mathematics. Is that a G.H. Hardy reference? No, it’s a G. Boole reference. Make the “equations” as simple as possible, but not simpler, Albert Einstein ABM is computational modeling essentially devoid of equations Internet Innovation Center 8 2007: Our Loose Specification for Epidemic Modeling Idea: Data mine where possible the basic tenets of people-people interactions Topology: Data mined from maps Behaviour: Data mined from “demographics” This approach develops models based on “real” network topologies and “scheduled” walkers The goal of the research is to shed additional light on the problems associated with very complicated phenomena through “data-driven” modeling and simulation Internet Innovation Center 9 Epidemics: Why Disease modeling of a pandemic-proportion epidemic is an important area of study There are widely-held beliefs that a local disease epidemic or global pandemic in the general population is overdue This idea is derived from analysis of previous epidemics and catastrophes that puncture the equilibrium (the ‘black swan’ phenomenon) Internet Innovation Center 10 The Model Data mining is a common theme in modern information technology Analytical methods may not exist or are overly complex Data does exist and can be readily extracted Statistical methods (and computing power) can now more easily deal with the vast amount of data that is available Our work is an attempt to help promote data-driven epidemic simulation and modeling Where data is available we demonstrate its utility, where unavailable we demonstrate how it would be utilized Unavailable data refers to practical or political limitations on access, rather than technical or theoretical availability Internet Innovation Center 11 “Where” One of the first things needed is topology data, or physical “network” data – that is, where are we attempting to apply the study of epidemic spread? These are primarily real places, such as homes, institutions, businesses, industry, schools, hospitals, and transport, within cities of all types Much of this information in mineable, more so now than ever “Geographic profiling” Internet Innovation Center 12 Topological Data Sources Google Earth with Overlays Internet Innovation Center Google Maps 13 “Who” Of similar importance to location (where), is the agents (who) are being infected This is data that is generally technically available but practically unavailable (An approach) Collaborate with organizations that house and take census data. Data mining of these sources are not technical issues but rather political or policy issues and require the will of government to make this data available for simulations such as these Our model attempts to illustrate how the data would be used if available Internet Innovation Center 14 “When” An agents’ schedule (when) is also of critical importance. This data is more typically inferred rather than explicitly available, but as we are primarily creatures of habit reasonable assumptions can be made Many of us operate on fairly routine weekday schedules, punctuated by more flexible weekends. As such, this scheduled data (for the sake of the simulation) can be associated with an agent, modified by slight variations in arrival and departure times Internet Innovation Center 15 “What” In addition to where, who, and when, we also need to address what. The what here is typically a disease, either bacterial or viral, communicated with an associated probability of contraction when in contact with an infectious agent These parameters are adjustable and represent an aspect of the current simulation with the greatest uncertainty in terms of their validity, and could benefit from collaboration or input from an epidemiologist Internet Innovation Center 16 Implementation Based on the model as described above, it should be clear that our underlying simulation model is that of a Discrete-Space Scheduled Walker (DSSW), in contrast to other models that are more traditionally based on random or Brownian walkers on artificial topologies We attempt to capture the most important aspects of real-people networks, incorporating (by construction) notions such as “small world” networks Behaviour: Low mobility when sick or getting sick Internet Innovation Center 17 Where - Topological Data Sources This is starting point for locating various institutions Winnipeg is used to illustrate the use of map information as input for generating our underlying topology of institutions, where people (agents) potentially come into contact and infect one another Internet Innovation Center Homes School Business Mall Transportation Etc. 18 Implementation: Where Object Template: Institution Properties: Geographical location Probability of contracting a disease i Special types: Home Bus Car Internet Innovation Center 19 Implementation: Who Object Template: Agent Properties: Probability of contracting a disease j Probability of using a car Working institutions Leisure Institutions Working schedule Leisure Schedule Internet Innovation Center 20 “Who” Internet Innovation Center 21 City of Winnipeg, population: 635,869 Implementation: When Object Template: Schedule Properties: Activity start time Activity lasts for Institution where the activity takes place Probability of choosing given activity Special Types: Working schedule, Leisure schedule Internet Innovation Center 22 When - Schedules Sample working schedule: 8:00 8:00 WI 98 17:00 2:00 Mall 30 17:00 2:00 LI 50 Meaning: 98% chances that agent will spend 8h at working institution starting from 8:00 am 30% chances that agent will spend 2h shopping starting from 5:00 pm or 50% chances that agent will spend 2h at any of its leisure institutions (Cinema, Restaurant, etc.) 8:00 98% 2% Working Institution 16:00 17:00 30% 35% 35% Mall Leisure Institution 19:00 Internet Innovation Center 23 Implementation: What Object Template: Virus Properties: Probability of contracting the virus Incubation period Infecting period starts Infecting period ends Sickness (low mobility) lasts Extra resistance boost Extra resistance lasts Extensions Incorporated: Seasonal Variations, Mutation Internet Innovation Center 24 Seasonal Variations Real Data Input to model Internet Innovation Center 25 The User Interface to DSSW • Parameters for simulation are set up in a number of files and the user can step or loop through the simulation at any given rate • During the simulation, a number of plots and statistics are collected and logged to a web server where the user can then further analyze the simulation run Internet Innovation Center 26 Analysis Some data that is available on the corresponding web server Internet Innovation Center 27 Seasonal Variations Seasonal variations are well known and provide fairly well “labeled” data for comparison The figure illustrates the type of data available Comparison allows for a tuning of parameters to more closely reflect actual data collected for a particular disease Internet Innovation Center 28 Mutations “tipping point” “seasonal variation” A mutation to a deadlier strain or a sudden variation in the mode of transmission (perhaps the virus has become airborne) Other uses of the simulator would be in helping to evaluate the extent of inoculations or policies in the event of a simulated outbreak. This will allow for epidemiologists to “partially close the loop” when evaluating policy Internet Innovation Center 29 Summary: Part 2 DSSW Introduced a reasonable method of epidemic modeling, taking advantage of opportunities for data mining and scheduled walkers The basic characteristic of the model is to extract and combine real topographic and demographic data. This work shows that model creation using real data is indeed feasible, and will likely result in better characterization of the actual dynamics of an epidemic outbreak Further work will focus on refining the model, and validating our conjectures Internet Innovation Center 30 Demonstration Questions Comment: The goal of research like this is to help prevent or avert a disaster. Unfortunately its utility won’t be validated unless we actually have a plague Internet Innovation Center 31 Part 3: Extensions and limitations Current limitations of the DSSW simulation engine Extensions in behavioral pattern extraction Often from potentially unexpected sources Sources originally conceived for other purposes Related work Rationalization of why it is the work for the “Sons/ Daughters of Martha” Internet Innovation Center 32 Limitations of DSSW Agree “All models are wrong but some models are useful.” -- George E.P. Box, Statistician “Truth is ever to be found in the simplicity, and not in the multiplicity and confusion of things.” -- Sir Isaac Newton Maybe truth can actually be found in the multiplicity and confusion of things. -- Me Internet Innovation Center Ref: Wikipedia 33 Limitations: One city (not a show stopper, OO model) Two types of schedules: work and leisure Actual demographics are more complex However, much of this information is available Some of which will be derived from disparate and unexpected sources At present DSSW appears well suited to “egalitarian” type disease, “who agnostic” Internet Innovation Center 34 Extensions: Hierarchy Incorporate Hierarchy Intracity and Intercity Basic modality remains: data-driven models of discrete space- and time- walkers, mined from available sources. Cities are largely autonomous Allows for the problem to remain tractable and allow for efficient modes of computation (parallelism can be exploited). Internet Innovation Center 35 Extensions: Extracting Patterns of Behaviour Patterns of behavior can be taken from tracking technologies that are in place albeit not mined for use in epidemic modeling. E.g. Financial Transaction Profiling Usually mined to detect fraud E.g. Cell phone tracking, “where are you” services By default the service provider already knows where you are, even more so with GPS Obstacle: Privacy Internet Innovation Center 36 Related Research: Extracting Patterns of Behaviour Consumer wireless electronics: MAC snooping and tracking. Bluetooth headsets (ingress and egress of signalized arterials) Similar protocols for WiFi Device-enabled Kiosks and vending machines Security cameras and systems with person detection Monitoring for behaviour patterns those of illegal activities and terrorist threats Internet Innovation Center 37 Related Research: Extracting Patterns of Behaviour continued Tracking subway ridership. Token data mining of ridership Objective: Bioterrorism impact Mining online transportation information systems Helsinki public transport Objective is to provide information for riders, ours would be using this data to model the movement of people with a city for disease modeling and its possible spread Internet Innovation Center 38 Related Research: Real-time Helsinki Public Transport Information Internet Innovation Center 39 Related Research: Ubiquitous Vehicle Tracking Cameras Ref: http://www.edmontontrafficcam.com/cams.php Internet Innovation Center 40 Related Research: Extracting Patterns of Behaviour (Economic Impact) Economic Impact: Costs associated with implementing policy. Specifically, the economic impact of restricting air travel as a policy in controlling a flu pandemic. Models global air travel and estimates impact and cost associated with travel restrictions. E.g. 95% travel restriction required before significantly impairing disease spread Not a surprise Internet Innovation Center 41 Related Research: Extracting Patterns of Behaviour (Economic Impact) Internet Innovation Center 42 Other sources of information/concern Occasional/periodic mass gatherings E.g.Olympics or other special event that may perturb an overall or global simulation E.g. The Hajj Largest mass pilgrimage in the world. 2007 an estimated 2-3 million people participated. Conditions are difficult and thus it offers an opportunity for a large scale disease such as influenza to take hold. These people then disperse to their home countries, many via public transport, and could easily influence the spread and outbreak of the disease. Internet Innovation Center 43 One other model worth mentioning Wrt: Epizootic modeling (M.Eng. Mr. Paizen) The chicken factory percolation model. Derived from forest fire percolation models http://www.shodor.org/interactivate/activities/fire/?version=1.6.0_07&browser=MSIE&ven dor=Sun_Microsystems_Inc. Internet Innovation Center 44 Percolation Variations Neighborhoods Our chicken factory percolation model adds mobility. Funny things happen at the percolation threshold Formation of the infinite cluster http://www.svengato.com/forestfire.html Internet Innovation Center 45 Mobility and Infection Longevity 100% Mobility/Longevity Impact Substantive shift in the Percolation Threshold Percent dead Percolation threshold is like a tipping point Mobility has a big effect: The mobility threshold for plague as a critical percolation phenomenon for an epizootic 5% 42% Internet Innovation Center Population 46 Percolation with mobility. We are sure others have considered percolation models with mobility (more akin to CAs) but likely not in this context. Internet Innovation Center 47 Part 4: New $200 BSERC Challenge Erdős feeling (keeping with math flavor) www.ee.umanitoba.ca/~mcleod/BSERC.html Internet Innovation Center 48 2008 BSERC: Augmenting DSSW 1) Model “Toronto” with the GTA population of 3.5 M people. (Big City) Great deal of arterial flow of large numbers of persons. It may be worthwhile to model the behaviour of traffic flows into and out of Toronto as a means of modeling how the spread of a disease may occur. (Some cities have web cam traffic monitoring from which flows could be inferred, so an interest in image processing would be useful, eventually real time satellite views may become available). Internet Innovation Center 49 2008 BSERC 2) Model Las Vegas, where something like 80 million visitors a year arrive and stay for 3-4 days at a time. Vegas is a "destination city" (as opposed to other major cities, where a similar number of people may arrive, but immediately transit through to other destinations). (Strange City) Vegas offers more opportunity for a disease to propagate than a majority of cities. It is possible to mine air traffic schedules (Yapta) to get a very reasonable flow in and out of Vegas. There are a relatively small number of hotels, all mineable from google maps or equivalent, and it seems that just about everyone in Vegas works in the hotel industry. Internet Innovation Center 50 2008 BSERC 3) In any city, mining behaviour through scheduled public transport would be a good way to model contracting disease. Public transit systems transport a good number of people in close contact. Estimating ridership would likely be a challenge but could be made available if the efficacy of the modeling could be demonstrated. Internet Innovation Center 51 2008 BSERC 4) Refine a model for one institution or one small set of institutions only - but deal with all parameters thoroughly and to a high degree of accuracy based on explicit assumptions and a high degree of sensitivity. Mine “Aurora”. (Model Institutional Hierarchy). The work here would be on using agent based models as support for knowledge bases which can then be further developed to perhaps provide simple rules-of-thumb institutional models, providing a demarcation between the details required and perhaps where computationally simpler models could take over. Internet Innovation Center 52 2008 BSERC 5) The EPI@home project. This is a very interesting option and network programming challenge. In its initial manifestation it is really a cluster based implementation of an agent based epidemic model well tailored to parallelism. In a cluster, each node would run an instance of a city and communication would emulate people traveling via air or train. Your packets in the TCP/IP sense would transport agents as their payload. EPI@home (epi-at-home.com taken for starters) Internet Innovation Center 53 2008 BSERC 6) Model mass gatherings. The Hajj, the largest mass pilgrimage in the world in which an estimated two-three million people participated in 2007, is an example. In addition to a large number of people in high density, physical conditions are difficult and thus it presents an unnatural pathogen reservoir, providing opportunity for a large scale disease such as influenza to take hold. Mass gathering participants then disperse to their homes, many via public transport, and could easily influence the spread and outbreak of the disease. Internet Innovation Center 54 2008 BSERC 7) Preparedness planning: This is a massive undertaking but one in which our individual city model could be useful in providing planners with policies and some degree of expectation how goods and services could be provisioned in the event of a catastrophe. Simple investigations as to how long food supplies would last and could be distributed will be modeled. Provisioning of resources extempore will lead to an aggravated and worsening disaster. Models like ours will contribute to preparedness planning. This can become an effective modeling tool for any city or municipality, allowing for provisioning not only of food and supplies but for inoculation services as well as temporary hospital and/or mortuary facilities. Internet Innovation Center 55 2008 GSERC 1) Richard Gordon is keenly interested in how ABMs1 can be used in preventing the spread of HIV/AIDS . He also has a offered a cash award ($200) for some solutions in this area using DSSW. HIV/AIDS is a much tougher nut to crack using DSSW than a nice “who agnostic” disease like H5N1 will be when it finds a nice home in humans. (Jumps the species gap) For more details see Richard. 1 ABMs as in agent based models not anti-ballistic missiles Internet Innovation Center 56 How Winnipeg saved the world Firstly there will be a pandemic of “Biblical/Torah/Koran/Bhagwat Gita ” proportion. The infrastructure in many places will crumble or be seriously taxed. (possibly not here) One problem is going to be that associated with the human races' “thin veil of civilization”. Our innate savagery is going to be a problem (Hopefully Winnipeg has a strong sense of community) Internet Innovation Center 57 How Winnipeg saved the world Winnipeg has a high chance of remaining somewhat unscathed and will play a crucial role in restructuring Main reasons: To some degree we are isolated. (or easily isolated) The Vegetable Guy. (Peak of the Market, local food) Stores could be set aside Manitoba Hydro. (non local electrical generation) Manitoba Water. (non local supply, gravity fed) Internet Innovation Center 58 How Winnipeg saved the world Power generation: Remote maintained by “healthy” individuals Easily Isolated: Transportation wise Food production: Local Water Supply: Remote Result: Pandemic Lag Internet Innovation Center 59 Coincidently MB looks like a Nano Internet Innovation Center 60 The Point: Technology Technology is both part of the problem and simultaneously can be part of the solution. E.g.“Super crunching” of the type discussed here (data mining) from non obvious and disparate sources is a technology solution. E.g. Mobility reduction through increased telepresence Internet Innovation Center Please be advised you purchased xyz from abc on cde. e.g. Listeria notification 61 For fun. What’s going to fail first?: Telephones, cellular. Electricity Water Waste disposal (Sewage/Garbage) Food Distribution The Internet Television (maybe we should reserve an analog channel) Radio Health Services Vehicular transport Civility Internet Innovation Center 62 Summary Overviewed two books. (Mentioned at least 11 ) Presented our Agent Based Modeling approach to epidemic simulation. Emphasis on data mining of spatial topologies and agent behavior patterns We denoted this “Discrete Space Scheduled Walkers” paradigm (instance of ABM) Presented several indirect data sources Often no obvious connection to epidemic modeling BSERC opportunities Presented one of the many benefits of living in Winnipeg Internet Innovation Center 63 Thank-you Comments and suggestions very welcome. Topics of consideration for future modeling Introduce a panic behavior in light of an epidemic Internet Innovation Center 64