Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sandra Williams MIS 406 Assignment 1 January 25, 2015 Predicting Disease Outbreaks Based on Internet Search Trends Everyone wants to know more about themselves- whether for fun, to increase productivity or other ways to better themselves. Recently there’s been a trend in the Quantified Self movement. People are collecting data about their day to day lives usually through some sort of technology. In data mining, predictive analytics can take all of the collected data and extract important information and use it to predict behavioral trends. People use technology for many aspects of their lives so it is not uncommon for people to look up their symptoms online when they are sick. WebMD has even become a common site that can be a resource in helping people understand issues when they are sick. People might find it easier to stay at home and research their own symptoms, less expensive than visiting a hospital, or just curiosity as part of understanding their bodies more. But, as a result, the searches done online can be used as important data in themselves. Location, time, and what you are searching for as symptoms can all be traced and predictive analytics can predict disease outbreaks. I was first interested in this idea in a creative coding class when we were data mining certain hashtags on Twitter through a Twitter API. It’s interesting how much data we are constantly creating through day to day activities and sharing (willingly or not). My dad also uses data mining at work and mentioned that Google can predict a disease epidemic before the Center for Disease Control. I was curious how much truth his statement had. The four articles I looked at are: Pelat, Camille et al. “More Diseases Tracked by Using Google Trends.”Emerging Infectious Diseases 15.8 (2009): 1327–1328. PMC. Web. Jan. 2015. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2815981/ This article was interesting because they did a couple of different studies, discussed different key words (some in French) and correlations. They found that “ for each of 3 infectious diseases, 1 well-chosen query was sufficient to provide time series of searches highly correlated with incidence. We have shown the utility of an Internet search engine query data for surveillance of acute diarrhea and chickenpox in a non–English-speaking country. Thus, the ability of Internet search-engine query data to predict influenza in the United States presented by Ginsberg et al. appears to have a broader application for surveillance of other infectious diseases in other countries”. Overall, it proved that the trends in searches were definitely helpful in understanding sicknesses and could be used in different countries. Varian, Hal & Choi, Hyunyoung. “Predicting the Present with Google Trends.” Economic Record Volume 88 (2012): 2-9. Wiley Online Library. Web. Jan. 2015. http://onlinelibrary.wiley.com/doi/10.1111/j.1475-4932.2012.00809.x/full This article didn’t discuss as much medical. But, it talked about understanding data mining using Google to overall find patterns and applying it with lots of aspects of life. Pervaiz, Fahad et al. “FluBreaks: Early Epidemic Detection from Google Flu Trends.” Ed. Gunther Eysenbach. Journal of Medical Internet Research 14.5 (2012): e125. PMC. Web. Jan. 2015. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3510767/ This article mentioned exactly what my dad told me about: “The Google Flu Trends service was launched in 2008 to track changes in the volume of online search queries related to flu-like symptoms. Over the last few years, the trend data produced by this service has shown a consistent relationship with the actual number of flu reports collected by the US Centers for Disease Control and Prevention (CDC), often identifying increases in flu cases weeks in advance of CDC records.” It’s interesting that it could identify/predict so much faster than the CDC. I’m curious of the reasons. I was bummed though when it said that the Google Flu Trends service can’t simply predict epidemics- that it’s only a “baseline indicator of the trend, or changes, in the number of disease case”. I’m sure there will be ways to tweek it and make it more powerful in the future so that it can full on predict epidemics of lots of different diseases. Eysenbach, Gunther. “Infodemiology: Tracking Flu-Related Searches on the Web for Syndromic Surveillance.” AMIA Annual Symposium Proceedings 2006 (2006): 244–248. Print http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1839505/ This article did a study in Canada for 33 weeks in 2004/2005- gathering data from the internet from flu-related searches. It discussed a lot more infodemiology which studies how information for health is spread by the public. “The Internet has made measurable what was previously immeasurable: The distribution of health information in a population, tracking health information trends over time, and identifying gaps between information supply and demand.” They also discuss patterns of “(mis)information” outbreaks and needing to fill gaps in knowledge that the public could have. Some concerns I have are privacy- does the greater good of predicting and hopefully stopping an outbreak of a disease trump privacy issues with gathering people’s data and searches? Also, how reliable are people’s search habits on Google. When Ebola was first breaking the news, I’m sure a lot of people were Googling the symptoms. Wouldn’t that cause false ideas of an epidemic based on what’s popular in the media? Is it risky that people can go online and self-diagnose? There’s always the joke that people go on WebMD and self-diagnose themselves with crazy things just because some of the symptoms match. Are there other ways Google could stop diseases? Perhaps countries that don’t have good medical supplies could connect online and other countries would know how to best aid the people when they are sick. Or by picking up information from people traveling it could prevent diseases from spreading from one country to another by predicting and quarantining people who could be carrying a disease.