Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Fang Jin Assistant professor Department of Computer Science Texas Tech University Email: [email protected] Research Interests My research area is Machine Learning and Data Mining. Most recent work has been focused on Information propagation modeling, graph mining, group anomaly detection, and spatiotemporal data analysis. Examples of typical applications include detecting disease outbreaks using public health data such as hospital visits and medication sales; the detection and prediction of civil unrest events using historical crime records and streaming Twitter data; and detecting rumors in social networks; and social media data to detect traffic congestion, excessive air pollution, and power outages. Employment Assistant Professor Department of Computer Science, Texas Tech University. Jan 2017 to Present Research Scientist Department of Electrical & Computer Engineering, Virginia Tech. Aug 2016 to Dec 2016 Software Engineer Beijing High Performance Computing Center, Beijing, China. Jul 2009 to Aug 2011 Education Ph.D. Computer Science Virginia Tech, Jan 2012 to Jun 2016 M.S. Information processing Chinese Academy of Science, 2009 B.S. Electronics Science & Technology Nanjing University of Posts Telecommunications, China, 2006 and Publications 1. Fang Jin, Feng Chen, Rupen Paul, Chang-Tien Lu, Naren Ramakrishnan. Absenteeism Detection in Social Media, in Proceedings of the SIAM International Conference on Data Mining (SDM'17), Houston, TX, Apr 2017. 2. Fang Jin, Wei Wang, Prithwish Chakraborty, Nathan Self, Feng Chen, Naren Ramakrishnan. Tracking Multiple Social Media for Stock Market Event Prediction. Under review, ICDM, 2017. 3. Liang Zhao, Jiangzhuo Chen, Feng Chen, Fang Jin, Wei Wang, Chang-Tien Lu, Naren Ramakrishnan. Social Media-driven Online Epidemics Modeling by Adaptive 1/4 Semi-supervised Multilayer Perceptron, ACM Transactions on Knowledge Discovery from Data (TKDD), 2016, submitted. 4. Fang Jin, Rupinder Paul Khandpur, Nathan Self, Edward Dougherty, Sheng Guo, Feng Chen, B. Aditya Prakash, Naren Ramakrishnan. Modeling Mass Protest Adoption in Social Network Communities using Geometric Brownian Motion, in Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'14), pages 1660-1669, Aug 2014. Recipient of KDD 2014 NSF student travel award. 5. Fang Jin, Edward Dougherty, Parang Saraf, Peng Mi, Yang Cao, and Naren Ramakrishnan. Epidemiological modeling of news and rumors on twitter, in Proceedings of the 7th ACM SIGKDD Workshop on Social Network Mining and Analysis, Chicago, IL, 2013, pages 8:1-8:9. Recipient of Best Paper Award, Recipient of Student Travel Award. 6. Fang Jin, Wei Wang, Liang Zhao, Edward Dougherty, Yang Cao, Chang-Tien Lu, Naren Ramakrishnan. Misinformation Propagation in the age of Twitter, IEEE Computer, Volume 47, Issue 12, pages 90-94, Dec 2014. 7. Fang Jin, Nathan Self, Parang Saraf, Patrick Butler, Wei Wang, Naren Ramakrishnan. Forex-Foreteller: Currency Trend Modeling using News Articles, in Proceedings of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining - Demo Track, pages 1470--1473, Aug 2013. 8. Yan Wang, Guofu Wang, Wenchuan He, Feng Gao, Jiang Zhu, Mingnong Feng, Fang Jin. Tiered storage technology of meteorological spatial data. Chinese Meteorological Society & Meteorological Communications and Information Technology Committee Scientific Meeting, 2011. 9. Min Wei, Lanning Wang, Fang Jin. The implementation of coupling algorithm for MOM4 and BCC_CSM Model, in Information Science and Engineering (ICISE), 2010 2nd International Conference on, pp. 1600-1604. IEEE, 2010. 10. Fang Jin, Hongqun Zhang, Xiaoqing Ge. Research of Fault Diagnosis Expert System for Satellites Receiving System. Microcomputer Information, 7 (2009): 107-108. Book Section 1. Edward A. Fox, Monika Akbar, Sherif Hanie El Meligy Abdelhamid, Noha Ibrahim Elsherbiny, Mohamed Magdy Gharib Farag, Fang Jin, Jonathan P. Leidig, Sai Tulasi Neppali. Computing Handbook, Third Edition, Vol. 2 (Information Systems and Information Technology). Section 3, Ch. 18, ed. by Heikki Topi, Allen Tucker, Chapman Hall/CRC Press, Taylor and Francis Group, ISBN 9781439898444, http://www.crcpress.com/product/isbn/9781439898543, May 2014. 2. Research Experience Analyze the Influence of Climate Change on Civil Unrest Jul 2015 to Jul 2016 Climate changes significantly affect people's behavior, potentially exacerbating social and politics unrest. This project seeks to identify climate-related unrest events in Latin American by 2/4 investigating their generation, development and evolution in both real-world and social media networks. Some highlights of this research include: Design document classifier to identify historical climate events using heterogeneous nearest neighbor strategy. Providing a pre-defined climate events description pool, the classifier is able to learn their similarities (semantically, temporally and spatially) and integrate these similarities with SVD to determine a new event's category. Improve StoryTelling algorithms capable of infering specific information chains related to a specific climate event and then track its generation, development, and evolution. These enhanced StoryTelling algorithms will automatically highlight an individual climate event's causalities. Develop dynamical query model to investigate information propagation related to specific events in social media,making it possible to mitigate the effects of unrest by blocking key players with subgraph optimization aglotithms. Forecasting Civil Unrest Events in Latin American Jan 2015 to Jun 2015 This project sought to understand and quantify the way ideas are spread to provide the basis for future research in this area by modeling and predicting the movement of information within social media outlets like Twitter. Adapt geometric Brownian motion and traditional network graph theory to quantify the stochastic nature of Twitter topic propagation. A model of civil unrest propagation pattern in social media, and a design simulation algorithm for civil unrest event prediction were created. A new algorithm to automatically extract protest keywords dynamically from social media was designed to identify those indicators closely linked to civil unrest, such as group absenteeism signals. New models were developed from scratch to predict civil unrest events’ location, time, group size, and event type. Ensemble multiple models, by depressing potential negatve warnings and boosting positive warnings, or rewriting one or more warning properties like population or event type, to improve prediction performance. Forecasting Disease Outbreaks (Ebola, influenza) Jan 2014 to Dec 2014 Modern epidemiological forecasts of common illnesses are difficult because of the delays associated with traditional surveillance sources and digital surveillance data such as social network activity and search queries. This project aims to develop robust quantitative predictions of temporal trends of epidemiological disease incidence using several surrogate data sources for Latin American countries. Design algorithms to extract related features from social media data like Twitter, Facebook, news/blogs, twitter-urls and Wikipedia, and new frameworks created to streamline the resulting large data flows. Integrating social indicators and physical indicators leveraged the selective superiorities of both types of feature sets, making it possible to develop matrix factorization models 3/4 using neighborhood embedding to forecast disease outbreaks based on a combination of social indicators and physical indicators like weather data. Investigate the efficacy of combining diverse different sources at two levels, the data level and the model level, were investigated. Forecasting Stock Market Fluctuations using Multiple Social Media Sources Jul 2012 to Dec 2013 The rapid growth of highly diverse forms of social media has enabled economists to leverage micro- and real-time indicators related to the factors that could possibly influence the market, such as public emotion, anticipation and behavior. By mining specific market features from varied sources such as news, Google Trends and Twitter, this project investigated the correlations between these features and stock market fluctuations, and constructed a prediction model that combined all those features. Design text mining algorithms to study extreme fluctuations in Latin American stock markets through analyzing the patterns in news sources. The new algorithms focused on learning topic distributions and identifying sentiment patterns in the news for various extreme fluctuation scenarios. Apply group Lasso to identify the most informative terms from Google Trends, construct a tweet entities network, and apply a one class SVM to learn the anomaly patterns in this tweet entity network. Build stock market fluctuations model by combining the features learned from the news, Google Trends and Twitter using feature level fusion and model level fusion. Honors and Awards 1. NSF student travel award, the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2014), New York, Aug 2014. 2. Best paper award, the 7th ACM SIGKDD Workshop on Social Network Mining and Analysis (SNA-KDD 2013), Chicago, IL, Aug 2013. 3. Student Travel Award, the 7th ACM SIGKDD Workshop on Social Network Mining and Analysis (SNA-KDD 2013), Chicago, IL, Aug 2013. 4. Honor student and excellent student cadre of Graduate University of Chinese Academic of Sciences (GUCAS), 2008. 5. Honor student and excellent student cadre of GUCAS (the only one in my class), 2007. 4/4