Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Open Source Social Media Analytics for Intelligence and Security Informatics Applications Swati Agarwal1 , Ashish Sureka2 , and Vikram Goyal1 1 Indraprastha Institute of Information Technology, Delhi (IIITD), India {swatia,vikram}@iiitd.ac.in http://www.iiitd.ac.in/ 2 Software Analytics Research Lab (SARL), India [email protected] http://www.software-analytics.in/ Abstract. Open-Source Intelligence (OSINT) is intelligence collected and inferred from publicly available and overt sources of information. Open-Source social media intelligence is a sub-field within OSINT with a focus on extracting insights from publicly available data in Web 2.0 platforms like Twitter (micro-blogging website), YouTube (video-sharing website) and Facebook (social-networking website). In this tutorial, we will provide an overview of Intelligence and Security Informatics (ISI) applications in the domain of open-source social media intelligence. We will introduce basic Machine Learning based framework, tools and techniques within the context of open-source social media intelligence. The focus of our tutorial is on mining free-form textual content present in social media websites. In particular we will focus on two important application: online radicalization and civil unrest. In addition to covering basic concepts and applications, we will discuss open research problem, important papers and results and future directions Keywords: Information Retrieval, Intelligence and Security Informatics, Machine Learning, Mining User Generated Content, Open-Source Intelligence, Social Media Analytics 1 Basic Information Duration Half-Day (3-4 Hours) Pre-requisite Basic understanding of Social Media and exposure to data mining tools and techniques Target Audiences MS/MTech and PhD students, Faculty Members and Researchers in Industry working or interested in the area of Social Media Analytics and Intelligence and Security Informatics Online&Social& Media&Pla.orms& ………………………………& (Video&Sharing&Pla.orms)& (Micro7blogging&Websites)& (Blogs&and&Discussion& Forums)& Intelligence& &&Security& InformaCcs& ………………………………& Content&IdenCficaCon,&Event& Text&Mining& (Issues&Raised&by&Government& ForecasCng,&Community&DetecCon&& and&AnalyCcs& and&Law&Enforcement& ………………………………& (Machine&Learning)& (InformaCon&Retrieval)& (Text&VisualizaCon)& Agencies)& Fig. 1: Diagram illustrating the scope of the tutorial: work at the intersection of Social Media, Machine Learning and Intelligence & Security Informatics Learning Outcome The tutorial will cover fundamental techniques, applications, research problems and future directions. Following are the 3 specific learning outcome: 1. Familiarity with Intelligence and Security Informatics applications in the domain of Open Source Social Media 2. Basics of Machine Learning based framework, tools and techniques for malicious content detection 3. Overview of important research problems, solution approaches, results and conclusions from recent research papers 2 Tutorial Outline Figure 1 illustrates the scope of the tutorial: intersection of (1) Online Social Media Platforms (2) Intelligence and Security Informatics (3) Text Mining and Analytics. In particular, we will discuss mining user generated content, online radicalization, civil disobedience and mobilization, content identification, event forecasting and community detection. The tutorial will be based on some of the recent research work and publications by the authors [1][2][3][4][5][6][7][8]. Following is the list of topics and case-studies to be covered in the tutorial: 1. Web 2.0 and Social Media Platforms 2. 3. 4. 5. 6. 7. 8. 2.1 Open-Source Intelligence (OSINT) Open-Source Social Media Intelligence Applications Machine Learning Framework Mining Social Media Textual Data Case-Study on Online Radicalization Detection Case-Study on Civil Unrest Prediction Open Research Problems and Future Directions A Focused Crawler for Mining Hate and Extremism Promoting Videos on YouTube We formulate the problem of identification of such malicious videos as a search problem and present a focused-crawler based approach consisting of various components performing several tasks: search strategy or algorithm, node similarity computation metric, learning from exemplary profiles serving as training data, stopping criterion, node classifier and queue manager. 2.2 Investigating the Potential of Aggregated Tweets as Surrogate Data for Forecasting Civil Protests We will present our solution approach consisting of various components such as location and temporal expression extractors, named-entity recognizers, planning & mobilization and crowd-buzz & commentary classifiers, location-time-topic correlation miner. We conduct a series of experiments on a real-world and large dataset and demonstrate the effectiveness of our approach. 3 Presenter’s Brief Bio Swati3 is a PhD Scholar at IIIT Delhi. Her research interests are in the area of Social Media Analytics, Mining User Generated Content and Intelligence and Security Informatics. She has published several research papers in prestigious conferences and workshops in her area of work. Ashish4 is currently a Visiting Research at Siemens Corporate Research and an Adjunct Faculty at IIIT Delhi. His research interests are in the area of Social Media Analytics and Mining Software Repositories. He has a PhD in Computer Science from NCSU and has worked at IBM Research (USA), IIIT Delhi, Infosys, TRDDC and Siemens. He has graduated several PhD and MTech students. Vikram5 is currently a faculty member at IIIT Delhi. His research interests are in the area of Database Systems, Geo-Social Networks, and Data Mining. He has a PhD in Computer Science from IIT Delhi. Has published several research papers in prestigious conferences, workshops and journals and advising several PhD and MTech students. 3 4 5 http://www.iiitd.edu.in/ swatia/ http://www.software-analytics.in https://www.iiitd.edu.in/ vikram/ Bibliography [1] Agarwal, S., Sureka, A.: Copyright infringement detection of music videos on youtube by mining video and uploader meta-data. In: Big Data Analytics (BDA). pp. 48–67 (2013) [2] Agarwal, S., Sureka, A.: A focused crawler for mining hate and extremism promoting videos on youtube. In: 25th ACM Conference on Hypertext and Social Media (HT). pp. 294–296 (2014) [3] Agarwal, S., Sureka, A.: Learning to classify hate and extremism promoting tweets. In: Intelligence and Security Informatics Conference (JISIC). pp. 320– 320 (2014) [4] Agarwal, S., Sureka, A.: Topic-specific youtube crawling to detect online radicalization. In: Databases in Networked Information Systems (DNIS). pp. 133–151 (2015) [5] Agarwal, S., Sureka, A.: A topical crawler for uncovering hidden communities of extremist micro-bloggers on tumblr. In: 5th Workshop on Making Sense of Microposts (MICROPOSTS) (2015) [6] Agarwal, S., Sureka, A.: Using common-sense knowledge-base for detecting word obfuscation in adversarial communication. In: Workshop on Future Information Security (FIS) (2015) [7] Agarwal, S., Sureka, A.: Using knn and svm based one-class classifier for detecting online radicalization on twitter. In: Distributed Computing and Internet Technology (ICDCIT). pp. 431–442 (2015) [8] Aggarwal, N., Agarwal, S., Sureka, A.: Mining youtube metadata for detecting privacy invading harassment and misdemeanor videos. In: Privacy, Security and Trust (PST). pp. 84–93 (2014)