Download Open Source Social Media Analytics for Intelligence and Security

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Open Source Social Media Analytics for
Intelligence and Security Informatics
Applications
Swati Agarwal1 , Ashish Sureka2 , and Vikram Goyal1
1
Indraprastha Institute of Information Technology, Delhi (IIITD), India
{swatia,vikram}@iiitd.ac.in
http://www.iiitd.ac.in/
2
Software Analytics Research Lab (SARL), India
[email protected]
http://www.software-analytics.in/
Abstract. Open-Source Intelligence (OSINT) is intelligence collected
and inferred from publicly available and overt sources of information.
Open-Source social media intelligence is a sub-field within OSINT with
a focus on extracting insights from publicly available data in Web 2.0
platforms like Twitter (micro-blogging website), YouTube (video-sharing
website) and Facebook (social-networking website). In this tutorial, we
will provide an overview of Intelligence and Security Informatics (ISI)
applications in the domain of open-source social media intelligence. We
will introduce basic Machine Learning based framework, tools and techniques within the context of open-source social media intelligence. The
focus of our tutorial is on mining free-form textual content present in
social media websites. In particular we will focus on two important application: online radicalization and civil unrest. In addition to covering
basic concepts and applications, we will discuss open research problem,
important papers and results and future directions
Keywords: Information Retrieval, Intelligence and Security Informatics, Machine Learning, Mining User Generated Content, Open-Source Intelligence, Social Media Analytics
1
Basic Information
Duration Half-Day (3-4 Hours)
Pre-requisite Basic understanding of Social Media and exposure to data mining tools and techniques
Target Audiences MS/MTech and PhD students, Faculty Members and Researchers in Industry working or interested in the area of Social Media Analytics and Intelligence and Security Informatics
Online&Social&
Media&Pla.orms&
………………………………&
(Video&Sharing&Pla.orms)&
(Micro7blogging&Websites)&
(Blogs&and&Discussion&
Forums)&
Intelligence&
&&Security&
InformaCcs&
………………………………&
Content&IdenCficaCon,&Event&
Text&Mining&
(Issues&Raised&by&Government&
ForecasCng,&Community&DetecCon&&
and&AnalyCcs&
and&Law&Enforcement&
………………………………&
(Machine&Learning)&
(InformaCon&Retrieval)&
(Text&VisualizaCon)&
Agencies)&
Fig. 1: Diagram illustrating the scope of the tutorial: work at the intersection of
Social Media, Machine Learning and Intelligence & Security Informatics
Learning Outcome The tutorial will cover fundamental techniques, applications, research problems and future directions. Following are the 3 specific
learning outcome:
1. Familiarity with Intelligence and Security Informatics applications in the
domain of Open Source Social Media
2. Basics of Machine Learning based framework, tools and techniques for
malicious content detection
3. Overview of important research problems, solution approaches, results
and conclusions from recent research papers
2
Tutorial Outline
Figure 1 illustrates the scope of the tutorial: intersection of (1) Online Social
Media Platforms (2) Intelligence and Security Informatics (3) Text Mining and
Analytics. In particular, we will discuss mining user generated content, online
radicalization, civil disobedience and mobilization, content identification, event
forecasting and community detection. The tutorial will be based on some of
the recent research work and publications by the authors [1][2][3][4][5][6][7][8].
Following is the list of topics and case-studies to be covered in the tutorial:
1. Web 2.0 and Social Media Platforms
2.
3.
4.
5.
6.
7.
8.
2.1
Open-Source Intelligence (OSINT)
Open-Source Social Media Intelligence Applications
Machine Learning Framework
Mining Social Media Textual Data
Case-Study on Online Radicalization Detection
Case-Study on Civil Unrest Prediction
Open Research Problems and Future Directions
A Focused Crawler for Mining Hate and Extremism Promoting
Videos on YouTube
We formulate the problem of identification of such malicious videos as a search
problem and present a focused-crawler based approach consisting of various components performing several tasks: search strategy or algorithm, node similarity
computation metric, learning from exemplary profiles serving as training data,
stopping criterion, node classifier and queue manager.
2.2
Investigating the Potential of Aggregated Tweets as Surrogate
Data for Forecasting Civil Protests
We will present our solution approach consisting of various components such as
location and temporal expression extractors, named-entity recognizers, planning
& mobilization and crowd-buzz & commentary classifiers, location-time-topic
correlation miner. We conduct a series of experiments on a real-world and large
dataset and demonstrate the effectiveness of our approach.
3
Presenter’s Brief Bio
Swati3 is a PhD Scholar at IIIT Delhi. Her research interests are in the area
of Social Media Analytics, Mining User Generated Content and Intelligence and
Security Informatics. She has published several research papers in prestigious
conferences and workshops in her area of work.
Ashish4 is currently a Visiting Research at Siemens Corporate Research and
an Adjunct Faculty at IIIT Delhi. His research interests are in the area of Social
Media Analytics and Mining Software Repositories. He has a PhD in Computer
Science from NCSU and has worked at IBM Research (USA), IIIT Delhi, Infosys,
TRDDC and Siemens. He has graduated several PhD and MTech students.
Vikram5 is currently a faculty member at IIIT Delhi. His research interests
are in the area of Database Systems, Geo-Social Networks, and Data Mining. He
has a PhD in Computer Science from IIT Delhi. Has published several research
papers in prestigious conferences, workshops and journals and advising several
PhD and MTech students.
3
4
5
http://www.iiitd.edu.in/ swatia/
http://www.software-analytics.in
https://www.iiitd.edu.in/ vikram/
Bibliography
[1] Agarwal, S., Sureka, A.: Copyright infringement detection of music videos
on youtube by mining video and uploader meta-data. In: Big Data Analytics
(BDA). pp. 48–67 (2013)
[2] Agarwal, S., Sureka, A.: A focused crawler for mining hate and extremism
promoting videos on youtube. In: 25th ACM Conference on Hypertext and
Social Media (HT). pp. 294–296 (2014)
[3] Agarwal, S., Sureka, A.: Learning to classify hate and extremism promoting
tweets. In: Intelligence and Security Informatics Conference (JISIC). pp. 320–
320 (2014)
[4] Agarwal, S., Sureka, A.: Topic-specific youtube crawling to detect online
radicalization. In: Databases in Networked Information Systems (DNIS). pp.
133–151 (2015)
[5] Agarwal, S., Sureka, A.: A topical crawler for uncovering hidden communities
of extremist micro-bloggers on tumblr. In: 5th Workshop on Making Sense
of Microposts (MICROPOSTS) (2015)
[6] Agarwal, S., Sureka, A.: Using common-sense knowledge-base for detecting
word obfuscation in adversarial communication. In: Workshop on Future Information Security (FIS) (2015)
[7] Agarwal, S., Sureka, A.: Using knn and svm based one-class classifier for
detecting online radicalization on twitter. In: Distributed Computing and
Internet Technology (ICDCIT). pp. 431–442 (2015)
[8] Aggarwal, N., Agarwal, S., Sureka, A.: Mining youtube metadata for detecting privacy invading harassment and misdemeanor videos. In: Privacy,
Security and Trust (PST). pp. 84–93 (2014)