Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Data Mining for Surveillance Applications Suspicious Event Detection Dr. Bhavani Thuraisingham April 2006 Outline      Acknowledgements Data Mining for Security Applications Surveillance and Suspicious Event Detection Directions for Surveillance Data Mining, Security and Privacy 2 Acknowledgements      Prof. Latifur Khan Prof. Murat Kantarcioglu Gal Lavee Ryan Layfield Sai Chaitanya 3 Our Vision: Assured Information Sharing Data/Policy for Coalition Publish Data/Policy Publish Data/Policy Publish Data/Policy Component Data/Policy for Agency A Component Data/Policy for Agency C Component Data/Policy for Agency B 1. Friendly partners 2. Semi-honest partners 3. Untrustworthy partners 4 Data Mining for Security Applications  Data Mining has many applications in Cyber Security and National Security Intrusion detection, worm detection, firewall policy management  Counter-terrorism applications and Surveillance  Fraud detection, Insider threat analysis   Need to enforce security but at the same time ensure privacy 5 Data Mining for Surveillance Problems Addressed    Huge amounts of surveillance and video data available in the security domain Analysis is being done off-line usually using “Human Eyes” Need for tools to aid human analyst ( pointing out areas in video where unusual activity occurs) 6 Example  Using our proposed system: Video Data User Defined Annotated Video w/ events of interest highlighted Event of interest  Greatly Increase video analysis efficiency 7 The Semantic Gap  The disconnect between the low-level features a machine sees when a video is input into it and the highlevel semantic concepts (or events) a human being sees when looking at a video clip  Low-Level features: color, texture, shape High-level semantic concepts: presentation, newscast, boxing match  8 Our Approach  Event Representation   Event Comparison   Estimate distribution of pixel intensity change Contrast the event representation of different video sequences to determine if they contain similar semantic event content. Event Detection  Using manually labeled training video sequences to classify unlabeled video sequences 9 Event Representation    Measures the quantity and type of changes occurring within a scene A video event is represented as a set of x, y and t intensity gradient histograms over several temporal scales. Histograms are normalized and smoothed 10 Event Comparison  Determine if the two video sequences contain similar high-level semantic concepts (events). l l 2 [ h ( i )  h ( i )] 1 1k 2k D2   3L k ,l ,i h1lk (i )  h2l k (i)   Produces a number that indicates how close the two compared events are to one another. The lower this number is the closer the two events are. 11 Event Detection  A robust event detection system should be able to Recognize an event with reduced sensitivity to actor (e.g. clothing or skin tone) or background lighting variation.  Segment an unlabeled video containing multiple events into event specific segments  12 Labeled Video Events  These events are manually labeled and used to classify unknown events Walking1 Running1 Waving2 13 14 Labeled Video Events walking1 walking2 walking3 running1 running2 running3 running4 waving 2 walking1 0 0.27625 0.24508 1.2262 1.383 0.97472 1.3791 10.961 walking2 0.27625 0 0.17888 1.4757 1.5003 1.2908 1.541 10.581 walking3 0.24508 0.17888 0 1.1298 1.0933 0.88604 1.1221 10.231 running1 1.2262 1.4757 1.1298 0 0.43829 0.30451 0.39823 14.469 running2 1.383 1.5003 1.0933 0.43829 0 0.23804 0.10761 15.05 running3 0.97472 1.2908 0.88604 0.30451 0.23804 0 0.20489 14.2 running4 1.3791 1.541 1.1221 0.39823 0.10761 0.20489 0 15.607 waving2 10.961 10.581 10.231 14.469 15.05 14.2 15.607 0 Experiment #1  Problem: Recognize and classify events irrespective of direction (right-to-left, left-to-right) and with reduced sensitivity to spatial variations (Clothing)  “Disguised Events”- Events similar to testing data except subject is dressed differently Compare Classification to “Truth” (Manual Labeling)  15 16 Experiment #1 Disguised Walking 1 walking1 0.97653 walking2 0.45154 walking3 0.59608 running1 1.5476 running2 1.4633 running3 1.5724 Classification: Walking running4 1.5406 waving2 12.225 17 Experiment #1 Disguised Running 1 walking1 1.411 walking2 1.3841 walking3 1.0637 running1 0.56724 running2 0.97417 running3 0.93587 Classification: Running running4 1.0957 waving2 11.629 18 Experiment #1 Disguised Running 3 walking1 1.3049 walking2 1.0021 walking3 0.88092 running1 0.8114 running2 1.1042 running3 1.1189 Classification: Running running4 1.0902 waving2 12.801 19 Experiment #1 Disguised Waving 1 walking1 13.646 walking2 13.113 walking3 13.452 running1 18.615 running2 19.592 running3 18.621 Classification: Waving running4 20.239 waving2 2.2451 20 Classifying Disguised Events Disguise walking1 Disguise walking2 Disguise running1 Disguise running2 Disguise running3 Disguise waving1 Disguise waving2 Disguise walking1 0 0.19339 1.2159 0.85938 0.67577 14.471 13.429 Disguise walking2 0.19339 0 1.4317 1.1824 0.95582 12.295 11.29 Disguise running1 1.2159 1.4317 0 0.37592 0.45187 15.266 15.007 Disguise Running2 0.85938 1.1824 0.37592 0 0.13346 16.76 16.247 Disguise Running3 0.67577 0.95582 0.45187 0.13346 0 16.252 15.621 Disguise waving1 14.471 12.295 15.266 16.76 16.252 0 0.45816 Disguise waving2 13.429 11.29 15.007 16.247 15.621 0.45816 0 Experiment #1    This method yielded 100% Precision (i.e. all disguised events were classified correctly). Not necessarily representative of the general event detection problem. Future evaluation with more event types, more varied data and a larger set of training and testing data is needed 21 Experiment #2   Problem: Given an unlabeled video sequence describe the high-level events within the video Capture events using a sliding window of a fixed width (25 frames in example) 22 Experiment #2  Running Similarity Graph 23 Experiment #2  Walking Similarity Graph 24 Experiment #2  Waving Similarity Graph 25 Experiment #2 Minimum Similarity Graph Walking Running Waving Running 26 XML Video Annotation   Using the event detection scheme we generate a video description document detailing the event composition of a specific video sequence This XML document annotation may be replaced by a more robust computer-understandable format (e.g. the VEML video event ontology language). <?xml version="1.0" encoding="UTF-8"?> <videoclip> <Filename>H:\Research\MainEvent\ Movies\test_runningandwaving.AVI</Filename> <Length>600</Length> <Event> <Name>unknown</Name> <Start>1</Start> <Duration>106</Duration> </Event> <Event> <Name>walking</Name> <Start>107</Start> <Duration>6</Duration> </Event> </videoclip> 27 Video Analysis Tool    Takes annotation document as input and organizes the corresponding video segment accordingly. Functions as an aid to a surveillance analyst searching for “Suspicious” events within a stream of video data. Activity of interest may be defined dynamically by the analyst during the running of the utility and flagged for analysis. 28 Directions   Enhancements to the work  Working toward bridging the semantic gap and enabling more efficient video analysis  More rigorous experimental testing of concepts  Refine event classification through use of multiple machine learning algorithm (e.g. neural networks, decision trees, etc…). Experimentally determine optimal algorithm. Develop a model for the following    29 simultaneous events within the same video sequence Face detection, Gait detection Security and Privacy  Define an access control model that will allow access to surveillance video data to be restricted based on semantic content of video objects  Biometrics applications  Privacy preserving surveillance Access Control and Biometrics  Access Control    RBAC and UCON-based models for surveillance data Initial work to appear in ACM SACMAT Conference 2006 Biometrics    Restrict access based on semantic content of video rather then low-level features Behavioral type access instead of “fingerprint” Used in combination with other biometric methods 30 Privacy Preserving Surveillance - Introduction •A recent survey at Times Square found 500 visible surveillance cameras in the area and a total of 2500 in New York City. •What this essentially means is that, we have scores of surveillance video to be inspected manually by security personnel •We need to carry out surveillance but at the same time ensure the privacy of individuals who are good citizens 32 System Use Raw video surveillance data Faces of trusted people derecognized to preserve privacy Face Detection and Face Derecognizing system Suspicious Event Detection System Manual Inspection of video data Suspicious people found Suspicious events found Report of security personnel Comprehensive security report listing suspicious events and people detected System Architecture Input Video Finding location of the face in the image Breakdown input video into sequence of images Raise an alarm that a potential intruder was detected Perform Segmentation Potential intruder found Compare face to trusted and untrusted individuals Trusted face found Derecognize the face in the image Other Applications of Data Mining in Security            Intrusion detection and worm detection Firewall policy management Insider Threat Analysis – both network/host and physical Fraud Detection Protecting children from inappropriate content on the Internet Digital Identity Management and Detecting Identity Theft Steganalysis and digital watermarking Biometrics identification and verification Digital Forensics Source Code Analysis National Security / Counter-terrorism 34 Data Mining Needs for Counterterrorism: Non-real-time Data Mining 35        Gather data from multiple sources  Information on terrorist attacks: who, what, where, when, how  Personal and business data: place of birth, ethnic origin, religion, education, work history, finances, criminal record, relatives, friends and associates, travel history, . . .  Unstructured data: newspaper articles, video clips, speeches, emails, phone records, . . . Integrate the data, build warehouses and federations Develop profiles of terrorists, activities/threats Mine the data to extract patterns of potential terrorists and predict future activities and targets Find the “needle in the haystack” - suspicious needles? Data integrity is important Techniques have to SCALE Data Mining Needs for Counterterrorism: Real-time Data Mining 36      Nature of data  Data arriving from sensors and other devices  Breaking news, video releases, satellite images, surveillance data  Continuous data streams  Some critical data may also reside in caches Rapidly sift through the data and discard unwanted data for later use and analysis (non-real-time data mining) Data mining techniques need to meet timing constraints Quality of service (QoS) tradeoffs among timeliness, precision and accuracy Presentation of results, visualization, real-time alerts and triggers Origins of Privacy Preserving Data Mining    Prevent useful results from mining  Introduce “cover stories” to give “false” results  Only make a sample of data available so that an adversary is unable to come up with useful rules and predictive functions Randomization/Perturbation  Introduce random values into the data and/or results  Challenge is to introduce random values without significantly affecting the data mining results  Give range of values for results instead of exact values Secure Multi-party Computation  Each party knows its own inputs; encryption techniques used to compute final results 37 Data Mining and Privacy: Friends or Foes?      They are neither friends nor foes Need advances in both data mining and privacy Data mining is a tool to be used by analysis and decision makers  Due to also positives and false negatives, need human in the loop Need to design flexible systems  Data mining has numerous applications including in security  For some applications one may have to focus entirely on “pure” data mining while for some others there may be a need for “privacypreserving” data mining  Need flexible data mining techniques that can adapt to the changing environments Technologists, legal specialists, social scientists, policy makers and privacy advocates MUST work together 38