Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Framework for Extrusion Detection Using Machine learning Yan Luo and Jeffrey J.P.Tsai Presented by: Koundinya Surepeddi Dept. of Computer & Information Sciences University of Delaware CISC 879 - Machine Learning for Solving Systems Problems Introduction • • Extrusion – Unauthorized transfer of digital assets that has classified information. Extrusion can be malicious – theft of information – accidental inclusion of an unauthorized recipient on a sensitive e-mail CISC 879 - Machine Learning for Solving Systems Problems Why EDS? • Problem: leaking of confidential information ( lack of information security) • Solution: EDS (Extrusion detection system) CISC 879 - Machine Learning for Solving Systems Problems EDS VS IDS • Extrusion detection is a reverse process of intrusion detection. IDS – protects the system from outside attacks EDS – protects the system from inside attacks. • • IDS use misuse detection and anomaly detection. EDS uses combination of – misuse detection (Well known attacks –> signatures) – anomaly detection (System activities –> normal profiles) – data mining techniques CISC 879 - Machine Learning for Solving Systems Problems Detecting Extrusions • Automatic process to detect extrusions Raw data (User & System Activities) Data Mining Techniques (Association rule analysis, Frequency analysis,…..) Detection rules, Proper features (Existing extrusions, Future extrusions) CISC 879 - Machine Learning for Solving Systems Problems EDS Framework • The framework includes 1. Data collection (produces raw data) a. User monitor - keyboard click events, mouse click events b. Process monitor - process create & process terminate events c. File system monitor - file create, file modify, file open/close, file read/write events. d. Network monitor - network traffic data events e. Clipboard monitor - clipboard data events, copy/paste events. 2. Data analysis 3. Extrusion detection CISC 879 - Machine Learning for Solving Systems Problems System diagram CISC 879 - Machine Learning for Solving Systems Problems Components • Target system: System that is being protected from extrusions. • The target system can be A personal computer, A local network, or A whole company's computer system. • Data collection: Collects raw data which includes event information. CISC 879 - Machine Learning for Solving Systems Problems Components • Raw data: Initial Event information • Data analysis: Analyzes raw data to generate detection rules and select proper features. • Detection rules: Automatically generated by data analysis module used for extrusion detections. • Features: The proper features model the system and user's normal profiles. The normal profiles can be further used to detect abnormal events and the possible extrusions. CISC 879 - Machine Learning for Solving Systems Problems Components Detection engine: • • Loads the detection rules from SQL database and applies them on the target system for run-time extrusion. • The detection engine can monitor the target system. • • If system’s behavior deviate the baseline of the normal profiles, the alarm of possible extrusions is triggered. Database: stores raw data, detection rules, proper features and system and user's normal profiles. CISC 879 - Machine Learning for Solving Systems Problems Technical Approach 1. Find pattern of Extrusions 2. Extrusion Forecasting 3. Dynamic characteristics CISC 879 - Machine Learning for Solving Systems Problems Pattern of Extrusions • Step 1: Sort the recorded events by their timestamps. • Step 2: The system is pre-trained with large datasets by using Data mining techniques or Pre-defined activities by users. • Step 3: Pattern recognition techniques -For real-time monitoring of the system activities. The alarm will be triggered when some patterns are found. CISC 879 - Machine Learning for Solving Systems Problems Example - BINDER BINDER • An Extrusion-based Break-in Detector. • Detects break-in extrusions by determining the network connections are unrelated to user actions. • Only processes that receive user input are allowed to make connections. CISC 879 - Machine Learning for Solving Systems Problems Extrusion Forecasting • We can forecast the next intrusion or extrusion activity: • • • Enough patterns of intrusion and extrusion activities. Define the detection rules correctly and completely. Forecasting new intrusion or extrusion activities : • • The partial pattern recognition or rule matching. To find abnormal activities, we first need to model the normal activities. CISC 879 - Machine Learning for Solving Systems Problems Dynamic Mechanism • • • How to organize events information and use them to detect extrusion? First way (Rule based detection): 1. Define some rules & apply them to the recorded or the realtime events information. 2. If the rule is matched, then the alarm will be triggered. Second way : 1. Organize these events information as a large dataset or many datasets. 2. Apply the data mining and the pattern recognition techniques. CISC 879 - Machine Learning for Solving Systems Problems Rule based Detection • A dynamic mechanism for adding rules, testing rules, and deleting rules. • Step 1: Add the specific rules to the system and run the experiments. • Step 2: If the result is not very good, delete the previous rules and add more new rules. Run another experiments based on new rules. CISC 879 - Machine Learning for Solving Systems Problems Rule based Detection Template • Sequence of rules: Action_1 -> Action_2 -> … -> Action_N • The user can specify a sequence of actions as a detection rule. • Then the system will examine the recorded system events. • The system will find whether there is a sequence of events that match the detection rule. CISC 879 - Machine Learning for Solving Systems Problems Rule based Detection Detection Rule • A confidential file is opened -> the content is copied to clipboard -> the content is pasted to another file -> the other file is saved • So in this detection rule, there are four actions and there a specific order that these actions are performed. • Then if our detection system found a sequence of events that match this rule, then some alarm will be triggered and some reactions will be done. CISC 879 - Machine Learning for Solving Systems Problems Rule based Detection Timing problem: • Sometimes the extrusion activities will not follow our detection rule step by step. • The order of there activities may be interleaving. • So our detection rules should concern the timing information. CISC 879 - Machine Learning for Solving Systems Problems Conclusion • • Combination method which integrates both misuse detection and anomaly detection for automatically generating detection rules and selecting proper features Extrusion detection and confidential information protection can be carried out based on the detection rules and proper features. CISC 879 - Machine Learning for Solving Systems Problems Queries? CISC 879 - Machine Learning for Solving Systems Problems