Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Statistical Anomaly Detection Technique based on Three Different Network Features Yuji Waizumi Tohoku Univ. Background The Internet has entered the business world Need to protect information and systems from hackers and attacks Network security has been becoming important issue Many intrusion/attack detection methods has been proposed Intrusion Detection System Two major detection principles: Signature Detection Attempts to flag behavior that is close to some previously defined pattern signature of a known intrusion Anomaly Detection Attempts to quantify the usual or acceptable behavior and flags other irregular behavior as potentially intrusive. Motivation Anomaly detection system Pro: can detect unknown attacks Con: many false positives Improve the performance of Anomaly detection system Analyze the characteristics of attacks Propose method to construct features as numerical values from network traffic Construct detection system using the features Classification of Attacks DARPA Intrusion Detection Evaluation DoS: Denial of Service Probe: Surveillance of Targets Remote to Local(R2L), User to Root(U2R): Unauthorized Access to a Host or Super User Re-classification of Attacks Classification by Traffic Characteristics DoS, Probe Traffic Quantity Access Range Probe Structure of Communication Flows DoS, R2L, U2R Contents of Communications To detect attacks with above characteristics, it is necessary to construct features corresponding those classes. Network Traffic Feature Numerical values(vectors) expressing state of traffic We propose three different network feature sets Based of re-classification of attacks Analyzed independently Time Slot Feature (34 dimension) Count various packets, flags, transmission and reception bytes, and port variety by a unit time Estimate scale and range of attacks Target Probe (Scan) DoS Each slot is expressed as a vector Ex) (TCP,icmp,SYN,FIN,RST,UDP,DNS,…) Element value Examples (Time Slot Feature) Vector element Values are regularizes as mean=0, variance=1.0 rst flag (port 21) ftp scan normal traffic only rst flag (port 23) telnet scan Flow Counting Feature Flow is specified by (srcIP, dstIP, srcPort,dstPort,protocol) Count packets, flags, transmission and reception bytes in a flow Target Scan with illegal flags Ports used as backdoors TCP:19 dim. , UDP:7 dim. Element value Examples (Flow Counting Feature) Specific packets of attacks are extremely high and low. Vector element Decrease of SYN packet Normal traffic Port sweep(scan) Flow Payload Feature Represent content of communication Histogram of character codes of a flow Count 8bit-unit(256 class) Transmission and reception are counted independently (total 512 class) Target Buffer overflow Malicious code Examples (Flow Payload Feature) Specific character of attacks are extremely high and low. Normal traffic imap attack Modeling Normal Behavior Each packet appears based on protocol Correlations between elements of the feature vectors Profile based on correlations can represent normal behavior of network traffic Principal Component Analysis:PCA Extract correlation among samples as Principal Component Principal Component lay along sample distribution Non-correlated data Principal Component Discriminant Function Projection Distance Long Distant Samples: •Unordinary traffic •Break Detection Criterion Correlation Principal Component Projection Distance Anomaly sample Detection Algorithm Independent Detection The three features are used for PCA independently "Logical OR" operation for detection alerts by each feature Network Traffic Features Time Slot PCA Alert Flow Counting PCA Alert Flow Payload PCA Alert OR Alert Performance Evaluation Two Examine Scenario Scenario1 Learn Week1 and 3 Test Week4 and 5 Scenario2 Learn Week 4 and 5 Test Week 4 and 5 More Practical Situation Real network traffic may include attack traffic Criterion for Evaluation Detection rate when number of miss-detection (false positive) per day is 10 Data Set Data Set 1999 DARPA off-line intrusion detection evaluation test set Contain 5 weeks data (from Monday to Friday) Week1,3: Normal traffic only Week2: Including attacks (for learning) Week4,5: Including attacks (for testing) Scenario 1 Result # of detection # of target Detection rate Proposed Method 104 171 60.8% NETAD 132 185 71.4% 2003 Forensics 15 27 55.6% 2000 Expert1 85 169 50.3% Expert2 81 173 46.8% Dmine 41 102 40.2% Scenario 2 Result # of detection Proposed Method NETAD # of target Detection rate 100 171 58.5% 70 185 37.8% NETAD •Use IP address as white list •Overfit learning data Proposed Method •Independent of IP address •Evaluate only anomaly of traffic Detection Results every Features Scenario 1 Time Slot Feature(TS) Flow Counting Feature(FC) Flow Payload Feature(FP) (TS) & (FC) & (FP) (TS) (FC) 22 9 (FP ) 5 13 6 44 5 Scenario 2 Time Slot Feature(TS) Flow Counting(FC) Flow Payload(FP) (TS) & (FC) & (FP) (TS ) 37 (FC ) 7 (FP) 8 3 2 40 Low detection overlap Each feature detect different characteristic attacks # of Detection by both TS & FP 3 # of Detection by all Three Features # of Detection by FP only Conclusion For network security Classification attacks into three types Construct three features corresponding to three attack characteristics Detection method with PCA Learning the three features independently Higher detection accuracy With samples including attacks