Download Is Sampled Data Sufficient for Anomaly Detection

Is Sampled Data Sufficient for Anomaly Detection Ip Wing Chung Peter (05133660) Ngan Sze Chung (05928650) Abstract  Traffic Measurement in Network is important    Network management Anomaly detection for security analysis Detect all packet trace?    The most accurate Consume network resources Affect normal traffic Sampling a point-to-point link Router A Router B Monitor Abstract  Sampling Technique    Conserve network resources How many samples? Sampling techniques vs Anomalies detection algorithm Abstract      Introduction Background and Methods Impact of Sampling on Volume Anomaly Detection Impact of Sampling on Portscan Detection Conclusion and Future Work Introduction  Aim   To study the impact of sampling on anomaly detection Objective     To study 4 existing sampling techniques To study 3 common anomaly detection algorithm To simulate the result by inputting the sampled data to detect the anomalies To evaluate the impact of sampling on anomaly detection algorithm Background and Methods      Sampling Volume Anomaly Detection Portscan Detection Trace Data Methodology Sampling  Random packet sampling  Sample a packet with a small probability r < 1  Classify sampled packets into flows based on source/destination, IP/port, protocol Flow terminated by timeout (1 min), or explicit TCP semantics (FIN)  Sampling  Random packet sampling    Simple to implement Low CPU power and memory requirement Inaccurate for flow statistic Sampling  Random flow sampling     Sample a flow with a small probability p < 1 Improve accuracy for flow statistic Classifies packet into flows first Prohibitive memory and CPU power Sampling Where z is a threshold that trades off accuracy  Smart sampling    Sample a flow of size x with a probability p(x) Determined by threshold z (e.g. z = 40000) Bias towards large flows Flow Flow Flow Flow Flow Flow 1, 2, 3, 4, 5, 6, 40 bytes 15580 bytes 8196 bytes 5350789 bytes 532 bytes 4000 bytes sample with 0.1% probability sample with 100% probability sample with 10% probability Sampling  Sample-and-hold (S&H) Sampling  Sample-and-hold (S&H)  Flow table lookup    If found, flow entry gets updated by all the subsequent packets once it is created in S&H table If not found, flow entry created with a probability p (e.g. p = 1/3 on previous case) Sampling biased toward “elephant” flows Volume Anomaly Detection  Detect Network traffic anomalies (e.g. DoS attack)    Abrupt changes in packet or flow count measurements Induces volume anomalies Discrete wavelet transform (DWT) based detection  Proved to be effective at detecting volume anomalies DWT-Based Detection    Applies wavelet decomposition on packet or flow time series Detect volume change at various time scale 3 steps    Decomposition Re-synthesis Detection DWT-Based Detection  Decomposition   Decompose original signal to identify changes DWT calculate wavelet coefficient low pass filter original signal high pass filter DWT-Based Detection  Re-synthesis     Aggregated into high, mid and low bands Low-band signal  slow-varying trends High-band signal  highlight sudden variations Mid-band  sum of the rest DWT-Based Detection  Detection    Compute variance of high and mid-band signals over a time interval local variance Deviation score = global variance If deviation score is higher than a predefined threshold are marked as volume anomalies Portscan Dectection    2 online portscan detection techniques Threshold Random Walk (TRW) Time Access Pattern Scheme (TAPS) Threshold Random Walk (TRW)     2 Hypothesis H0: a source is a “normal” host H1: a source is a scanner Rationale: A normal host is far more likely to have successful connection than a scanner which randomly probes address space. Threshold Random Walk (TRW)    Hypotheses testing on sequence of events To determine which hypothesis is more likely let Y = {Y1, Y2, . . . , Yi} represent the random vector of connections observed from a source, where Yi = 0 if the ith connection is successful and Yi = 1 otherwise Threshold Random Walk (TRW)  Likelihood Ratio:  When the Likelihood Ratio crosses either one of two predefined thresholds, the corresponding hypothesis is selected as the most likely. requires ~6 observed events to detect scanners successfully  Threshold Random Walk (TRW)      TRWSYN - backbone adaptation of TRW Backbone traffic usually uni-directional Difficult to predict “failed” / “succeeded” connection TRWSYN oracle: Marks single SYN-packet flows as failed connection Detect TCP portscan ONLY Time Access Pattern Scheme (TAPS)   Access Pattern Observation: Scanner initiates connections to a larger spread of    destination IP addresses (horizontal scan) port numbers (vertical scan) That means, ratio γ between distinct IP addresses and port number is larger for scanner. Time Access Pattern Scheme (TAPS)      Hypotheses test, similar to TRW. Single packet flow failed connection Each time bin (say i), for each source, compute ratio γ, compare with predefine threshold k. Event variable Yi = 0 if γ<k 1 if γ>=k Update Likelihood Ratio Trace Data  2 Links in Tier-1 ISP’s Backbone network     2 OC-48 links between backbone routers on West Coast and East Coast BB-West: Large percentage of scanning traffic BB-East: Large Volume Collected by IPMON Methodology     4 sampling schemes use different parameters Require common metric for fair comparison We choose: Percentage of sampled flows Different in:   Memory requirement CPU utilization Methodology  Note:   Although fixed percentage of sampled flows Smart sampling & Sample-and-Hold bias towards Large flows Impact of Sampling on Volume Anomaly Detection   Volume Anomaly Detection Result Feature Variation Due to Sampling Detection from the original trace  Total 21 abrupt changes from original trace  No. of detection ↓ as sampling interval ↑ Random flow sampling performs the best Smart sampling & Sample-and-hold drops much faster No false positive in detection    Feature Variation Due to Sampling  Difference in performance on detection      Most volume spikes caused by a sudden increase in small packet flows Random flow sampling is unbiased by flow size Others are biased by large flows Smart sampling and Sample-and-hold designed to track heavy hitters Poor performance compare to packet sampling Feature Variation Due to Sampling  No false positives    Simply, spike in samples must have existed in the original trace Not an artifact of sampling Sampling only ↓ no. of detection and not cause any false detection Feature Variation Due to Sampling    No. of detection ↓ as sampling interval ↑ even in random flow sampling Success Technique based on no. of sampled event and local variance Hypothesize sampling introduces distortion in variance Fail Feature Variation Due to Sampling  Sampling introduce distortion in variance       Sampling scale down original time series by a fraction of p Assume variance = and average rate = New scaled-down variance Sampling involves removal of discrete point Binomial i.e. Sample original point process random var. binomially Total variance Feature Variation Due to Sampling  Total variance scaled-down variance removal of discrete pt. > 70% when N = 500 Affect Detection ! Impact of Sampling on Portscan Dectection  Metrics  Desirable to have HIGH Rs and LOW Rf+ Focus on Success and False Positive Ratio (because Rs+Rf-=1)  Impact of Sampling on Portscan Dectection     Challenge: Determine true scanners Final list of scanners manually generated by Sridharan (in Impact of Packet Sampling on Portscan Detection) as the ground truth Less interested in absolute accuracy Relative performance as a function of sampling scheme and sampling rate TRWSYN under Sampling  Rs and Rf+ ratios for the BB-West trace as functions of effective sampling interval for all four sampling schemes TRWSYN under Sampling  Random Packet Sampling  As base case for comparison   Success Ratio Rs Initially increases slightly for small N (seems advantageous)  Drop off for Large N TRWSYN under Sampling  Random Packet Sampling  As base case for comparison   False Positive Ratio Rf+ Follows similar behaviour as Rs   but Larger scale Increases 3 times when N from 1 to 10 TRWSYN under Sampling   2 key effects of packet sampling Flow-reduction   Number of flows observed reduced Flow-shortening  Multi-packet flows reduced to single packet flows    Recall: TRWSYN algorithm Single SYN packet flow  connection failure  potential scanner TRWSYN under Sampling    Small sampling interval Flow-reduction  slight impact  High Rs Flow-shortening  substantial impact  ↑single packet flow Impact:  Scanners’ multi-packet flows initially missed  shortened  Detected  Increase Rs  Regular multi-packet flows  shortened  “Detected”  Increase Rf+ TRWSYN under Sampling     Large sampling interval Flow-reduction dominates Fewer decisions (detections) Rs and Rf+ decrease TRWSYN under Sampling   3 Flow sampling schemes Decision based on entire flow   No Flow-shortening Flow- Reduction dominates the impact Exception:  Sample-and-Hold    Mid-Flow-Shortening Decision only made on SYN packet flows Introduce NO False Positive TRWSYN under Sampling   Both Rs and Rf+ decrease almost monotonically as N increases Rf+ lower than packet sampling TRWSYN under Sampling   In terms of Rf+ Flow sampling >> Packet sampling In terms of Rs, Random Flow Sampling > Random Packet Sampling > Smart Sampling > Sample-andHold Cause:   Bias towards Large Flows Suffer more from Flow-reduction TAPS under Sampling     Critical parameter: Time Bin For each sampling scheme, each sampling rate, Use Optimal Time Bin    Maximize Rs Increasing function of sampling interval True for both Packet sampling and Flow sampling schemes TAPS under Sampling  Results of portscan detection with TAPS for Trace BB-West TAPS under Sampling    Rs decreases as sampling interval increases Random Flow Sampling performs the best Random Packet Sampling performs as well as the remaining 2 Flow sampling schemes Cause:   Bias towards Large Flows Tend to miss small (critical) flows TAPS under Sampling  Random Packet Sampling  Rf+ intially increases   Then drop off at large sampling interval   due to Flow-shortening due to Flow-reduction Flow Sampling schemes  No/Minor Flow-shortening   Low Rf+ Monotonically decreases with sampling interval TAPS under Sampling     TAPS uses address range distribution for detection Insensitive to the 4 schemes No distortion introduced Low Rf+ e.g. Random Packet Sampling yields 1/10 of Rf+ by TRWSYN Conclusion  Random Flow Sampling    Random Packet Sampling   Performs the best Prohibitive resource requirement Suffers from Flow-shortening Smart Sampling & Sample-and-Hold   Bias towards large flows Perform poorer than Random Packet Sampling in volume anomaly detection Conclusion     All 4 sampling schemes Degrade all 3 anomaly detection algorithms In terms of Rs and Rf+ Sampled Data Sufficient for Anomaly Detection?  Remains an Open Question

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Is Sampled Data Sufficient for Anomaly Detection