Download Is Sampled Data Sufficient for Anomaly Detection

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Is Sampled Data Sufficient
for Anomaly Detection
Ip Wing Chung Peter (05133660)
Ngan Sze Chung (05928650)
Abstract

Traffic Measurement in Network is important



Network management
Anomaly detection for security analysis
Detect all packet trace?



The most accurate
Consume network
resources
Affect normal traffic
Sampling a point-to-point link
Router A
Router B
Monitor
Abstract

Sampling Technique



Conserve network resources
How many samples?
Sampling techniques vs Anomalies detection
algorithm
Abstract





Introduction
Background and Methods
Impact of Sampling on Volume Anomaly
Detection
Impact of Sampling on Portscan Detection
Conclusion and Future Work
Introduction

Aim


To study the impact of sampling on anomaly
detection
Objective




To study 4 existing sampling techniques
To study 3 common anomaly detection algorithm
To simulate the result by inputting the sampled
data to detect the anomalies
To evaluate the impact of sampling on anomaly
detection algorithm
Background and Methods





Sampling
Volume Anomaly Detection
Portscan Detection
Trace Data
Methodology
Sampling

Random packet sampling

Sample a packet with a small probability r < 1

Classify sampled
packets into flows based
on source/destination,
IP/port, protocol
Flow terminated by
timeout (1 min), or
explicit TCP semantics
(FIN)

Sampling

Random packet sampling



Simple to implement
Low CPU power and memory requirement
Inaccurate for flow statistic
Sampling

Random flow sampling




Sample a flow with a small probability p < 1
Improve accuracy
for flow statistic
Classifies packet
into flows first
Prohibitive memory
and CPU power
Sampling
Where z is a threshold that trades off accuracy

Smart sampling



Sample a flow of size x with a probability p(x)
Determined by threshold z (e.g. z = 40000)
Bias towards large flows
Flow
Flow
Flow
Flow
Flow
Flow
1,
2,
3,
4,
5,
6,
40 bytes
15580 bytes
8196 bytes
5350789 bytes
532 bytes
4000 bytes
sample with 0.1% probability
sample with 100% probability
sample with 10% probability
Sampling

Sample-and-hold (S&H)
Sampling

Sample-and-hold (S&H)

Flow table lookup



If found, flow entry gets updated by all the subsequent
packets once it is created in S&H table
If not found, flow entry created with a probability p
(e.g. p = 1/3 on previous case)
Sampling biased toward “elephant” flows
Volume Anomaly Detection

Detect Network traffic anomalies (e.g. DoS
attack)



Abrupt changes in packet or flow count
measurements
Induces volume anomalies
Discrete wavelet transform (DWT) based
detection

Proved to be effective at detecting volume
anomalies
DWT-Based Detection



Applies wavelet decomposition on packet or
flow time series
Detect volume change at various time scale
3 steps



Decomposition
Re-synthesis
Detection
DWT-Based Detection

Decomposition


Decompose original signal to identify changes
DWT calculate wavelet coefficient
low pass filter
original
signal
high pass filter
DWT-Based Detection

Re-synthesis




Aggregated into high, mid and low bands
Low-band signal  slow-varying trends
High-band signal  highlight sudden variations
Mid-band  sum of the rest
DWT-Based Detection

Detection



Compute variance of high and mid-band signals
over a time interval
local variance
Deviation score = global variance
If deviation score is higher than a predefined
threshold are marked as volume anomalies
Portscan Dectection



2 online portscan detection techniques
Threshold Random Walk (TRW)
Time Access Pattern Scheme (TAPS)
Threshold Random Walk (TRW)




2 Hypothesis
H0: a source is a “normal” host
H1: a source is a scanner
Rationale:
A normal host is far more likely to have
successful connection than a scanner which
randomly probes address space.
Threshold Random Walk (TRW)



Hypotheses testing on sequence of events
To determine which hypothesis is more likely
let Y = {Y1, Y2, . . . , Yi} represent the random
vector of connections observed from a source,
where Yi = 0 if the ith connection is successful
and Yi = 1 otherwise
Threshold Random Walk (TRW)

Likelihood Ratio:

When the Likelihood Ratio crosses either one
of two predefined thresholds, the
corresponding hypothesis is selected as the
most likely.
requires ~6 observed events to detect
scanners successfully

Threshold Random Walk (TRW)





TRWSYN - backbone adaptation of TRW
Backbone traffic usually uni-directional
Difficult to predict “failed” / “succeeded”
connection
TRWSYN oracle:
Marks single SYN-packet flows as failed
connection
Detect TCP portscan ONLY
Time Access Pattern Scheme (TAPS)


Access Pattern
Observation: Scanner initiates connections
to a larger spread of



destination IP addresses (horizontal scan)
port numbers (vertical scan)
That means, ratio γ between distinct IP
addresses and port number is larger for
scanner.
Time Access Pattern Scheme (TAPS)





Hypotheses test, similar to TRW.
Single packet flow failed connection
Each time bin (say i), for each source,
compute ratio γ, compare with predefine
threshold k.
Event variable Yi = 0 if γ<k
1 if γ>=k
Update Likelihood Ratio
Trace Data

2 Links in Tier-1 ISP’s Backbone network




2 OC-48 links between backbone routers on West
Coast and East Coast
BB-West: Large percentage of scanning traffic
BB-East: Large Volume
Collected by IPMON
Methodology




4 sampling schemes use different parameters
Require common metric for fair comparison
We choose:
Percentage of sampled flows
Different in:


Memory requirement
CPU utilization
Methodology

Note:


Although fixed percentage of sampled flows
Smart sampling & Sample-and-Hold bias towards
Large flows
Impact of Sampling on
Volume Anomaly Detection


Volume Anomaly Detection Result
Feature Variation Due to Sampling
Detection from the original trace

Total 21 abrupt changes from original trace

No. of detection ↓ as sampling interval ↑
Random flow sampling performs the best
Smart sampling & Sample-and-hold drops
much faster
No false positive in detection



Feature Variation Due to Sampling

Difference in performance on detection





Most volume spikes caused by a sudden increase
in small packet flows
Random flow sampling is unbiased by flow size
Others are biased by large flows
Smart sampling and Sample-and-hold designed to
track heavy hitters
Poor performance compare to packet sampling
Feature Variation Due to Sampling

No false positives



Simply, spike in samples must have existed in the
original trace
Not an artifact of sampling
Sampling only ↓ no. of detection and not cause
any false detection
Feature Variation Due to Sampling



No. of detection ↓ as sampling interval ↑
even in random flow sampling
Success
Technique based
on no. of sampled
event and local
variance
Hypothesize sampling introduces distortion in
variance
Fail
Feature Variation Due to Sampling

Sampling introduce distortion in variance






Sampling scale down original time series
by a fraction of p
Assume variance =
and average rate =
New scaled-down variance
Sampling involves removal of discrete point
Binomial
i.e. Sample original point process
random var.
binomially
Total variance
Feature Variation Due to Sampling

Total variance
scaled-down
variance
removal of
discrete pt.
> 70%
when N = 500
Affect Detection !
Impact of Sampling on Portscan
Dectection

Metrics

Desirable to have HIGH Rs and LOW Rf+
Focus on Success and False Positive Ratio
(because Rs+Rf-=1)

Impact of Sampling on Portscan
Dectection




Challenge: Determine true scanners
Final list of scanners manually generated by
Sridharan (in Impact of Packet Sampling on Portscan Detection) as the
ground truth
Less interested in absolute accuracy
Relative performance as a function of
sampling scheme and sampling rate
TRWSYN under Sampling

Rs and Rf+ ratios for the BB-West trace as functions of
effective sampling interval for all four sampling schemes
TRWSYN under Sampling

Random Packet Sampling

As base case for comparison


Success Ratio Rs
Initially increases
slightly for small N
(seems advantageous)

Drop off for Large N
TRWSYN under Sampling

Random Packet Sampling

As base case for comparison


False Positive Ratio Rf+
Follows similar
behaviour as Rs


but Larger scale
Increases 3 times when N
from 1 to 10
TRWSYN under Sampling


2 key effects of packet sampling
Flow-reduction


Number of flows observed reduced
Flow-shortening

Multi-packet flows reduced to single packet flows



Recall:
TRWSYN algorithm
Single SYN packet flow  connection failure
 potential scanner
TRWSYN under Sampling



Small sampling interval
Flow-reduction  slight impact  High Rs
Flow-shortening  substantial impact
 ↑single packet flow
Impact:
 Scanners’ multi-packet flows initially missed
 shortened  Detected  Increase Rs

Regular multi-packet flows
 shortened  “Detected”  Increase Rf+
TRWSYN under Sampling




Large sampling interval
Flow-reduction dominates
Fewer decisions (detections)
Rs and Rf+ decrease
TRWSYN under Sampling


3 Flow sampling schemes
Decision based on entire flow


No Flow-shortening
Flow- Reduction dominates the impact
Exception:
 Sample-and-Hold



Mid-Flow-Shortening
Decision only made on SYN packet flows
Introduce NO False Positive
TRWSYN under Sampling


Both Rs and Rf+ decrease almost
monotonically as N increases
Rf+ lower than packet sampling
TRWSYN under Sampling


In terms of Rf+
Flow sampling >> Packet sampling
In terms of Rs,
Random Flow Sampling > Random Packet
Sampling > Smart Sampling > Sample-andHold
Cause:


Bias towards Large Flows
Suffer more from Flow-reduction
TAPS under Sampling




Critical parameter: Time Bin
For each sampling scheme,
each sampling rate,
Use Optimal Time Bin



Maximize Rs
Increasing function of sampling interval
True for both Packet sampling and Flow sampling
schemes
TAPS under Sampling

Results of portscan detection with TAPS for
Trace BB-West
TAPS under Sampling



Rs decreases as sampling interval increases
Random Flow Sampling performs the best
Random Packet Sampling performs as well
as the remaining 2 Flow sampling schemes
Cause:


Bias towards Large Flows
Tend to miss small (critical) flows
TAPS under Sampling

Random Packet Sampling

Rf+ intially increases


Then drop off at large sampling interval


due to Flow-shortening
due to Flow-reduction
Flow Sampling schemes

No/Minor Flow-shortening


Low Rf+
Monotonically decreases with sampling interval
TAPS under Sampling




TAPS uses address range distribution for
detection
Insensitive to the 4 schemes
No distortion introduced
Low Rf+
e.g. Random Packet Sampling yields 1/10 of Rf+ by TRWSYN
Conclusion

Random Flow Sampling



Random Packet Sampling


Performs the best
Prohibitive resource requirement
Suffers from Flow-shortening
Smart Sampling & Sample-and-Hold


Bias towards large flows
Perform poorer than Random Packet Sampling in
volume anomaly detection
Conclusion




All 4 sampling schemes
Degrade all 3 anomaly detection algorithms
In terms of Rs and Rf+
Sampled Data Sufficient for Anomaly
Detection?
 Remains an Open Question