* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download One Decoding Step
Survey
Document related concepts
Transcript
Characteristics of Network
Traffic Flow Anomalies
Paul Barford and David Plonka
University of Wisconsin – Madison
SIGCOMM IMW, 2001
Motivation
• Traffic anomalies are a fact of life in computer networks
• Anomaly detection and identification is challenging
– Operators typically monitor by eye using SNMP or IP flows
– Simple thresholding is ineffective
– Some anomalies are obvious, other are not
• Characteristics of anomalous behavior in IP flows have
not been established
– Do same types of anomalies have same characteristics?
– Can characteristics be effectively used in detection systems?
Barford & Plonka
IMW 2001
2
Related Work
• Network traffic characterization
– Eg. Caceres89, Leland93, Paxson97, Zhang01
• Focus on typical behavior
• Fault and anomaly detection techniques
– Eg. Feather93, Brutlag00
• Focus on thresholds and time series models
– Eg. Paxson99
• Rule based tool for intrusion detection
– Eg. Moore01
• Backscatter technique can be used to identify DoS attacks
• No work which identifies anomaly characteristics
Barford & Plonka
IMW 2001
3
Our Approach to Data Gathering
• Consider anomalies in IP flow data
– Collected at UW border router - 5 minute intervals
– Archive of two years worth of data (packets, bytes, flows)
– Includes identification of anomalies (after-the-fact analysis)
• Group anomalies into three categories
– Network operation anomalies
• Steep drop offs in service followed by quick return to normal behavior
– Flash crowd anomalies
• Steep increase in service followed by slow return to normal behavior
– Network abuse anomalies
• Steep increase in flows in one direction followed by quick return to
normal behavior
Barford & Plonka
IMW 2001
4
IP Flows
• An IP Flow is defined as a unidirectional series of
packets between source/dest IP/port pair over a period of
time
{SRC_IP/Port,DST_IP/Port,Pkts,Bytes,Start/End Time,TCP Flags,IP Prot …}
– Exported by Lightweight Flow Accounting Protocol (LFAP)
enabled routers (Cisco’s NetFlow)
• We use FlowScan [Plonka00] to collect and process
Netflow data
– Combines flow collection engine, database, visulaization tool
– Provides a near real-time visualization of network traffic
– Breaks down traffic into well known service or application
Barford & Plonka
IMW 2001
5
Characteristics of “Normal” traffic
Barford & Plonka
IMW 2001
6
Our Approach to Analysis
• Analyze examples of each type of anomaly via
statistics, time series and wavelets (our initial focus)
• Wavelets provide a means for describing time series
data that considers both frequency and scale
– Particularly useful for characterizing data with sharp
spikes and discontinuities
• More robust than Fourier analysis which only shows what
frequencies exist in a signal
– Tricky to determine which wavelets provide best
resolution of signals in data
• We use tools developed at UW Wavelet IDR center
• First step: Identify which filters isolate anomalies
Barford & Plonka
IMW 2001
7
First Look at Analysis of “Normal” Traffic
• Wavelets easily localize familiar daily/weekly signals
Barford & Plonka
IMW 2001
8
First Look Analysis of Attacks
• DoS: sharp increase in flows and/or packets in one direction
• Linear splines seem to be a good filter to distinguish DoS attacks
Barford & Plonka
IMW 2001
9
Characteristics of Flash Crowds
• Sharp increase in packets/bytes/flows followed by
slow return to normal behavior eg. Linux releases
• Leading edge not significantly different from DoS
signal so next step is to look within the spikes
Barford & Plonka
IMW 2001
10
Characteristics of Network Anomalies
• Typically a steep drop off in packets/bytes/flows
followed a short time later by restoration
Barford & Plonka
IMW 2001
11
Conclusion and Next Steps
• Project to characterize network traffic flow anomalies
– Based on flow data collected at UW border router
• Anomalies have been grouped into three categories
– Analysis approach: statistical, time series, wavelet
• Initial results
– Good indications that we can isolate signals
• Future
– Continue analysis of anomaly data
– Analysis of data from other sites
– Application of results in (distributed) detection systems
Barford & Plonka
IMW 2001
12
Acknowledgements
• Somesh Jha
• Jeff Kline
• Amos Ron
Barford & Plonka
IMW 2001
13