Download Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
A Machine Learning Approach
to Detecting Attacks
by Identifying Anomalies
in Network Traffic
A Dissertation
by Matthew V. Mahoney
Major Advisor: Philip K. Chan
Overview
• Related work in intrusion detection
• Approach
• Experimental results
– Simulated network
– Real background traffic
• Conclusions and future work
Limitations of Intrusion Detection
• Host based (audit logs, virus checkers, system
calls (Forrest 1996))
– Cannot be trusted after a compromise
• Network signature detection (SNORT (Roesch
1999), Bro (Paxson 1998))
– Cannot detect novel attacks
– Alarms occur in bursts
• Address/port anomaly detection (ADAM
(Barbara 2001), SPADE (Hoagland 2000),
eBayes (Valdes & Skinner 2000))
– Cannot detect attacks on public servers (web, mail)
Intrusion Detection Dimensions
BSM
Virus
Detection
System
SNORT
Bro
Network Protocol
Anomaly Detection
Model
Audit
Logs
User
SPADE
ADAM
eBayes
Anomaly
Firewalls
Host
Method
Data
Network
Signature
Problem Statement
• Detect (not prevent) attacks in network traffic
• No prior knowledge of attack characteristics
Training – no
known attacks
Model of
normal traffic
Test data with attacks
Alarms
IDS
Approach
1.
2.
3.
4.
5.
Model protocols (extend user model)
Time-based model of “bursty” traffic
Learn conditional rules
Batch and continuous modeling
Test with simulated attacks and real
background traffic
Approach 1. Protocol Modeling
• User model (conventional)
– Source address for authentication
– Destination port to detect scans
• Protocol model (new)
– Unusual features (more likely to be
vulnerable)
– Client idiosyncrasies
– IDS evasion
– Victim’s symptoms after an attack
Example Protocol Anomalies
Attack
How
detected
Teardrop – overlapping IP IP
fragments crashes target
fragments
Sendmail – buffer overflow Lower
gives remote root shell
case mail
FIN scan (portsweep) - FIN FIN withpackets not logged
out ACK
ARPpoison – Forged
Interruptreplies to ARP-who-has
ed TCP
Category
Unusual
feature
Idiosyncrasy
Evasion
Victim
symptoms
Approach 2 -Non-Poisson Traffic
Model (Paxson & Floyd, 1995)
• Events occur in bursts on all time scales
• Long range dependency
• No average rate of events
• Event probability depends on
– The average rate in the past
– And the time since it last occurred
Time-Based Model
If port = 25 then word1 = HELO or EHLO
• Anomaly: any value never seen in training
• Score = tn/r
– t = time since last anomaly for this rule
– n = number of training instances (port = 25)
– r = number of allowed values (2)
• Only the first anomaly in a burst receives a
high score
Example
Training = AAAABBBBAA
Test =
AACCC
• C is an anomaly
• r/n = average rate of training anomalies =
2/10 (first A and first B)
• t = time since last anomaly = 9, 1, 1
• Score (C) = tn/r = 45, 5, 5
Approach 3. Rule Learning
1. Sample training pairs to suggest rules
with n/r = 2/1
2. Remove redundant rules, favoring high
n/r
3. Validation: remove rules that generate
alarms on attack-free traffic
Learning Step 1 - Sampling
Port
Word1
Word2
Word3
80
GET
/
HTTP/1.0
80
GET
/index.html HTTP/1.0
• If port = 80 then word1 = GET
• word3 = HTTP/1.0
• If word3 = HTTP/1.0 and word1 = GET then port = 80
Learning Step 2 – Remove
Redundant Rules (Sorted by n/r)
Port
Word1
Word2
Word3
25
HELO
pascal
MAIL
80
GET
/
HTTP/1.0
80
GET
/index.html HTTP/1.0
•
•
•
•
R1: if port = 80 then word1 = GET (n/r = 2/1, OK)
R2: word1 = HELO or GET (n/r = 3/2, OK)
R3: if port = 25 then word1 = HELO (n/r = 1/1, remove)
R4: word2 = pascal, /, or /index.html (n/r = 3/3, OK)
Learning Step 3 – Rule Validation
• Training (no attacks) – Learn rules, n/r
• Validation (no attacks) – Discard rules that
generate alarms
• Testing (with attacks)
Train
Validate
Test
Approach 4. Continuous Modeling
•
•
•
•
No separate training and test phases
Training data may contain attacks
Model allows for previously seen values
Score = tn/r + ti/fi
– ti = tine since value i last seen
– fi = frequency of i in training, fi > 0
• No validation step
Implementation
Model
Data
Conditions
Validation
Score
PHAD
Packet
headers
None
No
tn/r
ALAD
TCP
streams
TCP
streams
Server, No
port
Learned Yes
tn/r
Packet
bytes
Protocol Yes
tn/r + ti/fi
LERAD
NETAD
tn/r
Example Rules (LERAD)
1 39406/1 if SA3=172 then SA2 = 016
2 39406/1 if SA2=016 then SA3 = 172
3 28055/1 if F1=.UDP then F3 = .
4 28055/1 if F1=.UDP then F2 = .
5 28055/1 if F3=. then F1 = .UDP
6 28055/1 if F3=. then DUR = 0
7 27757/1 if DA0=100 then DA1 = 112
8 25229/1 if W6=. then W7 = .
9 25221/1 if W5=. then W6 = .
10 25220/1 if W4=. then W8 = .
11 25220/1 if W4=. then W5 = .
12 17573/1 if DA1=118 then W1 = .^B^A^@^@
13 17573/1 if DA1=118 then SA1 = 112
14 17573/1 if SP=520 then DP = 520
15 17573/1 if SP=520 then W2 = .^P^@^@^@
16 17573/1 if DP=520 then DA1 = 118
17 17573/1 if DA1=118 SA1=112 then LEN = 5
18 28882/2 if F2=.AP then F1 = .S .AS
19 12867/1 if W1=.^@GET then DP = 80
20 68939/6 if then DA1 = 118 112 113 115 114 116
21 68939/6 if then F1 = .UDP .S .AF .ICMP .AS .R
22 9914/1 if W3=.HELO then W1 = .^@EHLO
23 9914/1 if F1=.S W3=.HELO then DP = 25
24 9914/1 if DP=25 W5=.MAIL then W3 = .HELO
1999 DARPA IDS Evaluation
(Lippmann et al. 2000)
• 7 days training data with no attacks
• 2 weeks test data with 177 visible attacks
• Must identify victim and time of attack
Attacks
Internet
(simulated)
IDS
Victims
SunOS
Solaris
Linux
WinNT
Attacks Detected at 10 FA/Day
160
140
120
100
80
60
40
20
0
PHAD
ALAD
LERAD
NETAD
Continuous
Unlikely Detections
• Attacks on public servers (web, mail, DNS)
detected by source address
• Application server attacks detected by
packet header fields
• U2R (user to root) detected by FTP upload
Unrealistic Background Traffic
r
Real
Simulated
Time
• Source Address, client versions (too few clients)
• TTL, TCP options, TCP window size (artifacts)
• Checksum errors, “crud”, invalid keywords and
values (too clean)
5. Injecting Real Background
Traffic
• Collected on a university departmental web server
• Filtered: truncated inbound client traffic only
• IDS modified to avoid conditioning on traffic source
Attacks
Internet
(simulated
and real)
IDS
SunOS
Real web server
Solaris
Linux
WinNT
Mixed Traffic: Fewer Detections,
but More are Legitimate
140
120
Total
Legitimate
100
80
60
40
20
0
PHAD
ALAD
LERAD
NETAD
Detections vs. False Alarms
(Simulated and Combined Traffic)
Detections out of 148
NETAD-S
125
LERAD-S
100
NETAD-C
75
LERAD-C
50
25
0
0
100
200
300
400
500
False Alarms
Results Summary
• Original 1999 evaluation: 40-55% detected
at 10 false alarms per day
• NETAD (excluding U2R): 75%
• Mixed traffic: LERAD + NETAD: 30%
• At 50 FA/day: NETAD: 47%
Contributions
1.
2.
3.
4.
5.
Protocol modeling
Time based modeling for bursty traffic
Rule learning
Continuous modeling
Removing simulation artifacts
Limitations
• False alarms – Unusual data is not always
hostile
• Rule learning requires 2 passes (not continuous)
• Tests with real traffic are not reproducible
(privacy concerns)
• Unlabeled attacks in real traffic
– GET /MSADC/root.exe?/c+dir HTTP/1.0
– GET /scripts/..%255c%255c../winnt/system32/cmd.exe?/c+dir
Future Work
• Modify rule learning for continuous traffic
• Add other attributes
• User feedback (should this anomaly be
added to the model?)
• Test with real attacks
Acknowledgments
• Philip K. Chan – Directing research
• Advisors – Ryan Stansifer, Kamel Rekab, James
Whittaker
• Ongoing work
– Gaurav Tandon – Host based detection using LERAD
(system call arguments)
– Rachna Vargiya – Parsing application payload
– Hyoung Rae Kim – Payload lexical/semantic analysis
– Muhammad Arshad – Outlier detection in network
traffic
• DARPA – Providing funding and test data
Related documents