Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia, lookup

Transcript
Ranking Attackers Through Network
Traffic Analysis
Andrew Williams & Nikunj Kela
11/28/2011
CSC/ECE 774
1
Agenda
•
•
•
•
•
Background
Tools We've Developed
Our Approach
Results
Future Work
11/28/2011
CSC/ECE 774
2
Background: The Problem
Setting 1: Corporate Environment
•Large number of attackers
•How do you prioritize which attacks to investigate?
RSA
11/28/2011
CSC/ECE 774
3
Background: The Problem
Setting 2: Hacking Competitions
•How do you know who should win?
11/28/2011
CSC/ECE 774
4
Background: Information Available
• Network Traffic Captures
• Alerts from Intrusion Detection Systems (IDS)
• Application and Operating System Logs
11/28/2011
CSC/ECE 774
5
Background: Traffic Captures
• HUGE volumes of data
• A complete history of interactions between clients and
servers*
Information available:
• Traffic Statistics
• Info on interactions across multiple servers
• How traffic varies with time
• Everything up to and including application layer info**
11/28/2011
CSC/ECE 774
6
Background: IDS Alerts
• Messages indicating that a packet matches the signature of
a known malicious one
• Still a fairly large amount of data
• Same downsides as anti-virus programs, but most IDS
signatures are open source!
• If IDS is compromised, these might not be available
Information available:
• Indication that known attacks are being launched
• Alert Statistics
• How alerts vary with time
11/28/2011
CSC/ECE 774
7
Background: Application/OS Logs
Ex: mysql logs, apache logs, Windows 7 Security logs, ...
• Detailed, application-specific error messages and warnings
• Large amount of data
• If a server is compromised, logs may not be available
Information available:
• Very detailed information with more context
• Access to errors/issues even if traffic was encrypted
11/28/2011
CSC/ECE 774
8
Background: iCTF 2010 Contest
• 72 teams attempting to compromise 10 servers
• Vulnerabilities include SQL Injection, exploitable off-by-one
errors, format string exploits, and several others*
• Pretty complex set of rules
Dataset from competition:
• 27 GB of Network Traffic Captures
• 46 MB of Snort Alerts (from competition)
• 175 MB of Snort Alerts (generated with updated rulesets)
• No Application or OS Logs
More information on the contest can be found here:
http://www.cs.ucsb.edu/~gianluca/papers/ctf-acsac2011.pdf
11/28/2011
CSC/ECE 774
9
Tools We've Developed
We wrote scripts to...
• Parse the large amount of data:
o Extract network traffic between multiple parties
o Filter out less important Snort Alerts
o Track connection state to generate statistics and stream data
• Visualize the data
o Show all of the alerts and flag submissions with respect to
time
• Analyze the data
o Pull out the transaction distances and find statistics on them
• Generate Application and OS Logs
o Replay network traffic to live virtual machine images
11/28/2011
CSC/ECE 774
10
Our Approach: Intuition
• Vulnerability Discovery Phase
Identify the type of vulnerability
• Vulnerability Exploitation Phase
Refine the attack string
• It is quite intuitive that a skilled attacker will come up with
the attack-string in less time than an unskilled attacker
• How do we know if the attacker has broken into the system?
We only have logs to work with!
• Time taken to break into the system reflects the learning
capabilities of an attacker
Fast learner implies good attacker
11/28/2011
CSC/ECE 774
11
Our Approach: Identify the attack string
• Once the attacker break into the system, he/she would use
the same attack string almost every time to gather information
• We observed from the traffic logs that in most of the cases,
the attacker used one TCP stream to break into the system
One TCP connection for each attempt!
• We chose Levenshtein distance (Edit Distance) as our metric
to compare the two TCP communication from attacker to
server
• Consecutive zero as the distance between TCP data means
the attacker has successfully broken into the system
11/28/2011
CSC/ECE 774
12
Example: Identify the attack string
Stream1: "%27%20or%20%27%27%3D%27%0Alist%0A"
Stream2: "%27%20OR%20%27%27%3D%27%0ALIST%0A"
Stream3: "asdfasd%20%27%20UNION%20SELECT%20%28%27secret.txt%27%29
%3B%20--%20%20%0AMUGSHOT%0ASADF%0A"
Stream4: "asdfasd%20%27%20UNION%20SELECT%20%28%27secret.txt%27%29
%3B%20--%20%20%0AMUGSHOT%0A39393%0A"
Stream5: "asdfasd%20%27%20UNION%20SELECT%20%28%27secret.txt%27%29
%3B%20--%20%20%0AMUGSHOT%0A1606%0A"
S
Stream6: "asdfasd%20%27%20UNION%20SELECT%20%28%27secret.txt%27%29
%3B%20--%20%20%0AMUGSHOT%0A1606%0A"
11/28/2011
CSC/ECE 774
13
Our Approach: Features Selection
• Time taken to successfully break into the system
• Mean and standard deviation of the distances between
consecutive TCP streams
• Number of attempts before successfully breach into the
service
• Length of the largest sequence of consecutive zero's
11/28/2011
CSC/ECE 774
14
Result: Distance-Time Plot
11/28/2011
CSC/ECE 774
15
Interesting Findings from the contest
• Although the contest involved only attacking the vulnerable
services, yet the teams tried to break into each others systems
• We noticed that teams shared the Flag value with each other
through the chat server
• The active status of the service was maintained through a complex
petri-net system and most of the teams struggled to understand it
• Hints about different vulnerabilities in the services were released
time to time through out the contest by the administrators
11/28/2011
CSC/ECE 774
16
Future Work
• Use of data mining tools(e.g. SAS miner) to analyse the
relationships among the features
• Use of data mining tools for developing a scoring systems to
give scores to each teams based on the feature set
• Continue improving the replay script to handle the large
number of connections
11/28/2011
CSC/ECE 774
17
Thank You!
Questions?
11/28/2011
CSC/ECE 774
18
Image Sources
WooThemes, free for
commercial use
Icons-Land, free for
non-commercial use
Fast Icon Studio,
used with permission
11/28/2011
CSC/ECE 774
19