Download P2P Doctor: Measurement and Diagnosis of Misconfigured Peer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Traffic flow wikipedia , lookup

Psychometrics wikipedia , lookup

Transcript
Network-based Intrusion
Detection, Prevention and
Forensics System
Yan Chen
Department of Electrical Engineering and
Computer Science
Northwestern University
Lab for Internet & Security Technology (LIST)
1
http://list.cs.northwestern.edu
The Spread of Sapphire/Slammer
Worms
2
Current Intrusion Detection
Systems (IDS)
• Mostly host-based and not scalable to highspeed networks
– Slammer worm infected 75,000 machines in <10
mins
– Host-based schemes inefficient and user
dependent
• Have to install IDS on all user machines !
• Mostly simple signature-based
– Cannot recognize unknown anomalies/intrusions
3
– New viruses/worms, polymorphism
Current Intrusion Detection
Systems (II)
• Cannot provide quality info for forensics or
situational-aware analysis
– Hard to differentiate malicious events with
unintentional anomalies
• Anomalies can be caused by network element faults,
e.g., router misconfiguration, link failures, etc., or
application (such as P2P) misconfiguration
– Cannot tell the situational-aware info: attack
scope/target/strategy, attacker (botnet) size, etc.
4
Network-based Intrusion Detection,
Prevention, and Forensics System
• Online traffic recording
[SIGCOMM IMC 2004, INFOCOM 2006, ToN 2007, INFOCOM 2008]
–
–
–
–
Reversible sketch for data streaming computation
Record millions of flows (GB traffic) in a few hundred KB
Small # of memory access per packet
Scalable to large key space size (232 or 264)
• Online sketch-based flow-level anomaly detection
[IEEE ICDCS 2006] [IEEE CG&A, Security Visualization 2006]
– Adaptively learn the traffic pattern changes
– As a first step, detect TCP SYN flooding, horizontal and
vertical scans even when mixed
• Online stealthy spreader (botnet scan) detection
[IWQoS 2007]
5
Network-based Intrusion Detection,
Prevention, and Forensics System (II)
• Polymorphic worm signature generation & detection
[IEEE Symposium on Security and Privacy 2006, IEEE ICNP 2007]
• Accurate network diagnostics
[ACM SIGCOMM 2006] [IEEE INFOCOM 2007 (2)]
• Scalable distributed intrusion alert fusion w/ DHT
[SIGCOMM Workshop on Large Scale Attack Defense 2006]
• Large-scale botnet and P2P misconfiguration event
6
forensics [work in progress]
System Deployment
• Attached to a router/switch as a black box
• Edge network detection particularly powerful
LAN
Switch
Switch
Inter
net
RAND
system
RAND
system
Inter
net
LAN
scan
port
RAND
system
Inter
net
LAN
Splitter
Switch
Router
Router
Switch
LAN
(a)
scan
port
Switch
LAN
Router
Switch
LAN
HPNAIDM
system
(b)
Original configuration
Splitter
Monitor each port
separately
(c)
Monitor aggregated
traffic from all ports
7
P2P Doctor: Measurement and
Diagnosis of Misconfigured Peerto-Peer Traffic
Anup Goyal, Zhichun Li, Yan Chen and Aleksandar
Kuzmanovic
Lab for Internet and Security Technology (LIST)
Northwestern Univ.
What is P2P Misconfiguration
 P2P file sharing accounted for > 60% of traffic in
USA and > 80% in Asia
 Thousands of peers send P2P file downloading
requests to a “random” target on the Internet
 possibly triggered by bugs or by malicious reasons
 generates large amount of unwanted traffic
 It contributes on an average of about 30% of the
“Internet background radiation”
Motivations
 On Dec. 6th, 2006, 5,047 sources generated >31,000
packets/sec and 11MB/s of traffic to a single unused
IP in Northwestern University
 P2P software DC++ has already been exploited by
attackers for DoS
 direct gigabit “junk” data per second to a victim host from
more than 150,000 peers
Peers
 Currently, little is known about the characteristics or
root causes of P2P misconfiguration events
File Request Flooding
11MB/s
Innocent Victim
Misconfigured Traffic
DDoS attack Scenario
Outline
•
•
•
•
•
Motivation
Passive measurement results
P2P Doctor system design
Root cause diagnosis and analysis
Conclusion
Peer Classification
Poisoned Peers
(Intentional)
Unintentionally
Misconfigured peers
All the peers
Normal Peers
Bogus
Peers
Not in the
P2P Network
Anti-P2P Peers
In the P2P Network
Passive Measurement
• Honeynet/honeyfarm datasets
• Events: # of unique sources > 100 in 6 hours
– After filtering scan traffic
• Event characteristics:
– Mostly target a single IP
– Duration: A few hours to up to a month
LBL
NU
GQ
Sensor
5 /24
10 /24
4 /16
Traces
883GB
287GB 49GB
Duration
37
7
26
months months days
LBL
NU
eMule
106
106
BitTorrent
242
90
Gnutella
1
1
Soribada
4
0
Xunlei
18
0
VAgaa
1
0
Popularity
30%!
• Growth Trend:
The average total connections of P2P
misconfiguration events per month.
• IP space
– Observed in three sensors in five different /8
IP prefixes
Further Diagnosis
• Problems with passive measurement on
archived data
– Events have gone
– Hard to backtrack the propagation
– Root cause?
• Need a real-time backtracking and
diagnosis system!
Outline
•
•
•
•
•
Motivation
Passive measurement results
P2P Doctor system design
Root cause diagnosis and analysis
Conclusion
Design of P2P Doctor System
P2P-enabled
Honeynet
Backtracking
system
Root cause
inference
10100101011101
infohash; ‘abc.avi’
P2P payload
signature
based responder
Event
identification
Protocol parsing for
metadata
Design of P2P Doctor System
P2P-enabled
Honeynet
Backtracking
system
Root cause
inference
Server
...
Server
...
Server
Local
Crawler
Server
Server
Index Server (tracker)
Crawling
BT: top 100, eMule: 185
Peer Exchange
Protocol Crawling
DHT Crawling
Design of P2P Doctor System
P2P-enabled
Honeynet
•
•
•
•
Backtracking
system
Root cause
inference
What is the root cause?
Which peers spread misconfigurtion?
How is misconfiguration disseminated?
What is the percentage of bogus peers in
the misconfigured P2P networks?
Deployment and Data Collection
• Deployed the P2P doctor system on NU
honeynet (10 /24 networks in three /8)
• Real-time events
– Previous passive measurement data referred
as historical events
BitTorrent
eMule
# of events
20
42
Duration
23 days
08/23/2007 to 09/15/2007
Outline
•
•
•
•
•
Motivation
Passive measurement results
P2P Doctor system design
Root cause diagnosis and analysis
Conclusion
Root Cause Analysis
• Methodology
– Track how honeynet IPs propagated in P2P systems
– Use unroutable IP space as a big honeynet (66.8% of
IPv4 Space)
– Hypothesis formulation and testing
• Classification of measured peers
– Misconfigured peers: Passively observed from
honeynet
– Backtracked peers: actively observed through
backtracking
– Reverse honeynet peers: the IP obtained by reversing
the target IP from the honeynets
• Results
– Data plane traffic radiation
– Detailed results focus on eMule and BitTorrent
Data Plane Traffic Radiation
1.2.3.4
Resource mapping
Peer
Exchange
Who has
Beowulf.avi?
DHT
Index
Server
1.2.3.4
eMule – Root Cause
• Byte ordering is the problem!
4.3.2.1
1.2.3.4
4.3.2.1
4.3.2.1
4.3.2.1
4.3.2.1
eMule – Root Cause
• Byte ordering is the problem!
– Hypothesis from the historical data
• In 80% of events, the reverse target IPs are alive
– Verified with real-time events
• 61% of the reverse honeynet peers indeed running
eMule with the port number reported
• For the backtracked peers which is in the
unroutable IP space, 69.6% of them having
reverse IPs run eMule
eMule – Peers & Dissemination
• Which peers spread misconfiguration?
– 99.24% of misconfigured peers are normal peers
• How is the misconfiguration disseminated?
– Index Server? No
– Peer exchange? Yes
• Percentage of bogus peers in eMule network?
– [12.7%, 25.0%] w/ a total of 37,079 backtracked peers
Unroutable (0)
From Index
Servers
(19.3%)
All Peers
eMule (12.8%)
Others (6.5%)
Unroutable
(100%)
Reverse-eMule(7.1%)
Reverse-unroutable(0.3%)
(10.3%)
From Peer
Exhcange
(80.7%)
eMule
(45.8%)
Others
(24.6%)
Reverse-others(2.9%)
Reverse-eMule (5.6%)
Reverse-unroutable(9.6%)
Reverse-others(9.4%)
BitTorrent – Responsible Peers
 Both anti-P2P and normal peers are responsible
 Events classified to two types with diagonally different
sets of characteristics
 For anti-P2P peers events
 All the sources are from the IP range owned by anti-p2p
companies like Media Defender, Media Sentry, Net Sentry etc.
 Seen 6 out of 7 major anti-P2P companies sources in our
honeynet.
Anti-P2P peers
Normal peers
Number of Events
127 (39%)
205 (61%)
Client Software
100% - Azureus
90% - UTorrent (NU)
88% - BitComet+BitSpirit (LBL)
400
25
All together
Poisson
4.5 hours
106.1 hours
Avg. number of
Connections / src
Arrival & Departure
Avg. Duration
BitTorrent – Root Cause
 Refuted Byte Ordering
Hypothesis
90
80
– For 20 real-time events, no reverse 70
honeynet peers runs BitTorrent
60
 For normal peer events, culprit
is Peer Exchange (PEX)
protocol implemented by
uTorrent-compatible clients
 For anti-P2P peer events
uTorrent
Transmission
kTorrent
Azureus
BitComet
50
40
30
20
– Possibly related to Azureus system 10
– Still an open question (No real0
time events)
Honeynet Peers
Backtracked Peers
BitTorrent – Dissemination
How is misconfiguration disseminated?
– Index server? - No
– Peer exchange? - Yes
Percentage of bogus peers in BitTorrent
network?
 Out of a total of 9,000 backtracked peers, only 13 IPs
are unroutable and 3,150 IPs gave connection timeout
 0.14% < bogus Peers < 35%
Conclusions
• The first study to measure and diagnose largescale P2P misconfiguration events
• Found 30% Internet background radiation is
caused by P2P misconfiguration
– Popular in various P2P systems, exponential growth
trend, and scattered in the IPv4 space
• For eMule, we found it is caused by network byte
order problem
• For BitTorrent, classified to anti-P2P peer events
and normal peer events with diagonally different
sets of characteristics
– Found the uTorrent PEX causes the problem in normal
peer events
Backup Slides
Motivation
 Given unprecedented amount of traffic, even a slight misconfiguration of the P2P system can result in a DDoS
kind of situation
 Prevalence in time, space, and across a number of
distinct P2P systems with a temporal increasing trend is
alarming.
 P2P miscongurations can cause innocent people to get
involved in the above “war” between P2P and anti-P2P
systems.
 Presently, nothing is known about the causes or overall
effects of P2P mis-configurations
 Our goal is to determine the root cause(s) of each type of
mis-configuration
Related Work
• Misconguration is widely spread across different networked
and distributed systems like BGP [Labovitz et al. ] and firewalls
[Cuppens et al. ].
• Measurement studies of normal P2P traffic [ACM SOSP
(2003), MCN (2002)], while we measure the abnormal P2P
traffic observed in honeynets.
• In [INFOCOM (2005)], Content pollution including intentional
and unintentional pollution is widespread for popular titles.
• P2P systems like Fasttrack and Overnet are vulnerable to the
index poisoning attack [INFOCOM (2006)]
• All of the above studies focus on the content pollution or index
poisoning while our focus is the index misconfiguration.
• First large-scale measurement study on the root causes for
both intentional/unintentional index misconfiguration.
What is P2P Misconfiguration
 More than 50% of the
traffic in the Internet
today is P2P traffic
 By Symantec
Corporation’s recent
report
 P2P file sharing
accounted for > 60%
of traffic in USA and >
80% in Asia
Other Traffic
P2P traffic