Download DYSWIS_20081209 - Columbia University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Wake-on-LAN wikipedia , lookup

Distributed operating system wikipedia , lookup

AppleTalk wikipedia , lookup

Deep packet inspection wikipedia , lookup

TCP congestion control wikipedia , lookup

Internet protocol suite wikipedia , lookup

Distributed firewall wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Airborne Networking wikipedia , lookup

Remote Desktop Services wikipedia , lookup

Lag wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

CAN bus wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Kademlia wikipedia , lookup

Transcript
DYSWIS
K Y U N G - H WA K I M
HENNING SCHULZRINNE
12/09/2008
INTERNET REAL-TIME LAB,
COLUMBIA UNIVERSITY
Do You See What I See?
Do you see
what I see?
End user
Internet
End user
End user
Outline
 Overview
 Fault Detection
 Peer Selection
 Probing
 Problem
 Implementation
 Demo
Overview
 Overview

DYSWIS – Do you see what I see


Motivation





Different causes for a particular network fault
Need different ‘view’ from other sources for the fault
End-to-end diagnosis
Need user-friendly interface
Current Problem


Distributed network fault detection and analysis system
Centralized management schemes
 Complexity in the user network and devices
 Failed to solve the service quality problem
Approach



Collaborate with other end users
P2P based
Remote probing
For Quick Understanding
Detect
Diagnosis
Probe
Detect
Diagnosis
Detect
Probe
Diagnosis
Probe
Detect
Diagnosis
Probe
Detect
Diagnosis
Detect
Probe
Diagnosis
Probe
Detect
Detect
Diagnosis
Diagnosis
Probe
Probe
Detect
Detect
Diagnosis
Diagnosis
Probe
Probe
Detect
Detect
Diagnosis
Diagnosis
Probe
Probe
Fault Detection
 Automatic fault detection
 Network raw packet capturing
 Analyze network packet and protocol
 Raw packet capturing
 Check error response
 Check timeout
 Check TCP congestion

Monitoring TCP sequence numbers
 Define fault cases
 Automatic vs. Manual

FSM approach
 pre-define
 learning
FSM - Approach
* Automatic Protocol Failure Detection Using Finite State Machines
Zhifeng Wang , Kai X. Miao, Tao Zuo, Henning Schulzrinne, Kyung Hwa Kim, Vishal Kumar Singh
FSM - Approach
* Automatic Protocol Failure Detection Using Finite State Machines
Zhifeng Wang , Kai X. Miao, Tao Zuo, Henning Schulzrinne, Kyung Hwa Kim, Vishal Kumar Singh
Peer Selection
 Peer Selection
 DHT or Database
Register myself to DHT network
 AS number, subnet, first hop, AP.
 Search probing nodes
 Inner nodes and outer nodes

I need some nodes who
can help me.
Who is in same subnet
with me?
You can contact to B.
His IP address is
218.59.21.16 and
port number is 9090
A
B
DHT
Peer Selection - DHT (key, value)
<key>
<type>node</type>
<asn>14<asn>
<subnet>128.59.0.0/16</subnet>
</key>
I need some nodes who
<key>
can help me.
<type>node</type>
Who is in same subnet
<asn>9880<asn>
with me?
<subnet>45.45.45.0/24</subnet>
<firewall>no</firewall>
<nat>no</nat>
</key>
<value>
<type>node</type>
<ip>128.59.21.15</ip>
<port>9090</port>
<protocol>udp</protocol>
</value>
<value>
<type>node</type>
A
<ip>128.59.21.15</ip>
<hostname>kkh.cs.columbia.edu</hostname>
B
<port>9090</port>
<protocol>tcp</protocol>
</value>
DHT
Remote Probing
 Distributing modules
 Detecting and probing modules should be added and updated
 Dynamic class loading
 Dynamic module distributing

Modules can be created and updated separately.
 XMLRPC
Probing Scenarios
 HTTP
 Causes: Dead web-server , page moved, low bandwidth …





Check DNS query
TCP connection
Ask other node to try same query
Check TCP congestion
…
 DNS
 Causes : Dead DNS server , resolution failed, udp is not working , …



Check other DNS server
Ask other node to try to connect my DNS server
Ask other node to query same host to another DNS server
 SIP/RTP
 Causes: NAT, DNS, proxy server, authentication



Proxy connectivity test
Ask other node to try same action.
…
Probing Scenarios
 Connection problem
 Causes : Dead server, firewall, wrong port number …
Traceroute – Check routers
 Ask other node to try to connect the server
 Ask other node to check my port
…

 TCP Congestion
 Causes : Queuing delay, dead routers
Traceroute , ping
 Try to find bottleneck
…

Probing Scenarios
A
B
Data Gathering
 Problem
 We have resources: Other machines
 But how do we use them efficiently?
 We need real data
 Approach
 Collecting data
 Collecting Scenarios
 Implementing prototype
Implementation
 Architecture
http://wiki.cs.columbia.edu/display/res/DYSWIS
For the detail, visit : http://wiki.cs.columbia.edu/display/res/DYSWIS
Demo
 Demo
Future work
 Implementation
 http://www.cs.columbia.edu/~khkim/project/dyswis
 Coming soon : Mac & Linux
 Testbed - PlanetLab
 Mature research for analysis
 Support real time protocols
 How to find solutions for end users
backup
 Check local network.
 Select two nodes, one from same subnet, another one from outer





subnet.
Let the nodes try to connect the server.
If both nodes failed to connect the server, log this fault as ‘server
failure’.
If only internal node failed, execute traceroute to check where the
packet is blocked.
If internal node succeeded, it is possible that this problem is caused
by local firewall or something else.
Check incoming/outgoing port; Let other nodes open same port,
and try to connect there. Check the remote node received packet
or not. Check the ACK from remote node came back.