* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Jigsaw: Solving the Puzzle of Enterprise 802.11 Analysis
Survey
Document related concepts
Asynchronous Transfer Mode wikipedia , lookup
TCP congestion control wikipedia , lookup
Computer network wikipedia , lookup
Distributed firewall wikipedia , lookup
Internet protocol suite wikipedia , lookup
Policies promoting wireless broadband in the United States wikipedia , lookup
Dynamic Host Configuration Protocol wikipedia , lookup
Network tap wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Deep packet inspection wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Wireless security wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
IEEE 802.11 wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Transcript
Jigsaw: Solving the Puzzle of Enterprise 802.11 Analysis Yu-Chung Cheng John Bellardo, Mikhail Afanasyev, Patrick Verkaik, Jennifer Chiang, Peter Benko Alex C. Snoeren, Geoff Voelker, Stefan Savage Department of Computer Science & Engineering University of California, San Diego 25.05.2017 Yu-Chung Cheng/Qualcomm CR&D 1 The promise of Enterprise 802.11? 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 2 A familiar story... “The wireless is being flaky.” “Flaky how?” “Well, my connections got dropped earlier and now things seem very sloooow.” “OK, we will take a look” Employee “Wait, wait … it’s ok now” “Mmm… well let us know if you have any more problems.” Now what? 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D Support 3 What are the problems? Contention with nearby wireless devices? Bad AP channel assignments? Microwave ovens? Congestions in the Internet? Bad interaction between TCP and 802.11? Rogue access points? Poor choice of APs (weak signal)? Incompatible user software/hardware? 802.11 DoS attack?! … Network admins are not paid enough to figure this out… 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 4 Why is this hard to understand? RF domain defies traditional networking intuition Wireless topology not well-modeled as a graph Asymmetry is common for all characteristics Packet loss, bandwidth, interference, etc. Variability in all characteristics caused by: Distance/mobility, orientation, temperature, RF workload, etc Automatic management: MAC, rate control, access point selection Huge inter-vendor variation Scale – lots of different RF domains Mobility management is complex 5/25/2017 The undeclared layer 2.5… L2 (assoc, scan, etc), ARP, DHCP, registration, etc Yu-Chung Cheng/Qualcomm CR&D 5 Goal: What’s going on in my network? Real-time diagnosis of wireless network problems In a production 802.11 network Identify components of delay at physical, link, network and transport layers Deconstruct full end-to-end behavior Interactions between environment, 802.11 PHY/MAC, TCP/UDP Ultimately: understand the most important sources of performance problems and opportunities for improvement 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 6 New CSE building at UCSD 150k square feet 4 floors + basement >500 occupants 150 faculty/staff 350 students Building-wide WiFi 40 access points 802.11b/g Channel 1, 6, 11 10 - 100 active clients anytime Daily traffic ~10 GB 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 7 UCSD passive monitor system Overlays existing WiFi Series of passive sniffers Blanket deployment for best coverage 48 sensor pods (192 radios) 5/25/2017 4 radios per pod (cover all channels in use) Captures/timestamps all 802.11 activity (including physical errors) Stream back to centralized server (>6TB storage) Yu-Chung Cheng/Qualcomm CR&D 8 Jigsaw system Constructs single view of all 802.11 activity Unifies frame views from all radios Transitive synchronization across all views (max dispersion ~10us; 80% within 5us) Reconstructs discrete L2, L3 and L4 state Inference of unseen events and host state (vantage point limitations) via protocol behavior Designed to make it easy to add analysis modules Physical fingerprints, contention inference, DHCP analysis, etc Easy to measure cross-layer interactions Yu-Chung Cheng, John Bellardo, Peter Benko, Alex C. Snoeren, Geoffrey M. Voelker, and Stefan Savage, Jigsaw: Solving the Puzzle of Enterprise 802.11 Analysis, SIGCOMM 2006 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 9 Traces synchronization and unification Sniffers label packets w/ local timestamp (TSF) Need a global clock Estimate the offset between TSF and the global clock for each sniffer 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 10 Part of a Jigsaw trace (L1/L2) Monitors Received Received, frames CRC error Client 1 Traces synchronized HW corrupted Time Client 2 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 11 Jigsaw in Action Physical layer inference Link layer modeling Transport layer flow reconstruction End-to-end cross-layer diagnosis Media access problems Mobility management overhead 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 12 Hidden terminal interference Co-channel interference from other transmitters For sender s and receiver r, estimate conditional probability of loss given simultaneous transmission by interferer i i ? r Current finding: hidden terminals not such a big deal (some exceptions) s Hidden-terminal: s sends Normal: s sends data, r data, r ‘s reception is sends ACK interfered by i 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 13 Broadband interference ~9 am 12-2 pm 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 14 Interference fingerprints Microwave oven: magnetron driven by half-wave voltage doubler @ 60Hz Automatically detect and tag “microwavelike” physical interference 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 15 Link layer Contention: a challenge to measure Three kinds of network events Directly observable: packet sent (easy) Directly inferable: packet received (harder) Indirectly inferable: packet delayed by contention (surprisingly tricky) Key issues Need to know input and output at each AP Need to model internal state of AP 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 16 Model Infer time at which packet is queued on AP (via wireline analysis) Ethernet serialization delay AP bus overhead (2 I/O) AP processing overhead Determine if previous packet had cleared AP (via wireless analysis) Head-of-line blocking (delay attributable to queuing) No head-of-line blocking (delay attributable to contention/MAC) 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D Directly observed Inferred/Modeled 17 Access delay (Dacc) at an AP Contention beyond Contention during DIFS backoff convolved with pkt backoff Mandatory backoff for last pkt 0-15 slot times (20us ea) Distributed Inter-Frame Space 5/25/2017 (50us) Yu-Chung Cheng/Qualcomm CR&D 18 End-to-end cross-layer diagnoses Media access problems Mobility overhead 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 19 Pathologies 802.11b faster than 802.11g Significant unsuccessful effort over 12 months by IT groups (and vendor) in understanding problem Issue Avaya AP only attempts one retry for 802.11g frames in “protection mode” High-rate transmissions more sensitive to noise Export many more losses to IP -> TCP backoff 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 20 Pathologies (2) Big L2 retry delay (> 10ms) Why? Broadcast frames have > 50ms avg delay Why? Same reason If any client request power-save mode then AP must buffer broadcast frames until beacon is sent Pending frame exchange is postponed until broadcast burst is completed 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 21 Pathologies (3) 802.11g protection mode Used when 802.11b clients are present 802.11g client sends a pilot CTS-to-Self frame (slow) before data Overhead is about 100% air time Issue: We still have many 11b clients But most 11b traffic are bursty, no need to use protection all the time 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 22 Pathologies (4) Lots of “vendor” hacks Do not respect CSMA Bursts packets in a row Early retransmission Do not wait for the full ACK time Do not respect protection mode Do not do exponential back-off (linear) Announce very large transmission duration Could mount DOS but not working in reality Do not increment sequence numbers … 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 23 TCP diagnoses breakdown Majors: slow receiver, AP retry bug, protection mode 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 24 Mobility management overhead Around 30% of time is spent in mobility management (DHCP, ARP, association etc) 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 25 Pathologies (5) Large startup delays (10s of secs) Client requests DHCP lease for private address space (192.168/16) Wireless Management system (Verneir) grants address with short timeout and won’t refresh Client has to do two DHCP transactions with long timeouts between 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 26 Startup delays breakdown Delay (seconds) Majors: (Gratuitous) ARPs + Scans 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 27 Where to next? Real-time system for automated detection and evaluation of poor network performance Identifies problem flows and isolates potential causes of poor performance City-wide network monitoring Currently deployed in a Bay-area metropolitan network Future: explore deployment and protocol fixes 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 28 Q&A Live traffic monitoring and more information at http://sysnet.ucsd.edu/wireless/ 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 29 Synchronization Create a virtual global clock TSF diff of two sniffers If A and B are transmitting at the same time they could interfere If A starts transmitting after B has started then A can’t hear B Require fine time-scales (1050us) TSF diff (us) To keep unification working Critical evidence for analysis NTP is >100 usec accuracy 802.11 HW clocks (TSF) have 100PPM stability Time (s) 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 30 Trace unification (ideal) Time 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 31 Trace unification (reality) Jigsaw unified trace JFrame 1 Time JFrame 2 JFrame 3 JFrame 4 JFrame 5 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 32 Challenge: sync at large-scale 1 2 3 4 To ∆t1 ∆t2 How to bootstrap? Goal: estimate the offset between TSF and the global clock for each sniffer Time reference from one sniffer to the other Sync across channels Dual radios on same sniffer slaved to same clock Manage TSF clock skews Continuously re-adjust offsets when unifying frames 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 33 Jigsaw syncs 99% frames < 10us Measure sync. quality by max dispersion per Jframe 10 us is important threshold 802.11 back-off time is 20 us 802.11 inter frame time is 50 us Sufficient to infer many 802.11 events 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 34 Sensor pods Pod = pair of monitors Separated ~1 meter >35dB separation at 2.4Ghz Monitor = Soekris 4826-50 266Mhz 586 class CPU 128MB RAM, 64MB Flash 100Mbps Ethernet Dual Atheros a/b/g radios Power-over-Ethernet (semi-std) Jigdump software Captures/timestamps all 802.11 activity (including physical errors) Stream back to centralized server (>6TB storage) 5/25/2017 Yu-Chung Cheng/Qualcomm CR&D 35