Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
System Management in Challenged Networks CENS Seminar – November 17th, 2006 Martin Lukac * Lewis Girod * † Deborah Estrin * * UCLA CENS - † MIT CSAIL Outline • Meso American Subduction Experiment (MASE): A Challenged Network • Data Delivery • System Management • The Future Seismic Deployment Application Requirements 50 standalone Caltech sites 62 wirelessly connected UCLA sites • Extensive: 500 Km from Acapulco through Mexico City to Tampico • Dense: 1 sensor every 5-10 Km • High bandwidth: Data acquisition rate: 3 - 24 bit channels at 100Hz each • Online and Reliable: Semi real-time (on the order of days), reliable data delivery to UCLA for analysis • Online system management – Query state, change configuration, update binaries – Can not interfere with data delivery • Application driven topology: application determines sensor placement – Infrastructure does not (Can’t rely on pre-existing cell or power infrastructure) MASE: Given these requirements, we deployed solar powered seismic stations equipped with 802.11b %18 - A MASE 13 Node %152 - B %69 - C %77 - D •%107 Network - E topology does not reflect the mostly %42 - Flinear physical topology %81 - G •%202 Routing - Hand other services can not use physical %76 - I topology %106 - J %95 - K %53 - L %157 - M Cuernavaca Line Data paths B F G D C E M I J L H K A A – sink Direct inet connection How challenged is the MASE network? • Frequent unpredictable disconnections – Rainy season: sites flood (some 24x7), trees grow – Wind: misaligned antennas – Equipment malfunction: amps burn, voltage regulators break • Poor and unstable links – Connectivity secondary concern for site selection – Stretched links highly susceptible to weather and environment • Human effort is a critical resource – Installation, maintenance, protection Networking support needed for both data acquisition and system management • Data delivery – Bandwidth driven – Bandwidth: 20-40 of MB per day per station – Latency: get the data eventually, but reliably – Many to one routing • System Management – Latency driven – Bandwidth: usually less than 10’s of KB’s – Latency: as fast as possible – One to all routing and back Well-known limitations of existing techniques • Data delivery and system management techniques designed for wired or always-onwireless do not work well – Typical tools use TCP to create and maintain an end to end session to deliver a stream of data over multiple hops – These are “online applications” which expect reliable links with low latencies • Patterns of poor links, disconnections, and disruptions – Difficult to obtain and maintain end-to-end connections – Intermittent end-to-end connections insufficient to achieve necessary bandwidth and latency Our Contributions • Real world application and deployment of Delay Tolerant Networking (DTN) techniques for data delivery • Disruption Tolerant Shell (DTS): a tool for system management on challenged networks that performs better than traditional tools Summary • MASE: A Challenged Network –Poor and erratic links –Frequent unpredictable disruptions • Data Delivery • System Management • The Future Data Delivery using DTN Techniques • Buffer data into hour long bundles (1-3 MB) • Deliberate one hop bundle transfer • Path to sink determined by best ETX • Improvement over end-to-end – Not affected by path disconnections – Keeps retrying on single link instead of full path – Continual ‘progress’ being made towards sink – More efficient use of bandwidth in face of disconnections and bottlenecks A X X B X X C F end-to-end hop-by-hop Upcoming Features • Currently piggyback data movement log with actual data – No global time stamping of log events • Want coarse grained global time (one second) – Will be able to recreate ‘movie’ of file movement for entire network – Can help spot network problems and bottlenecks • Upload data to SensorBase.org – Makes it easy to visualize and browse data collection status – RSS feed can provide access to anyone who wants to monitor problems or generic status of network Data Acknowledgement • Nodes keep their own bundles until ACK’ed by sink – Many ways of doing ACK’s • First try for ACK implementation worked – Push bundle ID into StateSync (disseminates information to all the nodes in the network) – But… usage model not quite right… too many entires, too much churn for StateSync (can explain better later) • Second try – Use ‘file dissemination’ feature of DTS to distribute ACK list once a day – Use DTS to remove list once we know all nodes have file Summary • MASE: A Challenged Network – Poor and erratic links – Frequent unpredictable disruptions • DTN Style Data Delivery – Resilient to path disconnections – Efficient use of bandwidth • System Management • The Future System Management • Existing management tool: remote shell (ssh) • Modified management tool: Disruption Tolerant Shell – Asynchronous remote shell to all nodes in network simultaneously – Provides node management capabilities when end-to-end connections are unavailable or fail – Ensures that commands will succeed: as long as there is eventually a connection between a node and any other node that already has the command df –h ls /opt/dts/file_mover | wc A E B C F Commands Responses D Extra Fun Features of DTS • Guaranteed in order execution from source node • Reboot and crash safe • Implicit feed back on nodes and links: spot bottlenecks, dead nodes • Execute a command on individual nodes • Push a file to all nodes – Distribute new script or component Upcoming Features • Web interface – Command line interface is nice for me • Takes a bit of getting used to – Web interface more intuitive for asynchronous model • Constant feeds of frequently executed commands – Disk space, file counts, q330/gurlap status, link quality • SensorBase.org – Accountability log: load all commands and responses and metadata for those – DTS analysis and implicit network feedback: just point and click Reliable State Synchronization A • StateSync: reliable and efficient publishsubscribe mechanism PUBLISH Commands Responses • Implements a broadcast dissemination protocol SYNCHRONIZE – Published data is scoped – DTS publishes commands and responses one hop B • Works well for applications that require: – Reliable delivery – Have a few Kbytes of data to share – Data has lifetime that is long compared to system latency requirements – Suitable for DTN since it does not use end-to-end connections Commands PUBLISH Responses SYNCHRONIZE PUBLISH C Commands Responses DTS latency results • Compare latency of DTS to parallel ssh • DTS is faster 90% of the time, comparable to the rest • DTS reaches 100% of nodes – ssh requires retries from the source node • Latency can vary by day, but DTS always faster or comparable to ssh What makes DTS better than ssh? • StateSync data model: tables of key value pairs – DTS has a command table and response table • Each node republishes a command and response tables one hop A Cmd A-1 Resp A-1-A Resp A-1-B Resp A-1-C • Logging mechanism – Do not republish whole table – Only send changes to tables: small amount of information – More efficient use of bandwidth in face of disconnections • Retransmission protocol – Keeps retrying on individual links – Not affected by path disconnections – No overhead of creating and maintaining endto-end connection B Cmd A-1 Resp A-1-A Resp A-1-B Resp A-1-C Future of StateSync • StateSync allows data to be published N hops – When publish N hops, not end to end but expect data path (the flow) to be maintained with refresh beacons – If refreshes from source or node in flow stop, statesync will not propagate information – Not idea for frequent disconnections • DTS publishes data one hop – Gets around problem by republishing another nodes data as its own – Statesync only publishes one hop • Tweaks – Allow flows to be propagated even when no refresh from source or node along data path – Tunable latency parameters – Report metrics about itself • DTS can then publish data N hops – Lowers RAM usage, lowers number of packets Site Installation Mexico Xyoli Pérez-Campos, Mario Islas Herrera, Oscar Martínez Susano, Jorge Soto, Aida Quezada Reyes, Arturo Iglesias, Lizbeth Espejo, Luis Antonio Placencia Gómez, Luis Edgar Rodriguez, Fernando Greene USA Paul Davis, Allen Husker, Igor Stubailo, Richard Guy, Sam Irving, Martin Lukac, Alma Quezada, Steve Skinner, Irving Flores Our Contributions • Real world application and deployment of Delay Tolerant Networking (DTN) techniques for data delivery • Disruption Tolerant Shell (DTS): a tool for system management on challenged networks that performs better than traditional tools Summary • MASE: A Challenged Network – Poor and erratic links – Frequent unpredictable disruptions • DTN Style Data Delivery – Resilient to path disconnections – Efficient use of bandwidth • System Management – DTS viable tool for system management for challenged networks • The Future Whats Next? • Have a tool that works – Understand conceptually why it works better • We have a high level analysis: per link bandwidth • Network is being pulled out in Feburary Work in Progress • Need better network characterization – Long-Distance 802.11b Links: Performance Measurements Experience, Chebrolu, B.(RSS) Raman, Sen source-destination – ITT Vinayakand analyzed receivedK.signal strength for aS.single Kanpur, 2006 pair in the UNAMMobicom line. – Use their driver to collect per packet: received signal strength, silence value, MAC packet & -81dBm subtype,(~10% CRCofcheck Max RSS: -46dBm (~83% of data) Mintype RSS: data) succeeded or not, MAC address information, MAC sequence Difference of 35dB number information Max/Min for IIT-Kanpur's -70dBm / -90dBm – Is our network different then theirs? Antennas, chipsets are Difference of 20dB the same. Our network is not always way up high… and do not have good link quality all the time. Next do this on Cuernavaca line. Maybe it will have higher variation than that • Coordinated IP level dumps on entire network of UNAM. – Can’t stop data flow Synchronize dumps between nodes since RTS-CTS is off High –variation might be from inter-link interference – Coordinate driver information See what RTS-CTSwith does. – How do the long links affect the transfers? If still–high linkhidden variation, then Mexico network is intrinsically different from Huge terminal problem, does rts/cts seem to help? that in India. May be our network is in between Boston's urban Roofnet and Kanpur's rural network? New Applications • DTS and DTN ideas/techniques can (must?) be applied to two new CENS applications – GeoNet – SHM (Structure Health Monitoring) GeoNet: Rapidly Deployable Challenged Network • Platform to support high data rate rapidly deployed large-scale WSN – Deploy 100-1000 nodes after event at a separation of 0.5-1Km AENSbox GPS • Software tools for rapid deployment – Must make real time decision about sensor location vs. network connectivity tradeoff – Need as much feedback from network as possible • Power efficient platform such as LEAP needs appropriate software architecture. • Network time synchronization when no GPS available • Data deliver & system management • Take advantage of dual radios? Geophone 802.15.4 WIFI SHM • SHM framework to improve safety and reliability of aerospace, civil and mechanical infrastructure by detecting damage before it reaches a critical state • Initially targeting tall buildings • Still a challenged network – Building structure (walls, ceilings), people, other networks, ‘stuff’ SHM Framework Embedded Network Structural System Sensors SHMBox Network Health Sensors SHMBox Network Health Damage State FEM Model Engineering Demand Parameters Fragility Curves Fragility Curves Probability Monitored Zones of Interest Event Detection Damage Measure Thank you! [email protected] Demo! Thanks to Igor and Derek for all the pictures and diagrams! Teotihuacan, 2006 MASE Wireless Seismic Station 15 dBi YAGI or 24 dBi Parabolic 2.4GHz antenna 70 watt solar panel, GPS mast and guy wires Quanterra Q330 24-bit digitizer sensor controller 2.4GHz amp car battery CDCC (CENS Data Communication Controller) Guralp 3T seismometer 30 Science! Following slides prepared by Roy Clayton (CalTech) and Igor Stubailo (UCLA – CENS) The Middle America Subduction Experiment (MASE). Why Mexico? Slab detachment theory. B •A subduction zone is an area on Earth where two tectonic plates meet and move towards one another, with one sliding underneath the other and moving down into the mantle, at a speed of several inches per year. •Typically, an oceanic plate slides underneath a continental plate, and this often creates a zone with many volcanoes and earthquakes. Ferrari, 2004, Geology 32 Similarities of Mexico City and Los Angeles locations •LA and Mexico City are major centers of commerce which sit upon compliant sedimentary basins. •Both are subject to damaging earthquakes and how earthquakes excite resonant shaking 33 Great potential of high station density •Achieve 20 times better resolution than before. •Provide visualization of the upper mantle and the subduction process, coast to coast across Mexico. •The data collected is very valuable to scientists in seismology, geodesy, geochemistry, geology, computational geodynamics, geophysics, and others 34 Russian Event (Kamchatka) : April 20, 2006, M=7.7 35 First results: detect flat slab with receiver functions 36 Rob Clayton, Caltech, 2006