* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Decentralized Location Services
Survey
Document related concepts
Transcript
Tapestry Architecture and status UCB ROC / Sahara Retreat January 2002 Ben Y. Zhao [email protected] Why Tapestry Today’s Internet Route failures not uncommon IPv4 constrains deployment of new protocols BGP too slow to recover, redundant routes unexploited IP multicast, security protocols (DDoS traceback), … Wide-area applications straining existing systems Scalable management of large scale resources Our goals Wide-area scalable network overlay Highly fault-tolerant routing / location Introspective / self-tuning platform Support application-specific protocols Efficient (b/w, latency) data delivery Pass on wide-area solutions to application layer ROC/Sahara Retreats, 1/2002 2 What is Tapestry? A prototype of a decentralized, fault-tolerant, adaptive overlay infrastructure (Zhao, Kubiatowicz, Joseph et al. 2000) Network substrate of OceanStore Routing: Suffix-based hypercube Similar to Plaxton, Rajamaran, Richa (SPAA97) Decentralized location: Virtual hierarchy per object with cached location references Dynamic algorithms using local information Core API: publishObject(ObjectID) routeMsgToObject(ObjectID) routeMsgToNode(NodeID) ROC/Sahara Retreats, 1/2002 3 Routing and Location Namespace (nodes and objects) 160 bits length 280 names before name collision Each object has its own hierarchy rooted at Root f (ObjectID) = RootID, via a dynamic mapping function Suffix routing from A to B At hth hop, arrive at nearest node hop(h) such that: hop(h) shares suffix with B of length h digits Example: 5324 routes to 0629 via 5324 2349 1429 7629 0629 Object location: Root responsible for storing object’s location Publish / search both route incrementally to root ROC/Sahara Retreats, 1/2002 4 Tapestry Mesh Incremental suffix-based routing 3 4 NodeID 0x79FE NodeID 0x23FE NodeID 0x993E 4 NodeID 0x035E 3 NodeID 0x43FE 4 NodeID 0x73FE 3 NodeID 0x44FE 2 2 4 3 1 1 3 NodeID 0xF990 2 3 NodeID 0x555E 2 NodeID 0xABFE 1 NodeID 0x73FF ROC/Sahara Retreats, 1/2002 NodeID 0x04FE NodeID 0x13FE 4 NodeID 0x9990 1 2 2 NodeID 0x423E 3 NodeID 0x239E 1 NodeID 0x1290 5 Object Location Randomization and Locality ROC/Sahara Retreats, 1/2002 6 Fault-tolerant Routing Strategy: Detect failures via soft-state probe packets Route around problematic hop via backup pointers Handling: 3 forward pointers per outgoing route (2 backups) 2nd chance algorithm for intermittent failures Upgrade backup pointers and replace Protocols: First Reachable Link Selection (FRLS) Proactive Duplicate Packet Routing ROC/Sahara Retreats, 1/2002 7 Talk Outline Tapestry overview Architecture Evaluation Brocade Conclude ROC/Sahara Retreats, 1/2002 8 Architecture Background OceanStore implementation Java with asynchronous I/O Event-based, stage driven architecture (Sandstorm – M. Welsh) Applications OceanStore Tapestry Sandstorm (async I/O, event arch.) Java Virtual Machine Operating System ROC/Sahara Retreats, 1/2002 9 Key Stages StaticTClient / Federation Uses config files to bootstrap initial Tapestry DynamicTClient Integrates new nodes into static Tapestry Router Primary handler of routing and location Patchwork Introspective monitoring and fault-detection Applications Dynamic TClient Patch OceanStore Static TClient work Router Sandstorm (async I/O, event arch.) ROC/Sahara Retreats, 1/2002 10 Static TClient Federation used as rendezvous point Pair-wise pings to generate route tables Federation used as global barrier to begin S S F S ROC/Sahara Retreats, 1/2002 1. Si says hello to F 2. F informs group of Si S 3. Nodes do pair-wise pings 4. Nodes signal readiness 5. Barrier reached at F, signals start 11 Dynamic TClient Node Integration 1. Hill-climb to find nearest Gateway 2. Route to surrogate / copy routes 3. Move relevant objects to new root 4. Directed multicast notifies nearby nodes Routes Request F Routes Response G S Moving Object Pointers Directed Multicast ROC/Sahara Retreats, 1/2002 ? 12 Routing / Location Router class Maintains: RoutingTable: [ ][ ] of RouteEntries ObjectPointers: Hash(Guid)PublishInfo Hash(Guid)LastHop Handles: Object publication / unpublication / mobile objects Route / location message handling ROC/Sahara Retreats, 1/2002 13 Patchwork Fault-handling / introspective stage Granulated periodic beacons measure loss and network latency to entries in routing table Promote/demote routes in single RouteEntry A BB CA C B A X C network ROC/Sahara Retreats, 1/2002 Router 14 Deployment Status Object Location Publish / unpublish / route to object Mobile objects (backtracking unpublish) Active deletes, confirmation of non-existence General Routing Route to node, redundant routes Soft-state fault-detection, limited optimization Advanced policies for fault recovery Dynamic Integration Integration w/ limited optimizations Best effort fault-resilient integration mechanisms Background threads for optimization / refresh ROC/Sahara Retreats, 1/2002 15 Talk Outline Tapestry overview Architecture Evaluation Brocade Conclude ROC/Sahara Retreats, 1/2002 16 Generalized Results Cached object pointers Efficient lookup for nearby objects Reasonable storage overhead Multiple object roots Improves availability under attack Improves performance and perf. stability Reliable packet delivery Redundant pointers approximate optimal reachability FRLS, a simple fault-tolerant UDP protocol ROC/Sahara Retreats, 1/2002 17 First Reachable Link Selection Use periodic UDP packets to gauge link condition Packets routed to shortest “good” link Assumes IP cannot correct routing table in time for packet delivery IP A B C D E ROC/Sahara Retreats, 1/2002 Tapestry No path exists to dest. 18 Some Numbers Measurements PIII 800, L2.2.18, IBM JDK 1.3 Simulating 6 nodes (4 staticTC, 1 federation, 1 dynamicTC) Publishing / locating ~10 objects PublishMsg, RouteMsg: ~ 0-2 ms Integration: ~2600ms (w/ pings) Integration messages: Assuming latency data available 2 x n (routing and objects) 16M (directed multicast notification) (M 3) ROC/Sahara Retreats, 1/2002 19 Talk Outline Tapestry overview Architecture Evaluation Brocade Conclude ROC/Sahara Retreats, 1/2002 20 Landmark Routing on P2P Brocade Exploit non-uniformity Minimize wide-area routing hops / bandwidth Secondary overlay on top of Tapestry Select super-nodes by admin. domain Super-nodes form secondary Tapestry Divide network into cover sets Advertise cover set as local objects Routing (AB) uses brocade to route directly into B’s local network ROC/Sahara Retreats, 1/2002 21 Brocade Mechanisms Selective utilization Nodes cache local cover set Only utilize brocade if dest. not in cache Forwarding messages to supernodes 1. 2. Super-node does IP-snooping Direct: cover set caches supernode Inter-domain routing: AB 1. 2. 3. ASN(A) via IP SN(A) finds SN(B) via Tapestry location SN(B)B via Tapestry/Chord/Pastry/CAN ROC/Sahara Retreats, 1/2002 22 Brocade Routing RDP Brocade Latency RDP 3:1 Relative Delay Penalty Original Tapestry IP Snooping Brocade Directed Brocade 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 2 4 6 8 10 12 14 16 18 20 22 24 26 Interdomain-adjusted Latency on Optimal Route Local cover set cache on; interdomain:intradomain = 3:1 Packet simulator, Transit-stub 4096 T nodes, 16 SuperN ROC/Sahara Retreats, 1/2002 23 Brocade Bandwidth Usage Brocade Aggregate Bandwidth Usage Original Tapestry IP Snooping Brocade Directed Brocade Approx. BW per Message 60 50 40 30 20 10 0 2 4 6 8 10 12 14 Physical Hops in Optimal Route Local cover set cache on B/W unit: (sizeof (Msg) * Hops) ROC/Sahara Retreats, 1/2002 24 Ongoing / Future Work Fill in full functionality Fault-handling policies, introspection, self-repair More realistic experiments Artificial topologies on SOSS simulator Larger scale dynamic integration experiments Code development External deployment / Code release Sprint programmable routers Academic networks Introspective measurement platform Implementing applications (Bayeux, Brocade … ) ROC/Sahara Retreats, 1/2002 25 For More Information Tapestry and related projects (and these slides): http://www.cs.berkeley.edu/~ravenben/tapestry OceanStore: http://oceanstore.cs.berkeley.edu Related papers: http://oceanstore.cs.berkeley.edu/publications http://www.cs.berkeley.edu/~ravenben/publications [email protected] ROC/Sahara Retreats, 1/2002 26