Download dht . ppt - Systems@NYU

Distributed hash tables Protocols and applications Jinyang Li Peer to peer systems • Symmetric in node functionalities – Anybody can join and leave – Everybody gives and takes • Great potentials – Huge aggregate network capacity, storage etc. – Resilient to single point of failure and attack • E.g. Gnapster, Gnutella, Bittorrent, Skype, Joost Motivations: the lookup problem • How to locate something given its name? • Examples: – File sharing – VoIP – Content distribution networks (CDN) Strawman #1: a central search service • Central service poses a single point of failure or attack • Not scalable(?) Central service needs $$$ Improved #1: Hierarchical lookup • Performs hierarchical lookup (like DNS) • More scalable than a central site • Root of the hierarchy is still a single point of vulnerability Strawman #2: Flooding Where is item K? • • • • Symmetric design: No special servers needed Too much overhead traffic Limited in scale No guarantee for results Improved #2: super peers • E.g. Kazaa, Skype • Lookup only floods super peers • Still no guarantees for lookup results DHT approach: routing • Structured: – put(key, value); get(key, value) – maintain routing state for fast lookup • Symmetric: – all nodes have identical roles • Many protocols: – Chord, Kademlia, Pastry, Tapestry, CAN, Viceroy DHT ideas • Scalable routing needs a notion of “closeness” • Ring structure [Chord] Different notions of “closeness” • Tree-like structure, [Pastry, Kademlia, Tapestry] 01100101 01100100 011000** 01100101 01100110 011001** 011010** 01100111 011011** Distance measured as the length of longest matching prefix with the lookup key Different notions of “closeness” • Cartesian space [CAN] (1,1) 0,0.5, 0.5,1 0.5,0.5, 1,1 0.5,0.25,0.75,0.5 0.75,0, 1,0.5 0,0, 0.5,0.5 0.5,0,0.75,0.25 (0,0) Distance measured as geometric distance DHT: complete protocol design 1. 2. 3. 4. Name nodes and data items Define data item to node assignment Define per-node routing table contents Handle churn • How do new nodes join? • How do nodes handle others’ failures? Chord: Naming • Each node has a unique flat ID – E.g. SHA-1(IP address) • Each data item has a unique flat ID (key) – E.g. SHA-1(data content) Data to node assignment • Consistent hashing Predecessor is “closest” to Successor is responsible for Key-based lookup (routing) • Correctness – Each node must know its correct successor Key-based lookup • Fast lookup – A node has O(log n) fingers exponentially “far” away Node x’s i-th finger is at x+2i away Why is lookup fast? Handling churn: join • New node w joins an existing network – – – – – Issue a lookup to find its successor x Set its successor: w.succ = x Set its predecessor: w.pred = x.pred Obtain subset of responsible items from x Notify predecessor: x.pred = w Handling churn: stabilization • Each node periodically fixes its state – Node w asks its successor x for x.pred – If x.pred is “closer” to itself, set w.succ = x.pred Starting with any connected graph, stabilization eventually makes all nodes find correct successors Handling churn: failures • Nodes leave without notice – Each node keeps s (e.g. 16) successors – Replicate keys on s successors Handling churn: fixing fingers • Much easier to maintain fingers – Any node at [2i, 2i+1] distance away will do – Geographically closer fingers --> lower latency – Periodically flush dead fingers Using DHTs in real systems • Amazon’s Dynamo key-value storage [SOSP’07] • Serve as persistent state for applications – shopping carts, user preferences • How does it apply DHT ideas? – Assign key-value pairs to responsible nodes – Automatic recovery from churn • New challenges? – Manage read-write data consistency Using DHTs in real systems • Keyword-based searches for file sharing – eMule, Azureus • How to apply a DHT? – User has file 1f3d… with name jingle bell Britney – Insert mappings: SHA-1(jingle)1f3d, SHA-1(bell) 1f3d, SHA-1(britney) 1f3d – How to answer query “jingle bell”, “Britney”? • Challenges? – Some keywords are much more popular than others – RIAA inserts a billion “Britney spear xyz”crap? Using DHTs in real systems CoralCDN Motivation: alleviating flash crowd Browser Origin Server Browser http proxy http proxy Browser http proxy Browser Browser http proxy http proxy Browser Browser Browser • Proxies handle most client requests • Proxies cooperate to fetch content from each other Getting content with CoralCDN Origin Server 2.Lookup(URL) What nodes are caching the URL? 3.Content transmission From which caching nodes should I download file? 1.Server selection What CDN node should I use? • Clients use CoralCDN via modified domain name nytimes.com/file → nytimes.com.nyud.net/file Coral design goals • Don’t control data placement – Nodes cache based on access patterns • Reduce load at origin server – Serve files from cached copies whenever possible • Low end-to-end latency – Lookup and cache download optimize for locality • Scalable and robust lookups Lookup for cache locations • Given a URL: – Where is the data cached? – Map name to location: URL  {IP1, IP2, IP3, IP4} – lookup(URL)  Get IPs of nearby caching nodes – insert(URL,myIP) ,TTL)  Add me as caching URL for TTL seconds Isn’t this what a distributed hash table is for? A straightforward use of DHT SHA-1(URL1) URL1={IP1,IP2,IP3,IP4} insert(URL1,myIP) • Problems – No load balance for a single URL – All insert and lookup go to the same node (cannot be close to everyone) #1 Solve load balance problem: relax hash table semantics • DHTs designed for hash-table semantics – Insert and replace: URL  IPlast – Insert and append: URL  {IP1, IP2, IP3, IP4} • Each Coral lookup needs only few values – lookup(URL)  {IP2, IP4} – Preferably ones close in network Prevent hotspots in index # hops: 1 2 3 Root node (closest ID) • Route convergence – O(b) nodes are 1 hop from root – O(b2) nodes are 2 hops away from root … Leaf nodes (distant IDs) Prevent hotspots in index # hops: Root node (closest ID) 1 2 3 Leaf nodes (distant IDs) URL={IP1,IP2,IP3,IP4} • Request load increases exponentially towards root Rate-limiting requests # hops: 1 2 Root node (closest ID) URL={IP1,IP2,IP3,IP4} 3 URL={IP5} Leaf nodes (distant IDs) URL={IP3,IP4} • Refuse to cache if already have max # “fresh” IPs / URL, – Locations of popular items pushed down tree Rate-limiting requests # hops: 1 2 Root node (closest ID) URL={IP1,IP2,IP6,IP4} 3 URL={IP5} Leaf nodes (distant IDs) URL={IP3,IP4} • Refuse to cache if already have max # “fresh” IPs / URL, – Locations of popular items pushed down tree • Except, nodes leak through at most β inserts / min / URL – Bound rate of inserts towards root, yet root stays fresh Load balance results 7β 494 nodes on PlanetLab 3β 2β 1β • Aggregate request rate: ~12 million / min • Rate-limit per node (β): 12 / min • Root has fan-in from 7 others #2 Solve lookup locality problem • Cluster nodes hierarchically based on RTT • Lookup traverses up hierarchy – Route to “closest” node at each level Preserve locality through hierarchy 000… Distance to key 111… Thresholds None < 60 ms < 20 ms • Minimizes lookup latency • Prefer values stored by nodes within faster clusters Clustering reduces lookup latency Reduces median lat by factor of 4 Putting all together: Coral reduces server load Most hits in 20-ms Coral cluster Local disk caches begin to handle most requests Few hits to origin 400+ nodes provide 32 Mbps 100x capacity of origin

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download dht . ppt - Systems@NYU