Download Chord

Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan February 2003 IEEE/ACM Transactions on Networking (TON) Presented by Leland Smith Organization  Introduction: Concepts        What is Chord? Distributed Hash Tables Chord Protocol Simulation Results Future work Summary and Conclusion Questions Intro: What is Chord?  Structured peer-to-peer overlay network   Fully distributed, no central authority   A logical network built on top of the existing internet. All nodes are equal. Challenges?    How to locate data? How to route efficiently and correctly? How to adapt to changes in the network? Intro: What is Chord?     Chord is a highly structured peer-to-peer key lookup service, based on distributed hash tables. Does not specify how to store the data, simply how to find it. It is an API providing just one function: lookup(key), which returns the node at which key should be stored, if it exists. Designed for higher level services to be built on top of its basic mechanism, such as persistent storage through replication.  Example: Storage (CFS), Indexing (DNS) Intro: Distributed Hash Tables      Hash tables associate keys with data. DHTs are an abstraction of the classic hash table with the load distributed across nodes in a network. Each connected node is responsible for a portion of the shared keyspace of the hash table. Uses a fixed keyspace that all hash values fall within. Other DHT based networks: FreeNet, Pastry, Tapestry, CAN, Ohaha Intro: Unique Identifiers    Each Node and Key has a unique identifier Chord uses the 160 bit SHA-1 hash function, making collisions extremely unlikely (2160 possible keys), and hash values scattered enough for approximate load balancing. Identifiers are generated by the hash function at each node using an understood input that is unique to the node.   Example: IP address Each node computes its own identifier, it is not assigned.  Centralization… Chord Ring (1) predecessor(x) node x successor(x) • Identifier Circle mod 2m • Node & Key IDs • Successor • Key location • “Owning” Keys m=6 10 nodes 5 keys Chord Ring (2)   A correct successor is the only piece of data required for correctness of the protocol. Additional information is maintained only to speed up lookups and provide fault tolerance. Simple Key Location (1)  Naïve key lookup strategy:     Is the key I’m looking for between me and my successor? If so, then my successor is responsible for the key If not, then forward the lookup request to my successor. It’ll eventually arrive at an answer, but it will pass through every node in the identifier space between the source node and the node hosting the key! This obviously doesn’t scale well! Simple Key Location (2) Scalable Key Location (1)    Chord improves lookup performance by maintaining a routing table of m nodes for an m-bit identifier space, called the finger table. The ith entry of the table (the ith finger) at node n contains the IP address and identifier of the first node that succeeds n by at least 2i-1 on the Chord ring. For i = 1, 21-1 = 20 = 1, so the first finger of a node is the node immediately following it on the Chord ring, its successor. Scalable Key Location (2) Scalable Key Location (3)  A better key lookup strategy:     Is the key I’m looking for between me and my successor? If so, then my successor is responsible for the key If not, consult my finger table, starting at the node furthest from me (mth finger in an m-bit identifier space), working backward, and find the first node whose identifier is between me and the key host, and forward the lookup to that node. Requires only O(log N) routes before arriving at an answer for an N node network. Scalable Key Location (4) Stabilization    A node must keep its finger table up to date when considering nodes joining, leaving and failing in a dynamic network. To achieve this, each node periodically runs a stabilization process, which locates all of its fingers in the dynamic network. More importantly, it makes sure it is its successor’s predecessor. If it isn’t, then a node has joined between it and its successor, therefore the node that joined is its new successor. Fault Tolerance     A Chord node must be able to withstand the failure of nodes in the network, most importantly, its successor. To guard against successor failure, a Chord node maintains a list of its next r successors in its successor list. r consecutive nodes must fail simultaneously in order for the ring to be disrupted, which is very improbable with even modest values of r. Guarding against failure in the finger table is less important. If a node fails during a lookup, the lookup may simply try the next finger in the table. Joining   A joining node must know of a node that is already connected to a Chord network. The joining node asks the existing node to find its successor.   Once it knows its successor, the joining node has connected to the network. Its successor can then transfer keys it “owns” whose identifiers lie in the joining node’s keyspace to the joining node. Simulation Results (1) Load Balancing: Total Keys vs. Keys per Node Although variations exist because node identifiers do not uniformly cover the entire identifier space, each node is responsible for about K / N keys in a network with N nodes and K total keys. N=10,000 nodes, K=500,000 keys On average, with high probability, each node is responsible for O(1 / N) of the identifier space. Simulation Results (2) Lookup Performance: Path Length vs. Number of Nodes For a network with N nodes, with high probability, lookups are performed in: O(log N) route hops Problems addressed      Load Balance: Distributed hash function spreads keys evenly over the nodes Decentralization: Fully distributed, self organizing Scalability: Lookup grows as a log of number of nodes Availability: Automatically adjusts internal tables to reflect changes in the network. Flexible Naming: No constraints on key structure. Summary and Conclusion     Solves the common peer-to-peer problem of locating a key in a network in a very efficient, decentralized manner. However, keyword searches are difficult. Provable correctness In an N node network:    Maintains routing information on only O(log N) nodes Resolves all lookups in O(log N) routing hops Promising future Future Work      Detecting and healing partitions. Protecting against malicious nodes. Network/geographic sensitive routing. Realistic load balancing, sensitive to each node’s unique combination of attributes. Anonymity, or at least plausible deniability. Questions? Thank You!

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chord