Download Chord

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Backpressure routing wikipedia , lookup

Airborne Networking wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Distributed operating system wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

CAN bus wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Kademlia wikipedia , lookup

Transcript
Chord: A Scalable
Peer-to-peer Lookup Protocol
for Internet Applications
Ion Stoica, Robert Morris, David Liben-Nowell,
David R. Karger, M. Frans Kaashoek,
Frank Dabek, Hari Balakrishnan
February 2003 IEEE/ACM Transactions on Networking (TON)
Presented by Leland Smith
Organization

Introduction: Concepts







What is Chord?
Distributed Hash Tables
Chord Protocol
Simulation Results
Future work
Summary and
Conclusion
Questions
Intro: What is Chord?

Structured peer-to-peer overlay network


Fully distributed, no central authority


A logical network built on top of the existing
internet.
All nodes are equal.
Challenges?



How to locate data?
How to route efficiently and correctly?
How to adapt to changes in the network?
Intro: What is Chord?




Chord is a highly structured peer-to-peer key lookup
service, based on distributed hash tables.
Does not specify how to store the data, simply how
to find it.
It is an API providing just one function:
lookup(key), which returns the node at which key
should be stored, if it exists.
Designed for higher level services to be built on top
of its basic mechanism, such as persistent storage
through replication.

Example: Storage (CFS), Indexing (DNS)
Intro: Distributed Hash Tables





Hash tables associate keys with data.
DHTs are an abstraction of the classic hash table
with the load distributed across nodes in a network.
Each connected node is responsible for a portion of
the shared keyspace of the hash table.
Uses a fixed keyspace that all hash values fall
within.
Other DHT based networks: FreeNet, Pastry,
Tapestry, CAN, Ohaha
Intro: Unique Identifiers



Each Node and Key has a unique identifier
Chord uses the 160 bit SHA-1 hash function,
making collisions extremely unlikely (2160 possible
keys), and hash values scattered enough for
approximate load balancing.
Identifiers are generated by the hash function at
each node using an understood input that is unique
to the node.


Example: IP address
Each node computes its own identifier, it is not assigned.
 Centralization…
Chord Ring (1)
predecessor(x)
node x
successor(x)
• Identifier Circle
mod 2m
• Node & Key IDs
• Successor
• Key location
• “Owning” Keys
m=6
10 nodes
5 keys
Chord Ring (2)


A correct successor is the only piece of data
required for correctness of the protocol.
Additional information is maintained only to
speed up lookups and provide fault tolerance.
Simple Key Location (1)

Naïve key lookup strategy:




Is the key I’m looking for between me and my
successor?
If so, then my successor is responsible for the key
If not, then forward the lookup request to my
successor.
It’ll eventually arrive at an answer, but it will pass
through every node in the identifier space
between the source node and the node hosting
the key! This obviously doesn’t scale well!
Simple Key Location (2)
Scalable Key Location (1)



Chord improves lookup performance by
maintaining a routing table of m nodes for an
m-bit identifier space, called the finger table.
The ith entry of the table (the ith finger) at
node n contains the IP address and identifier
of the first node that succeeds n by at
least 2i-1 on the Chord ring.
For i = 1, 21-1 = 20 = 1, so the first finger of a
node is the node immediately following it on
the Chord ring, its successor.
Scalable Key Location (2)
Scalable Key Location (3)

A better key lookup strategy:




Is the key I’m looking for between me and my
successor?
If so, then my successor is responsible for the key
If not, consult my finger table, starting at the node
furthest from me (mth finger in an m-bit identifier
space), working backward, and find the first node
whose identifier is between me and the key host,
and forward the lookup to that node.
Requires only O(log N) routes before arriving at
an answer for an N node network.
Scalable Key Location (4)
Stabilization



A node must keep its finger table up to date when
considering nodes joining, leaving and failing in a
dynamic network.
To achieve this, each node periodically runs a
stabilization process, which locates all of its fingers
in the dynamic network.
More importantly, it makes sure it is its successor’s
predecessor. If it isn’t, then a node has joined
between it and its successor, therefore the node that
joined is its new successor.
Fault Tolerance




A Chord node must be able to withstand the failure
of nodes in the network, most importantly, its
successor.
To guard against successor failure, a Chord node
maintains a list of its next r successors in its
successor list.
r consecutive nodes must fail simultaneously in
order for the ring to be disrupted, which is very
improbable with even modest values of r.
Guarding against failure in the finger table is less
important. If a node fails during a lookup, the lookup
may simply try the next finger in the table.
Joining


A joining node must know of a
node that is already connected
to a Chord network.
The joining node asks the
existing node to find its
successor.


Once it knows its successor,
the joining node has
connected to the network.
Its successor can then transfer
keys it “owns” whose
identifiers lie in the joining
node’s keyspace to the joining
node.
Simulation Results (1)
Load Balancing: Total Keys vs. Keys per Node
Although variations
exist because node
identifiers do not
uniformly cover the
entire identifier space,
each node is
responsible for about
K / N keys in a
network with N nodes
and K total keys.
N=10,000 nodes, K=500,000 keys
On average, with high
probability, each node
is responsible for
O(1 / N) of the
identifier space.
Simulation Results (2)
Lookup Performance: Path Length vs. Number of Nodes
For a network with
N nodes, with high
probability, lookups
are performed in:
O(log N) route hops
Problems addressed





Load Balance: Distributed hash function spreads
keys evenly over the nodes
Decentralization: Fully distributed, self organizing
Scalability: Lookup grows as a log of number of
nodes
Availability: Automatically adjusts internal tables to
reflect changes in the network.
Flexible Naming: No constraints on key structure.
Summary and Conclusion




Solves the common peer-to-peer problem of
locating a key in a network in a very efficient,
decentralized manner.
However, keyword searches are difficult.
Provable correctness
In an N node network:



Maintains routing information on only O(log N)
nodes
Resolves all lookups in O(log N) routing hops
Promising future
Future Work





Detecting and healing partitions.
Protecting against malicious nodes.
Network/geographic sensitive routing.
Realistic load balancing, sensitive to each
node’s unique combination of attributes.
Anonymity, or at least plausible deniability.
Questions?
Thank You!