Download 21. + 24. P2P (21.4.+28.4.) - ole unibz

Document related concepts

Peering wikipedia , lookup

Backpressure routing wikipedia , lookup

Airborne Networking wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Distributed operating system wikipedia , lookup

CAN bus wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Kademlia wikipedia , lookup

Peer-to-peer wikipedia , lookup

Transcript
Distributed Systems
21./24. Peer-to-Peer Systems
Simon Razniewski
Faculty of Computer Science
Free University of Bozen-Bolzano
A.Y. 2015/2016
Outline
1.
2.
3.
4.
5.
Distributed System Architectures
P2P - Overview
Overlay Networks
Unstructured P2P
Structured P2P – Chord
1. (Distributed) System Architectures
Reminder
• In general Distributed Systems (DS) are designed
to share resources.
• What is a resource?
– e.g., programs, data, CPUs
• Advantages of DS
–
–
–
–
–
Fault tolerance
Cost reduction (e.g., by pooling resources)
Scalability
Load Balancing
…
• Disadvantages?
Architectures
• DS are software architectures.
– It is important to organize the various software
modules.
• Software modules have to be deployed on
physical machines.
– According to the way software modules are placed
on physical machines, we obtain different system
architectures.
– This can generate different architectural styles.
Architectural Styles
•
•
•
•
Layered architectures
Object-based architectures
Data-centric architectures
Event-based architectures
Architectural Styles
Layered architectures
• Example: TCP/IP protocol stack
Architectural Styles
Object-based architectures
• Components are connected in a RPC style
• Example:
– Client-Server architecture, Java RMI
Architectural Styles
data-centric architectures
• Processes communicate via a shared repository
• Example:
– Distributed File Systems
Architectural Styles
Event-based architectures
• Processes communicate through a mechanism of event
propagation.
• Publish/Subscribe systems
– processes publish events
– only those processes that subscribed will be notified
• Decoupled Systems: publishers and subscribers don’t need to
explicitly refer each other (MOM)
System architectures
Architecture: Implementation of a style, placement
of components on machines
• Centralized architectures
– Vertical distribution
– Layered applications
– Multilayered-architectures
• Decentralized architecture:
– P2P architectures (horizontal distribution)
• Structured
• Unstructured
• Super-peer based
2. P2P - Overview
Peer-to-Peer: History
• ICQ
– 1996: first P2P instant messaging tool
– the server maintains the status of the users, communication happens
P2P (till 2011, also Skype worked that way)
• Napster
– 1999 first file sharing system for music
– 2001 shutdown over legal issues
• Gnutella
– 2000 first truly decentralized P2P filesharing system
• Bittorrent
– developed 2001, most used system today
• Bitcoin
– Distributed virtual currency, released 2009
Peer-to-Peer Systems
Motivation
• A huge number of nodes participating in the
network:
– Have resources to share
– Have demands towards the use of resources which
may not be satisfied easily and by single nodes
– Classical client/server systems: Server are the
bottlenecks
– Bitcoin network in 2013 8x the compute power of the
top 500 supercomputers
• Main questions:
– Which of the nodes provide the resource
– How to achieve fairness
Client Server vs. P2P content discovery
Index
Overlay
Network
Client
Network
Network
Provider
Provider
Provider
Provider
Node
Node
• Clients interacts with Servers after finding useful
information provided by Indexing services
• No intelligence in the network
Node
• Search in the network
• “Content”-based
• Intelligence in the network
System architectures
Decentralized (P2P) architectures
• Decentralized architectures
– Horizontal distribution
• No distinction between client and server
– Each process is a servent (server+client)
– Processes are organized in overlay networks
– Processes + Communication channels
• Different kinds of network overlays
• Super-peer based (e.g. Napster, Bittorrent (?))
• Unstructured (e.g. Gnutella)
• Structured (e.g. Chord)
Peer-to-Peer
Overlay Networks
Peer-to-Peer Systems
Overlay Networks
• P2P networks form an overlay network on top of the
Internet (TCP/IP network)
• TCP/IP networks form an overlay network politically
and technically over the underlying telecom
• Both:
– introduce their own addressing scheme (e.g. peer IDs, IP
addresses)
– emphasize fault-tolerance
• Complexity of P2P systems lies in the design of the
overlay network + the message exchange protocol
Peer-to-Peer
Features
• Resources (location, sharing)
– Relevant resources located at (private) nodes
(peers)
• uncontrolled, voluntary offers
– Resources of peers are heterogeneous
• bandwidth, CPU power, storage space, …
• quality depends on device / connectivity
– Distributed resource location
• widely spread: requires proper mechanism to find and
use
Peer-to-Peer
Features
• Networking
– High churn rate (ratio of peers joining/leaving a
service over time)
•
•
•
•
peers are online for a limited time
very unpredictable
heterogeneous connection types
often operating behind firewalls or NAT gateways
Peer-to-Peer
Features
• Interactions among peers
– Combined client and server functionality
• “SERVer + cliENT = SERVENT”
• services provided by end systems
• underlay only used for message transfers
– Direct interaction (provision of services, e.g. file
transfer) between peers (= “peer to peer”)
Peer-to-Peer
Features
• Coordination
– Peers have significant autonomy
• Mostly similar rights
• Roles based on capabilities
• Self-organizing system
– Relevant mechanisms performed by peers in the
network
• Resource search, allocation and scheduling
• No central control
• No centralized usage/provisioning of a service
Peer-to-Peer
vs Grid Computing
• A computing paradigm to interconnect the
existing data processing centers to Virtual
Organizations and operate on data in a
distributed way.
• Commonly in commercial applications
• Still some centralized components
Peer-to-Peer
vs Cloud Computing
• A computing paradigm to access distributed pools of
resources.
– Resources: storage, bandwidth, CPU
• Resource providers: typically companies
• Controlled environment
– No malicious users
– No (very little) churns
– Homogeneous devices
• Parts of the architecture are centralized
• Different goal
– Software as a service (e.g. Google docs)
– Platform as a service (e.g. Google app engine, Windows Azure)
– Infrastructure as a service (e.g. Amazon EC2)
Peer-to-Peer
Popular systems
Peer-to-Peer
Traffic
http://www.exfo.com/PageFiles/37706/Robert%20Fitts_Cisco%20Graph.jpg
Peer-to-Peer
Traffic
• Since 2003 P2P traffic is more than 50% of the
total Internet traffic
– Guess why ? 
System architectures
Structured P2P architectures
• The overlay is constructed by a deterministic
procedure.
• Distributed Hash Tables (DHTs) are commonly
used to organize data.
• DHTs:
– Each data item is assigned a key in a key space.
– Each node is assigned a key in the same way.
– Idea: map the key of a data item to a unique node key.
The mapping is done by using a distance metric
System architectures
Structured P2P architectures
• When looking for a piece of data, it is
important to return the id of the node
responsible for that piece of data.
• This is achieved by performing routing of the
request from the node that originated the
request to the node responsible for the key
associated to the data one is looking for.
System architectures
Structured P2P architectures - Chord
• Nodes are logically organized in a ring.
• Node ids can be obtained e.g., by hashing the
IP address of the node.
• A piece of data having key k is assigned to a
node having id>k.
– This node is called successor of k denoted by
succ(k).
System architectures
Structured P2P architectures- Chord
N63
The distance to peer nodes
increases exponentially.
N2
N6
N56
N11
N50
put ( 28, A )
performed on
N48
N42
N42
Key 28
…
N33
N33 is the target node
N27
N17
Data in this area
is stored into Node 27
System architectures
Unstructured P2P architectures
• The network looks like a random graph.
• Each node maintains a list of neighbors.
– How to construct this list ?
• e.g., by contacting popular nodes.
• Each node has a partial view of the whole
network.
• When a piece of data has to be found the
network will be flooded.
– Each node contacts its neighbors and so forth.
System architectures
Unstructured P2P architectures
• Centralized P2P
– e.g., Napster
System architectures
Unstructured P2P architectures
• Fully decentralized
– e.g., Gnutella 0.2
System architectures
Super-peer-based P2P architectures
• Some nodes will act as servers for a subset of
peers.
– These nodes are called super-peers (SPs).
• SPs are organized in a P2P way.
• Problems:
– Leader election (i.e., which nodes are good
candidates for being SPs ?)
System architectures
Super-peer-based P2P architectures
• Combine centralized and decentralized aspects
– e.g., KaZaa
Overlay Networks
Types of queries
• Lookup
– Key-value as in the case of hash tables
• Given a key, returns a single values/file
• Match
– Given a sequence of keywords
• Return all documents matching the terms
• Structured networks have 100% probability of success for
Lookup
• Unstructured networks do not have the same guarantee
Structured vs. Unstructured Overlays
Structured
• Object IDs and peer IDs
share the same space
– Every object is assigned to
a peer
– Lookup of data items
• Objects are stored at
peers according to their
IDs
• Easy Lookup
Unstructured
• Object IDs not assigned to
peers
– Object hosted by uploaders
directly
– Only search possible
• Objects have no special IDs
• Each peer is responsible for
its objects
• Easier organization
Overlay Networks
Classification
4. Unstructured P2P
Unstructured Overlay Networks
Types
• Homogeneous (all nodes are assumed equal)
• Heterogeneous (nodes have various roles)
– Decentralized File Sharing with Distributed Servers
– Decentralized file Sharing with Super Nodes
• Flat: all nodes are in one overlay
• Hierarchical: more than one overlay exists
Unstructured Overlay Networks
Centralized Networks
• Central index maintaining information about
– shared object (e.g., file name)
– location (IP address where the object is located)
• Normal peers maintain the objects
– each peer maintains its own objects
– decentralized storage
– file transfer between peers
• Issues:
– Central point of failure
– Unbalanced cost: the server is a bottleneck
Unstructured Overlay Networks
Centralized Networks
SERVER
Node G
d1, B
Node A
Node D
Node E
Node B
Node C
d1
transfer of d1
Node F
Unstructured Overlay Networks
Centralized Networks
• Advantages?
– Search complexity O(1)
– Simple and fast
• Disadvantages?
– No intrinsic scalability
– Single point of failure/attack
– Complex queries difficult
– A single server cannot handle a large number of
peers
Unstructured Overlay Networks
Distributed Networks
• All peers play the same role
• Search is carried out by a cooperation among
peers
• Each peer has a local view of the network
• Main motivation
– Provide robustness
– No single point of failure
– Scalability
Unstructured Overlay Networks
Distributed Networks - Challenges
• How to join the network ?
– No central index
• Knowing at lest 1 node that already is in the network
– Constructing the local view of the network
• How to perform search ?
– Flooding
– Random walk
– ..
• How to deliver contents?
– Peer to Peer communication
Unstructured Overlay Networks
Distributed Networks
• PROs?
– Each peer can fail/leave the network
– Scalable
• CONs?
– Slow and expensive search
– No guarantee of completeness (i.e., finding ALL
the objects that satisfy a request)
Unstructured Overlay Networks
Distributed Networks - Search
• Flooding (Breadth First Search)
– Send the message to all neighbor peers apart from
the peer that delivered the message
– Keep track of the messages processed to avoid
cycles
Query
Unstructured Overlay Networks
Distributed Networks - Flooding
Source
node
Message Count
Length of the path
5
Target
node
2615
113
1
Unstructured Overlay Networks
Distributed Networks - Search
• Expanding Ring
– Successive floods with increasing TTL
• Start with small TTLs, if no success, then increase it
– A.k.a. iterative deepening (e.g. games)
• Properties
– Improved performance
• If content follow a Zipf distribution
– Message overhead is high
Unstructured Overlay Networks
Distributed Networks - Search
• Random walk
– Forward the query to a randomly selected
neighbor
• Message overhead is reduced significantly
• Increased latency
• Multiple random walks
– Reduce latency
– Generate more load
• Termination mechanism
– TTL based
Unstructured Overlay Networks
Distributed Networks – Random walk
Assume k=2, each message is
sent to only 2 neighbors
Target
node
Source
node
Gnutella Protocol 0.4
Unstructured Overlay Networks
Gnutella protocol
• Messages have a GUID (globally unique identifier) to
avoid loops
• TTL max. equal to 7
• Phases:
– Connecting
• PING
• PONG
– Querying
• QUERY
• QUERY HIT
– Transfer
• HTTP GET or PUSH (in case of firewall)
Unstructured Overlay Networks
Gnutella protocol – Phase 1
• To connect to a Gnutella network, peer must initially
know
– (at least) one member node of the network and connect to
it
• This/these first member node(s) must be found by
other means
– find first member using other medium (Web, chat, ...)
– nowadays host caches are usually used
• Servent connects to a number of nodes
– thus it becomes part of the Gnutella network
Unstructured Overlay Networks
Gnutella algorithm – Phase 2
• A node that receives a QUERY message
– increases the HOP count field of the message and
• IF(HOP <= TTL && ! QUERY.GUID already received) THEN
forwards it to all nodes except the one the peer received it from
– Nodes also checks whether they can answer and in case
• Send a QUERY HIT message to the requestor
– QUERY HIT messages contain
• The IP address of the sender
• The message is routed through the same path it arrived
Unstructured Overlay Networks
Gnutella algorithm – Phase 3
• A peer sets up a HTTP connection
– actual data transfer is not part of the Gnutella protocol
– HTTP GET is used
• Special case: peer with the file located behind a
firewall/NAT gateway the downloading peer
– cannot initiate a TCP/HTTP connection
• can instead send the PUSH message asking the other peer to
initiate a TCP/HTTP connection to it and then transfer (push) the
file via it
• does not work if both peers are behind firewalls
Unstructured Overlay Networks
Gnutella algorithm – Scalability
• A TTL (Time To Live) of 4 hops for the PING messages
leads to a known topology of roughly 8000 nodes
– TTL in the original Gnutella client was 7 (not 4)
• Gnutella in its original version (V 0.4) suffers from a
range of scalability issues due to
– fully decentralized approach
– flooding of messages
Unstructured Overlay Networks
Gnutella algorithm – Scalability
• Portman et. al did some experiments about
the frequency of messages
Unstructured Overlay Networks
Gnutella algorithm – Scalability
• How to overcome scalability issues?
– Optimization of connection parameters (e.g., #hops)
– PING/PONG optimization
• e.g. C. Rohrs und V. Falco. Limewire: Ping Pong Scheme, April 2001.
– Dropping of low-bandwidth connections
– Ultra-Peers similar to KaZaA super nodes
– File Hashes to identify files, similar to eDonkey
• Chosen solution: Hierarchy (Gnutella v. 0.6)
– SUPERPEERS to cope with load of low bandwidth peers
(which otherwise might get congested)
Unstructured Overlay Networks
Gnutella algorithm – free riders
• Study results (since e.g. Adar/Hubermann 2000):
– 70% of the Gnutella users share no files
– 90% do not answer to queries
• Some solutions
– incentives for sharing: peers only accept connections /
forward messages from peers that share contents
– Micro-payment (e.g., Mojo Nation)
Bittorrent
• BitTorrent is a system for downloading files.
• It does not provide all the functionalities of a
typical P2P system (e.g., search).
• To share a content a user create a .torrent file
containing:
– Metadata about the content to be shared
– Information about the tracker, that is, the node
that coordinates the distribution
Bittorrent
• The user that provides the file chops it into
pieces of size between 64K and 1 MB.
• When a client is looking for a file it locates the
.torrent file pointing to the tracker.
• The tracker tells which other peers (called
leechers) are downloading the file.
Bittorrent
• Replicas of the downloaded parts are created.
– More leechers downloading more replicas.
– As soon as the leecher has completed a piece, it can share
it with others.
• All the pieces are put together to assemble the file.
• The main objective of BitTorrrent is to guarantee
cooperation.
– Avoiding free riding (i.e., nodes that only download
and do not provide contents)
– Tracker hands addresses of pieces only to peers that
share received pieces
Bittorrent
Client node
K out of N nodes
Node 1
Lookup(F)
A BitTorrent
Web page
Web server
Ref. to
file
server
.torrent file
for F
File server
Ref. to
tracker
List of nodes
storing F
Tracker
Node 2
Node N
Bittorrent
(click)
5. Structured P2P Networks Chord
System architectures
Structured P2P architectures
• The overlay is constructed by a deterministic
procedure.
• Distributed Hash Tables (DHTs):
– Each data item is assigned a key in a key space.
– Each node is assigned a key in the same way.
– Idea: map the key of a data item to a unique node
key. The mapping is done by using a distance
metric
Structured P2P architectures
Chord
• Developed in 2001 at MIT
• Efficient node localization
– provable performance, proven correctness
• Chord is a distributed lookup protocol
– Distributed version of traditional hash tables
• Support of just one operation:
– given a key, Chord maps the key onto a node
• More concretely
– Chord provides the IP address of the node
responsible for the sought key
Structured P2P architectures
Chord – consistent hashing
• The Hash function assigns each node and key an
m-bit identifier using a base hash function such as
SHA-1 (Secure Hash Standard)
– ID(node) = hash(IP, Port)
– ID(key) = hash(key)
• Properties of consistent hashing:
– Load balance: all nodes receive roughly the same
number of keys
– When an n-th node joins (or leaves) the network, only
an O(1/N) fraction of the keys are moved to a different
location
Structured P2P architectures
Chord – consistent hashing
• In an m-bit space there are 2m identifiers
• How to build the ring?
– Identifiers are ordered in an identifier circle
modulo 2m
– The circle is called Chord
• The key k is assigned to the node whose
identifiers is equal to or follows k in the
identifier space
• This node is called the successor of k
Structured P2P architectures
Chord – consistent hashing
identifier
node
6
1
0
successor(6) = 0
6
identifier
circle
6
5
2
2
3
4
key
successor(1) = 1
1
7
X
2
successor(2) = 3
Structured P2P architectures
Chord – node join and departure
• When a node n joins the network
– certain keys assigned to n’s successor now become
assigned to n
• When node n leaves the network
– all of its assigned keys are reassigned to n’s
successor.
Structured P2P architectures
Chord – node join
keys
5
7
Certain keys assigned to
succ(6) (i.e., 0) now are
assigned to 6
keys
1
0
1
7
keys
6
2
5
3
4
keys
2
Structured P2P architectures
Chord – node departure
keys
7
Certain keys are assigned
to succ(1) (i.e., 3) when 1
leaves
keys
1
0
1
7
keys
6
6
2
5
3
4
keys
2
Structured P2P architectures
Chord – (simple) lookup
• A very small amount of routing information
suffices to implement consistent hashing in a
distributed environment
• If each node knows only how to contact its current
successor node on the identifier circle, all node
can be visited in linear order.
• Queries for a given identifier could be passed
around the circle via these successor pointers until
they encounter the node that contains the key.
– Is this efficent ?
Structured P2P architectures
Chord – (simple) lookup
• Pseudo code for finding the successor:
// ask node n to find the successor of id
n.find_successor(id)
if (id  (n, successor])
return successor;
else
// forward the query around the circle
return successor.find_successor(id);
Structured P2P architectures
Chord – (simple) lookup
• The path taken by a query from node 1 for key
7:
keys
7
lookup (7)
0
1
7
6
2
5
3
4
Structured P2P architectures
Chord – efficient lookup
• To accelerate lookups, each Chord node
maintains additional routing information
• Each node maintains a routing table with (at
most) m entries (where N=2m) called the
finger table
• The ith entry in the table at node n will be:
FTn[i]=successor(n+2i-1)
– The distance from n increases exponentially
Structured P2P architectures
Chord – finger table
finger table
start
For.
0+20
0+21
0+22
1
2
4
1
6
succ.
1
3
0
finger table
For.
start
0
7
keys
6
0
1+2
1+21
1+22
2
3
5
succ.
keys
1
3
3
0
2
5
3
4
finger table
For.
start
0
3+2
3+21
3+22
4
5
7
succ.
0
0
0
keys
2
Structured P2P architectures
Chord – finger table
• Each node stores information about:
– only a small number of other nodes
– knows more about nodes closely following it than about
nodes farther away
• A node’s finger table generally does not contain
enough information to determine the successor of an
arbitrary key k
• Repetitive queries to nodes that immediately precede
the given key will lead to the key’s successor eventually
Structured P2P architectures
Chord – lookup with finger table
• How to perform lookup by exploiting the
finger table?
– Search in finger table for the highest node q that
precedes id
q=MAX(FT≤id)
– Invoke find_successor from that node q
=> Number of messages O(log N)!
Structured P2P architectures
Chord – finger table
63
The distance to peer nodes
increases exponentially.
finger table
distance succ.
N11+0
N11
N11+1
N17
N11+2
N17
N11+4
N17
N11+8
N27
N11+16 N27
N11+32 N48
2
6
56
11
50
get( 28)
performed on
48
finger table
Distance succ.
N42+0
N42
N42+1
N48
N42+2
N48
The
node that
N42+4
N48 is responsible
for theN50
key 28 is N11
N42+8
N42+16 N63
N42+32 N11
42
N42
Key 28
27
33
N33 is the target node
17
finger table
distance succ.
N27+0DataN27
in this area
N27+1
N33into Node 27
is stored
N27+2
N33
N27+4
N33
N27+8
N42
N27+16 N48
N27+32 N63
83
Structured P2P architectures
Chord – node join with finger table
finger table
start succ.
Certain keys assigned to
succ(6) (i.e., 0) now are
assigned to 6
1
2
4
Update the finger tables!
finger table
start
7
0
2
1
3
06
finger table
start succ.
0
1
7
keys
6
2
3
5
keys
1
3
3
06
keys
succ.
0
0
3
6
2
5
3
4
finger table
start succ.
4
5
7
keys
2
06
06
0
84
Structured P2P architectures
Chord – node departure (1) with finger table
finger table
start succ.
13
3
06
1
2
4
Update the finger tables!
finger table
start
7
0
2
succ.
0
0
3
keys
6
Certain keys
assigned to
succ(1) (i.e., 3)
when 1 leaves
finger table
start succ.
0
1
7
keys
6
2
3
5
keys
1
3
3
06
2
5
3
4
finger table
start succ.
4
5
7
keys
2
6
6
00
85
Structured P2P architectures
Chord – maintenance
• Basic “stabilization” protocol is used to keep
nodes’ successor pointers up to date
– this is sufficient to guarantee correctness of
lookups
• Those successor pointers can then be used to
verify the finger table entries
• Every node runs stabilize periodically to find
newly joined nodes
Structured P2P architectures
Chord – stabilization
• “Stabilization” protocol contains 6 functions:
– create()
– join()
– stabilize()
– notify()
– fix_fingers()
– check_predecessor()
Structured P2P architectures
Chord – join
• When node n first starts, it calls n.join(n’),
where n’ is any known Chord node.
• The join() function asks n’ to find the
immediate successor of n.
• join() does not make the rest of the network
aware of n.
Structured P2P architectures
Chord – stabilization
// create a new Chord ring.
n.create()
predecessor = nil;
successor = n;
// join a Chord ring containing node n’.
n.join(n’)
predecessor = nil;
successor = n’.find_successor(n);
Structured P2P architectures
Chord – stabilization
• Each time node n runs stabilize()
– it asks its successor for it’s predecessor p,
– it decides whether p should be n’s successor instead.
• stabilize() notifies node n’s successor of n’s
existence, giving the successor the chance to
change its predecessor to n.
• The successor does this only if it knows of no
closer predecessor than n.
Structured P2P architectures
Chord – stabilization
// called periodically. verifies n’s immediate
// successor, and tells the successor about n.
n.stabilize()
x = successor.predecessor;
if (x < n.successor)
successor = x;
successor.notify(n);
// n’ thinks it might be our predecessor.
n.notify(n’)
if (predecessor is nil or n’>predecessor)
predecessor = n’;
Structured P2P architectures
Chord – join and stabilization

np
succ(np) = ns
succ(np) = n
n
nil

predecessor = nil

n acquires ns as successor via some n’
n runs stabilize

n notifies ns being the new predecessor

ns acquires n as its predecessor

np runs stabilize

pred(ns) = n
pred(ns) = np
ns
n joins




np asks ns for its predecessor (now n)
np acquires n as its successor
np notifies n
n will acquire np as its predecessor

all predecessor and successor pointers are now
correct

fingers still need to be fixed, but old fingers will
still work
Structured P2P architectures
Chord – fix finger
• Each node periodically calls fix fingers to make
sure its finger table entries are correct.
• It is how new nodes initialize their finger
tables
• It is how existing nodes incorporate new
nodes into their finger tables.
Structured P2P architectures
Chord – fix finger
// called periodically. refreshes finger table entries.
// next stores the index of the finger to fix
n.fix_fingers()
next = next + 1 ;
if (next > m)
next = 1 ;
finger[next] = find_successor(n + 2next-1);
Structured P2P architectures
Chord – node failures
• Key step in failure recovery is maintaining correct
successor pointers
• To help achieve this, each node maintains a
successor-list of its r nearest successors on the ring
• If node n notices that its successor has failed, it
replaces it with the first live entry in the list
Structured P2P architectures
Chord – experimental results
• Latency grows slowly with
the total number of nodes
• Path length for lookups is
about ½ log2N
• Chord is robust in the face of
multiple node failures
Structured P2P architectures
Chord – further details
Ion Stoica, Robert Morris, David R. Karger, M. Frans Kaashoek, and
Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service
for internet applications. In SIGCOMM, pages 149–160, 2001.
http://pdos.csail.mit.edu/papers/chord:sigcomm01/chord_sigcomm.pdf
Chord is available at: http://pdos.lcs.mit.edu/chord/
Legal and social aspect?
Take home
• Peer-to-Peer
– Structured (Chord)
•
•
•
•
Logarithmic lookup via finger tables
100% success for lookup
Node join and leave
How to do lookup
– Unstructured
• Centralized (Napster, Bittorrent)
• Decentralized (Gnutella)
– Only flooding/random walk possible
– Sometimes with super peers