Download Mr. Atif_Kamal_P2P Routing Algorithms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SIP extensions for the IP Multimedia Subsystem wikipedia , lookup

Computer network wikipedia , lookup

Backpressure routing wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Airborne Networking wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Distributed operating system wikipedia , lookup

CAN bus wikipedia , lookup

Everything2 wikipedia , lookup

Routing wikipedia , lookup

Kademlia wikipedia , lookup

Peer-to-peer wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Transcript
Distributed Computing
Peer to Peer Computing
Chapter 10:
PEER TO PEER SYSTEMS
What is peer-to-peer (P2P) computing?

Webster definition


Peer: one that is of equal standing with another
Computing between equals
Intro to P2P Systems

The scope of expanding popular services by adding to
number of the computers hosting then is limited when all
the host must be owned & managed by the service
provider

Administration and fault recovery costs

Bandwidth that can be provided to a single server
Major service provider all face this problem with varying
severity

Intro to P2P Systems

P2P application that exploit resources available at the edges of
the internet


Traditional client-server provide access to these but only on
single machine or tightly coupled servers


Storage, content, cycles, human presence
This centralized design required few decisions about placement &
management of resources
In P2P -- algorithm for the placement and subsequent retrieval
of information objects are a key aspect of the system design. It’s
a system which is


Fully decentralized & self organizing
Can dynamically balance the storage and processing loads between
all the participating computers as they join and leave
P2P Design Characteristics





Their design ensures that each user contributes resources to
the systems
Although they may differ in the resources that contribute, all
the nodes in a peer to peer system have the same functionality
capabilities and responsibilities
Their correct operation dose not depend on the existence of
any centrally administered systems
They can be designed to offer a limited degree of anonymity to
the providers and users of resources
Key issues for the their efficient operation is the choice of
algorithm for placing and retrieving data on many hosts


Balance of load
Availability without much overhead
 Participants availability to system is unpredictable
Evolution of P2P

Can be grouped in three generations



First generation – Napster music exchange service [OpneNap
2001]
Second generation – file sharing applications with greater
 Scalability, anonymity & fault tolerance
 Guentella, Kaza, Freenet
Developed with help of middleware layers
 Application independent management of distributed resources on
a global scale



E.g. pastry, Tapestry, CAN, CHORD, JAXTA
Provide of delivery of request of delivery of request in a bounded
number of network hops
Place replicas of resources, by keeping mind volatile availability &
trustworthiness, locality
Cluster, Grid, P2P: Characteristics
Characteristic
Cluster
Grid
P2P
Population
Commodity
Computers
High-end computers
Edge of network
(desktop PC)
Ownership
Single
Multiple
Multiple
Discovery
Membership
Services
Centralised Index &
Decentralised Info
Decentralized
User Management
Centralised
Decentralised
Decentralised
Resource management
Centralized
Distributed
Distributed
Allocation/Scheduling
Centralised
Decentralised
Decentralised
Inter-Operability
VIA based?
No standards yet
No standards
Single System Image
Yes
No
No
Scalability
100s
1000?
Millions? [@Home]
Capacity
Guaranteed
Varies, but high
Varies
Throughput
Medium
High
Very High
Speed(Lat. Bandwidth)
Low, high
High, Low
High, Low
Example P2P Applications




SETI@home
Napster
Gnutella
FastTrack
SETI@home
SETI@home uses the National Astronomy
and Ionospheric Center's 305 meter
telescope at Arecibo, Puerto Rico.
A screenshot of the SETI@home
client program.
•2.4 mil volunteers as of Oct.
2000
Distributed computation

Usage & Exploitation best example






SETi@home (Search for Extra-Terrestrial Intelligence)
Portions a steam of digitized radio telescope data into 107
second work unit, each about 350KB, distribute them on clients
computer
Work unit is redundantly distributed to 3-4 users, to guard
against errors & bad nodes
Coordination work is handled by a single server
T3.91 million PCs participated in this by 2002
In one year they processed 221 million work units, data worth
27.36 teraflops on average
Napster




Centralized
MP3 file sharing
Clients/Peers hold the files
Servers holds catalog and broker relationships



Clients upload IP address, music file shared, and requests
Clients request locations where requests can be met
File transfer is P2P – proprietary protocol
Napster & its Legacy
peers
Napster serv er
Index
1. Fil e lo cation
req uest
2. Lis t of pee rs
offerin g th e fi le
Napster serv e
Index
3. Fil e req uest
5. Ind ex upd ate
4. Fil e de livered
Napster: peer-to-peer file sharing with a centralized,
replicated index
Napster & its Legacy



Architecture included centralized index servers, main
reason for defeat in lawsuit
Anonymity of receiver & provider
Lessons learned from napster



Feasible to develop a large scale P2P service
It can scale the resource to meet the need, based on locality
Limitations


Consistency between replicas was not strong, but for music this
requirement is not much strong
Index server for accessing the resource was a bottleneck
Gnutella



Completely decentralized – no servers with catalogs
Shares any files
Gnutella node ---- SERVENT


Issue the query and view search result
Accept the query from other SERVENTs and check the match
against its database and response with corresponding result
Gnutella (cont)

Joining the network:



The new node connects to a well-known SERVENT
Then sends a PING message to discover other nodes
PONG message are sent in reply from hosts offering
connections with the new node
Direct connection are then made
Gnutella (cont)

Searching a file:


A node broadcasts its QUERY to all its peers who in turn
broadcasts to their peers
Nodes route back QUERYHITS along the QUERY path back to
the sender containing the location detail
To download the files a direct connection is made using details
of the host in the QUERYHIT message
Gnutella (cont)



Gnutella broadcasts its messages.
To prevent flooding -TTL is introduced.
To prevent forwarding same mesg. twice - each servent
maintains a list of recently seen mesg.
Gnutella (cont)
F
GnuCache
(3)
(2)
(1)
G
(3)
C
(2)
(3)
(2)
(3)
(1)
A
J
H
(2)
B
(3)
I
E
(1)
(2)
(4)
(1)
D
User A connects to the GnuCache to get the list of available servents already
connected in the network GnuCache sends back the list to the user A
User A sends the request message GNUTELLA CONNECT to the user B
User B replies with the GNUTELLA OK message granting user A to join the
network
Gnutella (cont)

Typical query scenario:





A sends a query message to its neighbor, B
B first checks that the message is not an old one
Then checks for a match with its local data
If there is a match, it sends the queryHit message back to user
A
Else B decrements TTL by 1 and forwards the query message
to users C, D, and E
C, D, and E performs the same steps as user B and forwards
the query message further to users F, G, H, and I
Gnutella (cont)

Problems


Broadcast mesg. congests the network
Lost of reply packets (dynamic environment)
FastTrack


Hybrid between centralized and decentralized
Has 2 tiers of control:

Ordinary nodes that connect to super nodes in a centralized
fashion
Super nodes that connect to each other in a decentralized
manner
FastTrack (cont)
FastTrack (cont)

Joining the network? - Bootstrapping node
Querying?

Problems (Like Gnutella)



Broadcast mesg. between Super Nodes
Lost of reply packets
Some key issues

Scalability




Availability



Networks can grow to millions of nodes
Challenge in achieving efficient peer and resource discovery
High amount of query/response traffic
Potential for commercial content provision
Such services require high availability and accessibility
Anonymity

What is the right level of anonymity?
Some key issues (cont)

Security


Due to open nature, have to assume environment is hostile
Concerns include:




Privacy and anonymity
File authenticity
Threats like worms and virus
Fault Resilience

The system must still be able to function even though several
important nodes goes off-line.
Some key issues (cont)

Standards and Interoperability



Lack of standards lead to poor interoperability between
applications
Can be improved by using common protocols
Copyright / Access Control



Classic case of Napster being shut down
Other applications have learned to get around the law
Possibility of paid access in future
Some key issues (cont)

Quality of Service (QoS)



Complexity of Queries



Metrics to be used is not clearly defined
Tradeoff between achieving QoS and costs
Must be able to support query languages of varying degree of
expressiveness
Simple keywords to SQL-like searches
Search Mechanism

Different search algorithms are used to reduced search time
and maximize search space
Some key issues (cont)

Load Balancing

existence of hot-spots (overloaded nodes) due to:





uneven node distribution throughout logical space
uneven object distribution among nodes
uneven demand distribution among objects
query and routing hot-spots
Self-organization


Ability to adapt itself to the dynamic nature of the Internet
Depends on the architecture of the system
P2P Middleware - GUID







Resources are identified by Global Unique Identifier GUID
Derived from secure hash
HASH makes a resource self certifying
Client receiving the resource can check the hash
This requires that states of resources are immutable
P2P systems are inherently best suited for the storage of
immutable objects – music file, images
Mutable objects sharing can be managed by set of trusted
servers to manage the sequence of versions e.g Oceanstore,
Ivy – more in section 10.6
Overlay routing vs IP routing
IP
Scale
IPv4 is limited to 232 addressable nodes.
The IPv6 name space is much more
generous (2128), but addresses in both
versions are hierarchically structured and
much of the space is pre-allocated
according to administrative requirements.
Load balancing
Loads on routers are determined by
network topology and associated traffic
patterns.
Network dynamics (addition/deletion IP routing tables are updated
of objects/nodes)
asynchronously on a best-efforts basis
with time constants on the order of 1
hour.
Fault tolerance
Redundancy is designed into the IP
network by its managers, ensuring
tolerance of a single router or network
connectivity failure. n-fold replication is
costly.
Target identification
Each IP address maps to exactly one
target node.
Security and anonymity
Addressing is only secure when all nodes
are trusted. Anonymity for the owners of
addresses is not achievable.
Application-level routing overlay
Peer-to-peer systems can address
more objects. The GUID name space
is very large and flat (>2128),
allowing it to be much more fully
occupied.
Object locations can be randomized
and hence traffic patterns are
divorced from the network topology.
Routing tables can be updated
synchronously or asynchronously
with fractions of a second delays.
Routes and object references can be
replicated n-fold, ensuring tolerance
of n failures of nodes or connections.
Messages can be routed to the
nearest replica of a target object.
Security can be achieved even in
environments with limited trust. A
limited degree of anonymity can be
provided.
Peer to Peer to Middleware

Key problem in p2p application design: “Provide
mechanism to enable clients to access data resources
quickly & dependably whenever they are located
throughout the network”


Napster used a unified index
2nd generation p2p file systems like Gnutella & Freenet employ
portioned & distributed indexes, but the algorithm used are
specific to each system
P2P Middleware

P2P Middleware are designed specifically for placement & subsequent
location of the distributed objects managed by different p2p systems

Functional Requirements
 Simplified construction of distributed services




Locate resources and communicate with resource provider & consumer
Add & remove resources at will anytime
API for the p2p programmers
Non Functional Requirements
 Global Scalability
 Load Balancing
 Optimization for local interaction between neighboring peers
 Accommodating to highly dynamic host availability
 Security of data in an environment with heterogeneous trust
 Anonymity, deniability and resistance to censorship
Routing Overlay



In P2P we cannot maintain the database at all the client nodes,
giving the location of all the resources
Resource location knowledge must be partitioned and
distributed
Each node is made responsible for maintaining



detailed knowledge of the locations of nodes and objects in a
portion of the namespace
As well as general knowledge of the topology of the entire name
space
High degree of replication of this knowledge is necessary to
ensure dependability in the face of the volatile availability of hosts
and intermittent network connectivity.
Distribution of information in a routing
overlay
Routing overlay takes the responsibility for locating nodes and objects
AÕs routi ng knowle dge DÕs rou ting kn owledg e
C
A
D
B
Obje ct:
Node :
BÕs ro utin g knowled ge
CÕs rou ting kn owledg e
Routing Overlay



Ensures that any node can access any object by routing
each request through a sequence of nodes, exploiting
knowledge at each of them to locate the destination
object
It also maintains the knowledge of location of all the
replicas of the object and deliver request to nearest live
node
GUID used to identify nodes and objects are an example
of pure name “opaque identifier”

It dose not reveal identity of the location of the object
Tasks of Routing Overlay


Main task of Routing Overlay
 Client wishing to invoke an operation on an object submits a request
including the object’s GUID to the routing overlay, which routes the
request to a node at which a replica of the object resides
Other task of Routing Overlay
 Node wishing to make new object available to a P2P service computes a
GUID for the object and announces it to the routing overlay, which then
ensures that the object is reachable by all other clients.
 When clients request the removal of the object from the service the
routing overlay must make them unavailable.
 Nodes may join or leave the service, when a node joins the service, the
routing overlay arranges for it to assume some of the responsibilities of
other nodes. When a node leaves its responsibilities are distributed
amongst the other nodes.
Basic programming interface for a distributed hash
table (DHT) as implemented by the PAST API over
Pastry



put(GUID, data)
The data is stored in replicas at all nodes responsible for
the object identified by GUID.
remove(GUID)
Deletes all references to GUID and the associated data.
value = get(GUID)
The data associated with GUID is retrieved from one of
the nodes responsible for it.
Basic programming interface for distributed object
location and routing (DOLR) as implemented by
Tapestry



publish(GUID)
GUID can be computed from the object (or some part of it,
e.g. its name). This function makes the node performing a
publish operation the host for the object corresponding to
GUID.
unpublish(GUID)
Makes the object corresponding to GUID inaccessible.
sendToObj(msg, GUID, [n])
Following the object-oriented paradigm, an invocation message
is sent to an object in order to access it. This might be a
request to open a TCP connection for data transfer or to
return a message containing all or part of the object’s state.
The final optional parameter [n], if present, requests the
delivery of the same message to n replicas of the object.
Overlay Case Study: Pastry



All the nodes & objects are assigned 128-bit GUIDs
 For nodes: computed by applying a secure hash function to public key of
the node
 For objects: computed by applying a secure hash function to the objects
name or some part of its stored state
 Resulting GUIDs are randomly distributed in the range 0 to 2128 -1
 Provide no clue how these values are computed and clashes between
GUIDs for different nodes or objects are extremely unlikely, still pastry can
detect & mange this unlikely event
In a network with N participating nodes the Pastry algo will correctly route a
message addressed to any GUID in O(log N) steps
If GUID identifies a node which is active, message is delivered to that node
otherwise delivered to a active node with closet numeric GUID
Pastry



Routing steps involve the use of an underlying transport protocol (normally
UDP) to transfer the message to a Pastry node that is closer to its
destination
Closeness in pastry refers to an entirely artificial space – the space of
GUIDs
 Real transport of message across internet between two pastry nodes
may require lots of IP hops.
 For better path option, pastry uses locality metric on network distance
in the underlying network (hop count, two way latency) to select
appropriate neighbors when setting up the routing tables used at each
node
Pastry id fully self organizing
 New nodes get info form neighbors to construct the table
 Nodes can detect the absence of the node and can update the table
Pastry: Routing Algo

Explanation in two stages


Stage 01: simplified form of the algo which routes messages
correctly but inefficiently without a routing table
Stage 02: full routing algo which routes request to any node in
O(log N) messges
Figure 10.6: Circular routing alone is correct
but inefficient Based on Rowstron and Druschel [2001]
0 FFFFF....F (2 128-1)
D471F1
D467C4
D46A1C
D13DA3
65A1FC
The dots depict live nodes.
The space is considered as
circular: node 0 is adjacent
to node (2128-1). The
diagram illustrates the
routing of a message from
node 65A1FC to D46A1C
using leaf set information
alone, assuming leaf sets
of size 8 (l = 4). This is a
degenerate type of routing
that would scale very
poorly; it is not used in
practice.
Pastry: Routing Algo



Each pastry nodes maintain a tree structured routing table
giving GUIDs and IP addresses for a set of nodes spread
through out the entire range of 2128 possible values, with
increased density of coverage for GUID numerically close
to its own
Fig 10.7 shows the structure of the routing table
Fig 10.8 illustrate the actions of the routing algorithm
Routing tables

are structured as



GUIDs are viewed as hexadecimal values & tables classifies
GUIDs based on their hexadecimal prefixes
Tables has as many rows as there are hexadecimal digits in a
GUID, so for our prototype there are 128/4 = 32 rows
Each row contains 15 entries
Figure 10.7: First four rows of a Pastry
routing table
Figure 10.8: Pastry routing example Based
on Rowstron and Druschel [2001]
0 FFFFF....F (2 128-1)
Routing a mess age from node 65A1FC to D46A1C.
With the aid of a w ell-populated routing table the
mess age c an be deliv ered in ~ log16 (N ) hops.
D471F1
D46A1C
D467C4
D462BA
D4213F
D13DA3
65A1FC
Figure 10.9: Pastry’s routing
algorithm
To handle a mess age M address ed to a nodeD (where R[p,i] is the element at column i,
row p of the routing table):
1. If (L -l < D < L l) { // the des tination is within the leaf set or is the current node.
2.
Forward M to the element L i of the leaf s et with GUID closes t to D or the current
node A.
3. } else { // us e the routing table to des patch M to a node with a clos er GUID
4.
find p, the length of the longes t common prefix of D and A. and i, the (p+1)th
hexadecimal digit of D.
5.
If (R[p,i] ° null) forward M to R[p,i] // route M to a node with a longer common
prefix.
6.
else { // there is no entry in the routing table
7.
Forward M to any node in L or R with a common prefix of length i, but a
GUID that is numerically closer.
}
}
Pastry’s routing algorithm

Algo will succeed in delivering the message M to its
destination cuase lines 1,2 & 7


They perform action as described in stage 01
The remaining steps are designed to improve the
alogrithm’s performance by reducing the numbers of
hops required
Host Integration


Node compute GUID
Contact near by node – address of nearby node ??



X is the new node sends join request to A
A will dispatch the join as normal message to numerically nearest node
of X, using Pastry algo. Let that node is Z
A, Z and all the nodes (B, C …) through which the message was routed
to Z





Add relevant part of their RT and leaf sets to X
X examines these leaf sets and construct its own routing table & leaf sets
Can request some other nodes for additional info
X’s leaf set node should be very much similar to Z leaf set
Once X RT is constructed it send its leaf set and RT entries, info to
other nodes are other nodes to update their RT
RT updates

Node failure or departure



Node is considered failed when its immediate neighbors are
unable to contact
Node which discovers the failure of the node, looks for the
next nearest live node, and request for its leaf set
This leaf set will contain the overlapping info of failed node leaf
set. Discovering node will choose the best node from this leaf
set to replace the failed node.
Self study




Locality
Fault tolerance
Dependability – MS Pastry
Evaluation of MS Pastry
Thanks
Dr. Raihan! remember & keepup ur promise