* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Peer-to-peer applications fostered explosive growth in recent years
Airborne Networking wikipedia , lookup
Backpressure routing wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Distributed operating system wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Everything2 wikipedia , lookup
Peer to Peer File Sharing Huseyin Ozgur TAN What is Peer-to-Peer? Every node is designed to(but may not by user choice) provide some service that helps other nodes in the network to get service  Each node potentially has the same responsibility  Sharing can be in different ways:    CPU cycles: SETI@Home Storage space: Napster, Gnutella, Freenet… P2P: Why so attractive?  Peer-to-peer applications fostered explosive growth in recent years.   Low cost and high availability of large numbers of computing and storage resources, Increased network connectivity  As long as these issues keep their importance, peer-to-peer applications will continue to gain importance Main Design Goals of P2P Ability to operate in a dynamic environment  Performance and scalability  Reliability  Anonymity: Freenet, Freehaven, Publius  Accountability: Freehaven, Farsite  First generation P2P routing and location schemes Napster, Gnutella, Freenet…  Intended for large scale sharing of data files  Reliable content location was not guaranteed  Self-organization and scalability: two issues to be addressed  Second generation P2P systems Pastry, Tapestry, Chord, CAN…  They guarantee a definite answer to a query in a bounded number of network hops.  They form a self-organizing overlay network.  They provide a load balanced, faulttolerant distributed hash table, in which items can be inserted and looked up in a bounded number of forwarding hops.  CAN  Content Addressable Network     distributed infrastructure based on hash tables serves for Internet scale networks Advantages    Scalable Fault-tolerant Self orginazing CAN  Current P2P systems at that time     Napster Gnutella Both not scalable Napster    Central server Expensive (for server) Vulnerable (single point of failure) CAN  Gnutella     De-centralized file location as well File location is based on location Not scalable Aim  Building a scalable distributed infrastructure for p2p networks CAN  Basic operations     Insertion Lookup Deletion Each node   stores a zone of the entire hash table stores information about its neighbors CAN - Design    Virtual d-dimensional Cartesian coordinate space on a d-torus The entire space is partitioned among all nodes The space is used to store key,value pairs   K -> P -> zone Each node holds info about each of its neigbors for efficient routing CAN - Routing   Works by following the straight line path between source and destination Local neighbor state is sufficient for routing    Complexity     CAN message includes dest. Coordinates A node routes it to the neighbor with closest to the destination coordinates (d/4)(n1/d) avg path length 2d neighbors for a node Good for scalability Many paths between source and destination  In case of crashes the node routes the message to the next best available path CAN - Construction  A new node that joins the system must be allocated its own zone   Achieved by an existing node splitting its zone in half, retaining half and handing the other half to the new node 3 Steps    New node must find a node already in CAN It must find a node whose zone will be split The neighbors of split zone must be notified CAN - Construction  Bootstrap     Finding a zone    Chooses randomly a point P in space and sends JOIN request destined P Destination splits its zone and key,value pairs are transferred Joining the routing     New node discovers an existing node CAN DNS name resolves to one or more nodes IP addresses which are believed to be in the system By querying the DNS new node gets a bootstrap node and from this node gets other nodes IP addresses The new node learns the IP addresses of the neighbors Both new and existing nodes update their neighbor set To inform others two nodes send an immediate update, followed by periodic refreshes Complexity  O(d) CAN – Maintenance  Node departure     Explicitly hand over its zone to one of its neighbors If node’s zones can be merged it is done If not, the responsibility is handed to the neighbor whose current zone is smallest Takeover (a node becomes unreachable)     One of its neighbors takes over the zone But the pairs are lost Nodes send periodic advertisements Absence of such adv. signals its failure   Each neighbor starts a timer When it expires node sends TAKEOVER message to all neighbors of failed node Chord  Distributed lookup protocol   Advantages       One operation : lookup = key -> node Load balance Decentralization Scalability Availability Flexible naming An infrastructure for p2p system Chord - Overview  Specifies       How to find locations of keys How new nodes join the system How to recover from failure of existing nodes It uses consistent hashing Improves the scalability of it by avoiding the requirement that every node know about every other node. In N-node network    each node maintains information about only O (log N) nodes a lookup requires O (log N) messages Updating info on join and leave requires O (log2 N) Chord – Consistent Hashing  Consistent hash function each node and key an m-bit identifier using a base hash function     Node’s identifier = Hash of its IP Key’s identifier = Hash of key M must be big enough to make probability of collisions negligible Assignment of keys to nodes    Identifiers are ordered in an identifier circle Key k is assigned to the first node whose identifier is equal to or follows k in the identifier space This node is called successor (k) Chord – Consistent Hashing  Scalable key location  Very small amount of routing information suffices   Routing can be done by following successors until the node is found   Each node need only beware of its successor Not scalable: O(N) time lookup To accelerate Chord maintains additional routing information    Each node, n, maintains a routing table with at most m entries (finger table) ith entry contains the identity of first node, s, that succeeds n by at least 2i-1 on the identifier circle i.e. s= successor(n+2i-1) Chord Chord – Node Joins Main challenge is preserving the ability to locate every key in the network  2 invariants     Each nodes successor is correctly maintained For every key k, node successor(k) is responsible for k Also in order to be the lookup fast the finger tables must be correct Chord – Node joins To simplify the join and leave mechanism each node in Chord maintains a predecessor pointer  3 tasks     Init predecessor and fingers of node n Update the finger and predecessors of existing nodes Notify higher layer so that it can transfer state associated with keys that node n is now responsible for Chord – Node joins  Initializing fingers and predecessor   Learns its predecessor and fingers by asking n’ to look them up Updating fingers of existing nodes  Node n will become the ith finger of node p iff      P precedes n by at least 2i-1 The ith finger of p succeeds n The first node p that can meet these 2 conditions is the immediate predecessor of n- 2i-1 The algorithm starts with the ith finger of node n and then continues to walk in the counter clockwise direction on identifier circle until it encounters a node whose ith finger precedes n Transferring keys  New node must contact the successor of it and transfer responsibility Chord – Concurrent Operations  Stabilization    The invariants are difficult to maintain in the face of concurrent joins in large networks Separate correctness and performance goals Stabilization protocol   Keep nodes’ successor pointers up to date, which is sufficient to correctness of lookups Those successors are then used to verify and correct finger table entries Stabilization Chord
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            