Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Remote Desktop Services wikipedia , lookup
Distributed operating system wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Airborne Networking wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
P2P Tutorial Concept:Advantages: Why and What P 2 P History:Structural evolution Status: From Academy to Industry Future:How to Build a Practical P2P system Guihai Chen Concept HIstory Status Future 1 Client/Server Model is Being Challenged No single server or search engine can sufficiently cover increasing Web contents. 21018 Bytes/year generated in Internet. But only 31012 Bytes/year available to public (0.00015%). Google only searches 1.3108 Web pages. (Source: IEEE Internet Computing, 2001) Concept HIstory Status Future 2 Client/Server Model Has Problems Client/server model seriously limits utilization of available bandwidth and service. Popular servers and search engines become traffic bottlenecks. But high speed networks connecting many clients become idle. Computing cycles and information in clients are ignored. Concept HIstory Status Future 3 Socket Programming in Client/Server Model Two types of server Concurrent server – forks a new process, so multiple clients can be handled at the same time. Iterative server – the server processes one request before accepting the next. Concept HIstory Status Future 4 Socket Programming in Client/Server Model Two types of server Concurrent server – forks a new process, so multiple clients can be handled at the same time. Iterative server – the server processes one request before accepting the next. Concept HIstory Status Future 5 Socket Programming in Client/Server Model Concurrent Server Concept listenfd = socket(…); bind(listenfd,…); listen(listenfd,…) for ( ; ; ) { connfd = accept(listenfd, …); If (( pid = fork()) == 0) { /* child*/ close(listenfd); /* process the request */ close(connfd); exit(0); } close(connfd); /* parent*/ } HIstory Status Future 6 Socket Programming in Client/Server Model Iterative Server Concept listenfd = socket(…); bind(listenfd,…); listen(listenfd,…) for ( ; ; ) { connfd = accept(listenfd, …); /* process the request */ close(connfd); } HIstory Status Future 7 Socket Programming in Client/Server Model sockfd = socket(…); connect(sockfd, …) /* process the request */ close(sockfd); Client Concept HIstory Status Future 8 Content Delivery Networks (CDN): A Transition Model Servers are decentralized (duplicated) throughout the Internet. The distributed servers are controlled by a centralized authority (headquarters). Examples: Internet content distributions by Akamai, Overcast, and FFnet. Both Client/Server and CDN models have single point of failures. Concept HIstory Status Future 9 A New Paradigm: Peer-oriented Systems Both client (consumer) & server (producer). Has the freedom to join and leave any time. Huge peer diversity: service ability, storage space, networking speed, and service demand. A widely decentralized system opening for both opportunities and new concerns. Concept HIstory Status Future 10 Peer-oriented Systems Client/server e.g. search engine/grid Content Delivery Networks Server Duplicated Server Server e.g. Akami Hybrid P2P Pure P2P directory e.g. Freenet & Gnutella e.g. Napster Concept HIstory Status Future 11 Objectives and Benefits of P2P • As long as there no physical break in the network, the target file will always be found. • Adding more contents to P2P will not affect its performance. (information scalability). • Adding and removed nodes from P2P will not affect its performance. (system scalability). Concept HIstory Status Future 12 Peer-oriented Applications File Sharing: document sharing among peers with no or limited central controls. Instant Messaging (IM): Immediate voice and file exchanges among peers. Distributed Processing: One can widely utilize resources available in other remote peers. ??? Concept HIstory Status Future 13 Peer-oriented Applications Pioneers: Napster, Gnutella, FreeNet File sharing: CFS, PAST [SOSP’01] Network storage: FarSite [Sigmetrics’00], Oceanstore [ASPLOS’00], PAST [SOSP’01] Web caching: Squirrel[PODC’02] Event notification/multicast: Herald [HotOS’01], Bayeux [NOSDAV’01], CAN-multicast [NGC’01], SCRIBE [NGC’01], SplitStream [submitted] Anonymity: Crowds [CACM’99], Onion routing [JSAC’98] Censorship-resistance: Tangler [CCS’02] Concept HIstory Status Future 14 What is P2P Network— My version [Equality] All peers assume equal role. [Non Centralized] No centralized server in the space. [Robust] Highly robust, resilient, and selforganizing. [Zero Hardware Cost] No further investments in hardware or bandwidth. [A hot topic] But huge investment in research, e.g, IRIS got $ 12M. Concept HIstory Status Future 15 What is P2P Network—Another version ---M. Ripeaunu, A. Lamnitchi, and I. Foster, “Mapping the Gnutella Network”, IEEE IC, No.1, 2002. [Dynamic operability] P2P applications must keep operating transparently although hosts join and leave the network frequently. [Performance and scalability] P2P applications exhibit what economists call the “network effect” in which a network’s value to an individual user scales with the total number of participants. [Reliability] External attacks should not cause significant data or performance loss. [Anonymity] The application should protect the privacy of people seeking or providing sensitive information. Concept HIstory Status Future 16 How Did it Start? A killer application: Napster - Free music over the Internet Key idea: share the storage and bandwidth of individual (home) users Internet Concept HIstory History Status Future 17 Model Each user stores a subset of files Each user has access (can download) files from all users in the system Concept HIstory History Status Future 18 Main Challenge Find where a particular file is stored - Note: problem similar to finding a particular page in web caching E F D E? C A Concept B HIstory History Status Future 19 Other Challenges Scalability: up to hundred of thousands or millions of machines Dynamicity: machines can come and go at any time Concept HIstory History Status Future 20 Napster Napster.com Assume a centralized index system that maps files (songs) to machines that are alive How to find a file (song) - Query the index system return a machine that stores the required file • Ideally this is the closest/least-loaded machine - ftp the file Advantages: - Simplicity, easy to implement sophisticated search engines on top of the index system Disadvantages: - Robustness, scalability (?) Concept HIstory History Status Future 21 Napster: Example m5 E m6 F E? E E? m5 m1 m2 m3 m4 m5 m6 m4 C A B m1 Concept D A B C D E F m3 m2 HIstory History Status Future 22 Napster: History history: - 5/99: Shawn Fanning (freshman, Northeasten U.) founds Napster Online music service - 12/99: first lawsuit - 3/00: 25% UWisc traffic Napster - 2000: est. 60M users - 2/01: US Circuit Court of Appeals: Napster knew users violating copyright laws - 7/01: # simultaneous online users: Napster 160K, Gnutella: 40K, - Now: try to come back http://www.napster.com Concept HIstory History Status Future 23 Napster: architecture problems centralized server: - single logical point of failure can load balance among servers using DNS notation potential for congestion Napster “in control” (freedom is an illusion) no security: - passwords in plain text - no authentication - no anonymity Concept HIstory History Status Future 24 Gnutella Distribute file location and decentralize lookup. Idea: multicast the request Hot to find a file: - Send request to all neighbors - Neighbors recursively multicast the request - Eventually a machine that has the file receives the request, and it sends back the answer Advantages: - Totally decentralized, highly robust Disadvantages: - Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL) Concept HIstory History Status Future 25 Gnutella: Example Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… m5 E m6 F E D E? E? m4 E? E? C A B m1 Concept m3 m2 HIstory History Status Future 26 Gnutella: architecture problems Concept HIstory History Status Future 27 Gnutella: architecture problems Not scalable: the entire network can be swamped with request (to alleviate this problem, each request has a TTL) Not anonymous: The person you are getting the file from knows who you are. Not anymore than it’s non-centralized. What we care about: How much traffic does one query generate? how many hosts can it support at once? What is the latency associated with querying? Is there a bottleneck? Concept HIstory History Status Future 28 Freenet Additional goals to file location: - Provide publisher anonymity, security - Resistant to attacks – a third party shouldn’t be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines Architecture: - Each file is identified by a unique identifier - Each machine stores a set of files, and maintains a “routing table” to route the individual requests See how Freenet use “Routing table, depth-first searching, anonymity, caching”. Concept HIstory History Status Future 29 Data Structure Routing table Each node maintains a common stack - id –- file identifier - next_hop –- another node that stores the file id - file –- file identified by id being stored on the local node file … Forwarding: … id next_hop - Each message contains the file id it is referring to - If file id stored locally, then stop; - If not, search for the “closest” id in the stack, and forward the message to the corresponding next_hop - Birds flock in feather Concept HIstory History Status Future 30 Query API: file = query(id); Upon receiving a query for document id - Check whether the queried file is stored locally • If yes, return it • If not, forward the query message Notes: - Each query is associated a TTL that is decremented each time the query message is forwarded; to obscure distance to originator: • TTL can be initiated to a random value within some bounds • When TTL=1, the query is forwarded with a finite probability - Each node maintains the state for all outstanding queries that have traversed it help to avoid cycles - When file is returned it is cached along the reverse path (push ) Concept HIstory History Status Future 31 Query Example query(10) n2 n1 4 n1 f4 12 n2 f12 5 n3 1 9 n3 f9 4 2 n3 3 n1 f3 14 n4 f14 5 n3 4’ n4 14 n5 f14 13 n2 f13 3 n6 n5 5 4 n1 f4 10 n5 f10 8 n6 Note: - doesn’t show file caching on the reverse path - Use depth-first searching. Concept HIstory History Status Future 32 Insert API: insert(id, file); Two steps - Search for the file to be inserted • If found, report collision • if number of nodes exhausted report failure - If not found, insert the file Concept HIstory History Status Future 33 Insert Searching: like query, but nodes maintain state after a collision is detected and the reply is sent back to the originator Insertion - Follow the forward path; insert the file at all nodes along the path - A node probabilistically replace the originator with itself; obscure the true originator Concept HIstory History Status Future 34 Insert Example Assume query returned failure along “gray” path; insert f10 insert(10, f10) n2 n1 4 n1 f4 12 n2 f12 5 n3 9 n3 f9 n3 3 n1 f3 14 n4 f14 5 n3 Concept HIstory History n4 n5 14 n5 f14 13 n2 f13 3 n6 4 n1 f4 11 n5 f11 8 n6 Status Future 35 Insert Example Assume query returned failure along “gray” path; insert f10 insert(10, f10) n2 n1 10 n1 f10 4 n1 f4 12 n2 orig=n1 9 n3 f9 n3 3 n1 f3 14 n4 f14 5 n3 Concept HIstory History n4 n5 14 n5 f14 13 n2 f13 3 n6 4 n1 f4 11 n5 f11 8 n6 Status Future 36 Insert Example n2 replaces the originator (n1) with itself insert(10, f10) n2 n1 10 n1 f10 4 n1 f4 12 n2 10 n1 f10 9 n3 f9 orig=n2 n3 10 n2 f10 3 n1 f3 14 n4 Concept HIstory History n4 n5 14 n5 f14 13 n2 f13 3 n6 4 n1 f4 11 n5 f11 8 n6 Status Future 37 Insert Example n2 replaces the originator (n1) with itself n4 replaces the originator (n2) with itself too Insert(10, f10) n2 n1 10 n1 f10 4 n1 f4 12 n2 10 n1 f10 9 n3 f9 n3 10 n2 f10 3 n1 f3 14 n4 Concept HIstory History n4 n5 10 n2 f10 14 n5 f14 13 n2 10 n4 f10 4 n1 f4 11 n5 Status Future 38 Freenet Properties Newly queried/inserted files are stored on nodes with similar ids. Birds flock in feather. New nodes can announce themselves by inserting files Attempts to supplant or discover existing files will just spread the files Concept HIstory History Status Future 39 Freenet Summary Advantages - Provides publisher anonymity - Totally decentralize architecture robust and scalable - Resistant against malicious file deletion Disadvantages - Does not always guarantee that a file is found, even if the file is in the network - Space-time complexity =??? - So still not scalable Concept HIstory History Status Future 40 New Solutions to the Location Problem Overlay Networks: - applications, running at various sites - create “logical” links (e.g., TCP or UDP connections) pairwise between each other - each logical link: multiple physical links, routing defined by native Internet routing Goal: Scalability, Resilient, Security. Abstraction: Hash function + Routing table - DataID = hash(data); NodeID = hash(IP) data= lookup(ID); Note: data can be anything: a data object, document, file, pointer to a file… Proposals - CAN (ACIRI/Berkeley) Chord (MIT) Pastry (Rice) Tapestry (Berkeley) Concept New Generation: Koordle(USA) Viceroy(Israel) Cycloid(China) HIstory History Status Future 41 New Solutions to the Location Problem Event notification Network storage ? P2P application layer P2P substrate (self-organizing overlay network) Overlay Network TCP/IP Concept Internet HIstory History Status Future 42 Overlay Networks: a generic model Over l ay Net wor k peer 1 peer 2 peer n r out i ng and l ocat i ng al gor i t hm r out i ng and l ocat i ng al gor i t hm r out i ng and l ocat i ng al gor i t hm r out i ng t abl e r out i ng t abl e r out i ng t abl e Dat a St or age Dat a St or age Dat a St or age Dat a Cache Dat a Cache Dat a Cache I nt er net : Suppor t i ng Net or k A Gener i c Topol ogi cal Model of P2P Syst ems Concept HIstory History Status Future 43 Overlay Networks: Mapping to IP Network Overlay Network IP-layer Network Host Router Overlay links Concept IP Links HIstory History Status Future 44 Overlay Networks: Consistent Hashing David Karger, Eric Lehman, Tom Leighton, Mathhew Levine, Daniel Lewin, Rina Panigrahy, Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web, ACM Symposium on Theory of Computing, 1997 SHA-1: http://www.w3.org/PICS/DSig/SHA1_1_0.html Concept HIstory History Status Future 45 Overlay Networks: Problems Overlay Maintenance: - (1) periodically ping to verify liveness of peers; - (2) delete the edge with an dead peer; - (3) new peer needs to bootstrap. Geographical dismatch: - (1) topology-unaware; (2) duplicated messages; (3) inefficient network usage. Loosing Security and Privacy : - (1) Providing a conduit for evil code and viruses. - (2) Providing loopholes for information leakage. - (3) Relaxing the privacy protection by exposing peer identities. Weak Resource Coordinations - (1) With limited or no central control, but mainly rely on self-organization. - (2) Lacking communication monitoring and scheduling: cause unnecessary traffic jams. - (3) Lacking access and service coordinations: unbalanced loads among peers. Many others Concept HIstory History Status Future 46 Overlay Networks: Typical Systems Ring Mesh Hypercube Systems Chord[MIT] CAN[Berkeley] Pastry[Rice], Tapestry[Berkeley] Persons Dabek Kaashoek Stoica Ratnasamy, Shenker Stoica(formerly in MIT) Druschel, Rowstron Applications CFS Key space 1-dimensional cycle Space-time complexity O (log N ) Data distribution Each node holds a segment of data keys between predecessor and itself. Each node holds a zone of data keys where itself resides Each node holds a segment of data keys that are the closest numerically. Data location Routing table lookup(k)successor(k) lookup(k)region(k) lookup(k) nearest(k) PAST, SCRIBE, OceanStore 2 or d-dimensional torus O (log N ) Successor set + O (log N ) fingers O(d ) 1-dimensional cycle O(d d N ) O(d ) neighbors O (log N ) O (log N ) O(| L |) leaf set + O (| M |) proximity set + O (log N ) neighobrs Concept HIstory History Status Future 47 Questions in Mind Can we use ring, mesh and hypercube directly? If not, what modifications should we make? How to convert an interconnection network as an overlay network? …… Concept HIstory History Status Future 48 Chordal Ring Visit Chord project website http://pdos.csail.mit.edu/chord/ to know the purpose and every detail of Chord Read the paper “Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications” by Ion Stoica, et al. Concept HIstory History Status Future 49 29 30 31 0 1 2 3 28 Chordal Ring: definition 4 5 27 6 26 7 25 Hash function: node NnodeID, data DdataID 24 8 23 9 10 11 12 22 21 20 nodeID, dataID in {0, .., n -1} where n =2m 19 18 17 16 15 14 13 Put data D on node N, so that nodeID(N) is smallest nodeID larger than or equal to dataID(D) Given key=dataID(D), how to find successor(key) ? Lookup(key)=successor(key), find the first live nodeID which is ≥key. Finger table: node k stores pointers to k+1, k+2, k+4 ..., k +2m -1 (mod n) 0 1 2 3 4 ... 7 8 9 ... 14 15 16 ... ... ... n-2 n-1 Find node for every data in O(log(#nodes)) steps; O(log(#nodes)) storage per node Concept HIstory History Status Future 50 Chordal Ring: Data Structure Assume identifier space is 0..2m-1 Each node maintains - Finger table • Entry i in the finger table of n is the first node that succeeds or equals n + 2i - Predecessor node An item identified by id is stored on the successor node of id Concept HIstory History Status Future 51 Chordal Ring: finger table A ring of 25 node Ids. m =5, i = 0…n-1 . 29 30 0 31 1 Start = k + 2i (modulo 2m) IP addr of successor (start ) 2 4 3 4 5 7 9 12 17 20 5 7 6 7 8 8 12 12 12 9 20 20 13 15 14 15 16 20 20 20 28 1 3 4 28 27 2 5 Actual node Node 1’s finger table 6 26 Node identifier 25 7 24 23 Node 4’s finger table 10 22 11 21 12 20 19 18 17 16 Concept 15 14 HIstory History 13 Status Future Node 12’s finger table 52 29 30 31 0 1 2 3 28 Chordal Ring: routing algorithm 4 5 27 6 26 7 25 24 8 23 9 When node k receives the lookup(key), 10 11 12 22 21 20 19 18 17 16 15 14 13 1) If k < key ≤ next(k), return next(k) 2) elseif key ≤k at intermediate node k, return k 3) else forward to f such that f=MAX{fingers|fingers≤key} Lookup message contains the requester’s IP address so that lookup result can be returned. When node k is alive, successor(k) = k, next(k) is the next alive node which is IP address of the first entry. Make as bigger step as possible, or send the request as close to the destination as possible, confirming to the small world phenomena. Correctness(convergence): distance is closer and closer. Concept HIstory History Status Future 53 Chordal Ring: routing example Look up key=14,15,16,17 at node 1. 29 30 0 31 1 Start = k + 2i (modulo 2m) IP addr of successor (start ) 2 3 4 28 5 27 2 4 3 4 5 7 9 12 17 20 5 7 6 7 Node 1’s finger table 6 26 25 7 24 8 8 12 12 12 9 20 20 13 15 14 15 16 20 20 20 28 1 23 Node 4’s finger table 10 22 11 21 12 20 19 18 17 16 Concept 15 14 HIstory History 13 Status Future Node 12’s finger table 54 Chord Example: self-organization Assume an identifier space 0..8 Node n1:(1) joinsall entries in its finger table are initialized to itself Succ. Table i id+2i succ 0 2 1 1 3 1 2 5 1 0 1 7 6 2 5 3 4 Concept HIstory History Status Future 55 Chord Example: self-organization Node n2:(2) joins Succ. Table i id+2i succ 0 2 2 1 3 1 2 5 1 0 1 7 6 2 Succ. Table 5 3 4 Concept HIstory History Status Future i id+2i succ 0 3 1 1 4 1 2 6 1 56 Chord Example: self-organization Succ. Table Nodes n3:(0), n4:(6) join i id+2i succ 0 1 1 1 2 2 2 4 0 Succ. Table i id+2i succ 0 2 2 1 3 6 2 5 6 0 1 7 Succ. Table i id+2i succ 0 7 0 1 0 0 2 2 2 6 2 Succ. Table 5 3 4 Concept HIstory History Status Future i id+2i succ 0 3 6 1 4 6 2 6 6 57 Chord Examples Succ. Table Nodes: n1:(1), n2(3), n3(0), n4(6) Items: f1:(7), f2:(2) i i id+2 0 1 1 2 2 4 Items 7 succ 1 2 0 0 1 7 Succ. Table Succ. Table 6 i id+2i succ 0 7 0 1 0 0 2 2 2 i i id+2 0 2 1 3 2 5 2 Succ. Table 5 i id+2i succ 0 3 6 1 4 6 2 6 6 3 4 Concept Items succ 1 2 6 6 HIstory History Status Future 58 Query Upon receiving a query for item id, a node Check whether stores the item locally If not, forwards the query to the largest node in its successor table that does not exceed id Succ. Table i i id+2 0 1 1 2 2 4 Items 7 succ 1 2 0 0 Succ. Table 1 7 i i id+2 0 2 1 3 2 5 query(7) Succ. Table 6 i id+2i succ 0 7 0 1 0 0 2 2 2 2 Succ. Table 5 i id+2i succ 0 3 6 1 4 6 2 6 6 3 4 Concept Items succ 1 2 6 6 HIstory History Status Future 59 Discussion Query can be implemented - Iteratively - Recursively Self-organization - Join and leave - Gracefully and abruptly - Distributed or locally Performance: routing in the overlay network can be more expensive than in the underlying network - Because usually there is no correlation between node ids and their locality; a query can repeatedly jump from Europe to North America, though both the initiator and the node that store the item are in Europe! - Solutions: Tapestry takes care of this implicitly; CAN and Chord maintain multiple copies for each entry in their routing tables and choose the closest in terms of network distance Concept HIstory History Status Future 60 Pastry: hypercube connection How to maintain hypercube connection even after some nodes are absent? 110 100 111 100 101 011 010 001 000 ( a ) 111 110 100 101 011 010 001 000 110 111 101 011 010 001 000 ( b ) ( c ) ( a) t r adi t i onal hyper cube, ( b) r econf i gur at i on i f node 011 f ai l s, ( c) r econf i gur at i on i f node 101 al so f ai l s. Not e t hat al l l i ve nodes mai nt ai ns 3 poi nt er s. Concept HIstory History Status Future 61 Tapestry : incremental suffix-based routing 3 4 NodeID 0x79FE NodeID 0x23FE NodeID 0x993E 4 NodeID 0x035E 3 NodeID 0x43FE 4 NodeID 0x73FE 3 NodeID 0x44FE 2 2 4 3 1 1 3 NodeID 0xF990 2 3 NodeID 0x555E 2 NodeID 0xABFE NodeID 0x04FE NodeID 0x13FE 4 NodeID 0x9990 1 2 1 2 NodeID 0x73FF 3 NodeID 0x423E Concept HIstory History NodeID 0x239E Status 1 Future NodeID 0x1290 62 Tapstry: Routing Table Example: Octal digits, 218 namespace, 005712 627510 005712 340880 943210 834510 387510 727510 0 1 0 1 0 1 0 1 0 1 0 1 2 2 2 2 2 2 3 4 5 3 4 5 3 4 5 3 4 5 3 4 5 3 4 5 005712 6 7 Neighbor Map For “5712” (Octal) 340880 6 7 6 7 6 7 6 7 6 7 x012 xx02 xxx0 1712 x112 5712 xxx1 2712 x212 xx22 5712 3712 x312 xx32 xxx3 4712 x412 xx42 5712 x512 xx52 xxx4 387510 6712 x612 xx62 xxx6 7712 727510xx72 5712 xxx7 4 627510 0 1 2 Concept 3 4 5 834510 3 2 Routing Levels 6 7 HIstory History 943210 0712 Status Future xxx5 1 627510 63 Discussion Robustness - Maintain multiple copies associated to each entry in the routing tables - Replicate an item on nodes with close ids in the identifier space Security - Can be build on top of CAN, Chord, Tapestry, and Pastry Concept HIstory History Status Future 64 Content Addressable Network (CAN) virtual Cartesian coordinate space: d-dimensional torus Associate to each node and item a unique id in an d-dimensional space entire space is partitioned amongst all the nodes - every node “owns” a zone in the overall space Properties - Routing table size O(d) d - Guarantees that a file is found in at most O(d N ) steps, where N is the total number of nodes Concept HIstory History Status Future 65 CAN Example: Two Dimensional Space Space divided between nodes All nodes cover the entire space 7 Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 6 Example: - Assume space size (8 x 8) - Node n1:(1, 2) first node that joins cover the entire space 5 4 3 n1 2 1 0 0 Concept HIstory History Status 1 2 Future 3 4 5 6 7 66 CAN Example: Two Dimensional Space Node n2:(4, 2) joins space is divided between n1 and n2 7 6 5 4 3 n2 n1 2 1 0 0 Concept HIstory History Status 1 2 Future 3 4 5 6 7 67 CAN Example: Two Dimensional Space Node n2:(4, 2) joins space is divided between n1 and n2 7 6 n3 5 4 3 n2 n1 2 1 0 0 Concept HIstory History Status 1 2 Future 3 4 5 6 7 68 CAN Example: Two Dimensional Space Nodes n4:(5, 5) and n5:(6,6) join 7 6 n5 n4 n3 5 4 3 n2 n1 2 1 0 0 Concept HIstory History Status 1 2 Future 3 4 5 6 7 69 CAN Example: Two Dimensional Space Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6) Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5); 7 6 n5 n4 n3 5 f4 4 f1 3 n2 n1 2 f3 1 f2 0 0 Concept HIstory History Status 1 2 Future 3 4 5 6 7 70 CAN Example: Two Dimensional Space Each item is stored by the node who owns its mapping in the space 7 6 n5 n4 n3 5 f4 4 f1 3 n2 n1 2 f3 1 f2 0 0 Concept HIstory History Status 1 2 Future 3 4 5 6 7 71 CAN: Query Example Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 7 6 n5 n4 n3 5 f4 4 f1 3 n2 n1 2 f3 1 f2 0 0 Concept HIstory History Status 1 2 Future 3 4 5 6 7 72 Conclusions The key challenge of building wide area P2P systems is a scalable and robust location service Solutions covered in this Tutorial - Naptser: centralized location service - Gnutella: broadcast-based decentralized location service - Freenet: intelligent-routing decentralized solution (but correctness not guaranteed; queries for existing items may fail) - CAN, Chord, Tapestry, Pastry: intelligent-routing decentralized solution • Guarantee correctness • Tapestry (Pastry ?) provide efficient routing, but more complex Concept HIstory History Status Future 73 Conclusions Classification of lookup algorithms. It is also a classification of P2P systems. - Structured or non-structured: whether the overlay network is regular or not - Symmetric or non-symmetric: whether each node assumes equal role. - Examples: Non- structured Non-symmetric Symmetric Napster(star), DNS(tree) FastTrack(hierarchical) Gnutellar, Freenet Concept Structured HIstory History CAN(mesh), Chord(ring), Pastry(hypercube), Tapestry(hypercube), Viceroy(butterfly) Status Future 74 Reading Task Read papers or webpages about - Naptser Gnutella Freenet CAN Chord Tapestry Pastry Concept HIstory History Status Future 75