Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Backpressure routing wikipedia , lookup
Airborne Networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Distributed operating system wikipedia , lookup
Peer-to-peer wikipedia , lookup
Peer-to-Peer Networks and Distributed Hash Tables 2006 Peer-peer networking file sharing: - files are stored at the end user machines (peers) rather than at a server (C/S), files are transferred directly between peers. leverage: - P2P is a way to leverage vast amounts of computing power, storage, and connectivity from personal computers (PC) distributed around the world. • Q: What are the new technical challenges? • Q: What new services/applications enabled? • Q: Is it just “networking at the application-level”? 2 • Everything old is new again? • Napster • • Naptser -- free music over the Internet Key idea: share the content, storage and bandwidth of individual (home) users Model: Each user stores a subset of files; Each user has access (can download) files from all users in the system Application-level, client-server protocol (index server) over point-to-point TCP How does it work -- four steps: • Connect to Napster index server • Upload your list of files (push) to server. • Give server keywords to search the full list with. • Select “best” of correct answers. (pings) Internet 3 Napster: Example m5 E m6 F E? E E? m5 m1 m2 m3 m4 m5 m6 m4 C A m1 (machine) D A B C D E F B m3 m2 4 Napster characteristics • • Advantages: - Simplicity, easy to implement sophisticated search engines on top of the index system centralized index server: • single logical point of failure • can load balance among servers using DNS rotation • potential for congestion • Napster “in control” (freedom is an illusion) no security: • passwords in plain text • no authentication 5 • no anonymity Main Challenge Find where a particular file is stored Scale: up to hundred of thousands or millions of machines - 7/2001: # simultaneous online users: Napster-160K, Gnutella-40K, Morpheus-300K Dynamicity: machines can come and go any time E F D E? A C B 6 Gnutella peer-to-peer networking: peer applications Focus: decentralized method of searching for files How to find a file: flood the request - Send request to all neighbors - Neighbors recursively multicast the request - Eventually a machine that has the file receives the request, and it sends back the answer Advantages: - Totally decentralized, highly robust Disadvantages: - Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL) 7 Gnutella: Example Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… m5 E m6 F E D E? E? m4 E? E? C A m1 B m2 m3 8 Gnutella What we care about: - How much traffic does one query generate? - how many hosts can it support at once? - What is the latency associated with querying? - Is there a bottleneck? late 2000: only 10% of downloads succeed - 2001: more than 25% downloads successful (is this success or failure?) 9 BitTorrent BitTorrent (BT) is new generation p2p. It can make download more faster - The file to be distributed is split up in pieces and an SHA-1 hash is calculated for each piece Swarming: Parallel downloads among a mesh of cooperating peers - Scalable - capacity increases with increase in number of peers/downloaders - Efficient - it utilises a large amount of available network bandwidth Tracker - a central server keeping a list of all peers participating in the swarm (Handles peer discovery) 10 BitTorrent…. A picture.. Uploader/downloader Uploader/downloader Tracker Uploader/downloader Uploader/downloader Uploader/downloader 11 BitTorrent…. A picture.. 12 Freenet Addition goals to file location: - Provide publisher anonymity, security - Resistant to attacks – a third party shouldn’t be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines Architecture: - Each file is identified by a unique identifier - Each machine stores a set of files, and maintains a “routing table” to route the individual requests 13 Data Structure … Each node maintains a common stack - id – file identifier - next_hop – another node id next_hop file that store the file id - file – file identified by id being stored on local node Forwarding: - Each message contains the file id it is referring to - If file id stored locally, then stop; - If not, search for the “closest” id in the stack, and forward the message to the corresponding next_hop … 14 Query API: file = query(id); Upon receiving a query for document id - Check whether the queried file is stored locally • If yes, return it • If not, forward the query message Notes: - Each query is associated a TTL that is decremented each time the query message is forwarded; to obscure distance to originator: • TTL can be initiated to a random value within some bounds • When TTL=1, the query is forwarded with a finite probability - Each node maintains the state for all outstanding queries that have traversed it help to avoid cycles - When file is returned, the file is cached along the 15 reverse path Query Example query(10) n2 n1 4 n1 f4 12 n2 f12 5 n3 1 9 n3 f9 4’ 4 2 n3 n4 14 n5 f14 13 n2 f13 3 n6 n5 5 4 n1 f4 10 n5 f10 8 n6 3 n1 f3 14 n4 f14 5 n3 Note: doesn’t show file caching on the reverse path 16 Insert API: insert(id, file); Two steps - Search for the file to be inserted - If not found, insert the file Searching: like query, but nodes maintain state after a collision is detected and the reply is sent back to the originator Insertion - Follow the forward path; insert the file at all nodes along the path - A node probabilistically replace the originator with itself; obscure the true originator 17 Insert Example Assume query returned failure along “blue” path; insert f10 insert(10, f10) n1 4 n1 f4 12 n2 f12 5 n3 n2 9 n3 f9 n3 3 n1 f3 14 n4 f14 5 n3 n4 n5 14 n5 f14 13 n2 f13 3 n6 4 n1 f4 11 n5 f11 8 n6 18 Insert Example insert(10, f10) n1 10 n1 f10 4 n1 f4 12 n2 n2 orig=n1 9 n3 f9 n3 3 n1 f3 14 n4 f14 5 n3 n4 n5 14 n5 f14 13 n2 f13 3 n6 4 n1 f4 11 n5 f11 8 n6 19 Insert Example n2 replaces the originator (n1) with itself insert(10, f10) n1 10 n1 f10 4 n1 f4 12 n2 n2 10 n2 f10 9 n3 f9 orig=n2 n3 10 n2 f10 3 n1 f3 14 n4 n4 n5 14 n5 f14 13 n2 f13 3 n6 4 n1 f4 11 n5 f11 8 n6 20 Insert Example n2 replaces the originator (n1) with itself Insert(10, f10) n1 n2 10 n1 f10 4 n1 f4 12 n2 10 n2 f10 9 n3 f9 n4 n3 10 n4 f10 14 n5 f14 13 n2 n5 10 n4 f10 4 n1 f4 11 n5 10 n2 f10 3 n1 f3 14 n4 21 Freenet Summary Advantages - Provides publisher anonymity - Totally decentralize architecture robust and scalable - Resistant against malicious file deletion Disadvantages - Does not always guarantee that a file is found, even if the file is in the network 23 Solutions to the Location Problem Goal: make sure that an item (file) identified is always found - indexing scheme: used to map file names to their location in the system - Requires a scalable indexing mechanism Abstraction: a distributed hash-table data strctr - insert(id, item); - item = query(id); - Note: item can be anything: a data object, document, file, pointer to a file… Proposals - CAN, Chord, Kademlia, Pastry, Viceroy, Tapestry, etc 24 Internet-scale hash tables Hash tables - essential building block in software systems Internet-scale distributed hash tables - equally valuable to large-scale distributed systems? - peer-to-peer systems - Napster, Gnutella, Groove, FreeNet, MojoNation… - large-scale storage management systems - Publius, OceanStore, PAST, Farsite, CFS ... - mirroring on the Web - Content-Addressable Network (CAN) - scalable - operationally simple - good performance 25 Content Addressable Network (CAN): basic idea Interface - insert(key,value) - value = retrieve(key) key (id), value (item) K V K V K V K V K V K V K V K V insert (K1,V1) K V K V K V 26 CAN: basic idea (K1,V1) K V K V K V K V K V K V K V K V K V K V K V retrieve (K1) 27 CAN: basic idea Associate to each node and item a unique id in an d-dimensional Cartesian space - key (id) - node/point – zone (d) Goals - Scales to hundreds of thousands of nodes - Handles rapid arrival and failure of nodes Properties - Routing table size O(d) - Guarantees that a file is found in at most d*n1/d steps, where n is the total number of nodes 28 CAN: solution virtual d-dimensional Cartesian coordinate space entire space is partitioned amongst all the nodes - every node “owns” a zone in the overall space abstraction - can store data at “points” in the space - can route from one “point” to another point = node that owns the enclosing zone 29 CAN Example: Two Dimensional Space Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 Example: - Node n1:(1, 2) first node that joins cover the entire space 7 6 5 4 3 n1 2 1 0 0 1 2 3 4 5 6 30 7 CAN Example: Two Dimensional Space Node n2:(4, 2) joins space is divided between n1 and n2 7 6 5 4 3 n2 n1 2 1 0 0 1 2 3 4 5 6 31 7 CAN Example: Two Dimensional Space Node n3:(3, 5) joins space is divided between n1 and n3 7 6 n3 5 4 3 n2 n1 2 1 0 0 1 2 3 4 5 6 32 7 CAN Example: Two Dimensional Space Nodes n4:(5, 5) and n5:(6,6) join 7 6 n4 n3 5 n5 4 3 n2 n1 2 1 0 0 1 2 3 4 5 6 33 7 Simple example: To store a pair (K1,V1) key K1 is mapped onto a point P in the coordinate space using a uniform hash function The corresponding (key,value) pair is then stored at the node that owns the zone within which the point P lies Data stored in the CAN is addressed by name (i.e. key), not location (i.e. IP address) Node I::insert(K,V) (1) a = hx(K) b = hy(K) (2) route(K,V) --> (a,b) (3) (a,b) stores (K,V) I y=b 34 x=a CAN Example: Two Dimensional Space Each item is stored by the node who owns its mapping in the space 7 6 5 Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6) Items: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5); n4 n3 n5 f4 4 f1 3 n2 n1 2 f3 1 f2 0 0 1 2 3 4 5 6 35 7 Simple example: To retrieve key K1 Any node can apply the same deterministic hash function to map K1 onto point P and then retrieve the corresponding value from the point P If the point P is not owned by the requesting node, the request must be routed through the CAN infrastructure until it reaches the node in whose zone P lies node J::retrieve(K) (1) a = hx(K) b = hy(K) (2) route “retrieve(K)” to (a,b) (K,V) y=b J 36 x=a CAN: Query/Routing Example Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 A node only maintains state for its immediate neighboring nodes Can route around some failures 7 6 n4 n3 5 n5 f4 4 f1 3 n1 2 n2 f3 1 f2 0 0 1 2 3 4 5 6 37 7 CAN: node insertion Inserting a new node affects only a single other node and its immediate neighbors 1) discover some node “I” already in CAN 2) pick random point in space 3) I routes to (p,q), discovers node J 4) split J’s zone in half… new owns one half (p,q) J new I new node 38 CAN: Node Failure Recovery Simple failures - Know your neighbor’s neighbors - When a node fails, one of its neighbors takes over its zone More complex failure modes - Simultaneous failure of multiple adjacent nodes - Scoped flooding to discover neighbors - Hopefully, a rare event Only the failed node’s immediate neighbors are required for recovery 39 Evaluation Scalability Low-latency Load balancing Robustness 40 CAN: scalability For a uniformly partitioned space with n nodes and d dimensions - per node, number of neighbors is 2d - average routing path is (dn1/d)/4 hops - simulations show that the above results hold in practice Can scale the network without increasing per-node state Chord/Plaxton/Tapestry/Buzz - log(n) neighbors with log(n) hops 41 CAN: low-latency Problem - latency stretch = (CAN routing delay) (IP routing delay) - application-level routing may lead to high stretch Solution - increase dimensions, realities (reduce the path length) - Heuristics (reduce the per-CAN-hop latency) • RTT-weighted routing • multiple nodes per zone (peer nodes) • deterministically replicate entries 42 CAN: low-latency #dimensions = 2 180 160 w/o heuristics 140 w/ heuristics 120 100 80 60 40 20 0 16K 32K #nodes 65K 131K 43 CAN: low-latency #dimensions = 10 10 8 w/o heuristics 6 w/ heuristics 4 2 0 16K 32K #nodes 65K 131K 44 CAN: load balancing Two pieces - Dealing with hot-spots • popular (key,value) pairs • nodes cache recently requested entries • overloaded node replicates popular entries at neighbors - Uniform coordinate space partitioning • uniformly spread (key,value) entries • uniformly spread out routing load 45 CAN: Robustness Completely distributed - no single point of failure ( not applicable to pieces of database when node failure happens) Not exploring database recovery (in case there are multiple copies of database) Resilience of routing - can route around trouble 46 Strengths More resilient than flooding broadcast networks Efficient at locating information Fault tolerant routing Node & Data High Availability (w/ improvement) Manageable routing table size & network traffic 47 Weaknesses Impossible to perform a fuzzy search Susceptible to malicious activity Maintain coherence of all the indexed data (Network overhead, Efficient distribution) Still relatively higher routing latency Poor performance w/o improvement 48 Suggestions Catalog and Meta indexes to perform search function Extension to handle mutable content efficiently for web-hosting Security mechanism to defense against attacks 49 Ongoing Work Topologically-sensitive CAN construction - Distributed Binning Goal - bin nodes such that co-located nodes land in same bin Idea - well known set of landmark machines - each CAN node, measures its RTT to each landmark - orders the landmarks in order of increasing RTT CAN construction - place nodes from the same bin close 50 together on the CAN Distributed Binning - 4 Landmarks (placed at 5 hops away from each other) - naïve partitioning #dimensions=2 20 #dimensions=4 w/o binning w/ binning w/o binning w/ binning 15 10 5 256 1K 4K 256 number of nodes 1K 4K 51 Ongoing Work (cont’d) CAN Security (Petros Maniatis - Stanford) - spectrum of attacks - appropriate counter-measures CAN Usage - Application-level Multicast (NGC 2001) - Grass-Roots Content Distribution - Distributed Databases using CANs (J.Hellerstein, S.Ratnasamy, S.Shenker, I.Stoica, S.Zhuang) 52 Summary CAN - an Internet-scale hash table - potential building block in Internet applications Scalability - O(d) per-node state - average routing path is (dn1/d)/4 hops Low-latency routing - simple heuristics help a lot Robust - decentralized, can route around trouble 53