* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download p2p-intro-mcomp - UF CISE
Zero-configuration networking wikipedia , lookup
Deep packet inspection wikipedia , lookup
Distributed firewall wikipedia , lookup
Airborne Networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Remote Desktop Services wikipedia , lookup
Dr. Sumi Helal & Dr. Choonhwa Lee Computer & Information Science & Engineering Department University of Florida, Gainesville, FL 32611 {helal, chl}@cise.ufl.edu Introduction to peer-to-peer networking protocols (Nov. 9) BitTorrent protocol (Nov. 9) Peer-to-peer streaming protocols (Nov. 18) 1. 2. 3. 4. 5. I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan, “Chord: A Peerto-Peer Lookup Service for Internet Applications,” In Proc. of the ACM SIGCOMM Conference, September 2001. B. Cohen, “Incentives Build Robustness in Bit Torrent,” In Proceedings of Workshop on Economics of Peer-to-Peer Systems, 2003. M. Piatek, T. Isdal, T. Anderson, A. Krishnamurthy, and A. Venkataramani, "Do Incentives Build Robustness in BitTorrent?” In Proc. of the 4th USENIX Symposium on Networked Systems Design and Implementation (NSDI), April 2007. J. Liu, S. G. Rao, B. Li, and H. Zhang, “Opportunities and Challenges of Peer-toPeer Internet Video Broadcast,” Proc. of the IEEE, vol.96, no.1, pp.11-24, January 2008. M. Zhang, Q. Zhang, L. Sun, and S. Yang, “Understanding the Power of PullBased Streaming Protocol: Can We Do Better?” IEEE Journal on Selected Areas in Communications, vol.25, no.9, pp.1678-1694, December 2007. Slide courtesy: Prof. Jehn-Ruey Jiang, National Central University, Taiwan Dah Ming Chiu, Chinese University of Hong Kong, China Chun-Hsin, National University of Kaohsiung, Taiwan Prof Shiao-Li Tsao, National Chiao Tung University, Taiwan Prof. Shuigeng Zhou, Fudan University, China P2P file sharing P2P communication Squid, Akamai, LimeLight Overlay testbed CoopNet, Zigzag, Narada, P2Cast, Joost, PPStream Proxies and Content Distribution Networks IRIS, Chord/CFS, Tapestry/OceanStore, Pastry/PAST, CAN P2P multimedia streaming NetNews (NNTP), Instant Messaging (IM), Skype (VoIP) P2P lookup services and applications (DHTs and global repositories) Napster, FreeNet, Gnutella, KaZaA, eDonkey/eMule, ezPeer, Kuro, BT PlanetLab, NetBed/EmuLab Other areas P2P Gaming, Grid Computing More than 200 million users registered with Skype and around 10 million online users (2007) Around 4.7M hosts participate in SETI@home (2006) BitTorrent accounts for 1/3 of Internet traffic (2007) More than 200,000 simultaneous online users on PPLive (2007) More than 3,000,000 users downloaded PPStream (2008) Well-known, powerful, reliable server is data source Clients request data from server Very successful model WWW (HTTP), FTP, Web Services, etc Scalability A single point of failure System administration Unused resources at the network edge “Peer-to-Peer (P2P) is a way of structuring distributed applications such that individual nodes have symmetric roles. Rather than being divided into clients and servers each with quite distinct roles, in P2P applications, a node may act as both a client and a server.” Excerpt from the Charter of Peer-to-Peer Research Group, IETF/IRTF, June 24, 2003 Peers play similar roles No distinction of responsibilities In a P2P network, every node is both a client and a server No centralized data source Provide and consume data Any node can initiate a connection The ultimate form of democracy on the Internet As no. of clients increases, no. of servers also increases Perfectly scalable Distributed costs Increased privacy Efficient use of resources Scalability Consumers of resources also donate resources Aggregate resources grow naturally, as more peers join Reliability B/W, storage, and processing power at the edge of the network Replicas Geographic distribution No single point of failure Ease of administration Self-organization No need for server deployment and provisioning Built-in fault tolerance, replication, and load balancing 1999: Napster 2000: Gnutella, eDonkey 2001: Kazaa 2002: eMule, BitTorrent 2003: Skype 2004: Coolstreaming, GridMedia, PPLive 2004~: TVKoo, TVAnts, PPStream, SopCast, … 14 Whether or not the protocols rely on central indexing servers to facilitate the interactions between peers Decentralized Hybrid Centralized Whether the overlay networks contain some structure or are created in an ad-hoc fashion Unstructured Structured (i.e., precise control over network topology or data placement) Unstructured Networks Centralized Napster Decentralized Gnutella Hybrid KaZaA, Gnutella Structured Networks Chord, Pastry, CAN First P2P file sharing application Centralized directory to help find content History In 1999, S. Fanning launches Napster Peaked at 1.5 million simultaneous users July 2001, Napster shuts down insert(X, 123.2.21.23) ... Publish I have X, Y, and Z! 123.2.21.23 123.2.0.18 Fetch Query Where is file A? search(A) --> 123.2.0.18 Reply Pros: Simple Search cost is O(1) Controllable (pro or con?) Cons: Server maintains O(N) state Server does all processing A single point of failure Completely distributed P2P file sharing Each peer floods its request to all other peers prohibitive overheads History In 2000, J. Frankel and T. Pepper from Nullsoft released Gnutella Soon many other clients: Bearshare, Morpheus, LimeWire, etc. In 2001, many protocol enhancements including “UltraPeers The ‘Animal’ GNU GNU: Recursive Acronym GNU’s Not Unix …. Gnutella = + GNU Nutella Nutella: a hazelnut chocolate spread produced by the Italian confectioner Ferrero …. I have file A. I have file A. Reply Query Where is file A? Pros: Fully de-centralized Search cost distributed Cons: Search cost is O(N) Search time is O(???) Nodes leave often, network unstable Hierarchical supernodes, i.e., ultra-peers Assigned the task of servicing a small sub-part of the network Indexing and caching of files in the assigned part Sufficient bandwidth and processing power Kazaa & Morpheus are proprietary systems Hybrid protocol More efficient than old Gnutella More robust than Napster “Super Nodes” insert(X, 123.2.21.23) ... Publish I have X! 123.2.21.23 search(A) --> 123.2.22.50 123.2.22.50 Query Replies search(A) --> 123.2.0.18 Where is file A? 123.2.0.18 So far n: number of participating nodes Centralized : - Directory size – O(n) - Number of hops – O(1) Flooded queries: - Directory size – O(1) - Number of hops – O(n) We want Efficiency : O(log(n)) messages per lookup Scalability : O(log(n)) state per node Robustness : surviving massive failures Publish (H(y)) P2P Network Object “y” Objects have hash keys Join (H(x)) Peer “x” H(y) H(x) y Hash key Peer nodes also x have hash keys in the same hash space Place object to the peer with closest hash keys 0 M node data object Hash table 0 2128-1 Peer node Internet Hash table Peer node 0 2128-1 Track peers which allow us to move quickly across the hash space Hash table Peer node A peer p tracks those peers responsible for hash keys (p+2i-1), i=1,..,m 0 p p+22 p+24 p+28 2128-1 Frans Kaashoek et al., MIT, 2001 Identifiers m bit identifier space for both keys and nodes Key identifier = SHA-1(key) Key=“LetItBe” ID=5 Node identifier = SHA-1(IP address) IP=“198.10.10.1” SHA-1 SHA-1 ID=105 Both are uniformly distributed How to map key IDs to node IDs? K5 IP=“198.10.10.1” N105 As nodes enter the network, they are assigned unique IDs by hashing their IP address K20 Circular 7-bit ID space N90 K80 N32 A key is stored at its successor: node with next higher ID N120 N10 “Where is key 80?” N105 “N90 has K80” K80 N32 Every node knows its successor in the ring N90 N60 Finger table (FT): With m additional entry The i-th entry points to the successor of node n+ 2i-1 To look up key k at node n In FT, identify the highest node n' whose id is between n and k. If such a node exists, the lookup is repeated starting from n' Otherwise, the successor of n is returned finger[k]: The first node on circle that succeeds (n+2k-1) mod 2m, 1≤k≤m N42 is the first node that succ eeds (8+26-1) mod 26=40 N14 is the first node that suc ceeds (8+21-1) mod 26=9 Lookup(my-id, key-id) look in local finger table for highest node n s.t. my-id < n < key-id if n exists call Lookup(key-id) on node n // next hop else return my successor // done N5 N10 N110 K19 N20 N99 N32: N60, N80, N99 N99: N110, N5, N60 N5 : N10, N20, N32, N60, N80 N10: N20, N32, N60 N80 N80 N20: N32, N60, N99 N32 Lookup(K19) N60 Centralized/distributed/hybrid Napster, Gnutella, KaZaA Unstructured/structured Unstructured P2P – no control over topology and file placement Gnutella, Morpheus, Kazaa, etc Structured P2P – topology is tightly controlled and placement of files are not random Chord, CAN, Pastry, Tapestry, etc P2P overlay topology Free riding – incentive mechanisms Topological awareness ISP-friendly NAT traversal Fault resilience P2P traffic monitoring and detection Security Search – full index, partial index, semantic search Spurious content, anonymity, trust & reputation management Non-technical issues Copyright infringement, intellectual privacy Slide courtesy: Prof. Dah Ming Chiu, Chinese University of Hong Kong, Hong Kong Dr. Iqbal Mohomed, University of Toronto, Canada IP multicast CDN (Content Distribution Network) Application layer multicast Overlay structures Tree-based (push) Data-driven (pull) P2P swarming BitTorrent, CoolStreaming Released in the summer of 2001 Basic ideas from game theory to largely eliminate the free-rider problem All precedent systems could not deal with this problem well No strong guarantees unlike DHTs Working extremely well in practice unlike DHTs A file is chopped into small pieces, called chunks Pieces are disseminated over the network As soon as a peer acquire a piece, it can trade it for missing pieces with other peers A peer hopes to be able to assemble the entire file at the end Web server The .torrent file Tracker Peers Content discovery (i.e., file search) is handled outside of BitTorrent, using a Web server To provide the “meta-info” file by HTTP For example, http://bt.btchina.net The information about each movie or content is stored in a metafile such as “supergirl.torrent” Static file storing necessary meta information Name Size Checksum The content is divided into many “chunks” (e.g., 1/4 megabyte each) Each chunk is hashed to a checksum value When a peer later gets the chunks (from other peers), it can check the authenticity by comparing the checksum IP address and port of the Tracker Keeping track of peers To allow peers to find one another To return a random list of active peers Two types of peers: Downloader (leecher) : A peer who has only a part (or none) of the file. Seeder: A peer who has the complete file, and chooses to stay in the system to allow other peers to download Matrix.torrent User Bob Web Server ` Tracker User Downloader: David User Seeder: Chris User Downloader: Alice Tracker Web Server C A Peer Peer [Leech] B Peer [Leech] [Seed] Tracker Web Server C A Peer Peer [Leech] B Peer [Leech] [Seed] Tracker Web Server C A Peer Peer [Leech] B Peer [Leech] [Seed] Tracker Web Server C A Peer Peer [Leech] B Peer [Leech] [Seed] Tracker Web Server C A Peer Peer [Leech] B Peer [Leech] [Seed] Tracker Web Server C A Peer Peer [Leech] B Peer [Leech] [Seed] Tracker Web Server C A Peer Peer [Leech] B Peer [Leech] [Seed] A file is split into chunks of fixed size, typically 256Kb Each peer maintains a bit map that indicates which chunks it has Each peer reports to all of its neighboring peers (obtained from tracker) what chunks it has This is the information used to build the implicit delivery trees {1,2,3,4,5,6,7,8,9,10} User Seeder: Alice {}{1,2,3} {1,2,3,5} {} {1,2,3} {1,2,3,4} {1,2,3,4,5} User Downloader Bob User Downloader Joe Rarer pieces are given priority in downloading with the rarest being the first candidate The most common pieces are postponed towards the end This policy ensures that a variety of pieces are downloaded from the seeder, resulting in quicker chunk propagation Basic idea of tit-for-tat strategy in BitTorrent: Maintain 4-5 “friends” with which to exchange chunks If a friend is not exchanging enough chunks, get rid of him/her Periodically, randomly select a new friend Known as “choking” in BT Known as “optimistic unchoking” in BT If you have no friends, randomly select several new friends Known as “anti-snubbing” in BT Alice 100kb/s User 40kb/s 70kb/s User 110kb/s 10kb/s Downloader Joe 70kb/s 10kb/s 20kb/s 30kb/s 5kb/s 15kb/s User Downloader: Bob User Downloader: Ed User Downloader: David User Downloader: Chris