Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Peer-to-Peer Intro 5.4.2005 Jani & Sami Peltotalo Overview of P2P • Overlay networks • Current P2P applications – P2P file sharing – Instant messaging / voice over IP – P2P distributed computing P2P Architectures 1G P2P: Centralized Network • Fast search/query response times • Simple Protocol • Provides a high degree of performance and resilience • Susceptible to being shutdown, single server or server farm • e.g. Napster 2G P2P: Decentralized Network • Slow search/query response times which generates large volumes of network traffic • Network resilience and performance governed by users' PCs and their network connectivity • No central points of failure or control • e.g. Gnutella 0.4 3G P2P: Hybrid Architecture • Improved search/query response times, with less traffic generated per query than decentralized networks • The deployment of super-peers provides a high degree of performance and resilience • No central points of failure or control • e.g. FastTrack, Gnutella 0.6 super-peers 4G P2P: Different type of architectures • BitTorrent: centralized • eDonkey2000: semi-centralized • Overnet: decentralized FastTrack Overview FastTrack • Clients: KaZaA, iMesh, Grokster... • MP3s & entire albums, videos, games • Decentralized network, supernodes act as temporary indexing servers (hierarchical architecture) • Control data encrypted • Everything in HTTP request and response messages • Optional parallel downloading of files FastTrack: Architecture • Each peer is either a supernode or is assigned to a supernode • Selection criterias: CPU, memory, network connection • Each SN has about 100-150 children nodes and has 30-50 TCP connections with other supernodes • SN tracks the content and IP of its children nodes, not content under its neighboring SNs supernodes originalnodes FastTrack: Metadata • When ON connects to SN, it uploads its metadata • For each file: – – – – File name File size Content Hash (MD5+CRC) File descriptors: used for keyword matches during query • Content Hash: – When peer A selects file at peer B, peer A sends ContentHash in HTTP request – If download for a specific file fails (partially completes), ContentHash is used to search for new copy of file FastTrack: Overlay Maintenance • List of potential supernodes included within software download • New peer goes through the list until it finds operational supernode – Node “pings” (5-6) supernodes on the list and connects with the first replied SN – Connects and obtains more up-to-date list, with 200 entries – SNs in the updated list are “close” to ON • If supernode goes down, node goes through the updated list and finds new supernode FastTrack: Queries • Node first sends query to supernode – Supernode responds with matches – If x matches found, done • Otherwise, supernode forwards query to subset of supernodes – If total of x matches found, done • Otherwise, query further forwarded – Probably by original supernode rather than recursively FastTrack: Parallel Downloading and Recovery • If file is found in multiple nodes, user can select parallel downloading • Identical copies identified by ContentHash • HTTP byte-range header used to request different portions of the file from different nodes • Automatic recovery when server peer stops sending file – ContentHash is used to search for new copy of file eDonkey2000 Overview eDonkey2000 (ED2K) • • Semi-centralized network, includes index servers Many clients: eDonkey2000, MLDonkey, eMule, Shareaza... Index server: Lugdunum Used also for legal content delivery Files identified by hash (MD4) Possible to search files using web, founded ed2k links can be used to start file download • • • • – ed2k://|file|gentoo.linux.install-x86-minimal-2004.1 [found via www.FileDonkey.com].iso|85764096|F1819D1C731923327E1 40F09DB7400B6|/) ED2K • Communication: – – – – • client-connected index server: TCP client-other index servers: UDP index server-index server: UDP client-client: TCP File transfer using Multisource File Transmission Protocol (MFTP) – also HTTP and BitTorrent supported ED2K: Registration Index Server 1 Index Server 2 Index Server 3 Register to server, tell server own shared files Peer 4 (registered to index server 3) XXXX.txt XXXX.exe Index Server 1 XXXX.exe YYYY.txt YYYY.exe Peer 2 Peer 2 Peer 2 Index Server 2 YYYY.txt ZZZZ.exe Peer 3 Peer 3 ZZZZ.txt ZZZZ.exe Peer 1 Index Server 3 ZZZZ.txt ZZZZ.exe Peer 4 Peer 4 YYYY.txt YYYY.exe XXXX.exe Peer 2 (registered to index server 1) YYYY.txt ZZZZ.exe Peer 3 (registered to index server 2) ED2K: Registration Reply Index Server 1 Index Server 2 Index Server 3 List of other index servers known by index server 1 Peer 4 (registered to index server 3) ZZZZ.txt ZZZZ.exe XXXX.txt XXXX.exe Peer 1 (registered to index server 1) Index Server 1 XXXX.txt XXXX.exe YYYY.txt YYYY.exe Peer 1 Peer 1 & Peer 2 Peer 2 Peer 2 YYYY.txt YYYY.exe XXXX.exe Peer 2 (registered to index server 1) YYYY.txt ZZZZ.exe Peer 3 (registered to index server 2) ED2K: File Search Index Server 1 Index Server 2 Index Server 3 Search Files (UDP) Search Files (TCP) Search Files message includes: •keyword •optionally - min file size - max file size - availability - etc. Peer 4 Peer 1 Peer 2 Peer 3 ED2K: File Search Reply Index Server 1 Index Server 2 Index Server 3 Search File Results (UDP) Search File Results (TCP) Search File Results message includes one or more file info: •file hash •client IP and port (optional?) •file name Peer 4 Peer 1 Peer 2 Peer 3 ED2K: File Downloading 1/3 Index Server 1 Index Server 2 Index Server 3 Get Sources (UDP) Get Sources (TCP) •Done if Search File Results message(s) don’t include client IP and port pair(s) or ED2K link is used to start downloading •includes: - file hash Peer 4 Peer 1 Peer 2 Peer 3 ED2K: File Downloading 2/3 Index Server 1 Index Server 2 Index Server 3 Found Sources (UDP) Found Sources (TCP) Found Sources message includes: •file hash •address list - client IP and port Peer 4 Peer 1 Peer 2 Peer 3 ED2K: File Downloading 3/3 Index Server 1 Index Server 2 Index Server 3 Peer 4 File requests and downloading Peer 1 Peer 2 Peer 3 BitTorrent Overview BitTorrent • • • • Centralized network, includes tracker .torrent files Google search for .torrents Legal material available BitTorrent: Get .torrent HTTP Server Tracker Seed 1 GET .torrent file .torrent file Leecher 1 Downloader In .torrent file: • file size • file name • hash of file (SHA1) • url of tracker Seed 2 BitTorrent: Get Peer List HTTP Server Tracker Seed 1 GET-announce Response-peer list Leecher 1 Downloader Seed 2 BitTorrent: Query File Pieces HTTP Server Tracker Seed 1 GET pieces of file Leecher 1 Downloader Seed 2 BitTorrent: File Pieces HTTP Server Tracker Info about download status Seed 1 pieces of file Leecher 1 Leecher 2 Seed 2 BitTorrent: Status Information HTTP Server Tracker Info about complete download Seed 1 Seed 3 Seed 4 Seed 2 Skype Overview Skype • Skype is a P2P VoIP client developed by the people who did KaZaA • Allows its users to place voice calls and send text messages to other users of Skype clients • Two types of nodes in the overlay network, ordinary hosts (OH) and super nodes (SN) • OH is a Skype application that can be used to place voice calls and send text messages • SN is an ordinary host’s end-point on the Skype network • Any node with a public IP address having sufficient CPU, memory, and network bandwidth is a candidate to become a SN Skype • OH must connect to a SN and must register itself with the Skype login server for a successful login • 7 bootstrap super nodes • The host cache (HC) is a list of super node IP address and port pairs that OH builds and refreshes regularly • HC contains a maximum of 200 entries Skype Network Skype Login Server Super Nodes Message exchange during login Skype • Uses its Global Index technology to search for a user • Firewall traversal: First UDP, second TCP, third TCP port 80 (HTTP), fourth TCP port 443 (HTTPS) • Call signaling is carried always over TCP NAT and Firewall Traversal -If caller is behind portrestricted NAT, call signaling (TCP) is forwarded through a node, which has a public IP address -If either caller or callee or both are behind portrestricted NAT voice traffic (UDP) is forwarded through the same node - If both caller and callee have a public IP address, call signaling (TCP) and voice traffic (UDP) flow directly between them - If both caller and callee are behind port-restricted NAT and UDPrestricted firewall, then signaling traffic and voice traffic is forwarded through another node over TCP