* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 9: Applications
IEEE 802.1aq wikipedia , lookup
Remote Desktop Services wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Airborne Networking wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Routing in delay-tolerant networking wikipedia , lookup
Computer Networks: A Systems Approach, 5e Larry L. Peterson and Bruce S. Davie Chapter 9 Applications Copyright © 2010, Elsevier Inc. All rights Reserved 1 • Applications need their own protocols. Chapter 9 4 Problem • This chapter explores some of the most popular network applications available today. 2 • Traditional Applications Chapter 9 4 Chapter Outline – HTTP – SMTP • Infrastructure Services – DNS – SNMP • Overlay Networks – P2P – CDN • Multimedia Applications – SDP 3 Chapter 9 4 Traditional Applications • Two of the most popular: – The World Wide Web – Email. • Both applications use the request/reply paradigm: – users send requests to servers – Server respond accordingly. • Why call them “traditional” applications ? – applications that existed since the early days of computer networks – Web is a lot newer than email but has its roots in file transfers that predated it). 4 • Application programs vs Application protocols: • E.g. Chapter 9 4 Traditional Applications – Application Protocol: HTTP • used to retrieve Web pages from remote servers. – Application Programs: Web Clients • E,g. Internet Explorer, Chrome, Firefox, and Safari • Provides users with a different look and feel • All web browsers use the same HTTP protocol to communicate with Web servers over the Internet. 5 • Two very widely-used, standardized application protocols: Chapter 9 4 Traditional Applications – SMTP: Simple Mail Transfer Protocol is used to exchange electronic mail. – HTTP: HyperText Transport Protocol is used to communicate between Web browsers and Web servers. 6 – Email is one of the oldest network applications Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) – User interface/application: • mail reader: Microsoft outlook, Gmail – Underlying message transfer protocols: • SMTP • IMAP – Companion protocol : • RFC 822 and MIME • Defines the format of the messages being exchanged 7 – Message Format Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) • • • • • RFC 822 Messages to have two parts: a header and a body. Both parts are represented in ASCII text. Originally, the body was assumed to be simple text. This is still the case. However, RFC 822 has been augmented by MIME to allow the message body to carry all sorts of data. • This data is still represented as ASCII text • However, because it may be an encoded version of, say, a JPEG image, it’s not necessarily readable by human users. 8 • MIME (Multi-Purpose Internet Mail Extensions): Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) • An extension of the original Internet e-mail protocol • Developed in 1992 by IETF. • MIME is a specification for enhancing the capabilities of standard Internet electronic mail 9 • When using the MIME standard, messages can contain the following types: Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) Text messages in ASCII . Character sets other than ASCII. Multi-media: Image, Audio, and Video messages. Multiple objects in a single message. Multi-font messages. Messages of unlimited length. Binary files. 10 • It builds on the older standard by: Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) – defining additional fields for the mail message header, that describes new content types, and – a distinct organization of the message body. • It explicitly describes the set of allowable Content-types. 1. Text - Used to represent textual information. 2. Image - this type is for transmitting still images. 3. Audio- this content type is for transmitting audio or voice data.. 4. Video - The Video content type is for transmission of video data or moving image data. 11 – Servers insert the MIME header at the beginning of ANY Web transmission. Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) – Clients use this header to select an appropriate "player" application for the type of data the header indicates. – Some of these players are built into the Web client or browser • E.g all browsers come with GIF and JPEG image players as well as the ability to handle HTML files); other players may need to be downloaded. 12 • SMTP, POP3 and IMAP are TCP/IP protocols used for mail delivery. Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) • SMTP (Simple Mail Transfer Protocol): – Used when : • email is delivered from an email client to an email server OR • when email is delivered from one email server to another. – Uses port 25. 13 • SMTP delivers mail to your mail server from other people. Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) • Your mail server stores the received email in a mailbox until your mail client asks for it. • This is where IMAP and POP enter the picture. • IMAP and POP are the two most prevailing protocols for retrieving email from a mail server. • Both of these protocols are supported by almost all popular mail client programs: – Outlook, Thunderbird, Apple Mail. 14 Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) • POP3(Post Office Protocol): – Allows an email client to download an email from an email server. – Simple : does not offer many features except for download. – Assumes that the email client downloads all available email from the server, deletes email from the server and then disconnects. – Normally uses port 110. 15 • IMAP(Internet Message Access Protocol): Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) – shares many similar features with POP3 – It, too, is a protocol that an email client can use to download email from an email server. – However, IMAP includes many more features than POP3. – designed to let users keep their email on the server. • requires more disk space on the server and more CPU resources than POP3, as all emails are stored on the server. – normally uses port 143 16 • Example: Chapter 9 4 Electronic Mail (SMTP, MIME, IMAP) – Suppose you use csbsju email server as your email server to send an email to [email protected]. – You click Send in your email client, say, Outlook Express. – Outlook Express delivers the email to csbsju email server using the SMTP protocol. – csbsju email server delivers the email to Microsoft's mail server, mail.microsoft.com, using SMTP. – Bill's Mozilla Mail client downloads the email from mail.microsoft.com to his laptop using the POP3 protocol (or IMAP). • https://support.google.com/mail/troubleshooter/1668960?hl=en 17 Chapter 9 4 World Wide Web (www) • The World Wide Web has made the Internet accessible to so many people that sometimes it seems to be synonymous with the Internet. • design of the system that became the Web started around 1989, long after the Internet had become a widely deployed system. • The original goal of the Web was to find a way to organize and retrieve information, drawing on ideas about hypertext— interlinked documents—that had been around since at least the 1960s. 18 • The core idea of hypertext is that one document can link to another document, and the protocol (HTTP) and document language (HTML) were designed to meet that goal. Chapter 9 4 World Wide Web • Any Web browser has a function that allows the user to obtain/retrieve an object by “opening a URL.” • They provide information that allows objects on the Web to be located, and they look like the following: – http://www.cs.princeton.edu/index.html 19 • If you opened that particular URL: Chapter 9 4 World Wide Web – your Web browser would open a TCP connection to the Web server at a machine called www.cs.princeton.edu – Then immediately retrieve and display the file called index.html. • Most files on the Web contain : – – – – – – images text audio video clips, pieces of code Hyperlinks etc. 20 Chapter 9 4 World Wide Web • At its core, HTTP is a request/response protocol, where every message has the general form: START_LINE <CRLF> MESSAGE_HEADER <CRLF> <CRLF> MESSAGE_BODY <CRLF> • <CRLF>stands for carriage-return-line-feed. • The first line (START LINE) indicates whether this is a request message or a response message (status line). 21 Chapter 9 4 HTTP request operations 22 Chapter 9 4 • E.g. start wireshark to (filter=http) capture and open url: www.w3.org • http request: Header (no bodyoptional) 23 Chapter 9 4 • HTTP Response: Five types of HTTP result codes 24 Chapter 9 4 • http response: Header (no body- optional) 25 Chapter 9 4 Header body 26 • TCP Connections: – The original version of HTTP (1.0) established a separate TCP connection for each data item retrieved from the server. Chapter 9 4 World Wide Web – Inefficient : connection setup and teardown per data item – Thus, retrieving a page that included some text and a 12 icons or other small graphics would result in 13 separate TCP connections being established and closed. 27 Chapter 9 4 World Wide Web – To overcome this situation, HTTP version 1.1 introduced persistent connections: • the client and server can exchange multiple request/response messages over the same TCP connection. • – Persistent connections have many advantages. • Eliminate the connection setup overhead – Reduce the load on the server – Reduce the load on the network caused by the additional TCP packets, – Reduce the delay perceived by the user. • TCP’s congestion window mechanism operate more efficiently. – Client can send multiple request messages down a single TCP connection. – It’s not necessary to go through the slow start phase for each page. 28 Chapter 9 4 World Wide Web TCP connection 1 TCP connection 2 HTTP 1.0 behavior 29 Chapter 9 4 TCP connection 1 HTTP 1.1 behavior with persistent connections 30 Chapter 9 4 • Web Services: • Read Textbook • http://www.w3schools.com/webservices/ • http://www.tutorialspoint.com/webservices/ 31 • DNS: Domain Name Service – – • Chapter 9 4 DNS url ip translation E.g. cs.csbsju.edu 152.65.160.46 Question: • Why use URLs instead of IP addresses? 32 • Answer: Chapter 9 4 DNS – IP addresses are perfectly suited for use by routers – However, ip addresses are not exactly user-friendly. – Therefore we ALSO assign a unique name to each host in a network. 33 – Host names differ from host addresses in several ways. Host Name Host Address Variable length Fixed length (32 bits- IP4) Mnemonics – easy for humans to remember Numbers- Not easy for humans to remember Has no routing information embedded Sometimes has routing information Chapter 9 4 DNS 34 • Some basic terminology: Chapter 9 4 DNS 1. Name space: – – set of possible names Two types: • flat (names are not divisible into components) • hierarchical (e.g. Unix file names). 2. Binding: – – Binding: name value naming system maintains a collection of bindings 35 Chapter 9 4 DNS 3. Resolution mechanism: – procedure that, when invoked with a name, returns the corresponding value. 4. Name Server: – A specific implementation of a resolution mechanism that is available on a network – Can be queried by sending it a message. • Internet has a well-developed naming system in place— the Domain Name System (DNS). 36 Chapter 9 4 DNS • History of DNS: – Internet did not always use DNS – Early days, there were only a few hundred hosts on the Internet – So a central authority called the Network Information Center (NIC) maintained a flat table of name-toaddress bindings – This table was called hosts.txt. 37 Chapter 9 4 DNS – Whenever a site wanted to add a new host to the Internet, the site administrator sent email to the NIC giving the new host’s name/address pair. – This information was manually entered into the table – The modified table was mailed out to the various sites every few days – Each site installed a local copy. – Name resolution was then simply implemented by a procedure that looked up a host’s name in the local copy of the table and returned the corresponding address. 38 – hosts.txt approach did not work well as the number of hosts in the Internet started to grow. – Therefore, in the mid-1980s, the Domain Naming System was put into place. Chapter 9 4 DNS • DNS: – DNS employs a hierarchical namespace rather than a flat name space – “table” of bindings is partitioned into disjoint pieces and distributed throughout the Internet. – These subtables are made available in name servers that can be queried over the network. 39 • How DNS works: Chapter 9 4 DNS 1. A user presents a host name to an application program (e.g. email, url) 2. Application program uses a naming system to translate this name to a host address 3. Application then opens a connection to this host by presenting some transport protocol (e.g., TCP) with the host’s IP address. 40 Chapter 9 4 DNS Names translated into addresses, where the numbers 1–5 show the sequence of steps in the process 41 Chapter 9 4 DNS • Domain Hierarchy – DNS implements a hierarchical name space for Internet objects. – DNS names are processed from right to left and use periods as the separator. • Unlike Unix file names – The DNS hierarchy can be visualized as a tree, where: • Each node in the tree corresponds to a domain, and • the leaves in the tree correspond to the hosts being named. • Like the Unix file hierarchy. 42 Chapter 9 4 DNS TOP LEVEL = big size domains + domain per country Princeton.edu cs.Princeton.edu Example of a domain hierarchy • In recent years, the number of top-level domains has been expanded, partly to deal with the high demand for .com domains names. • The newer top-level domains include .biz, .coop, and .info. 43 • How is this hierarchy is actually implemented? Chapter 9 4 DNS – First, partition the hierarchy into subtrees called zones. • Each zone corresponds to some administrative authority • Each name server implements the zone information as a collection of resource records. 44 • How to create zones for previous example: Chapter 9 4 DNS Top level hierarchy forms a zone Managed by the Internet Corporation for Assigned Names and Numbers (ICANN). Zone corresponds to Princeton university Zone corresponds to cs department http://www.cs.princeton.edu/ Some departments do not want responsibility of managing hierarchy. So they remain in university level zone. E.g. http://www.csbsju.edu/computer-science 45 • Information contained in each zone is implemented in two or more name servers (for redundancy). • Each name server can be accessed over the Internet. • Clients send queries to name servers • Name servers respond with the requested information. Chapter 9 4 DNS 46 • From an implementation perspective, think of DNS as being represented by a hierarchy of name servers rather than by a hierarchy of domains Chapter 9 4 DNS Hierarchy of name servers. 47 • Each name server implements the zone information as a collection of resource records. Chapter 9 4 DNS 48 • Example Zone file for the domain example.com: • • • • • Chapter 9 4 DNS Zone files consist of Comments, Directives and Resource Records Comments start with ; Directives start with $ (e.g. $ORIGIN, $TTL, $INCLUDE, $GENERATE) The $TTL directive should be present and appear before the first RR (Resource Record) The first Resource Record MUST be the SOA (Start of Authority) with authoritative master name server and email address of someone managing name server . From: wikepedia.org 49 • A resource record is a name-to-value binding: Chapter 9 4 DNS – A 5-tuple : <Name, Value, Type, Class, TTL > • Name: – Name of resource – E.g. mail.example.com • Value: – Value associated with name (in namevalue binding) – E.g. 192.0.2.3 50 Chapter 9 4 DNS – Type: – Type = A: indicates that the Value is an IP address. » Thus, A records implement the name-to-address mapping – Type =NS: The Value field gives the domain name for a host that is running a name server that knows how to resolve names within the specified domain. – Type= CNAME: The Value field gives the canonical name for a particular host; it is used to define aliases. – Type = MX: The Value field gives the domain name for a host that is running a mail server that accepts messages for the specified domain. – Class: – Specifies the class of the resource record being requested – Only widely used Class is IN (one used by the Internet) 51 Chapter 9 4 DNS • TTL: – Time To Live – Specifies the number of seconds that the record should be retained in the cache of the device reading the record. 52 Chapter 9 4 DNS TTL SOA RR Set of RRs RR name class type value 53 • Read more: Chapter 9 4 DNS – https://www.centos.org/docs/5/html/Deployment_Guide-en-US/s1-bind-zone.html – http://www.zytrax.com/books/dns/ch8/ – http://en.wikipedia.org/wiki/Zone_file 54 Chapter 9 4 DNS • Broader view: Root Name Server NS = a3.nstld.com • NS = a.gtld-servers.net A root name server contains an NS record for each top-level domain (TLD) name server – servers that can resolve queries for .edu and .com: .edu TLD server IP address binding .com TLD server 55 NS = a3.nstld.com Chapter 9 4 DNS NS = dns.princeton.edu • The a3.nstld.com server has records for .edu domains like this: – Servers that can resolve SOME queries for princeton.edu • e.g. email.princeton.edu – Server redirect others to a lower level server in the hierarchy • e.g. penguins.cs.princeton.edu 56 Chapter 9 4 DNS NS = a3.nstld.com NS = dns.princeton.edu • Third-level name server with domain: cs.princeton.edu contain A records for all its hosts. 57 • Name Resolution: Chapter 9 4 DNS – Given a hierarchy of name servers, let’s see how a client engages these servers to resolve a domain name. – E.g. suppose the client wants to resolve the name penguins.cs.princeton.edu 58 • Name Resolution 2. Root server CANNOT match the entire name (No A records ) Chapter 9 4 DNS 1. Client send a query containing this name to one of the root servers 3. So Root server send BEST MATCH it has (.edu NS) Name resolution in practice, where the numbers 1–10 show the sequence of steps in the process. 59 • The Unix utility nslookup is a DNS-lookup tool Chapter 9 4 nslookup – nslookup 60 • Mechanism for introducing new functionality into the Internet Chapter 9 4 Overlay Network • Becoming very popular • Overlay : – a logical network implemented on top of a some underlying network. – E.g VPN • Overlay node: – Each node in the overlay also exists in the underlying network – A node processes and forwards packets in an application-specific way. • Overlay links: – The links that connect the overlay nodes – Implemented as tunnels through the underlying network. 61 Chapter 9 4 Overlay Network Overlay node Overlay link Overlay network Physical network Overlay link mapped to sequence of physical links Overlay network layered on top of a physical network 62 Chapter 9 4 Overlay Networks • P2P Networks: – Overlay networks – An alternative to conventional client–server systems – A peer can act both as a client and a server – No central coordination 63 63 Chapter 9 4 Overlay Network • P2P Applications: – File Sharing (E.g. Napster, Gnutella, Bittorent) • Peers share part/all of their files in local machine – Collaboration (E.g. Groove, Collanos workspace) – IP Telephony (E.g. Skype) – Web Search Engines (E.g. Yacy, Faroo) – Digital libraries (E.g. DESCENT) 64 64 Chapter 9 4 Introduction[contd.] • Structured P2P – Overlay structure and data placement are precisely determined – E.g. Chord, CAN, Pastry etc. – Guaranteed lookups • Unstructured P2P – Network topology is arbitrary – No rules defining where data is stored – More popular, – Highly resilient to network dynamics – E.g. Gnutella, KaZAa etc. 65 65 – What’s interesting about peer-to-peer networks? Chapter 9 4 Overlay Network • Searching and downloading: – – – – Searching :locating object of interest : Downloading: downloading that object onto your local machine Happen with no centralized authority Very scalable to millions of nodes. • Searching Unstructured P2P networks are challenging: – Peers operate on incomplete knowledge » Each peer know existence of only its directly connected neighbors – P2P networks are highly dynamic – Need support complex queries (E.g. semantic queries) 66 • Unstructured Peer-to-peer Networks: Chapter 9 4 Overlay Network – Gnutella • Gnutella is an early peer-to-peer network • General files sharing • Original design: flat network – Join via bootstrap node – Connect to random set of existing hosts • Recent incarnations use hierarchical structure 67 • Unstructured Overlays Chapter 9 4 Overlay Network Example topology of a Gnutella peer-topeer network 68 Chapter 9 4 Topology of the Gnutella network in 2001 (1771 peers) Topology of the Gnutella network after a random 30% of the nodes are removed Topology of the Gnutella network after the highestdegree 4% of the nodes are removed. From: A Measurement Study of Peer-to-Peer File Sharing Systems, by Stefan Saroiu, P. Krishna Gummadi, Steven D. Gribble 69 Chapter 9 4 • Searching in Gnutella: – Assume a node want to find a file (he knows the name of the file) – How can Gnutella search the network for the file? – Note that a node only knows existence of direct neighbors only. 70 Chapter 9 4 • Answer: – Blind Search: • Nodes keep no soft state for neighboring nodes 1. Flooding – – – – – – – Node ask ALL neighbors for files of interest (generate and send query) TTL controlled flooding Neighbors ask their neighbors, (forward to ALL other neighbors) and so on Each neighbor reduce message TTL by 1 before forwarding Message is discarded when TTL=0 F I High success rate B High network traffic A E K J C H Query (TTL=2) D G Hit Query originator • Is there any way to perform better search? 71 Chapter 9 4 2. Random Walker – Query originator (and forwarding nodes) randomly select one neighbors to send query – This reduces traffic – Reduces success rate of finding objects – Success really depends on topology of the network – K-walker random walk: » Query originator select k random neighbors to forward the query F I B A E K J C H Query (K=1,TTL=2) D G Hit Query originator 72 Chapter 9 4 – Informed Search: • Nodes/peers keep some soft state about neighbors • What kind of soft state would help peers to do a better search? 73 Chapter 9 4 – Answer: • Past history • Degree of neighbors – Adaptive Probabilistic Search (APS) • • • Informed search Dynamically builds knowledge based on past queries Uses this knowledge to guide future queries F I B A E K J C H Query (walkers=2,TTL=3) G D Hit Miss Query originator Indices Initially DG DC DA 30 30 30 At Walker After Index Termination Update 40 40 40 20 30 30 74 Chapter 9 4 F I B A E K J C H 01 Query (walkers=2,TTL=3) G D Hit Miss Query originator Indices Initially DG DC DA 30 30 30 At Walker Termination 40 40 30 After Index Update 40 20 30 Search Object=“01” 75 – Structured Overlays Chapter 9 4 Overlay Network – Overlay structure and data placement are precisely determined – E.g. Chord, CAN, Pastry etc. – Guaranteed lookups 76 Chapter 9 4 – Chord – Organize nodes in a ring – In an N-node network, each node maintains information only about O(log N) other nodes – Small amount of routing information/soft state per node – A lookup requires O(log N) messages. – Distributed Indexes: – – – – – indices are distributed in nodes to support keyword search. Set of <key,value> pairs A key = a keyword/ search term A value = list of nodes hosting documents with those keywords. Values (list of nodes with documents) can be retrieved by looking up key 77 Chapter 9 4 – Consistent Hashing: – Assign each node and key an m-bit identifier using a base hash function such as SHA-1 – Node ID = hash (node’s IP address) – Key ID = hash (key) – m must be large enough to avoid two keys/nodes hashing to same ID – How are keys assigned to nodes? – Key space is partitioned among nodes – For index to be distributed, each node take responsibility of storing <key,value> pairs of subset of keys – Note that both nodeID and keyID are from same hash range (0 – 2m). – So assigning key to a node is easy based on hash – Key k is assigned to the first node whose ID is equal to or follows k ID in the ID space (successor(k)) 78 Chapter 9 4 • An identifier circle consisting of the three nodes 0, 1, and 3 (m=3). – – – – NodeID/KeyID space: 0,1,2,…7 Key 1 (keyID=1) is located at node 1 (NodeID=1) key 2 is located at node 3 Key 6 is located at node 0. All arithmetic is modulo 2m Key k is assigned to the first node whose ID is equal to or follows k in the ID space (successor(k)) Node immediately follows key 2 79 Chapter 9 4 Hash function ensures even distribution of nodes and keys on the circle 80 Chapter 9 4 • What happens when a new node joins or existing node leaves the network? – Keys need to be redistributed/reassigned – Node n joins: • Certain keys previously assigned to successor(n) now assigned to n • E.g. in first example, if node2 joins, key2 will be assigned to successor(2)= node2 now. – Node n leave: • All of its assigned keys are reassigned to n’s successor • E.g. if node0 leaves, key6 is assigned to successor(6)= node1 now. 81 Chapter 9 4 • Node N21 joins 82 Chapter 9 4 • Node N26 joins • N26 joins the system • N26 aquires N32 as its successor • N32 aquires N26 as its predecessor 83 • N21 aquires N26 as its successor • N26 aquires N21 as predecessor Chapter 9 4 • Redistribute keys and update successor pointers 84 Chapter 9 4 • Searching/ key location: – Each node need only be aware of its successor node (successor pointers) on the circle to route a query – Query routing: • Query for a given keyID is passed around the circle via successor pointers until query land in a node with nodeID = successor(searchkeyID) • Query does not traverse all N nodes in circle. • To accelerate routing, each node maintain a routing table (Finger table). 85 Chapter 9 4 • Finger table has at most m entries – E.g. for m=3, 8 keyids/nodeid as possible. A node finger table has 3 entries • ith entry of finger table of node n (n.finger[i].node): – First node s = sucessor(n + 2i-1) where 1 i m – First finger of n (i=1) : » immediate successor on the circle » simply called successor 86 Chapter 9 4 • Question: – Devise finger table for node N8 in following Chord overlay: • ith entry of finger table of node n (n.finger[i].node): – First node s = sucessor(n + 2i-1) where 1 i m n + 2i-1 sucessor(n + 2i-1) N8+1 N14 +1 87 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 88 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 89 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 90 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 91 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 92 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 93 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 94 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 95 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 96 Chapter 9 4 Finger table: finger[i] = successor (n + 2i-1 ) 97 Chapter 9 4 Simple search: // ask node n to find the successor of id n.find_successor(id) { if (id n.fingerTable) return successor; else // forward the query around the return successor.find_successor(id); } circle Number of messages linear in the number of nodes ! 98 Chapter 9 4 • Scalable Search: – We can do a better job in searching (accelerated lookups) – Query searchKey = k – Query originator = node n – If query receiver node has an entry to successor(k) in finger table: • Forward query to that node – Else • Find a node whose ID is closer than its own to k (predecessor(k)). • Predecessor(k) has more pointers toward target node with documents for k. • Predecessor(k) = node entry in n’s fingertable whose ID most immediately precedes k. • n forward query to predecessor(k) 99 Chapter 9 4 • Search in finger table for the nodes which most immediatly precedes id • Invoke find_successor from that node Predecessor(54) Predecessor(54) Number of messages O(log N)! 100 Chapter 9 4 • Application: Chord-based DNS – DNS provides a lookup service • keys: host names • values: IP adresses – Chord could hash each host name to a key – no special root servers – no manual management of routing information 101 Chapter 9 4 BitTorrent • Designed for fast, efficient content distribution – Ideal for downloading large files, e.g. movies, DVDs, ISOs, etc. – Uses P2P file swarming • Not a full fledged P2P system – Does NOT support searching for files – Trackers acts as a centralized swarm coordinators • Fully P2P, trackerless torrents are now possible • Insanely popular – 35-70% of all Internet traffic – A lot cheaper, faster and more efficient to distribute files using BitTorrent than a regular download. 102 Chapter 9 4 File sharing • To share a file or group of files, the initiator first creates a .torrent file, a small file that contains : • • • • Metadata about the files to be shared, and Information about the tracker, the computer that coordinates the file distribution. Downloaders first obtain a .torrent file (there are site for downloading torrents), and then connect to the specified tracker. Tracker tells them from which other peers to download the pieces of the file. 103 Chapter 9 4 • Some popular Trackers: – http://thepiratebay.se/ – https://www.torrentz.com/ 104 Chapter 9 4 • Some Terminology: – Leech: • peer that’s downloading the file (downloader) • Does not have 100% of data – Seed: • peer with the entire file • When a downloader/leech starts uploading content, the peer becomes a seed. • Initial seeder = a peer that provides the initial copy. – Swarm: • Set of peers all downloading the same file • Each node knows list of pieces downloaded by neighbors • Node requests pieces it does not own from neighbors – Tracker: • server that keeps track of which seeds and peers are in the swarm. • Is not directly involved in the data transfer • Does not have a copy of the file. 105 Chapter 9 4 • The peers first download a torrent file of the file it want to download • Contents of .torrent file: – URL of tracker – Piece length – Usually 256 KB – SHA-1 hashes of each piece in file 106 Chapter 9 4 Overlay Network • Swarm Lifecycle: • Each file is shared via a swarm • The swarm starts with a initial seeder, a singleton peer with a complete copy of the file. • A node that wants to download the file joins the swarm, becoming its second member, and begins downloading pieces of the file from the original peer. • In doing so, it becomes another source for the pieces it has downloaded, even if it has not yet downloaded the entire file. 107 Chapter 9 4 Sharing Pieces Initial Seeder 1 1 2 3 4 2 3 5 6 7 8 4 5 6 Pieces are downloaded in random order to avoid a situation where peers find themselves lacking the same set of pieces. 7 8 1 2 3 4 Seeder Leecher 5 6 7 8 Seeder Leecher 108 From: http://www.ccs.neu.edu/home/cbw/4700/ 108 Chapter 9 4 • The Beauty of BitTorrent: – Multiple, redundant sources for each piece • More leechers = more replicas of pieces • More replicas = faster downloads – Great for content distribution – Cost is shared among the swarm 109 109 Chapter 9 4 Download in progress 110 Chapter 9 4 • Operation: 111 • We have discussed some of the popular applications in the Internet Chapter 9 4 Summary – Electronic mail, World Wide Web • We have discussed infrastructure services – Domain Name Services (DNS) • We have discussed overlay networks – Routing overlay, End-system multicast, Peer-to-peer networks 112