Download Chapter 9: Applications

Document related concepts

IEEE 802.1aq wikipedia , lookup

Remote Desktop Services wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Airborne Networking wikipedia , lookup

CAN bus wikipedia , lookup

Lag wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Peer-to-peer wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Transcript
Computer Networks: A Systems Approach, 5e
Larry L. Peterson and Bruce S. Davie
Chapter 9
Applications
Copyright © 2010, Elsevier Inc. All rights Reserved
1
• Applications need their own protocols.
Chapter 9
4
Problem
• This chapter explores some of the most popular
network applications available today.
2
• Traditional Applications
Chapter 9
4
Chapter Outline
– HTTP
– SMTP
• Infrastructure Services
– DNS
– SNMP
• Overlay Networks
– P2P
– CDN
• Multimedia Applications
– SDP
3
Chapter 9
4
Traditional Applications
• Two of the most popular:
– The World Wide Web
– Email.
• Both applications use the request/reply paradigm:
– users send requests to servers
– Server respond accordingly.
• Why call them “traditional” applications ?
– applications that existed since the early days of computer networks
– Web is a lot newer than email but has its roots in file transfers that
predated it).
4
• Application programs vs Application protocols:
• E.g.
Chapter 9
4
Traditional Applications
– Application Protocol: HTTP
• used to retrieve Web pages from remote servers.
– Application Programs: Web Clients
• E,g. Internet Explorer, Chrome, Firefox, and Safari
• Provides users with a different look and feel
• All web browsers use the same HTTP protocol to
communicate with Web servers over the Internet.
5
• Two very widely-used, standardized application
protocols:
Chapter 9
4
Traditional Applications
– SMTP: Simple Mail Transfer Protocol is used to exchange
electronic mail.
– HTTP: HyperText Transport Protocol is used to communicate
between Web browsers and Web servers.
6
– Email is one of the oldest network applications
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
– User interface/application:
• mail reader: Microsoft outlook, Gmail
– Underlying message transfer protocols:
• SMTP
• IMAP
– Companion protocol :
• RFC 822 and MIME
• Defines the format of the messages being exchanged
7
– Message Format
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
•
•
•
•
•
RFC 822
Messages to have two parts: a header and a body.
Both parts are represented in ASCII text.
Originally, the body was assumed to be simple text.
This is still the case. However, RFC 822 has been
augmented by MIME to allow the message body to carry
all sorts of data.
• This data is still represented as ASCII text
• However, because it may be an encoded version of, say, a
JPEG image, it’s not necessarily readable by human users.
8
• MIME (Multi-Purpose Internet Mail Extensions):
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
• An extension of the original Internet e-mail protocol
• Developed in 1992 by IETF.
• MIME is a specification for enhancing the capabilities of standard
Internet electronic mail
9
• When using the MIME standard, messages can contain the
following types:
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
 Text messages in ASCII .
 Character sets other than ASCII.
 Multi-media: Image, Audio, and Video messages.
 Multiple objects in a single message.
 Multi-font messages.
 Messages of unlimited length.
 Binary files.
10
• It builds on the older standard by:
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
– defining additional fields for the mail message header, that describes new
content types, and
– a distinct organization of the message body.
• It explicitly describes the set of allowable Content-types.
1.
Text - Used to represent textual information.
2.
Image - this type is for transmitting still images.
3.
Audio- this content type is for transmitting audio or voice data..
4.
Video - The Video content type is for transmission of video data or moving
image data.
11
– Servers insert the MIME header at the beginning of ANY Web
transmission.
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
– Clients use this header to select an appropriate "player"
application for the type of data the header indicates.
– Some of these players are built into the Web client or browser
• E.g all browsers come with GIF and JPEG image players as well as the ability to
handle HTML files); other players may need to be downloaded.
12
• SMTP, POP3 and IMAP are TCP/IP protocols used for
mail delivery.
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
• SMTP (Simple Mail Transfer Protocol):
– Used when :
• email is delivered from an email client to an email server
OR
• when email is delivered from one email server to another.
– Uses port 25.
13
• SMTP delivers mail to your mail server from other people.
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
• Your mail server stores the received email in a mailbox
until your mail client asks for it.
• This is where IMAP and POP enter the picture.
• IMAP and POP are the two most prevailing protocols for
retrieving email from a mail server.
• Both of these protocols are supported by almost all
popular mail client programs:
– Outlook, Thunderbird, Apple Mail.
14
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
• POP3(Post Office Protocol):
– Allows an email client to download an email from an
email server.
– Simple : does not offer many features except for
download.
– Assumes that the email client downloads all available
email from the server, deletes email from the server
and then disconnects.
– Normally uses port 110.
15
• IMAP(Internet Message Access Protocol):
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
– shares many similar features with POP3
– It, too, is a protocol that an email client can use to
download email from an email server.
– However, IMAP includes many more features than
POP3.
– designed to let users keep their email on the server.
• requires more disk space on the server and more CPU
resources than POP3, as all emails are stored on the
server.
– normally uses port 143
16
• Example:
Chapter 9
4
Electronic Mail (SMTP, MIME, IMAP)
– Suppose you use csbsju email server as your email server to
send an email to [email protected].
– You click Send in your email client, say, Outlook Express.
– Outlook Express delivers the email to csbsju email server using
the SMTP protocol.
– csbsju email server delivers the email to Microsoft's mail
server, mail.microsoft.com, using SMTP.
– Bill's Mozilla Mail client downloads the email from
mail.microsoft.com to his laptop using the POP3 protocol (or
IMAP).
• https://support.google.com/mail/troubleshooter/1668960?hl=en
17
Chapter 9
4
World Wide Web (www)
• The World Wide Web has made the Internet accessible to so
many people that sometimes it seems to be synonymous with
the Internet.
• design of the system that became the Web started around 1989,
long after the Internet had become a widely deployed system.
• The original goal of the Web was to find a way to organize and
retrieve information, drawing on ideas about hypertext—
interlinked documents—that had been around since at least the
1960s.
18
• The core idea of hypertext is that one document can link to
another document, and the protocol (HTTP) and document
language (HTML) were designed to meet that goal.
Chapter 9
4
World Wide Web
• Any Web browser has a function that allows the user to
obtain/retrieve an object by “opening a URL.”
• They provide information that allows objects on the Web to be
located, and they look like the following:
– http://www.cs.princeton.edu/index.html
19
• If you opened that particular URL:
Chapter 9
4
World Wide Web
– your Web browser would open a TCP connection to the Web server at a
machine called www.cs.princeton.edu
– Then immediately retrieve and display the file called index.html.
• Most files on the Web contain :
–
–
–
–
–
–
images
text
audio
video clips,
pieces of code
Hyperlinks etc.
20
Chapter 9
4
World Wide Web
• At its core, HTTP is a request/response protocol, where every message has the
general form:
START_LINE <CRLF>
MESSAGE_HEADER <CRLF>
<CRLF>
MESSAGE_BODY <CRLF>
• <CRLF>stands for carriage-return-line-feed.
• The first line (START LINE) indicates whether this is a request message or a
response message (status line).
21
Chapter 9
4
HTTP request operations
22
Chapter 9
4
• E.g. start wireshark to (filter=http) capture and
open url: www.w3.org
• http request:
Header
(no bodyoptional)
23
Chapter 9
4
• HTTP Response:
Five types of HTTP result codes
24
Chapter 9
4
• http response:
Header
(no body- optional)
25
Chapter 9
4
Header
body
26
• TCP Connections:
– The original version of HTTP (1.0) established a separate TCP
connection for each data item retrieved from the server.
Chapter 9
4
World Wide Web
– Inefficient : connection setup and teardown per data item
– Thus, retrieving a page that included some text and a 12
icons or other small graphics would result in 13 separate TCP
connections being established and closed.
27
Chapter 9
4
World Wide Web
– To overcome this situation, HTTP version 1.1 introduced
persistent connections:
• the client and server can exchange multiple request/response messages over
the same TCP connection.
•
– Persistent connections have many advantages.
• Eliminate the connection setup overhead
– Reduce the load on the server
– Reduce the load on the network caused by the additional TCP packets,
– Reduce the delay perceived by the user.
• TCP’s congestion window mechanism operate more efficiently.
– Client can send multiple request messages down a single TCP connection.
– It’s not necessary to go through the slow start phase for each page.
28
Chapter 9
4
World Wide Web
TCP connection 1
TCP connection 2
HTTP 1.0 behavior
29
Chapter 9
4
TCP connection 1
HTTP 1.1 behavior with persistent connections
30
Chapter 9
4
• Web Services:
• Read Textbook
• http://www.w3schools.com/webservices/
• http://www.tutorialspoint.com/webservices/
31
•
DNS: Domain Name Service
–
–
•
Chapter 9
4
DNS
url  ip translation
E.g. cs.csbsju.edu  152.65.160.46
Question:
•
Why use URLs instead of IP addresses?
32
• Answer:
Chapter 9
4
DNS
– IP addresses are perfectly suited for use by routers
– However, ip addresses are not exactly user-friendly.
– Therefore we ALSO assign a unique name to each host in a
network.
33
– Host names differ from host addresses in several ways.
Host Name
Host Address
Variable length
Fixed length (32 bits- IP4)
Mnemonics – easy for humans to remember
Numbers- Not easy for humans to remember
Has no routing information embedded
Sometimes has routing information
Chapter 9
4
DNS
34
• Some basic terminology:
Chapter 9
4
DNS
1. Name space:
–
–
set of possible names
Two types:
• flat (names are not divisible into components)
• hierarchical (e.g. Unix file names).
2. Binding:
–
–
Binding: name  value
naming system maintains a collection of bindings
35
Chapter 9
4
DNS
3. Resolution mechanism:
– procedure that, when invoked with a name, returns the
corresponding value.
4. Name Server:
– A specific implementation of a resolution mechanism that is
available on a network
– Can be queried by sending it a message.
• Internet has a well-developed naming system in place—
the Domain Name System (DNS).
36
Chapter 9
4
DNS
• History of DNS:
– Internet did not always use DNS
– Early days, there were only a few hundred hosts on
the Internet
– So a central authority called the Network Information
Center (NIC) maintained a flat table of name-toaddress bindings
– This table was called hosts.txt.
37
Chapter 9
4
DNS
– Whenever a site wanted to add a new host to the Internet, the
site administrator sent email to the NIC giving the new host’s
name/address pair.
– This information was manually entered into the table
– The modified table was mailed out to the various sites every
few days
– Each site installed a local copy.
– Name resolution was then simply implemented by a procedure
that looked up a host’s name in the local copy of the table and
returned the corresponding address.
38
– hosts.txt approach did not work well as the number of hosts in
the Internet started to grow.
– Therefore, in the mid-1980s, the Domain Naming System was
put into place.
Chapter 9
4
DNS
• DNS:
– DNS employs a hierarchical namespace rather than a flat name
space
– “table” of bindings is partitioned into disjoint pieces and
distributed throughout the Internet.
– These subtables are made available in name servers that can
be queried over the network.
39
• How DNS works:
Chapter 9
4
DNS
1. A user presents a host name to an application
program (e.g. email, url)
2. Application program uses a naming system to
translate this name to a host address
3. Application then opens a connection to this host by
presenting some transport protocol (e.g., TCP) with
the host’s IP address.
40
Chapter 9
4
DNS
Names translated into addresses, where the numbers 1–5 show the sequence
of steps in the process
41
Chapter 9
4
DNS
• Domain Hierarchy
– DNS implements a hierarchical name space for Internet objects.
– DNS names are processed from right to left and use periods as
the separator.
• Unlike Unix file names
– The DNS hierarchy can be visualized as a tree, where:
• Each node in the tree corresponds to a domain, and
• the leaves in the tree correspond to the hosts being named.
• Like the Unix file hierarchy.
42
Chapter 9
4
DNS
TOP LEVEL =
big size domains + domain per country
Princeton.edu
cs.Princeton.edu
Example of a domain hierarchy
• In recent years, the number of top-level domains has been
expanded, partly to deal with the high demand for .com domains
names.
• The newer top-level domains include .biz, .coop, and .info.
43
• How is this hierarchy is actually implemented?
Chapter 9
4
DNS
– First, partition the hierarchy into subtrees called
zones.
• Each zone corresponds to some administrative authority
• Each name server implements the zone information as a
collection of resource records.
44
• How to create zones for previous example:
Chapter 9
4
DNS
Top level hierarchy forms a zone
Managed by the Internet Corporation for Assigned Names and Numbers (ICANN).
Zone corresponds to
Princeton university
Zone corresponds to cs department
http://www.cs.princeton.edu/
Some departments do not want responsibility of managing hierarchy.
So they remain in university level zone.
E.g. http://www.csbsju.edu/computer-science
45
• Information contained in each zone is implemented in two or
more name servers (for redundancy).
• Each name server can be accessed over the Internet.
• Clients send queries to name servers
• Name servers respond with the requested information.
Chapter 9
4
DNS
46
• From an implementation perspective, think of DNS as being
represented by a hierarchy of name servers rather than by a
hierarchy of domains
Chapter 9
4
DNS
Hierarchy of name servers.
47
• Each name server implements the zone information as a
collection of resource records.
Chapter 9
4
DNS
48
• Example Zone file for the domain example.com:
•
•
•
•
•
Chapter 9
4
DNS
Zone files consist of Comments, Directives and Resource Records
Comments start with ;
Directives start with $ (e.g. $ORIGIN, $TTL, $INCLUDE, $GENERATE)
The $TTL directive should be present and appear before the first RR (Resource Record)
The first Resource Record MUST be the SOA (Start of Authority) with authoritative
master name server and email address of someone managing name server .
From: wikepedia.org
49
• A resource record is a name-to-value binding:
Chapter 9
4
DNS
– A 5-tuple : <Name, Value, Type, Class, TTL >
• Name:
– Name of resource
– E.g. mail.example.com
• Value:
– Value associated with name (in namevalue binding)
– E.g. 192.0.2.3
50
Chapter 9
4
DNS
– Type:
– Type = A: indicates that the Value is an IP address.
» Thus, A records implement the name-to-address mapping
– Type =NS: The Value field gives the domain name for a host
that is running a name server that knows how to resolve
names within the specified domain.
– Type= CNAME: The Value field gives the canonical name for a
particular host; it is used to define aliases.
– Type = MX: The Value field gives the domain name for a host
that is running a mail server that accepts messages for the
specified domain.
– Class:
– Specifies the class of the resource record being requested
– Only widely used Class is IN (one used by the Internet)
51
Chapter 9
4
DNS
• TTL:
– Time To Live
– Specifies the number of seconds that the record
should be retained in the cache of the device reading
the record.
52
Chapter 9
4
DNS
TTL
SOA RR
Set of RRs
RR
name
class
type
value
53
• Read more:
Chapter 9
4
DNS
– https://www.centos.org/docs/5/html/Deployment_Guide-en-US/s1-bind-zone.html
– http://www.zytrax.com/books/dns/ch8/
– http://en.wikipedia.org/wiki/Zone_file
54
Chapter 9
4
DNS
• Broader view:
Root Name Server
NS = a3.nstld.com
•
NS = a.gtld-servers.net
A root name server contains an NS record for each top-level domain (TLD) name server
– servers that can resolve queries for .edu and .com:
.edu TLD server
IP address binding
.com TLD server
55
NS = a3.nstld.com
Chapter 9
4
DNS
NS = dns.princeton.edu
• The a3.nstld.com server has records for .edu domains like this:
– Servers that can resolve SOME queries for princeton.edu
• e.g. email.princeton.edu
– Server redirect others to a lower level server in the hierarchy
• e.g. penguins.cs.princeton.edu
56
Chapter 9
4
DNS
NS = a3.nstld.com
NS = dns.princeton.edu
• Third-level name server with domain: cs.princeton.edu contain A records for all
its hosts.
57
• Name Resolution:
Chapter 9
4
DNS
– Given a hierarchy of name servers, let’s see how a
client engages these servers to resolve a domain
name.
– E.g. suppose the client wants to resolve the name
penguins.cs.princeton.edu
58
• Name Resolution
2. Root server CANNOT
match the entire name
(No A records )
Chapter 9
4
DNS
1. Client send a query containing this name to one of
the root servers
3. So Root server send BEST
MATCH it has (.edu NS)
Name resolution in practice, where the numbers 1–10 show the sequence of steps in the process.
59
• The Unix utility nslookup is a DNS-lookup tool
Chapter 9
4
nslookup
– nslookup
60
• Mechanism for introducing new functionality into the Internet
Chapter 9
4
Overlay Network
• Becoming very popular
• Overlay :
– a logical network implemented on top of a some underlying network.
– E.g VPN
• Overlay node:
– Each node in the overlay also exists in the underlying network
– A node processes and forwards packets in an application-specific way.
• Overlay links:
– The links that connect the overlay nodes
– Implemented as tunnels through the underlying network.
61
Chapter 9
4
Overlay Network
Overlay node
Overlay link
Overlay network
Physical network
Overlay link mapped to
sequence of physical links
Overlay network layered on top of a
physical network
62
Chapter 9
4
Overlay Networks
• P2P Networks:
– Overlay networks
– An alternative to conventional client–server systems
– A peer can act both as a client and a server
– No central coordination
63
63
Chapter 9
4
Overlay Network
• P2P Applications:
– File Sharing (E.g. Napster, Gnutella, Bittorent)
• Peers share part/all of their files in local machine
– Collaboration (E.g. Groove, Collanos workspace)
– IP Telephony (E.g. Skype)
– Web Search Engines (E.g. Yacy, Faroo)
– Digital libraries (E.g. DESCENT)
64
64
Chapter 9
4
Introduction[contd.]
• Structured P2P
– Overlay structure and data placement
are precisely determined
– E.g. Chord, CAN, Pastry etc.
– Guaranteed lookups
• Unstructured P2P
– Network topology is arbitrary
– No rules defining where data is stored
– More popular,
– Highly resilient to network dynamics
– E.g. Gnutella, KaZAa etc.
65
65
– What’s interesting about peer-to-peer networks?
Chapter 9
4
Overlay Network
• Searching and downloading:
–
–
–
–
Searching :locating object of interest :
Downloading: downloading that object onto your local machine
Happen with no centralized authority
Very scalable to millions of nodes.
• Searching Unstructured P2P networks are challenging:
– Peers operate on incomplete knowledge
» Each peer know existence of only its directly connected neighbors
– P2P networks are highly dynamic
– Need support complex queries (E.g. semantic queries)
66
• Unstructured Peer-to-peer Networks:
Chapter 9
4
Overlay Network
– Gnutella
• Gnutella is an early peer-to-peer network
• General files sharing
• Original design: flat network
– Join via bootstrap node
– Connect to random set of existing hosts
• Recent incarnations use hierarchical structure
67
• Unstructured Overlays
Chapter 9
4
Overlay Network
Example topology of a Gnutella peer-topeer network
68
Chapter 9
4
Topology of the Gnutella
network in 2001 (1771 peers)
Topology of the Gnutella
network after a random 30%
of the nodes are removed
Topology of the Gnutella
network after the highestdegree 4% of the nodes are
removed.
From: A Measurement Study of Peer-to-Peer File Sharing Systems, by Stefan Saroiu, P. Krishna Gummadi, Steven D. Gribble
69
Chapter 9
4
• Searching in Gnutella:
– Assume a node want to find a file (he knows the name of the file)
– How can Gnutella search the network for the file?
– Note that a node only knows existence of direct neighbors only.
70
Chapter 9
4
• Answer:
– Blind Search:
• Nodes keep no soft state for neighboring nodes
1. Flooding
–
–
–
–
–
–
–
Node ask ALL neighbors for files of interest (generate and send query)
TTL controlled flooding
Neighbors ask their neighbors, (forward to ALL other neighbors) and so on
Each neighbor reduce message TTL by 1 before forwarding
Message is discarded when TTL=0
F
I
High success rate
B
High network traffic
A
E
K
J
C
H
Query (TTL=2)
D
G
Hit
Query originator
• Is there any way to perform better search?
71
Chapter 9
4
2. Random Walker
– Query originator (and forwarding nodes) randomly select one
neighbors to send query
– This reduces traffic
– Reduces success rate of finding objects
– Success really depends on topology of the network
– K-walker random walk:
» Query originator select k random neighbors to forward the
query
F
I
B
A
E
K
J
C
H
Query (K=1,TTL=2)
D
G
Hit
Query originator
72
Chapter 9
4
– Informed Search:
• Nodes/peers keep some soft state about neighbors
• What kind of soft state would help peers to do a better
search?
73
Chapter 9
4
– Answer:
• Past history
• Degree of neighbors
– Adaptive Probabilistic Search (APS)
•
•
•
Informed search
Dynamically builds knowledge based on past queries
Uses this knowledge to guide future queries
F
I
B
A
E
K
J
C
H
Query (walkers=2,TTL=3)
G
D
Hit
Miss
Query originator
Indices
Initially
DG
DC
DA
30
30
30
At Walker After Index
Termination
Update
40
40
40
20
30
30
74
Chapter 9
4
F
I
B
A
E
K
J
C
H
01
Query (walkers=2,TTL=3)
G
D
Hit
Miss
Query originator
Indices
Initially
DG
DC
DA
30
30
30
At Walker
Termination
40
40
30
After Index
Update
40
20
30
Search Object=“01”
75
– Structured Overlays
Chapter 9
4
Overlay Network
– Overlay structure and data placement are precisely
determined
– E.g. Chord, CAN, Pastry etc.
– Guaranteed lookups
76
Chapter 9
4
– Chord
– Organize nodes in a ring
– In an N-node network, each node maintains information only
about O(log N) other nodes
– Small amount of routing information/soft state per node
– A lookup requires O(log N) messages.
– Distributed Indexes:
–
–
–
–
–
indices are distributed in nodes to support keyword search.
Set of <key,value> pairs
A key = a keyword/ search term
A value = list of nodes hosting documents with those keywords.
Values (list of nodes with documents) can be retrieved by looking up key
77
Chapter 9
4
– Consistent Hashing:
– Assign each node and key an m-bit identifier using a base hash function
such as SHA-1
– Node ID = hash (node’s IP address)
– Key ID = hash (key)
– m must be large enough to avoid two keys/nodes hashing to same ID
– How are keys assigned to nodes?
– Key space is partitioned among nodes
– For index to be distributed, each node take responsibility of storing
<key,value> pairs of subset of keys
– Note that both nodeID and keyID are from same hash range (0 – 2m).
– So assigning key to a node is easy based on hash
– Key k is assigned to the first node whose ID is equal to or follows k ID in
the ID space (successor(k))
78
Chapter 9
4
• An identifier circle consisting of the three nodes 0, 1, and 3 (m=3).
–
–
–
–
NodeID/KeyID space: 0,1,2,…7
Key 1 (keyID=1) is located at node 1 (NodeID=1)
key 2 is located at node 3
Key 6 is located at node 0.
All arithmetic is modulo 2m
Key k is assigned to the first node
whose ID is equal to or follows k in
the ID space (successor(k))
Node
immediately
follows key 2
79
Chapter 9
4
Hash function ensures even distribution of nodes and keys on the circle
80
Chapter 9
4
• What happens when a new node joins or existing node
leaves the network?
– Keys need to be redistributed/reassigned
– Node n joins:
• Certain keys previously assigned to successor(n) now assigned to n
• E.g. in first example, if node2 joins, key2 will be assigned to successor(2)=
node2 now.
– Node n leave:
• All of its assigned keys are reassigned to n’s successor
• E.g. if node0 leaves, key6 is assigned to successor(6)= node1 now.
81
Chapter 9
4
• Node N21 joins
82
Chapter 9
4
• Node N26 joins
•
N26 joins the system
•
N26 aquires N32 as its successor
•
N32 aquires N26 as its predecessor
83
•
N21 aquires N26 as its successor
•
N26 aquires N21 as predecessor
Chapter 9
4
• Redistribute keys and update successor pointers
84
Chapter 9
4
• Searching/ key location:
– Each node need only be aware of its successor node (successor pointers) on
the circle to route a query
– Query routing:
• Query for a given keyID is passed around the circle via successor pointers
until query land in a node with nodeID = successor(searchkeyID)
• Query does not traverse all N nodes in circle.
• To accelerate routing, each node maintain a routing table (Finger table).
85
Chapter 9
4
• Finger table has at most m entries
– E.g. for m=3, 8 keyids/nodeid as possible. A node finger table has 3 entries
• ith entry of finger table of node n (n.finger[i].node):
– First node s = sucessor(n + 2i-1) where 1  i  m
– First finger of n (i=1) :
» immediate successor on the circle
» simply called successor
86
Chapter 9
4
• Question:
– Devise finger table for node N8 in following Chord overlay:
• ith entry of finger table of node n (n.finger[i].node):
– First node s = sucessor(n + 2i-1)
where 1  i  m
n + 2i-1
sucessor(n + 2i-1)
N8+1
N14
+1
87
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
88
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
89
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
90
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
91
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
92
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
93
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
94
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
95
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
96
Chapter 9
4
Finger table:
finger[i] = successor (n + 2i-1 )
97
Chapter 9
4
Simple search:
// ask node n to find the successor of id
n.find_successor(id)
{
if (id  n.fingerTable)
return successor;
else
// forward the query around the
return successor.find_successor(id);
}
circle
Number of messages linear in the number of nodes !
98
Chapter 9
4
• Scalable Search:
– We can do a better job in searching (accelerated lookups)
– Query searchKey = k
– Query originator = node n
– If query receiver node has an entry to successor(k) in finger table:
• Forward query to that node
– Else
• Find a node whose ID is closer than its own to k (predecessor(k)).
• Predecessor(k) has more pointers toward target node with documents for k.
• Predecessor(k) = node entry in n’s fingertable whose ID most immediately
precedes k.
• n forward query to predecessor(k)
99
Chapter 9
4
• Search in finger table for the nodes which most immediatly precedes id
• Invoke find_successor from that node
Predecessor(54)
Predecessor(54)
Number of messages O(log N)!
100
Chapter 9
4
• Application: Chord-based DNS
– DNS provides a lookup service
• keys: host names
• values: IP adresses
– Chord could hash each host name to a key
– no special root servers
– no manual management of routing information
101
Chapter 9
4
BitTorrent
• Designed for fast, efficient content distribution
– Ideal for downloading large files, e.g. movies, DVDs, ISOs, etc.
– Uses P2P file swarming
• Not a full fledged P2P system
– Does NOT support searching for files
– Trackers acts as a centralized swarm coordinators
• Fully P2P, trackerless torrents are now possible
• Insanely popular
– 35-70% of all Internet traffic
– A lot cheaper, faster and more efficient to distribute files using
BitTorrent than a regular download.
102
Chapter 9
4
File sharing
•
To share a file or group of files, the initiator first creates
a .torrent file, a small file that contains :
•
•
•
•
Metadata about the files to be shared, and
Information about the tracker, the computer that coordinates the file
distribution.
Downloaders first obtain a .torrent file (there are site
for downloading torrents), and then connect to the
specified tracker.
Tracker tells them from which other peers to download
the pieces of the file.
103
Chapter 9
4
• Some popular Trackers:
– http://thepiratebay.se/
– https://www.torrentz.com/
104
Chapter 9
4
• Some Terminology:
– Leech:
• peer that’s downloading the file (downloader)
• Does not have 100% of data
– Seed:
• peer with the entire file
• When a downloader/leech starts uploading content, the peer becomes a seed.
• Initial seeder = a peer that provides the initial copy.
– Swarm:
• Set of peers all downloading the same file
• Each node knows list of pieces downloaded by neighbors
• Node requests pieces it does not own from neighbors
– Tracker:
• server that keeps track of which seeds and peers are in the swarm.
• Is not directly involved in the data transfer
• Does not have a copy of the file.
105
Chapter 9
4
• The peers first download a torrent file of the file
it want to download
• Contents of .torrent file:
– URL of tracker
– Piece length – Usually 256 KB
– SHA-1 hashes of each piece in file
106
Chapter 9
4
Overlay Network
• Swarm Lifecycle:
• Each file is shared via a swarm
• The swarm starts with a initial seeder, a singleton peer
with a complete copy of the file.
• A node that wants to download the file joins the swarm,
becoming its second member, and begins downloading
pieces of the file from the original peer.
• In doing so, it becomes another source for the pieces it
has downloaded, even if it has not yet downloaded the
entire file.
107
Chapter 9
4
Sharing Pieces
Initial Seeder
1
1
2 3 4
2
3
5 6 7 8
4
5
6
Pieces are downloaded in
random order to avoid a
situation where peers
find themselves lacking
the same set of pieces.
7
8
1
2 3 4
Seeder
Leecher
5 6 7 8
Seeder
Leecher
108
From: http://www.ccs.neu.edu/home/cbw/4700/
108
Chapter 9
4
• The Beauty of BitTorrent:
– Multiple, redundant sources for each piece
• More leechers = more replicas of pieces
• More replicas = faster downloads
– Great for content distribution
– Cost is shared among the swarm
109
109
Chapter 9
4
Download in progress
110
Chapter 9
4
• Operation:
111
• We have discussed some of the popular applications in the
Internet
Chapter 9
4
Summary
– Electronic mail, World Wide Web
• We have discussed infrastructure services
– Domain Name Services (DNS)
• We have discussed overlay networks
– Routing overlay, End-system multicast, Peer-to-peer networks
112