Download 11/18

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript
COS 109 Wednesday November 18
• Housekeeping
– Lab 6 and Problem Set 7 due dates
Lab 6 is due by midnight on Friday November 27
Problem Set 7 is due by 5 PM on Monday November 30
– Because these deadlines have been extended, there will be no further
extensions
• Today’s class
– The internet
It’s history
How it works
How it is governed
The Internet
• a huge number of independent networks that are connected
– NOT a giant computer or a single network
– each network may serve many host computers
• nearby computers are connected by a local area network
– most often Ethernet (including wireless)
• information travels through networks in small "packets"
– each packet independent of all others
like individual envelopes through the mail
– all packets have the same format
– standard protocols for format of info and behavior
• networks connected by specialized gateway computers (routers)
– route packets of information from one network to the next
– gateways continuously exchange routing information
• each packet passes through multiple gateways
– gateway passes packet to gateway that is closer to ultimate destination
– gateways usually operated by different companies
Internet History
• 1961: packet switching concept (Leonard Kleinrock, MIT, UCLA)
– Break a message into packets and send individual packets
• 1960's: ARPANET, funding from DARPA (Dept of Defense)
– Largely in response to cold war
• 1969: first Internet communication
– LO (connection UCLA  SRI crashed before it could be LOGIN)
1971
Early ARPANET connectivity
1971
1969
1980
From http://www.angelfire.com/pro/arpanet/
Internet History
• 1972: first network email
– Announced the availability of email (of a fashion) at BBN
– 1976 Queen Elizabeth II hits send on her first email
• 1973: basic protocols: TCP/IP (Bob Kahn *64, Vint Cerf)
– A Protocol for Packet Network Interconnection,"
• 1980's: National Science Foundation funding, NSFNet
(Al Gore)
Internet History
• 1984: MILNET splits off from ARPANET
– ARPANET continued for academic work but the networks were no longer
connected
• 1986: Internet Engineering Task Force for technical decisions
– Develops and promotes internet standards
• 1988: Morris worm brings down the internet
– 6000 UNIX machines disconnected from Internet (60,000 total hosts at
that point)
• 1990-1 HTTP (Web protocol)/AOL/Modems/PCs
• 1997-2000: commercialization, Web, dot-com boom
– Just get users, don’t worry about making $$ (now or ever)
• 2000: dot-com bust
– Pets.com and Webvan among the companies that failed
• 2010: universal availability
• for lots more, http://www.isoc.org/internet/history/
Basic mechanisms
• names for computers
– princeton.edu, finance.yahoo.com, www.whitehouse.gov, deandobkin.com,
• addresses for identifying networks and computers
– each has a unique number like 128.112.128.81 (IP address)
– central authority assigns numbers to networks
– each host computer has unique address (32 bit integer in IPv4),
assigned locally according to what network it's on
• Domain Name System to convert names to addresses
– Translator
• routing for finding paths from network to network
– Tracing a route
• protocols (rules) for packaging and transporting information
– IP, or "Internet Protocol": a uniform transport mechanism
at IP level, all information is in a common format
– below IP, different hardware uses different protocols
– above IP, higher-level protocols for handling web pages, mail, login …
Internet (IP) addresses
• each network and each connected computer has an IP address
• IP address: a unique 32-bit number in IPv4 (IPv6 is 128 bits)
– 1st part is network id, assigned centrally in blocks
(Internet Assigned Numbers Authority -> Internet Service Provider ->
you)
– 2nd part is host id within that network
assigned locally, often dynamically
net part
host on that net
• written in "dotted decimal" notation: each byte in decimal
– e.g., 128.112.132.86 = www.princeton.edu
128
10000000
112
132
01110000 10000100
86
01010110
Domain names
• a hierarchical naming scheme
– central authority (ICANN) manages top level of names
• top level domains include .com, .edu, .gov, .xx for country XX
– and newer domains like .biz, .info, .name, .xxx, …
• each domain delegates responsibilities to levels below
– for administration and translation into addresses
• each level is responsible for names within it
– princeton.edu handles all of princeton
– delegates cs.princeton.edu to a CS machine
– CS department manages names within, e.g., tux.cs.princeton.edu
• names impose logical structure, not physical or geographical
ICANN
• Internet Corporation for Assigned Names and Numbers
– non-profit corporation, established 1998 by Dept of Commerce
– technical coordination of the Internet
– www.icann.org
• "coordinates the assignment of the following identifiers that
must be globally unique for the Internet to function:
– Internet domain names
– IP address numbers
– protocol parameter and port numbers
• "coordinates the stable operation of the Internet's root server
system"
Protocols
• precise rules that govern communication between two parties
• TCP/IP: the basic Internet protocols
• IP: Internet protocol (bottom level)
– all packets shipped from network to network as IP packets
– no guarantees on quality of service or reliability: "best effort"
– each physical network has its own format for carrying IP packets
• TCP: transmission control protocol
– creates a reliable 2-way data stream using IP
errors are detected and corrected
– most things we think of as "Internet" use TCP
• "application-level" protocols, mostly built from TCP
– HTTP (web), SMTP (mail), SSH (secure login), FTP (file transfer), …
• UDP: user datagram protocol
– simple unreliable datagram protocol (errors not detected)
– used in DNS, remote file systems, ...
Packets
• packet: a sequence of bytes carrying information
– usually over a network connection
• bytes have a specific sequence, format, organization
– usually as specified in a protocol
• typical network packet includes
–
–
–
–
–
source (where it comes from)
destination (where it goes to)
size or length information (how big is the data part)
miscellaneous information (type, version, info to detect errors, ...)
the data itself ("payload")
• typical sizes range from
– a few bytes
– 150-1500 (Ethernet packets)
– 100-65000 (IP packets)
What's in an IP packet
• a "header" that contains
– protocol version, type of packet, length of header, length of data
– fragmentation info in case it was broken into pieces
– time to live: maximum number of hops before packet is discarded
each gateway decreases this by 1
– source & destination addresses (32 bits for IPv4, 128 bits for IPv6)
– checksum of header information
redundant info to detect errors in header information only, not data
itself
– etc.; about 20-40 bytes in header
• actual data
– up to 64 KB of payload
– IPv4:
version type
hdr
len
total
len
frag
TTL
source
address
dest
address
chk
data…
IP: Internet Protocol
• IP provides an unreliable connectionless packet delivery service
– every packet has full source & destination addresses
– every packet is independent of all others
• IP packets are datagrams
– individually addressed packages, like postcards in the postal system
"connectionless"
– stateless: no memory from one packet to next
each packet is independent of others, even if in sequence and going
same place
– unreliable: packets can be lost or duplicated ("best effort" delivery)
– packets can be delivered out of order
– contents can be wrong (though error rates are usually very low)
– no speed control: packets can arrive too fast to be processed
– limited size: long messages have to be split up and then reassembled
• higher level protocols use IP packets to carry information
• IP packets are carried on a wide variety of physical media
TCP: Transmission Control Protocol
• a reliable 2-way byte stream built with IP
• a TCP connection is established to a specific host
– and a specific "port" at that host
• each port provides a specific service
– SSH = 22, SMTP = 25, HTTP = 80, ...
• a message is broken into 1 or more segments
• each TCP segment has a header (src, dest, etc) + data
– header includes checksum for error detection, and sequence number to
preserve order and detect missing or duplicated packets
• each TCP segmentis wrapped in an IP packet and sent
– has to be positively acknowledged to ensure that it arrived safely
otherwise, re-send it after a time interval
• TCP is the basis of most higher-level protocols
TCP/IP
How things are connected
•
•
•
•
local nets connected to local Internet Service Provider (ISP)
these in turn connect to regional ISPs
and then to larger ones like UUNet, AT&T, Cogent, Level 3, …
traffic exchanged at Internet exchanges
– large and small, formal and informal, profit and non-profit
• bandwidth (bit-carrying capacity) of connections is usually higher
for larger ISPs
–
–
–
–
phone line analog modem 56 Kbps (you to your ISP)
cable modem, DSL 500 Kbps - 3MBps (you to your ISP)
telephone lines 1.5-45 Mbps (local ISP or big company to ISP)
optical fiber 100 Mbps and up (large carriers)
Typical home connection
home
computer
phone line, cable modem, ADSL
Verizon
another local area network
router or gateway
gateway
TCP/IP re-enactment
if you are the originator (IP address 1.2.3.4)
send the packets one at a time to a nearby router
if you are a router in the middle,
when you receive a packet
pass it on towards the destination as fast as you can
if you are the
tear off the
put it in the
send it back
recipient (IP address 5.6.7.8)
sequence number
envelope
to the originator
traceroute to cs.princeton.edu (128.112.136.10)
1 Wireless_Broadband_Router.home (192.168.1.1) 1.289 ms 1.283 ms 1.275 ms
2 L100.CMDNNJ-VFTTP-50.verizon-gni.net (72.73.238.1) 4.967 ms 4.965 ms 5.607
ms
3 G0-3-5-3.CMDNNJ-LCR-21.verizon-gni.net (130.81.191.210)11.372ms 11.370ms
11.363ms
4 so-0-3-0-0.NWRK-BB-RTR1.verizon-gni.net (130.81.22.0) 13.461 ms 13.458 ms
15.306ms
5 0.xe-8-1-0.XL3.NYC1.ALTER.NET (152.63.5.213) 18.140 ms 19.566 ms 19.562
ms
6 0.xe-4-1-3.XL3.IAD8.ALTER.NET (152.63.3.142) 24.870 ms
7 POS6-0.GW3.IAD8.ALTER.NET (152.63.41.21) 17.497 ms 16.523 ms 16.519 ms
8 paetec-gw.customer.alter.net (157.130.49.210) 18.660 ms 16.190 ms 16.851 ms
9 so-6-0-0.phlapafgw26cr02.paetec.net (169.130.81.61) 20.506 ms 20.509 ms
22.880 ms
10 ge-1-0-0.phlapafgw26cr01.paetec.net (64.198.100.33) 21.896 ms 23.589 ms
23.578 ms
11 169.130.83.130 (169.130.83.130) 25.707 ms 25.695 ms 29.217 ms
12 te-0-3-0-0.PHLAPAFGW32PE02.paetec.net (169.130.83.117) 26.350ms 27.756ms 21.661ms
13 209.92.27.38 (209.92.27.38) 22.344 ms 24.464 ms 21.851 ms
14 gigagate1.Princeton.EDU (128.112.12.21) 21.827 ms 23.190 ms 23.185 ms
15 csgate.Princeton.EDU (128.112.12.58) 25.068 ms 25.067 ms 26.450 ms
Looking into the internet
• http://ping.eu/traceroute/
– 74.64.107.235
Is it enough?
• a.b.c.d
– 256x256x256x256 ~ 4 billion addresses
• Are there enough IPv4 addresses?
• New scheme (IPv6)
– Will have 340 billion billion billion billion
Growth in number of hosts
Charts all taken from http://www.zakon.org/robert/internet/timeline/
Domain Name registrations over time
WWW growth
Growth in number of facebook accounts
How big is facebook?
Website of the day
• Search on google for
– Do a barrel roll
• Google tilt trick and friends