* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 3rd Edition: Chapter 4
Parallel port wikipedia , lookup
Net neutrality law wikipedia , lookup
Distributed firewall wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Deep packet inspection wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Computer network wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Network tap wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Airborne Networking wikipedia , lookup
Internet protocol suite wikipedia , lookup
UniPro protocol stack wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Routing in delay-tolerant networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Network Layer Dr. Yingwu Zhu 4-1 Network Layer Introduction Datagram networks IP: Internet Protocol Datagram format IPv4 addressing ICMP What’s inside a router Routing algorithms Link state Distance Vector Hierarchical routing Routing in the Internet RIP OSPF BGP Broadcast and multicast routing 4-2 Network layer transport segment from sending to receiving host on sending side encapsulates segments into datagrams on receiving side, delivers segments to transport layer network layer protocols in every host, router Router examines header fields in all IP datagrams passing through it application transport network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical application transport network data link physical 4-3 Key Network-Layer Functions forwarding: move packets from router’s input to appropriate router output routing: determine route taken by packets from source to dest. analogy: routing: process of planning trip from source to dest forwarding: process of getting through single interchange Routing algorithms 4-4 Interplay between routing and forwarding routing algorithm local forwarding table header value output link 0100 0101 0111 1001 3 2 2 1 value in arriving packet’s header 0111 1 3 2 4-5 Network Layer Introduction Datagram networks IP: Internet Protocol Datagram format IPv4 addressing ICMP What’s inside a router Routing algorithms Link state Distance Vector Hierarchical routing Routing in the Internet RIP OSPF BGP Broadcast and multicast routing 4-6 Datagram networks Connection-less service; no call setup at network layer routers: no state about end-to-end connections (why?) no network-level concept of “connection” packets forwarded using destination host address packets between same source-dest pair may take different paths application transport network data link 1. Send data physical application transport network 2. Receive data data link physical 4-7 Network Layer Introduction Datagram networks IP: Internet Protocol Datagram format IPv4 addressing ICMP What’s inside a router Routing algorithms Link state Distance Vector Hierarchical routing Routing in the Internet RIP OSPF BGP Broadcast and multicast routing 4-8 The Internet Network layer Host, router network layer functions: Transport layer: TCP, UDP Network layer IP protocol •addressing conventions •datagram format •packet handling conventions Routing protocols •path selection •RIP, OSPF, BGP forwarding table ICMP protocol •error reporting •router “signaling” Link layer physical layer 4-9 The Internet Protocol (IP) Protocol Stack App Transport TCP / UDP Network IP Data Data TCP Segment Hdr Hdr IP Datagram Link 4-10 The Internet Protocol (IP) Characteristics of IP CONNECTIONLESS: mis-sequencing UNRELIABLE: may drop packets… BEST EFFORT: … but only if necessary DATAGRAM: individually routed Source D A D R2 H R1 R3 R4 H B Destination •Architecture •Links •Topology Transparent 4-11 IP datagram format IP protocol version number header length (bytes) “type” of data 6 for TCP max number remaining hops (decremented at each router) upper layer protocol to deliver payload to how much overhead with TCP? 20 bytes of TCP 20 bytes of IP = 40 bytes + app layer overhead 32 bits head. type of length ver len service fragment 16-bit identifier flgs offset upper time to Internet layer live checksum total datagram length (bytes) for fragmentation/ reassembly 32 bit source IP address 32 bit destination IP address Options (if any) data (variable length, typically a TCP or UDP segment) E.g. timestamp, record route taken, specify list of routers to visit. 4-12 IP Fragmentation & Reassembly Problem: A router may receive a packet larger than the maximum transmission unit (MTU) of the outgoing link. fragmentation: different link types, in: one large datagram out: 3 smaller datagrams different MTUs Solution: large IP datagram divided (“fragmented”) within net reassembly one datagram becomes several datagrams “reassembled” only at final destination, why? IP header bits used to identify, order related fragments E.g., Ethernet frames carry up to 1500 bytes, frames for some wide-area links carry no more than 576 bytes. (MTU: the max of data a link-layer 4-13 frame can carry) IP Fragmentation and Reassembly Example 4000 byte datagram MTU = 1500 bytes 1480 bytes in data field offset = 1480/8 length ID fragflag offset =4000 =x =0 =0 One large datagram becomes several smaller datagrams length ID fragflag offset =1500 =x =1 =0 length ID fragflag offset =1500 =x =1 =185 length ID fragflag offset =1040 =x =0 =370 Note: the offset value is specified in units of 8-byte chunks!!! 4-14 Fragmentation Fragments are re-assembled by the destination host; not by intermediate routers. To avoid fragmentation, hosts commonly use path MTU discovery to find the smallest MTU along the path. Path MTU discovery involves sending various size datagrams until they do not require fragmentation along the path. Most links use MTU>=1500bytes today. Try: traceroute www.berkeley.edu 500 –F and traceroute www.berkeley.edu 1501 -F: Set the "don't fragment" bit, return error it is too long Bonus: Can you find a destination for which the path MTU < 1500 bytes? 4-15 Network Layer Introduction Virtual circuit and datagram networks What’s inside a router IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 Routing algorithms Link state Distance Vector Hierarchical routing Routing in the Internet RIP OSPF BGP Broadcast and multicast routing 4-16 IP Addresses IP (Version 4) addresses are 32 bits long Every interface has a unique IP address: A computer might have two or more IP addresses A router has many IP addresses IP addresses are hierarchical They contain a network ID and a host ID E.g. SeattleU addresses start with: 172.17… IP addresses are assigned statically or dynamically (e.g. DHCP) IP (Version 6) addresses are 128 bits long 4-17 IP Addresses Originally there were 5 classes: CLASS “A” CLASS “B” CLASS “C” CLASS “D” CLASS “E” 0 A 1 24 7 Host-ID 0 Net ID 2 10 3 110 16 14 Host-ID Net ID 8 21 Host-ID Net ID 4 28 1110 Multicast Group ID 5 27 11110 Reserved B C D 232-1 4-18 IP Addresses Examples Class “A” address: www.mit.edu 18.181.0.31 (18<128 => Class A) Class “B” address: www.seattleu.edu 172.17.72.14 (128<171<128+64 => Class B) 4-19 IP Addressing Problem: Address classes were too “rigid”. For most organizations, Class C were too small and Class B too big. Led to inefficient use of address space, and a shortage of addresses. Organizations with internal routers needed to have a separate (Class C) network ID for each link. And then every other router in the Internet had to know about every network ID in every organization, which led to large address tables. Small organizations wanted Class B in case they grew to more than 255 hosts. But there were only about 16,000 Class B network IDs. 4-20 IP Addressing Two solutions were introduced: Subnetting within an organization to subdivide the organization’s network ID. Classless Interdomain Routing (CIDR) in the Internet backbone was introduced in 1993 to provide more efficient and flexible use of IP address space. CIDR is also known as “supernetting” because subnetting and CIDR are basically the same idea. 4-21 Subnetting CLASS “B” e.g. Company e.g. Site 2 10 2 10 Net ID 0000 Subnet ID (20) e.g. Dept 2 10 Subnet ID (22) 2 Host-ID 10 16 000000 2 Host-ID Subnet Host ID (10) 16 14 Net ID 1111 Subnet ID (20) Subnet Host ID (12) 14 Net ID Host-ID Net ID 16 14 16 14 10 Subnet Host ID (12) 16 14 Net ID Subnet ID (26) Host-ID 1111011011 Host-ID Subnet Host ID (6) 4-22 Subnetting Subnetting is a form of hierarchical routing. Subnets are usually represented via an address plus a subnet mask or “netmask”. e.g. [zhuy@cs1 ~]$ /sbin/ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:21:9B:8F:64:6D inet addr:10.124.72.20 Bcast:10.124.72.255 Mask:255.255.255.0 Netmask 255.255.255.0: the first 24 bits are the subnet ID, and the last 8 bits are the host ID. Can also be represented by a “prefix + length”, e.g. 171.64.15.0/24, or just 171.64.15/24. 4-23 IP Addressing IP address: 32-bit identifier for host, router interface interface: connection between host/router and physical link router’s typically have multiple interfaces host typically has one interface IP addresses associated with each interface 223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 223.1.3.2 223.1.3.1 223.1.1.1 = 11011111 00000001 00000001 00000001 223 1 1 1 4-24 Subnets IP address: subnet part (high order bits) host part (low order bits) What’s a subnet ? device interfaces with same subnet part of IP address can physically reach each other without intervening router 223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 subnet 223.1.3.1 223.1.3.2 network consisting of 3 subnets 4-25 Subnets Recipe To determine the subnets, detach each interface from its host or router, creating islands of isolated networks. Each isolated network is called a subnet. 223.1.1.0/24 223.1.2.0/24 223.1.3.0/24 Subnet mask: /24 4-26 Subnets 223.1.1.2 How many? 223.1.1.1 223.1.1.4 223.1.1.3 223.1.9.2 223.1.7.0 223.1.9.1 223.1.7.1 223.1.8.1 223.1.8.0 223.1.2.6 223.1.2.1 223.1.3.27 223.1.2.2 223.1.3.1 223.1.3.2 4-27 Classless Interdomain Routing (CIDR) Addressing The IP address space is broken into line segments. Subnet portion of address of arbitrary length Each line segment is described by a prefix. A prefix is of the form x/y where x indicates the prefix of all addresses in the line segment, and y indicates the length of the segment. e.g. The prefix 128.9/16 represents the line segment containing addresses in the range: 128.9.0.0 … 128.9.255.255. 128.9.0.0 65/8 0 142.12/19 128.9/16 216 232-1 128.9.16.14 4-28 Classless Interdomain Routing (CIDR) Addressing 128.9.19/24 128.9.25/24 128.9.16/20 128.9.176/20 128.9/16 0 232-1 128.9.16.14 Most specific route = “longest matching prefix” 4-29 Classless Interdomain Routing (CIDR) Addressing Prefix aggregation: If a service provider serves two organizations with prefixes, it can (sometimes) aggregate them to form a shorter prefix. Other routers can refer to this shorter prefix, and so reduce the size of their address table. E.g. ISP serves 128.9.14.0/24 and 128.9.15.0/24, it can tell other routers to send it all packets belonging to the prefix 128.9.14.0/23. ISP Choice: In principle, an organization can keep its prefix if it changes service providers. 4-30 Size of the Routing Table at the core of the Internet Source: http://www.cidr-report.org/ 31 IP addresses: how to get one? Q: How does host get IP address? hard-coded by system admin in a file Wintel: control-panel->network->configuration>tcp/ip->properties UNIX: /etc/rc.config DHCP: Dynamic Host Configuration Protocol: dynamically get address from as server “plug-and-play” 4-32 IP addresses: how to get one? Q: How does network get subnet part of IP addr? A: gets allocated portion of its provider ISP’s address space ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20 Organization 0 Organization 1 Organization 2 ... 11001000 00010111 00010000 00000000 11001000 00010111 00010010 00000000 11001000 00010111 00010100 00000000 ….. …. 200.23.16.0/23 200.23.18.0/23 200.23.20.0/23 …. Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 4-33 Hierarchical addressing: route aggregation Hierarchical addressing allows efficient advertisement of routing information: Organization 0 200.23.16.0/23 Organization 1 200.23.18.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . Fly-By-Night-ISP “Send me anything with addresses beginning 200.23.16.0/20” Internet 200.23.30.0/23 ISPs-R-Us “Send me anything with addresses beginning 199.31.0.0/16” 4-34 Hierarchical addressing: more specific routes ISPs-R-Us has a more specific route to Organization 1 Organization 0 200.23.16.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . Fly-By-Night-ISP “Send me anything with addresses beginning 200.23.16.0/20” Internet 200.23.30.0/23 ISPs-R-Us Organization 1 200.23.18.0/23 “Send me anything with addresses beginning 199.31.0.0/16 or 200.23.18.0/23” Longest prefix match 4-35 IP addressing: the last word... Q: How does an ISP get block of addresses? A: ICANN: Internet Corporation for Assigned Names and Numbers allocates addresses manages DNS assigns domain names, resolves disputes 4-36 NAT: Network Address Translation rest of Internet local network (e.g., home network) 10.0.0/24 10.0.0.4 10.0.0.1 10.0.0.2 138.76.29.7 10.0.0.3 All datagrams leaving local network have same single source NAT IP address: 138.76.29.7, different source port numbers Datagrams with source or destination in this network have 10.0.0/24 address for source, destination (as usual) 4-37 NAT: Network Address Translation Motivation: local network uses just one IP address as far as outside world is concerned: no need to be allocated range of addresses from ISP: - just one IP address is used for all devices can change addresses of devices in local network without notifying outside world can change ISP without changing addresses of devices in local network devices inside local net not explicitly addressable, visible by outside world (a security plus). 4-38 NAT: Network Address Translation Implementation: NAT router must: outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #) . . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr. remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table 4-39 NAT: Network Address Translation 2: NAT router changes datagram source addr from 10.0.0.1, 3345 to 138.76.29.7, 5001, updates table 2 NAT translation table WAN side addr LAN side addr 1: host 10.0.0.1 sends datagram to 128.119.40.186, 80 138.76.29.7, 5001 10.0.0.1, 3345 …… …… S: 10.0.0.1, 3345 D: 128.119.40.186, 80 S: 138.76.29.7, 5001 D: 128.119.40.186, 80 138.76.29.7 S: 128.119.40.186, 80 D: 138.76.29.7, 5001 3: Reply arrives dest. address: 138.76.29.7, 5001 3 1 10.0.0.4 S: 128.119.40.186, 80 D: 10.0.0.1, 3345 10.0.0.1 10.0.0.2 4 10.0.0.3 4: NAT router changes datagram dest addr from 138.76.29.7, 5001 to 10.0.0.1, 3345 4-40 NAT: Network Address Translation 16-bit port-number field: 60,000 simultaneous connections with a single LAN-side address! NAT is controversial: routers should only process up to layer 3 violates end-to-end argument • NAT possibility must be taken into account by app designers, eg, P2P applications address IPv6 shortage should instead be solved by 4-41 Network Layer Introduction Virtual circuit and datagram networks What’s inside a router IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 Routing algorithms Link state Distance Vector Hierarchical routing Routing in the Internet RIP OSPF BGP Broadcast and multicast routing 4-42 ICMP: Internet Control Message Protocol used by hosts & routers to communicate network-level information error reporting: unreachable host, network, port, protocol echo request/reply (used by ping) network-layer “above” IP: ICMP msgs carried in IP datagrams ICMP message: type, code plus first 8 bytes of IP datagram causing error Type 0 3 3 3 3 3 3 4 Code 0 0 1 2 3 6 7 0 8 9 10 11 12 0 0 0 0 0 description echo reply (ping) dest. network unreachable dest host unreachable dest protocol unreachable dest port unreachable dest network unknown dest host unknown source quench (congestion control - not used) echo request (ping) route advertisement router discovery TTL expired bad IP header 4-43 An aside: Error Reporting (ICMP) and traceroute Internet Control Message Protocol: Used by a router/end-host to report some types of error: E.g. Destination Unreachable: packet can’t be forwarded to/towards its destination. E.g. Time Exceeded: TTL reached zero, or fragment didn’t arrive in time. Traceroute uses this error to its advantage. An ICMP message is an IP datagram, and is sent back to the source of the packet that caused the error. 4-44 Traceroute and ICMP Source sends series of UDP segments to dest First has TTL =1 Second has TTL=2, etc. Unlikely port number When nth datagram arrives to nth router: Router discards datagram And sends to source an ICMP message (type 11, code 0) Message includes name of router& IP address When ICMP message arrives, source calculates RTT Traceroute does this 3 times Stopping criterion UDP segment eventually arrives at destination host Destination returns ICMP “host unreachable” packet (type 3, code 3) When source gets this ICMP, stops. It would be fun if you design a traceroute tool on your own! 4-45 Network Layer Introduction Datagram networks IP: Internet Protocol Datagram format IPv4 addressing ICMP What’s inside a router Routing algorithms Link state Distance Vector Hierarchical routing Routing in the Internet RIP OSPF BGP Broadcast and multicast routing 4-46 Router Architecture Overview Two key router functions: run routing algorithms/protocol (RIP, OSPF, BGP) forwarding datagrams from incoming to outgoing link 4-47 Input Port Functions Physical layer: bit-level reception Data link layer: e.g., Ethernet Decentralized switching: given datagram dest., lookup output port using forwarding table in input port memory goal: complete input port processing at ‘line speed’ queuing: if datagrams arrive faster than forwarding rate into switch fabric 4-48 Three types of switching fabrics 4-49 Switching Via Memory First generation routers: traditional computers with switching under direct control of CPU packet copied to system’s memory speed limited by memory bandwidth (2 bus crossings per datagram) Input Port Memory Output Port System Bus 4-50 Switching Via a Bus datagram from input port memory to output port memory via a shared bus bus contention: switching speed limited by bus bandwidth 1 Gbps bus, Cisco 1900: sufficient speed for access and enterprise routers (not regional or backbone) 4-51 Switching Via An Interconnection Network overcome bus bandwidth limitations Banyan networks, other interconnection nets initially developed to connect processors in multiprocessor Advanced design: fragmenting datagram into fixed length cells, switch cells through the fabric. Cisco 12000: switches Gbps through the interconnection network 4-52 Output Ports Buffering required when datagrams arrive from fabric faster than the transmission rate Scheduling discipline chooses among queued datagrams for transmission 4-53 Output port queueing buffering when arrival rate via switch exceeds output line speed queueing (delay) and loss due to output port buffer overflow! 4-54 Input Port Queuing Fabric slower than input ports combined -> queueing may occur at input queues Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward queueing delay and loss due to input buffer overflow! 4-55 How a Router Forwards Datagrams 128.17.20.1 R2 1 R1 2 3 R3 R4 128.17.16.1 e.g. 128.9.16.14 => Port 2 Prefix Next-hop Port 65/8 128.9/16 128.9.16/20 128.9.19/24 128.9.25/24 128.9.176/20 142.12/19 128.17.16.1 128.17.14.1 128.17.14.1 128.17.10.1 128.17.14.1 128.17.20.1 128.17.16.1 3 2 2 7 2 1 3 Forwarding/routing table 4-56 How a Router Forwards Datagrams Every datagram contains a destination address. The router determines the prefix to which the address belongs, and routes it to the“Network ID” that uniquely identifies a physical network. All hosts and routers sharing a Network ID share same physical network. 4-57 Forwarding Datagrams Is the datagram for a host on a directly attached network? If no, consult forwarding table to find next-hop. 4-58 Inside a router Link 1, ingress Choose Egress Link 1, egress Link 2, ingress Choose Egress Link 2, egress Link 3, ingress Choose Egress Link 3, egress Link 4, ingress Choose Egress Link 4, egress 4-59 Inside a router Forwarding Table Link 1, ingress Forwarding Decision Link 1, egress Link 2, ingress Choose Egress Link 2, egress Link 3, ingress Choose Egress Link 3, egress Link 4, ingress Choose Egress Link 4, egress 4-60 Forwarding in an IP Router • Lookup packet DA in forwarding table. – If known, forward to correct port. – If unknown, drop packet. • Decrement TTL, update header Checksum. • Forward packet to outgoing interface. • Transmit packet onto link. Question: How is the address looked up in a real router? 4-61 Making a Forwarding Decision Class-based addressing IP Address Space Class A Class B Class A 212.17.9.4 Class B Class C Class C D Routing Table: Exact Match 212.17.9.0 212.17.9.0 Port 4 Exact Match: There are many well-known ways to find an exact match in a table. 4-62 Direct Lookup IP Address Memory Next-hop, Port Problem: With 232 addresses, the memory would require 4 billion entries. 4-63 Associative Lookups “Contents addressable memory” (CAM) Advantages: Associative Memory or CAM Search Data 32 Network Address Port Number • Simple Port Number Hit? Disadvantages • • • • Slow High Power Small Expensive Search data is compared with every entry in parallel 4-64 Hashed Lookups Hashing Function 16 Memory Data 32 Address Search Data Associated Data { Hit? Address log2N 4-65 Lookups Using Hashing An example Memory #1 Search Data 32 Hashing Function 16 Linked list of entries with same hash key. #2 #3 #4 Associated Data #1 #2 #1 #2 Hit? #3 4-66 Lookups Using Hashing Advantages: • Simple • Expected lookup time can be small Disadvantages • Non-deterministic lookup time • Inefficient use of memory 4-67 Trees and Tries Binary Search Tree: < (“reTRIEval”) > > < N entries > log2N < Binary Search Trie: 0 0 1 1 010 0 1 111 Requires 32 memory references, regardless of number of addresses. 4-68 Search Tries Multiway tries reduce the number of memory references 16-ary Search Trie 0000, ptr 0000, 0 1111, ptr 000011110000 1111, ptr 0000, 0 1111, ptr 111111111111 Question: Why not just keep increasing the degree of the trie? 4-69 Classless Addressing CIDR 128.9.19/24 128.9.25/24 128.9.16/20 128.9.176/20 128.9/16 0 232-1 128.9.16.14 Most specific route = “longest matching prefix” Question: How can we look up addresses if they are not an exact match? 4-70 Ternary CAMs Associative Memory Value Mask Port 255.255.255.255 1 10.1.1.0 255.255.255.0 2 10.1.3.0 255.255.255.0 3 10.1.0.0 255.255.0.0 4 10.0.0.0 255.0.0.0 4 10.1.1.32 Port Priority Encoder Note: Most specific routes appear closest to top of table 4-71 Longest prefix matches using Binary Tries 0 1 g f d e h j abc k Example a) b) c) d) e) i f) g) h) i) j) k) Prefixes: 00001 00010 00011 001 0101 011 10 1010 111 111100 11110001 4-72 Lookup Performance Required Line Line Rate Pktsize=40B Pktsize=240B T1 1.5Mbps 4.68 Kpps 0.78 Kpps OC3 155Mbps 480 Kpps 80 Kpps OC12 622Mbps 1.94 Mpps 323 Kpps OC48 2.5Gbps 7.81 Mpps 1.3 Mpps OC192 10 Gbps 31.25 Mpps 5.21 Mpps 4-73 Discussion Why was the Internet Protocol designed this way? Why connectionless, datagram, best-effort? Why not automatic retransmissions? Why fragmentation in the network? Must the Internet address be hierarchical? What address does a mobile host have? Are there other ways to design networks? 4-74