* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 4: Network Layer
Distributed firewall wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Deep packet inspection wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Computer network wikipedia , lookup
Network tap wikipedia , lookup
Internet protocol suite wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Zero-configuration networking wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Airborne Networking wikipedia , lookup
UniPro protocol stack wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Chapter 4: Network Layer Chapter goals: understand principles behind network layer services: network layer service models forwarding versus routing how a router works routing (path selection) dealing with scale advanced topics: IPv6, mobility instantiation, implementation in the Internet Network Layer 4-1 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-2 Network layer transport segment from sending to receiving host on sending side encapsulates segments into datagrams on rcving side, delivers segments to transport layer network layer protocols in every host, router router examines header fields in all IP datagrams passing through it application transport network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical network network data link data link physical physical network data link physical network data link physical network data link physical network data link physical Network Layer application transport network data link physical 4-3 Application Application protocol Application TCP TCP protocol TCP IP Data Link Host IP IP protocol Data Link Data Link IP IP protocol Data Link Router Data Link Data Link IP protocol Data Link Router Data Link IP Network Access Host Network Layer 4-4 Two Key Network-Layer Functions forwarding: move packets from router’s input to appropriate router output routing: determine route taken by packets from source to dest. analogy: routing: process of planning trip from source to dest forwarding: process of getting through single interchange routing algorithms Network Layer 4-5 Interplay between routing and forwarding routing algorithm local forwarding table header value output link 0100 0101 0111 1001 3 2 2 1 value in arriving packet’s header 0111 1 3 2 Network Layer 4-6 Connection setup 3rd important function in some network architectures: ATM, frame relay, X.25 before datagrams flow, two end hosts and intervening routers establish virtual connection routers get involved network vs transport layer connection service: network: between two hosts (may also involve intervening routers in case of VCs) transport: between two processes Network Layer 4-7 Network service model Q: What service model for “channel” transporting datagrams from sender to receiver? Example services for individual datagrams: guaranteed delivery guaranteed delivery with less than 40 msec delay Example services for a flow of datagrams: in-order datagram delivery guaranteed minimum bandwidth to flow restrictions on changes in interpacket spacing Network Layer 4-8 Network layer service models: Network Architecture Internet Service Model Guarantees ? Congestion Bandwidth Loss Order Timing feedback best effort none ATM CBR ATM VBR ATM ABR ATM UBR constant rate guaranteed rate guaranteed minimum none no no no yes yes yes yes yes yes no yes no no (inferred via loss) no congestion no congestion yes no yes no no Network Layer 4-9 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-10 Network layer connection and connection-less service datagram network provides network-layer connectionless service VC network provides network-layer connection service analogous to the transport-layer services, but: service: host-to-host no choice: network provides one or the other implementation: in network core Network Layer 4-11 Virtual circuits “source-to-dest path behaves much like telephone circuit” performance-wise network actions along source-to-dest path call setup, teardown for each call before data can flow each packet carries VC identifier (not destination host address) every router on source-dest path maintains “state” for each passing connection link, router resources (bandwidth, buffers) may be allocated to VC (dedicated resources = predictable service) Network Layer 4-12 VC implementation a VC consists of: 1. 2. 3. path from source to destination VC numbers, one number for each link along path entries in forwarding tables in routers along path packet belonging to VC carries VC number (rather than dest address) VC number can be changed on each link. New VC number comes from forwarding table Network Layer 4-13 Forwarding table VC number 22 12 1 Forwarding table in northwest router: Incoming interface 1 2 3 1 … 2 32 3 interface number Incoming VC # 12 63 7 97 … Outgoing interface 3 1 2 3 … Outgoing VC # 22 18 17 87 … Routers maintain connection state information! Network Layer 4-14 Virtual circuits: signaling protocols used to setup, maintain teardown VC used in ATM, frame-relay, X.25 not used in today’s Internet application transport 5. Data flow begins network 4. Call connected data link 1. Initiate call physical 6. Receive data application 3. Accept call transport 2. incoming call network data link physical Network Layer 4-15 Datagram networks no call setup at network layer routers: no state about end-to-end connections no network-level concept of “connection” packets forwarded using destination host address packets between same source-dest pair may take different paths application transport network data link 1. Send data physical application transport 2. Receive data network data link physical Network Layer 4-16 Forwarding table Destination Address Range 4 billion possible entries Link Interface 11001000 00010111 00010000 00000000 through 11001000 00010111 00010111 11111111 0 11001000 00010111 00011000 00000000 through 11001000 00010111 00011000 11111111 1 11001000 00010111 00011001 00000000 through 11001000 00010111 00011111 11111111 2 otherwise 3 Network Layer 4-17 Longest prefix matching Prefix Match 11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise Link Interface 0 1 2 3 Examples DA: 11001000 00010111 00010110 10100001 Which interface? DA: 11001000 00010111 00011000 10101010 Which interface? Network Layer 4-18 Datagram or VC network: why? Internet (datagram) data exchange among ATM (VC) evolved from telephony computers human conversation: “elastic” service, no strict strict timing, reliability timing req. requirements “smart” end systems need for guaranteed (computers) service can adapt, perform “dumb” end systems control, error recovery telephones simple inside network, complexity inside complexity at “edge” network many link types different characteristics uniform service difficult Network Layer 4-19 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-20 Router Architecture Overview Two key router functions: run routing algorithms/protocol (RIP, OSPF, BGP) forwarding datagrams from incoming to outgoing link Network Layer 4-21 Input Port Functions Physical layer: bit-level reception Data link layer: e.g., Ethernet see chapter 5 Decentralized switching: given datagram dest., lookup output port using forwarding table in input port memory goal: complete input port processing at ‘line speed’ queuing: if datagrams arrive faster than forwarding rate into switch fabric Network Layer 4-22 Three types of switching fabrics Network Layer 4-23 Switching Via Memory First generation routers: traditional computers with switching under direct control of CPU packet copied to system’s memory speed limited by memory bandwidth (2 bus crossings per datagram) Input Port Memory Output Port System Bus Network Layer 4-24 Switching Via a Bus datagram from input port memory to output port memory via a shared bus bus contention: switching speed limited by bus bandwidth 32 Gbps bus, Cisco 5600: sufficient speed for access and enterprise routers Network Layer 4-25 Switching Via An Interconnection Network overcome bus bandwidth limitations Banyan networks, other interconnection nets initially developed to connect processors in multiprocessor advanced design: fragmenting datagram into fixed length cells, switch cells through the fabric. Cisco 12000: switches 60 Gbps through the interconnection network Network Layer 4-26 Output Ports Buffering required when datagrams arrive from fabric faster than the transmission rate Scheduling discipline chooses among queued datagrams for transmission Network Layer 4-27 Output port queueing buffering when arrival rate via switch exceeds output line speed queueing (delay) and loss due to output port buffer overflow! Network Layer 4-28 How much buffering? RFC 3439 rule of thumb: average buffering equal to “typical” RTT (say 250 msec) times link capacity C e.g., C = 10 Gps link: 2.5 Gbit buffer Recent recommendation: with N flows, buffering equal to RTT. C N Scheduling: QoS AQM, RED Network Layer 4-29 Input Port Queuing Fabric slower than input ports combined -> queueing may occur at input queues Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward queueing delay and loss due to input buffer overflow! Network Layer 4-30 Scheduling Policies scheduling: choose next packet to send on link FIFO (first in first out) scheduling: send in order of arrival to queue real-world example? discard policy: if packet arrives to full queue: who to discard? • Tail drop: drop arriving packet • priority: drop/remove on priority basis • random: drop/remove randomly Network Layer 4-31 Scheduling Policies: more Priority scheduling: transmit highest priority queued packet multiple classes, with different priorities class may depend on marking or other header info, e.g. IP source/dest, port numbers, etc.. Real world example? Network Layer 4-32 Scheduling Policies: still more round robin scheduling: multiple classes cyclically scan class queues, serving one from each class (if available) real world example? Scheduling Policies: still more Weighted Fair Queuing: generalized Round Robin each class gets weighted amount of service in each cycle real-world example? Network Layer 4-34 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-35 The Internet Network layer Host, router network layer functions: Routing protocols •path selection •RIP, OSPF, BGP Network layer Transport layer: TCP, UDP IP protocol •addressing conventions •datagram format •packet handling conventions forwarding table ICMP protocol •error reporting •router “signaling” Link layer physical layer Network Layer 4-36 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-37 IP datagram format IP protocol version number header length (bytes) “type” of data max number remaining hops (decremented at each router) upper layer protocol to deliver payload to how much overhead with TCP? 20 bytes of TCP 20 bytes of IP = 40 bytes + app layer overhead 32 bits type of ver head. len service length fragment 16-bit identifier flgs offset upper time to header layer live checksum total datagram length (bytes) for fragmentation/ reassembly 32 bit source IP address 32 bit destination IP address Options (if any) data (variable length, typically a TCP or UDP segment) E.g. timestamp, record route taken, specify list of routers to visit. Network Layer 4-38 IP datagram format Don’t Fragment header. The IPv4 (Internet Protocol) 20 bytes ≤ Header Length < (24 – 1) x 4 bytes = 60 bytes 20 bytes ≤ Total Length < 216 – 1 bytes = 65535 bytes Network Layer 4-39 IP datagram format Some of the IP options. 5-54 Network Layer 4-40 IP Fragmentation & Reassembly network links have MTU (max. transmission size) - largest possible link-level frame. different link types, different MTUs large IP datagram divided (“fragmented”) within net one datagram becomes several datagrams “reassembled” only at final destination IP header bits used to identify, order related fragments fragmentation: in: one large datagram out: 3 smaller datagrams reassembly Network Layer 4-41 IP Fragmentation con. IP分组头中与分段有关的域(红色): version header length Identification time-to-live (TTL) Total Length (in bytes) TOS protocol 0 DM F F Fragment Offset header checksum Total Length: 本分段的长度。 Identification: 属于同一分组的分段有相同的Identification。 DF: 如果置1,则不允许分段。如果分组长度大于MTU, 则抛弃。 MF: 如果置1,说明还有跟在后面的分段。 Fragment Offset: 本分段中的数据相对于分段前的分组中数据的 位移,单位是8个字节。 IP Fragmentation & Reassembly Example 4000 byte datagram MTU = 1500 bytes 1480 bytes in data field offset = 1480/8 length ID fragflag offset =4000 =x =0 =0 One large datagram becomes several smaller datagrams length ID fragflag offset =1500 =x =1 =0 length ID fragflag offset =1500 =x =1 =185 length ID fragflag offset =1040 =x =0 =370 Network Layer 4-43 IP Fragmentation con. 举例,请注意 Fragment Offset 的单位是8个字节。 Header length: 20 Total length: 3620 Identification: 0xa428 DF flag: 0 MF flag: 0 Fragment offset: 0 Header length: 20 Total length: 660 Identification: 0xa428 DF flag: 0 MF flag: 0 Fragment offset: 370 IP datagram Header length: 20 Header length: 20 Total length: 1500 Total length: 1500 Identification: 0xa428 Identification: 0xa428 DF flag: 0 DF flag: 0 MF flag: 1 MF flag: 1 Fragment offset: 185 fragment offset: 0 Fragment 3 MTU: 4000 MTU: 1500 Router Fragment 2 Fragment 1 IP Fragmentation con. 发送主机或路由器都可能对IP分组进行分段。 FDDI Ring Host A Ethernet Router MTU: 4352 一个IP分组可能被多次分段。 但重组 (reassembly)仅在目的主机进行。 Host B MTU: 1500 IP Fragmentation & Reassembly Costs of fragmentation? Network Layer 4-46 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-47 Design Principles for Internet Make sure it works. Keep it simple. Make clear choices. Exploit modularity. Expect heterogeneity. Avoid static options and parameters. Look for a good design; it need not be perfect. Be strict when sending and tolerant when receiving. Think about scalability. Consider performance and cost. Network Layer 4-48 Collection of Subnetworks The Internet is an interconnected collection of many networks. Network Layer 4-49 IP Address 10000000 1st Byte = 128 10001111 2nd Byte = 143 10001001 3rd Byte = 137 10010000 4th Byte = 144 128.143.137.144 Network Layer 4-50 IP Addressing: introduction IP address: 32-bit identifier for host, router interface interface: connection between host/router and physical link router’s typically have multiple interfaces host typically has one interface IP addresses associated with each interface 223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 223.1.3.2 223.1.3.1 223.1.1.1 = 11011111 00000001 00000001 00000001 223 1 1 1 Network Layer 4-51 Subnets IP address: subnet part (high order bits) host part (low order bits) What’s a subnet ? device interfaces with same subnet part of IP address can physically reach each other without intervening router 223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 subnet 223.1.3.1 223.1.3.2 network consisting of 3 subnets Network Layer 4-52 Subnets A campus network consisting of LANs for various departments. Network Layer 4-53 Subnets Recipe To determine the subnets, detach each interface from its host or router, creating islands of isolated networks. Each isolated network is called a subnet. 223.1.1.0/24 223.1.2.0/24 223.1.3.0/24 Subnet mask: /24 Network Layer 4-54 Subnets 223.1.1.2 How many? 223.1.1.1 223.1.1.4 223.1.1.3 223.1.9.2 223.1.7.0 223.1.9.1 223.1.7.1 223.1.8.1 223.1.8.0 223.1.2.6 223.1.2.1 223.1.3.27 223.1.2.2 223.1.3.1 223.1.3.2 Network Layer 4-55 IP addressing: CIDR CIDR: Classless InterDomain Routing subnet portion of address of arbitrary length: VLSM (Variable Length Subnet Masking ) address format: a.b.c.d/x, where x is # bits in subnet portion of address subnet part host part 11001000 00010111 00010000 00000000 200.23.16.0/23 Network Layer 4-56 IP addressing: CIDR A set of IP address assignments. 5-59 Network Layer 4-57 e.g. 128.14.32.0/20 (212 addrs) “min” The same 20-bit prefix “max” 10000000 00001110 00100000 00000000 10000000 00001110 00100000 00000001 10000000 00001110 00100000 00000010 10000000 00001110 00100000 00000011 10000000 00001110 00100000 00000100 10000000 00001110 00100000 00000101 10000000 00001110 00101111 11111011 10000000 00001110 00101111 11111100 10000000 00001110 00101111 11111101 10000000 00001110 00101111 11111110 10000000 00001110 00101111 11111111 VLSM # of IP addrs w.r.t. length of network mask: mask length /27 /26 /25 /24 /23 /22 /21 /20 /19 /18 /17 /16 /15 /14 /13 # of addrs 32 64 128 256 512 1,024 2,048 4,096 8,192 16,384 32,768 65,536 131,072 262,144 524,288 Network Layer 4-59 Classful addressing Network Layer 4-60 Classful addressing Network Layer 4-61 Special IP Network Layer 4-62 Special IP broadcast IP: • The subnet broadcast 202.112.41.224 202.112.41.255 hosts in the same subnet receive it. • Limited broadcast 255.255.255.255, all hosts receive it. multicast IP: • D, 224.0.0.0 – 239.255.255.255 Network Layer 4-63 Special IP “0”: • Host + network • Host =0, network address; • Network =0, host address; loopback: • 127.0.0.0 ~ 127.255.255.255; 127.0.0.1 Network Layer 4-64 Special IP Private IP 10.0.0.0 172.16.0.0 192.168.0.0 - 10.255.255.255 - 172.31.255.255 - 192.168.255.255 Used in LAN NAT: Network Address Translation Network Layer 4-65 IP address network address host address How to identify? Explicit: VLSM Implicit: Classful Network Layer 4-66 Subnets network addr network addr host addr subnet addr host addr A class B network subnetted into 64 subnets VLSM Network Layer 4-67 Why subnets? DUT College2 College1 library Network Layer 4-68 Subnets: one more example 145.13.3.11 … 145.13.3.10 R2 145.13.3.101 145.13.7.34 145.13.7.35 subnet 145.13.3.0 R3 R1 … subnet 145.13.7.0 145.13.7.56 subnet 145.13.21.0 … 145.13.21.23 145.13.21.9 145.13.21.8 network 145.13.0.0 Problems mask, network addr, host addr? e.g. 224.221.121.19/12 IP addr classification? e.g. 125.2.156.7 IP addr class choosing: A, B, C? e.g. 3470 hosts/interfaces length of subnet/net/mask addr? e.g. 578 hosts/interfaces Network Layer 4-70 Problems Subnet mask is 255.255.255.224 Host IP: 202.112.41.241 Problems: 1, network address and host address? 2, how many subnets are created at most? Problems 224 1111…..11100000 (x=27) 202.112.41.241: 24111110001 =17 net: 202.112.41.224 host: 0.0.0.17; 255.255.255.2241111…..11100000 Eight subnets: 202.112.41.0; 202.112.41.32; 202.112.41.64; 202.112.41.96; 202.112.41.128; 202.112.41.160; 202.112.41.192; 202.112.41.224; IP addresses: how to get one? Q: How does a host get IP address? hard-coded by system admin in a file Windows: control-panel->network->configuration>tcp/ip->properties UNIX: /etc/rc.config DHCP: Dynamic Host Configuration Protocol: dynamically get address from as server “plug-and-play” Network Layer 4-73 DHCP: Dynamic Host Configuration Protocol Goal: allow host to dynamically obtain its IP address from network server when it joins network Can renew its lease on address in use Allows reuse of addresses (only hold address while connected an “on”) Support for mobile users who want to join network (more shortly) DHCP overview: host broadcasts “DHCP discover” msg DHCP server responds with “DHCP offer” msg host requests IP address: “DHCP request” msg DHCP server sends address: “DHCP ack” msg Network Layer 4-74 DHCP client-server scenario A B 223.1.2.1 DHCP server 223.1.1.1 223.1.1.2 223.1.1.4 223.1.2.9 223.1.2.2 223.1.1.3 223.1.3.1 223.1.3.27 223.1.3.2 E arriving DHCP client needs address in this network Network Layer 4-75 DHCP client-server scenario DHCP server: 223.1.2.5 DHCP discover arriving client src : 0.0.0.0, 68 dest.: 255.255.255.255,67 yiaddr: 0.0.0.0 transaction ID: 654 DHCP offer src: 223.1.2.5, 67 dest: 255.255.255.255, 68 yiaddrr: 223.1.2.4 transaction ID: 654 Lifetime: 3600 secs DHCP request time src: 0.0.0.0, 68 dest:: 255.255.255.255, 67 yiaddrr: 223.1.2.4 transaction ID: 655 Lifetime: 3600 secs DHCP ACK src: 223.1.2.5, 67 dest: 255.255.255.255, 68 yiaddrr: 223.1.2.4 transaction ID: 655 Lifetime: 3600 secs Network Layer 4-76 IP addresses: how to get one? Q: How does network get subnet part of IP addr? A: gets allocated portion of its provider ISP’s address space ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20 Organization 0 Organization 1 Organization 2 ... 11001000 00010111 00010000 00000000 11001000 00010111 00010010 00000000 11001000 00010111 00010100 00000000 ….. …. 200.23.16.0/23 200.23.18.0/23 200.23.20.0/23 …. Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 Network Layer 4-77 IP addressing: the last word... Q: How does an ISP get block of addresses? A: ICANN: Internet Corporation for Assigned Names and Numbers allocates addresses manages DNS assigns domain names, resolves disputes Network Layer 4-78 IP addresses are exhausted Solutions? VLSM/CIDR IPv4 NAT IPv6 Network Layer 4-79 NAT: Network Address Translation rest of Internet local network (e.g., home network) 10.0.0/24 10.0.0.4 10.0.0.1 10.0.0.2 138.76.29.7 10.0.0.3 All datagrams leaving local network have same single source NAT IP address: 138.76.29.7, different source port numbers Datagrams with source or destination in this network have 10.0.0/24 address for source, destination (as usual) Network Layer 4-80 NAT: Network Address Translation Motivation: local network uses just one IP address as far as outside world is concerned: range of addresses not needed from ISP: just one IP address for all devices can change addresses of devices in local network without notifying outside world can change ISP without changing addresses of devices in local network devices inside local net not explicitly addressable, visible by outside world (a security plus). Network Layer 4-81 NAT: Network Address Translation Implementation: NAT router must: outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #) . . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr. remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table Network Layer 4-82 NAT: Network Address Translation 2: NAT router changes datagram source addr from 10.0.0.1, 3345 to 138.76.29.7, 5001, updates table 2 NAT translation table WAN side addr LAN side addr 1: host 10.0.0.1 sends datagram to 128.119.40.186, 80 138.76.29.7, 5001 10.0.0.1, 3345 …… …… S: 10.0.0.1, 3345 D: 128.119.40.186, 80 S: 138.76.29.7, 5001 D: 128.119.40.186, 80 138.76.29.7 S: 128.119.40.186, 80 D: 138.76.29.7, 5001 3: Reply arrives dest. address: 138.76.29.7, 5001 3 1 10.0.0.4 S: 128.119.40.186, 80 D: 10.0.0.1, 3345 10.0.0.1 10.0.0.2 4 10.0.0.3 4: NAT router changes datagram dest addr from 138.76.29.7, 5001 to 10.0.0.1, 3345 Network Layer 4-83 NAT: Network Address Translation 16-bit port-number field: 60,000 simultaneous connections with a single LAN-side address! NAT is controversial: port # are for processes, not hosts routers should only process up to layer 3 violates end-to-end argument • NAT possibility must be taken into account by app designers, eg, P2P applications address shortage should instead be solved by IPv6 Network Layer 4-84 NAT traversal problem client wants to connect to server with address 10.0.0.1 server address 10.0.0.1 local Client to LAN (client can’t use it as destination addr) only one externally visible NATted address: 138.76.29.7 solution 1: statically configure NAT to forward incoming connection requests at given port to server 10.0.0.1 ? 138.76.29.7 10.0.0.4 NAT router e.g., (123.76.29.7, port 2500) always forwarded to 10.0.0.1 port 25000 Network Layer 4-85 NAT traversal problem solution 2: Universal Plug and Play (UPnP) Internet Gateway Device (IGD) Protocol. Allows NATted host to: learn public IP address (138.76.29.7) add/remove port mappings (with lease times) 10.0.0.1 IGD 10.0.0.4 138.76.29.7 NAT router i.e., automate static NAT port map configuration Network Layer 4-86 NAT traversal problem solution 3: relaying (used in Skype) NATed client establishes connection to relay External client connects to relay relay bridges packets between to connections 2. connection to relay initiated by client Client 3. relaying established 1. connection to relay initiated by NATted host 138.76.29.7 10.0.0.1 NAT router Network Layer 4-87 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-88 ICMP: Internet Control Message Protocol used by hosts & routers to communicate network-level information error reporting: unreachable host, network, port, protocol echo request/reply (used by ping) network-layer “above” IP: ICMP msgs carried in IP datagrams ICMP message: type, code plus first 8 bytes of IP datagram causing error Type 0 3 3 3 3 3 3 4 Code 0 0 1 2 3 6 7 0 8 9 10 11 12 0 0 0 0 0 description echo reply (ping) dest. network unreachable dest host unreachable dest protocol unreachable dest port unreachable dest network unknown dest host unknown source quench (congestion control - not used) echo request (ping) route advertisement router discovery TTL expired bad IP header Network Layer 4-89 Traceroute and ICMP Source sends series of UDP segments to dest First has TTL =1 Second has TTL=2, etc. Unlikely port number When nth datagram arrives to nth router: Router discards datagram And sends to source an ICMP message (type 11, code 0) Message includes name of router& IP address When ICMP message arrives, source calculates RTT Traceroute does this 3 times Stopping criterion UDP segment eventually arrives at destination host Destination returns ICMP “host unreachable” packet (type 3, code 3) When source gets this ICMP, stops. Network Layer 4-90 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-91 IPv6 Initial motivation: 32-bit address space soon to be completely allocated. Solutions? Additional motivation: header format helps speed processing/forwarding header changes to facilitate QoS IPv6 datagram format: fixed-length 40 byte header no fragmentation allowed Network Layer 4-92 IPv6 – overview IPv6 (IP Version 6), 又称为 IPng (IP Next Generation),是 IPv4的下一个版本。 IPv6的标准文档是RFC 2460 RFC 2461 – RFC 2466 讨论更多的有关IPv6的细节 IPv6 – overview IPv6 把IP地址的长度增加到了16个字节 IPv6简化了IP分组的首部格式 IPv6增强了对进一步扩展的支持 IPv6增强了对QoS (Quality of Service)的支持 IPv6增强了对安全的支持 IPv6增加了对 Anycast 通信方式的支持 IPv6的基本首部 IPv6 仍支持无连接的传送所引进的主要变化如下 更大的地址空间。IPv6 将地址从 IPv4 的 32 位 增 大到了 128 位。 地址空间>3.4*1038.如果整个地球覆盖计算机,IPv6允许 每平方米拥有7*1023个IP地址。 如果每微妙分配100万个地址,需要1019年的时间才能将所 有地址分配完毕。 “地球上的每一粒沙子都可以有一个IP地址”。 IPv6的基本首部 IPv6 仍支持无连接的传送所引进的主要变化如下 更大的地址空间。IPv6 将地址从 IPv4 的 32 位 增 大到了 128 位。 扩展的地址层次结构。 灵活的首部格式。 改进的选项。 允许协议继续扩充。 支持即插即用(即自动配置) 支持资源的预分配。 IPv6数据报的首部 IPv6 将首部长度变为固定的 40 字节,称为基 本首部(base header)。 将不必要的功能取消了,首部的字段数减少到 只有 8 个。 取消了首部的检验和字段,加快了路由器处理 数据报的速度。 在基本首部的后面允许有零个或多个扩展首部 。 所有的扩展首部和数据合起来叫做数据报的有 效载荷(payload)或净负荷。 IPv6数据报的一般形式 有效载荷 选项 基本 首部 扩展 首部 1 … 扩展 首部 N IPv6 数据报 数 据 部 分 位 0 4 版本 12 通信量类 IPv6 的 有效载荷 (至 64 KB) 24 流 有 效 载 荷 长 度 IPv6 的 基本首部 (40 B) 16 标 31 号 下一个首部 源 地 址 (128 位) 目 的 地 址 (128 位) 有效载荷(扩展首部 / 数据) 跳数限制 位 0 4 版本 12 通信量类 IPv6 的 有效载荷 (至 64 KB) 24 流 有 效 载 荷 长 度 IPv6 的 基本首部 (40 B) 16 标 31 号 下一个首部 源 地 址 (128 位) 目 的 地 址 (128 位t) 扩展首部 / 数据 有效载荷(扩展首部 / 数据) 跳数限制 位 0 4 版本 12 流量类别 24 流 有 效 载 荷 长 度 IPv6 的 基 本 首 部 40 B 16 标 31 号 下一个首部 跳数限制 源 地 址 (128 位) 目 的 地 址 (128 位) 版本(version)—— 4 位。它指明了协议的版本,对 IPv6 该字段总是 6。 位 0 4 版本 12 24 流 流量类别 有 效 载 荷 长 度 IPv6 的 基 本 首 部 40 B 16 标 31 号 下一个首部 跳数限制 源 地 址 (128 位) 目 的 地 址 (128 位) 流量类别(traffic class)—— 8 位。这是为了区分不同的 IPv6 数据报的类别或优先级。目前正在进行不同的流量 类别性能的实验。 位 0 4 版本 12 流量类别 24 流 有 效 载 荷 长 度 IPv6 的 基 本 首 部 40 B 16 标 下一个首部 31 号 跳数限制 源 地 址 (128 位) 目 的 地 址 (128 位) 流标号(flow label)—— 20 位。 “流”是互联网络上从特定源 点到特定终点的一系列数据报, “流”所经过的路径上的路由 器都保证指明的服务质量。 所有属于同一个流的数据报都具有同样的流标号。 位 0 4 版本 12 16 流量类别 流 有 效 载 荷 长 度 IPv6 的 基 本 首 部 40 B 24 标 31 号 下一个首部 跳数限制 源 地 址 (128 位) 目 的 地 址 (128 位) 有效载荷长度(payload length)—— 16 位。它指明 IPv6 数据报 除基本首部以外的字节数(所有扩展首部都算在有效载荷之内 ),其最大值是 64 KB。 位 0 4 版本 12 流量类别 24 流 有 效 载 荷 长 度 IPv6 的 基 本 首 部 40 B 16 标 31 号 下一个首部 跳数限制 源 地 址 (128 位) 目 的 地 址 (128 位) 下一个首部(next header)—— 8 位。它相当于 IPv4 的协议字段 或可选字段。 位 0 4 版本 12 流量类别 24 流 有 效 载 荷 长 度 IPv6 的 基 本 首 部 40 B 16 标 31 号 下一个首部 跳数限制 源 地 址 (128 位) 目 的 地 址 (128 位) 跳数限制(hop limit)—— 8 位。源站在数据报发出时即设定跳数 限制。路由器在转发数据报时将跳数限制字段中的值减1。 当跳数限制的值为零时,就要将此数据报丢弃。 位 0 4 版本 12 流量类别 24 流 有 效 载 荷 长 度 IPv6 的 基 本 首 部 40 B 16 标 31 号 下一个首部 跳数限制 源 地 址 (128 位) 目 的 地 址 (128 位) 源地址—— 128 位。是数据报的发送站的 IP 地址。 位 0 4 版本 12 16 流量类别 流 有 效 载 荷 长 度 IPv6 的 基 本 首 部 40 B 24 标 31 号 下一个首部 跳数限制 源 地 址 (128 位) 目 的 地 址 (128 位) 目的地址—— 128 位。是数据报的接收站的 IP 地址。 IPv6的扩展首部 IPv6 把原来 IPv4 首部中选项的功能都放在扩 展首部中,并将扩展首部留给路径两端的源站 和目的站的主机来处理。 数据报途中经过的路由器都不处理这些扩展首 部(只有一个首部例外,即逐跳选项扩展首部 )。 这样就大大提高了路由器的处理效率。 六种扩展首部 在 RFC 2460 中定义了六种扩展首部: 逐跳选项 路由选择 分片 鉴别 封装安全有效载荷 目的站选项 IPv6的扩展首部 无扩展首部 基本首部 下一个首部 = TCP/UDP TCP/UDP 首部 和数据 (TCP/UDP 报文段) 有效载荷 有扩展首部 基本首部 路由选择首部 分片首部 下一个首部 = 路由选择 下一个首部 = 分片 下一个首部 = TCP/UDP 有效载荷 TCP/UDP 首部 和数据 (TCP/UDP 报文段) IPv6的地址空间 IPv6 数据报的目的地址可以是以下三种基本类型 地址之一: (1) 单播(unicast) 单播就是传统的点对点通信。 (2) 组播(multicast) 组播是一点对多点的通信。 (3) 任播(anycast) 这是 IPv6 增加的一种类型。 任播的目的站是一组计算机,但数据报在交付时 只交付其中的一个,通常是距离最近的一个。 冒号十六进制记法 (colon hexadecimal notation) 每个 16 位的值用十六进制值表示,各值之间用 冒号分隔。 68E6:8C64:FFFF:FFFF:0:1180:960A:FFFF 零压缩(zero compression),即一连串连续的零 可以为一对冒号所取代。 FF05:0:0:0:0:0:0:B3 可以写成: FF05::B3 点分十进制记法的后缀 0:0:0:0:0:0:128.10.2.1 再使用零压缩即可得出: ::128.10.2.1 CIDR 的斜线表示法仍然可用。 60 位的前缀 12AB00000000CD3 可记为: 12AB:0000:0000:CD30:0000:0000:0000:0000/ 60 或12AB::CD30:0:0:0:0/60 或12AB:0:0:CD30::/60 前缀为0000 0000的地址 前缀为 0000 0000 是保留一小部分地址与 IPv4 兼容的,这是因为必须要考虑到在比较长的时期 IPv4和 IPv6 将会同时存在,而有的结点不支持 IPv6。“IPv4映射的IPv6地址” 因此数据报在这两类结点之间转发时,就必须进 行地址的转换。 80 位 IPv4 映射的 0000..................0000 IPv6 地址 16 位 32 位 FFFF IPv4 地址 Changes from IPv4 Header Length 使用 Fixed Header : 40 bytes 3 Flag Bits (0, DF, MF), Fragment Offset 分段有关信息被移动到扩展首部 Checksum: removed entirely to reduce processing time at each hop Options: allowed, but outside of header, indicated by “Next Header” field 改变为扩展首部 Network Layer 4-116 ICMPv6 ICMPv6 的报文格式和 IPv4 使用的 ICMP 的相 似,即前 4 个字节的字段名称都是一样的。但 ICMPv6 将第 5 个字节起的后面部分作为报文主 体。 additional message types, e.g. “Packet Too Big” multicast group management functions ICMPv6 的报文划分为四大类 差错报告报文 提供信息的报文 多播听众发现报文 邻站发现报文 Transition From IPv4 To IPv6 Not all routers can be upgraded simultaneous no “flag days” How will the network operate with mixed IPv4 and IPv6 routers? Network Layer 4-118 IPv6 – Transition RFC 2893 中提出了两种由IPv4向IPv6转变的 方法: Dual IP Layer (又称Dual Stack,双协议栈): 在主机 和路由器上同时实现IPv4和IPv6两种协议. Tunneling (隧道技术): 把IPv6分组封装在IPv4分组 中传送。 IPv6 – Dual IP Layer 应用程序 TCP/UDP协议 IPv6协议 IPv4协议 数据链路层 物理层 Dual Stack (双协议栈): 如果一台主机同时支持IPv6和IPv4两种协议,那么该 主机既能与支持IPv4协议的主机通信,又能与支持IPv6协 议的主机通信,这就是双协议栈技术的工作机理。 Tunneling Logical view: Physical view: E F IPv6 IPv6 IPv6 A B E F IPv6 IPv6 IPv6 IPv6 A B IPv6 tunnel IPv4 IPv4 Network Layer 4-121 Tunneling Logical view: Physical view: A B IPv6 IPv6 A B C IPv6 IPv6 IPv4 Flow: X Src: A Dest: F data A-to-B: IPv6 E F IPv6 IPv6 D E F IPv4 IPv6 IPv6 tunnel Src:B Dest: E Src:B Dest: E Flow: X Src: A Dest: F Flow: X Src: A Dest: F data data B-to-C: IPv6 inside IPv4 B-to-C: IPv6 inside IPv4 Flow: X Src: A Dest: F data E-to-F: IPv6 Network Layer 4-122 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-123 Interplay between routing, forwarding routing algorithm local forwarding table header value output link 0100 0101 0111 1001 3 2 2 1 value in arriving packet’s header 0111 1 3 2 Network Layer 4-124 Graph abstraction 5 2 u 2 1 Graph: G = (N,E) v x 3 w 3 1 5 z 1 y 2 N = set of routers = { u, v, w, x, y, z } E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) } Remark: Graph abstraction is useful in other network contexts Example: P2P, where N is set of peers and E is set of TCP connections Network Layer 4-125 Graph abstraction: costs 5 2 u v 2 1 x • c(x,x’) = cost of link (x,x’) 3 w 3 1 5 z 1 y - e.g., c(w,z) = 5 2 • cost could always be 1, or inversely related to bandwidth, or inversely related to congestion Cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp) Question: What’s the least-cost path between u and z ? Routing algorithm: algorithm that finds least-cost path Network Layer 4-126 Routing Algorithm classification Global or decentralized information? Global: all routers have complete topology, link cost info “link state” algorithms Decentralized: router knows physicallyconnected neighbors, link costs to neighbors iterative process of computation, exchange of info with neighbors “distance vector” algorithms Static or dynamic? Static: routes change slowly over time Dynamic: routes change more quickly periodic update in response to link cost changes Network Layer 4-127 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-128 A Link-State Routing Algorithm Dijkstra’s algorithm net topology, link costs known to all nodes accomplished via “link state broadcast” all nodes have same info computes least cost paths from one node (‘source”) to all other nodes gives forwarding table for that node iterative: after k iterations, know least cost path to k dest.’s Notation: c(x,y): link cost from node x to y; = ∞ if not direct neighbors D(v): current value of cost of path from source to dest. v p(v): predecessor node along path from source to v N': set of nodes whose least cost path definitively known Network Layer 4-129 Dijsktra’s Algorithm 1 Initialization: 2 N' = {u} 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(u,v) 6 else D(v) = ∞ 7 8 Loop 9 find w not in N' such that D(w) is a minimum 10 add w to N' 11 update D(v) for all v adjacent to w and not in N' : 12 D(v) = min( D(v), D(w) + c(w,v) ) 13 /* new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v */ 15 until all nodes in N' Network Layer 4-130 Dijkstra’s algorithm: example Step 0 1 2 3 4 5 N' u ux uxy uxyv uxyvw uxyvwz D(v),p(v) D(w),p(w) 2,u 5,u 2,u 4,x 2,u 3,y 3,y D(x),p(x) 1,u D(y),p(y) ∞ 2,x D(z),p(z) ∞ ∞ 4,y 4,y 4,y 5 2 u v 2 1 x 3 w 3 1 5 z 1 y 2 Network Layer 4-131 Dijkstra’s algorithm: example (2) Resulting shortest-path tree from u: v w u z x y Resulting forwarding table in u: destination link v x (u,v) (u,x) y (u,x) w (u,x) z (u,x) Network Layer 4-132 Dijkstra algorithm illustration [2 / s] 5 [5 / s] a b 3 2 4 [0 / s] 3 2 s 1 e 2 1 1 c [1 / s] d [∞ / ] [∞ / ] Dijkstra algorithm illustration [2 / s] 5 [4 / c] a b 3 2 4 [0 / s] 3 2 s 1 e 2 1 1 c d [1 / s] [2 / c] [∞ / ] Dijkstra algorithm illustration [2 / s] 5 [4 / c] a b 3 2 4 [0 / s] 3 2 s 1 e 2 1 1 c d [1 / s] [2 / c] [∞ / ] Dijkstra algorithm illustration [2 / s] 5 [3 / d] a b 3 2 4 [0 / s] 3 2 s 1 e 2 1 1 c [1 / s] d [2 / c] [4 / d] Dijkstra algorithm illustration [2 / s] 5 [3 / d] a b 3 2 4 [0 / s] 3 2 s 1 e 2 1 1 c d [1 / s] [2 / c] [4 / d] Dijkstra algorithm illustration [2 / s] 5 [3 / d] a b 3 2 4 [0 / s] 3 2 s 1 e 2 1 1 c d [1 / s] [2 / c] [4 / d] Dijkstra algorithm summary • 复杂度 (Complexity) – O(n2) 注意: 计算所有的最短路径和计算一条最短路径具有相同的复杂度。 • 输出结果给出了网络上的一棵生成树 (Spanning Tree)。 5 a b 3 2 4 s e 3 2 1 1 2 c d 1 Dijkstra’s algorithm, discussion Algorithm complexity: n nodes each iteration: need to check all nodes, w, not in N n(n+1)/2 comparisons: O(n2) more efficient implementations possible: O(nlogn) Oscillations possible: e.g., link cost = amount of carried traffic D 1 1 0 A 0 0 C e 1+e e initially B 1 2+e A 0 D 1+e 1 B 0 0 C … recompute routing 0 D 1 A 0 0 C 2+e B 1+e … recompute 2+e A 0 D 1+e 1 B e 0 C … recompute Network Layer 4-140 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-141 Distance Vector Algorithm Bellman-Ford Equation (dynamic programming) Define dx(y) := cost of least-cost path from x to y Then dx(y) = min {c(x,v) + dv(y) } v where min is taken over all neighbors v of x Network Layer 4-142 Bellman-Ford example 5 2 u v 2 1 x 3 w 3 1 5 z 1 y Clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3 2 B-F equation says: du(z) = min { c(u,v) + dv(z), c(u,x) + dx(z), c(u,w) + dw(z) } = min {2 + 5, 1 + 3, 5 + 3} = 4 Node that achieves minimum is next hop in shortest path ➜ forwarding table Network Layer 4-143 Distance Vector Algorithm Dx(y) = estimate of least cost from x to y Node x knows cost to each neighbor v: c(x,v) Node x maintains distance vector Dx = [Dx(y): y є N ] Node x also maintains its neighbors’ distance vectors For each neighbor v, x maintains Dv = [Dv(y): y є N ] Network Layer 4-144 Distance vector algorithm (4) Basic idea: From time-to-time, each node sends its own distance vector estimate to neighbors Asynchronous When a node x receives new DV estimate from neighbor, it updates its own DV using B-F equation: Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N Under minor, natural conditions, the estimate Dx(y) converge to the actual least cost dx(y) Network Layer 4-145 Distance Vector Algorithm (5) Iterative, asynchronous: each local iteration caused by: local link cost change DV update message from neighbor Distributed: each node notifies neighbors only when its DV changes neighbors then notify their neighbors if necessary Each node: wait for (change in local link cost or msg from neighbor) recompute estimates if DV to any dest has changed, notify neighbors Network Layer 4-146 Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} = min{2+0 , 7+1} = 2 node x table cost to x y z cost to x y z from from x 0 2 7 y ∞∞ ∞ z ∞∞ ∞ node y table cost to x y z Dx(z) = min{c(x,y) + Dy(z), c(x,z) + Dz(z)} = min{2+1 , 7+0} = 3 x 0 2 3 y 2 0 1 z 7 1 0 x ∞ ∞ ∞ y 2 0 1 z ∞∞ ∞ node z table cost to x y z from from x x ∞∞ ∞ y ∞∞ ∞ z 71 0 time 2 y 7 1 z Network Layer 4-147 Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} = min{2+0 , 7+1} = 2 node x table cost to x y z x ∞∞ ∞ y ∞∞ ∞ z 71 0 from from from from x 0 2 7 y 2 0 1 z 7 1 0 cost to x y z x 0 2 7 y 2 0 1 z 3 1 0 x 0 2 3 y 2 0 1 z 3 1 0 cost to x y z x 0 2 3 y 2 0 1 z 3 1 0 x 2 y 7 1 z cost to x y z from from from x ∞ ∞ ∞ y 2 0 1 z ∞∞ ∞ node z table cost to x y z x 0 2 3 y 2 0 1 z 7 1 0 cost to x y z cost to x y z from from x 0 2 7 y ∞∞ ∞ z ∞∞ ∞ node y table cost to x y z cost to x y z Dx(z) = min{c(x,y) + Dy(z), c(x,z) + Dz(z)} = min{2+1 , 7+0} = 3 x 0 2 3 y 2 0 1 z 3 1 0 time Network Layer 4-148 Distance Vector: link cost changes Link cost changes: node detects local link cost change updates routing info, recalculates distance vector if DV changes, notify neighbors “good news travels fast” 1 x 4 y 50 1 z At time t0, y detects the link-cost change, updates its DV, and informs its neighbors. At time t1, z receives the update from y and updates its table. It computes a new least cost to x and sends its neighbors its DV. At time t2, y receives z’s update and updates its distance table. y’s least costs do not change and hence y does not send any message to z. Network Layer 4-149 Distance Vector: link cost changes Good news spreads fast 1 X 4 Y 50 1 Z 算法 收敛 Distance Vector: link cost changes Link cost changes: good news travels fast bad news travels slow - “count to infinity” problem! 44 iterations before algorithm stabilizes: see text 60 x 4 y 50 1 z Poisoned reverse: If Z routes through Y to get to X : Z tells Y its (Z’s) distance to X is infinite (so Y won’t route to X via Z) will this completely solve count to infinity problem? Network Layer 4-151 Distance Vector: link cost changes Bad news spreads slowly, so called “Count to Infinity Problem”: 60 X 4 Y 50 1 Z 算法仍 不收敛 Distance Vector–Poisoned Reverse Poisoned Reverse: 如果 Z 到 X 的最短路径 经过 Y,那么 Z 告诉 Y “Z 到 X 的最短距离 是 ∞ “,这样 Y 就不会选择经由 Z 到达 X 的 路线。 60 X 4 Y 50 1 Z 算法 收敛 ∞ 立即修改 Comparison of LS and DV algorithms Message complexity LS: with n nodes, E links, O(nE) msgs sent DV: exchange between neighbors only convergence time varies Speed of Convergence LS: O(n2) algorithm requires O(nE) msgs may have oscillations DV: convergence time varies may be routing loops count-to-infinity problem Robustness: what happens if router malfunctions? LS: node can advertise incorrect link cost each node computes only its own table DV: DV node can advertise incorrect path cost each node’s table used by others • error propagate thru network Network Layer 4-154 DV versus LS Distance Vector 仅与邻居节点交换消息 消息包括到所有节点的 最短距离 收敛速度比较慢 能够广播不正确的路径 信息 有Count to Infinity Problem Link State 向网络上所有其它节 点广播消息 消息仅包括到邻居节 点的距离 收敛速度比较快 能够广播不正确的链 路信息 没有Count to Infinity Problem Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-156 Hierarchical Routing Our routing study thus far - idealization all routers identical network “flat” … not true in practice scale: with 200 million destinations: can’t store all dest’s in routing tables! routing table exchange would swamp links! administrative autonomy internet = network of networks each network admin may want to control routing in its own network Network Layer 4-157 Hierarchical Routing aggregate routers into regions, “autonomous systems” (AS) routers in same AS run same routing protocol Gateway router Direct link to router in another AS “intra-AS” routing protocol routers in different AS can run different intraAS routing protocol Network Layer 4-158 Interconnected ASes 3c 3a 3b AS3 1a 2a 1c 1d 1b Intra-AS Routing algorithm 2c AS2 AS1 Inter-AS Routing algorithm Forwarding table 2b forwarding table configured by both intra- and inter-AS routing algorithm intra-AS sets entries for internal dests inter-AS & intra-As sets entries for external dests Network Layer 4-159 Inter-AS tasks AS1 must: 1. learn which dests are reachable through AS2, which through AS3 2. propagate this reachability info to all routers in AS1 Job of inter-AS routing! suppose router in AS1 receives datagram destined outside of AS1: router should forward packet to gateway router, but which one? 3c 3b 3a AS3 1a 2a 1c 1d 1b 2c AS2 2b AS1 Network Layer 4-160 Example: Setting forwarding table in router 1d suppose AS1 learns (via inter-AS protocol) that subnet x reachable via AS3 (gateway 1c) but not via AS2. inter-AS protocol propagates reachability info to all internal routers. router 1d determines from intra-AS routing info that its interface I is on the least cost path to 1c. installs forwarding table entry (x,I) x 3c 3a 3b AS3 1a 2a 1c 1d 1b AS1 2c 2b AS2 Network Layer 4-161 Example: Choosing among multiple ASes now suppose AS1 learns from inter-AS protocol that subnet x is reachable from AS3 and from AS2. to configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x. this is also job of inter-AS routing protocol! x 3c 3a 3b AS3 1a 2a 1c 1d 1b 2c AS2 2b AS1 Network Layer 4-162 Example: Choosing among multiple ASes now suppose AS1 learns from inter-AS protocol that subnet x is reachable from AS3 and from AS2. to configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x. this is also job of inter-AS routing protocol! hot potato routing: send packet towards closest of two routers. Learn from inter-AS protocol that subnet x is reachable via multiple gateways Use routing info from intra-AS protocol to determine costs of least-cost paths to each of the gateways Hot potato routing: Choose the gateway that has the smallest least cost Determine from forwarding table the interface I that leads to least-cost gateway. Enter (x,I) in forwarding table Network Layer 4-163 Intra-AS Routing also known as Interior Gateway Protocols (IGP) most common Intra-AS routing protocols: RIP: Routing Information Protocol OSPF: Open Shortest Path First IGRP: Interior Gateway Routing Protocol (Cisco proprietary) Network Layer 4-164 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-165 RIP ( Routing Information Protocol) distance vector algorithm included in BSD-UNIX Distribution in 1982 distance metric: # of hops (max = 15 hops) From router A to subnets: u v A z C B D w x y destination hops u 1 v 2 w 2 x 3 y 3 z 2 Network Layer 4-166 RIP advertisements distance vectors: exchanged among neighbors every 30 sec via Response Message (also called advertisement) each advertisement: list of up to 25 destination subnets within AS Network Layer 4-167 RIP: Example z w A x D B y C Destination Network w y z x …. Next Router Num. of hops to dest. …. .... A B B -- 2 2 7 1 Routing/Forwarding table in D Network Layer 4-168 RIP: Example Dest w x z …. Next C … w hops 1 1 4 ... A Advertisement from A to D z x Destination Network w y z x …. D B C y Next Router Num. of hops to dest. …. .... A B B A -- Routing/Forwarding table in D 2 2 7 5 1 Network Layer 4-169 RIP: Link Failure and Recovery If no advertisement heard after 180 sec --> neighbor/link declared dead routes via neighbor invalidated new advertisements sent to neighbors neighbors in turn send out new advertisements (if tables changed) link failure info quickly (?) propagates to entire net poison reverse used to prevent ping-pong loops (infinite distance = 16 hops) Network Layer 4-170 RIP Table processing RIP routing tables managed by application-level process called route-d (daemon) advertisements sent in UDP packets, periodically repeated routed routed Transprt (UDP) network (IP) link physical Transprt (UDP) forwarding table forwarding table network (IP) link physical Network Layer 4-171 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-172 OSPF (Open Shortest Path First) “open”: publicly available uses Link State algorithm LS packet dissemination topology map at each node route computation using Dijkstra’s algorithm OSPF advertisement carries one entry per neighbor router advertisements disseminated to entire AS (via flooding) carried in OSPF messages directly over IP (rather than TCP or UDP Network Layer 4-173 OSPF “advanced” features (not in RIP) security: all OSPF messages authenticated (to prevent malicious intrusion) multiple same-cost paths allowed (only one path in RIP) For each link, multiple cost metrics for different TOS (e.g., satellite link cost set “low” for best effort; high for real time) integrated uni- and multicast support: Multicast OSPF (MOSPF) uses same topology data base as OSPF hierarchical OSPF in large domains. Network Layer 4-174 Hierarchical OSPF Network Layer 4-175 Hierarchical OSPF two-level hierarchy: local area, backbone. Link-state advertisements only in area each nodes has detailed area topology; only know direction (shortest path) to nets in other areas. area border routers: “summarize” distances to nets in own area, advertise to other Area Border routers. backbone routers: run OSPF routing limited to backbone. boundary routers: connect to other AS’s. Network Layer 4-176 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-177 Internet inter-AS routing: BGP BGP (Border Gateway Protocol): the de facto standard BGP provides each AS a means to: 1. 2. 3. Obtain subnet reachability information from neighboring ASs. Propagate reachability information to all ASinternal routers. Determine “good” routes to subnets based on reachability information and policy. allows subnet to advertise its existence to rest of Internet: “I am here” Network Layer 4-178 BGP basics pairs of routers (BGP peers) exchange routing info over semi-permanent TCP connections: BGP sessions BGP sessions need not correspond to physical links. when AS2 advertises a prefix to AS1: AS2 promises it will forward datagrams towards that prefix. AS2 can aggregate prefixes in its advertisement eBGP session 3c 3a 3b AS3 1a AS1 iBGP session 2a 1c 1d 1b 2c AS2 2b Network Layer 4-179 Distributing reachability info using eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1. 1c can then use iBGP do distribute new prefix info to all routers in AS1 1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP session when router learns of new prefix, it creates entry for prefix in its forwarding table. eBGP session 3c 3a 3b AS3 1a AS1 iBGP session 2a 1c 1d 1b 2c AS2 2b Network Layer 4-180 Path attributes & BGP routes advertised prefix includes BGP attributes. prefix + attributes = “route” two important attributes: AS-PATH: contains ASs through which prefix advertisement has passed: e.g, AS 67, AS 17 NEXT-HOP: indicates specific internal-AS router to next-hop AS. (may be multiple links from current AS to next-hop-AS) when gateway router receives route advertisement, uses import policy to accept/decline. Network Layer 4-181 BGP route selection router may learn about more than 1 route to some prefix. Router must select route. elimination rules: 1. 2. 3. 4. local preference value attribute: policy decision shortest AS-PATH closest NEXT-HOP router: hot potato routing additional criteria Network Layer 4-182 BGP messages BGP messages exchanged using TCP. BGP messages: OPEN: opens TCP connection to peer and authenticates sender UPDATE: advertises new path (or withdraws old) KEEPALIVE keeps connection alive in absence of UPDATES; also ACKs OPEN request NOTIFICATION: reports errors in previous msg; also used to close connection Network Layer 4-183 BGP routing policy legend: B W X A provider network customer network: C Y A,B,C are provider networks X,W,Y are customer (of provider networks) X is dual-homed: attached to two networks X does not want to route from B via X to C .. so X will not advertise to B a route to C Network Layer 4-184 BGP routing policy (2) legend: B W X A provider network customer network: C Y A advertises path AW to B B advertises path BAW to X Should B advertise path BAW to C? No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers B wants to force C to route to w via A B wants to route only to/from its customers! Network Layer 4-185 Why different Intra- and Inter-AS routing ? Policy: Inter-AS: admin wants control over how its traffic routed, who routes through its net. Intra-AS: single admin, so no policy decisions needed Scale: hierarchical routing saves table size, reduced update traffic Performance: Intra-AS: can focus on performance Inter-AS: policy may dominate over performance Network Layer 4-186 Static routing table Destination subnet/Mask, Next hop/router, Output interface, # of hops to destination typical problems: Given net structure, to produce the routing table Given routing table, to produce the net structure Network Layer 4-187 Static routing table: Problem 1 R1, R2, R3: Destination subnet/Mask, Next hop/router, Interface, # of hops to destination ? Network Layer 4-188 Static routing table: Problem 2 Destination Mask Next hop Interface # of hops 202.204.65.0 255.255.255.0 C Vlan160 1 202.204.64.0 255.255.255.0 C Valn159 1 202.14.71.0 255.255.255.0 202.124.254.9 Vlan2 4 202.38.70.0 255.255.255.0 202.124.254.9 Vlan2 5 C Vlan2 1 202.204.65.1 Vlan160 2 202.124.254.0 255.255.255.0 176.20.0.0 255.255.0.0 net structure? Network Layer 4-189 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-190 Broadcast Routing deliver packets from source to all other nodes source duplication is inefficient: duplicate duplicate creation/transmission R1 R1 duplicate R2 R2 R3 R4 source duplication R3 R4 in-network duplication source duplication: how does source determine recipient addresses? Network Layer 4-191 In-network duplication flooding: when node receives brdcst pckt, sends copy to all neighbors Problems: cycles & broadcast storm controlled flooding: node only brdcsts pkt if it hasn’t brdcst same packet before Node keeps track of pckt ids already brdcsted Or reverse path forwarding (RPF): only forward pckt if it arrived on shortest path between node and source spanning tree No redundant packets received by any node Network Layer 4-192 Spanning Tree First construct a spanning tree Nodes forward copies only along spanning tree A B c F A E B c D F G (a) Broadcast initiated at A E D G (b) Broadcast initiated at D Network Layer 4-193 Spanning Tree: Creation Center node Each node sends unicast join message to center node Message forwarded until it arrives at a node already belonging to spanning tree A A 3 B c 4 E F 1 2 B c D F 5 E D G G (a) Stepwise construction of spanning tree (b) Constructed spanning tree Network Layer 4-194 Multicast Routing: Problem Statement Goal: find a tree (or trees) connecting routers having local mcast group members tree: not all paths between routers used source-based: different tree from each sender to rcvrs shared-tree: same tree used by all group members Shared tree Source-based trees Approaches for building mcast trees Approaches: source-based tree: one tree per source shortest path trees reverse path forwarding group-shared tree: group uses one tree minimal spanning (Steiner) center-based trees …we first look at basic approaches, then specific protocols adopting these approaches Shortest Path Tree mcast forwarding tree: tree of shortest path routes from source to all receivers Dijkstra’s algorithm S: source LEGEND R1 1 2 R4 R2 3 R3 router with attached group member 5 4 R6 router with no attached group member R5 6 R7 i link used for forwarding, i indicates order link added by algorithm Reverse Path Forwarding rely on router’s knowledge of unicast shortest path from it to sender each router has simple forwarding behavior: if (mcast datagram received on incoming link on shortest path back to center) then flood datagram onto all outgoing links else ignore datagram Reverse Path Forwarding: example S: source LEGEND R1 R4 router with attached group member R2 R5 R3 R6 R7 router with no attached group member datagram will be forwarded datagram will not be forwarded • result is a source-specific reverse SPT – may be a bad choice with asymmetric links Reverse Path Forwarding: pruning forwarding tree contains subtrees with no mcast group members no need to forward datagrams down subtree “prune” msgs sent upstream by router with no downstream group members LEGEND S: source R1 router with attached group member R4 R2 P R5 R3 R6 P R7 P router with no attached group member prune message links with multicast forwarding Shared-Tree: Steiner Tree Steiner Tree: minimum cost tree connecting all routers with attached group members problem is NP-complete excellent heuristics exists not used in practice: computational complexity information about entire network needed monolithic: rerun whenever a router needs to join/leave Center-based trees single delivery tree shared by all one router identified as “center” of tree to join: edge router sends unicast join-msg addressed to center router join-msg “processed” by intermediate routers and forwarded towards center join-msg either hits existing tree branch for this center, or arrives at center path taken by join-msg becomes new branch of tree for this router Center-based trees: an example Suppose R6 chosen as center: LEGEND R1 3 R2 router with attached group member R4 2 R5 R3 1 R6 R7 1 router with no attached group member path order in which join messages generated Internet Multicasting Routing: DVMRP DVMRP: distance vector multicast routing protocol, RFC1075 flood and prune: reverse path forwarding, source-based tree RPF tree based on DVMRP’s own routing tables constructed by communicating DVMRP routers no assumptions about underlying unicast initial datagram to mcast group flooded everywhere via RPF routers not wanting group: send upstream prune msgs DVMRP: continued… soft state: DVMRP router periodically (1 min.) “forgets” branches are pruned: mcast data again flows down unpruned branch downstream router: reprune or else continue to receive data routers can quickly regraft to tree following IGMP join at leaf odds and ends commonly implemented in commercial routers Mbone routing done using DVMRP Tunneling Q: How to connect “islands” of multicast routers in a “sea” of unicast routers? physical topology logical topology mcast datagram encapsulated inside “normal” (non-multicast- addressed) datagram normal IP datagram sent thru “tunnel” via regular IP unicast to receiving mcast router receiving mcast router unencapsulates to get mcast datagram PIM: Protocol Independent Multicast not dependent on any specific underlying unicast routing algorithm (works with all) two different multicast distribution scenarios : Dense: Sparse: group members # networks with group densely packed, in “close” proximity. bandwidth more plentiful members small wrt # interconnected networks group members “widely dispersed” bandwidth not plentiful Consequences of Sparse-Dense Dichotomy: Dense group membership by Sparse: no membership until routers assumed until routers explicitly join routers explicitly prune receiver- driven data-driven construction construction of mcast on mcast tree (e.g., RPF) tree (e.g., center-based) bandwidth and non bandwidth and non-groupgroup-router processing router processing profligate conservative PIM- Dense Mode flood-and-prune RPF, similar to DVMRP but underlying unicast protocol provides RPF info for incoming datagram less complicated (less efficient) downstream flood than DVMRP reduces reliance on underlying routing algorithm has protocol mechanism for router to detect it is a leaf-node router PIM - Sparse Mode center-based approach router sends join msg to rendezvous point (RP) router can switch to source-specific tree increased performance: less concentration, shorter paths R4 join intermediate routers update state and forward join after joining via RP, R1 R2 R3 join R5 join R6 all data multicast from rendezvous point R7 rendezvous point PIM - Sparse Mode sender(s): unicast data to RP, which distributes down RP-rooted tree RP can extend mcast tree upstream to source RP can send stop msg if no attached receivers “no one is listening!” R1 R4 join R2 R3 join R5 join R6 all data multicast from rendezvous point R7 rendezvous point Chapter 4: summary 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-212 Chapter 4: 3rd Homework 1. P22 2. 给定子网掩码255.255.255.224,请问与主机 210.30.97.245对应的网络地址和主机地址分别是多少? 3. 列举IPv6和IPv4的5项不同点 Hand in at once by Monitors ONE week later. No print! Hand-writing with you ID and name. Review Questions See the textbook (P.442-) R2, R3, R8, R18, R21, R33 P11, P16, P23 虚电路网络与数据报网络的区别? 链路状态路由算法与距离向量路由算法的区别? 广播、组播、单播、任播? 单播 主机之间“一对一”的通讯模式,网络中的交换机和路由器 对数据只进行转发不进行复制。但由于其能够针对每个客户 的及时响应,所以现在的网页浏览全部都是采用IP单播协议。 网络中的路由器和交换机根据其目标地址选择传输路径,将 IP单播数据传送到其指定的目的地。 215 单播 单播的优点: 1. 服务器及时响应客户机的请求 2. 服务器针对每个客户不通的请求发送不通的数据,容 易实现个性化服务。 单播的缺点: 1. 服务器针对每个客户机发送数据流,在客户数量大、 每个客户机流量大的流媒体应用中服务器不堪重负。 2. 现有的网络带宽是金字塔结构,城际省际主干带宽仅 仅相当于其所有用户带宽之和的5%。如果全部使用单播 协议,将造成网络主干不堪重负。 216 广播 主机之间“一对所有”的通讯模式,网络对其中每一台主机 发出的信号都进行无条件复制并转发,所有主机都可以接收 到所有信息(不管你是否需要),由于其不用路径选择,所 以其网络成本可以很低廉。有线电视网就是典型的广播型网 络。在数据网络中也允许广播的存在,但其被限制在二层交 换机的局域网范围内,禁止广播数据穿过路由器,防止广播 数据影响大面积的主机。 217 广播 广播的优点: 1. 网络设备简单,维护简单,布网成本低廉 2. 由于服务器不用向每个客户机单独发送数据,所以服 务器流量负载极低。 广播的缺点: 1.无法针对每个客户的要求和时间及时提供个性化服务。 2. 网络允许服务器提供数据的带宽有限,客户端的最大 带宽=服务总带宽。 3. 广播禁止在Internet宽带网上传输。 218 组播 主机之间“一对一组”的通讯模式,也就是加入了同一个组 的主机可以接受到此组内的所有数据,网络中的交换机和路 由器只向有需求者复制并转发其所需数据。 219 组播 组播的优点: 1. 需要相同数据流的客户端加入相同的组共享一条数据 流,节省了服务器的负载。具备广播所具备的优点。 2. 由于组播协议是根据接受者的需要对数据流进行复制 转发,所以服务端的服务总带宽不受客户接入端带宽的 限制。 3. 此协议和单播协议一样允许在Internet宽带网上传输。 组播的缺点: 1.与单播协议相比没有纠错机制,发生丢包错包后难以 弥补,但可以通过一定的容错机制和QoS加以弥补。 2.现行网络虽然都支持组播的传输,但在客户认证、 QoS等方面还需要完善,这些缺点在理论上都有成熟的 解决方案,只是需要逐步推广应用到现存网络当中。 220