* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download PPT - Electrical and Computer Engineering
Asynchronous Transfer Mode wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Distributed firewall wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Deep packet inspection wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Computer network wikipedia , lookup
Internet protocol suite wikipedia , lookup
Network tap wikipedia , lookup
Airborne Networking wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Routing in delay-tolerant networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Chapter 4 Network Layer A note on the use of these ppt slides: We’re making these slides freely available to all (faculty, students, readers). They’re in PowerPoint form so you can add, modify, and delete slides (including this one) and slide content to suit your needs. They obviously represent a lot of work on our part. In return for use, we only ask the following: If you use these slides (e.g., in a class) in substantially unaltered form, that you mention their source (after all, we’d like people to use our book!) If you post any slides in substantially unaltered form on a www site, that you note that they are adapted from (or perhaps identical to) our slides, and note our copyright of this material. Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April 2009. Thanks and enjoy! JFK/KWR All material copyright 1996-2010 J.F Kurose and K.W. Ross, All Rights Reserved Network Layer 4-1 Chapter 4: Network Layer 4. 1 Introduction 4.2 Packet forwarding 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Distance Vector Link state Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP Network Layer 4-2 Network Network layer transport segment from sending host to receiving host on sending side encapsulate segments into datagrams on rcving side, deliver segments to transport layer network layer protocols in every host, router router examines header fields in all IP datagrams passing through it Link1 Link2 Link3 Link4 PHY1 PHY2 PHY3 PHY4 application transport network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical network network data link data link physical physical network data link physical network data network link physical data link physical network data link physical network data link physical Network Layer application transport network data link physical 4-3 The Canadian Network Links: OC-192 (10 Gbps) 4 ORION (Ontario Research and Innovation Optical Network) ORION Office 34 King Street East Suite 800, 8th Floor Toronto, Ontario, M5C 2X8 5 COGENT Network See a fuller map at: http://www.submarinecablemap.com/ 6 Network Connection to UW 7 UW Network 8 Many routing protocols RIP BGP OSPF UDP TCP ICMP Routing Table(s) Network layer IP Link1 Link2 Link3 Link4 PHY1 PHY2 PHY3 PHY4 2 1 3 4 OSPF: Open Shortest Path First RIP: Routing Information Protocol BGP: Border Gateway Protocol ICMP: Internet Control Message Protocol TCP: Transmission Control Protocol UDP: User Datagram Protocol 9 Hop-by-hop routing Dest. Introduction 1-10 IP address In IPv4, an IP address is 32-bit long Example: 10000001 01100001 01011100 00100101 This is also written as: 129.97.92.37 (10000001)2 (01100001)2 (01011100)2 (00100101)2 = 129 = 97 = 92 = 37 Dotted decimal notation (easy to enter …..) Network Layer 4-11 IP address Switches and hubs do not have IP addresses End-devices generally have one IP address each ….. (Laptops, desktops, servers, … have one IP address each) Routers have multiple IP addresses … one+ for each physical interface … IP3 IP2 IP1 IP4 Network Layer 4-12 Who gives ISPs IP address blocks? ICANN allocates IP address blocks to regional internet registries - AFRINIC: African Registry for Network Info Centre (Mauritius) - APNIC: Asia-Pacific Network Information Centre (South Brisbane) - ARIN: American Registry for Internet Numbers (Virginia) - LACNIC: Latin American and Caribbean Network Info. Centre (Uruguay) - RIPE: Réseaux IP Européens Network Coordination Centre (Amsterdam) ICANN: Internet Corp. for Assigned Names and Numbers Network Layer 4-13 The concept of network prefix IP1 IP2 IP3 IP4 All the IP addresses on the same link have a common portion in their most significant bits IP1: x.y.z. 1 0 1 0 1 0 0 0 IP2: x.y.z. 1 0 1 0 1 0 0 1 IP3: x.y.z. 1 0 1 0 1 0 1 0 IP4: x.y.z. 1 0 1 0 1 0 1 1 common portion Network prefix host ID 14 The concept of network prefix IP1: x.y.z.1 0 1 0 1 0 0 0 IP2: x.y.z. 1 0 1 0 1 0 0 1 IP3: x.y.z. 1 0 1 0 1 0 1 0 IP4: x.y.z. 1 0 1 0 1 0 1 1 Network prefix host ID Remember: Routers do not store routing info. for individual destination IPs… (there are billions of IP addresses….. storing and searching all those IP addresses will need much more CPU power and memory….. Instead, routers store aggregated IP addresses in their routing tables…… Example: x.y.z.168/30 (10101000)2 = 168 30 = length of the network prefix in bits 15 New/ IP addressing: CIDR CIDR: Classless InterDomain Routing subnet portion (prefix portion) of address is of arbitrary length Addr format: a.b.c.d/x, where x is # bits in subnet portion host part subnet part 11001000 00010111 00010000 00000000 200.23.16.0/23 VLSM: Variable Length Subnet Mask Old / Classful addressing: Fixed length subnet mask Class A: Addr begins with 0 and it is of the form: Class B: Addr begins with 10 and it is of the form: Class C: Addr begins with 110 and it is of the form: Prefix . Host . Host . Host Prefix. Prefix. Host. Host Prefix . Prefix . Prefix . Host Network Layer 4-16 The concept of network prefix #1: Net Prefix #2: Autonomous Systems The concept of network prefix is very simple, yet it is a powerful concept that makes the Internet scalable….. Routers in Toronto, Tokyo, New York, … know all UWaterloo hosts by 129.97.0.0/16. The whole UW is one dest. Tokyo Toronto UW NY Network Layer 4-17 The concept of network prefix For discussion purpose, we use the notation 129.97.0.0/16. However, routers use the following notation: Dest. Address: 129.97.0.0 Network Mask: 11111111.11111111.00000000.00000000 : 255.255.0.0 Destination address: 129.97.0.0/ 255.255.0.0 Network Layer 4-18 The concepts of network ID and broadcast address Network ID appears in routing tables. Individual host IP addresses do NOT. Broadcast Address is used to perform IP-level broadcast. You send an IP packet with a Broadcast Addr as the Destination, the IP packet is delivered to ALL the nodes on the network. Network Layer 4-19 The concepts of network ID and broadcast address The network has an ID called Network ID, expressed in the form of an IP address. IP1 IP2 IP3 IP4 Consider the network IP5 Also, the network has a broadcast address. 10.2.5.16/28 /28 = 11111111 . 11111111 . 11111111 . 11110000 = 255.255.255.240 The FIRST addr in the net 10.2.5.16/28 is 10.2.5.00010000 = 10.2.5.16 Net ID The next addr in the net 10.2.5.16/28 is 10.2.5.00010001 = 10.2.5.17 Assign to Router (con.) The next addr in the net 10.2.5.16/28 is 10.2.5.00010010 = 10.2.5.18 : : : : : : The next addr in the net 10.2.5.16/28 is 10.2.5.00011110 = 10.2.5.30 The LAST addr in the net 10.2.5.16/28 is 10.2.5.00011111 = 10.2.5.31 Assign to hosts Broadcast addr Network Layer 4-20 Net ID and Broadcast address are NOT assigned to any host/router. Given a network ID and a mask: Find the next network ID with the same mask. Find the previous network ID with the same mask. Given an IP address and a mask: Find the network ID and the Broadcast address Network Layer 4-21 Given a network ID and a mask: Find the next network ID with the same mask. Find the previous network ID with the same mask. Start IP address space Previous network ID 10.2.5.16/28 Next network ID 32 – 28 = 4 32 is length of IP addr in IPv4 Block size = 24 = 16 Previous network ID = 10.2.5.16 - 16 = 10.2.5.0 Next network ID = 10.2.5.16 + 16 = 10.2.5.32 End Network Layer 4-22 Given an IP address and a mask: Find the network ID and the Broadcast address Start 32 – 28 = 4 IP address space 10.2.5.20/28 32 is length of IP addr in IPv4 Block size = 24 = 16 For network ID: 10.2.5.20 = 10.2.5.00010100 (Identify the network portion: the left-most 28 bits) => 10.2.5.00010100 (Reset the host portion to 0’s) => 10.2.5.00010000 = 10.2.5.16 Broadcast Address = Network ID + Block size -1 = 10.2.5.16 + 16 -1 End = 10.2.5.31 Network Layer 4-23 Public IP addr vs. Private IP addr Private IP addresses Public IP addresses These addresses are globally unique. These are not globally unique. (Unique within an org.) Routers everywhere recognize these. Routers outside the org. do not recognize these. Any host can open a TCP connection with a machine with a public IP addr. If a host has a private IP addr, a host outside the org. cannot open a TCP conn with it. Reuse of IP addresses Added security Adv. Network Layer 4-24 Private IP addresses (RFC 1918) 10.0.0.0/8: Valid IP addresses are 10.0.0.1 -- 10.255.255.254. 172.16.0.0/12: Valid IP addresses are 172.16.0.1 -- 172.31.255.254. 192.168.0.0/16: Valid IP addresses are 192.168.0.1 -- 192.168.255.254. Note UW uses the 172.16.0.0/12 10.0.0.0/8 block block for wireless RFC: Request For Comments -- a kind of IETF (Internet Eng. Task Force) doc. Network Layer 4-25 Partitioning an IP address block into different networks An ISP (UW) gets a block of public IP addresses (129.97.0.0/16) from IANA/ARIN Public IP address space ECE CS Private IP address space Human Resource ECE Finance Quest WiFi Optometry Network Layer 4-26 Simple structure of a routing table (at B): Example * Dest. Address Mask Next hop Interface 73.2.0.0 255.255.0.0 IP1 1 129.97.8.0 255.255.255.0 “connected” 3 0.0.0.0 0.0.0.0 IP2 Metric 73.2.0.0/16 IP1 2 1 B Default entry (configure it) A network connected to the router (configure it) 2 3 IP2 129.97.8.0/24 IP3 IP4 * Learn this entry by running routing protocols. Network Layer 4-27 How does a router choose the next hop for a packet…. Address x.y.z.w Mask a.b.c.d Next hop IPn Interface I Metric m Routing Table of A Next hop? If dest. addr (IP1) “matches” with an entry in the RT, choose interface I. IP packet IP1 IP1 IPn I Router A Network Layer 4-28 How is address “matching” performed? x.y.z.w/n IP1 “matching” condition True IF n MS-bits of x.y.z.w == Matching occurred n MS-bits of IP1 False Matching Failed If ( (x.y.z.w AND a.b.c.d) == (IP1 AND a.b.c.d) ), matching occurs Network Layer 4-29 For an RT and an IP address, many entries may match Destination Address Interface # 11001000 00010111 00010/21 0 11001000 00010111 00011000/24 1 11001000 00010111 00011/21 2 Otherwise (default): 0.0.0.0/0 3 Examples: IP1: 11001000 00010111 00010 110 10100001 matches with the 1st one. IP2: 11001000 00010111 00011 000 10101010 matches with the 2nd entry matches with the 3rd entry Note: If many matchings occurs, there is a “longest” prefix matching ….. Network Layer 4-30 Matching and forwarding algorithm Inputs: IP address from packet header (call it IP1) Routing Table (call it RT) Processing: if (matching occurs between IP1 and RT), { - find the matching entry with the longest prefix - forward the packet via the appropriate interface } else if (default entry exists in RT) { // default: 0.0.0.0/0 forward the packet via the appropriate interface } else { send error message (ICMP message) to the source of the IP packet } ICMP: Internet Control Message Protocol Network Layer 4-31 Longest Prefix Matching/ packet routing 12 4 IP1 6 4 3 3 bits (total: 32 bits) IP packet Dest addr: IP1 IP1 Source Network Tokyo IP1 Can./Vancouver IP1 Can./Toronto1 IP1 Ont./Toronto2 IP1 UW/IST IP1 ECE/EIT IP1 Connected IP1 IP1 4th Floor IP1 Dest. Network Implementation of Routing Tables RT in RAM RT in TCAM (Ternary Content Addressable Memory) Network Layer 4-33 Chapter 4: Network Layer 4. 1 Introduction 4.2 Packet forwarding 4.3 What’s inside a router? 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Distance Vector Link state Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-34 Logical view Router Architecture Overview Network Link1 Link2 Link3 Link4 PHY1 PHY2 PHY3 PHY4 Two key router functions: run routing algorithms/protocols (RIP, OSPF, BGP) forward datagrams from incoming to outgoing links switching fabric router input ports OS routing processor router output ports IP, RIP, OSPF, BGP Network Layer 4-35 Input Port Functions line termination link layer protocol (receive) lookup, forwarding switch fabric queueing Decentralized switching Given datagram dest. IP addr, lookup output port using RT Queuing occurs at both input ports and output ports If fabric is slower than input ports combined, queueing may occur at input queues: queueing delay and loss due to input buffer overflow! Network Layer 4-36 Switching fabrics switching fabric router input ports router output ports routing processor transfer packet from input buffer to appropriate output buffer (Performance of switching fabric) switching rate: rate at which packets can be transferred from inputs to outputs often measured as multiple of input/output line rate N inputs: Ideally, switching rate is N times line rate Example: 4 input lines with 10 Gbps per line Desired switching rate: 4 x 10 Gbps = 40 Gbps Network Layer 4-37 Switching fabrics Three types of switching fabrics memory memory bus crossbar Network Layer 4-38 Switching Via Memory First generation routers: traditional computers with switching under direct control of CPU packets are copied to system’s memory speed is limited by memory bandwidth (2 bus crossings per datagram) input port (e.g., Ethernet) memory output port (e.g., Ethernet) system bus Network Layer 4-39 Switching Via a Bus datagram from input port memory to output port memory via a shared bus bus contention: switching speed limited by bus bandwidth Example: bus Cisco 5600 router: 32 Gbps bus Network Layer 4-40 Switching Via An Interconnection Network Overcome bus bandwidth limitations Banyan networks and other interconnection nets initially developed for parallel processing crossbar Example: Cisco 12000 router: switches 60 Gbps through the interconnection network An 8x8 banyan network Network Layer 4-41 Output Ports Functions switch fabric datagram buffer queueing link layer protocol (send) line termination buffering required when datagrams arrive from fabric faster than the transmission rate scheduling discipline chooses among queued datagrams for transmission Network Layer 4-42 Output port queueing switch fabric at t, packets move from input to output switch fabric one packet time later buffering when arrival rate via switch exceeds output line speed queueing (delay) and loss due to output port buffer overflow! Network Layer 4-43 How much buffering? RFC 3439 rule of thumb: average buffering = “typical” RTT x link capacity C Example RTT = 250 ms; C = 10 Gpbs Buffer size = 2.5 Gbits R1 RTT C (RTT: Round-trip Time) R2 buffer Network Layer 4-44 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram (IPv4 pkt) format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-45 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram (IPv4 pkt) format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-46 IP Packet Format 32 bits Header ver head. DSCP len Length fragment Flgs 16-bit ID offset upper header TTL layer checksum source IP address destination IP address Version (4 bits) : 4 (= 0100) Header length (4 bits): unit is 4-bytes (Ex.: A 20-byte header is rep. by 5 (= 0101) DSCP (Differentiated Services Code Point/ 8 bits): Type of data carried (6-bit DSCP + 2-bit) (DSCP = 46 High Priority; 0 Low) Length (16 bits): Packet length in bytes Options (if any) Data (variable length, typically a TCP or UDP segment) 16-bit ID: A long IP packet is fragmented into smaller packets. All those small packets carry the same 16-bit-ID. 3-bit flags: <Not used, Don’t frag., More frags. to follow> Fragment offset x 23 : gives the position of the fragment in the original packet. Network Layer 4-47 IP Packet Format 32 bits Header ver head. DSCP len length fragment Flgs 16-bit ID offset upper header TTL layer checksum source IP address destination IP address Options (if any) Data (variable length, typically a TCP or UDP segment) TTL: Time To Live Max # of remaining hops. TTL is decremented by 1 at each router. TTL = 0 Router discards the packet Upper layer: Upper layer protocol to deliver payload to (Ex.: TCP = 6 UDP=17) Header Checksum: to detect bit errors in packet header (errors in “data” are ignored.) Source IP address: 32-bit IP address of the node that (originally) created the packet. Destination IP address: 32-bit IP address of the destination node of the packet. Network Layer 4-48 IP Packet Format Options: 32 bits Header ver head. DSCP len 16-bit ID upper TTL layer length Flgs fragment offset header checksum source IP address destination IP address Time stamp, record route taken, specify list of routers to visit, … Data: from the upper layer. Usually one TCP (Transport Control Protocol) or one UDP (User Datagram Protocol) segment Options (if any) Data (variable length, typically a TCP or UDP segment) IPv4 header length without options is 20 bytes………. How much to remember? DSCP: Diff. Services Code Point (for Quality of Service) Network Layer 4-49 Header Checksum Calculation (at the Sender) Example : IP header (in Hex) with checksum set to 0000 4500 0073 0000 4000 4011 0000 c0a8 0001 c0a8 00c7 Step 1: Add all 16-bit blocks of the header Result = 0010 0100 0111 1001 1100 Add the carry (0010) to the rest to get Temp = 0100 0111 1001 1110 Step 2: Take 1’s complement of Temp to get the checksum Checksum = 1’s complement(Temp) = 1011 1000 0110 0001 = b861 Header with checksum = 4500 0073 0000 4000 4011 b861 c0a8 0001 c0a8 00c7 Send the IP packet with this header …. 50 Header Checksum Re-calculation (at the Receiver) Assume that the header is received without bit error. 4500 0073 0000 4000 4011 b861 c0a8 0001 c0a8 00c7 Step 1: Add all the 16-bit blocks of the header Result = 2 fffd Add carry (2) to fffd to get ffff Step 2: Take 1’s complement of the final result from step 1, 1’s complement of ffff = 0000. Step 3: Decision If the result from step 2 is 0000: No error Else: bit-error; drop packet 51 IP Fragmentation & Reassembly network links have MTU limitations (Max. Transfer Unit) Ex. 1500 bytes large IP packets are divided (“fragmented”) within net one IP pkt becomes several IP pkts “reassembled” only at final destination IP header bits used to identify and order related IP packets fragmentation: in: one large datagram out: 3 smaller datagrams reassembly Header bits: < ID, Flags, Offset> Network Layer 4-52 IP Fragmentation and Reassembly Data size = 4000-20 = 3980 Bytes Example 4000 byte IP pkt MTU = 1500 bytes length =4000 ID fragflag =x =0 One large IP pkt becomes several smaller IP pkts length =1500 1480 bytes in data field offset =0 ID =x fragflag =1 offset =0 length =1500 ID =x fragflag =1 offset =185 length =1040 ID =x fragflag =0 offset =370 offset = 1480/8 offset = (1480 +1480)/8 Verification: Data size = 1500 + 1500 +1040 – (20 + 20 + 20) = 3980 Bytes Network Layer 4-53 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-54 Subnets IP address: subnet part (high order bits) (Recall: Network Prefix) host part (low order bits) 223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 What’s a subnet ? device interfaces with same subnet part of IP address 223.1.2.9 223.1.3.27 223.1.2.2 subnet 223.1.3.1 223.1.3.2 can physically reach each other without an intervening router network consisting of 3 subnets Network Layer 4-55 IP addresses: how to get one? hard-coded by system admin in a file DHCP: Dynamic Host Configuration Protocol: dynamically get address from a server Network Layer 4-56 DHCP: Dynamic Host Configuration Protocol Goal: allow host to dynamically obtain its IP addr. from network server when it joins network Allows reuse of addresses (only hold addr while connected) Support for mobile users ….. DHCP overview: host broadcasts “DHCP discover” msg DHCP server responds with “DHCP offer” msg host requests IP address: “DHCP request” msg DHCP server sends address: “DHCP ack” msg Network Layer 4-57 DHCP client-server scenario DHCP server A B 223.1.2.1 223.1.1.1 223.1.1.2 223.1.1.4 RA 223.1.2.9 223.1.2.2 223.1.1.3 223.1.3.1 arriving DHCP client needs address in this network 223.1.3.27 223.1.3.2 E DHCP Relay Agent (RA) on routers … DHCP server Note: A DHCP server need not be a separate machine. Router Many routers run DHCP servers for small networks. Network Layer 4-58 DHCP Port #67 DHCP client-server scenario UDP IP DHCP Discover src : 0.0.0.0, 68 dest.:, 255.255.255.255, 67 yiaddr: 0.0.0.0 transaction ID: 654 (example) DHCP server: 223.1.2.5 DHCP Port #68 UDP IP Arriving client DHCP Offer src: 223.1.2.5, 67 dest: 255.255.255.255, 68 yiaddr: 223.1.2.4 note this transaction ID: 654 Lifetime: 3600 secs Broadcast IP addr: 255.255.255.255 DHCP Request src: 0.0.0.0, 68 dest:: 255.255.255.255, 67 yiaddr: 223.1.2.4 transaction ID: 655 Lifetime: 3600 secs time DHCP ACK src: 223.1.2.5, 67 dest: 255.255.255.255, 68 yiaddr: 223.1.2.4 transaction ID: 655 Lifetime: 3600 secs Client’s IP addr= 223.1.2.4 4-59 DHCP: returns more than an IP address It returns: IP address of first-hop router for client (aka default gateway) name and IP address of DNS sever DNS: Domain Name System Function: Machine name IP address Example: naik3.uwaterloo.ca 129.97.10.192 network mask IP addr block: 11001000 00010111 00010000 00000000 200.23.16.0/20 Net. mask: 11111111 11111111 11110000 00000000 255.255.240.0 Network Layer 4-60 Hierarchical addressing: address aggregation Hierarchical addressing allows efficient advertisement of routing information: Without aggregation: R1 advertises 8 routes With aggregation: R1 advertises 1 route Organization 0 200.23.16.0/23 Organization 1 “Send me anything with addresses beginning 200.23.18.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . 200.23.16.0/20” Fly-By-Night-ISP R2 R1 200.23.30.0/23 Internet ISPs-R-Us “Send me anything with addresses beginning 199.31.0.0/16” Network Layer 4-61 Address Aggregation (a.k.a. supernetting) 1 entry on RT B 4 entries on RT A 192.168.0.0/24: 11000000.10101000.000000 00.00000000 192.168.1.0/24: 11000000.10101000.000000 01.00000000 192.168.2.0/24: 11000000.10101000.00000010.00000000 192.168.3.0/24: 11000000.10101000.00000011.00000000 192.168.0.0/22 Advertise: I can reach 192.168.0.0/22 Network Layer 4-62 Address aggregation (Cisco: Summary address) Organization 0 Organization 1 Organization 2 ... 11001000 00010111 0001 000 0 00000000 11001000 00010111 0001 001 0 00000000 11001000 00010111 0001 010 0 00000000 ….. …. 200.23.16.0/23 200.23.18.0/23 200.23.20.0/23 …. Organization 7 11001000 00010111 0001 111 0 00000000 200.23.30.0/23 ISP's Address block 11001000 00010111 0001 0000 00000000 200.23.16.0/20 Summary address Possible Final Q.: Given a few IP address blocks, find their summary address. Network Layer 4-63 NAT: Network Address Translation Motivation: One way to solve the address shortage problem ….. The 2nd way is to use IPv6 … local (say, home) network rest of Internet 138.76.28.2 10.0.0.1 138.76.28.1 (private IP addr) 10.0.0.4 NAT 138.76.29.7 138.76.28.3 10.0.0.2 138.76.28.4 10.0.0.3 All IP pkts leaving local net have same single source NAT IP addr: 138.76.29.7, but different source port # (with unchanged dest. IP addr) IP pkts with source and destination in this network have 10.0.0.0/24 address for source, destination (as usual) Network Layer 4-64 NAT in UW Public IP address space (globally unique) Private IP address space (not globally unique) Remember Network Layer 4-65 NAT: Network Address Translation Local network uses just one IP address as far as outside world is concerned. more devices are supported. A range of addr. is not needed from ISP: just one IP addr for all. Devices inside local net are not explicitly addressable (i.e. visible) by outside world (a security plus). This constraint is seen as a security plus point. App TCP Port (16-bit #) IP Network Layer 4-66 NAT: Network Address Translation 2: NAT router - changes IP pkt source addr from 10.0.0.1, 3345 to 138.76.29.7, 5001, NAT translation table WAN side addr LAN side addr 1: host 10.0.0.1 (with port# 3345) sends datagram to 128.119.40.186, 80 138.76.29.7, 5001 10.0.0.1, 3345 …… …… S: 10.0.0.1, 3345 D: 128.119.40.186, 80 10.0.0.1 - updates table 2 S: 138.76.29.7, 5001 D: 128.119.40.186, 80 138.76.29.7 S: 128.119.40.186, 80 D: 138.76.29.7, 5001 3: Reply arrives dest. address: 138.76.29.7, 5001 3 1 10.0.0.4 S: 128.119.40.186, 80 D: 10.0.0.1, 3345 10.0.0.2 4 10.0.0.3 4: NAT router changes IP pkt dest addr from 138.76.29.7, 5001 to 10.0.0.1, 3345 Network Layer 4-67 Implementation of NAT outgoing datagrams replace (src IP addr, port #) of every outgoing datagram with (NAT IP addr, new port #) [remote clients/servers will respond using (NAT IP addr, new port #) as dest. addr. ] remember (in NAT translation table) every mapping (src IP addr, port #) (NAT IP addr, new port #) incoming datagrams replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (src IP addr, port #) stored in NAT table Network Layer 4-68 NAT traversal problem Client wants to connect to server with address 10.0.0.1 Client ? server address 10.0.0.1 local to LAN (client can’t use it as destination addr) 10.0.0.4 138.76.29.7 only one externally visible NATed address: 138.76.29.7 10.0.0.1 NAT router Network Layer 4-69 NAT traversal problem: Solution #3 Relaying 3. Relaying established 1. Connection to relay initiated by NATed host Client 2. Connection to relay initiated by client 138.76.29.7 10.0.0.1 NAT router Network Layer 4-70 NAT traversal problem: solution #1 statically configure NAT to forward incoming connection requests at given port to server e.g., (138.76.29.7, port 2500) always forwarded to 10.0.0.1, port 2500 Client 10.0.0.1 ? 10.0.0.4 138.76.29.7 NAT router Network Layer 4-71 NAT traversal problem: solution #2 (automate static NAT port map configuration) 10.0.0.1 Internet Gateway Device (IGD) Protocol allows NATed host to: 138.76.29.7 learn public IP address (138.76.29.7) add/remove port mappings (with lease times) IGD 10.0.0.4 NAT router Network Layer 4-72 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-73 ICMP: Internet Control Message Protocol used by hosts & routers to communicate network-level information error reporting: unreachable host, network, port, protocol echo request/reply (used by ping) network-layer “above” IP: ICMP msgs carried in IP datagrams ICMP message: type, code plus first 8 bytes of IP datagram causing error Type 0 3 3 3 3 3 3 Code 0 0 1 2 3 6 7 description echo reply (ping) dest. network unreachable dest host unreachable dest protocol unreachable dest port unreachable dest network unknown dest host unknown 8 9 10 11 12 0 0 0 0 0 echo request (ping) router advertisement router discovery TTL expired bad IP header Network Layer 4-74 Traceroute and ICMP Source sends series of UDP segments to dest first has TTL =1 second has TTL=2, etc. unlikely port number When nth datagram arrives to nth router: router discards datagram and sends to source an ICMP message (type 11, code 0) ICMP message includes name of router & IP address when ICMP message arrives, source calculates RTT traceroute does this 3 times Stopping criterion UDP segment eventually arrives at destination host destination returns ICMP “port unreachable” packet (type 3, code 3) when source gets this ICMP, stops. Network Layer 4-75 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol Datagram format IPv4 addressing ICMP IPv6 4.5 Routing algorithms Link state Distance Vector Hierarchical routing 4.6 Routing in the Internet RIP OSPF BGP 4.7 Broadcast and multicast routing Network Layer 4-76 IPv6 Initial motivation: 32-bit address space soon to be completely allocated. Additional motivation: header format helps speed processing/forwarding header changes to facilitate QoS (Quality of Service) IPv6 datagram format: fixed-length 40 byte header no fragmentation allowed Network Layer 4-77 IPv6 Header (Cont) Priority: identify priority among datagrams in flow Flow Label: identify datagrams in same “flow.” (concept of“flow” not well defined). Next header: identify upper layer protocol for data ver pri flow label hop limit payload len next hdr source address (128 bits) destination address (128 bits) data 32 bits Network Layer 4-78 Other Changes from IPv4 Checksum: removed entirely to reduce processing time at each hop Options: allowed, but outside of header, indicated by “Next Header” field ICMPv6: new version of ICMP additional message types, e.g. “Packet Too Big” multicast group management functions Network Layer 4-79 Transition From IPv4 To IPv6 Not all routers can be upgraded simultaneously How will the network operate with mixed IPv4 and IPv6 routers? Tunneling: IPv6 carried as payload in IPv4 datagram among IPv4 routers Network Layer 4-80 Tunneling Logical view: E F IPv6 IPv6 IPv6 A B E F IPv6 IPv6 IPv6 IPv6 A B IPv6 Physical view: tunnel IPv4 IPv4 Network Layer 4-81 Tunneling F E A B IPv6 IPv6 A B C D E F IPv6 IPv6 IPv4 IPv4 IPv6 IPv6 Logical view: tunnel IPv6 IPv6 Physical view: Flow: X Src: A Dest: F data A-to-B: IPv6 Src:B Dest: E Src:B Dest: E Flow: X Src: A Dest: F Flow: X Src: A Dest: F data data B-to-C: IPv6 inside IPv4 D-to-E: IPv6 inside IPv4 Flow: X Src: A Dest: F data E-to-F: IPv6 Network Layer 4-82 Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol 4.5/4.6 Routing algorithms Distance vector + RIP Link state + OSPF Hierarchical routing BGP Datagram format IPv4 addressing ICMP IPv6 Network Layer 4-83 Hierarchical Routing Scale: with 200 Mil. destination networks can’t store all dest’s in routing tables! routing table exchange would swamp links! Address aggregation alleviates the problem, but not enough….. Solution: Autonomous System Hierarchical routing Network Layer 4-84 Hierarchical organization of the Internet Autonomous System BGP (Border Gateway Protocol) Routers (Ordinary) routers Internet Network Layer 4-85 An Autonomous System is a set of routers under a single technical admin, using an interior gateway protocol and common metrics to route packets within the AS, and using an exterior gateway protocol to route packets to other AS’s. AS’s are identified by a 16-bit ID, called AS number. Network Layer 4-86 Top IPv6 providers Example AS numbers and names AS Num of Customers number 159 174 113 577 105 15290 Network Name COGENT Cogent/PSI BACOM – Bell Canada ALLST-15290 – Allstream Corp. 102 88 84 84 852 3356 6539 6327 ASN852 – Telus Advanced Communications LEVEL3 Level 3 Communications GT-BELL – Bell Canada SHAW – Shaw Comm. Inc. 64 63 701 3257 UUNET – MCI Comm. Services, Inc. d/b/a Verizon Business TINET-BACKBONE Tinet SpA 57 50 49 6453 3549 13768 GLOBEINTERNET TATA Communications GBLX Global Crossing Ltd. PEER1 – Peer 1 Network Inc. Network Layer 4-87 ORANO #26677 COGENT #174 IPv4/v6 IPv4 Univ. of Waterloo AS #12093 Hydro One Telecom Inc #19752 IPv4/v6 IPv4 Allstream Corp. #15290 Network Layer 4-88 UW AS #12093 Voskamp Advertises 4 prefixes 198.96.155.0/24 (IPv4) 129.97.0.0/16 (IPv4) 129.97.248.0/21 (IPv4) /47 (IPv6) Network Layer 4-89 Reason for advertising TWO UW IPv4 prefixes COGENT #174 ORANO #26677 Advt. 129.97.0.0/16 UW 129.97.0.0/16 1 Gbps 129.97.248.0/21 Student Residence Advt. 129.97.0.0/16 1 Gbps Hydro One Telecom Inc #19752 Advt. 129.97.0.0/16 129.97.248.0/21 10 Gbps 1 Gbps Advt. 129.97.0.0/16 Allstream Corp. #15290 Network Layer 4-90 Routing Protocols Intra-AS routing (within an AS) (IGP: Interior Gateway Protocols) Inter-AS routing (among AS) (EGP: Exterior Gateway Protocols) RIP: Routing Information Protocol BGP: Border Gateway Protocol or OSPF: Open Shortest Path First or your own proprietary protocol (Cisco: EIGRP (Enhanced Interior Gateway Routing Protocol)) Choose one intra-AS routing protocol in a given AS. Two different intra-AS routing protocols do NOT run in the same AS. If one+ intra-AS protocols run in the same AS, (Cisco) routers can be configured to redistribute routes. Network Layer 4-91 NOTE RT RT RIP BGP RT RT RIP RIP RT RIP RT RIP AS (running RIP for intra-AS routing) Network Layer 4-92 A few things to remember …. RIP OSPF BGP Network Layer 4-93 5 Graph abstraction Graph: G = (N,E) N = set of routers = { u, v, w, x, y, z } 2 u v 2 1 x 3 w 3 z 1 y 1 5 2 E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) } c(x,x’) = cost of link (x,x’) Example: c(w,z) = 5 Cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp) Cost: hop count, delay, …. Network Layer 4-94 RIP Routing Information Protocol Network Layer 4-95 Distance Vector Algorithm x y v Let dx(y) := cost of least-cost path from x to y Bellman-Ford Equation dx(y) = min {c(x, v) + dv(y) } min is taken over all neighbors v of x The neighbor that leads to the minimum cost is the next hop on shortest path from x to y. Network Layer 4-96 Bellman-Ford example du(z) = min { c(u,v) + dv(z), c(u,x) + dx(z), c(u,w) + dw(z) } 5 2 u v 2 1 x 3 w 3 1 5 z 1 y 2 = min {2 + 5, 1 + 3, 5 + 3} =4 Clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3 Network Layer 4-97 Distance Vector Algorithm Dx(y) = estimate of least cost from x to y x maintains distance vector Dx = {Dx(y): y є N } node x: knows cost to each neighbor v: c(x, v) maintains its neighbors’ distance vectors. For each neighbor v, x maintains Dv = {Dv(y): y є N } Network Layer 4-98 Distance vector algorithm Basic idea: from time-to-time, each node sends its own distance vector estimate to neighbors v y when x receives new DV estimate from a neighbor, it updates its own DV using B-F equation: Dx(y) ← minv{c(x,v) + Dv(y)} x for each node y ∊ N In steady state the estimate Dx(y) converges to the actual least cost dx(y) Network Layer 4-99 LHTR-N1(4) ICR-N1(3) LHTR-N1(5) LHTR-N1(3) N2 ICR-N1(2) LHTR-N1(2) LHTR- N1(6) LHTR-N1(4) ICR-N1(1) ICR-N1(3) Config. this for N1 ICR-N1(2) LHTR-N1(3) N1 I can reach N1 (Hopcount cost): ICR-N1(cost) Learns how to reach N1 (Hopcount cost): LHTR-N1(cost) Network Layer 4-100 Distance Vector Algorithm: General Idea Each node: Wait for (change in local link cost or msg from neighbor) Recompute estimates Notify neighbours, if DV to any dest has changed Network Layer 4-101 Distance Vector Algorithm In steady state, shortest paths are established. Routers learn from neighbors only. simplicity Good news travels fast (and explicitly). When a router finds a shorter path to a dest., the news is broadcasted to its neighbors in the next round of comm. Bad news travels slowly When a router does not hear from a neighbor, it does not tell its other neighbors about the failure. Routing loops are formed. 102 Note: RT @ each node DV: Two-node instability problem Before failure X 1 A X A 2 B 1 X A X B A X A 4 1 B : : X A 2 1 Cost X B 3 After failure X - ∞ NH After B receives update from A [Dest, NH, cost] X - 1 Dest. Finally After A receives update from B X - ∞ X B 3 X A X A 2 1 B X A 1 X - ∞ B 103 DV: Solutions to the loop problem Redefine “infinity” to a smaller number (say, 16). X A B Split horizon If node B learns about a path to X from A, subsequently, B does not tell A about this. • Taking information from A, modifying it, and sending it back to A creates confusion. Split horizon with poisoned reverse If A routes to X via B, A tells B that it has an infinity-cost path to X. 104 DV: Three-node instability problem X 1 X - 1 X A 2 A B 1 1 X Failure has occurred … X 1 A 1 After A sends the update to B and C, but the update to C is lost. X 1 X C 3 1 B 1 C X A 2 X - ∞ A 1 C Before failure X - ∞ After C sends update info to B. ∞ B 1 C X X B X A 2 4 A 1 X C 3 1 B 1 C X A 2 After B sends update info to A. X A 2 105 RIP ( Routing Information Protocol) included in BSD-UNIX distribution in 1982 uses Distance Vector algorithm cost metric: # hops (max = 15 hops), each link has cost 1 DVs exchanged with neighbors every 30 sec in response message (aka advertisement) each advertisement: list of up to 25 destination subnets (in IP addressing sense) u v A z C B D w x y from router A to destination subnets: subnet hops u 1 v 2 w 2 x 3 y 3 z 2 Network Layer 4-106 RIP: Example D w A z y x B C routing table in router D destination subnet next router # hops to dest w y z x A B B -- 2 2 7 1 …. …. .... Network Layer 4-107 RIP: Example dest w x z …. w A A-to-D advertisement next hops 1 1 C 4 … ... x z y B D C routing table in router D destination subnet next router # hops to dest w y z x A B A B -- 2 2 5 7 1 …. …. .... Network Layer 4-108 RIP: Link Failure and Recovery If no advertisement heard from a neighbor after 180 sec --> neighbor/link declared dead routes via neighbor invalidated new advertisements sent to neighbors neighbors in turn send out new advertisements (if tables changed) link failure info quickly (?) propagates to entire net poisoned reverse used to prevent ping-pong loops (infinite distance = 16 hops) Network Layer 4-109 RIP Table processing RIP routing tables managed by application-level process called route-d (daemon) advertisements sent in UDP packets, periodically repeated routed routed Transport (UDP) network (IP) link physical Transprt (UDP) routing table routing table network (IP) link physical Network Layer 4-110 OSPF Open Shortest Path First Network Layer 4-111 OSPF (Open Shortest Path First) “open”: publicly available Uses Link State algorithm Each router learns the complete topology of the network (within AS) via flooding of OSPF-advertisement messages Use Dijkstra’s algorithm to compute shortest paths between all pairs of nodes For a given source/dest. pair, find “next hop” from the shortest path. Network Layer 4-112 OSPF: Types of links 1. Point-to-Point Link 2. Transient Link (Broadcast link) LAN 3. Stub Link LAN By means of LS-advertisement, other routers know what destinations are connected in the AS. Network Layer 4-113 OSPF Link-State 4 3 4 5 A B 3 D E 2 3 D E 4 C A 3 D 5 2 C 4 3 4 5 2 C E B 4 4 3 E A B 4 3 4 D A 4 B 3 Link-State of C: (C, A, 2) (C, B, 4) (C, E, 4) 2 C B 5 A Link-State of B: (B, A, 5) (B, C, 4) (B, E, 3) Link-State of E (E, B, 3) (E, C, 4) 2 C 3 D Link-State of A: (A, B, 5) (A, C, 2) (A, D, 3) Link-State of D: (D, A, 3) 5 A 3 E LS of All nodes 3 D 5 2 C B 4 4 3 E Network Layer 4-114 Flooding of LS-advertisemnt OSPF Flooding example of LS of D: (D, A, 3) = 5 B A 2 3 D C 4 3 4 E Similarly, LS of other nodes are flooded …… Network Layer 4-115 A Link-State Routing Algorithm Notation: c(x,y): link cost from node x to y; = ∞ if not direct neighbors D(v): current value of cost of path from source to dest. v p(v): predecessor node along path Cost? Hop count, delay, loss, … New: Cost computation (Cisco) Default cost computation Cost α 1/Link speed Cost = 100 Mbps/Link speed from source to v Examples: N': set of nodes whose least cost paths 1 Mbps link Cost = 100 10 Mbps link Cost = 10 100 Mbps link Cost = 1 1 Gbps link Cost 1 definitively known You can configure routers with cost computation approaches. Network Layer 4-116 Dijkstra’s Algorithm 1 Initialization: u is assumed to be “self.” 2 3 4 5 6 N' = {u} for all nodes v if v adjacent to u then D(v) = c(u,v) and p(v) = u else D(v) = ∞ 8 Loop 9 find w not in N' such that D(w) is a minimum 10 add w to N' 11 12 13 14 All nodes run this algorithm. v u w update D(v) for all v adjacent to w and not in N' : D(v) = min( D(v), D(w) + c(w,v) ); p(v) = w /* if w is chosen */ /* new cost to v is either old cost to v or known shortest path cost to w plus cost from w to v */ 15 until all nodes in N' Network Layer 4-117 Dijkstra’s algorithm: example D(v) D(w) D(x) D(y) D(z) Step 0 1 2 3 4 5 N' p(v) p(w) p(x) u uw uwx uwxv uwxvy uwxvyz 7,u 6,w 6,w 3,u ∞ ∞ 5,u ∞ 5,u 11,w 11,w 14,x 10,v 14,x 12,y p(y) u construct shortest path tree by tracing predecessor nodes ties can exist (broken arbitrarily) 9 7 4 8 3 u w x 5 Notes: v p(z) w y 3 7 2 z 4 v Network Layer 4-118 Dijkstra’s algorithm: example Resulting shortest-path tree from u: x 5 u Resulting routing table in u: 3 w z y 2 3 4 v Destination Next hop v x w x y w w w z w Network Layer 4-119 OSPF “advanced” features (not in RIP) security: all OSPF messages authenticated (to prevent malicious intrusion) multiple same-cost paths allowed (only one path in RIP) hierarchical OSPF in large domains (next slide ….) Network Layer 4-120 Hierarchical OSPF A large AS is partitioned into several “Areas”. boundary router (BGP router) backbone router area border routers Backbone (Area 0) Area 3 internal routers Area 1 Area 2 Network Layer 4-121 Border Gateway Protocol Network Layer 4-122 Inter-AS tasks AS1 must: 1. learn which dests are reachable through AS2, which through AS3 suppose a router in AS1 receives datagram destined outside of AS1: router should forward packet to edge router, but which one? 3c 3b other networks 3a AS3 2c 1c 1a AS1 1d 2a 1b 2b other networks AS2 Network Layer4-123 Internet inter-AS routing: BGP BGP (Border Gateway Protocol): the de facto inter-domain routing protocol “glue that holds the Internet together” allows subnets to advertise their existence to rest of Internet: “I am here” BGP provides each AS a means to: (eBGP) obtain subnet reachability information from neighboring ASs. (iBGP) propagate reachability information to other AS-internal BGP routers. determine “good” routes to other networks based on reachability information and policy Network Layer 4-124 BGP basics BGP session: two BGP routers (“peers”) exchange BGP msg: advertising paths to different destination network prefixes (“path vector” protocol) exchanged over semi-permanent TCP connections when AS3 advertises a prefix to AS1: AS3 promises that it will forward datagrams towards that prefix AS3 can aggregate prefixes in its advertisement 3c 3b other networks 3a BGP message AS3 2a 1c 1a AS1 1d 1b 2c AS2 2b other networks Network Layer 4-125 BGP basics: distributing path information using eBGP session over 3a-to-1c, AS3 sends prefix reachability info to AS1. 1c can then use iBGP to distribute new prefix info to all other BGP routers in AS1 (Note: with all-to-all TCP conns.) 1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP session when a router learns of a new prefix, it creates an entry for the prefix in its routing table. 3c 3b other networks eBGP session 3a AS3 1a AS1 iBGP session 1c 1d 1b 2a 2c AS2 2b other networks Network Layer 4-126 Advt. of “paths to destination nets” between neighboring AS’s Recall: Dest. Nets are represented by prefixes. Route (Path) = prefix + attributes Two important attributes: AS-PATH: a sequence of ASs through which prefix advertisement has passed: e.g., AS 67, AS 17, AS 205, … N1 AS 205 17 67 55 ({67, 17, 205}; N1) 70 ({55, 67, 17, 205}; N1) NEXT-HOP: indicates specific router IP addr in next-hop AS. Network Layer 4-127 Advt. of “paths to destination nets” between neighboring AS’s gateway (BGP) router receiving route advt. uses import policy to accept/decline a route Example: never route through AS 64496 Import policy: Accept/decline a route You receive an AS-PATH to dest. X: ({AS 12, AS 215, AS 64496, AS 99}; X) You do not trust AS 64496……. So you decline the above path to X Documentation AS: 64496 -- 64511 Network Layer 4-128 Remember BGP prefers policy-based routing as opposed to cost-based routing in OSPF and RIP Network Layer 4-129 BGP route selection router may learn about more than 1 route to destination AS; it selects a route based on: 1. policy decision 2. shortest AS-PATH Network Layer 4-130 BGP messages: exchanged between peers over TCP conn. BGP messages: OPEN • opens TCP connection to peer and authenticates sender UPDATE: advertises new path (or withdraws old) KEEPALIVE: keeps connection alive in absence of UPDATES NOTIFICATION: reports errors in previous msg; also used to close connection Network Layer 4-131 BGP routing policy legend: customer network B W provider network X A C Y Export policy: A router can hide a path X is dual-homed: attached to two networks (B and C) if X does not want to route from B to C X will not advertise to B a route to C Essentially, X does not tell B that it (X) has a path to C….. Network Layer 4-132 Why different Intra- and Inter-AS routing ? Policy: Inter-AS: admin wants control over how its traffic is routed, who routes through its net. Intra-AS: single admin, so no policy decisions needed Scale: hierarchical routing saves table size, reduces update traffic Performance: Intra-AS: focuses on performance Inter-AS: policy may dominate performance Network Layer 4-133 Multi-hop communication done! End of Network Layer 134