Download ch4_1040106update

Document related concepts

SIP extensions for the IP Multimedia Subsystem wikipedia , lookup

AppleTalk wikipedia , lookup

Point-to-Point Protocol over Ethernet wikipedia , lookup

Net neutrality law wikipedia , lookup

IEEE 1355 wikipedia , lookup

Computer network wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Peering wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Multiprotocol Label Switching wikipedia , lookup

Wake-on-LAN wikipedia , lookup

Net bias wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Deep packet inspection wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Internet protocol suite wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Transcript
Computer Networks
An Open Source Approach
Chapter 4: Internet Protocol Layer
Ying-Dar Lin, Ren-Hung Hwang, Fred Baker
Chapter 4: Internet Protocol Layer
1
Content








4.1 General Issues
4.2 Data-Plane Protocols: IPv4
4.3 Data-Plane Protocols: IPv6
4.4 Control-Plane Protocols: Address
Management
4.5 Control-Plane Protocols: Error Reporting
4.6 Control-Plane Protocols: Routing
4.7 Control-Plane Protocols: Multicast
Routing
4.8 Summary
Chapter 4: Internet Protocol Layer
2
Protocols Discussed in this Chapter
DHCP server
NAT Server
host
Router
TCP/UDP
ICMP
Routing
Protocols
IP address
Subnet
Default
router
IP
ARP
Data Link
Routing
Table
IP
NAT
Data Link
Chapter 4: Internet Protocol Layer
IP
Data Link
3
Open Source Implementation 4.1: IP-Layer Packet
Flows in Call Graphs
Raw IP
ip_append_data
UDP
TCP
ip_append_page
ip_push_pending_frames
ip_queue_xmit
Raw IP
UDP
TCP
raw_v4_input udp_v4_rcv tcp_v4_rcv
Transport
Layer
ip_route_output_flow
__ip_route_output_key
ip_route_output_slow
ip_local_deliver_finish
ip_local_deliver
skb->dst->input
dst_input
dst_output
skb->dst->output
ip_output
ip_route__input
IP
Layer
ip_finish_output
ip_rcv_finish
ip_rcv
ip_finish_output2
netif_receive_skb
net_rx_action
net_tx_action
dev_queue_xmit
Medium Access Control (MAC)
Chapter 4: Internet Protocol Layer
Data link
Layer
4
4.1 General Issues
Service
 Addressing
 Forwarding
 Routing
 Security

Chapter 4: Internet Protocol Layer
5
Service


Provides a host-to-host transmission service
Connects several LANs into an internetwork


a network of networks
“Internet”

the global internetwork to which most of networks
are connected
Chapter 4: Internet Protocol Layer
6
Internetwork

An example of an internetwork
Ethernet
Fast Ethernet
H1
H2
R1
GigabitEthernet
R2
R3
H3
Wireless LAN
Chapter 4: Internet Protocol Layer
7
Internet Service Model


Connectionless
Best effort delivery





packets may be lost
packets are delivered out of order
duplicate copies of a packet are delivered
packets can be delayed for a long time
Next-hop forwarding based on destination
address
Chapter 4: Internet Protocol Layer
8
Address



A globally unique address for host
identification
Data link layer: a flat address
Network layer: a hierarchical address
Chapter 4: Internet Protocol Layer
9
Deliver a packet

How to deliver a packet?

Routing



Find a path from source to destination
Done by routing protocols
Forwarding


Forward packets at a router
Look up the next-hop from the routing table and then
forward
Chapter 4: Internet Protocol Layer
10
Forwarding at Data Plane

Steps


Extract destination address
Look up destination address in routing table


Obtain the output interface from routing table
Forward the packet
Chapter 4: Internet Protocol Layer
11
Look Up the Routing Table

Issues


Speed and memory requirement
Good data structure



fast look up and table update
low memory requirement
Classical approaches




Trie
Hash
Fast lookup table
Hardware implementation
Chapter 4: Internet Protocol Layer
12
Look Up the Routing Table

An example of trie with prefixes {00*,010*, 11*, 0001*,
001*, 10100*, 111*}.
Chapter 4: Internet Protocol Layer
13
Routing at Control Plane

Task of routing


Select a path from the source to the destination
Goal of routing





Efficient (low delay, high throughput, …)
Scalable
Stable
Robust
Fair
Chapter 4: Internet Protocol Layer
14
IP Routing
D

Hop-by-hop routing



Option: source routing
R
S
Shortest path routing
Available information



R
Global information vs. local information
Ex. OSPF vs RIP
Information exchange


Flooding (broadcast) vs. neighbors only
Ex. OSPF vs RIP
Chapter 4: Internet Protocol Layer
15
Principle in Action: Bridging vs.
Routing



Both can be used for connecting two or more LANs
Both look up a table for forwarding packets
Layering:
 A bridge forwards a frame based on the link-layer header


A router forwards a packet based on the network layer header
information


e.g., destination MAC address
e.g., destination IP address
Table :
 A bridge usually builds a forwarding table through transparent
self-learning
 A router builds a routing table by running a routing protocol
explicitly
Chapter 4: Internet Protocol Layer
16
Principle in Action: Bridging vs. Routing

Collision domain vs. broadcast domain :
 A bridge is used to separate a collision domain




A router is used to separate a broadcast domain



A collision domain refers to a network segment
An n-port bridge could separate one collision domain into n collision domains
All these collision domains are still under the same broadcast domain unless
VLANs are created
All nodes can communicate with each other by broadcast at the link layer
An n -port router could separate one broadcast domain into n broadcast
domains
Scalability :

Bridging is less scalable than routing due to the broadcast requirement

if millions of hosts are bridged together, it will be very difficult, if not
impossible, to deliver a broadcast message to all hosts

when a MAC address is not learned into the forwarding table, flooding will be
used to forward a frame
Chapter 4: Internet Protocol Layer
17
Multicast

Definition of a multicast



Communication between a group of hosts
Packets are sent to all group members
Issues

Group membership


receivers of a multicast session
Multicast tree construction


Multiple point-to-point connections or a multicast tree
A multicast tree connects the source node to all
destination nodes
Chapter 4: Internet Protocol Layer
18
Security of IP

Aspects on the network security

Access Control


Data Security


Control who has the rights to access
Encrypt messages transmitted
Intrusion Detection

Detect illegal break in
Chapter 4: Internet Protocol Layer
19
Data-Plane Protocols and
Mechanisms
3.2 Internet Protocol
3.3 Internet Protocol Version 6
Chapter 4: Internet Protocol Layer
20
4.2 Internet Protocol
Addressing
 Subnetting
 Forwarding
 Packet format
 Fragmentation and re-assembly

Chapter 4: Internet Protocol Layer
21
IP Address


A globally unique 32-bit address to identify
a network interface
A hierarchical address



consists of network id and host id
A router usually has more than one
interface and one address
A host may have more than one address
Chapter 4: Internet Protocol Layer
22
IP Address Notation
140.123.1.1 = 10001100 01111011 00000001 00000001
140
123
1
1
IP address notation
Chapter 4: Internet Protocol Layer
23
Transmission Order
byte order stored in memory
Big Endian
Little Endian
10001100 01111011 00000001 00000001
A
A+1
A+2
byte order
transmitted from
network layer to
data link layer
A+3
00000001 00000001 01111011 10001100
A
A+1
A+2
A+3
Big Endian
00000001
00000001
01111011
10001100
Little Endian
bit order
transmitted from
Ethernet to
physical layer
…
1
0
0
0
1
1
0
0
Chapter 4: Internet Protocol Layer
24
Class-ful IP Address
bits
01234
16
8
31
24
0.0.0.0 to
Host
Class A 0 Network
Class B 1 0
Class C 1 1 0
Class D 1 1 1 0
Class E 1 1 1 1
127.255.255.255
128.0.0.0 to
Host
Network
Network
Multicast address
Reserved
191.255.255.255
Host
192.0.0.0 to
223.255.255.255
224.0.0.0 to
239.255.255.255
240.0.0.0 to
255.255.255.255
IANA IPv4 Address Space Registry:
http://www.iana.org/assignments/ipv4-address-space/ipv4-address-space.xhtml
Chapter 4: Internet Protocol Layer
25
Reserved IP Addresses

Host id = 0


denotes the network itself
Host id = F…F

broadcast address of the network
Chapter 4: Internet Protocol Layer
26
IP Subnet


Network address uniquely identifies a
physical network
A physical network consists of several LANs



Subnet mask is used to identify a subnet
Hosts in the same IP subnet talk directly without
intervening router
For example


cs.ccu.edu.tw: 140.123.101.0
subnet mask: 255.255.255.0 or 140.123.101.0/24
Chapter 4: Internet Protocol Layer
27
IP Subnet Addressing
bits
01234
Class A 0 Network
Class B
10
Class C 1 1 0
16
8
31
Host
Subnet
Network
24
Subnet
Network
Copyright
Chapterreserved
4: Internet
2001
Protocol
(Lin & Layer
Hwang)
Host
Subnet Host
28
IP Subnet
H2
H1
Subnet:
140.123.1.0
140.123.1.2
140.123.1.1
140.123.1.250
R1
140.123.250.1
140.123.250.2
Subnet:
140.123.250.0
R2
140.123.2.250
140.123.2.1
H3
140.123.250.3
R3
140.123.3.250
140.123.2.2
140.123.3.1
H4
Subnet: 140.123.2.0
H5
Subnet: 140.123.3.0
Chapter 4: Internet Protocol Layer
29
Classless IP Address

Classful addressing:

Inefficient use of address space




A class B address is too large
A class C address is too small
Scalability: too many class C routing entries
CIDR: Classless InterDomain Routing


network portion of address of arbitrary length
address format: a.b.c.d/x
Chapter 4: Internet Protocol Layer
30
Authority
ICANN: Internet Corporation for Assigned
Names and Numbers
 allocates addresses
 manages DNS
 assigns domain names, resolves disputes
Chapter 4: Internet Protocol Layer
31
IP Forwarding

Aspects of forwarding



Packets from upper layer protocols
Packets from a network interface
Routing table



Forwarding is based on routing table
Routing entry: (Destination/SubnetMask, NextHop)
Default router: (0.0.0.0/0, default router)
Chapter 4: Internet Protocol Layer
32
Packet Forwarding (at Host)
If (NetworkAddress of the destination == My subnet address) then
Transmit the packet directly to the destination
Else
Look up the routing table
Deliver the packet to the default router
End if
Check if destination is in my subnet:
If ((HostIP ^ DestinationIP) & SubnetMask)==0)
Chapter 4: Internet Protocol Layer
33
Packet Forwarding (at Router)
Look up the routing table
If the packet is to be delivered to the upper layer
Deliver the packet to an upper layer protocol
Else if the packet is to be delivered to a directly connected subnet
Deliver the packet directly to the destination
Else
Deliver the packet to a next hop router
End if
Chapter 4: Internet Protocol Layer
34
Table Look Up

Longest prefix match




Organization A: 194.24.0.0/21
Organization B: 194.24.7.0/24
194.24.7.10 matches 194.24.0.0/21 (21 bits) as
well as 194.24.7.0/24 (24bits)
Longest prefix: 194.24.7.0/24 is the right routing
entry
Chapter 4: Internet Protocol Layer
35
Open Source Implementation 4.2:
IPv4 Packet Forwarding

Search cache first; if not found, search the
routing table (FIB).
ip_route_output()
ip_route_output_key()
return
yes
no
found?
Chapter 4: Internet Protocol Layer
ip_route_output_slow()
36
Open Source Implementation 4.2 (cont)
Routing Cache
rt_hash_table
chain
u.rt_next
rtable
rtable
chain
chain
Chapter 4: Internet Protocol Layer
37
Open Source Implementation 4.2 (cont)
Routing Table (FIB)
fib_table
tb_data
fn_hash
fn_zone
fn_zones[0]
fn_zones[1]
fz_next
fz_hash[..]
fib_node
fib_node
fn_next
fn_next
fn_info
fn_info
fz_next
fz_hash[..]
fib_nh
fib_nh
nh_dev
fn_zone
fib_info
fn_zones[2]
fn_zone
fn_zones[32]
fn_zone_list
fz_next
fz_hash[..]
Chapter 4: Internet Protocol Layer
nh_gw
38
IP Packet Format (1/5)
0
4
Version
16
8
Type of
Service
Header
Length
31
Packet Length (bytes)
Flags
Identifier
Time-to-Live
24
13-bit Fragmentation Offset
Upper Layer
Protocol
Header Checksum
Source IP Address
Destination IP Address
Options
Data
Chapter 4: Internet Protocol Layer
39
IP Packet Format (2/5)

Version Number



Header Length


Current version 4
Version for next generation IP is 6
In units of 4-byte words
Type of Service (TOS)

Desired service of the packet
Chapter 4: Internet Protocol Layer
40
IP TOS
Precedence
New:
Used as DS
codepoint.
Type of Service
Precedence defined
In RFC 791:
111: network control
110: Internetwork control
101: CRITIC/ECP
100: Flash override
011: Flash
010: Intermediate
001: Priority
000: Routine
Partially implemented!!
R
TOS defined in RFC 1349:
1000: minimize delay
0100: maximize throughput
0010: maximize reliability
0001: minimize cost
0000: normal service
1111: maximize security
R: Reserved
Not implemented!!
Chapter 4: Internet Protocol Layer
41
IP Packet Format (3/5)

Packet Length



Identifier


Total number of bytes (header + data)
Maximum is 65,535 bytes
Uniquely identify an IP packet
Flags

Low-order two bits: for fragmentation control


First bit: do not fragment
Last bit: more
Chapter 4: Internet Protocol Layer
42
IP Packet Format (4/5)

Fragmentation Offset


Time-to-live (TTL)




Position of the fragment, measured in unit of 8
bytes.
Used as hop limit
Each router decrease TTL by one
If TTL reaches zero, sent an ICMP message
Upper Layer Protocol

IP:0, ICMP:1, TCP:6, UDP:17
Chapter 4: Internet Protocol Layer
43
IP Packet Format (5/5)

Header Checksum




Source Address (32 bits)
Destination Address (32 bits)
Options


16-bit 1’s complement checksum of the IP header
and IP options
loose source routing, strict source routing, record
route, record timestamp
Data

Payload from upper layers
Chapter 4: Internet Protocol Layer
44
Open Source Implementation 4.3:
IPv4 Checksum in Assembly

ip_fast_csum() function
(src/include/asm_i386/checksum.h).


optimized by writing this function in assembly
languages.
For 80x86 machines,





do the summation in 32-bit words first
The result is then copied to another register
Shifting registers to have 16 bits in their low-order bits
add up registers
Taking the complement of the result gives the checksum
Chapter 4: Internet Protocol Layer
45
IP Fragmentation & Reassembly

Limitation from data link layers


MTU(different link-layers, different MTUs)
An IP packet larger than MTU of its data link layer
needs to be “fragmented”


one packet becomes several small packets
Re-assembled only at the destination
IP
Packet
Help, cannot
get through.
Yes, can get
through now.
link-layer
link-layer
IP fragments
Chapter 4: Internet Protocol Layer
46
Fragment Control

Identify fragments of a packet


Know the position of a fragment


All fragments have the same identifier
Recorded in fragmentation offset (13 bits)
Know the end of a packet

more fragment bit of the last fragment is 0
Chapter 4: Internet Protocol Layer
47
IP Fragmentation Example
Header
id=x, more=0, offset=0
Header
id=x, more=1, offset=0
1480 bytes of data
Header
id=x, more=1, offset=185
3200 bytes of data
1480 bytes of data
Header
id=x, more=0, offset=370
240 bytes of data
(a) Original packet
(b) Fragments
Chapter 4: Internet Protocol Layer
48
Open Source Implementation 4.4:
IPv4 Fragmentation




Upper layer protocol calls ip_queue_xmit()
After routing is determined, call ip_queue_xmit2()
ip_queue_xmit2() calls ip_fragment() if the packet
length is larger than the MTU of the device
ip_fragment()


A while loop is used to fragment the original packet into
fragments
Size (in bytes) of a fragment, except the last one, is set to
the largest multiplicative number of 8 that is less than the
MTU
Chapter 4: Internet Protocol Layer
49
Open Source Implementation 4.4 (cont)
Re-Assembly
net_bh()
ip_route_input()
ip_rcv()
ip_local_deliver()
In ip_local_deliver():
yes
ip_defrag()
more or offset is set?
no
ip_local_deliver_finish()
In ip_defrag():
ip_find()
ip_frag_queue()
all fragments in?
ip_frag_reasm()
In ip_find():
yes
ipqhashfn()
found in hash table?
return queue
no
ip_frag_create()
Chapter 4: Internet Protocol Layer
50
Network Address Translation

Network Address Translation Protocol
Chapter 4: Internet Protocol Layer
51
Network Address Translation

Why NAT?


Solution to IP address depletion
Private IP address (RFC 1597)




10.0.0.0-10.255.255.255
172.16.0.0-172.31.255.255
192.168.0.0-192.168.255.255
Network address translation (RFC 3022)



Allow hosts with private IP address to have Internet
access
Short-term solution for IP address depletion
Also provides security for Intranet service
Chapter 4: Internet Protocol Layer
52
NAT Example
NAT Table
10.2.2.2 ==> 140.123.101.30
10.2.2.3:1175 ==> 140.123.101.30:6175
Src: 140.123.101.30: 1064
Src: 10.2.2.2: 1064
Router
With
NAT
Dst: 140.113.250.5: 80
Src: 10.2.2.3: 1175
Dst: 140.113.54.100: 21
Dst: 140.113.250.5: 80
Src: 140.123.101.30: 6175
Dst: 140.113.54.100: 21
Chapter 4: Internet Protocol Layer
53
Types of NAT (1/2)

NAT with a pool of global IP addresses





10.2.2.2 ==> 140.123.101.30
10.2.2.3 ==> 140.123.101.31
dynamic: translate IP address on demand
static: translate IP address with pre-configuration
NAT with Port Address Translation (NAPT) of
one global IP address


10.2.2.2:1064 ==> 140.123.101.30:5064
10.2.2.3:1175 ==> 140.123.101.30:6175
Chapter 4: Internet Protocol Layer
54
Types of NAT (2/2)

Port redirection

Redirect all WWW service to a specific IP and
private port number



DNS: www.cs.ccu.edu.tw ==> 140.123.101.38
NAT: 140.123.101.38:80 ==> 10.2.2.2:8080
Transparent proxy

Enforce all www traffic to a proxy with cache


140.123.101.38:80 ==> internal www proxy (10.1.1.1)
All HTTP requests go to the internal proxy
Chapter 4: Internet Protocol Layer
55
Problems with NAT (1/2)




Modify source IP and/or port number
Modify IP header checksum
Modify TCP checksum
Application dependent modification

ICMP:


Basic NAT: ICMP checksum, query id (echo)
NAPT: ICMP packets that may contain IP address

destination unreachable (3), source quench (4), redirect (5),
time exceeded (11), IP header error (12)
Chapter 4: Internet Protocol Layer
56
Problems with NAT (2/2)

Application Specific Gateways (ALGs)

FTP


PORT/PASV command has IP address:port in ASCII
Translate IP address may result in a change of packet
size





If new size is shorter, pad with zeroes
If new size is longer, need to change TCP sequence
number
Affects acknowledge, congestion control, …
A special table is used to correct the TCP sequence and
acknowledge numbers
Others: SMTP, SNMP, …….
Chapter 4: Internet Protocol Layer
57
Open Source Implementation 4.5:
NAT

Source and destination NAT implementation
in Linux iptables
From
PRE_ROUTING
Routing
POST_ROUTING
Interface
(Destination NAT)
Decision
(Source NAT)
To
Interface
LOCAL_OUT
(Destination NAT)
Upper Layer (TCP/UDP)
Chapter 4: Internet Protocol Layer
58
Open Source Implementation 4.5
(cont)

Data structure



Hash table: ip_conntrack_hash[]
Hash function: hash_conntrack()
Linear search with a hashed list
do_masquerade()
ip_conntrack_in()
resolve_normal_ct()
ip_conntrack_find_get()
Chapter 4: Internet Protocol Layer
59
Open Source Implementation 4.5
(cont)

NAT function flows
ip_nat_out()
ip_nat_out()
do_bindings()
upper_layer_protocol->manip_pkt()
manip_pkt()
ip_nat_localout()
Chapter 4: Internet Protocol Layer
60
Open Source Implementation 4.5
(cont)

FTP ALG function flows
do_bindings()
helper->help()
ip_nat_seq_adjust()
ip_nat_resize_packet()
ftp_data_fixup()
mangle_rfc959_packet()
ip_nat_mangle_tcp_packet()
Chapter 4: Internet Protocol Layer
61
4.3 Internet Protocol Version 6
Changes from IPv4
 IPv6 Header
 IPv6 Extension Header
 IPv6 Fragmentation and Reassembly
 IPv6 Address Space

Chapter 4: Internet Protocol Layer
62
IPv6

Problems with IPv4



Shortage of address space
Lack of Quality of Service guarantee
New features of IPv6





Enlarge address space
Fixed header format helps speed processing/forwarding
Better support for Quality of Service
Auto-configuration
new “anycast” address: route to “best” of several replicated
servers
Chapter 4: Internet Protocol Layer
63
IPv6 Header (1/2)
0
4
12
Version Traffic Class
Payload Length
16
24
31
Flow Label
Next Header
Hop Limit
Source Address (16 octects)
Destination Address (16 octects)
Chapter 4: Internet Protocol Layer
64
IPv6 Header (2/2)


Version: 6
Traffic class:



Flow Label:


identify class of service
E.g., DiffServ (DS codepoint)
identify datagrams in same “flow”
Next header:

identify upper layer protocol for data
Chapter 4: Internet Protocol Layer
65
IPv4 and IPv6 Header Comparison
IPv6 Header
IPv4 Header
Version IHL
Type of
Service
Identification
Total Length
Flags
Version
Traffic Class
Fragment
Offset
Payload Length
Time to Live
Protocol
Flow Label
Next Header
Header Checksum
Source Address
Source Address
Destination Address
Legend
Options
Padding
Field’s name kept from IPv4 to IPv6
Fields not kept in IPv6
Name and position changed in IPv6
New field in IPv6
Destination Address
Hop Limit
Changes from IPv4 (1/3)
0
4
Version
16
8
Type of
Service
Header
Length
Time-to-Live
Packet Length (bytes)
Flags
Identifier
31
24
13-bit Fragmentation Offset
Upper Layer
Protocol
Header Checksum
Source IP Address
Destination IP Address
Options
Data
0
4
12
Version Traffic Class
Payload Length
16
24
31
Flow Label
Next Header
Hop Limit
Source Address (16 octects)
Destination Address (16 octects)
Chapter 4: Internet Protocol Layer
67
Changes from IPv4 (2/3)

Expanded Addressing Capabilities




Header Format Simplification


From 32 bits to 128 bits (more level and nodes)
Improve multicast routing (“scope” field)
“anycast address”: send a packet to any one of a
group of nodes
Reduce bandwidth cost
Extensions

More flexibility
Chapter 4: Internet Protocol Layer
68
Changes from IPv4 (3/3)

Options


Checksum


Allowed, but outside of header, indicated by “Next
Header” field
Removed to reduce processing at routers
Fragmentation

Not allowed at intermediate routers
Chapter 4: Internet Protocol Layer
69
IPv6 Extension Header Examples
IPv6 Header
Next Header = TCP
TCP Header
Data
(a) No extension header
IPv6 Header
Routing Header
Next Header = Routing
Next Header = TCP
TCP Header
Data
(b) IPv6 header followed by a routing header
IPv6 Header
Routing Header
Fragment Header
Next Header = Routing
Next Header = Frag.
Next Header = TCP
TCP Header
Data
(c) IPv6 header followed by a routing header and a fragment header
Chapter 4: Internet Protocol Layer
70
IPv6 Extension Header (1/2)

Order of extension headers









IPv6 (41)
Hop-By-Hop Options header (0)
Destination Options header (60)
Routing header (43)
Fragment header (44)
Authentication header (51)
Encapsulating Security Payload header (50)
Destination Options header (60)
Upper-layer header


ICMPv6(58)
TCP(6), UDP(17), RSVP(46), SCTP(132)
Chapter 4: Internet Protocol Layer
71
IPv6 Extension Header (2/2)

Not processed by intermediate routers



except hop-by-hop option header
Processed strictly in order
Each extension header occurs at most once

except Destination Options header, which occurs at most
twice
Chapter 4: Internet Protocol Layer
72
Fragment Header


Fragmentation is only performed by source
Fragment header format
0
8
Next Header
Reserved
16
Fragment Offset
29 31
R M
Identifier
Chapter 4: Internet Protocol Layer
73
Fragmentation Example
IPv6 Header
Fragment 1 Data
Fragment 2 Data
Fragment 3 Data
(a) Original packet
IPv6 Header
Fragment Header
Fragment 1 Data
IPv6 Header
Fragment Header
Fragment 2 Data
IPv6 Header
Fragment Header
Fragment 3 Data
(b) Fragments
Chapter 4: Internet Protocol Layer
74
Packet Size Issue

MTU of every link must >= 1280 bytes


Use Path MTU Discovery to discover MTU greater
than 1280 bytes
A node need to accept a fragmented packet that
is as large as 1500 octets
Chapter 4: Internet Protocol Layer
75
IPv6 Addressing

Three categories




Unicast
Multicast
Anycast
Notation
16-bit Hex’s separated by colons
3FFD:3600:0000:0000:0302:B3FF:FE3C: C0DB
 Consecutive null 16-bit numbers replaced by ::
3FFD:3600:0:0:0:0:1:A =>3FFD:3600::1:A

Chapter 4: Internet Protocol Layer
76
IPv6 Address Assignment
Prefix
Address Type
Portion
0000 0000
0000 0001
0000 001
0000 010
Reserved (IPv4 compatibility)
Unassigned
Reserved for NSAP
Reserved for IPX
1/256
1/256
1/128
1/128
0000 011
Unassigned
1/128
0000 1
0001
Unassigned
Unassigned
1/32
1/16
001
010
011
100
Aggregatable Global Unicast Address
Unassigned
Unassigned
Unassigned
1/8
1/8
1/8
1/8
101
110
1110
Unassigned
Unassigned
Unassigned
1/8
1/8
1/16
1111 0
1111 10
1111 110
Unassigned
Unassigned
Unassigned
1/32
1/64
1/128
1111 1110 0
1111 1110 10
1111 1110 11
1111 1111
Unassigned
Link Local Unicast Address
Site Local Unicast Address
Multicast Address
1/512
1/1024
1/1024
1/256
Chapter 4: Internet Protocol Layer
77
IPv6 Unicast Address (1/2)
Unicast Address without Internal Structure:
Node Address
Unicast Address with Subnet:
Subnet Prefix
Interface ID
Unicast Unspecified Address:
0000
0000
0000
0000
0001
Unicast Loopback Address:
0000
Chapter 4: Internet Protocol Layer
78
IPv6 Unicast Address (2/2)
::8C7B:65A0
IPv4-compatible IPv6 Address:
32
0000
00000000
32 bits
IPv4 Address
IPv4-Mapped IPv6 Address: 32
0000
0000FFFF
32 bits ::FFFF:8C7B:65A0
IPv4 Address
NSAP Addresses:
00000001 defined according to usage requirements
IPX Addresses:
00000010
to be defined
Chapter 4: Internet Protocol Layer
79
Aggregatable Global Unicast Address
3
13
8
P TLA ID RES
24
16
NLA ID
SLA ID
64
Interface ID
RFC 2374






P : Fromat Prefix (001)
TLA : Top-Level Aggregation Identifier (8192)
RES : Reserved
NLA : Next-Level Aggregation Identifier
SLA : Site-Level Aggregation Identifier
Interface ID : Interface Identifier
Current policy:
Registry /23, ISP /35, Site /48
Chapter 4: Internet Protocol Layer
80
Interface ID: EUI-64 (RFC 2464)


Prefix range from 001 to 111 should use EUI-64
format for interface ID.
For 48-bit MAC address



0xff-fe is inserted between the 3rd and 4th bytes
The universal/local bit (the second low-order bit of the first
byte) is complemented.
Example




MAC: 00-02-b3-1e-83-29
EUI-64 ID: 02-02-b3-ff-fe-1e-83-29
Link local: FE80::202:b3ff:fe1e:8329
Some problem with privacy: a host can be traced
from IPv6 address
Chapter 4: Internet Protocol Layer
81
Current Address Allocations





APNIC
 2001:0200::/23, 2001:0C00::/23, …, 2400:0000::/12
 http://www.apnic.net/faq/IPv6-FAQ.html
ARIN
 2001:0400::/23, …, 2600:0000::/12
 http://www.arin.net/library/guidelines/ipv6_initial.html
RIPE NCC
 2001:0600::/23, 2001:0800::/23, 2A00:0000::/12
 http://www.ripe.net/ripencc/mem-services/registeration/ipv6.html
LACNIC
 2001:1200::/23, …, 2800:0000::/12
6to4 tunnels
 2002::/16
http://www.iana.org/assignments/ipv6-unicast-address-assignments/ipv6-unicast-address-assignments.xml
Chapter 4: Internet Protocol Layer
82
IPv6 Multicast Address (1/2)
Format:

flag : 00PT



T = 0 : well-known multicast address
T = 1 : transient multicast address
scope : scope of multicast group






P=0 address not assigned on prefix
P=1 assigned based on prefix
Plen: length of network prefix
Prefix: up to 64 bits
0000 : reserved
0001 : node-local scope
0010 : link-local scope
0101 : site-local scope
1000 : organization-local scope
1110 : global scope
Chapter 4: Internet Protocol Layer
83
IPv6 Multicast Address (2/2)

Node-Local Scope
FF01:0:0:0:0:0:0:1
FF01:0:0:0:0:0:0:2

All Nodes Address
All Routers Address
Link-Local Scope
FF02:0:0:0:0:0:0:1
All Nodes Address
FF02:0:0:0:0:0:0:2
All Routers Address
FF02:0:0:0:0:1:FFxx:xxxx Solicited Node Address
(Unicast : 4037::01:800:200E:8C6C is FF02::1:FF0E:8C6C)

Site-Local Scope
FF05:0:0:0:0:0:0:2
FF05:0:0:0:0:0:0:3
All Routers Address
All DHCP Servers
Chapter 4: Internet Protocol Layer
84
Transition From IPv4 To IPv6

Not all routers can be upgraded simultaneous


How will the network operate with mixed IPv4 and IPv6
routers?
Transition assumptions

No “Flag Day”


Transition will be incremental


Possibly over several years
Transparent to end users


Last Internet transition was 1983 (NCP  TCP)
Seamless transition from IPv4 to IPv6
IPv6 is designed with transition in mind

Assumption of IPv4/IPv6 coexistence
Chapter 4: Internet Protocol Layer
85
Transition Approaches

Dual Stacks


Allow IPv4/IPv6 to co-exist on one device
Tunnels

For tunneling IPv6 across IPv4 clouds



Manually configured tunnel
Automatic tunnel


Encapsulate IPv6 packets in IPv4 packets (PID=41)
Relies on some special IPv6 addresses
Translators

IPv6 only device communicates with IPv4 only device
Chapter 4: Internet Protocol Layer
86
Conceptual View of IPv6 Routing
Table
fib6_table
…
tb6_root
fib6_node
*parent
fib6_node
fib6_node
*left
rt6_info
rt6_info
*right
Neighbor
Entry
Copyright
Chapterreserved
4: Internet
2001
Protocol
(Lin & Layer
Hwang)
87
FIB6 Data Structure
fib6_table
fib6_node
hlist_node tb6_hlist
fib6_node *parent
…
fib6_node *left
fib6_node tb6_root
fib6_node *right
rt6_info
rt6_info *leaf
inet6_dev *rt6i_idev
fib6_node *rt6i_node
rt6_info *rr_ptr
fib6_node *parent
in6_addr rt6i_gateway
rt6_info *leaf
…
fib6_node *left
fib6_node *right
rt6_info *leaf
fib6_table *rt6i_table
rt6key dst
rt6key src
rt6_info *rr_ptr
fib6_node *parent
。
。
。
fib6_node *left
fib6_node *right
rt6_info *leaf
rt6_info *rr_ptr
Copyright
Chapterreserved
4: Internet
2001
Protocol
(Lin & Layer
Hwang)
88
Chapter 4
Internet Protocol Layer
Part II
Chapter 4: Internet Protocol Layer
89
Control Plane Mechanisms

Address Management



Error reporting


Internet Control Message Protocol
Routing



Address resolution
Address configuration
Intra-domain routing
Inter-domain routing
Multicast
Chapter 4: Internet Protocol Layer
90
4.4 Address Management
Address resolution
 Address configuration

Chapter 4: Internet Protocol Layer
91
Address Resolution
Address Resolution Protocol (ARP)
Chapter 4: Internet Protocol Layer
92
Address Resolution

What is address resolution


Translate address at different layers
For example



host name to IP address
IP address to Ethernet address
Why address resolution

MAC address vs. IP address
Chapter 4: Internet Protocol Layer
93
Address Resolution Protocol

Protocol operation




Source node broadcasts an ARP request packet
on the IP subnet
All nodes on the subnet will receive the ARP
request, but only the target node (or some
designate server) will reply an ARP reply packet
via unicast
Source node receives the reply and gets the MAC
address of the target node
Cache is used to speed up (w/ timer)
Chapter 4: Internet Protocol Layer
94
ARP Packet Format (1/3)
Chapter 4: Internet Protocol Layer
95
ARP Packet Format (2/3)




HARDWARE ADDRESS TYPE
 Link types: Ethernet=0x0001
PROTOCOL ADDRESS TYPE
 Upper layer protocol identifier: IP=0x0800
HADDR LEN
 Length of the address of the link layer: Ethernet=6
PADDR LEN
 Length of the address of the network layer: IP=4
Chapter 4: Internet Protocol Layer
96
ARP Packet Format (3/3)

OPERATION


SENDER HADDR


Sender network layer address
TARGET HADDR


Sender link layer address
SENDER PADDR


Operation code: ARP request=1, ARP reply=2 RARP
request=3, RARP reply=4
Target link layer address, fill zero if unknown
TARGET PADDR

Target network layer address
Chapter 4: Internet Protocol Layer
97
Encapsulate ARP Packet into MAC
Frame


Protocol id: 0x0806
Destination address of an ARP request
packet: 0xFFFFFFFFFFFF
Chapter 4: Internet Protocol Layer
98
Reverse ARP (RARP)



Allow a diskless workstation to discover its IP
address
Need a RARP server on each network
Bootp:

Use UDP messages which are forwarded over
routers to find the file server that holds the
mapping
Chapter 4: Internet Protocol Layer
99
Open Source Implementation 4.6:
ARP

Data structure



Hash table: arp_table
Hash parameters: a primary key and device interface index
Functions


Arp_send(): set up ARP header and then xmit
Arp_rcv(): Only deal with reply or request operation.




Request: calls ip_input_route(), if routes to local, calls arp_send()
to send out ARP reply. Otherwise, if the host is an arp proxy, also
sends ARP reply.
Reply: update ARP table.
__neigh_lookup(): calls neigh_lookup() to search the arp hash
table, if not found, create one
Eth_rebuild_header (old) or arp_solicit() calls arp_send()
Chapter 4: Internet Protocol Layer
100
Address Configuration
Dynamic Host Configuration Protocol
(DHCP)
Chapter 4: Internet Protocol Layer
101
Address Configuration

What is address configuration


Automatically and dynamically assign an IP
address to a host
Why address configuration



Setting IP address is error prone.
Insufficient IP addresses: share IP addresses
among hosts
Better network management
Chapter 4: Internet Protocol Layer
102
DHCP Protocol


Dynamic Host Configuration Protocol
DHCP is derived from BOOTP


Some fields are not for host configuration
Operations




A host broadcasts a DHCPDISCOVER message
A DHCP server receives and replies it
Or a DHCP relay server receives it and forwards
to the DHCP server, gets the configuration and
relays to the host
DHCP message are sent over UDP (port 67)
Chapter 4: Internet Protocol Layer
103
State Diagram for DHCP Client
/DHCPDISCOVER
Initial
DHCPNACK
or Lease expires
DHCPOFFER
Rebind
Offer
DHCPNACK
/DHCPREQUEST
Rebinding expires
/DHCPREQUEST
DHCPACK
Renew
Request
DHCPACK
DHCPACK
Bind
Renewal expires
/DHCPREQUEST
Chapter 4: Internet Protocol Layer
104
DHCP Packet Format
0
8
Operation
Hard. Type
16
24
Hardware Len
Hops
31
Transaction ID
Seconds
B
Flags
Client IP Address
Your IP Address
Server IP Address
Router IP Address
Client Hardware Address (16 octects)
Server Host Name (64 octects)
Boot File Name (128 octects)
Options (variable)
Chapter 4: Internet Protocol Layer
105
DHCP Packet Format

More information for host configuration


such as default router, subnet mask
encoded in the option field (code=55, length, parameter)
ID
1
3
6
12
15
17
40
Request Parameter
Subnet mask
Default gateway
DNS server
Host name
Domain name
Boot path
NIS domain name
Chapter 4: Internet Protocol Layer
106
DHCP Packet Format

Options

Option field starts with three fields: code (53), length(1),
type(1-7)
Type
DHCP Message
1
DHCPDISCOVER
2
DHCPOFFER
3
DHCPREQUEST
4
DHCPDECLINE
5
DHCPACK
6
DHCPNACK
7
DHCPRELEASE
Chapter 4: Internet Protocol Layer
107
Open Source Implementation 4.7:
DHCP
ip_auto_config()
struct bootp_pkt {
struct iphdr iph;
struct udphdr udph;
u8 op;
u8 htype;
u8 hlen;
u8 hops;
u32 xid;
u16 secs;
u16 flags;
u32 client_ip;
u32 your_ip;
u32 server_ip;
u32 relay_ip;
u8 hw_addr[16];
u8 serv_name[64];
u8 boot_file[128];
u8 exten[312];
};
ic_dynamic()
/* BOOTP packet format */
/* IP header */
/* UDP header */
/* 1=request, 2=reply */
/* HW address type */
/* HW address length */
/* Used only by gateways */
/* Transaction ID */
/* Seconds since we started */
/* Just what it says */
/* Client's IP address if known */
/* Assigned IP address */
/* (Next, e.g. NFS) Server's IP address */
/* IP address of BOOTP relay */
/* Client's HW address */
/* Server host name */
/* Name of boot file */
/* DHCP options / BOOTP vendor extensions */
ic_bootp_send_if()
ic_dhcp_init_options()
Chapter 4: Internet Protocol Layer
108
4.5 Error Reporting
Internet Control Message Protocol
(ICMP)
Chapter 4: Internet Protocol Layer
109
Error Control Protocol

What is error control protocol


A protocol for reporting error or status of TCP/IP
at remote site (router or host)
Why error control protocol


For monitoring the status of TCP/IP at each
host/router
For reporting error between hosts or routers
Chapter 4: Internet Protocol Layer
110
Internet Control Message Protocol (ICMP)

ICMP runs over IP
ICMP Header
IP Header
ICMP Data
IP Data
Chapter 4: Internet Protocol Layer
111
ICMPv4 Packet Format


Type and Code are used to identify an error event
The data filed contains

STD 5, RFC 792, “Internet Control Message Protocol”


STD 3, RFC 1122, “Requirements for Internet Hosts – Communication
Layers”


IP header plus the first 64 bits of the packet that elicited the ICMP message
IP header and at least the first 8 data octets of the datagram that triggered the
error (more than 8 octets MAY be sent)
RFC 1812, "Requirements for IP Version 4 Routers“

SHOULD contain as much of the original datagram as possible without the
length of the ICMP datagram exceeding 576 bytes
0
8
Type
16
24
31
Checksum
Code
Data
Chapter 4: Internet Protocol Layer
112
Type and Code
Type Code Description
0
0
Echo reply (ping)
3
0
Destination network unreachable
3
1
Destination host unreachable
3
2
Destination protocol unreachable
3
3
Destination port unreachable
3
4
Fragmentation needed and DF set
3
5
Source route failed
3
6
Destination network unknown
3
7
Destination host unknown
4
0
Source quench (congestion control)
5
0
Redirect (destination network)
5
1
Redirect (host)
8
0
Echo request (ping)
9
0
Route advertisement
10
0
Router discovery
11
0
TTL expired
12
0
Bad IP header
Chapter 4: Internet Protocol Layer
113
ICMPv4 Examples (1/6)

Echo Request/Reply

Source sends an echo request (type=8) to a destination, destination
responses with an echo reply (type=0)



The data received by the Echo Request must be entirely included in the
Echo Reply.
The Identifier and Sequence Number is used by the client to match the
reply with the request that caused the reply.
ping uses echo request and reply
0
8
Type
16
24
31
Checksum
Code
Sequence Number
Identifier
Data
Chapter 4: Internet Protocol Layer
114
ICMPv4 Examples (2/6)

Destination Unreachable (type=3)


destination unreachable is used to report various
unreachable reasons, such as network, host, or
port unreachable.
However, code 4 of type 3 message is used to
report the error that fragmentation is needed at an
intermediate router (due to MTU) but the do not
fragment bit in the IP header is set.
0
8
Type=3
Empty
16
Code
24
31
Checksum
Next-hop MTU
IP header + first 8 bytes of original packet’s data
Chapter 4: Internet Protocol Layer
115
ICMPv4 Examples (3/6)

Source Quench

when buffer overflows, router sends a source quench (type=4)
to source 0
8
31
16
Type
24
Checksum
Code
Unused
Data

Routing redirect

If a host forwards a packet to a wrong router, router sends a
redirect (type=5, code=0 or 1, (network/ host)) ICMP message
to source
0
8
31
16
24
Type
Checksum
Code
Gateway (router) IP address
Data
Chapter 4: Internet Protocol Layer
116
ICMPv4 Examples (4/6)

Time Exceeded


If TTL is less or equal to zero (after decrement), router sends
a Time Exceeded (type=11) ICMP message to source
traceroute implementation
 traceroute sends an ICMP echo request with TTL=1 to the
target machine
 When the first router receives the message, it responds
with a time exceeded message
 traceroute then sends another echo request with TTL=2
 The message passes the first router, but discarded by the
second router with a returned time exceeded message
 Traceroute repeats sending echo requests until it receives
an echo reply from the target machine
Chapter 4: Internet Protocol Layer
117
ICMPv4 Examples (5/6)

IP header error

Wrong IP header, such as wrong option field. (type=12)

Code=0: IP header is invalid

Code=1: a required option is missing
Chapter 4: Internet Protocol Layer
118
ICMPv4 Examples (6/6)

Time Stamp Request/Reply


Information Request/Reply


Type=13/14, code=0
Type=15/16, code=0
Address Mask Request/Reply

Type=17/18, code=0
Chapter 4: Internet Protocol Layer
119
ICMPv6

New type and code
 Type 0..127: error report





1: Destination unreachable
2: Packet too big
3: Time Exceeded
4: Parameter problem
Type 128..255: informational










128, 129: Echo request & reply (RFC 2463)
130, 131, 132: Multicast group membership management (RFC 2710)
133,134: Router solicitation and advertisement (RFC 2461)
135, 136: Neighbor solicitation and advertisement (RFC2461)
137: Redirect (RFC 2461)
138: Router renumbering (RFC 2894)
139, 140: node information query/response (draft, name-lookups)
141, 142: Inverse ND solicitation/ Adv message (RFC 3122)
150, 151: Home agent address discovery request/reply (draft)
152, 153: Mobile prefix solicitation/advertisement
Chapter 4: Internet Protocol Layer
120
ICMPv6
Type Code Description
1
1
0
1
No route to destination
Communication with destination
administratively prohibited
1
1
3
4
Address unreachable
Port unreachable
2
3
0
0
Packet too big
Hop limit exceeded in transit
3
1
Fragment reassembly time exceeded
4
4
0
1
Erroneous header field encountered
Unrecognized Next Header type
4
2
Unrecognized IPv6 option encountered
128
129
0
0
Echo request
Echo reply
130
131
0
0
Multicast Listener Query
Multicast Listener Report
132
0
Multicast Listener Done
133
134
0
0
Router Solicitation
Router Advertisement
135
0
Neighbor Solicitation
136
137
0
0
Neighbor Advertisement
Redirect
Chapter 4: Internet Protocol Layer
121
Open Source Implementation 4.8:
ICMP

Data structure


ICMP header: struct icmphdr
Error when forwarding IP packets

ip_forward()  icmp_send()




TTL<=1
Strict source routing Fail
Route redirect
Error when receiving IP packets

ip_route_input_slow() ip_error()  icmp_send()

destination unreachable
Chapter 4: Internet Protocol Layer
122
Open Source Implementation 4.8
(cont)

Receiving ICMP packets

Control handlers: icmp_pointers[]









icmp_unreach() for type 3, 4, 11, and 12
icmp_redirect() for type 5
icmp_echo() for type 8
icmp_timestamp() for type 13
icmp_address() for type 17
icmp_address_reply() for type 18
icmp_discard() for other types
icmp_rcv()  icmp_pointers
ICMPv6


icmpv6_send()
icmpv6_rcv() icmpv6_echo_reply(), icmpv6_notify()
Chapter 4: Internet Protocol Layer
123
4.6 Routing
Principle
 Intra-domain routing
 Inter-domain routing

Chapter 4: Internet Protocol Layer
124
Routing Principle
Link State Routing
 Distance Vector Routing

Chapter 4: Internet Protocol Layer
125
Routing

Task of routing


Select a path from the source to the destination
Goal of routing





Efficient (low delay, high throughput, …)
Scalable
Stable
Robust
Fair
Chapter 4: Internet Protocol Layer
126
Optimality of IP Routing

IP uses hop-by-hop routing(forwarding)


Each router determines its own routing table
Why packets will be delivered to their destinations
along the optimal path?



If k is an intermediate node on the optimal path from
source node s to destination d
The path from s to k is also the optimal path from s to k
A shortest path tree can be constructed from a source to
the rest of the graph.
Chapter 4: Internet Protocol Layer
127
Routing Algorithm Classification

Global or decentralized information?



Static


Link State routing: use Dijkstra algorithm
Distance Vector routing: use distributed
Bellman-Ford algorithm
Fixed routing table, set up manually
Dynamic (adaptive)

Routing table adapts to network status
Chapter 4: Internet Protocol Layer
128
The Shortest Path Algorithm

View a network as a graph


Nodes are routers
Edges are physical links


Associated with a link cost: delay, congestion level, …
Find the least cost path

Depends on information available
Chapter 4: Internet Protocol Layer
129
Link-State Routing

Routing information



Global information is available by reliable
broadcasting
Dynamic: information exchanged when topology
changes or periodically
Path calculation

Dijkstra algorithm
Chapter 4: Internet Protocol Layer
130
Dijkstra Algorithm
For each v in V-{s} {
If v is adjacent to s
C(v)=lc(s,v)
else
C(v)=?
}
T = {s}
While (T≠ V) {
find w not in T s.t. C(w) is the minimum for all w in (V-T)
T = T ∪{w}
For each v in V-T
C(v) = MIN(C(v), C(w)+lc(w,v))
P(v)=w)
}
Chapter 4: Internet Protocol Layer
131
Dijkstra Algorithm Example
4
2
A
1
B
D
3
1
1
C
Iteration
0
1
2
3
4
T
A
AC
ACE
ACEB
ACEBD
1
E
C(B),p(B) C(C),p(C) C(D),p(D) C(E),p(E)
∞
∞
4,A
1,A
3,C
4,C
2,C
3,C
3,E
3,E
Chapter 4: Internet Protocol Layer
132
Routing Table at Node A
Destination Cost NextHop
B
3
C
C
1
C
D
3
C
E
2
C
Chapter 4: Internet Protocol Layer
133
Distance Vector Algorithm

Routing information

Only local information is known



Knows status of adjacent links and routing information of
adjacent nodes
Dynamic: information exchanged when link cost or
shortest path changed
Path calculation

Bellman-Ford
Chapter 4: Internet Protocol Layer
134
Bellman-Ford Algorithm
While (1) {
If x received route update message from y {
For each (Dest, Distance) pair in y’s report {
If (Dest is new) { /* Dest not in routing table */
Add a new entry for destination Dest
rt(Dest).distance = Distance+lc(x,y)
rt(Dest).NextHop = y
}
else if ((Distance+lc(x,y))<rt(Dest).distance){
/* y reports a shorter distance to Dest */
rt(Dest).distance = Distance+lc(x,y)
rt(Dest).NextHop = y
}
}
Send update messages to all neighbors if route changes
Also send update messages to all neighbors periodically
}
Chapter 4: Internet Protocol Layer
135
Bellman-Ford Algorithm Example:
Step 1
Dt. C
Dt. C
NH
A
4
A
C
2
D
1
C
D
NH
B
4
B
C
1
C
4
B
2
A
1
D
Dt. C
NH
B
1
B
C
3
E
1
C
E
3
1
1
C
1
E
Dt. C
NH
A
1
A
Dt. C
NH
B
2
C
1
C
D
3
B
D
D
1
D
E
1
E
Chapter 4: Internet Protocol Layer
136
Bellman-Ford Algorithm Example:
Step 2
Dt. C
NH
B
3
C
C
1
D
4
C
C
E
2
C
4
B
2
A
Dt. C
NH
A
3
C
C
2
D
1
C
D
E
2
D
1
D
3
Dt. C
NH
A
4
C
B
1
C
2
B
E
E
1
E
1
1
C
1
E
Dt. C
NH
Dt. C
NH
A
1
A
A
2
C
B
2
B
2
D
2
B
E
C
1
D
C
E
1
E
D
1
D
Chapter 4: Internet Protocol Layer
137
Bellman-Ford Algorithm Example:
Step 3
Dt. C
NH
B
3
C
C
D
1
C
C
E
2
3
4
C
B
2
A
Dt. C
NH
A
3
C
C
D
2
1
C
D
E
2
D
1
Dt. C
D
3
NH
A
3
E
B
C
1
2
B
E
E
1
E
1
1
C
1
E
Dt. C
NH
Dt. C
NH
A
1
A
A
2
C
B
D
2
2
B
E
B
C
2
1
D
C
E
1
E
D
1
D
Chapter 4: Internet Protocol Layer
138
Bellman-Ford Algorithm Example

Routing table of node A after convergence
Destination
B
C
D
E
Cost
3
1
3
2
NextHop
C
C
C
C
Chapter 4: Internet Protocol Layer
139
Problem with DV Routing (1/2)

Phenomenon


good news travels fast
bad news travels slowly
4
2
A
1
B
D
50
3
D
3
1
1
7
1
2
A
1
1
B
C
1
E
∞
Route updated in two iterations.
C
1
E
Route updated in more than 25 iterations.
Chapter 4: Internet Protocol Layer
140
Problem with DV Routing (2/2)

Routing loop



Due to the above phenomenon
Loop formed before routing converged
Partial solutions

Split horizon


Poisoned reverse


Routing updates sent to a neighbor should not contain route learned from that
neighbor.
If A learns a route to D from B, then A tells B that he cannot reach D so to
poison the route.
Hold down timer


When a router receives an update from a neighbor indicating a network is
inaccessible, the router marks the route as inaccessible and starts a
holddown timer
Holddown timers help prevent counting to infinity but also increase
convergence time
Chapter 4: Internet Protocol Layer
141
Hierarchical Routing


Not a flat network: too many routing entries
Define an AS


Routers within an AS are under the same
administrative control
Routing within an AS and between AS’s


Intra-domain routing
Inter-domain routing
http://bgp.potaroo.net
for the current BGP table size
Chapter 4: Internet Protocol Layer
142
AS

The Internet consists of Autonomous Systems (AS)
interconnected with each other:




Stub AS: small corporation
Multihomed AS: large corporation (no transit)
Transit AS: provider
Two-level routing:


Intra-AS: routing within an AS
Inter-AS: routing between AS’s
Chapter 4: Internet Protocol Layer
143
An example of Hierarchical Routing
Inter-domain routers (exterior gateway)
A.2
C.2
Domain A
Domain C
A.1
C.1
A.3
B.1
C.3
B.4
Domain B
B.3
B.2
Intra-domain routers (interior gateway)
Chapter 4: Internet Protocol Layer
144
Example of Internet Routing Protocols

Intradomain routing



RIP
OSPF
Interdomain routing

BGP-4
Chapter 4: Internet Protocol Layer
145
Intra-domain Routing
Routing Information Protocol (RIP)
 Open Shortest Path First (OSPF)

Chapter 4: Internet Protocol Layer
146
Intra-domain Routing

What is intra-domain routing




Routing within a domain (AS)
Administrator decides the routing protocol
Administrator has total control on all routers
Why intra-domain routing

Maintain connectivity within a domain
Chapter 4: Internet Protocol Layer
147
Intra-domain Routing


Runs Interior Gateway Protocols (IGP)
Most Common IGP’s


RIP: Routing Information Protocol
OSPF: Open Shortest Path First
Chapter 4: Internet Protocol Layer
148
RIP


Originally designed for Xerox PARC
Universal Protocol (used in XNS)
Adopted by UNIX and TCP/IP in 1982




routed of BSD
RIP: RFC 1058 [1988]
RIPv2: RFC 1388 [1993]
RIPng: RFC 2080 [1997]
Chapter 4: Internet Protocol Layer
149
RIP

Distance Vector routing



Use hop count as cost metric (up to 15)
Restrict size of the network to 15
Exchange routing message (advertisement)


every 30 seconds
Each advertisement consists of up to 25 routes
(destination nets)
Chapter 4: Internet Protocol Layer
150
RIPv2 Packet Format
0
8
Command
16
24
Must be zero
Version
Family of net 1
31
Route Tag for net 1
Address of net 1
Subnet Mask for net 1
Next Hop for net 1
Distance to net 1
Family of net 2
Route Tag for net 2
Address of net 2
Subnet Mask for net 2
Next Hop for net 2
Distance to net 2
Chapter 4: Internet Protocol Layer
151
RIP Packet Format and Stability

RIP packet format



commands: request or reply, version number
up to 25 destination addresses
Stability


hop count limit: 15 means infinity
Stabilization Timer:


Split horizons


allows RIP to learn all routes from its neighbors before sending
full updates
no update on backward route (omits routes learned from that
neighbor)
Poison Reverse Update

sends updates to a neighbor includes routes learned from that
neighbor but sets the route metric to infinity
Chapter 4: Internet Protocol Layer
152
Routing Table of RIP

Taken from a cisco router at cs.ccu.edu.tw
Destination
Gateway
Distance
/Hop
35.0.0.0/8
140.123.1.250 120/1
127.0.0.0/8
directly connected
136.142.0.0/16 140.123.1.250 120/1
150.144.0.0/16 140.123.1.250 120/1
140.123.230.0/24 directly connected
140.123.240.0/24 140.123.1.250 120/4
140.123.241.0/24 140.123.1.250 120/3
140.123.242.0/24 140.123.1.250 120/1
192.152.102.0/24 140.123.1.250 120/1
0.0.0.0/0
140.123.1.250 120/3
Chapter 4: Internet Protocol Layer
Update Flag Interface
timer
00:00:28 R
Vlan1
C
Vlan0
00:00:17 R
Vlan1
00:00:08 R
Vlan1
C
Vlan230
00:00:22 R
Vlan1
00:00:22 R
Vlan1
00:00:22 R
Vlan1
00:01:04 R
Vlan1
00:00:08 R
Vlan1
153
Open Source Implementation 4.9: RIP

GNU Zebra Project

Supports many routing protocols


RIP, OSPF, BGP
Runs routing daemon as user process

Communicates with kernel via netlink
Chapter 4: Internet Protocol Layer
154
Open Source Implementation 4.9 (cont)
Routing Daemon and Kernel
User space
Kernel space
Routing Table
Routing manager
(Zebra, routed, gated, …)
Handling protocol specific packets
Control
Kernel packets
Data packets
Packets from NICs
Chapter 4: Internet Protocol Layer
155
Open Source Implementation 4.9 (cont)
Overview of Zebra Routing Protocols
OSPFd
BGPd
RIPngd
Zebra Daemon
ioctl
Routing Table
sysctl
netlink
proc fs
rtnetlink
Routing Information
(via socket interface)
RIPd
Kernel
Chapter 4: Internet Protocol Layer
156
Open Source Implementation 4.9 (cont)
RIP Daemon (ripd)
Initialization
Scheduling
RIP core
rip_version
rip_default_metric
rip_timers
rip_route
rip_distance
routemap
Interface
rip_network
rip_neighbor
rip_passive_interface
ip_rip_version
ip_rip_authentication
rip_split_horizon
Zebra
client
RIP Peer
rip_peer_timeout
rip_peer_update
rip_peer_display
Zebra Daemon
offset
Chapter 4: Internet Protocol Layer
157
OSPF Features (1/3)





OSPF v2: RFC 2328 [1998]
OSPF v3: RFC 2740 [1999]
Run internal to a single Autonomous System
Link-state routing protocol
Shortest-path tree be constructed for routing table




Dijkstra algorithm
Support for equal-cost multipath routing
Support for TOS-based routing
Support variable subnet length

each route distributed has a destination and mask
Chapter 4: Internet Protocol Layer
158
OSPF Features (2/3)

Integrated uni- and multicast support:


Multicast OSPF (MOSPF) uses same topology
database as OSPF
Two levels of hierarchy : areas within an AS

Area: a group of contiguous networks and hosts


Topology of an area is invisible form outside
Routing in the AS takes place on two level

intra-area routing, inter-area routing
Chapter 4: Internet Protocol Layer
159
OSPF : Two Levels of Hierarchy
AS boundary router
Area border
router
internal
router
Area A
backbone
router
Backbone
internal
router
Area B
Chapter 4: Internet Protocol Layer
Area border
router
internal
router
Area C
160
OSPF Hierarchy

Area border routers



Area internal routers



Only participate intra-area routing
Receive external routes broadcasted by area border router
Backbone routers


“summarize” distances to networks of its area
advertise to other Area Border routers
run OSPF routing limited to backbone
AS Boundary routers

connect to other AS’s
Chapter 4: Internet Protocol Layer
161
OSPF Features (3/3)

External routing data is advertised through AS


Flood without modification
Two types of cost


type 1: compatible with costs within area, cost to an external
network is the sum of internal cost and external cost
type 2: order of magnitude larger, cost to an external network
is solely determined by external cost
Chapter 4: Internet Protocol Layer
162
OSPF

Features

Supports stub to reduce broadcasting
 An area can be figured as stub when there is a single
exit point from the area.
 AS boundary routers cannot be placed internal to stub
areas.
 No AS external advertisements are flood into /through
stub areas.
Chapter 4: Internet Protocol Layer
163
3
N1
3
N2
RT1
1
1
1
N3
1
RT4
8
8
8
7 6
8
RT2
2
8
RT5
RT3
6
6
N14
Internal
router
Area border
router
AS boundary
router
RT6
N4
Area 1
7 Ia
Area 2
N11
1
N9
3
1
RT11
1
2
1 RT12
2
N10
1
N6
N8
10
6 RT7 2 N12
9
N15
Ib
5 RT10
RT9
Stub
H1
N12
8 N13
1 RT8
Area 3
Chapter 4: Internet Protocol Layer
4
N7
164
OSPF Example: Intra-area

Summarized area information advertised by
RT3 and RT4 to backbone.
Network
Cost advertised by RT3
Cost advertised by RT4
N1
4
4
N2
4
4
N3
1
1
N4
2
3
3
N1
3
N2
Area 1
RT1
1
1
N3
RT2
1
1
2
N4
RT4
RT3
Chapter 4: Internet Protocol Layer
165
OSPF Example: Inter-area

Backbone information advertised into area
1 by RT3 and RT4.
Destination
Ia, Ib
N6
N7
Cost
advertised
by RT3
20
16
20
N8
N9-N11
RT5
RT7
18
29
14
20
Cost
advertised
by RT4 RT4
27
8
15
19
8
8
RT3
N12
8 N13
N14
7 6
8
18
36
8
14
8
RT5
6
6
RT6
7 Ia
Ib
5 RT10
1
Chapter 4: Internet Protocol Layer
N6
6 RT7 2 N12
9
1
166
N15
OSPF Example: Final Routing Table

RT4’s routing table
Destination
Path Type
Cost
Next Hop
N1
N2
N3
intra-area
intra-area
intra-area
4
4
1
RT1
RT2
direct
N4
N6
intra-area
Inter-area
3
15
RT3
RT5
N7
N8
inter-area
Inter-area
19
18
RT5
RT5
N9-N11
N12
inter-area
Type 1 external
36
16
RT5
RT5
N13
N14
Type 1 external
Type 1 external
16
16
RT5
RT5
N15
Type 1 external
23
RT5
Chapter 4: Internet Protocol Layer
167
Open Source Implementation 4.10:
OSPF
Initialization
Scheduling
OSPF core
ip_ospf_interface
ip_ospf_neighbor
ospf_router_id
network_area
show_ip_ospf_cmd
Route Map
route_map_update
route_map_event
Interface
LSA
Link State
Advertisement
Route
Zebra
daemon
Network
OSPF Flooding
zclient
OSPF SPF
calcuation
Chapter 4: Internet Protocol Layer
ASE
AS external
route calculation
LSDB
168
Inter-domain Routing
Border Gateway Protocol (BGP)
Chapter 4: Internet Protocol Layer
169
Inter-domain Routing


Called Exterior Gateway Protocols (EGP)
Most common EGP

BGP: Border Gateway Protocol
Chapter 4: Internet Protocol Layer
170
BGP Features (1/3)


RFC 1771 (BGP-4)
“Path vector” routing



loop free inter-domain routing between ASs
Runs over TCP with port 179
Routing table keeps all feasible paths

Only advertises optimal path to neighbors
Chapter 4: Internet Protocol Layer
171
BGP Features (2/3)

Can be used within and between ASs


multiple border routers (BGP speaker) within an
AS
IBGP: Interior BGP



runs between routers in the same AS
All BGP speakers within the AS must be fully meshed
(through IGP protocol)
EBGP: Exterior BGP

runs between routers belonging to two different ASs
Chapter 4: Internet Protocol Layer
172
BGP Features (3/3)

Support information aggregation


CIDR
Confederation


Policy routing at AS


could also be used to allow multiple ASs within an AS
access-list permit or deny (route or path filtering)
Link cost metric

combination of different metric with the degree of
preference (weight, loc pref, med, …)
Chapter 4: Internet Protocol Layer
173
BGP Messages

Open


Keepalive


Send often enough to keep from timer expiration
Update





First message sent after connection
No periodic refresh of the entire table
Advertise a single feasible route to a peer
Withdraw multiple routes previously advertised
Message contains path attributes and Network Layer
Reachability Information (NLRI)
Notification

send when an error is detected
Chapter 4: Internet Protocol Layer
174
BGP Routing Algorithm

Path vector routing




Different ASs may have different link cost metrics
Loop free is very important
Policy routing is preferred (different priorities, prohibit lists, …)
AS_PATH of the path attribute




A list of ASs to the destination
Loop is found if current AS already in the AS_PATH
Next_Hop of the path attribute indicates the next router to the
destination
NLRI

A list of subnets that can be reached by the AS_PATH
Chapter 4: Internet Protocol Layer
175
BGP Path Selection

Path selection
(1) If Next_Hop is inaccessible, drop the update
(2) Prefer largest LOCAL_PREF
(3) Prefer shorter AS_PATH
(4) Prefer lower origin code (igp<egp<incomplete)
(5) Prefer lower MED (MULTI_EXIT_DISC)
(6) Prefer external path over internal path
(7) Prefer closer IGP neighbor
(8) Prefer BGP router with lower ip address

Advertise the highest degree of preference for each
destination to neighbor BGP speakers
Chapter 4: Internet Protocol Layer
176
BGP PATH Attributes (1/2)

Origin

Defines the origin of the path information


AS_PATH


IGP, BGP, Incomplete (unknown, e.g., static route)
Ordered list or a set
Next_Hop


IP of the next hop to the destination
For multiaccess network, nexthop could be a router
other than the BGP speaker
Chapter 4: Internet Protocol Layer
177
BGP PATH Attributes (2/2)

LOCAL_PREF


Indicate preferred exit router within an AS
Multi_Exit_Disc(MED)

When a router has multiple external links to the same
AS, the link to the router with lower MED is preferred.
Chapter 4: Internet Protocol Layer
178
BGP Example
Network
61.13.0.0/16
61.251.128.0/20
211.73.128.0/19
218.32.0.0/17
218.32.128.0/17
Next Hop
139.175.56.165
140.123.231.103
140.123.231.100
139.175.56.165
140.123.231.103
210.241.222.62
139.175.56.165
140.123.231.103
140.123.231.106
139.175.56.165
140.123.231.103
140.123.231.106
LOCAL_
Weight Best?
PREF
0
0
0
0
0
0
0
0
0
0
0
0
0
N
N
Y
Y
N
Y
N
N
Y
N
N
Y
Chapter 4: Internet Protocol Layer
PATH
Origin
4780,9739
9918,4780,9739
9739
4780,9277,17577
9918,4780,9277,17577
9674
4780,9919
9918,4780,9919
9919
4780,9919
9918,4780,9919
9919
IGP
IGP
IGP
IGP
IGP
IGP
IGP
IGP
IGP
IGP
IGP
IGP
179
4.7 Multicast
Internet Group Management Protocol (IGMP)
 Distance Vector Multicast Routing Protocol (DVMRP)
 Protocol-Independent Multicast (PIM)
 New Developments: SSM, MSDP, Anycast RP
 Multicast Backbone (MBONE)

Chapter 4: Internet Protocol Layer
180
Multicast

Communication among more than two parties



Multi-party video conferencing
Distance learning
Issues



Maintain group member information
Construct a multicast tree for packet transmission
Many to many communication
Chapter 4: Internet Protocol Layer
181
Membership Management
IGMP
Chapter 4: Internet Protocol Layer
182
Internet Group Management Protocol
( IGMPv2)

RFC 2236
Used by IP hosts to report multicast group
memberships to routers

Enhances IGMPv1


Querier election mechanism

IGMPv2 Leave Group message

Group-Specific Query message
Chapter 4: Internet Protocol Layer
183
Protocol Overview (1/4)

Multicast router plays one of the two roles:
Querier or Non-Querier


Querier is responsible for maintain membership
information
Router with the smallest IP address becomes the
Querier



Routers hear the Query messages and make the
judge
Querier periodically sends General Query to solicit
membership information
A General Query is sent to 224.0.0.1 (ALLSYSTEMS multicast group)
Chapter 4: Internet Protocol Layer
184
Protocol Overview (2/4)

When a host receives a General Query

Delays a random time from the range of
[0..Max Response Time](starts a timer)



Sends a report with TTL=1 when timer expires
Report suppression


Max Resp. Time is given in the Query message
If another host’s report received, stop the timer and
does not send the report
Similar for a host receives a GroupSpecific Query
Chapter 4: Internet Protocol Layer
185
Protocol Overview (3/4)

When a router receives a report


adds the group being reported to the list of
multicast groups
Sets timer for the membership to [Group
Membership Interval].



Deletes it if no reports received before timer expired
Query is sent periodically
When a host joins a multicast group

Sends an unsolicited report immediately
Chapter 4: Internet Protocol Layer
186
Protocol Overview (4/4)

When a host leaves a multicast group


If it was the last host to reply to a Query, it
should send a Leave Group message to allrouters multicast address (224.0.0.2)
When a router receives a Leave Group
message


Sends Group-specific Queries every [Last
Member Query Interval] to the group being left
for [Last Member Query Count] times.
If no reports received before [Last Member
Query Interval], assumes no local members.
Chapter 4: Internet Protocol Layer
187
IGMPv2 Message Format (1/2)

message format
0
8
Type
16
Max. Resp.
Time
24
31
Checksum
Multicast group Address

type
0x11=Membership Query
- General query
- Group-Specific Query
0x16=Version 2 Membership Report
0x17=Leave Group
Chapter 4: Internet Protocol Layer
188
IGMPv2 Message Format (2/2)

Max Response Time
- only in membership query message
- set to be zero in other messages

Checksum
- 16-bit one’s complement

Group address
- zero when sending a General Query
- group address when sending a Group-Specific
query
Chapter 4: Internet Protocol Layer
189
IGMPv3


IETF RFC 3376
Adds support for “source filtering”


A receiver may request to receive packets only
from specific source addresses
Select source addresses by INCLUDE or
EXCLUDE


IPMulticastListen(socket, interface, multicast-address,
filter-mode, source-list)
filter-mode: INCLUDE or EXCLUDE
Chapter 4: Internet Protocol Layer
190
Multicast Routing Protocols
DVMRP
 PIM-SM
 SSM
 MSDP
 Anycast RP

Chapter 4: Internet Protocol Layer
191
Multicast Routing Protocols

Two types of multicast tree



source-based tree
core-based tree (shared tree)
Multicast protocols






What’s the
difference:
per (S,G) tree
or
per (*,G) tree
DVMRP
PIM
 Sparse mode
 Dense mode
SSM
MSDP
Anycast RP
MBGP
Chapter 4: Internet Protocol Layer
192
Example where Steiner tree is
different from least-cost-path tree
C
3
1
A
3
B
4
1
3
D
Copyright
Chapterreserved
4: Internet
2001
Protocol
(Lin & Layer
Hwang)
193
Distance Vector Multicast Routing
Protocol (DVMRP)


RFC-1054
Derived from RIP


Widely used on the Mbone


Relies on RIP for unicast routing
Enable incremental deployment of IP multicast
since it supports tunnel
Construct a source-based tree per source

Provide a shortest path between source and
receivers using Reverse Path Forwarding (RPF)
algorithm
Chapter 4: Internet Protocol Layer
194
RPF Algorithm

Three steps



Reverse Path Broadcast (RPB)
Prune to a Reverse Path Multicast (RPM) tree
Forwarding data uni-directionally
Chapter 4: Internet Protocol Layer
195
Reverse Path Broadcast (RPB)

Broadcast on the Reserve Path

When a multicast packet is received

Forward the packet on all of its outgoing links only if



Packet arrives on the interface that is also the
interface of the shortest path back to the sender
Packet is not duplicated
Otherwise, discard the packet
Chapter 4: Internet Protocol Layer
196
RPB Example
member
mrouter
router w/o
member
source
RA
Forward
Discard
RD
RB
RC
RF
RE
RG
Chapter 4: Internet Protocol Layer
197
Prune RPB Tree

Prune to RPM tree


Routers that do not lead to any members send
prune messages to upstream routers
Routers know membership information via
IGMP
Chapter 4: Internet Protocol Layer
198
Prune RPB Tree Example
member
mrouter
router w/o
member
source
RA
Forward
Prune
RD
RB
RC
RF
RE
RG
Chapter 4: Internet Protocol Layer
199
Example of a RPM tree
member
router w/
member
router w/o
member
source
Chapter 4: Internet Protocol Layer
Forward
200
DVMRP Drawbacks and Benefits

Drawbacks




First packet has to be flooded
Periodic prune state refresh
Routing state per (source , group) pair
Benefits


guarantee efficient delivery
easy to implement
Chapter 4: Internet Protocol Layer
201
Problems of DVMRP

Work well only for densely represented
groups


Large amount of state information stored



periodic broadcast will cause performance
problems
Information for forwarding
Prune-state information
Not scaleable
Chapter 4: Internet Protocol Layer
202
PIM-SM



Protocol Overview
Special Features
Packet Formats
Chapter 4: Internet Protocol Layer
203
Protocol Overview

Documents


Terminologies




RFC 2362, 4601(August, 2006)
DR: Designated Router
RP: Rendezvous Point
RPT: RP-based Tree
PIM-SM route packets in three phases



Phase one: RP tree
Phase two: Register Stop
Phase three: Shortest-Path Tree (Optional)
Chapter 4: Internet Protocol Layer
204
Phase One: RP Tree

Receiver


Sends join message to DR using IGMP
DR sends (*,G) PIM Join message to RP



Reaches RP or converge on a router on the RPT
Join message is sent periodically (o.w., it will time
out)
Sender


Sender sends a packet with multicast address
as its destination to DR
DR unicasts encapsulated packet to RP


PIM Register packets
RP decapsulates it and forwards it onto RPT
Chapter 4: Internet Protocol Layer
205
Phase One: RP Tree (Fig)
Join
Encapsulated
Multicast Send
member
RP
DR
RP
(*,G)
(*,G)
RTA
source
A
Chapter 4: Internet Protocol Layer
B
206
Phase Two: Register Stop

Motivation


Encapsulation and decapsulation are too expensive
Steps







RP initiates an (S,G) source-specific Join to S
All the routers on the path records the (S,G) multicast state
Packets start to flow following the (S,G) tree to RP
RP may now receive duplicate packets: native and
encapsulated. RP discards the encapsulated packet
RP sends a Register-Stop message to DR of Source
RP forwards native packets to the RPT
If the packet reaches a router with (*,G), do a short-cut to
receivers.
Chapter 4: Internet Protocol Layer
207
Phase Two: Register Stop (Fig)
member
Source specific join
RP
DR
RP
(S,G)
source
Chapter 4: Internet Protocol Layer
208
Phase Three: Shortest-Path Tree

Motivation


From source to RP, then to receivers is too long.
Steps




A receiver’s DR may optionally initiate to transfer from the
RPT to a source-specific tree (SPT)
It issues an (S,G) join to S. The join message may reach
the source or converged at some router.
It starts to receive two copies of packets. Drop the one
from RPT.
It then sends an (S,G) prune message to RP
 (S, G, rpt) prune
 Prune message reaches RP or converged at some
router on RPT
Chapter 4: Internet Protocol Layer
209
Phase Three: Shortest-Path Tree (Fig)
member
Source specific join (IGMPv3)
Source specific prune
RP
DR
RP
(S,G)
(S,G,rpt)
source
Chapter 4: Internet Protocol Layer
210
Source-specific Joins and Prunes

If a receiver sends a source-specific join using
IGMPv3



Multicast addresses for source-specific multicast



If no other receiver on that group, DR may omit performing
a (*,G) join.
Instead, DR issues a source-specific (S,G) join.
232.0.0.0 to 232.255.255.255
Only source-specific join will be accepted for group in this
range.
A receiver may also sends a source-specific join
with exclusive source list

DR will perform a (*,G) join as normal, but may combine
this with an (S,G,rpt) prune for each source in the list.
Chapter 4: Internet Protocol Layer
211
Inter-domain Multicast: MSDP



RP in each domain establishes an MSDP peering relation with
RPs in other domains
When the RP learns a new multicast source within its own
domain, it informs its MSDP peers
 The RP encapsulates the first data packet in a Source Active (SA)
message and sends the SA to all peers.
 The SA is forwarded by each receiving peer using a modified
RPF check
 If the receiving MSDP peer is an RP, and the RP has a (*,G)
entry for the group in the SA, the RP sends a (S,G) join. The RP
also decapsulates the data and forwards down to its shared tree
 The receiver that interests in this (S,G) could sends a (S,G) join
to have the shortest path to the source
Each RP periodically sends SAs, which include all sources within
its domain.
Chapter 4: Internet Protocol Layer
212
Inter-domain Multicast: Multi-Protocol
BGP (MBGP)


Defined in RFC 2283 (extensions to BGP)
MBGP is extended to carry different
information to support






IPv4 Unicast
IPv6 Unicast
IPv4 Multicast
IPv6 Multicast
….
Routing information may be carried in same
BGP session
Chapter 4: Internet Protocol Layer
213
Open Source Implementation 4.12: Mrouted
Data structures of Mrouted
routing_table
Groups orginiated from the
same source.
rtentry
rt_next
rt_groups
gtable
gtable
gt_next
gt_prev
gt_next
gt_prev
gt_gnext
gt_gprev
gt_gnext
gt_gprev
rt_next
rt_groups
gtable
gt_next
gt_prev
gt_gnext
gt_gprev
Copyright
Chapterreserved
4: Internet
2001
Protocol
(Lin & Layer
Hwang)
214
Summary on Multicast

Source-based tree

Advantage


Disadvantage


Optimal path between sources and receivers
Routing information for each (S,G) pair
Shared tree

Advantage


Less state in each router
Disadvantage

Non-optimal path between sources and receivers
Chapter 4: Internet Protocol Layer
215
4.8 Summary





Forwarding: longest prefix matching
Routing: two-level, intra-domain and interdomain
Distance vector routing vs. link state routing:
distributed vs. centralized
Other mechanisms: IPv6, NAT, ARP, DHCP,
ICMP
Broadcast in subnet: used by ARP and
DHCP
Chapter 4: Internet Protocol Layer
216