Download Multicast Virtual Private Networks

Graduate Program in Telecommunications George Mason University Technical Report Series 4400 University Drive MS#2B5 Fairfax, VA 22030-4444 USA http://telecom.gmu.edu/ 703-993-3810 Multicast Virtual Private Networks CHRISTOPHER LENART [email protected] Technical Report GMU-TCOM-TR-09 Abstract Multicast has long been a popular technology in computer networks for the efficient distribution of data, such as patches or live video, to multiple users simultaneously. The early implementations were always restricted to a single network, and a remote office would need its own multicast distribution system separate from a main office, for example. This report describes Next-Generation Multicast Virtual Private Networks (NG-MVPN). NG-MVPN is a popular technology used by service providers to connect the multicast networks for several locations over their network. The beginning of this report starts by describing the building blocks of NG-MVPN. These are Multicast, Multiprotocol Label Switching (MPLS), Border Gateway Protocol (BGP) and BGP/MPLS VPNs. The report assumes the reader already has an understanding of these technologies. For brevity, the essential parts of these technologies required for NG-MVPN are discussed. The service provider multicast technology, MVPN (mVPN), written by Eric Rosen and also called Draft Rosen MVPN also is discussed as background. Lastly, this report also discusses Global Table Multicast (GTM), which is an extension of NG-MVPN that uses the global routing table rather than the segregated routing tables used for BGP Virtual Private Networks. Resources for this report are mainly IETF Request for Comments, but also includes technical books, technical articles, and personal communication. All references are be cited and listed at the end of the report. Contents Introduction 1 2 Building Blocks: Multicast, BGP, and MPLS 1.1 Multicast . . . . . . . . . . . . . . . . . . . . 1.1.1 Multicast Addressing . . . . . . . . . . 1.1.1.1 Types of Multicast Addresses 1.1.2 Multicast Distribution Trees . . . . . . 1.1.2.1 Reverse Path Forwarding . . 1.1.3 Internet Group Management Protocol . 1.1.4 Protocol Independent Multicast . . . . 1.1.4.1 PIM Sparse-Mode . . . . . . 1.1.4.2 PIM Dense-Mode . . . . . . 1.1.4.3 PIM Single-Source Mode . . 1.2 MPLS . . . . . . . . . . . . . . . . . . . . . . 1.2.1 MPLS Signaling . . . . . . . . . . . . . 1.2.1.1 LDP . . . . . . . . . . . . . 1.2.1.2 RSVP-TE . . . . . . . . . . 1.3 BGP . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 UPDATE Message . . . . . . . . . . . 1.3.2 Multiprotocol BGP . . . . . . . . . . . 1.4 BGP/MPLS Virtual Private Networks . . . . . 1.4.1 Network Topology and Terminology . . 1.4.2 Virtual Routing and Forwarding Tables 1.4.3 BGP Addressing and Advertisement . . 1.4.3.1 VPNv4 Address Family . . . 1.4.4 Forwarding . . . . . . . . . . . . . . . 1.4.5 Inter-AS Considerations . . . . . . . . 1.4.6 BGP/MPLS VPN Summary . . . . . . 1.5 Generic Routing Encapsulation . . . . . . . . . 1.6 Control Plane vs Forwarding Plane . . . . . . . i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 2 3 4 5 5 6 7 8 8 10 10 11 13 13 14 14 15 15 16 16 18 20 20 20 20 Draft Rosen Multicast Virtual Private Networks 2.1 Overview of MVPNs . . . . . . . . . . . . . . . . . . . . . 2.2 MVPN Operation . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Multicast Distribution Trees . . . . . . . . . . . . . 2.2.1.1 MDTs and Generic Routing Encapsulation 2.2.1.2 Default MDT . . . . . . . . . . . . . . . 2.2.1.3 Data MDT . . . . . . . . . . . . . . . . . 2.2.2 Auto-Discovery in MVPNs . . . . . . . . . . . . . . 2.2.3 RPF . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Considerations for Inter-AS and BGP Free Core . . . . . . . 2.3.1 PIM MVPN Join Attribute . . . . . . . . . . . . . . 2.3.2 BGP Connector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 22 25 25 25 25 26 28 29 29 29 29 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4 BGP/MPLS Multicast Virtual Private Networks 3.1 Next-Generation Multicast VPN Overview . . . . . . . . . . . . . . . . . 3.2 PMSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Instantiating PMSIs . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 PIM and BGP Control Plane . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 PIM Control Plane for CE-PE Information . . . . . . . . . . . . 3.3.2 MP-BGP Control Plane for PE-PE Information . . . . . . . . . . 3.3.2.1 New BGP Path Attributes and Extended Communities 3.3.2.2 MCAST-VPN NLRI . . . . . . . . . . . . . . . . . . . 3.3.3 MP-BGP for PE-PE Upstream Multicast Hop . . . . . . . . . . . 3.3.3.1 BGP for Upstream Multicast Hop Selection . . . . . . 3.3.3.2 Upstream Multicast Hop Selection . . . . . . . . . . . 3.4 Forwarding Plane Considerations . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Tunnel Type 1 - RSVP-TE P2MP LSP . . . . . . . . . . . . . . 3.4.2 Tunnel Type 2 - mLDP P2MP LSP . . . . . . . . . . . . . . . . 3.4.3 Tunnel Type 3 - PIM-SSM . . . . . . . . . . . . . . . . . . . . . 3.4.4 Tunnel Type 4 - PIM-SM . . . . . . . . . . . . . . . . . . . . . . 3.4.5 Tunnel Type 6 - Ingress Replication . . . . . . . . . . . . . . . . 3.4.6 P-Tunnel Aggregation . . . . . . . . . . . . . . . . . . . . . . . 3.5 Global Table Multicast . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Use of NG-MVPN BGP Procedures in GTM . . . . . . . . . . . 3.5.1.1 Route Distinguishers and Route Targets . . . . . . . . 3.5.1.2 UMH-Eligible Routes . . . . . . . . . . . . . . . . . . 3.5.1.3 BGP Autodiscovery Routes . . . . . . . . . . . . . . . 3.5.1.4 BGP C-Multicast Routes . . . . . . . . . . . . . . . . 3.5.2 Inclusive and Selective Tunnels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 30 31 33 33 33 33 33 35 41 41 41 42 42 42 42 42 43 43 43 43 44 44 45 45 45 Summary 4.1 Compare and Contrast . . 4.2 Receiver Sites: All or Some 4.3 NG-MVPN vs GTM . . . . 4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 46 46 47 47 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of Figures 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 1.15 Basic Modes of Network Transmission . . . . . . . Unicast vs Multicast Trees . . . . . . . . . . . . . PIM-DM vs PIM-SM . . . . . . . . . . . . . . . . MPLS LSPs . . . . . . . . . . . . . . . . . . . . . Point-to-Multipoint MPLS LSPs . . . . . . . . . . LDP Signaling . . . . . . . . . . . . . . . . . . . . Multicast LDP Signaling . . . . . . . . . . . . . . RSVP-TE Signaling . . . . . . . . . . . . . . . . . Multicast RSVP-TE Signaling . . . . . . . . . . . Service Provider Network with Customer Sites . . . VRFs and Attachment Circuits . . . . . . . . . . . MP-BGP VPNv4 BGP UPDATE Message Example VPN Label Advertisements . . . . . . . . . . . . . VPN Forwarding . . . . . . . . . . . . . . . . . . . Control Plane vs Forwarding Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 4 7 9 9 10 11 12 12 15 17 18 19 19 21 2.1 2.2 2.3 2.4 2.5 2.6 MVPN MVPN MVPN MVPN MVPN MVPN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 24 24 26 27 28 3.1 3.2 3.3 3.4 BGP/MPLS Multicast VPN . . . . . . . . . . . . . . . . . . . . . . . . Provider Multicast Service Interface . . . . . . . . . . . . . . . . . . . . Shared Tree to Source Tree Switchover using Source Active A-D Routes GTM Network Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 32 39 44 Overview . . . . . . . . Details . . . . . . . . . . C-Instance LAN . . . . . Default MDT Operation Data MDT Signaling . . Data MDT Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction Every day more technology is utilizing digital methods of communication. The popular example of this is television, where a handful of channels were sent using analog radio waves directly to an antenna on a house. There was nothing in between. Today television content is created digitally then packaged digitally to be sent to a television provider’s head-end. From there the content is sent over a private network to the home or even over the Internet. Between all these points are finite sized communication channels. The content is growing in size too. Standard Definition Television (SDTV) was upgraded to High Definition Television (HDTV). HDTV bandwidth is increasing even further with 4K and 8K HDTV, the nomenclature coming from the number of vertical pixels. All of this extra bandwidth is challenging those finite communication channels and they must be constantly updated to keep up. Television isn’t the only use case that is choking networks. Large enterprise networks have servers that maintain software updates, or may also stream an executive message video. Multicast steps in by allowing a network to send one copy of a packet over a link from a source to many receivers. Rather than having to send a stream to each server, which is the case with unicast, a source can send one stream and let the network do the work in getting that stream to anyone who wants to receive it. Multicast also keeps track of where the interested receivers are, so unlike broadcast, the stream only goes to parts of the network rather than all of the network. Companies have embraced the use of Virtual Private Networks over Service Provider networks for a number of years, which allow them to distribute traffic between remote sites without having to build their own infrastructure. These Virtual Private Networks have been extended to distribute Multicast across them in a scalable manner. This report explores the network technologies that provide the Virtual Private Networks and how they have been updated and modified to care for multicast traffic. Approach The intention of this technical report is to walk the reader through the various Multicast VPN technologies. Rather than jump straight into the multicast technologies and describe each underlying technology involved, the approach is to present the underlying technologies up front and then put them together when discussing the Multicast VPNs. The report starts with basic concepts that are then built on for the various approaches to doing Multicast VPNs. It is assumed as well that the reader already has a background in various computer network technologies. The report is laid out as follows: • Building Blocks • Draft Rosen Multicast VPNs • Next-Generation Multicast VPNs • Global Table Multicast • Summary Building Blocks This chapter explains the basics of mutlicast, MPLS, and BGP that are relevant to multicast VPNs. The topics are cherry-picked so that there is an understanding of the underlying mechanisms for the various multicast VPN technologies. The information from BGP and MPLS is combined to discuss Layer 3 VPNs (L3VPNs) which are a i major component of each multicast VPN technology discussed in this report. Much information regarding each technology is omitted for brevity and simplicity. Draft Rosen Multicast VPNs One of the first widespread implementations for multicast VPNs, or MVPN, was created by Eric Rosen at Cisco. It was implemented while it was in draft status at the IETF, hence the name Draft Rosen mVPN. Even though it was only released in draft status it had wide acceptance among the various telecommunications vendors. Next-Generation Multicast VPNs Draft Rosen MVPNs evolved to Next-Generation Multicast VPNs (NG-MVPNs) which overcame some of the limitations of Draft Rosen MVPNs. This section focuses on the two IETF RFCs that were used to establish the standard, and building on the BGP and MPLS concepts established in the Building Blocks section. Global Table Multicast is another Multicast VPN technology that relies on the mechanisms and semantics established by the NG-MVPN standards. While NG-MVPN has routing table isolation for customers as a key characteristic, GTM relies on the global routing table to reduce operational overhead when that isolation isn’t necessary. This part of the chapter explores the di↵erences between NG-MVPN and GTM. Resources This paper utilizes mainly the documents from the Internet Engineering Task Force (IETF) standards body. The IETF releases standards in the form of Request For Comments (RFCs) which are allowed unlimited distribution. The initial stage of an RFC is a draft which has many versions over its lifetime as it is edited, reviewed, and updated. Eventually the draft is ratified as a standard to become an RFC and is assigned a number. Telecommunication vendors use these standards to ensure interoperability with products created by other vendors. Each RFC referenced is mentioned in the main body of the text as a plain-sight reference. Also where applicable the page number is referenced to assist in identifying the location of a particular piece of information. Some information was taken from various texts as they have additional illustrations or more elegant explanations of the technology at hand, or the amount of detail in an RFC was not required. ii Chapter 1 Building Blocks: Multicast, BGP, and MPLS This chapter introduces the relevant concepts of Multicast, BGP, MPLS, and the combinations of BGP and MPLS that are used in Multicast VPNs. Not all aspects of each technology will be covered. The reader is encouraged to follow the references for a more in depth understanding of all the technologies. 1.1 Multicast The familiar method of transmitting data or a message is unicast. This is the common model of one source node and one destination node. An instant message that goes from one computer to another computer is a familiar example. Another example is a single web server sends the contents of a web page to just one node at a time. A file download goes from one server to the single user that needs it. Another transmission model is broadcast. In the case of a broadcast a message is sent to all of the nodes on a network, and is generally limited to that local network. Broadcasts if not used properly can overwhelm a network. The last model is multicast. Not everyone needs a file at the same time and not everyone is watching the same channel at the same time. Multicast solves this problem by only sending the data to the nodes that request it [1, p. 69–71]. Another problem that multicast solves is the escalating bandwidth problem. In the unicast model each person requesting the data gets a copy. If 100 people request it, the source server will send 100 copies. With multicast the server only needs to send one copy, and this copy gets replicated in the network by an intermediate node, such as a router or switch, until each requesting user gets a copy. Each link in the network only has to forward one copy, even if 100 users are requesting it [2, p. 1]. 1 Unicast End Node Broadcast Source Multicast Receiver Figure 1.1: Basic Modes of Network Transmission Figure 1.1 gives a graphical representation of the three main modes of transmission. The right-most graphic implies that the source is sending one transmission but it is sent to multiple receivers that request the content. The mechanisms of how a receiver requests data will be described later in this chapter. The figure also shows two major components of the multicast network, the source and the receiver. In between are the nodes that replicate and forward the multicast traffic. 1.1.1 Multicast Addressing Internet Protocol (IP) Addresses are defined by five classes, A-E. Classes A, B and C are used predominately for unicast, although certain addresses are used for broadcast. The addresses in each class can be further broken down using subnetting, with the last IP address in a subnet reserved for broadcast for that subnet that’s reserved for a particular Local Area Network (LAN). Class E addresses are reserved for future or experimental use, but have not had any widespread implementation. Class D addresses are reserved for multicast, and are defined by the range 224.0.0.0 through 239.255.255.255. The exact specifications for the addressing are defined in RFC 1112. The addresses in this range are also referred to as group addresses [3, p. 2]. Because they are part of the IP Protocol domain they still follow the dotted decimal notation used for the other classes. 1.1.1.1 Types of Multicast Addresses Within the Class D range, the addresses are further broken down into various groups, and may either be permanently assigned or transient addresses. The assignment of the permanent addresses are maintained by the IANA after they are specified in the IETF RFCs [4, p. 28]. Link Local-Scope Link local scope is within the range 224.0.0.0 through 224.0.0.255. This range contains addresses specifically assigned to a function, such as routing protocol updates. The Time-to-Live (TTL) of these addresses are set to 1 so they can only be forwarded once before becoming invalid. The addresses 224.0.0.1 and 224.0.0.2 have the important assignments of being the “all hosts on subnet” and “all routers on subnet.” Globally Scoped This is the large range of 224.0.1.0 through 238.255.255.255. These aren’t limited like the link-local addresses and can be used to transmit information across large networks and the Internet. Some addresses have been reserved for specific network functions, such as 224.0.1.1 for Network Time Protocol (NTP), as well as ranges assigned to organizations (all within the 224.0.0.0/8 range). 2 Both the link-local scope and the globally scoped assignments were originally maintained in RFCs, however now they are maintained on the IANA website. Limited Scope These fall within the range 239.0.0.0 through 239.255.255.255. These are analogous to private addresses used for unicast, such as 10.0.0.0/8. Networks are required to use policies to prevent any traffic from these range from leaving an autonomous system (AS). These are defined in RFC 2365. GLOP Addressing GLOP addressing isn’t an initialism or acronym, it’s simply the name of the range 233.0.0.0 through 233.0.0.255. Established in RFC 2770, this group of addresses was created for organizations that already had an AS number assigned by the IANA. The AS number is inserted into the second and third octets of the address to create a unique address range for the organization. This leaves the last octet as the assignable range [4, p. 28–30]. An example of a GLOP address for AS 789 is 233.3.21.1 [5, p. 2]. Source-Specific Multicast Well after multicast was created specific addresses were reserved solely for Source-Specific Multicast (SSM). The range is 232.0.0.0/8 and any group using this address uses SSM. SSM requires special modifications to Internet Group Management Protocol and Protocol Independent Multicast, which will be discussed in sections 1.1.3 and 1.1.4. RFC 4607 declares that the use of any address outside of this range is called Any-Source Multicast (ASM) [6, p. 3]. This report will follow this convention. 1.1.2 Multicast Distribution Trees An important part of forwarding multicast traffic through the network is the ability for a network node to build distribution trees so it can do routing and forwarding. A network node with this capability can be referred to as a multicast-enabled node, and since it is doing multicast routing these nodes will be referred to as a multicast-enabled router, or just multicast router. Each multicast router is connected to other multicast routers and shares information with the use of special multicast protocols to build trees. There are two main types of trees: shared-based trees and source-based trees. Shared-based trees can be referred to as shared trees. Source-based trees can be referred to as source trees or Shortest Path Trees (SPTs). In this report, to prevent confusion, the terms shared trees and source trees will be used. Both trees are based on a common notation referred to as (S,G) notation (pronounced “ess comma gee”) to represent a set of sources and groups. The S represents the source of the stream and is the unicast IP address of the server that is sending the traffic. The G represents the multicast group and it is the identification of a specific stream of traffic. A source can have multiple groups associated with it. A group address could represent something like a specific file or a channel in IP based TV. As discussed in the addressing section, the group address from the class D range of all IPv4 addresses. An example of a source and group set would be (1.1.1.1,239.1.1.1) where 1.1.1.1 is the multicast source server and 239.1.1.1 is the multicast group address. In shared trees the source is denoted by an asterisk and means “all sources.” The notation is (*,G), and using the previous example is written as (*,239.1.1.1) to represent a specific group, but no specific source. Shared trees utilize a central point in the tree, referred to as a Rendezvous Point (RP). Sources send their traffic to the RP then the RP forwards the traffic to all of the active receivers for a group. Shared trees use the (*,G) notation since the source is unknown to the receiver and the traffic is sent to the RP. Source trees are simpler than shared trees since the root of the tree is at the source. The tree then spans the multicast enabled network to all the receivers. This type of tree makes use of the shortest path between the source and the receiver, and di↵erent trees may exist for di↵erent groups. The source tree uses the (S,G) notation since the source is known[4, p. 41–43]. 3 Unicast Source Tree S1 1 2 S1 S2 S1 S2 1 2 1 2 3 4 6 Shared Tree 3 5 4 7 3 5 6 RP 7 4 6 5 7 S1 Stream S2 Stream End Node Source Receiver Intermediate Node (Router) Figure 1.2: Unicast vs Multicast Trees Figure 1.2 compares unicast distribution to the source and shared mode multicast distribution trees. With unicast, the source needs to send one copy per receiver for the same content. Contrast that to the source tree where source 1 (S1) only needs to send one copy even though it has two receivers. The copy is replicated at intermediate node 5 and each downstream node only receives one copy. Even if a downstream node, such as 7, had dozens of receivers attached to it (directly or indirectly) node 5 would only have to send one copy to 7. In the shared tree intermediate node 4 is configured to be the RP. The stream from source (S2) is unchanged since it passed through that node anyway, but the stream from S1 no longer takes the shortest path to node 5 and instead sends it to 4 before being passed along to 5 to then be replicated. 1.1.2.1 Reverse Path Forwarding Multicast routing co-exists with unicast routing in a network. Unicast routing is responsible for looking at the destination of a IP packet1 and forwarding it out the interface that was determined to be on the best path by a unicast routing protocol. When forwarding multicast packets the router needs to know the best path to the root or source of the tree in the upstream direction in addition to which interfaces are toward the receivers in the downstream direction. Reverse Path Forwarding (RPF) is employed by the router to ensure that there is a loop free topology. It does this by ensuring that the multicast traffic is arriving on the same interface that is also the best path to the source. If the traffic arrives on a di↵erent interface it’s possible that there is a loop in the topology. RPF knows which interface is the best path to the 1 Datagram is the original technical term for an IP packet; however the common vernacular is to use packet when referring to IP encapsulated data. 4 source utilizing the unicast routing table since the source for a multicast is a unicast address. When a multicast packet arrives in a router it will check to make sure it arrived on the upstream interface. If it does the router will forward it. If it does not the router will drop it [4, p. 47]. Referencing figure 1.2, intermediate node 5 will only forward traffic from S1 if the traffic is coming from intermediate node 1; otherwise it will be dropped. 1.1.3 Internet Group Management Protocol At its most fundamental level, Internet Group Management Protocol (IGMP) is used by IP hosts (receiving nodes) to announce they would like to receive traffic from a specific group or multiple groups, also referred to as dynamic host registration. Multicast routers listen for these messages as well as send out queries to discover if hosts are active or idle. IGMP was originally specified in RFC 1112, then was enhanced in RFC 2236 as IGMPv2 [4, p. 51]. One of the major enhancements in IGMPv2 is to allow a host to to leave a group rather than just timing out. The latest is IGMPv3 and is specified in RFC 3376, and was updated by RFC 4604. RFC 3376 added the ability to filter by source[7, p. 1], while RFC 4604 adds wording for SSM.2 [8, p. 1]. IGMP messages are embedded into IP packets. There are three types of messages that are germane to the interaction between the hosts and multicast routers: Membership Query, Membership Report, and Leave Group. The message is distinguished by the type field in an IGMP message which is the payload within an IP packet. Queries are sent by routers to either to learn if an attached network has any groups with active hosts, in the case of a general query, or a group-specific query to learn if a group has any active hosts. The membership report is used by hosts to either respond to a query, or to send an unsolicited query when an application is launched. The leave group message is used by hosts to explicitly notify a router that it is leaving a group. In each case the group address is referenced in the message, except in the case of a general query where the address is set to zero. In all cases the TTL of the packet is set to 1 so the router cannot forward the message [9, p. 2–5]. RFC 3376 describes IGMPv3 and modifies the membership query and introduces a new membership report for version 3. The membership query is modified to support a list of one or more specific sources in the message. The group format is still the same where the group address is set to zero for a general query and a group address is provided for a group-specific query. The version 3 membership report is modified so that the IGMP message has one or more records, and each group record can list one or more specific sources. The message itself specifies the number of group records, and each group record specifies the number of sources for that record [7, p. 7–15]. The same RFC also specifies the mechanism of INCLUDE and EXCLUDE modes. The INCLUDE mode specifies a list of sources that the host would like to receive traffic from, and EXCLUDE specifies a list of sources that the host should not receive multicast traffic from. These INCLUDE and EXCLUDE lists tell the router that hosts only want traffic from these specific sources [4, p. 55]. RFC 4607 builds on RFC 3376 to add language regarding source-specific multicast rules established in RFC4607 (written by the same authors as 4604 and published at the same time). Specifically this references the 232.0.0.0/8 range and establishes the concept of “SSM-aware” hosts and routers that recognize this address space. [8, p. 1–6]. RFC 4607 states that when a host joins an SSM group the router should use SSM methods and does not need to use shared-tree distribution (i.e. a source-tree can be used instead) [6, p. 3–4]. 1.1.4 Protocol Independent Multicast IGMP cares for multicast signaling between a host and a multicast router. However a separate protocol is needed between multicast routers and other multicast routers. Although there are several multicast routing protocols available, such as Distance Vector Multicast Routing Protocol (DVMRP) and Multicast OSPF (MOSPF), this report focuses on Protocol Independent Multicast (PIM), and its three modes: Sparse-Mode, Dense-Mode, and Single-Source Mode. PIM gets its name from the fact that it does not rely on any specific routing protocol for it to function. It can use BGP, OSFP, IS-IS, static routes, etc. This is in contrast to a protocol like MOSPF which requires OSPF as the routing protocol. PIM also does not build its own routing topology, instead relying on the unicast routing tables provided by the aforementioned routing protocols to build its distribution trees. Using the unicast routing table PIM can do reverse path checks and build reverse path tables to maintain the interface used to most optimally reach a known source. PIM-DM is regarded to be better when there is expected to be a large number of active receivers compared to the total number of receivers in the 2 Some recent texts mention only RFC 3376 as the reference for SSM; however the semantics specific to SSM are expanded in RFC 4604. RFC 3376 does establish the message formatting for reports and queries with specific sources. 5 network, and when the traffic is constantly being forwarded. PIM-SM is regarded as the better choice when the number of active users will be a small percentage of the total receivers, or when the traffic for a group will be used sporadically [4, p. 78–79]. Note: From this point onward an IGMP membership report will be referred to as an IGMP Join. This is in line with various other texts, articles, and sources regarding IGMP and PIM interaction. 1.1.4.1 PIM Sparse-Mode PIM Sparse-Mode (SM) was originally specified in RFC 2117 which was later updated by RFC 2362. More recently RFC 4601 was created which obsoletes RFC 2362, fixes any errors from RFC 2362, as well as adds rules regarding how to handle traffic using SSM addresses [10, p. 4]. PIM-SM relies on shared-trees for multicast distribution. At the center of the tree is the Rendezvous Point (RP) which functions as an intermediary for the multicast routers attached to the source and receivers. Another name for the shared tree is the RP Tree (RPT) since the tree for the receivers is rooted at the RP. The location of the RP is either statically configured or learned dynamically by various methods, one of which is the Boostrap Router (BSR) method. Each router builds a Multicast Routing Information Base (MRIB) which stores the best interface to use as a next-hop for forwarding PIM messages. These messages are typically sent in the opposite direction of the multicast traffic being forwarded, as is the case for a PIM Join or Prune message. The MRIB is based on reverse-path forwarding rules, meaning it knows the best path back towards a source. Each source and receiver has a Designated Router (DR)3 that acts on its behalf for various PIM related actions. Each router also has a Tree Information Base (TIB) which contains the state of a multicast router by collecting all the messages received via PIM and IGMP. It stores the state of all the multicast trees on the router [10, p. 5]. When a receiver sends an IGMP Join to its directly connected multicast router a PIM Join is sent to the RP. The notation of this join is a (*,G) message meaning the source is undefined. The PIM Join will be propagated toward the RP by each intermediate multicast router until it reaches the RP or another multicast router with a (*,G) entry for that group already established. All routers with receivers for that group will be part of a tree that is rooted at the RP. PIM Join messages sent periodically as long as the DR has active receivers to prevent that section of the tree from timing out. A source will always send its traffic to its local multicast router (DR). The source DR will encapsulate the traffic into a unicast traffic and forward it to the RP which decapsulates it and forwards it onto the tree for that group. This source-to-RP mechanism is facilitated by a Register Message. This method is inefficient however, and only needs to be used to establish an initial source-receiver relationship. When the RP starts receiving the encapsulated packets from the source DR it will begin building a source tree path back toward the source using (S,G) Joins that specifically contain the source address. Eventually the source specific (S,G) Joins will make it back to the source DR. At this point, the source DR will forward unencapsulated packets toward the RP. The RP will then be receiving two copies of the multicast traffic - encapsulated and unencapsulated. The RP will drop the encapsulated packets and send a PIM Register-Stop to the source DR, and at this point the DR will stop sending encapsulated packets to the RP for that group. So far some efficiency has been gained in that the RP is now receiving unencapsulated native multicast traffic and forwarding it native to the receiver as well. However, further efficiency is created by allowing the router attached to the receiver to join a source based tree. With the traffic hitting the receiver’s router natively, this router now knows the source for the group. It will initiate an (S,G) Join back toward the source (based on the MRIB, as it contains the best path toward the source based on reverse-path forwarding built on the unicast tree) until it reaches the source router or an intermediate router that already has an entry for that specific (S,G) pair. At some point in the tree a router will be receiving traffic from the source on the shortest-path/source tree and the RP simultaneously. The router will drop the traffic from the RP as well as send a special PIM Prune message toward the RP, denoted as an (S,G,rpt) Prune.4 [10, p. 4–8]. 3 The DR is one of several routers that exists on a LAN, and is selected through an election process PIM Join and Prune message are actually the same message, referred to as a PIM Join/Prune Message. They are distinguished based on whether the group address is in the Join or Prune field of the message [11, p. 708] 4 The 6 Another message used in PIM-SM is the Hello Message. The Hello Message is used by PIM to discover neighbors, maintain adjacencies, and elect DRs in a LAN environment. The Hello messages contain a holddown timer which tells the router how long to wait before determining a neighbor is down. The message is sent at a regular interval, typically a number of seconds. The well known address used for Hello Messages is the ALL-PIM-ROUTERS address of 224.0.0.13 [10, p. 21]. 1.1.4.2 PIM Dense-Mode PIM also has a source tree mode where the router with receivers immediately builds a shortest-path tree back to the source. In contrast to PIM-SM, PIM-DM uses a “push” method rather than a “pull” method[4, p. 80]. PIM-DM is described in RFC 3973. The basic operation of PIM-DM is to flood multicast traffic throughout the network, then “prune” back the links that do not have any active receivers. The prune is sent upstream toward the source. Another message called a PIM Graft is used when a link needs to be re-added to the multicast tree. The Prune state is based on a timer. When the Prune timer expires traffic will once again be transmitted down a link that was previously pruned toward potential receivers. A router can also send a Graft message toward the source when a receiver joins an area that was originally pruned from the source tree. PIM-DM uses (S,G) notation only, and each (S,G) pair has a timer associated with it to maintain state and does not rely on keepalive messages [12, p. 5-6]. PIM-DM also uses the Join message only to override a prune [12, p. 13]. PIM-DM PIM-SM S1 S1 1 2 1 2 3 4 3 5 6 RP 7 5 6 Prune Traffic End Node 4 7 Join Source Source Join Traffic Receiver Intermediate Node (Router) Figure 1.3: PIM-DM vs PIM-SM Figure 1.3 makes a basic comparison between PIM-DM and PIM-SM. The graphic on the left shows S1 sending out traffic to all active receivers. Since node 6 does not have any active receivers it sends a prune message back toward S1 7 via node 4. Node 2 also does not have an active receiver so it sends a prune toward node 3. In contrast, with PIM-SM a PIM Join is sent by any router that’s aware of an active receiver. The Join is sent in the opposite direction of the traffic flow. A dash-dotted arrowed line from node 4 to 1 is a source-specific Join that the RP sends to the source once it starts receiving the encapsulated traffic. As described in section 1.1.4.1 (Sparse Mode) eventually the traffic to each receiver will evolve into a source based tree similar to the PIM-DM tree, where all traffic is native (unencapsulated) from the source to the receiver, whether it goes through the RP or not. The graphic on the right only shows the initial stages of PIM-SM. 1.1.4.3 PIM Single-Source Mode As laid out in RFC 4607 some extra considerations are required when receiver joins a group in the 232.0.0.0/8 range. [6, p. 4] IGMP was expanded so it can handle source-specific messages. PIM wasn’t expanded, but RFC 4601 mentions specific semantics and rules to be applied for SSM groups that makes PIM Single-Source Mode (PIM-SSM) a subset of PIM-SM. Mainly, it specifies that when the SSM range is used the (*,G) Join cannot be utilized and the tree must be built using a source tree with (S,G) Joins. Also, there is no need for an RP. This means that the PIM Register and Register-Stop processes are not used, and there is no need for the special (S,G,rpt) Prune since the source tree is always built. Otherwise, the mechanics for building a tree in PIM-SSM are the same as PIM-SM by utilizing (S,G) Joins directly to the source in the opposite direction of the traffic flow. The same RPF and MRIB constructs are used [10, p. 80–81]. 1.2 MPLS Multiprotocol Label Switching (MPLS) is an IP technology that uses one or more shim headers (called labels) to forward packets rather than the address information contained in an IP header. The shim sits between the IP header and the payload in the packet. A network that is MPLS enabled consists of two main types of routers: Label Edge Routers (LERs) and Label Switch Routers (LSRs). Throughout the MPLS are Label Switched Paths (LSPs) which are unidirectional tunnels that carry packets5 through the network. An LSP begins at an LER and passes through LSRs in the middle of the network. The LER can create many LSPs, and it decides which LSP to place a packet using a Forwarding Equivalency Class (FEC). A basic example of a FEC are packets that all have the same destination IP address [13, p. 6–7]. The LER is either an ingress router, where the LSP begins, or an egress router where the LSP ends. A label is 4 bytes in size and consists of a 20-bit value, a 3-bit traffic-class value (commonly referred to as EXP bits), a bottom of stack bit which has a value of one when it is the bottom (or only label) in a “stack” of labels between the header and the payload, and a 8 bit TTL field which as the same function as an IP TTL. An MPLS router forms many mappings of an ingress label to an egress label and an associated interface. An LER or LSR will either “push” (add a new label), “swap” (exchange one label for another), or “pop” (remove a label). The ingress LSR will push one or more labels onto an IP packet based on FEC information to form the LSP. The router exchanges the incoming label, based on the mappings it already established, with the egress label and then sends the entire packet with its labels to the next router for a similar operation, or a pop operation since it’s the last router in the LSP (the LER). This exchange operation is called label swapping. Basically the router is selecting the interface to the next-hop based on the inner label. There also is an additional operation called Penultimate Hop Popping (PHP) where the penultimate router will pop a label exposing either another label or the IP header itself. The former is a common operation in Layer 3 VPNs and is discussed in section 1.4 [13, p. 7–9]. 5 An LSP can also carry Layer 2 information without an IP header, such as plain Ethernet, with a technology called Layer 2 VPNs. These are outside the scope of this report. 8 1 LER 2 3 4 5 LSR 6 LER 7 Figure 1.4: MPLS LSPs The line in figure figure 1.4 represents a unidirectional LSP. Its origin is at the LER, transits an LSR, and terminates at another LER. In one of many scenarios the LER, node 1, will have pushed a label into the IP header, node 5 will do a swap operation, and it knows to send that packet through the interface that connects it to node 7 based on the label it gets from node 1. Each MPLS router contains a database of labels which need to be populated. These are done by MPLS signaling protocols. The following sections will discuss the two main signaling protocols, LDP and RSVP-TE, as well as their additional mechanisms for Point-to-MultiPoint (P2MP). P2MP forwarding has a single ingress router with multiple egress routers for the same LSP. A router in the middle will copy the traffic and send it out two or more interfaces with a separate label for each interface. A router that does replication is also referred to as a branch node. Downstream from a replication point is a branch node. As with regular LSPs, the P2MP LSP is unidirectional [13, p. 165–166]. 1 LER 2 3 4 5 LSR 6 LER 7 Figure 1.5: Point-to-Multipoint MPLS LSPs Compare figure 1.5 to figure 1.4. Figure 1.5 has node 5 as a branch node which replicates the traffic to both node 7 and node 3. In this case, node 7 is a branch node while node 3 is a branch node and a transit node. 9 1.2.1 MPLS Signaling An association between an IP subnet and a label is called a label binding. A signaling protocol is required to build and distribute these bindings. To accomplish this the engineering community created a new protocol called Label Distribution Protocol (LDP) and also extended an existing protocol called Resource Reservation Protocol (RSVP). RSVP was extended to become RSVP Traffic Engineering (RSVP-TE) [13, p. 11]. BGP was also extended to distribute labels. This will be covered more in section 1.4.3.1. 1.2.1.1 LDP LDP was defined in RFC 5036, which updates RFC 3036, as a specific protocol for handling labels in MPLS networks. LDP uses message exchanges between directly connected peers or through targeted sessions that span multiple hops. In either case, the peer that exchanges messages is an LDP neighbor. These messages are used for session setup and information exchange. Once a session is setup the neighbors exchange label binding information between the labels and FECs (e.g. IP subnet). LDP has a fundamental rule that the LSP it is creating will always follow the shortest path of the Interior Gateway Protocol (IGP) such as IS-IS or OSPF. LDP relies on the IGP to determine the shortest path throughout a network based on its routing metrics. LDP distributes its labels from egress to ingress. The egress router will advertise a label {L1} for a given FEC to its upstream neighbor. The upstream neighbor will decide, based on the IGP shortest path, if it should use L1 to forward downstream to that FEC on the egress router. If this checks, the upstream neighbor will use that label to forward traffic to the egress router that initiated it. The upstream neighbor will then apply label L2 for that FEC, and advertise that label to its upstream neighbors. This process continues with all routers throughout the network [13, p. 12–13]. An LSP creation in LDP is demonstrated simply in figure 1.6 where node 7 advertises label {100} back toward node 5 for a given FEC. Node 5 installs this label in its forwarding table (assuming it’s the shortest path based on the IGP) then advertises label {50} back to node 1 which also installs the label. For an LSP, the ingress router will now push label {50} and forward the packet to node 5 which swaps {50} for {100}, then forwards it on to node 7 where the label is finally popped. The LSP now consists of labels {50} and {100}. Push Label {50} 1 2 {50} 4 6 5 Pop Label {100} 3 Swap Label {50},{100} {100} 7 Figure 1.6: LDP Signaling RFC 6388 describes the extensions for multicast LDP (mLDP). The LDP message has an extension added so that a label can be associated with a “P2MP FEC” value, which is the combination of the source address of the tree and a unique identifier. A router must be able to understand mLDP labels and the capability is advertised during LDP neighbor initialization. Using the P2MP FEC an mLDP enabled router can associate the labels as part of the same tree [14, p. 6–11]. As a result when the mLDP router receives two labels that contain the same P2MP FEC it knows to only advertise one label upstream toward the source. The procedure for advertising a label is slightly di↵erent from regular LDP. In regular LDP a router will only use the label for forwarding that matches the IGP best path. In the case of mLDP, 10 the router will only advertise a label that follows the IGP best path toward the source [13, p. 173–174]. In essence, mLDP is doing its own RPF check in order to advertise a label. Figure 1.7 illustrates two labels, {100} and {200} that are being advertised up the shortest path toward source A. A new P2MP FEC is used which consists of source A and the unique identifier of 1 (this is just an arbitrarily picked value). Since both labels belong to the same P2MP FEC the mLDP router, node 5, advertises only a single label back toward the source. Node 5, when receiving label {50} will replicate the traffic toward nodes 7 and 3 using labels {100} and {200} respectively. Source A 1 2 P2MP {50} FEC: A, 1 4 5 3 {200} P2MP FEC: A, 1 P2MP FEC: A, 1 {100} 6 7 Figure 1.7: Multicast LDP Signaling 1.2.1.2 RSVP-TE Resource Reservation Protocol (RSVP) was originally created with Quality of Service (QoS) in mind. It had mechanisms that allowed for reserving bandwidth in a network for a specific flow. Scalability concerns doomed it from ever becoming widespread but the mechanisms for bandwidth reservation proved useful in MPLS networks and it evolved into RSVP Traffic Engineering (RSVP-TE), and was originally defined in RFC 3209. RSVP-TE is di↵erent from LDP in that it doesn’t necessarily follow the best path provided by an IGP and therefore doesn’t rely on the IGP for shortest path information. Also, the LSP is set up from the ingress router, also called the headend router. The ingress router sends a Path Message toward the egress router, which is defined by an IP address (such as a loopback interface) on the egress router. Once the Path Message makes it to the egress router it responds with an Resv Message (“reserve message”) back toward the initiating ingress router. The Resv Message is only addressed to the next-hop back toward the ingress, and each subsequent Resv Message along the path is also one hop. This is because each Resv Message contains a label along with bandwidth reservation information. The path that the ingress router sets can be dynamic, which utilizes a traffic engineering database, or statically configured6 [13, p. 21–27]. 6 RSVP-TE allows for more than just label reservation as it also has traffic engineering capabilities as well as Fast Reroute capabilities allowing for SONET-like failover times in a packet switched network. The mechanics for setup of RSVP-TE such as path computation are outside the scope of this report. 11 Push Label {50} 1 2 {50} 4 5 3 Swap Label {50},{100} {100} Pop Label 7 {100} Path Message Resv Message 6 Figure 1.8: RSVP-TE Signaling Looking at figure 1.8 shows how RSVP-TE accomplishes the same task by building an LSP from node 1 to node 7 but with a di↵erent method. Node 1 initiates the LSP by sending a Path Message toward node 7 using an IP address for node 7. Once node 7 receives the path message it responds with a Resv Message to node 5, its upstream router back toward node 1. The Resv Message toward node 5 contains the label {100} and also traffic reservation information (not shown). Node 5 then repeats this process to node 1, advertising label {50}. At this point node 1 will push 50 onto a packet then forward it to node 5, where label {50} is swapped for {100} and sent to node 7. The mechanisms for P2MP RSVP-TE are mostly the same as regular RSVP-TE. The P2MP version uses the same Path and Resv Messages to set up the path, and each egress LER gets its own sub-LSP [13, p. 167–169]. A new identifier called a P2MP SESSION Object, defined in RFC 4875, is used to relate the multiple sub-LSPs together so that the router knows that they are the part of the same P2MP LSP. The session object contains three fields: P2MP ID, a Tunnel ID, and an Extended Tunnel ID. In the P2MP SESSION Object the P2MP ID is the IP address of the destination LSR. The Tunnel ID is a unique 16-bit number, and the Extended Tunnel ID is either blank or the IP address of the ingress LSR .[15, p. 5]. 1 2 {50} 3 {50} {200} 4 5 {100} 6 7 Path Message Resv Message Figure 1.9: Multicast RSVP-TE Signaling Figure 1.9 is very similar to figure 1.8 except that two separate Path and Resv Messages are used resulting in label {50} 12 being advertised twice, one for each sub-LSP. Recall that for a P2MP LSP there is a P2MP SESSION Object that “ties” the two sub-LSPs together. 1.3 BGP Border Gateway Protocol (BGP) was originally created to be a new Exterior Gateway Protocol (EGP) for IP networks. BGP was originally conceived during the 12th meeting of the IETF in 1989 and eventually evolved into RFC 1779, later obsoleted by 4271. BGP creates loop free topologies between and through various autonomous systems using a path vector methodology that analyzes a path of a network rather than simply using the lowest cost path like an IGP [16, p. 1–9]. The usefulness of BGP isn’t limited to just its scalability, especially as it pertains to multicast VPNs. The construction of BGP allows it to be extendible. This versatility was leveraged to support additional protocols and gave the foundation for services such as multicast VPNs which exchange information beyond IPv4. 1.3.1 UPDATE Message BGP consists of OPEN, NOTIFICATION, KEEPALIVE, and UPDATE Messages for setup and session control. However the UPDATE message will be the focus of this report as it is the message that carries, with some modifications discussed in section 1.3.2, the multicast information needed in multicast VPNs. An UPDATE message is used to exchange feasible IPV4 prefixes, or to withdraw them, between BGP speakers (BGP enabled routers). The UPDATE message contains, among a few other things, a field for withdrawn prefixes, Path Attributes, and a field for Network Layer Reachability Information (NLRI) which carries the feasible prefixes that a BGP speaker knows about. Below the encoding of the UPDATE message is shown. +————————————————————–+ | Withdrawn Routes Length (2 octets) | +————————————————————–+ | Withdrawn Routes (variable) | +————————————————————–+ | Total Path Attribute Length (2 octets) | +————————————————————–+ | Path Attributes (variable) | +————————————————————–+ | Network Layer Reachability Information (variable) | +————————————————————–+ Within an UPDATE Message there are several Path Attributes defined, only one of which will be discussed in detail in this report (NEXT HOP). BGP uses Path Attributes to add information to a set of prefixes that a BGP speaker can use to manage and control how the prefixes are added to its Route Information Base (RIB) and the global routing table. Certain attributes can also be used in policies for greater administrative control over how the prefix is stored or sent to other routers. The NEXT HOP attribute contains an IPV4 unicast address that is used as the next-hop for the prefixes contained in the NLRI field and represents the router that either has these prefixes directly connected or knows how to reach them. A BGP speaker MUST be able to process the NEXT HOP Path Attribute7 . The NLRI field in the original BGP implementation is fairly straightforward as it contains a list of IP address prefix and their lengths (subnet size). The number of prefixes contained in an UPDATE message is variable. An UPDATE message can contain only one set of Path Attributes. If only one IP prefix pertains to that set, then there will only be one prefix 7 BGP defines characteristics for Path Attributes as follows: Well Known Mandatory, Well Known Discretionary, Optional Transitive, and Optional Non-Transitive. The NEXT HOP Path Attribute is Mandatory Well Known and must be handled by the BGP speaker. Optional Transitive on the other hand does not need to be handled by the BGP speaker and can be forwarded to another BGP speaker. For more details refer to RFC 4271 section 5. 13 contained in the NLRI [17, p. 14–21]. Prefixes matching another set of Path Attributes need to be sent in a separate UPDATE message [16, p. 13]. 1.3.2 Multiprotocol BGP Originally BGP was created with IPv4 addressing in mind [16, p. 35]. In order to carry more than just IPv4 information Multiprotocol BGP (MP-BGP) was defined in RFC 2858, and was later obsoleted by RFC 4760. To extend the capabilities of what BGP can carry two new Path Attributes were created, called MP REACH NLRI and MP UNREACH NLRI. Unlike, for example the NEXT HOP Path Attribute, these two new Path Attributes are not required to be processed by the router. Therefore if the router does not understand or support the new Path Attributes the router can simply ignore them8 . MP UNREACH NLRI functions similarly to the field for withdrawn prefixes in the UPDATE message. If anything other than IPv4 needs to be sent by a BGP speaker it uses the MP REACH NLRI Path Attribute. It has a similar role to the legacy NLRI but it has been extended to identify other protocols as well as carry their information. The MP REACH NLRI also contains its own Next Hop field. The NLRI is encoded depending on the protocol being carried. To identify what protocol is being carried MP-BGP defines an Address Family Identifier (AFI) and Subsequent Address Family Identifier (SAFI). The formatting Next Hop is also dependent on the AFI and SAFI of the MP REACH NLRI Path Attribute. +————————————————————-+ | Address Family Identifier (2 octets) | +————————————————————-+ | Subsequent Address Family Identifier (1 octet) | +————————————————————-+ | Length of Next Hop Network Address (1 octet) | +————————————————————-+ | Network Address of Next Hop (variable) | +————————————————————-+ | Reserved (1 octet) | +————————————————————-+ | Network Layer Reachability Information (variable) | +————————————————————-+ Above the encoding of the MP REACH NLRI Path Attribute is shown, which is a part of the UPDATE Message encoding shown on page 13. Note that the MP REACH NLRI Path Attribute has its own Next Hop and NLRI fields, the structures of which are determined by the AFI and SAFI combination[18, p. 1–5]. As it will be seen in this chapter and the following chapters the MP-BGP MP REACH NLRI and MP UNREACH NLRI Path Attributes will be used to enable extensions to unicast routing and multicast routing by reserving their own AFI and SAFI numbers and creating unique NLRI encodings for each extension. Sometimes it will be described that a route carries certain attributes. This is just another way of describing an UPDATE Message that has a certain set of attributes that are associated with particular route or set of routes that uses those attributes. 1.4 BGP/MPLS Virtual Private Networks BGP/MPLS Virtual Private Networks or BGP/MPLS VPNs, also known as Layer 3 VPNs (L3VPNs), create a method for service providers (SPs) to provide IP VPN services to their customers. The method was originally described in RFC 2457bis but was obsoleted by RFC 4364. As we will see later in this report, BGP/MPLS VPNs are very important components for multicast VPNs since they borrow the mechanisms that are defined in RFC 4364. As the name implies, BGP/MPLS VPNs utilize the concepts of the previous two sections of this report. 8 MP REACH NLRI and MP UNREACH NLRI are optional non-transitive meaning the router can ignore them then must drop them if ignored. 14 The major components of BGP/MPLS VPNs that will be discussed are as follows: Network topology and terminology, virtual routing and forwarding tables, BGP addressing and advertisement, and forwarding. 1.4.1 Network Topology and Terminology BGP/MPLS VPNs come with their own set of terms describing network components. In the world of VPNs the network is broken up into Customer Edge (CE) routers, Provider Edge (PE) routers, and Provider (P) routers. The P routers sit in the core of the SP network and in the path of the VPN there can be one or more of them (and in some rare cases none). As the name implies the PE routers sit at the edge of the SP network and connect to one or more CE routers that sit at the customer’s location. The connection between the PE and CE routers is called an attachment circuit (AC). Figure 1.10 shows an example topology. Nodes 1, 2, 6 and 7 are the PE routers, each with a CE router attached to it. The red CE routers belong to one customer, CE1 being at site 1 and CE2 being at site 2 for that particular customer. The same applies to the blue CE routers, which belong to a di↵erent customer [19, p. 5–9]. Two separate customers can also connect to the same PE and remain isolated. Virtual Routing and Forwarding Tables make customer separation within a router possible. CE CE A1 B1 1 PE C D1 2 PE 3 P P 4 5 P Service Provider Network 6 PE D2 A2 7 PE CE B2 CE Figure 1.10: Service Provider Network with Customer Sites 1.4.2 Virtual Routing and Forwarding Tables In a PE model the PE is responsible for keeping the routing information separate between customers. A Virtual Routing and Forwarding Table (VRF) is used to accomplish this. The VRF is a routing table that is kept separate from the main routing table, which will be referred to global routing table in this report, and other VRFs on the same PE. The PE router also maintains independent forwarding information for each VRF. In essence a VRF behaves like a router within a router using the same mechanisms to learn prefixes and forward traffic over a network. The AC between a CE and a PE is associated with a specific VRF for only that customer. The PE router learns prefixs from the CE by using any IGP or BGP, and static routes can also be configured within a specific VRF9 . The PE router maintains these prefixes in a separate logical table that indicates which interface to use for the prefixes learned from the CE [19, p. 9–12]. 9 See section 7 from RFC 4364 for more details 15 1.4.3 BGP Addressing and Advertisement The purpose of a BGP/MPLS VPN is to connect remote customer sites over a Service Provider network. Figure 1.10 shows two customers, each with two sites, on opposite sides of the network. The previous section mentioned that a CE will exchange prefixes with a PE and the prefixes will be placed in a particular VRF. BGP has been updated using multiprotocol extensions discussed in section section 1.3.2 so that the prefixes in one PE can be sent to another PE on the other side of the network. 1.4.3.1 VPNv4 Address Family RFC 4364 introduces the VPN IPv4 (VPNv4) Address Family in section 4.1. Route Distinguisher The key part of the VPN-v4 Address Family is an 8-byte Route Distinguisher (RD) that is prepended to an IPv4 Address. The purpose of the RD is not to convey any additional information about a subnet, but to make any address unique when it is in the domain of the service provider network [19, p. 12–13]. The RD has two formats defined by a Type field, either 0 or 1. In addition to the two byte Type field are the Administrator and Assigned Number subfields, both of which add up to six bytes. The first variation is when the type field is 0, which means the Administrator Subfield is 2 bytes and the Assigned Number subfield is 4 bytes. In this case the Administrator Subfield is the Autonomous System Number (ASN) field that is assigned by the IANA for a Service Provider to which the ASN is assigned. The Assigned Number is assigned by the Service Provider and is an arbitrary number. The second variation is when the Type field is 1 which means the Assigned Number is four bytes and the Assigned Number is two bytes. In this case the Administrator field is an IPv4 IP address, and is recommended to be a public IP address. The Assigned Number is assigned by the Service Provider to which the IPv4 address is assigned [20, p. 116–117]. Because of the RD and the VRF route table isolation, customers can advertise the same address space over the service provider network, including RFC 1918 private IP addresses (e.g. 192.168.0.0) which are not allowed to be advertised over the Internet [19, p. 12–13]. An example of a Type 0 RD is 65000:100 [21, p. 435]. Type 0 RDs (and Route Targets discussed next) will be the convention used throughout this report. Route Target Although VRFs keep the routing information separate for di↵erent CEs on the PE router the same BGP session is used to forward the prefixes to other BGP speakers/PEs throughout the Service Provider network. The prefixes in the VRF are converted to VPNv4 prefixes when they are exported from the VRF to the PE BGP table. BGP will then use its knowledge of the network to distribute the route to the other PEs that need to know about it. The far end PE will then import the VPNv4 addresses into the VRF associated with the same customer as the VRF on the advertising PE. To control which VRF is allowed to import which prefixes, a new Path Attribute is created called a Route Target (RT) [19, p. 15–16]. The Route Target uses the same structure as the RD, however it is not prepended to an IPv4 address. The RT is actually defined in RFC 4360 which defines several Extended Communities for use in BGP, and mentions BGP/MPLS VPNs as a possible use for RTs. The RT is a specific form of the Extended Community BGP Path Attribute which is an eight byte value. Like the RD, a Type field defines whether or not an ASN or IPv4 address is used as the Administrator Field, and the Assigned Number field is an arbitrary number assigned by the Service Provider to which the ASN or IP address is assigned [22, p. 2–6]. The RT acts as an identifier for a prefix advertised BGP. As the prefix is exported from the VRF to the BGP table an export RT is configured for that VRF. When BGP sends an UPDATE Message it eventually makes it to a PE that is connected to the same customer. This PE has an import RT configured for the VRF. For the prefixes to be imported to the VRF the RT must match the value that was set on the other PE that exported the prefixes into BGP. A VRF must have at least one export and import PE, but they do not need to be similar on the same PE within the same VRF. 16 VRF B: RD – 789:201 Export RT – 789:2 Import RT – 789:2 CE B1 VRF B C Global Table D1 PE VRF D Figure 1.11: VRFs and Attachment Circuits Figure 1.11 shows two customers, A1 and B1, each at their own site, connected to a PE. Both customers have an attachment circuit that is associated with a single VRF. There is a third customer that connects to the global routing table. In this report attachment circuit will only refer to interfaces (physical and logical) that are associated with a VRF even though the same transport technology (such as frame relay, SONET, or Ethernet VLAN) is used to connect all the customers to the same router. Also, more than one CE can connect to the same VRF, either using separate physical interfaces or the same physical interface and multiple logical interfaces such as VLAN subinterfaces. Furthermore two separate logical interfaces can be in separate VRFs even if there is only one physical circuit. In any case, a VRF is associated with only a certain set of prefixes that come from the customer via an IGP, external BGP, or statically configured in the VRF on the PE, and these prefixes remain separate from the global table and any other VRF on the same PE. VRF B, the VRF associated with customer B, also has an RD of 789:201 and an RT of 789:2. It exports and imports the same RT, so it will accept prefixes from any VRF exporting 789:2 and any VRF importing 789:2 will accept prefixes from Customer B site 1. Any prefixes within VRF B at site 1 will be prepended with 789:201 when being sent via BGP to other PEs. The RT and RD values are assigned by the SP. Note that site 1 is configured with RD 789:201, while site 2 can be configured with 789:202 as shown in figure 1.13 on page 19. Each VRF can have its own RD value. The use of 789 is the AS Number of the SP. MPLS/BGP VPNv4 NLRI Formatting While various protocols may be used to connect the PE and the CE, the PE-PE communication is carried by BGP. Each PE is a BGP speaker and forms a BGP session with the other PEs with the capability of advertising VPNv4 addresses within the BGP UPDATE message. For VPNv4 an AFI of 1 is used (IPv4) and a SAFI of 128, which signifies it’s a labeled VPNv4 NLRI. Recall from section 1.3.2 that this information is contained within the MP REACH NLRI Path Attribute of the BGP update. The structure of the NLRI field within the MP REACH NLRI Path Attribute is defined in RFC 310 7[19, p. 22] as follows: A length field, a label field, and an address field [23, p. 3]. The address field in the VPNv4 NLRI is a combination of the RD and IPv4 address from a VRF [19, p. 22]. The VPNv4 message also contains a Next Hop field which contains the address of the PE that is advertising and an RD of 0:0. The Next Hop is formatted this way because MP-BGP requires that the address format of the Next Hop is the same as the format of the prefixes in the NLRI. This Next Hop is also referred to as the BGP Next Hop. [19, p. 17]. In summary, an UPDATE Message sent between two PEs using the VPNv4 Address Family is summarized in Figure 1.12 on the following page. 17 BGP UPDATE Withdrawn Routes Length Withdrawn Routes Total Number of Path Attributes NEXT-HOP Path Attribute (Legacy BGP) Path Attribute MP_REACH_NLRI Path Attribute AFI 1 SAFI 128 Next Hop NLRI Length Field Label Field RD Field Extended Community Path Attribute Flags Route Target Value Network Layer Reachability Information Figure 1.12: MP-BGP VPNv4 BGP UPDATE Message Example The extensibility of the BGP protocol, and the concept that allows MP-BGP to exist, is the Path Attribute. In the above figure a VPNv4 BGP UPDATE Message is shown, showing how the Path Attributes and their respective fields are nested within the UPDATE message. The values in the AFI, SAFI, the fields in the NLRI field of the MP REACH NLRI, and the presence of the Extended Community of Route Target type Path Attribute are unique to the VPNv4 message. If the SAFI number were di↵erent the NLRI field of the MP REACH NLRI Path Attribute may be formatted di↵erently, and the Route Target Extended Community may not be there at all, replaced with a di↵erent Extended Community. 1.4.4 Forwarding The forwarding used for MPLS/BGP VPNs is MPLS using a combination of the label sent using BGP and another label using LDP or RSVP-TE. The label carried in BGP is referred to as the VPN Label or the BGP Label and the one learned by LDP or RSVP-TE is the IGP Label or the Tunnel Label since this label is used to tunnel the VPN traffic through the Service Provider network. The IGP Label is associated with the IP address that the PE used to advertise the BGP message, and can be referred to as the IGP Next Hop [19, p. 23–24]. The BGP Next Hop and the IGP Next Hop are typically the same IP address and assigned using the address on a loopback interface [24, p. 115]. The IGP Next Hop is 18 advertised throughout the network using an IGP, and a label is associated with it and advertised hop-by-hop using the MPLS mechanisms discussed in section section 1.2. B1 MP-BGP Message VPNv4 Address: 789:202:10.2.2.0/24 BGP Next Hop: 1.1.1.7/32 Label: {123} Route Target 789:2 5 2 RT Export: 789:2 RT Import: 789:2 3 LDP Label for 1.1.1.1/32 {Imp-Null} LDP Label for 1.1.1.1/32 {200} LDP Label for 1.1.1.1/32 {100} RT Export: 789:2 Loopback: 1.1.1.7/32 7 RT Import: 789:2 RD: 789:202 B2 10.2.2.0/24 Figure 1.13: VPN Label Advertisements Figure 1.13 provides a summary for the BGP VPN advertisement. It shows labels being advertised hop-by-hop by LDP for the loopback address 1.1.1.1/32. Node 2 is advertising an Implicit Null label which tells the upstream router to pop the top label rather than leave it on. This signals a penultimate hop pop. A label of {123} is also being advertised by a BGP UPDATE message which also contains the VPNv4 prefix 789:202:10.2.2.0/24. The 789:202 is the RD of Customer B Site 2. RT information is also included as 789:2, and the VRFs at both sites for Customer B is configured to import and export that RT. These two labels combine to form a label stack. The BGP Label is the inner label of the stack and is therefore sometimes referred to as the “Inner Label.” The IGP Label is on top of the stack and is used to forward the packet through the Service Provider network. As the packet traverses through the network each subsequent hop swaps the top IGP label while the inner BGP label remains the same. Once the packet reaches the far-end PE the IGP label is popped (or it is popped at the penultimate hop using PHP). The PE is then able to use the BGP Label to forward the packet to the correct CE router using a standard label lookup and forwarding process [25, p. 204–206]. {300} {123} IP B1 2 Swap 3 {200} {123} IP Pop {123} IP 10.2.2.0/24 5 7 B2 Loopback: 1.1.1.7/32 Figure 1.14: VPN Forwarding Figure 1.14 is another look at figure 1.13 showing the label stack and how it changes hop by hop between the two PEs. Since the Imp-Null label was advertised by node 2 to node 3 the IGP Label is popped. 19 1.4.5 Inter-AS Considerations In some situations, depending on the operator, a VPN may extend beyond a single AS. This section briefly describes the terminology and options that support this scenario. In each case, eBGP is used to communicate between the two networks. Option A: Back-to-Back VRFs In this option an Autonomous System Border Router (ASBR) has a single interface to the ASBR in the other network. The interface has multiple subinterfaces, at least one per VRF, that is used to exchange routes for that VPN/VRF. Option B: Labeled VPNv4 Routes In this method an ASBR will receive VPNv4 routes using iBGP, and will then exchange them to another ASBR in another network using eBGP. That ASBR will distribute the labeled VPNv4 routes within the network to another ASBR in another network. This option should only be used between trusted networks. An LSP is required end-to-end over both networks, and Route Targets must be agreed upon. Option C: Multihop eBGP for VPNv4 For this scenario two separate networks exchange /32 host addresses representing the BGP process for a router. The PE routers in the di↵erent networks create a multi-hop eBGP session (default for eBGP is only 1 hop as the default TTL for a BGP message is set to 1) to exchange the VPNv4 routes. This requires three labels in a stack. The bottom label is the one found in the VPNv4 update. The middle label is the one bound to the /32 host address for the edge PE. The top label is bound to the /32 address of the ASBR. This way from the perspective of a packet from a particular PE, it uses the top label to get to the edge of the network, the label is then popped and the packet is forwarded to the other network where the middle label (now the top label of a two label stack) is used to reach the other PE, then the bottom label is used for the specific VRF as in normal BGP/MPLS operation. 1.4.6 BGP/MPLS VPN Summary The important takeaways for BGP/MPLS VPN, as it relates to Multicast VPNs, are that each PE has one or more VRFs, and that the VRFs on all the PEs in the SP network are linked by their Route Targets which determine which VRF can accept which routes. The Route Target is configured for a VRF, and is carried in the Route Target Extended Community of the BGP UPDATE Message. In a simple case all the VRFs for a single customer use the same Route Target. Also, each VRF can be uniquely identified by its Route Distinguisher. As will be seen in the next chapters a single IPv4 address configured on the PE, usually on a loopback interface, should be used as the BGP Next Hop. The same IPv4 address can be used by extensions to other protocols and the Multicast VPN mechanisms can then map messages within those protocols to messages within BGP and a specific VRF/VPN. 1.5 Generic Routing Encapsulation Generic Routing Encapsulation (GRE) is defined in RFC 2784 as an attempt to create a generic description of how to create tunnels transport IPv4 packets using another IPv4 header. The encapsulation is described as a payload packet being encapsulated by a GRE packet. This GRE packet is then encapsulated by another protocol and is referred to as the delivery packet. The defined values for the delivery protocol and the payload packet are both IP, therefore GRE currently describes a method for IP-in-IP encapsulation [26, p. 1–5]. This technology is important in Draft Rosen VPNs. GRE should not be confused with IP-in-IP encapsulation defined in RFC 1853. 1.6 Control Plane vs Forwarding Plane An important concept in this report will be the control plane mechanisms in contrast to the forwarding plane mechanisms. The idea of separate control and forwarding planes doesn’t have a definition and the idea can vary depending on what technology is in focus. For example within the specific protocol MPLS the control plane can be thought of as RSVP-TE or LDP label signaling, while the forwarding plane can be thought of as the router process of swapping the advertised labels and forwarding the traffic throughout the network. Looking at BGP/MPLS VPNs, extended to multicast VPNs, 20 there is a suite of protocols such as BGP and PIM as well as RSVP-TE. This report defines the former, the protocols involved in session setup, as the control plane, such as BGP advertising an MP-BGP UPDATE message. The protocols responsible for forwarding the traffic through the network, such as RSVP-TE or LDP, will be defined as the forwarding plane. One example of this is the concept of a BGP Free Core. Each PE can form a BGP session with a BGP Route Reflector (RR), which can be a dedicated router for distributing BGP routes only. This is in contrast to having a mesh of BGP sessions throughout the Service Provider network. BGP connections between routers in the same as (Internal BGP or iBGP) do not need to be directly connected as is a requirement with BGP connections to a di↵erent AS (External BGP or eBGP). This means that the RR can sit anywhere in the network and not need to be directly connected to the PEs and can be centralized. In this configuration BGP does not need to be configured on the P routers since the BGP communication is PE-RR-PE. The BGP distribution across the network is the Control Plane. As discussed previously the BGP UPDATE message carries the Next Hop address of the originating PE and this address is also distributed by an IGP. MPLS labels are distributed through the network for each Next Hop address since each one can be considered a FEC. When a PE sets up the forwarding path for traffic for a specific VRF it associates the packet with a BGP label with an IGP label on top. The traffic is then forwarded hop by hop using only the IGP label. The distribution of the IGP label via LDP or RSVP-TE and the forwarding process of swapping labels hop-by-hop is the Forwarding Plane. The P routers have no knowledge of the routes on the PEs yet traffic can be forwarded through the network. In e↵ect, the data is tunneled through the network in a VPN model whether BGP is along the forwarding path or not. RR MP-iBGP PE P Label Distribution MP-iBGP P Label Distribution PE Label Distribution Figure 1.15: Control Plane vs Forwarding Plane Figure 1.15 shows MP-iBGP communication between two PEs and an RR. The RR could be physically connected to one or both of the PEs or it could be connected by any number of routers between it and the PEs. For this reason it does not have any lines representing interfaces. A dashed line is used to represent the communication between the PE and RR and is representative of the Control Plane communication. Traffic does not need to be forwarded through the RR and in this scenario it is for exchanging BGP information only. The PEs are physically connected to the P routers, and labels are distributed hop-by-hop. The label distribution is represented by the solid lines and is representative of the Forwarding Plane communication. Note that the Forwarding Plane can have its own control communication such as PIM adjacency establishment or LDP neighbor communication. However these are still considered part of the Forwarding Plane. 21 Chapter 2 Draft Rosen Multicast Virtual Private Networks The BGP/MPLS VPNs discussed in the previous chapter were designed to carry unicast traffic. With the growing popularity of multicast services, various enterprises began to require multicast support between their sites over a Service Provider (SP) network. Initial implementations of GRE tunnels or Layer 2 VPNs (aka pseudowires) provided results that are not scalable. The Multicast Virtual Private Network (MVPN) solution, developed by Cisco, is a way to address these issues. The IETF draft was written by Eric Rosen at Cisco and stayed in draft status, hence the name Draft Rosen VPNs. Eventually the draft was turned into a historical RFC, number 6037, which will be used as one of the sources for this chapter. For the rest of this report the solution will be referred to as Multicast VPN (MPVN). Although the MVPN solution is built o↵ of unicast BGP/MPLS VPNs there is a large di↵erence between the two. However certain elements from the unicast model are reused, such as VPNs, tunneling traffic through the network (with GRE instead of MPLS), and the use of Multiprotocol BGP [13, p. 279–280]. 2.1 Overview of MVPNs Standard BGP/MPLS VPNs hide per-VPN state information from the P routers. They are not aware of how many VRFs are on the PE routers in the network. For optimal multicast routing the P routers would need to maintain some sort of per-VRF state information for the multicast replication trees. Even if the P routers did support this information they would need to maintain multicast state information for every group for every customer so that the multicast tree is only built to the PEs which require the traffic. This is not scalable. Multicast VPN provides a solution to the scalability issue by allowing the SP to maintain a multicast tree only for each VPN rather than every group inside every VPN. The solution has the following prerequisites: • PIM-SM is used in the PE VRF instance. • PIM is used in the SP network. • The SP network supports multicast forwarding natively. It is helpful to first define some terms used in the specification. Customer Element vs. Provider Element The convention in MVPN documentation, and the convention that will be used in this report, is to add C- for customer or P- for provider before the various technical terms that describe MVPN. Multicast VRF This is a VRF on a PE that the service provider configures to be multicast enabled. Within each VRF is its own multicast routing table and PIM-SM adjacencies with a PIM capable CE router. The CE related PIM instances, 22 whether directly to the CE router or to the far end PE, will be referred to as C-Instances [27, p. 6]. The Multicast VRF also participates in MP-BGP for VPNv4 addresses for unicast routes specific to the VPN as well as a new MDT-SAFI address structure created for MVPN. Multicast Domain A Multicast Domain (MD) is a set of multicast VRFs that belong to the same MVPN. Multicast Distribution Tree The tunnel that is used to carry multicast traffic across the SP network is referred to as the Multicast Distribution Tree (MDT). The MDT is the MVPN mechanism that allows the C-Instance PIM sessions between the PEs to appear as if they are directly connected, hiding the core of the network. At a high level each PE sees the PIM adjacencies of the C-Instance as if they were directly connected via a LAN [13, p. 281–282]. MDTs are created using the P-Instances of PIM in the SP network and are used to encapsulate the C-Packets of an multicast VRF. There are two types of MDTs: Default and Data. The Default MDT is used to encapsulate all customer multicast traffic and forward the traffic to each PE, at least initially. If the traffic volume becomes large and not all sites within the MD want to receive the traffic one or more Data MDTs can be created. Each MD has at least a Default MDT and can have zero or more Data MDTs [27, p. 5]. Multicast Tunnel The Multicast Tunnel or Multicast Tunnel Interface (MT or MTI) is an abstract concept as there is no actual physical tunnel. From the perspective of the multicast VRF the MT is the interface for the path to the other VRFs in an MD via an MDT. Depending on the router vendor or platform the tunnel will be displayed as “tunnelx” or “MT” to represent the encapsulation or decapsulation interface [2, p. 67–69][2, p. 80–81]. A1 B1 1 2 3 B3 4 5 A3 6 7 A2 B2 Customer A MD Customer A MVRF Customer A MDT Customer B MD Customer B MVRF Customer B MDT Figure 2.1: MVPN Overview A high level overview of Multicast VPNs is shown in figure 2.1. Both customers have three sites and a Multicast VRF represented by the solid circle. Each customer also has its own Multicast VRFs which are part of the MD, each connected by an independent MDT. Note how each customer has its own MD and MDT. Also recall that each MD can have multiple MDTs (one Default and multiple Data) even though only one is depicted. Figure 2.2 shows a little more detailed view correlating the terms discussed above using Customer A as an example. 23 Multicast Domain A PIM C-Instance PIM P-Instance PIM C-Instance (Tunneled) PIM C-Instance C-IP Header C-Payload P-IP Header (GRE) C-IP Header C-Payload C-IP Header C-Payload PE6 M-VRF A A2 PE1 A1 P M-VRF A PE5 M-VRF A MTI A3 Figure 2.2: MVPN Details From left to right, the C-Packets are encapsulated with GRE as it enters the MDT via the MTI. From this point on all C-Groups are hidden and they are all transported through the network using the P-Group that’s assigned to the MDT. Using that mechanism there could be 100 C-Groups but the SP network only needs to build a tree for the one P-Group using P-Instance PIM. At PEs 5 and 6 the P-IP Header is removed along with the P-Group and the multicast traffic is forwarded to the CE. The PEs use the P-Group to identify which VRF the MDT belongs to. This creates a LAN environment from the perspective of the C-Instance as shown below in figure 2.3. PE6 M-VRF A A2 PE1 A1 M-VRF A PE5 M-VRF A Figure 2.3: MVPN C-Instance LAN 24 A3 2.2 2.2.1 MVPN Operation Multicast Distribution Trees The Multicast Distribution Trees (MDTs) in MVPN are used to carry the customer multicast control and data traffic, already defined as C-Packets. These can be further broken down into C-PIM Join, C-Traffic, etc. The C-Packets are encapsulated within the MDT and from the SP perspective the traffic becomes P-Packets. The MDTs can be shared trees established using PIM-SM, source trees using PIM-SSM, or a combination of the two. Which is used is up to the carrier [2, p. 61]. 2.2.1.1 MDTs and Generic Routing Encapsulation The tunneling aspect of MVPNs and MDTs is very important and therefore is explained before the MDT operational details. When a customer sends multicast traffic (C-Packets) it first reaches the PE in the Multicast VRF where it is part of the PIM C-Instance. If the traffic needs to be forwarded across the network it is encapsulated by the PE via the logical MTI by Generic Routing Encapsulation (GRE) and decapsulated at the far-end PE by its logical MTI. The encapsulation is what allows MVPN to scale. When the C-Packets from a customer enter the Multicast VRF and are forwarded they are encapsulated by GRE so that the C-Source address and the C-Group address are encapsulated by another IP Header forming a P-Packet. This header contains an address of the PE as the source address (typically the address used for MP-BGP as well) and a unique-per-MDT address referred to as the P-Group address [2, p. 61][27, p. 13]. The SP network only uses the outer header to forward the traffic and build the MDT. Because of this encapsulation the SP network can build the multicast trees mostly the same way as described in the first chapter using just the P-IP Header. Some extra considerations for building the trees are necessary and are described in the following sections. 2.2.1.2 Default MDT The Default MDT is used by every PE that is part of an MD as well as each Multicast VRF that is part of that MD. The Default MDT is identified by an MDT Group Address, also known as the VPN Group address and defined earlier as a P-Group Address. MDT Group Address and P-Group address will be used interchangeably. A CE router uses its C-Instance PIM to exchange multicast routing information with the PE within its VRF. The routing information is then sent across the MDT via the MTI from PE to PE. At the destination PE the information within the VRF is propagated to the CE using its C-Instance PIM. The PE-PE multicast traffic that is carried across the MDT is also part of the C-Instance, but is tunneled. Refer again to figure 2.2 where there is a PE-CE C-Instance on both sides, with a tunneled C-Instance in the middle. This can also be thought of as one contiguous C-Instance where part of it is tunneled. Any traffic that enters the Default MDT is sent to all PEs participating in that MDT [2, p. 62–66]. The Default MDT is created and maintained by the P-Instance of PIM in the SP network using standard PIM setup procedures and using the global routing table of the SP’s IGP. If PIM-SM is used the MDT for a specific MDT Group joins the shared tree that is rooted at the Rendezvous Point (RP). Just like standard trees in PIM each MDT has a separate trees built defined by where the receivers/Multicast VRFs are located. A PE router in an MDT is both a source and receiver. Using the P-Packets the PIM P-Instance can do normal RPF checks via the global IGP as it builds the tree. Figure 2.4 summarizes the operation of the Default MDT. Customer CE A1 sends traffic over the Default MDT to A2 while A2 is sending a PIM Hello over the MDT as well. The Default MDT is connected to all three PE’s with Multicast VRFs for Customer A. While the customer is using Group Address 239.0.0.1 for the C-Traffic, it is encapsulated in the MDT and the SP network forwards the traffic using the P-Group Address 233.3.21.1001 . A1 could also be sending traffic using groups 239.0.0.2 and 239.0.0.3 and so on, but the same MDT Group Address of 233.3.21.100 is used. Note that the same P-Group Address is used in both directions, while the C-Group for the Customer Join uses the ALL-PIM-ROUTERS Group Address. Referencing figure 2.2 on page 24 the MDT for Customer B could have an MDT Group Address of 233.3.21.200. 1 In this example the P-Group address is a GLOP Multicast Group Address. The second and third octets of 3.21 are derived from the AS Number 789 using the method described in the GLOP Addressing paragraph on page 3. The last octet is arbitrary with .100 used for Customer A. 25 C-Source: 10.1.1.1 C-Dest: 239.0.0.1 P-Source: 1.1.1.1 P-Dest: 239.3.21.100 A1 C-Source: 10.1.1.1 C-Dest: 239.0.0.1 1 2 C-Source: 10.1.1.1 C-Dest: 239.0.0.1 P-Source: 1.1.1.1 P-Dest: 239.3.21.100 C-Source: 10.1.2.1 C-Dest: 232.0.0.13 P-Source: 1.1.1.6 P-Dest: 239.3.21.100 3 4 5 6 A3 7 C-Source: 10.1.2.1 C-Dest: 232.0.0.13 A2 Join Path Traffic Path Figure 2.4: MVPN Default MDT Operation The Default MDT has a single P-Group address but is carrying multiple customer (S,G) streams, which use a C-Source Address and a C-Group Address. The customer streams can be referred to as (C-S,C-G). Each MDT may be denoted as (P-S,P-G). 2.2.1.3 Data MDT The Default MDT always sends all traffic to PEs that are participating in that particular MDT. When the amount of traffic gets larger this method becomes more and more inefficient. To regain efficiency of delivering multicast traffic to only the PEs that have active receivers the Data MDT is used. The Data MDT can be created when a configured bandwidth threshold is crossed for the Default MDT. One or more Data MDTs can be created in addition to the Default MDT and each MDT receives a unique group address which can be obtained from a pool of P-Group addresses. The Data MDT also only handles data traffic; control traffic is only sent over the Default MDT. The PE router tracks the amount of bandwidth for each (C-S,C-G) customer stream and creates a new Data MDT if that particular group exceeds the user configured bandwidth threshold. The PE does not create a new Data MDT based solely on the aggregate traffic amount for all groups traversing a Default MDT. Each (C-S,C-G) stream gets its own Data MDT if it crosses the bandwidth threshold. However if the amount of P-Groups in the pool is exceeded then the PE router will put more than one customer (C-S,C-G) stream onto a Data MDT. The trade-o↵ is that a smaller P-Group pool allows for fewer MDTs, which means less P-Instance PIM state, but a larger pool allows for more optimization but with more P-Instance state. Just like the Default MDT the Data MDT is created using P-Instance PIM. The PE router with active receivers can send a PIM P-Join message, but first it needs to learn of the P-Group address of the Data MDT. To facilitate this a new control message is created called a Data MDT Join. The PE with an active source sends the Data MDT Join to all the PEs participating in the Default MDT using a destination address of 224.0.0.13, the ALL-PIM-ROUTERS Group Address. The message payload consists of the customer’s (C-S,C-G) information (the customer’s source address and group address for a stream) along with the Data MDT’s P-Group address. A PE router with receivers for that particular 26 (C-S,C-G) stream will then join that Data MDT. PE’s that do not have active receivers will still store the Data MDT Join information in case an active receiver does want to join that (C-S,C-G) stream. The source PE that initiated the Data MDT will wait several seconds before putting traffic onto the Data MDT to allow for time for the receiving PEs to set up the tunnel [2, p. 66–67]. The Data MDT can be setup by using either PIM-SM or PIM-SSM. If PIM-SM is used the PE routers, upon receipt of the Data MDT Join, will send a P-Join back toward the P-RP of the shared tree. If PIM-SSM is used the receiving PE will send a P-Join back to the source PE router creating a source tree. RFC 6037 recommends the use of PIM-SSM [27, p. 16–17]. A1 C-Source: 10.1.1.1 C-Group: 239.0.0.2 P-Group: 239.3.21.101 P-Source: 1.1.1.1 P-Group: 239.3.21.101 1 2 3 4 5 6 A3 7 A2 Data MDT Join Path P-Join Figure 2.5: MVPN Data MDT Signaling Figure 2.5 shows the source PE advertising the Data MDT Join over the Default MDT. In contrast to creffig:mdtdefault a new P-Group Address of 233.3.21.101 is used for the Data MDT instead of .100 which is already used for the Default MDT. The (C-S,C-G) of (10.1.1.1,239.0.0.2) is the customer stream that crossed the bandwidth threshold configured on the PE. Only PE 5 has active receivers for this customer stream so it sends a P-Join back to the source PE using the PIM-SSM method. Figure 2.6 below shows the customer traffic traversing the new Data MDT. Note that the traffic for this group is encapsulated using the new P-Group Address specific to the Data MDT of .101. 27 A1 C-Source: 10.1.1.1 C-Group: 239.0.0.2 P-Source: 1.1.1.1 P-Group: 239.3.21.101 1 2 3 4 5 6 A3 7 A2 Data MDT Join Path P-Join Figure 2.6: MVPN Data MDT Operation 2.2.2 Auto-Discovery in MVPNs The P-Group Address for an MDT is manually configured on a router. When PIM-SM is used to build the trees the standard mechanisms are used, where the source and receiver PEs can discover each other through the RP. The receiver PE is sending (*,G) PIM P-Joins toward the RP while the source is sending Register Messages toward the RP. Because of the use of (*,G) the receiver PE does not need to know the source PE’s unicast source address for that particular group [2, p. 105]. Each PE only needs to know the P-Group address for the MDT [27, p. 8]. Using PIM-SSM for MDT setup requires an additional mechanism for auto-discovery since the receiver PE does not know the source PE’s IP address2 . The mechanism created is a new BGP Address Family called MDT SAFI. This Address Family uses an AFI of 1 and a SAFI of 66. The NLRI field contains one or more of the 2-tuple of an RD prepended to the IPv4 address used as the source address plus the P-Group Address. +————————————————+ | RD:IPv4 Source Address (12 octets) | +————————————————+ | P-Group Address (4 octets) | +————————————————+ A Route Target (RT) is also included in the same UPDATE message that contains the MP-BGP Address Family for MDT SAFI. Using normal BGP VPN mechanisms the route information can be associated with the correct VRF. The P-Group could also be used, but this would require that all P-Groups are unique across a multi-provider network. This is difficult, so the RFC specifies that RTs must always be used to facilitate the use of multi-provider networks [27, p. 8–10]. Each BGP speaker participating in MVPN receives the MDT SAFI information and uses the Route Targets to install 2 Contrast this to a typical SSM case in a non-MVPN network where a host is trying to join a specific group and source: The IGMPv3 Membership Report (IGMP Join) has the source included in it along with the group it’s trying to join. The router then turns this into a (S,G) PIM Join. In the MVPN case in the P-Instance there is no host joining a group using a specific source; only the P-Group Address is known from manual configuration. 28 the information into the correct VRF. Each PE router can then join the (S,G) tree using normal PIM processes [2, p. 105–106]. 2.2.3 RPF Reverse Path Forwarding (RPF) checks are a fundamental part of multicast and are still needed in an MVPN environment. In a typical PIM network the check occurs by making sure traffic is arriving over the interface that is part of the shortest path back to the source according to the global unicast routing table. This check needs be a handled a little di↵erently when in a Multicast VRF that consists of MDT MTIs. The check can occur normally for the PIM P-Instance since this is part of the global table. Within the VRF the C-Instance traffic can either be sourced from a CE interface or from the MDT’s MTI for the MVPN. If it is received from the CE interface a normal RPF check can occur since that interface is participating in the VRF’s routing table. However if the packets are received from another PE on the other side of the MDT the VRF doesn’t automatically have the route toward the other PE. In this case, the routes within a VRF for the other PEs in the MDT are provided by VPNv4 BGP. The RPF check within the Multicast VRF will set the upstream interface as the MTI if the VPNv4 message contains a C-Source address. The RPF neighbor address is set to be the BGP Next Hop address within the VPNv4 message, and PIM will use this address when sending Hello Messages across the MDT. With these modifications the MTI is treated just like a physical interface on the router, and PIM simply uses the BGP Next Hop as the PIM neighbor on the other side of the MDT [2, p. 70]. 2.3 Considerations for Inter-AS and BGP Free Core When a BGP free core is used, or in Inter-AS scenarios, extra information is necessary for RPF checks or PIM signaling to occur. RFC 6037 specifies two new methods to allow for communication in these scenarios. 2.3.1 PIM MVPN Join Attribute The PIM MVPN Join Attribute, also called the PIM RPF Vector or PIM Vector, is used to assist with Inter-AS communication or BGP-Free Core communication. The PIM Vector is a new PIM Join Attribute, an extension of PIM. The PIM Vector contains the IP address of the router that has reachability to the source (the IP address that the PIM Join/Prune should be forwarded to), and an RD. The RD is taken from the BGP MDT SAFI UPDATE Message [27, p. 11–13][2, p. 122–123]. Using MVPN and BGP MDT advertisements the PE will be aware of the source address, but it is kept outside of the IGP table and is in the special BGP MDT SAFI Table. The PIM Vector helps in a BGP free core, or in an inter-AS scenario, where the source address isn’t known because it’s not in the IGP table. PIM relines on the IGP table, and there is no BGP MDT Table, since that is only on the PE routers or ASBR router. Instead, P-PIM can use the IP address of the RPF Vector, which is an IP address of a router that knows how to reach the source PE. The source PE is aware of the BGP MDT Table and the global IP address used in the PIM Vector. The RD is required so that the PE can associate the PIM message with the appropriate BGP MDT Table [2, p. 123]. 2.3.2 BGP Connector With each VPNv4 UPDATE message that a PE distributes from a Multicast VRF it must carry the BGP Connector Attribute. It is an optional transitive attribute [27, p. 15–16]. The value of the attribute is the IP address of the PE (likely the loopback). For Intra-AS communication it doesn’t have much purpose, but for Inter-AS “Option-B” communication it has significance, when the ASBR changes the next-hop of the UPDATE message. This allows the originating PE’s router address to be preserved. This allows the far-end PE in the other AS to fulfill its RPF check [2, p. 116–117]. 29 Chapter 3 BGP/MPLS Multicast Virtual Private Networks While Draft Rosen MVPNs were able to allow Service Providers (SPs) to create scalable Multicast Private Networks, Draft Rosen does have its limitations. For one, the SPs are not able to leverage the MPLS technology already deployed in their network. The Draft Rosen method utilized GRE to create the tunnel between the edges through the core which created an overlay network. This results in the SP having to maintain a PIM/GRE topology in addition to a BGP/MPLS topology for the traditional RFC 4364 Unicast VPNs. In a large SP network with many customers a large amount of PIM state also had to be maintained by the routers in the core, when there is the preference by many SPs to keep their cores simple and only label-switch traffic. The Default MDT was inefficient in the sense that all PEs had to receive C-Packets even if there weren’t any receivers, and higher amounts of traffic caused more state to be created to support multiple Data MDTs [2, p. 153–154]. BGP/MPLS Multicast VPNs were created to extend the use of Unicast VPNs, as defined in RFC 4364, to carry customer multicast traffic. The RFC defines a framework to allow an SP to carry multiple C-Multicast streams without requiring the amount of state in the SP network to increase proportionally. The primary method for accomplishing this is by aggregating multiple customer streams into a single distribution tree throughout the backbone P routers. Multiple aggregation methods are defined [28, p. 5–7]. BGP/MPLS Multicast VPNs are defined in RFC 6513, which provides the overview and framework, and RFC 6514 which includes detailed information about the BGP encodings defined within RFC 6513. Eric Rosen co-authored RFC 6513 along with Rahul Aggarwal and both are the main authors for RFC 6514. RFC 6513 defines an Multicast VPN (MVPN) as two sets of sites, a Sender Site and a Receiver Site. The traffic originated by a Sender Site should only be received by its corresponding set of Receiver Sites, and not any other Receiver Site not in that set. In other words Customer A Sender traffic should only be received by receivers at Customer A sites. Or, Customer A can send traffic to another customer if it allows that to happen, which would imply that the other customer is in Customer A’s receiver set. This would be the case in an extranet. The MVPN capabilities are carried out using RFC 4364 mechanisms. In this chapter the Draft Rosen MVPNs will be referred to as DR-MVPNs and the BGP/MPLS Multicast VPNs described in RFCs 6513 and 6514 will be referred to as Next-Generation MVPNs (NG-MVPNs). This chapter will also use the same convention of distinguishing customer and provider elements with the C- and P- prefix. Some terms will be carried over as well, such as P-Group Address. 3.1 Next-Generation Multicast VPN Overview In an NG-MVPN network the role of BGP is to convert PIM messages from a customer on a PE into special BGP messages, send them across the network, and convert them back to PIM at the far end PE for hando↵ to the customer at another site. Using the Unicast BGP/MPLS procedures defined in RFC 4364 the PE can map these messages to a specific Multicast enabled VRF. BGP is also responsible for autodiscovery using a set of special BGP messages and binding C-Multicast routes to whichever provider tunnel is chosen. Using information carried within BGP the PEs can 30 also establish a variety of P-Tunnels. One option is be PIM/GRE based tunnels as in DR-MVPNs. However, there are also MPLS based options, including RSVP-TE which can allow for traffic engineering of the multicast traffic. With the inclusion of MPLS technologies for transport and the use of BGP for control plane the technology has the name BGP/MPLS Multicast VPNs [13, p. 287-292]. PIM Control Plane BGP Control Plane P-Tunnel Forwarding C-Traffic (Tunneled) PIM Control Plane C-PIM C-Payload Transport Type C-IP Header C-Payload C-PIM C-Payload PE6 M-VRF A A2 PE1 A1 P M-VRF A PE5 M-VRF A PMSI A3 Figure 3.1: BGP/MPLS Multicast VPN The above figure is purposefully similar to figure 2.2 on page 24 to compare and contrast the two technologies. As DR-MVPN there is a P2MP P-Tunnel; however the P-Tunnel can be a variety of options using PIM/GRE and MPLS. Rather than an MTI, NG-MVPN uses a somewhat similar concept of a PMSI at the endpoints of the P-Tunnel. Also, the control plane is no longer solely PIM but is now MP-BGP within the SP network. In both cases, the C-Traffic is tunneled throughout the multicast network. NG-MVPNs can be broken up into two parts: control plane and forwarding plane. The control plane is the combination of PIM and BGP while the forwarding plane are the various options for transporting the customer multicast traffic across the network, such as MPLS. 3.2 PMSI As with DR-MVPN, NG-MVPN also has multicast distribution trees. The two types are Inclusive Trees and Selective Trees. An Inclusive Tree includes all of the C-Multicast Traffic of the PEs that are members of the same MVPN. The number of Inclusive Trees is bound by the number of VPNs on a PE router, not by the number of C-Multicast groups. Selective Trees carry only one or more C-Multicast Groups for a given MVPN. In other words, they don’t carry all of the C-Multicast groups for a customer. A PE can by default carry all traffic on an Inclusive Tree and elect to only put higher bandwidth flows onto separate Selective Trees. The Selective Trees should be configured so that they only terminate on PEs that actually have active receivers [28, p. 7–8]. Inclusive trees also have two subtypes: Multidirectional and Unidirectional (MI-PMSI and UI-PMSI). The Multidirectional tree is akin to a broadcast network where any PE that sends a message will have that message sent to any other PE on the MI-PMSI. The Unidirectional PMSI allows a particular PE to send traffic to any other PE in that MVPN [28, p. 15–16]. The di↵erence may not be obvious between a MI-PMSI and a UI-PMSI. A MI-PMSI can be thought of a set of UI-PMSIs that create full-mesh connectivity in an MPVN. This may become more clear when explained in the subsection regarding instantiation of PMSIs in section 3.2.1. 31 The MI-PMSI is used in special circumstances not used in this report (such as PIM as the PE-PE Control Plane or the use of PIM-BIDIR) so only I-PMSI will be used. The Inclusive Tree and Selective Tree are akin to the Default MDT and Data MDT of DR-MVPNs respectively. Both inclusive and selective trees can be aggregated into another tunnel as an aggregated inclusive tree and/or an aggregated selective tree. This is discussed more in depth in section 3.4.6. A PE needs the ability to send packets over one or more trees that belong to an MVPN. This concept is realized by Provider Multicast Service Interfaces (PMSIs). A C-Packet sent via a PMSI will be delivered to some or all of the PEs participating in the MVPN, and any receiver will be able to determine which VPN the C-Packet resides in. The PMSI is the entry point for a P-Tunnel, which is the transport mechanism used for delivering C-Packets. RFC 6513 clarifies that a PMSI is not necessarily part of a P-Tunnel, as a single P-Tunnel can carry multiple PMSIs [28, p. 14–15]. The PMSI is also an abstract concept. When a PE gives a packet to the PMSI it will arrive at one or all of the PEs that belong to a given MVPN. A PE may send C-Traffic to the PE routers that have receivers for that traffic or to all of the PE routers in that MVPN. BGP is used to signal which type of PMSI should be used by including a PMSI Tunnel Attribute that is included in a NG-MVPN BGP UPDATE [2, p. 157–158]. There are two types of PMSIs. The first is an Inclusive PMSI (I-PMSI). The I-PMSI is used when a PE can send a message that will be received by all the PEs for that MPVN. Another type of PMSI is the Selective PMSI (S-PMSI). The S-PMSI is used so that a message will be sent to only selected PEs participating in an MVPN [28, p. 15–16]. It is possible to send traffic only on S-PMSIs and never use an I-PMSI for carrying C-Multicast Traffic which allows for further optimization [28, p. 19]. I-PMSI S-PMSI A1 A1 1 2 1 2 3 4 6 5 3 4 A3 7 6 A2 5 A3 7 A2 Customer A MVRF Customer A P-Tunnel Figure 3.2: Provider Multicast Service Interface Figure 3.2 shows an I-PMSI and an S-PMSI. The PMSI can be thought of the interface to the P-Tunnel, however for each P-Tunnel there may be more than one PMSI. The I-PMSI connects to all the PEs for Customer A, while the S-PMSI connects to only one PE. The S-PMSI may also carry only a subset of the multicast groups for the MVPN. 32 3.2.1 Instantiating PMSIs A PMSI is instantiated by P-Tunnels, which are the encapsulation and forwarding method for multicast traffic in NG-MVPN. The P-Tunnels can be created by PIM, mLDP, RSVP-TE, or replication over P2P Unicast P-Tunnels. In the PIM case, as is in DR-MVPN, there is a P-Instance of PIM that is used to create the tunnels. These can be either source tree or shared tree methods, but an S-PMSI is best created using source tree methods. Using mLDP P2MP can create an S-PMSI or a UI-PMSI, and MP2MP mLDP can create a MI-PMSI. An MI-PMSI can also be created by a set of P2MP mLDP LSPs. RSVP-TE can instantiate an S-PMSI or a UI-PMSI with a single set, where multiple sets can instantiate an MI-PMSI (one by each PE in the MPVN). Unicast P-Tunnels are either a partial or full mesh for UI-PMSI and S-PMSI or MI-PMSI respectively. P-Tunnels are discussed in detail in section 3.4. 3.3 PIM and BGP Control Plane NG-MVPN requires that a PE maintains at most one BGP peering session with all the other PEs in the network, or with a Route Reflector (RR), for carrying the NG-MVPN control information [28, p. 11]. This report only considers using BGP for PE-PE control information and not PIM. In other words, the report only considers translating, for example, PIM C-Join messages into BGP C-Multicast Routes, and not forwarding the PIM Join over a PMSI. The description for PE-CE PIM and PE-PE BGP components are covered below. 3.3.1 PIM Control Plane for CE-PE Information Similar to Unicast BGP/MPLS VPNs, NG-MVPNs have the CE peer only with the directly attached PE using a multicast routing protocol over the attachment circuit (AC). The CE does not peer with the remote CE on the other side of the SP network. The AC is part of a VRF that is configured to be multicast enabled. As with DR-MVPNs these multicast peering sessions between the CE and PE are referred to as multicast C-Instances. The VRF that the AC is attached to contains both unicast and multicast routing instances. RFC 6513 specifies the use of PIM-SM, PIM-SSM, and Bidirectional PIM (BIDIR-PIM) as the PE-CE protocols [28, p. 13]. The PE-PE support methodology for BIDIR-PIM will not be discussed in this report. 3.3.2 MP-BGP Control Plane for PE-PE Information New Path Attributes, Extended Communities, and NLRI Encodings (referred to as Route Types) were created to support NG-MVPNs and are included in NG-MVPN BGP UPDATE Messages. The following sections describe in detail each addition. 3.3.2.1 New BGP Path Attributes and Extended Communities RFC 6514 defines three new path attributes that are used in conjunction with the new NLRI encodings described in the next section. PMSI Attribute The P-Tunnel Multicast Service Interface (PMSI) Attribute in a BGP UPDATE message identifies which type of P-Tunnel is used to send traffic. This is an optional transitive attribute. The PMSI Attribute is made up of four fields as follows [29, p. 10–11]: 33 +————————————————-+ | Flags (1 Octet) | +————————————————-+ | Tunnel Type (1 octet) | +————————————————-+ | MPLS Label (3 Octets) | +————————————————-+ | Tunnel Identifier (Variable) | +————————————————-+ The Flags field only has one flag which indicates of leaf information is required. The MPLS Label field is either set to zero to indicate there is no label, or a label value is encoded in the high-order 20 bits of the three octets [29, p. 10]. The MPLS Field is used when the ingress PE uses “upstream label allocation” to distribute a label to an egress router [30, p. 9]. The Tunnel Type field has the following values [29, p. 10]: • 0 - No Tunnel Information Present • 1 - RSVP-TE P2MP LSP • 2 - mLDP P2MP LSP • 3 - PIM-SSM Tree • 4 - PIM-SM Tree • 5 - BIDIR-PIM Tree • 6 - Ingress Replication • 7 - mLDP MP2MP LSP Depending on the value in the Tunnel Type field the Tunnel Identifier includes the following information [29, p. 10– 13]: No Tunnel Information Present No tunnel information is included. This setting can be used when a PE needs to know the receivers before it establishes a tunnel. The “Leaf Information Required Bit” is set in this case, which will prompt the other PEs to send Leaf A-D route messages [28, p. 52]. RSVP-TE P2MP LSP The same information in the P2MP Session Object is included. This is the Extended Tunnel ID, Tunnel ID, and P2MP ID. mLDP P2MP LSP The P2MP FEC Element is included. This is the combination of the source address of the LSP tree and a unique value. PIM-SSM Tree The P-Root Node Address (P-Source Address of the PE) and the P-Group Address. The P-Group address is an address from the P-Instance of PIM running in the service provider network. PIM-SM Tree The Sender Address and the P-Group Multicast Address. BIDIR-PIM Tree BIDIR-PIM uses the same Tunnel Information as PIM-SM. Ingress Replication The unicast IP address of the tunnel endpoint. mLDP MP2MP LSP The MP2MP FEC Element, which is similar to the P2MP FEC in concept, and is not discussed in this report. Section 3.4 discusses the various types of P-Tunnels in depth, except for BIDIR-PIM and mLDP MP2MP, which will not be covered in this report. Source AS Extended Community This BGP Extended Community is set to the AS Number (ASN) of the SP network that the PE belongs to. It is used for identifying the ASN, and has particular use for Inter-AS updates. It is an optional transitive attribute. A unicast BGP/MPLS UPDATE Message must carry this Extended Community [29, p. 13]. 34 VRF Route Import Extended Community Every Multicast VRF is required to have an import Route Target configured, which is similar use to the Unicast BGP/MPLS VPNs import/export Route Target. This Route Target is referred to as the C-Multicast Import RT. It contains two fields. One is the “Global Administrator Field” which contains an IP address of the PE that is the same across all VRFs (e.g. a loopback address on the PE). The other is the “Local Administrator Field” which is set to a unique 16-bit number that can identify a VRF. The combination of the Global and Local Fields can uniquely Identify a VRF [29, p. 14]. An important clarification from unicast BGP/MPLS RTs is that The C-Multicast Import RT is also dynamic in the sense that the Global Admin Field always contains the IP address of the active sender, which can change [2, p. 166]. The C-Multicast Import RT is just the value that is configured for a particular VRF, and is carried to other PEs by putting the value into the RT Extended Community of a BGP UPDATE message. Of the special BGP/MPLS MVPN Routes, which are described in section 3.3.2.2, C-Multicast Import RTs are only carried by the Route Target Extended Communities of C-Multicast Routes (Type 6 and 7) [2, p. 166]. Outside of these special routes, the C-Multicast RT value must also be carried in the VRF Route Import Extended Community of a BGP UPDATE Message for a unicast BGP/MPLS VPN Route. These unicast routes represent the source of a particular C-Multicast flow. However, if it is known that none of the unicast routes are capable of being a source, then the route should not carry the VRF Route Import EC [29, p. 14]. 3.3.2.2 MCAST-VPN NLRI RFC 6514 defines a new MP-BGP NLRI with a set of NRLI encodings for two purposes: MVPN auto-discovery (A-D) and binding as well as advertisement of C-Multicast Routes. Each NRLI encoding is known as a Route Type. One of these Route Type may indicate the type of PMSI that is going to be signaled, or it may indicate that a PE has a receiver ready to receive traffic. As discussed earlier there are multiple types of PMSIs and BGP is used to signal which types are used for an MVPN. The first five Route Types are for auto-discovery and binding information are as follows: • Intra-AS I-PMSI A-D route • Inter-AS I-PMSI A-D route • S-PMSI A-D route • Leaf A-D route • Source Active A-D route The last two Route Types are for carrying C-Multicast Route information, “C-Multicast Routes”. Each VRF contains a unique Tree Information Base (MVPN-TIB) containing the C-Multicast Routes for that particular VRF. The two Route Types are as follows: • Shared Tree Join Route • Source Tree Join Route The NLRI is identified by AFI 1 and SAFI 5 (MCAST-VPN) and consists of three fields. The first is the Route Type field which identifies which Route Type will be encoded in the NLRI. The next field is the length field to specify how many bits will make up the actual Route Type encoding [29, p. 4-6]. +————————————————-+ | Route Type (1 Octet) | +————————————————-+ | Length (1 octet) | +————————————————-+ | Route Type specific (Variable) | +————————————————-+ Each NLRI Route Type encoding is described below along with its behavior in an NG-MVPN network. 35 Route Type 1 - Intra-AS I-PMSI A-D The Intra-AS I-PMSI A-D route is advertised by any PE that wishes to participate in NG-MVPN auto-discovery and binding. +————————————————-+ | Route Distinguisher (8 Octets) | +————————————————-+ | Originating Router’s IP Address | +————————————————-+ The NLRI contains an RD that is configured for the VRF that the route originated from along with the same IP address that it uses in the VRF Route Import EC that was used in a Unicast BGP/MPLS advertisement for that VRF (e.g. a loopback address). The combination of the RD and the Originating Router’s IP address uniquely identifies a Multicast VRF. The advertisement only contains the Tunnel Attribute field if an I-PMSI is being created (remember that an I-PMSI does not need to be used and the network can use solely S-PMSIs). In other words, in any case, the PE send this type of advertisement. If the I-PMSI is being used then the advertisement must contain the PMSI Attribute, and if Ingress Replication is used it must contain a label for demultiplexing at the receiver end. The Next Hop field of the MP REACH NLRI that contains the MCAST-VPN Route must be set to the same address as the Originating Router’s IP Address field. The advertisement also uses the same Route Target values as the Unicast BGP/MPLS export routes for that VRF. Upon receipt of the I-PMSI Intra-AS advertisement, the receiving PE will import the routes into the VRF if the Route Target in the RT EC of the route matches the RT value configured for the VRF. When the receiving PE receives the Intra-AS Route advertisement and it does not have the PMSI Tunnel Attribute and Ingress Replication is not used the receiving PE can assume that (1) only an S-PMSI will be used, or (2) that the originating PE of the advertisement cannot send multicast traffic (i.e.it is only a receiver). To determine whether it’s case 1 or 2 the VRF Route Import EC is used. If the VRF Route Import EC is not present for a unicast BGP/MPLS route, then the PE that originated the cannot be selected as a source PE (as it does not have routes with a source). Therefore it is case (1), and this PE will only be used for originating S-PMSI routes. If a Tunnel Attribute is carried and Ingress Replication is used then the MPLS Label and the Address in the Tunnel Identifier should be used when the local PE sends traffic to the PE that originated the route. In all other cases the local PE should join the P-Tunnel (if RSVP-TE is used then the sender PE is responsible to building the tunnel to the local PE). The only time an Intra-AS I-PMSI Route is not originated by a PE is when a MVPN site will not be receiving any multicast traffic (i.e. it is only a sender) and Ingress Replication is used. An example of an Intra-AS I-PMSI A-D route as it is shown in a router’s routing table: 1:789:100:1.1.1.1, where 1 is the Route Type, 789:100 is the RD, and 1.1.1.1 is the IP address of the originating router [31]. Route Type 2 - Inter-AS I-PMSI A-D This Route Type is only used when Inter-AS segmented tunnels are used between AS networks. Only an ASBR originates this route. +————————————————-+ | Route Distinguisher (8 Octets) | +————————————————-+ | Source AS (4 Octets) | +————————————————-+ The RD is encoded the same as it is in Unicast BGP/MPLS VPNs. The Source AS contains an AS Number of the originating router, and occupies the low-order 16 bits of the field. The high-order bits are set to zero. This Route Type is originated when an ASBR determines, using Type 1 Routes, that there is an active receiver in its own AS. The Inter-AS I-PMSI A-D Route also carries an import Route Target called “ASBR Import RT” (which is the unicast RT), which allows for the accptance of Leaf A-D route and C-Multicast routes from an ASBR. The ASBR sends the advertisement via external BGP to the neighboring AS. It sends the message with the “Leaf Information Required” flag set, and does not send any label. The Next Hop field of the MP REACH NLRI field is set to an IP address that is reachable by a 36 router in the other AS. In the network that is on the other side of the ASBR the identification of a source becomes the pair of AS and RD, rather than PE and RD. This means that even with multiple trees on the source AS side, the other AS may have just one MVPN for all of the MVPNs in the source AS. Upon receipt of the I-PMSI Inter-AS advertisement, the receiving PE will import the routes into the VRF if the Route Target in the RT EC of the route matches the RT value configured for the VRF. If the router is an ASBR it will pass the routes along in external BGP. If the PMSI Attribute carries a Tunnel Type for PIM-SM/SSM or mLDP P2MP Tree, the receiving router should join the tree using the identifying information carried in the Tunnel Identifier field of the attribute. If the Tunnel Identifier is set to RSVP-TE P2MP Tree, then the originating router is required to build the sub-LSP to the receiving router (this may have been done already as the headend is responsible for initiating the LSP construction in RSVP-TE). If the “Leaf Information Required” bit was set then the receiving router will originate a Leaf A-D Route. The Leaf A-D Route Key is populated with the MCAST VPN NLRI information from the Inter-AS I-PMSI advertisement [29, p. 20–30]. An example of an Inter-AS I-PMSI A-D route as it is shown in a router’s routing table: 2:789:100:789, where 2 is the Route Type, 789:100 is the RD, and 789 is the source AS Number of the originating router [31]. Route Type 3 - S-PMSI A-D C-Source address (C-S,C-G). The S-PMSI A-D Route Type is only used when the C-Multicast stream has a specific +————————————————-+ | Route Distinguisher (8 Octets) | +————————————————-+ | Multicast Source Length (1 Octet) | +————————————————-+ | Multicast Source (variable) | +————————————————-+ | Multicast Group Length (1 octet) | +————————————————-+ | Multicast Group (variable) | +————————————————-+ | Originating Router’s IP Address | +————————————————-+ The RD is the same as in the Inter-AS and Intra-AS I-PMSI Route. The Multicast Source contains the IP address of the C-Multicast source IP address. The Multicast Group contains the C-Multicast Group Address or the mLDP P2MP FEC values when P2MP mLDP is used. The Originating Router’s IP Address is that of the PE, not the CE, as with the Intra-AS I-PMSI A-D message, and it needs to be the same as the address used in the VRF Route Import Extended Community (e.g. a loopback address). This Route Type carries the PMSI Tunnel Attribute which contains the identity of the P-Multicast Tree used for the P-Tunnel. If the originating PE needs to learn about the leaves of the P-Multicast tree it can set the “Leaf Information Flag” bit. An ASBR in certain circumstances may convert one or more received S-PMSIs from another AS into one I-PMSI and distribute it toward the receiver in its own AS. The process when receiving an S-PMSI A-D route is the same as described for the Inter-AS I-PMSI A-D Route. If the “Leaf Information Required” bit is set then the receiving PE originates a Leaf A-D route. The Route Key Field is populated with the MCAST VPN NLRI information from the S-PMSI A-D Route [29, p. 40–45]. An example of an S-PMSI A-D route as it is shown in a router’s routing table: 3:789:100:32:10.1.1.1:32:239.0.0.1:1.1.1.1, where 3 is the Route Type, 789:100 is the originating router’s RD, 32 is the length of the address (indicating IPv4) in both locations, and 10.1.1.1 is the C-Source Multicast Address, 239.0.0.1 is the C-Group Address, and 1.1.1.1 is the Originating Router’s IP Address [31]. Route Type 4 - Leaf A-D Route The previous three Route Types mentioned the Leaf A-D Route. The Leaf A-D Route is sent in response to an advertisement that contains the PMSI Tunnel Attribute with the “Leaf Information Required” bit set to 1 in an Inter-AS I-PMSI A-D Route or in an S-PMSI A-D Route. 37 +————————————————-+ | Route Key (variable) | +————————————————-+ | Originating Router’s IP Address | +————————————————-+ The Route Key field carries the MCAST VPN NLRI information from whichever type of PMSI A-D Route it received (either Inter-AS Inclusive or Selective). If the Tunnel Type from the received advertisement is Ingress Replication then the Leaf A-D needs to set Ingress Replication in its PMSI Tunnel Attribute Tunnel Type field, and it also needs to carry a label. This label will be placed on the stack by the ingress PE (the same one that originated the PMSI A-D advertisement) so the MVPN traffic can be demultiplexed into the correct Multicast VRF by the egress PE (the same one that originated the Leaf A-D advertisement). The Next Hop of the MP REACH NLRI in the Leaf A-D Message must be set to the same IP that is in the Originating Router’s IP Address field. The Leaf A-D advertisement also contains an IP-Based RT EC that is based on the IP address carried in the Next Hop field of the received PMSI A-D advertisement (the sender PE’s IP address) in the Global Admin Field. The Local Admin field is set to zero [29, p. 29]. Zero is used because the correct VRF can be determined by the corresponding Route information in the Route Key field [32]. An example of a Leaf A-D route as it is shown in a router’s routing table: 4:3:32:10.1.1.1:32:239.0.0.1:1.1.1.1:1.1.1.7, where 4 is the Route Type. In this example, after the 4:, the S-PMSI MCAST VPN NLRI information is copied, which makes the Route Key field. The trailing 1.1.1.7 is the Originating Router’s IP Address (of the PE that is sending the Leaf A-D advertisement) [31]. The scenario in this example is that the PE 1.1.1.1 originated an S-PMSI A-D Route and the PE 1.1.1.7 is responding with a Leaf A-D Advertisement. In a common scenario, an ingress (source) PE will originate a Type 3 S-PMSI A-D Route with the “Leaf Information Required” bit set. Receiver PEs that have active receivers will respond with a Type 4 Leaf A-D Route. This is the standard process when using S-PMSIs [30, p. 17]. Route Type 5 - Source Active A-D Route The Source Active A-D Route is used to advertise if a PE has an active source. The Source Active A-D Route is only used for groups outside the 232/8 range for SSM and only in conjunction with Source Tree C-Multicast Join (Route Type 7) [29, p. 9]. When using the SSM range a PE will simply use the Source Tree C-Multicast Route [32]. +————————————————-+ | Route Distinguisher (8 Octets) | +————————————————-+ | Multicast Source Length (1 Octet) | +————————————————-+ | Multicast Source (variable) | +————————————————-+ | Multicast Group Length (1 octet) | +————————————————-+ | Multicast Group (variable) | +————————————————-+ The Source Active A-D Route is only used in conjunction with C-Trees when they switch from a shared tree to a source tree, or when the C-Tree is only a source tree. Depending on the scenario the fields are populated di↵erently, except the RD field which takes the standard RD encoding from the Multicast VRF in Unicast BGP/MPLS format. In both cases the Source and Group fields are the C-Source and C-Group addresses. However in the procedure that is solely source tree the C-Source and C-Group are received from PIM Register messages1 . The MP REACH NLRI Next Hop is the same as the address carried in the VRF Route Import EC of the unicast BGP/MPLS routes that are advertised by the PE, and should carry the same Route Targets as the Intra-AS I-PMSI A-D Route the PE originates. The Source Active A-D Route is propagated to all of the PEs of the MVPN [29, p. 46–47]. 1 It can also come from an MSDP Source-Active Message but that is outside the scope of this report 38 Source Tree Only There are three ways that a PE can learn about an active multicast source in this scenario. One is for the PE to be a C-RP. A second way is to use PIM Anycast RP procedures. Another way is to use MSDP to exchange the information from the C-RP to the PE. Once a new source is learned using any of these methods the PE will send a Source Active A-D route to all PEs within the same MVPN [29, p. 49–52]. This is the default method for NG-MVPN. PEs with receivers for the C-Group in the Source Active message will respond with a Type 7 C-Multicast route toward the ingress PE [2, p. 162]. Shared Tree changing to Source Tree In certain situations the default method is not suitable. One such situation is when the C-RP is not on a PE and MDSP is not used. In this case a Shared Tree method is used where Joins are sent to the RP. In NG-MVPN the Type 6 Shared Tree C-Multicast Route is used instead of a Type 7 Route. These Type 6 messages contain the (C-*,C-G) information and are forwarded from the PE with a receiver to the PE that is attached to the Customer VPN site of the C-RP [2, p. 164]. At this point the C-RP is sending traffic to its PE and the PE is forwarding this traffic to all the PEs on that I-PMSI. The PE with the C-Source then sends its packets to the C-RP with PIM register messages. The PE with the C-RP attached will then send (C-S,C-G) messages over the I-PMSI to all the PEs. Any C-Receiver o↵ the other PEs will send (S,G) PIM Joins to their respective PE, which will them forward them as (C-S,C-G) C-Multicast Routes (Type 7 Source Tree) to the PE with the C-Source. This PE will then start sending traffic onto the I-PMSI, while the C-RP is also sending traffic. Recall that the I-PMSI includes all PEs. As a result a PE may receive traffic from both the C-RP PE and the C-S PE over the PMSI. To prevent this the Source Active A-D route is used. Whenever a PE creates an (C-S,C-G) state within its VRF, because of reception of the Source C-Multicast Route, it originates the Source Active route to all the PEs of that MVPN. As a result, the PEs that receive the Source Active advertisement, that have active receivers, will accept traffic from the PE with the C-Source instead of the PE with the C-RP. The PE connected to the C-RP will stop forwarding any traffic for that specific (C-S,C-G) as a result of receiving the Source Active advertisement [28, p. 63–67]. A1 C-S 1 2 3 4 5 6 A2 A3 C-RP 7 C-R A4 C-R Figure 3.3: Shared Tree to Source Tree Switchover using Source Active A-D Routes Consider the simple topology in figure 3.3. PE1 is attached to the C-Source and PE3 is connected to the C-RP. PE1 is forwarding the traffic to PE3 which is then forwarding the traffic to PEs 6 and 7 which have C-Receivers attached to them over a PMSI. The C-Receiver attached to PE6 may send an (S,G) PIM Join that gets translated to a Type 7 (C-S,C-G) Source Tree C-Multicast Route by PE6 and then forwarded to PE1. Upon reception PE1 will start forwarding the traffic onto the PMSI. To prevent the scenario described above where PE6 and PE7 receive traffic from both the C-Source and the C-RP, PE1 will send a Source Active A-D Route to all the PEs. PE6 and PE7 will select PE1 as its sender, and PE3 will cease forwarding traffic onto the PMSI for that particular (C-S,C-G). 39 Handling a Source Active A-D Route For Both Methods When a PE receives a Source Active A-D Route it will put the route in the Multicast VRF with the corresponding RTs. It will also check to see if a matching (C-*,C-G) entry is present. If one is present it will use the tunnel of the corresponding Source Active A-D Route in the forwarding path to receive traffic. When the PE receives a C-Multicast PIM Join from the CE it will install the (C-*,C-G) state in the MVPN TIB and check if there is a corresponding Source Active A-D Route. If there is one present it will set up the forwarding path to receive traffic from the tunnel of corresponding Source Active A-D Route. In both cases the (C-*,C-G) entry must have an associated PE-CE Attachment Circuit within that Multicast VRF [29, 47–48]. 5:789:100:32:10.1.1.1:32:239.0.0.1, where 5 is the Route Type, 789:100 is the originating router’s RD, 32 is the length of the address and 10.1.1.1 is the C-Source Multicast Address and 239.0.0.1 is the C-Group Address [31]. Regardless of how the fields were populated they will appear the same in the Multicast VRF routing table. Route Type 6 and Route Type 7 - Shared and Source C-Multicast Route C-Multicast Routes are created in response to the creation of C-PIM states on a PE within a Multicast VRF. The encoding for Route Types 6 and 7 are the same, with only a di↵erence in the Customer Source Address fields. +————————————————-+ | Route Distinguisher (8 Octets) | +————————————————-+ | Source AS (4 Octets) | +————————————————-+ | Multicast Source Length (1 Octet) | +————————————————-+ | Multicast Source (variable) | +————————————————-+ | Multicast Group Length (1 octet) | +————————————————-+ | Multicast Group (variable) | +————————————————-+ The RD field consists of he standard Unicast BGP/MPLS encoding. The Source AS field contains the AS Number of the PE that originated the advertisement. The Multicast Group is always the C-Multicast Group Address. If it is a Type 6 Shared Tree C-Multicast Route the C-Multicast Source is the address of the C-RP. If it is a Type 7 Source Tree C-Multicast Route the address consists of the C-Source Address for that group. A PE creates a Shared Tree Join C-Multicast Route when the C-PIM instance creates a (C-*,C-G) state. If this state is deleted the PE can send a C-Multicast advertisement using the MP UNCREACH NLRI attribute. A PE will create and delete a Source Tree C-Multicast Route once the C-PIM instance creates a (C-S,C-G) state using similar methods to the (C-*,C-G) state. Again, the di↵erence is that with the (C-*,C-G) Shared Tree state the C-Source Address of the advertisement is the C-RP, and in the (C-S,C-G) case it is the C-Source Address. There is a special case where mLDP is the C-Instance Protocol (between the CE and PE). In that case there will be an mLDP state with the P2MP FEC, and the C-Source Address is the P2MP FEC. All three cases (Shared, Source, and mLDP) are the same for constructing the rest of the C-Multicast Route. The local PE will select the best Uptream multicast Hop (UHM) route and pull the following information: The ASN that is carried in the Source AS Extended Community of the UMH route and the C-Multicast Import RT of the upstream PE (which is from the value of the VRF Route Import EC of the UMH route). The UMH route was also described as the Unicast BGP/MPLS VPN Route that represents the source of the C-Multicast flow). UMH routes and selection are discussed in detail in section 3.3.3. The RD of the C-Multicast Route is set to the RD of the UMH route that contains the subnet for the C-Multicast Source Address. The C-Multicast Route also constructs an RT that is set to the value of the C-Multicast Import RT (the value of the C-Multicast Import RT, the VRF Route Import EC, and the last RT are the same). If the local and source PEs are in di↵erent AS networks then the AS number of the source PE is used, and the RD is taken from the Inter-AS I-PMSI A-D route for the corresponding C-Multicast Route. An ASBR can use the RD and 40 Originating IP Address information to propagate the C-Multicast Route. When a PE receives a Shared Tree or Source Tree C-Multicast Route it will check to see if any of the RTs in the Extended Communities of the route match the C-Multicast Import RT of the VRF. It will then create the (C-*,C-G) or (C-S,C-G) state in the VRF (assuming the RTs match for that VRF) then bind either an I-PMSI or S-PMSI to that route depending on the PE’s configuration. If a withdrawal message (MP UNREACH NLRI) is received then the PE must remove the (C-*,C-G) or (C-S,C-G) state in the VRF. If the C-Group is in the non-SSM range then a timer is used to delay the removal. This is done so that the PE will continue forwarding traffic over the PMSI until all the PEs have received the withdrawal of the Source Active A-D route for a given (C-S,C-G) [29, p. 32–39]. Examples of the routes for both Shared and Source C-Multicast Routes: 6:789:100:789:32:1.1.1.4:32:239.0.0.1, where 6 is the Route Type, 789:100 is the originating router’s RD, the following 789 is the Source AS, 32 is the length of the address and 1.1.1.4 is the C-Source Multicast Address as the C-RP and 239.0.0.1 is the C-Group Address. 7:789:100:789:32:10.1.1.1:32:239.0.0.1, where 7 is the Route Type, 789:100 is the originating router’s RD, the following 789 is the Source AS, 32 is the length of the address and 10.1.1.1 is the C-Source Multicast Address as the C-Source and 239.0.0.1 is the C-Group Address [31]. 3.3.3 MP-BGP for PE-PE Upstream Multicast Hop When a PE receives a PIM C-Join or C-Prune message from a CE it contains a (*,G) or (S,G) flow. If the source of this flow, or the RP, is across the MVPN of the SP network then the PE needs to find the “Upstream Multicast Hop” (UMH). The UMH is the PE where the traffic enters a network. This could be the PE where the (*,G) packets enter the network in the case of a shared tree and an RP, the actual source in the case of a (S,G) source tree, or at an ASBR. RFC 6513 refers to both the (*,G) RP source or the (S,G) source as the C-Root. This report will follow the same convention. The process of selecting the UMH for a given C-Root is called the “upstream multicast hop selection.” UMH selection can be done by PIM or BGP, but this report only focuses on the BGP method. 3.3.3.1 BGP for Upstream Multicast Hop Selection In a simple case the PE does the UMH selection by checking the unicast routing table of the VRF that the PE-CE Attachment Circuit is in. However sometimes a customer will choose to use a separate set of unicast routes. In this case the PE-CE relationship may share unicast routes using MP-BGP and SAFI 22 or OSPF with a Multi-Topology Identifier (the cases are not limited to these two protocols). In this case an MVPN can have two separate VRFs, one for the unicast and one for the routes used for UMH. While the same BGP SAFI can be used to send this traffic to both VRFs across the backbone3 , RFC 6513 uses a new MP-BGP Address Family (AF), referred to as “Multicast for BGP/MPLS IP Virtual Private Networks (VPNs)” [28, p. 25–26]. This AF should not be confused with the MVPN Address Family from section 3.3.2.2 used for the various autodiscovery/binding and C-Multicast Routes. The SAFI for this AF is 129. The NLRI of this MP REACH NLRI is a Length field and a Prefix field. The length field determines if it’s IPv4 or IPv6, and the prefix is an RFC 4364 RD prepended to the IP address. These routes must also carry the Source AS Extended Community and the VRF Route Import Extended Community, as with the Unicast BGP/MPLS Routes [29, p. 31–32]. 3.3.3.2 Upstream Multicast Hop Selection After a PE receives a C-Join message it looks in the Multicast VRF. In the VRF it looks at all the UMH routes and determines the best match for the C-Root from within that C-Join (matching the source Address or the RP address). For the matching routes the PE determines the Upstream PE and RD. The Upstream PE is determined from the VRF Route Import EC, or if that is not included, the route’s BGP Next Hop. In both cases the RD is taken from the route’s 2 SAFI 2 is the value for Multicast Routes. However these are just unicast routes that are used specifically for multicast purposes and are kept in their own routing table. 3 In which case RFC 6514 recommends using the same RD between unicast and UMH VRF on the same PE, but a di↵erent RD for the set on di↵erent PEs. 41 NLRI. This creates a set 3-tuples of Route, Upstream PE and Upstream RD. All of the routes in this set are called the “UMH Route Candidate Set’. A router must choose the best Route out of the set, which results in the ”Selected UMH Route,“ and the corresponding ”Selected Upstream PE“ and ”Selected Upstream RD“ [28, p. 27]. When Inter-AS methods are used the UMH and the Selected Upstream PE are di↵erent. In this case the UMH is the ASBR IP address [28, p. 29]. 3.4 Forwarding Plane Considerations As is in RFC 4346 for Unicast BGP/MPLS VPNs, RFC 6513 decouples the methods for exchanging control/routing information from the methods for encapsulating and forwarding the traffic. The P-Tunnels supported can be encapsulated in MPLS, IP, or GRE and can be signaled by PIM (using GRE encapsulation) and MPLS (RSVP-TE and mLDP) [28, p. 11]. Inline with separation of control and forwarding, the PMSI is the control plane component that binds the traffic to a P-Tunnel (as a P-Tunnel can carry more than on PMSI). The P-Tunnel forwarding plane is the component that handles the encapsulation and forwarding of the traffic through the network. In the case of MPLS the concepts discussed in Chapter 1 are used to build the tunnel. No new extensions are required for NG-MVPN. In the case of PIM the concepts discussed in Chapter 2 are used to build the tunnel. PIM P-Tunnels in NG-MVPN are very similar to the ones in DR-MVPN. A PE router will use the PMSI information from the BGP A-D routes in conjunction with the PMSI Tunnel Attribute to determine which P-Tunnel is used for a particular customer stream [2, p. 159]. 3.4.1 Tunnel Type 1 - RSVP-TE P2MP LSP Only the headend PE for an RSVP-TE LSP sends Intra-AS I-PMSI A-D Routes with the Tunnel Attribute included. All other PEs send Intra-AS I-PMSI A-D Routes without the PMSI tunnel attributes. The headend PE, after receiving the Intra-AS I-PMSI A-D Routes without the PMSI Attribute, will build the RSVP-TE sub-LSPs of the P2MP LSP to each PE that originated the routes. If an S-PMSI is being used then the headend PE will send an S-PMSI A-D Route with the “Leaf Information Required” bit set. This will result in a Leaf A-D Route and the headend router will use this to bind a C-Flow to that S-PMSI and build the LSP. The PMSI Tunnel Attribute contains the Tunnel Type set to RSVP-TE P2MP, the RSVP-TE P2MP Session Object, and optionally a P2MP Sender Template Object4 [28, p. 39–40]. Penultimate Hop Popping (PHP) must be disabled so that the MPLS label is carried all the way to the PE. This is because the label is used to correlate the traffic carried by the LSP to its VRF. 3.4.2 Tunnel Type 2 - mLDP P2MP LSP When using mLDP the A-D Routes carry a PMSI Tunnel Attribute identifying the use of an mLDP P2MP LSP. The Tunnel Identifier is set to the mLDP P2MP FEC Element [28, p. 42]. The setup process for I-PMSI and S-PMSI tunnels is the same as the RSVP-TE case. However, the egress PE initiates the LSP construction [2, p. 248–250]. 3.4.3 Tunnel Type 3 - PIM-SSM When PIM-SSM is used to create the P-Tunnel the PMSI Tunnel Attribute states that PIM-SSM is used [28, p. 40]. The Tunnel Identifier is the IP Address of the PE that is attached to the C-Source, which is used as the P-Source Address for the IP/GRE encapsulation, and the P-Group Address. When S-PMSIs are being created the PE routers should have a set of P-Group Addresses that can be used to create the tunnels [28, p. 41]. 3.4.4 Tunnel Type 4 - PIM-SM When PIM-SM is used to create the P-Tunnel the PMSI Tunnel Attribute states that PIM-SM is used and uses the P-Group Address. The PE at the root of the shared tree sends out the Intra-AS I-PMSI A-D Routes [28, p. 41]. The 4 This is used to identify a particular P2MP TE LSP 42 information in the Tunnel Identifier field of the PMSI Attribute is the Sender Address (the IP address of the originating PE) and the P-Group address. The Sender Address is used as the P-Source Address for the IP/GRE encapsulation [29, p. 12]. As is the case with PIM-SSM, when S-PMSIs are being created the PE routers should have a set of P-Group Addresses that can be used to create the tunnels. However in the PIM-SM case each PE must have a unique set of addresses. [28, p. 41]. 3.4.5 Tunnel Type 6 - Ingress Replication In this type of P-Tunnel the ingress PE replicates C-Traffic then puts it on to any number of point-to-point unicast tunnels to each PE. IP/GRE or MPLS can be used as the tunnel technology. The PE routers still send an Intra-AS I-PMSI A-D Routes. The PMSI Tunnel Attribute will identify Ingress Replication, and in this case must also send an MPLS label. This label is used to identify the proper VRF at the egress PE [28, p. 42–43]. 3.4.6 P-Tunnel Aggregation As mentioned earlier in the report, multiple PMSIs can be aggregated into one P-Tunnel using MPLS. In essence, an outer tunnel is built using the processes described earlier in the report. These are built using downstream allocated labels. This is because the downstream LSR (with traffic flowing from ingress PE to egress PE as upstream to downstream in the context of VPN) originally advertised the label toward the upstream LSR. To support aggregation, a new concept called “upstream label allocation” is used, which is defined in RFC 5331. In this model the upstream LSR allocates and advertises the label being used [33, p. 1-11]. In NG-MVPN, BGP is used to send the upstream allocated label. The label is contained within the PMSI Tunnel Attribute. Intra-AS I-PMSI[28, p. 17], Inter-AS I-PMSI [28, p. 22], and S-PMSI A-D routes [28, p. 42] all can distribute the upstream allocated label. This MPLS label is below the downstream allocated MPLS label used to build the outer LSP, which is the aggregate LSP. The egress PE uses this label to demultiplex the traffic to the correct VRF. The outer LSP must advertise a regular MPLS label at the last hop. It cannot advertise an Implicit Null or Explicit Null label [28, p. 35–38]. 3.5 Global Table Multicast Global Table Multicast is an IETF specification, currently in draft status at the time of this writing, that uses the NG-MVPN methodology to create multicast provider tunnels in an SP network without the use of VRFs. A common name for the main table outside of VRFs is called the “global table,” hence the name Global Table Multicast (GTM). GTM is sometimes also called “Internet Multicast” but the GTM IETF draft (“Global Table Multicast with BGP-MVPN Procedures”) avoids the use of the term since the use of Internet implies that the multicast streams carried by the provider are available to the entire public Internet. GTM separates the network into a “core network” that is surrounded by one or more non-core parts of the network called “attachment networks.” Between the core and attachment networks is the Protocol Border Router (PBR). The PBR translates the protocols used in the core network (e.g. BGP) to the protocols used in the attachment network (e.g. PIM), and it gets its name as it sits at the protocol boundary. The routers in the attachment network that attach to the PBRs are referred to as Attachment Routers (ARs). A PBR isn’t necessarily an edge router in the PE sense, as in NG-MVPN and regular Unicast BGP/MPLS VPNs. The PBR does mark the border of any tunnels that are used to transport multicast traffic across the core [34, p. 4–5]. 3.5.1 Use of NG-MVPN BGP Procedures in GTM Global Table Multicast PBRs use the same procedures described in NG-MPVN for PE routers. The PE-CE Attachment Circuit (AC) should be considered any circuit that attaches to a PBR (PBR-AR), and the backbone network in NG-MVPN to be considered the core network between the PBRs. Some adaptations are required [34, p. 6]. 43 PIM AR MP-iBGP PBR “Core” PIM PBR AR P-Tunnel Figure 3.4: GTM Network Topology Figure 3.4 shows a high level diagram of the separation between the “core,” where the GTM procedures are carried out, and the AR routers that attach to the PBRs. The AR can simply be another router within the same AS, it does not need to be a CE router, and a source and also simply connect directly to a PBR. 3.5.1.1 Route Distinguishers and Route Targets The MCAST-VPN BGP Routes (SAFI 5 MP REACH NLRI Path Attribute) from NG-MVPN have a Route Target (RT) field and a Route Distinguisher (RD) field in the NLRI. The RD must be set to zero. Recall that NG-MVPN has two types of RTs: The C-Multicast RT Extended Community (EC) and the Unicast BGP/MPLS VPN Import/Export RT. The C-Multicast RT is carried by Extended Communities in the routes of CMulticast Shared Tree Routes, C-Multicast Source Tree Routes, and Leaf A-D Routes, and identifies the PE router that has been selected by the route’s originator as the Upstream PE or UMH. This RT has a Global Admin Field, which identifies the Upstream PE or UMH and a Local Admin Field which is a unique value that identifies a specific VRF. GTM requires the use of the C-Multicast RT, however with the Local Admin field set to zero to imply that the Global Table is being used and not a VRF. The Global Admin Field remains the same. This version of the C-Multicast RT is referred to as the PBR-Identifying RT. The Unicast BGP/MPLS VPN Import/Export RT is optional. If this RT is used and configured for the Global Table, then the values must match, and should be unique from any Import/Export RTs used for NG-MVPN [34, p. 6-8]. 3.5.1.2 UMH-Eligible Routes NG-MVPN specified that UMH-Eligible Routes use SAFI 128 (Unicast BGP/MPLS VPN) or SAFI 129 (Multicast BGP/MPLS VPN). These are the VPN specific routes that are contained within a VPN and require the use of RTs. GTM specifies that the UMH-Eligible Routes are of SAFI 1 (Unicast), 2 (Multicast) or 4 (MPLS Labeled), and they do not require the use of RTs. No new procedures are required for these routes to be imported into the Global Table of a PBR. Recall that NG-MVPN described that the PE looks up the C-Root address (either the C-Source or the C-RP) in the Global Table and finds the best matches and these are the UMH-Eligible Routes. This is done to determine the UMH, Upstream PE, Upstream RD, and Source AS of the flow. GTM will use the routes of SAFI 2 if available, if not it will use routes from SAFI 1 or SAFI 4 (which are considered equal according to BGP best path selection). The same NG-MVPN procedures are used to find the Selected UMH Route. The Upstream RD is always assumed to be zero. The UMH-Eligible Routes in GTM may carry the VRF Route Import EC and/or the Source AS EC. If these are carried then the Upstream PBR and Source AS are identified from these ECs respectively. If the UMH-Eligible Route is not carrying the Source AS EC the AS is considered to be the local AS. If the UMH-Eligible Route does not carry the VRF Route Import EC, then the following optional procedure is used: a PBR advertises a route to itself carrying a VRF Route Import EC with an IP address in its Global Administrator field that is set to the same IP address as the Next Hop and the NLRI address in that route that its advertising to itself. Refer to this as “Route R”. The PBR then advertises “Route R” to other PBRs within the network. When a PBR looks up a route that does not contain the VRF Route Import EC it looks up a route that contains the Next Hop, and should find “Route R” that was advertised by all of the PBRs. From 44 “Route R“ it can determine the upstream PBR from the PBR-Identifying RT found within. Each PBR will perform this process. In some cases the UMH-Eligible Route can be learned outside of BGP. For example, the C-Root address may be found in the IGP links state database, or the C-Root next-hop interface may be a Traffic Engineering tunnel [34, p. 9-12]. 3.5.1.3 BGP Autodiscovery Routes Some special considerations may be needed for the various A-D Routes [34, p. 14–17]. Intra-AS I-PMSI A-D Routes In addition to the conditions when an NG-MVPN implementation does not need to distribute Intra-AS I-PMSI A-D Routes, GTM specifies that these routes do not need to be distributed when I-PMSIs are not being used, and when Shared and Source Tree C-Multicast Routes never have their Next Hop field change. Also section section 3.5.1.1 on RD and RT changes applies. Inter-AS I-PMSI Routes S-PMSI Routes Leaf A-D Routes There are no additional procedures for GTM, except for sections on RD and RT usage. There are no additional procedures for GTM, except for sections on RD and RT usage. There are no additional procedures for GTM, except for sections on RD and RT usage. Source Active A-D Routes The changes in section section 3.5.1.1 apply. In NG-MVPN there is the assumption that no two routes will have the same RD unless they come from the same PE. However in GTM the RD is always set to zero, so all RDs will match. A special procedure is used for GTM. A PBR can attach a VRF Route Import EC to the route. If this is the case, a BGP speaker distributing the route can change the Next Hop, otherwise the BGP speaker may not change the Next Hop. An egress PBR that receives the route can either use the VRF Route Import EC if it is available, or it may use the Next Hop of the originating PBR if it not available (hence the requirement for a BGP speaker to not change the Next Hop if there is no VRF Route Import EC for that route). 3.5.1.4 BGP C-Multicast Routes In GTM environments when it is known in advance that the Next Hop of a route will not change as it propagates through the BGP speakers, the procedure for creating the IP-Address-Specific RT is to just use the IP address of the Upstream PBR in the Global Admin field of the RT. Otherwise the process from NG-MVPN is used, where the IP-Address-Specific RT is based on the Next Hop of a Type 1 or Type 2 I-PMSI Route [34, p. 17]. 3.5.2 Inclusive and Selective Tunnels GTM allows the use of both Inclusive and Selective Tunnels. The specification does advise that using Inclusive Tunnels should be carefully considered for reasons of scale. If there is a large set of PBRs then the exclusive use of Selective Tunnels may be a better approach [34, p. 14]. 45 Chapter 4 Summary The previous two chapters explored Draft Rosen MVPNs (DR-MVPNs) and BGP/MPLS MVPNs (NG-MVPNs). Both of these utilized concepts from Chapter 1, “Building Blocks,” which discussed the various Protocol Independent Multicast (PIM) technologies, Border Gateway Protocol (BGP), Multiprotocol Label Switching (MPLS), and the combination of BGP and MPLS to form Unicast Virtual Private Networks. 4.1 Compare and Contrast Both DR-MPVNs and NG-MVPNs will allow a customer to carry customer multicast traffic across an SP network. Selecting the ideal method is up to the operator. If a network does not already utilize MPLS, then DR-MVPNs may be the better choice over deploying MPLS. However, in a network that uses MPLS and already deploys Unicast BGP/MPLS VPNs, NG-MVPNs are the better choice. NG-MVPNs are a newer technology, so networks that use older equipment may need to use DR-MVPNs until upgrades can be made. DR-MVPN relied heavily on PIM to set up the P-Tunnels within the Service Provider (SP) network. BGP is mainly used for special cases, for example when PIM-SSM is used and the source needs to be advertised across the network. In contrast, NG-MVPNs specify the use of BGP/MPLS Unicast VPNs to build the P-Tunnels. Encapsulation in DR-MVPN uses GRE to encapsulate the customer PIM messages into a new IP packet using a di↵erent IP address, one that is part of the SP network. NG-MVPN also allows the use of GRE but also other encapsulation methods using MPLS as well as optionally using pre-existing point-to-point tunnels in the case of Ingress Replication. 4.2 Receiver Sites: All or Some Both DR-MVPNs and NG-MVPNs have methods of sending traffic to all sites using the same multicast interface or only select sites. These are Default MDTs and Data MDTs for DR-MVPNs or Inclusive PMSIs and Selective PMSIs for NG-MVPNs. In the case of Default MDTs and Inclusive PMSIs, traffic may be sent to sites that do not have active receivers. Considering that the point of multicast is to only send traffic to sites with active receivers these methods may seem excessive. However, both have their place. DR-MVPNs require the Default MDT to build the connectivity to the various sites. Data MDTs cannot send control traffic. In this case the Default MDT is mandatory. The Data MDTs can then be used, after being signaled over the Default MDT, to better scale larger traffic flows. NG-MVPNs allow for only Selective PMSIs to be established. Even though an Inclusive PMSI is not mandatory for signaling, it still has uses. An example is a customer with low bandwidth requirements. In this case there isn’t much burden being placed on the network by sending the traffic to all customer sites, even if not all sites have receivers that have active multicast receivers. Some extra use of resources is traded for avoiding the need to add more multicast state to the SP network. The same is true for Inclusive PMSIs. Another case is 46 simply that all sites actually do need the traffic, in which case the Default MDT and Inclusive PMSI make sense. Both DR-MVPNs and NG-MVPNs allow for the dynamic creation of Data MDTs or Selective PMSIs, respectively. For both technologies, the use of Data MDTs or Selective PMSIs comes down to the operator’s preference of scale. NG-MVPNs can further increase scale in the SP network by allowing for the aggregation of P-Tunnels. 4.3 NG-MVPN vs GTM The methods used in NG-MVPN were extended or modified to create Global Table Multicast (GTM). NG-MVPNs allow for the multicast traffic to be tunneled through the network using MPLS, which allows for the multicast traffic to traverse a network that does not have PIM or BGP in the core. The need for GTM over NG-MVPN becomes apparent in very large networks that are also not carrying external customer traffic. In a larger network the configuration of VRFs and the parameters for building P-Tunnels required for NG-MVPN can become burdensome. If the operator is trying to distribute its own traffic and not customer traffic the need for VRFs likely is not necessary. However, the mechanics of NG-MVPNs require information tied to VRFs to operate. With GTM these mechanics are modified so that the information isn’t required and the Global Table of a router can be used to originate and accept the BGP MVPN routes. A good use case for this is Internet Protocol Televison (IPTV) for a cable company. The content can be originated on the company’s own routers and does not need to be isolated or distributed to only specific routers for a specific business customer. If the traffic is going to all of the potentially thousands of routers that terminate TV subscribers, having to build VPNs to each router is an enormous task. With GTM this task can be eliminated and the technology simply needs to be enabled on the edge routers connected to the source and the subscribers. The TV content can then be pushed to all routers using efficient multicast replication through a core that does not have PIM or BGP state. Alternatively, PIM and BGP can be configured in the core for uses independent from the VPN or MVPN services, allowing for the subscriber TV content to be distributed independently from the core PIM and BGP instances. 4.4 Conclusion DR-MVPNs and NG-MVPNs provide an SP the ability to provide new services for their customers in a scalable manner, while GTM builds on NG-MVPNs to allow an SP to lower operational burden if per-customer isolation is not required. Each technology can be considered when multicast needs to be deployed in a network. With these technologies the reach of multicast is further than ever, enabling more multicast applications to reach even more people. 47 References [1] Pete Loshin. TCP/IP Clearly Explained. Morgan Kaufmann, 2002. [2] Vinod Joseph and Srinivas Mulugu. Deploying Next Generation Multicast-Enabled Applications. Morgan Kaufmann, 2011. [3] S. Deering. Host Extensions for IP Multicasting. RFC 1112, RFC Editor, August 1989. [4] Daniel Minoli. IP Multicast With Applications To IPTV and Mobile DVB-H. John Wiley & Sons, Inc., 2008. [5] D. Meyer and P. Lothberg. GLOP Addressing in 233/8. RFC 3180, RFC Editor, September 2001. [6] H. Holbrook and B. Cain. Source-Specific Multicast for IP. RFC 4607, RFC Editor, August 2006. [7] B. Cain, S. Deering, I. Kouvelas, B. Fenner, and A. Thyagarajan. IGMP Group Management Protocol, Version 3. RFC 3376, RFC Editor, October 2002. [8] H. Holbrook, B. Cain, and B. Haberman. Using Internet Group Management Protocol Version 3 (IGMPv3) and Multicast Listener Discovery Protocol Version 2 (MLDv2) for Source-Specific Multicast. RFC 4604, RFC Editor, August 2006. [9] W. Fenner. Internet Group Management Protocol, Version 2. RFC 2236, RFC Editor, November 1997. [10] B. Fenner, M. Handley, H. Holbrook, and I. Kouvelas. Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised). RFC 4607, RFC Editor, August 2006. [11] Wendell Odom, Rus Healy, and Denise Donohue. CCIE Routing and Switching Certification Guide. Cisco Press, Fourth edition, 2010. [12] A. Adams, J. Nicholas, and W. Sidak. Protocol Independent Multicast - Dense Mode (PIM-DM): Protocol Specification (Revised). RFC 3973, RFC Editor, January 2005. [13] Ina Minei and Julian Lucek. MPLS-Enabled Applications. John Wiley & Sons, Inc., Third edition, 2011. [14] IJ. Wijnands, I. Minei, K. Kompella, and B. Thomas. Label Distribution Protocol Extensions for Point-to-Multipoint and Multipoint-to-Multipoint Label Switched Paths. RFC 6388, RFC Editor, November 2011. [15] R. Arrgarwal, D. Papadimitriou, and S. Yasukawa. Extensions to Resource Reservation Protocol - Traffic Engineering (RSVP-TE) for Point-to-Multipoint TE Label Switched Paths (LSPs). RFC 4875, RFC Editor, May 2007. [16] Russ White, Danny McPherson, and Srihari Sangali. Practical BGP. Pearson Education, Inc., 2005. [17] Y. Rekhter, T. Li, and S. Hares. A Border Gateway Protocol 4 (BGP-4). RFC 4271, RFC Editor, January 2006. [18] T. Bates, R. Chandra, D. Katz, and Y. Rekhter. Multiprotocol Extensions for BGP-4. RFC 4760, RFC Editor, January 2007. [19] E. Rosen and Y. Rekhter. BGP/MPLS IP Virtual Private Networks (VPNs). RFC 4364, RFC Editor, February 2006. [20] Peter Tomsu and Gerahrd Wieser. MPLS-Based VPNs. Prentice Hall PTR, 2002. [21] Randy Zhang and Micah Bartell. BGP Design and Implementation. Cisco Press, 2004. [22] S. Sangli, D. Tappan, and Y. Rekhter. BGP Extended Communities Attribute. RFC 4360, RFC Editor, February 2006. 48 [23] Y. Rekhter and E. Rosen. Carrying Label Information in BGP-4. RFC 3107, RFC Editor, May 2001. [24] Ivan Pepelnjak and Jim Guichard. MPLS and VPN Architectures. Cisco Press, 2000. [25] Luc De Ghein. MPLS Fundamentals. Cisco Press, 2006. [26] D. Farinacci, T. Li, S. Hanks, D. Meyer, and P. Traina. Generic Routing Encapsulation (GRE). RFC 2784, RFC Editor, March 2000. [27] E. Rosen, Y. Cai, and I. Wijnands. Cisco Systems Solution for Multicast in BGP/MPLS IP VPNs. RFC 6037, RFC Editor, October 2010. [28] E. Rosen and R. Aggarwal. Multicast in BGP/MPLS IP VPNs. RFC 6513, RFC Editor, February 2012. [29] R. Aggarwal, E. Rosen, T. Morin, and Y. Rekhter. BGP Encodings and Procedures for Multicast in BGP/MPLS IP VPNs. RFC 6514, RFC Editor, February 2012. [30] Understanding JUNOS OS Next-Generation Multicast VPNs. https://kb.juniper.net/library/ CUSTOMERSERVICE/GLOBAL˙JTAC/technotes/2000320-en.pdf, January 2014. Accessed:July 15th, 2014. [31] NG MVPN BGP Route Types and Encodings. http://www.juniper.net/us/en/local/pdf/app-notes/ 3500142-en.pdf, 2010. Accessed:July 15th, 2014. [32] Personal Communication, July 2014. Je↵rey Zhang, Juniper Networks. [33] R. Aggarwal, Y. Rekhter, and E. Rosen. MPLS Upstream Label Assignment and Context-Specific Label Space. RFC 5331, RFC Editor, August 2008. [34] J. Zhang, L. Giuliano, E. Rosen, Karthik Subramanian, D. Pacella, and J. Schiller. Global Table Multicast with BGP-MVPN Procedures. Draft 04, IETF Tools, May 2014. 49

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Multicast Virtual Private Networks