Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DATA CENTER NETWORKING Enhancing your core business WHITE PAPER BACKGROUND The ability to deliver robust and highly available data center networks is a fundamental requirement for any data center network today.The infrastructure, being the foundation for all applications and services produced in the data center, is increasingly under pressure from application and business owners to maintain 100% availability and zero downtime. At SecureLink we believe that the data center is the heart and center of the production line, much like a factory assembly. The physical network provides the connectivity, capacity, availability and volume to the production line, while the overlay network provides Software Defined features and functions. This distinct separation provides a guide for where in the data center certain features should be implemented – adding to the overall stability. There are little operational and efficiency benefits attempting to add an overlay network to a physical network not designed for heavy east-west traffic. Assuming a high oversubscription ratio, you might be surprised to learn that a spine and leaf topology is about half the price per port compared to a multi chassis lag solution using chassis based switches. Network infrastructure in the data center is at a stage where we need to challenge the mentality “it always has been done this way”. SecureLink believes that today we have the capability to build next-generation data centers and this document aims to provide a glimpse into how we believe it should be delivered, highlighting what is important to our customers. THE PHYSICAL NETWORK MATTERS SPINE AND LEAF TOPOLOGY In an effort to meet the demands of today’s workloads, the traditional fat-tree topology with its focus on north-south traffic and centrally enforced security segmentation simply does not cut it any more. Some SDN vendors claim that it is of no importance how the underlying physical networks are designed. From our point of view, this is a dangerous and somewhat simplified approach.Yes, by definition the software controlling the traffic is independent of the physical design of the network, – but that does not mean that traffic flows the most effective way. The workloads are still using the physical network, although it is being abstracted through a virtual overlay. Pioneered by the cloud titans and hyper scale data centers, today’s network topologies are more condensed and “flatter”, something often referred to as the Spine and Leaf topology. Although the majority of companies are not operating on the same scale, they can still benefit from the topology. As virtual workloads move around the data center to maximize available compute resources, any physical network constraints such as available bandwidth and unpredictable latency, will impact application performance. Server infrastructure has long since been virtualized simply because it is more operationally, administratively and cost efficient. Server virtualization is here to stay and the data center network needs to be architected in a way that makes sense for the virtual workloads. The Spine and Leaf topology is a two layered (three-staged) topology. It consists of an edge layer connecting servers and end hosts, referred to as the leaf layer and a core layer that interconnects the leaf layer devices called the spine layer. The Spine and Leaf topology provides a more scalable, robust network with uniform capacity and latency to the entire data center. Figure: Dense Spine and Leaf Figure: Minimal Spine and Leaf 3 www.securelink.net SIZING AND INTEGRATING THE POD The number of leaf switches are determined by the number of interfaces facing servers and end hosts. The number of spine switches are determined by overall fabric capacity and availability. Key principle is that leaf switches should connect to all spine switches to provide uniform and deterministic capacity and latency but also to avoid transient micro-loops. A spine and leaf topology is purposely uniform, which suits the preferred routing protocol in the data center, BGP. Although BGP requires a bit more configuration than OSPF and ISIS, BGP is the most robust IP routing protocol there is and at the same time one of the most extensible. A multitude of features and address-families have been added over the last 10 years and many of the features have been designed to improve convergence times and to be able to advertise other reachability information beside IPv4. Based on the age of your existing data center network and the amount of technical debt that has been accumulated, fork lift upgrading your legacy data center network might just be a too big bite to swallow. LAYER 2 WITH A CONTROL-PLANE The success of Ethernet is largely attributed to its simplicity and cost benefits, the flood-and-learn approach is the simplest and easiest way to advertise reachability, but not the most efficient and certainly not the safest. The lack of a control-plane means that propagation of information, correct or incorrect cannot be limited or suppressed without impacting normal operations. Implementing a new Spine and Leaf topology is easier to consume in smaller chunks. Coresec recommends that you start the implementation in “pods”. With “pods”, we mean an isolated environment separated from your existing data center network. The extensibility of BGP, enabled through RFC7432 (BGP MPLS-Based Ethernet VPN) and draft-sd-l2vpn-evpn-overlay (A Network Virtualization Overlay Solution using EVPN), gives BGP the ability to transport layer 2 information over a VXLAN or MPLS data plane. Distributed MAC address learning through BGP is superior in every way compared to VPLS and older layer 2 transport techniques. For example, VPLS uses normal Ethernet flooding to propagate MAC addresses (i.e. data plane learning) and relies on spanning-tree to resolve Ethernet loops when in a redundant setup. In terms of complexity and resiliency,VPLS focuses on communication inside the MPLS cloud, not at the edges where “Classical Ethernet” connect. EVPN provides all-active forwarding (multi-homing and multi-pathing) with local proxy ARP and unknown unicast suppression at the edges. Coresec recommends that new data center network designs use the EVPN technology as an enabler for extending layer 2 domains throughout the data center network including the data center interconnect. Once initial testing is completed, the pod is connected to the existing legacy network, ideally using layer 3 to minimize the potential layer 2 impacts. If layer 2 connectivity is crucial during migration, consider using VXLAN to tunnel layer 2 over layer 3. FAILURE DOMAINS AND ROUTING BOUNDARY One of the principal goals of the Spine and Leaf topology is to push the routing boundary closer to the end hosts. There are a number of benefits to do that, for example it provides deterministic traffic flows and multi-pathing, but the key benefit is network stability. Keeping failure domains small adds to the overall fabric stability. As a rule of thumb, keeping the layer 2 domain at the leaf layer, limits the failure domain and provides a natural fault isolation within the rack. Building spine and leaf based topologies using traditional interior gateway protocols, such as OSPF and ISIS, adds very little value today. There is no need for every router to keep state synchronized and calculate the best-path through the area or domain, traffic between top-of-rack switches should be sent to the spine switches – period. 4 www.securelink.net THE VALUE OF HARDWARE BUFFERS Different types of workloads pose different types of requirements on the physical network, such as high bandwidth, low latency and zero packet drop. From a physical network perspective these requirements are to a large extent addressed with hardware features. For example; IP based storage needs bandwidth but also rely on buffers to manage the burstiness associated with disk read and writes operations. The “in cast” problem where there are many ingress interfaces needing to exit on the same egress interface causes congestion. Transitioning from a higher speed interface to a slower speed interface requires buffering as load increases. Based on a study from Insieme Networks (now Cisco) in 2013, about 65% of all drops in a Spine and Leaf topology traffic occurred outbound on the leaf switches facing end-hosts and about 35% occurred outbound on the spine switches facing the leaf switches. It was also determined that fabric oversubscription on leaf switches and higher speed interfaces in the fabric vs edge (facing end hosts) had a huge impact on overall drops. Large buffers in the leaf switches was the most effective way to mitigate the ”in cast” problem. Based on your use case, Coresec recommends that you factor in what applications will consume the “transport service” and decide if and where larger buffers are needed. TODAY WE HAVE THE CAPABILITY TO BUILD NEXTGENERATION DATA CENTERS 5 www.securelink.net SECURE AND AGILE VIRTUALIZED NETWORKS Taking a step back and looking at the big picture, data center networking and Ethernet have had a slow development for the last decades. Granted, capacity and line speeds have evolved but no revolution on how to design the data center network has occurred. nectivity between the actual virtual machines or end-hosts that want to communicate. The overlay network stretches the layer 2 and broadcast domain over a routed underlay. Determining where the overlay networks, i.e. the virtual layer 2 segments, should start and terminate is important. Providing isolated and secure segments between virtual machines running on hypervisors should be done as close to the virtual machines as possible, optimally on the hypervisor itself. Proximity to end-hosts matter, if the overlay starts closer to the underlay spine, layer 2 traffic still need to transit from the VM to the overlay starting point. This contradicts the overall goal to minimize the failure domain. The reliance of a protocol two decades old to make the Ethernet network loop free is far from innovative. Now, regardless of whether this is a result of physical constraints, slow standardization process in the IETF or something else, it is clear that networking and Ethernet is struggling to keep pace with x86 software evolution. One of the demands every business has is the ability to be agile. Agility implies the ability to quickly adapt to change to take advantage of new business opportunities with minimal effort. Change is perpetual, which drives the need to quickly and easily adapt to changes affecting the day-to-day business. Agility in enterprise IT infrastructure means that changes to infrastructure should be managed in a simple way with minimal effort. From a functional and transport perspective it is more efficient to push the overlay functionality to the hypervisor. It simplifies the underlay by reducing the amount of functionality and features required in the physical underlay. A software-based overlay has; No data-plane hardware dependency. The majority of network functions such as security zones or virtual forwarding tables in the physical network are pre-provisioned. Requesting resources outside of that requires human interaction and lead times. In other words, agility in data center networking depends on what perspective you are looking from. Control- and management-plane is not bound to rigid hardware capabilities. x86 platform is mature and programming is much simpler compared to ASIC hardware. Updates, patches and new services will always be faster to implement in software only, rather than a combination of hardware and software components. In contrast, agility in the data center today is almost solely provided by the virtualized server infrastructure platform which can deploy new workloads and applications in minutes. THE OVERLAY NETWORK Most enterprise companies have transitioned the bulk of their workloads from physical servers to virtual servers. Granted, there might be a few non-virtualized servers left but those are usually the exception. An overlay network is a logical network layered on-top-of the physical underlay network. The abstraction of overlay vs. underlay provides separation of traffic based of the function it provides. The underlay manages the physical aspects such as capacity and redundancy etc. The overlay is “tunneled” over the underlay and provides con- 6 www.securelink.net The data center networks will most likely have some form of special or legacy equipment that cannot be virtualized, but still require layer 2 connectivity. The established network hardware vendors do support and use the same set of overlay protocols, providing the ability to integrate the physical and virtual world. In our view, the extra marginal performance gain received by using ASICs in the data-plane is not justifiable. Architecturally, the network virtualization for virtual workloads should be managed by the server virtualization component. Identify the use cases and treat those as the exceptions rather than the norm and be pragmatic, focus your efforts on where you get the most return. Figure:Virtualization Services ABSTRACTION, AUTOMATION OR SDN? The industry and vendors are pushing for automation in the data center. Every networking vendor is either launching its own solution for Software Defined Network (SDN) or adapting by providing plugins or APIs to existing SDN technologies. The goal of automation, is to have machines execute instructions in the same way, every time to improve quality, accuracy and precision. The definition of SDN is somewhat arbitrary, but the essence is to be able to control, operate and administer the network as a single entity, through a single point regardless of the physical hardware. SDN cannot be achieved without automation. Level of Efficiency In the context of data center networks, Coresec views the network evolution and efficiency as steps that lead to more efficient networks. At a high level they can be visualized in the following diagram. Software Defined Networks Script-assisted networks Next future state of networking Orchestrated and Automated networks Device centric CLI networks Time Diagram: Network evolution and efficiency 7 www.securelink.net DEVICE CENTRIC COMMAND LINE INTERFACE NETWORKS Configuration is performed using the CLI on a per device approach by a network operator. Configurations are created using templates and actual device configuration tend to erode over time. Validation of configuration is done by text comparison of the template and actual configuration side-by-side. The definition of SDN is somewhat arbitrary, but the essence is to be able to control, operate and administer the network as a single entity, through a single point regardless of the physical hardware. SDN cannot be achieved without automation. In the context of data center networks, Coresec views the network evolution and efficiency as steps that lead to more efficient networks. At a high level they can be visualized in the following diagram. SCRIPT ASSISTED NETWORKS Introduction of scripts to simplify repetitive configuration changes that are performed often and require little or no verification checks. This could be Python or Perl scripts or even CLI screen scraping, that for example creates and pushes new vlan to the data center switches, assigns vlan to interfaces for all ESX hosts in a cluster etc. ORCHESTRATED AND AUTOMATED NETWORKS One or more external tools that manages and configures the network devices (still on a device-per-device) fashion. The complexities of managing many devices is less important as the underlying topology is abstracted away by an orchestration and automation interface. Automation assists with recurring day-to-day operations, such as reacting to events and reconfiguring device settings. A lot of items fall into the category but, anything that takes manual tasks out of the “examine and react” loop qualifies. SECURING NORTH-SOUTH AND EAST-WEST TRAFFIC Security in the data center is a heavy component and as always the requirement varies for each company. Web scale companies might leverage security through application delivery controllers (ADC) and rely on protection on the end hosts. The enterprise companies that Coresec do business with are usually more security concerned, using for example IDS/IPS, firewalls, WAFs, DDOS Protection and ADCs to employ a defense-in-depth approach for their published services. From a data center networking perspective this translates into two discreet directions of traffic flows, north-south and east-west. SOFTWARE DEFINED NETWORKS A centralized piece of software (a controller) controls, not only how devices are configured, but also enforce policy using in-flow traffic direction. This could be done by for example OpenFlow, where the controller effectively manages the forwarding table in the switches. Integrating an SDN controller with a server virtualization platform enables orchestration of virtual machines, networks and storage in a coherent manner. SDN is a true separation of the data plane from the control plane and enables the provisioning and configuration of these elements. The SDN controller provides a single pane of glass into the physical network. North-south traffic is traffic entering the data center “processing factory” from a source outside the data center, such as internet, partners or internal users. East-west traffic is defined as machine-to-machine traffic, for example a web server talking to an app server, talking to a database server. For data center networks the 80/20 rule, i.e. 80% of the traffic is east-west-based and 20% is north-south and the use of private clouds pushes this close to 90/10. 8 www.securelink.net The benefits of micro segmentation are; Looking at firewalls for example, we believe the direction of traffic flow is important when choosing how and where to implement the firewall. For the enterprise customer almost all vendors offer a virtualized version of their firewall, capable of running on the most popular hypervisors. The virtual firewall is managed and operated the same way as its physical counterpart. Persistent and distributed security – Policies follow the virtual machine regardless of location and IP address. Optimized traffic flows – A distributed security alleviates load on physical security devices and removes hair pinning of VM traffic through for example congested firewalls. As a general rule of thumb, physical firewalls should be used to provide filtering and application inspection for north-south traffic, while virtual firewalls are preferred for filtering east-west based traffic. The east-west firewalling can be delivered using virtual appliances from your preferred vendor or pushed to the hypervisor and managed in combination with the overlay and micro-segmentation. Agility – Software enforced security provides quicker and more agile ways of designing and operating the security inside the data center. Improved malware mitigation – Provides the ability to quarantine infected virtual machines from the rest of the production servers. MICRO-SEGMENTATION Network segmentation is something that network administrators have used for decades and it has been the only way to divide the network into zones, using security functions to enforce a set of access rules. Compliance – Simplifies audits as virtual machine security rules through a single UI. Extensibility – Security functions such as IDS/IPS can be pushed closer to the workload. Until today the way to accomplish this was to place servers in specific sets of vlan and IP subnets and separate them with a firewall. Micro-segmentation takes network segmentation a step further, as the virtual machine is placed in its own security zone. Security policies are defined and enforced by the hypervisor the virtual machine runs on. SUCCESSFULLY IMPLEMENTING A DATA CENTER NETWORK FOR TOMORROW REQUIRES AN UNDERSTANDING OF THE COMPANY’S CORE BUSINESS. 9 www.securelink.net TELEMETRY AND VISIBILITY METRICS Up until today, monitoring the health and status of the data center network was limited to interface monitoring using up/down check and a 5-minute average interface load. In today’s data center networks, about five Gigabytes of data can be sent every second on a 40Gigabit Ethernet interface. In this situation, those five minutes are the equivalent of an eternity. If an interface or network device is experiencing intermittent errors or performance degradation, how does that impact the application users actually use? Coresec believes that to be successful in any next-generation data center, tracking real-time end-to-end health metrics will be a key success factor when building the network fabric. There are two important changes that have enabled network devices to take the next step in monitoring network health. The use of Linux based kernels on standard Intel x86 CPU hardware in network devices. Transition from centralized pull-model to network device event-driven push-model. The use of Linux based kernels on standard Intel x86 CPU hardware instead of monolithic, custom, low capacity CPUs, simplifies code development and enables full use of the Intel CPU features and functionality. The current architecture of polling network devices for information or state (using for example SNMP) lacks the scaling and accuracy to provide useful real-time information. An event-driven model where data gets pushed and streamed by the network device based on user-defined criteria, provides a significant boost to the level of detail and completeness, which was previously unobtainable. When building network fabrics today, where overlay traffic is inherently load balanced and distributed through the fabric, insight into the overlay and the ability to correlate data is critical to provide human readable metrics. CONCLUSION AND THE SECURELINK BENEFIT From an IT perspective, successfully implementing a data center network for tomorrow requires an understanding of the company’s core business and how IT is used to enable the business to be more successful. At SecureLink we believe that the data center is the heart and center of the production line, much like a factory assembly. The physical network provides the connectivity, capacity, availability and volume to the production line, while the overlay network provides Software Defined features and functions. This distinct separation provides a guide for where in the data center certain features should be implemented – adding to the overall stability. There are little operational and efficiency benefits when attempting to add an overlay network to a physical network not designed for heavy east-west traffic. Assuming a high oversubscription ratio, you might be surprised to learn that a spine and leaf topology is about half the price per port compared to a multi chassis lag solution using chassis based switches. Understanding the architectural targets most important to you, SecureLink can help you describe, document, advise, design and implement data center networks that fit your business needs. Navigating our customers through technology choices and compromises as well as pros and cons are what we do on an everyday basis. SecureLink do business with some of the largest enterprise customers in the country and have extensive experience by adding value as the Trusted Advisor for both network and security. Please contact a sales representative to learn what Coresec can do for you. 11 www.coresecsystems.com WWW.SECURELINK.NET