Download PDF - This Chapter

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
CH A P T E R
2
System Overview
Interconnecting Cloud Data Centers can be a complex undertaking for Enterprises and SP’s. Enabling
business critical applications to operate across or migrate between metro/geo sites impacts each Tier of
the Cloud Data Center as described in Figure 2-1. Customers require a validated end-to-end DCI solution
that integrates Cisco’s best in class products at each tier, to address the most common Business
Continuity and workload mobility functions. To support workloads that move between geographically
diverse data centers, VMDC DCI provides Layer 2 extensions that preserve IP addressing, extended
tenancy and network containers, a range of stateful L4-L7 services, extended hypervisor geo-clusters,
geo-distributed virtual switches, distributed storage clusters, different forms of storage replication
(synchronous and asynchronous), geo-extensions to service orchestration tools, IP path optimization to
redirect users to moved VMs and workloads, and finally, support across multiple hypervisors. The
cumulative impact of interconnecting data centers is significant and potentially costly for SPs and
Enterprises. Lack of technical guidance and best practices for an “end-to-end” business continuity
solution is a pain point for customers that are not staffed to sift through these technical issues on their
own. In addition, multiple vendors and business disciplines are required to design and deploy a
successful business continuity and workload mobility solution. VMDC DCI simplifies the design and
deployment process by providing a validated reference design for each tier of the Cloud Data Center.
Figure 2-1
Extending Cloud Data Centers Across Infrastructure Tiers
Route Optimization
Path Optimization
(LISP/DNS/Manual)
Layer 2 Extension
(OTV/VPLS/E-VPN)
WAN Edge/DCI
Switching Fabric
Tenancy and QoS
Switching Fabric
Services and
Containers
Integrated
Compute
Stacks
Virtual
Switching
Virtual Storage
Volumes
Cisco
Products
Partner
Products
Data Center 2
WAN Edge/DCI
Storage and
Fabric Extensions
Management
Infrastructure
and Orchestration
Stateful Services
(FW/SLB/IPsec/VSG)
VM VM VM
VMware ESX
Services and
Containers
VMware and Hyper-V
UCS/Geo-Clusters/Mobility
Integrated
Compute
Stacks
Distributed Virtual Switch
(FW/SLB/IPsec/VSG)
Virtual
Switching
Distributed Virtual Volumes
Storage Federation
MDS Fabric and FCoE
Container
Orchestration
VM VM VM
VMware ESX
Virtual Storage
Volumes
Storage and
Fabric Extensions
Management
Infrastructure
and Orchestration
295211
Data Center 1
The VMDC DCI design uses the following definitions to assess the overall cost of a recovery time
resulting from workload mobility or a recovery plan:
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
Design Guide
2-1
Chapter 2
System Overview
Mapping Applications to Business Criticality Levels
•
Business Continuity—Processes to ensure that essential Business functions can continue during
and after an outage. Business continuance seeks to prevent interruption of mission-critical services,
and to reestablish full functioning as swiftly and smoothly as possible.
•
Recovery Point Objective (RPO)—Amount of data loss that’s deemed acceptable, defined by
application, in the event of an outage. RPO can range from zero (0) data loss to minutes or hours of
data loss depending on the criticality of the application or data.
•
Recovery Time Objective (RTO)—Amount of time to recover critical business processes to users,
from initial outage, ranging from zero time to many minutes or hours.
•
Recovery Capacity Objective (RCO)—Additional capacity at recovery sites required to achieve
RPO/RTO targets across multi-site topologies. This may include many-to-one site recovery models
and planned utilization of recovery capacity for other functions
•
Metro Distance—Typically less than 200 km and less than 10 ms RTT
•
Geo Distance—Typically greater than 200 km and less than 100 ms RTT
The Business Criticality of an application will define an acceptable RPO and RTO target in the event of
a planned or unplanned outage. (Figure 2-2)
Figure 2-2
RPO and RTO Definitions
Achieving necessary recovery objectives involves diverse operations teams and an underlying Cloud
infrastructure that has been built to provide business continuity and workload mobility. Each application
and infrastructure component has unique mechanisms for dealing with mobility, outages, and recovery.
The challenge of an end-to-end cloud data center solution is to combine these methods in a coherent way
so as to optimize the recovery/mobility process across metro and geo sites, and reduce the overall
complexity for operations teams. This is the ultimate goal of the VMDC DCI solution.
Mapping Applications to Business Criticality Levels
A critical component of a successful DCI strategy is to align the business criticality of an application
with a commensurate infrastructure design that can meet those application requirements. Defining how
an application or service outage will impact Business will help to define an appropriate redundancy and
mobility strategy. A critical first step in this process is to map each application to a specific Critically
Level as described in Figure 2-3.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
2-2
Design Guide
Chapter 2
System Overview
Mapping Applications to Business Criticality Levels
Lowest
RTO/RPO
Application Criticality Levels
Criticality
Levels
Term
Impact Description
C1
RTO/RPO
0 t0 80
mins
Mission
Imperative
Any outage results in immediate cessation of a primary function,
equivalent to immediate and critical impact to revenue generation,
brand name and/or customer satisfaction; no downtime is acceptable
under any circumstances
Mission
Critical
Any outage results in immediate cessation of a primary function,
equivalent to major impact to revenue generation, brand name and/or
customer satisfaction
Business
Critical
Any outage results in cessation over time or an immediate reduction
of a primary function, equivalent to minor impact to revenue
generation, brand name and/or customer satisfaction
20% of Apps
C4
Business
Operational
A sustained outage results in cessation or reduction of a primary
function
40% of Apps
C5
Business
Administrative
A sustained outage has little to no impact on a primary function
C2
C3
RTO/RPO
1 to 5 hrs
Highest
RTO/RPO
Typical App
Distribution
20% of Apps
20% of Apps
Each Application is mapped to a specific level… Cloud Data Centers should accommodate all levels… Cost is important factor
295214
Figure 2-3
Industry standard application criticality levels range from Mission Imperative (C1) in which any outage
results in immediate cessation of a primary business function, therefore no downtime or data loss is
acceptable, to Business Administrative (C5) in which a sustained outage has little to no impact on a
primary business function. Applications representing more Business Critical functions (C1-C3)
typically have more stringent RTO/RPO targets than those toward the bottom of the spectrum (C4-C5).
In addition, most SP and Enterprise Cloud Providers have applications mapping to each Criticality
Level. A typical Enterprise distribution of applications described above shows roughly 20% of
applications are Mission Imperative and Mission Critical (C1, C2) and the remainder of applications fall
into lower categories of Business Critical, Business Operational, and Business Administrative (C3-C5).
The VMDC Cloud Data Center must therefore accommodate different levels and provide Business
Continuity and workload mobility capabilities to support varied RPO/RTO targets.
It important to note that even a relatively outage (less than one hour) can have a significant business
impact to enterprises and service providers. Figure 2-4 describes the typical Recovery Point Objective
(RPO) requirements for different enterprises. In this study, 53% of Enterprises will have significant
revenue loss or business impact if they experience an outage of just one hour of Tier-1 data (Mission
Critical data). In addition, 48% of these same enterprises will have a significant revenue loss or business
impact if they experience an outage of less than 3 hours of Tier-2 data (Business Critical data). Even
tighter RPO requirements are applicable to SP Cloud Providers. Enterprise and SP Cloud Providers have
a strong incentive to implement Business Continuity and workload mobility functions to protect critical
workloads and support normal IT operations. VMDC DCI provides a validated framework to achieve
these goals within Private Clouds, Public Clouds, and Virtual Private Clouds.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
Design Guide
2-3
Chapter 2
System Overview
Mapping Applications to Business Criticality Levels
Typical Enterprise RPO Requirements1
Figure 2-4
VMDC DCI implements a reference architecture that meets two of the most common RPO/RTO targets
identified across Enterprise Private Clouds and SP Private/Public Clouds. The two RPO/RTO target use
cases are described in Figure 2-5. The first case covers an RTO/RPO target of 0 to 15 minutes which
addresses C1 and C2 criticality levels. Achieving near zero RTO/RPO requires significant infrastructure
investment, including synchronous storage replication, Live VM migrations with extended clusters,
LAN extensions, and metro services optimizations. Achieving near zero RTO/RPO typically requires
100% duplicate resources at the recovery site, representing the most capital intensive business
continuity/workload mobility option. The second use case covers an RPO/ RTO target of more than
15 minutes which addresses Critically Levels C3 and C4. Achieving a 15 minute target is less costly,
less complex, and can utilize a many-to-one resource sharing model at the recovery site.
Lowest
RTO/RPO
Validated RPO/RTO Targets
Criticality
Levels
Term
Impact Description
C1
RTO/RPO
0 t0 80
mins
Mission
Imperative
Any outage results in immediate cessation of a primary function,
equivalent to immediate and critical impact to revenue generation,
brand name and/or customer satisfaction; no downtime is acceptable
under any circumstances
Mission
Critical
Any outage results in immediate cessation of a primary function,
equivalent to major impact to revenue generation, brand name and/or
customer satisfaction
Business
Critical
Any outage results in cessation over time or an immediate reduction
of a primary function, equivalent to minor impact to revenue
generation, brand name and/or customer satisfaction
C4
Business
Operational
A sustained outage results in cessation or reduction of a primary
function
C5
Business
Administrative
A sustained outage has little to no impact on a primary function
C2
Highest
RTO/RPO
C3
RTO/RPO
15+ mins
VMDC DCI will focus on two
common RTO/RPO Targets
Typical
Cost $
100%
Duplicate
Resources,
2x Cost
Multiplier
(Most Costly)
Many-to-One
Resource
Sharing.
Lower Cost
Multiplier
(Less Costly)
295215
Figure 2-5
1. Source: Enterprise Strategy Group, 2012
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
2-4
Design Guide
Chapter 2
System Overview
Active-Active Metro Design
To cover both of these recovery targets, the VMDC DCI design must support two operational models.
The first operational model, Active-Active metro design, is derived from two physical sites spanning a
metro distance, operating as a single Logical Data Center. The second operational model represents a
more traditional Active-Backup metro/geo Design, where two independent data centers provide
recovery and workload mobility functions across both metro and geo distances. A brief description of
both VMDC DCI options is provided below.
Active-Active Metro Design
The active-active metro design is described in Figure 2-6. This model provides DCI extensions between
two metro sites, operating together as a single Logical Data Center. This design accommodates the most
stringent RTO/RPO targets for Business Continuity and Workload Mobility. This model supports
applications that require live workload mobility, near zero RTO/RPO, stateful services, and a
synchronous storage cluster across a metro distance.
Figure 2-6
Active-Active Metro Design
Route Optimization
Data Center 2
WAN Edge/DCI
LAN Extensions
WAN Edge/DCI
Switching Fabric
Extend Tenancy and QoS
Switching Fabric
Services and
Containers
Maintain Stateful Services
and Network Containers
Services and
Containers
Integrated
Compute
Stacks
Virtual
Switching
VM VM VM
VMware ESX
Live Workload Mobility
Extended Clusters
Integrated
Compute
Stacks
Distributed Virtual Switching
Virtual
Switching
Virtual Storage
Volumes
Synchronous Storage
Replicatioin
Virtual Storage
Volumes
Storage and
Fabric Extensions
Storage Federation
MDS Fabric and FCoE
Storage and
Fabric Extensions
Management
Infrastructure
and Orchestration
Extended Operational
Domain
Management
Infrastructure
and Orchestration
VM VM VM
VMware ESX
295216
Data Center 1
These Sites are Operating as ONE Logical Data Center
Applications mapped to this infrastructure may be distributed across metro sites and also support Live
Workload mobility across metro sites. Distributed applications and Live Workload Mobility typically
requires stretched clusters, LAN extensions, and synchronous storage replication, as described in
Figure 2-7. DCI extensions must also support Stateful L4-L7 Services during workload moves,
preservation of network QoS and tenancy across sites, and virtual switching across sites. A single
Operational domain with Service Orchestration is typically used to manage and orchestrate multiple data
centers in this model.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
Design Guide
2-5
Chapter 2
System Overview
Active-Backup Metro/Geo Design
Figure 2-7
Distributed Clusters and Live Workload Mobility
The key VMDC DCI design choices for the Active-Active metro design are described in Figure 2-8.
Figure 2-8
Active-Active Metro Design Choices
Route Optimization
VMDC DCI Design Choices
Layer 2 Extension
(OTV/VPLS/E-VPN)
WAN Edge/DCI
Switching Fabric
Services and
Containers
Integrated
Compute
Stacks
Virtual
Switching
Virtual Storage
Volumes
Cisco
Products
Partner
Products
VM VM VM
VMware ESX
• External Path Re-direction thru Manual configuration or RHI
• Forced routing re-convergence to new site
• OTV LAN Extension, Preserve IP Addressing of Applications
• IP WAN Transport with 10ms RTT across Metro distance
Tenancy and QoS
• VMDC 3.0 FabricPath (Typical Design) with Multi-Tenancy
• Palladium Network Container
Stateful Services
(FW/SLB/IPsec/VSG)
• Stateful Services between sites
• Citrix SDX SLB at each site (no Metro extension)
• ASA 5500 FW Clustering at each site (no Metro extension)
VMware and Hyper-V
UCS/Geo-Clusters/Mobility
Distributed Virtual Switch
(FW/SLB/IPsec/VSG)
Distributed Virtual Volumes
Storage and
Fabric Extensions
Storage Clusters
MDS Fabric and FCoE
Management
Infrastructure
and Orchestration
Container Orchestration
• Stretched ESX Clusters and Server Affinity
• VMware Live vMotion across Metro sites
• Distributed vCenter spanning Metro sites
• Single and Multi-Tier Application migration strategy
• Nexus 1000v with VSMs and VEMs across Metro sites
• Service and Security Profiles follow Application VMs
• Different Nexus 1000v’s mapped to Application Domains as needed
• Virtual volumes follow VM
• NetApp MetroCluster Synchronous Storage Replication
• ONTAP 8.1 Fabric MetroCluster, 160 Km long haul link (DWDM)
• FCoE to compute stack, Cisco MDS FC Switching for data
replication
• Replicate Service Container to new site to support Mobile VM
• Virtual Management Infrastructure support across Metro
Move an “ACTIVE” Workload across Metro Data Centers while maintaining Stateful ervices
295218
Data Center 1
Path Optimization
(LISP/DNS/Manual)
Active-Backup Metro/Geo Design
The second model, Active-Backup metro/geo Design represents a more traditional primary/backup
redundancy design, where two independent data centers provide recovery and workload mobility
functions across both metro and geo distances, as described in Figure 2-9. This model can address less
stringent RTO/RPO targets, where applications require Cold workload mobility/recovery in which
applications and corresponding network services are restarted at the recovery location.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
2-6
Design Guide
Chapter 2
System Overview
Active-Backup Metro/Geo Design
Figure 2-9
Active-Backup Metro/Geo Design
Metro or Geo Connections
Data Center 1
Data Center 2
WAN Edge/DCI
LAN Extensions
WAN Edge/DCI
Switching Fabric
Switching Fabric
Services and
Containers
Services and
Containers
VM VM VM
Integrated
Compute
Stacks
Cold Workload Mobility
with Site Recovery Tools
VMware ESX
Virtual
Switching
Integrated
Compute
Stacks
VM VM VM
VMware ESX
Virtual
Switching
Virtual Storage
Volumes
Storage and
Fabric Extensions
Storage and
Fabric Extensions
Management
Infrastructure
and Orchestration
Management
Infrastructure
and Orchestration
295219
Asynchronous Storage
Replicatioin
Virtual Storage
Volumes
These Sites are Operating as TWO Independent Data Centers
This Business Continuity and Workload Mobility design is best suited for moving or migrating “stopped
workloads” between different Cloud data centers as described in Figure 2-10. These less stringent
RPO/RTO requirements enable the participating data center to span a geo distance of more than 200 km.
In this model, LAN extensions between data centers is optional, but may be necessary for operators that
need to preserve to IP addressing for applications and services. In addition, Asynchronous data
replication used to achieve less stringent RPO/RTO targets.
Figure 2-10
Migrating Stopped Workloads
West
Data Center
East
Data Center
Moving Workloads
Hypervisor
Hypervisor Control
Traffic (routable)
Hypervisor
IP Network
Moving workloads with optional LAN extensions
295220
Asynchronous Data Replication
The key VMDC DCI design choices for the Active-Backup metro/geo design are described in
Figure 2-11.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
Design Guide
2-7
Chapter 2
System Overview
VMDC DCI Supports Multiple Design Options
Figure 2-11
Active-Backup Metro/Geo Design Choices
Route Optimization
VMDC DCI Design Choices
Layer 2 Extension
(OTV/VPLS/E-VPN)
Switching Fabric
Services and
Containers
Integrated
Compute
Stacks
Virtual
Switching
Virtual Storage
Volumes
Cisco
Products
Partner
Products
VM VM VM
VMware ESX
• External Path Re-direction thru Manual configuration or RHI
• Forced routing re-convergence to new site
• OTV LAN Extension, Preserve IP Addressing of Applications
• IP WAN Transport with 10ms RTT across Metro/Geo distance
Tenancy and QoS
• VMDC 3.0 FabricPath (Typical Design) with Multi-Tenancy
• Palladium Network Container
Stateful Services
(FW/SLB/IPsec/VSG)
• Services Silo’d to each site
• Citrix SDX SLB at each site (no Geo extension)
• ASA 5500 FW Clustering at each site (no Geo extension)
VMware and Hyper-V
UCS/Geo-Clusters/Mobility
Distributed Virtual Switch
(Nexus 1000v)
Distributed Virtual Volumes
Storage and
Fabric Extensions
Storage Clusters
MDS Fabric and FCoE
Management
Infrastructure
and Orchestration
Container Orchestration
• Separate ESX Clusters at each site with Server Affinity
• VMware SRM Cold Migration across Metro/Geo sites
• Silo’d vCenter at each Metro/Geo site
• Single and Multi-Tier Application migration strategy
• Nexus 1000v with VSMs and VEMs Silo’d to each site
• Service and Security Profiles follow Application VMs
• Different Nexus 1000v’s mapped to Application Domains as needed
• Virtual volumes local to each site, replicated asynchronously
• NetApp SnapMirror ONTAP Asynchronous Storage Replication
• WAN based Storage Replicaion over long distance (200 RTT)
• MDS FC Switching for data replication
• Replicate Service Container to new site to support Mobile VM
• Virtual Management Infrastructure support across Metro/Geo sites
Migrate a “Stopped” Virtual Workload across Metro/Geo Data Centers, Stateless Services and VM reboot at new site
295221
Data Center 1
WAN Edge/DCI
Path Optimization
(LISP/DNS/Manual)
VMDC DCI Supports Multiple Design Options
It is important to note that BOTH of these design options are typically required by Enterprises and SPs,
to address their wide range of applications in a cost efficient way. Therefore, VMDC DCI integrates the
Active-Active metro design and the Active-Backup metro/geo design into a single Cloud data center that
can be used to provide Business Continuity and Workload Mobility for a wide range applications and
RPO/RTO targets.
Based on the recent survey sited in the Figure 2-12, almost half of all Enterprises have their primary
backup facility within a 250 mile distance. As a result, most Enterprises can therefore implement both
metro and geo business continuity and workload models across their current data center locations. Large
Tier 1 Service Providers and Enterprises typically span longer distances and many regions.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
2-8
Design Guide
Chapter 2
System Overview
Top Level Use Cases
Figure 2-12
Typical Enterprise Geo-Redundancy1
May 2011 “State of Enterprise Disaster Recovery Preparedness, Q2 2011”
“What is the distance between your primary center
and your furthest backup data center, in miles?”
Half of Enterprises can deploy
VMDC DCI Metro and Geo designs
across their current Data Center sites
22%
Greater than 1,000 miles
12%
15%
500 to less than 1,000 miles
24%
13%
250 miles to less than 500 miles
5%
100 to less than 250 miles
16%
9%
2007
2010
10%
50 miles to less than 100 miles
48% Less than 250 miles apart
12%
15%
25 miles to less than 50 miles
11%
27% Less than 50 miles apart
17%
16%
Less than 25 miles
Source: Forrester/Disaster Recovery Journal October 2007 Global Disaster Recovery Preparedness Online
Survey and Forrester/Disaster Recovery Journal November 2010 Global Disaster Recovery Preparedness
Online Survey
295222
Base: disaster recovery decision-makers and influencers at enterprises globally with a recovery site
(does not include those who answered “Don’t know”)
(percentages may not total 100 due to rounding)
Top Level Use Cases
Top level use cases validated in VMDC DCI are mapped to one of the following design choices:
•
Design Parameters for Active-Active Metro Use Cases, page 2-9
•
Design Parameters for Active-Standby Metro/Geo Use Cases, page 2-10
Design Parameters for Active-Active Metro Use Cases
VMDC DCI used the following design parameters in the Active-Active metro design.
Live Workload Mobility can Solve Specific Business Problems
•
Perform live (or cold) workload migrations between metro data centers
•
Perform operations re-balancing/maintenance/consolidation of live (or cold) workloads between
metro data centers
•
Provide disaster avoidance of live (or cold) workloads between metro data centers
•
Implement application geo-clusters spanning metro DCs
•
Utilized for the most business critical applications (lowest RPO/RTO)
•
Maintain user connections for live workload moves
•
Implement load balanced workloads between metro DC's
Hypervisor tools utilized to implement Live Workload Mobility
•
VMware live vMotion
•
Stretched HA/DRS clusters across metro data centers
•
Single vCenter across metro data centers
1. Source: Forrester “State of Enterprise Disaster Recovery Preparedness, May 2011
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
Design Guide
2-9
Chapter 2
System Overview
Top Level Use Cases
•
DRS host Affinity rules to manage compute resources
Metro Data Center Infrastructure to support Live Workload Mobility
•
Network—Data Center Interconnect extensions between metro data centers
– Simplified LAN extensions using Overlay Transport Virtualization (OTV) is used to preserve
IP addressing of applications and support Live migrations
– Virtual switches distributed across metro data centers
– Tenant Containers spanning multiple sites
– Maintain traffic QoS and packet markings across metro networks
•
Services—Maintain stateful services for active connections where possible
– Support a combination of services hosted physical appliances, as well as virtual services hosted
on the UCS
– Minimize traffic tromboning between metro data centers
•
Compute—Support single-tier and multi-tier applications
– Multiple UCS systems across metro DC's to support workload mobility
•
Storage—Storage extended across metro, synchronous and asynchronous replication
– Distributed storage clusters spanning metro data centers
Figure 2-13 shows a typical live migration of an active workload. Each tier of data center is impacted
by this use case.
Figure 2-13
Live Workload Mobility
Branch
5
Orchestration redirects external flows to
DC-2, connecting users to DC-2 network
container and the moved application
(LISP Future)
Metro Network
WAN Edge/DCI
Stateful Services
Core/Aggregation
3
Service
Containers
Trombone to original site
for active flows, use original
Network Container (and
original physical appliances
if needed) to maintain
Stateful Services
Data Center 2
WAN Edge/DCI
Core/Aggregation
Service
Containers
4
Live VM
VM VM VM
Integrated
Compute
Stacks
VMware ESX
APP
OS
2
Live VM Migration across
LAN extensions, Vitrual
Services follow VM, IP
Addressing preserved
Synchronous
Storage
Virtual
Switching
1
Virtual Storage
Volumes
Continuous Synchronous
Storage Replication
Integrated
Compute
Stacks
VM VM VM
APP
OS
VMware ESX
With Service
Orchestration,
Create a new
Network Container
at DC-2
Virtual
Switching
Virtual Storage
Volumes
Storage and
Fabric Extensions
Storage and
Fabric Extensions
Management
Infrastructure
and Orchestration
Management
Infrastructure
and Orchestration
6
Migration of Live
Workload complete!
Compute, Network,
Storage, and Services
are now local to DC-2.
Reclaim DC-1 resources
for new workloads.
295223
Data Center 1
Move a “Live” Workload across Metro Data Centers while maintaining Stateful Services
Design Parameters for Active-Standby Metro/Geo Use Cases
VMDC DCI used the following design parameters in the Active-Standby metro/geo design.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
2-10
Design Guide
Chapter 2
System Overview
Top Level Use Cases
Cold Workload Mobility can solve specific Business problems
•
Perform planned workload migrations of stopped VMs between metro/geo data centers
•
Operations rebalancing/maintenance/consolidation of stopped workloads between metro/geo data
centers
•
Disaster avoidance or recovery of stopped workloads
•
User connections will be temporarily disrupted during the move process
•
Site migrations across metro/geo data centers of stopped workloads
•
Utilized for less business critical applications (Medium to High RPO/RTO)
Hypervisor tools utilized to implement Cold Workload Mobility
•
VMware Site Recovery Manager (SRM) and VMware High Availability
•
Resource pools mapped to Active/Active or Active/Standby metro/geo DCs
•
Host Affinity rules to manage compute resources
•
Many-to-One Site Recovery Scenarios
Metro/Geo Data Center Infrastructure to support Cold Workload Mobility
•
Network—Data Center Interconnect is optional
– Simplified LAN extensions using Overlay Transport Virtualization (OTV) is used to preserve
IP addressing of applications
– Multiple UCS systems utilized to house moved workloads at the recovery site
– Create new tenant containers at recovery site to support the moved workloads
•
Services—Service connections will be temporarily disrupted
– New network containers and services created at new site
– Traffic tromboning between metro DCs can be avoided in many cases
•
Compute—Support Single-Tier and Multi-Tier Applications
•
Storage—Asynchronous Data Replication to remote site
– Virtual Volumes silo’d to each DC
Figure 2-14 shows the different infrastructure components involved in the cold migration of a stopped
workload. Each tier of data center is impacted by this use case.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
Design Guide
2-11
Chapter 2
System Overview
Solution Architecture
Figure 2-14
Components of Stopped Workload Cold Migration
Branch
4
Orchestration redirects external flows to
DC-2, connecting users to DC-2 network
container and the moved application
(LISP Future)
Metro/Geo Network
Data Center 1
WAN Edge/DCI
Stateful Services
Core/Aggregation
3a
Service
Containers
Data Center 2
WAN Edge/DCI
3
With Service
Orchestration,
Create a new
Network Container
at DC-2
Trombone to original site
if maintaining original
Core/Aggregation
Network Container (and
original physical appliances).
Service
This step is optional.
Containers
Stopped VM
VM VM VM
VMware ESX
APP
OS
2
VM is halted. SRM based
Cold Migration of stopped
VM across IP WAN, Virtual
Services follow VM, IP
Addressing preserved
Asynchronous
Storage
Virtual
Switching
1
Virtual Storage
Volumes
Continuous Asynchronous
Storage Replication
Integrated
Compute
Stacks
5
VM VM VM
APP
OS
VMware ESX
Reboot Moved VM
at new site
Virtual
Switching
Virtual Storage
Volumes
Storage and
Fabric Extensions
Storage and
Fabric Extensions
Management
Infrastructure
and Orchestration
Management
Infrastructure
and Orchestration
6
Migration of Cold
Workload complete!
Compute, Network,
Storage, and Services
are now local to DC-2.
Reclaim DC-1 resources
for new workloads.
295224
Integrated
Compute
Stacks
Move a “Stopped” Workload across Metro Data or Geo Centers
Solution Architecture
Top lever use components validated in VMDC DCI are mapped to one of the following design choices:
•
Active-Active Metro Design, page 2-12
•
Active-Backup Metro/Geo Design, page 2-13
Active-Active Metro Design
The Active-Active metro design used in the VMDC DCI system is included in Figure 2-15. The physical
sites are separated by a metro distance of 75 Km. Layer 2 LAN extensions are included to support
multi-site hypervisor clusters, stretched network containers, and preservation of IP addressing for
workloads. Storage is extended between sites to support active-active clusters and synchronous storage
replication. Asynchronous storage replication between sites is also provided for less business critical
applications.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
2-12
Design Guide
Chapter 2
System Overview
Solution Architecture
Figure 2-15
Active-Active Metro Design Topology
Active-Backup Metro/Geo Design
The Active-Backup metro/geo Design validated in the VMDC DCI system is included in Figure 2-16.
The physical sites are separated by a geo Distance of 1000 Km. Layer 2 LAN extensions are optional.
Storage is contained to each site. Asynchronous storage replication provides long distance data
replication between sites.
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
Design Guide
2-13
Chapter 2
System Overview
System Components
Figure 2-16
Active-Backup Metro/Geo Design Topology
System Components
Table 2-1 and Table 2-2 list product components for Cisco and Partners, respectively.
Table 2-1
Cisco Components
Role
Cisco Products
WAN Edge / Core
ASR-1004
Nexus 7010
Aggregation
FabricPath Spine
Nexus 7009
Access-Edge
FabricPath Leaf
Nexus 6004
Nexus 5548
Nexus 7004 w Sup2/F2
FEX
N2K-C2232PP/N2K-C2248TP-E
Fabric Interconnect
UCS 6248UP
Compute
UCS B-200-M3s /M2
UCS M81KR Virtual Interface card
UCS P81E Virtual Interface card
UCS Virtual Interface card 1280, 1240
Virtual Access Switch Nexus 1000v
Virtual Firewall
VSG
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
2-14
Design Guide
Chapter 2
System Overview
System Components
Table 2-1
Cisco Components (continued)
Role
Cisco Products
Physical Firewall
ASA5585X
Storage Fabric
MDS9148
Table 2-2
Third Party and Partner Products
Role
Partner Products
SAN/NAS Storage
NetApp MetroCluster
NetApp SnapMirror
FAS 6080/6040
FAS 3250
Hypervisor
VMWare vSphere 5.1
Site Recovery Manager 5.1
Server Load Balancers
NetScaler SDX
Applications used to demonstrate Migration use cases
Microsoft SharePoint & Visual Studio
Oracle & Swingbench
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
Design Guide
2-15
Chapter 2
System Overview
System Components
Virtualized Multiservice Data Center (VMDC) Data Center Interconnect (DCI) 1.0
2-16
Design Guide