Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MSG389 Achieving High Availability with Windows Server and Exchange Server Anthony Quigney, Application Solution Centre Manager, Dell EMEA Brian Hayden, Senior Systems Consultant, Application Solution Centre, Dell EMEA Agenda Availability – Why is it important? Availability – Defined Availability - Business / IT Challenges Availability Solutions Windows Server 2003 Exchange Server 2003 Clustering Dell High Availability Solutions Effects Of Downtime Today, computing resources are the axis on which business revolves. When these resources are unavailable to an organization, it is at risk of losing its competitive edge. Lost Systems . . . Leads to lost . . . Revenue Customers Productivity Data data Decision Capability ??? The Cost of Downtime Industry Sector Energy Telecommunications Manufacturing Financial institutions Insurance Retail Pharmaceuticals Banking Food/beverage processing Consumer products Transportation Utilities Health care Professional services Construction and engineering Media Hospitality and travel Average Revenue Per Hour $2,817,846 $2,066,245 $1,610,654 $1,495,134 $1,202,444 $1,107,274 $1,082,252 $996,802 $804,192 $785,719 $668,586 $643,250 $636,030 $532,510 $389,601 $340,432 $330,654 $1,010,536 Source: IT Performance Engineering & Measurement Strategies: Quantifying Performance Loss, Meta Group, October 2000 Revenue Per Employee-Hour $569.20 $186.98 $134.24 $1,079.89 $370.92 $244.37 $167.53 $130.52 $153.10 $127.98 $107.78 $380.94 $142.58 $99.59 $216.18 $119.74 $38.62 $205.55 Four Levels of Continuity High Availability: Maintaining the availability of systems critical to ongoing operations during a failure or service outage Disaster Recovery: Recovering from unplanned, catastrophic events or disasters in a predetermined manner based on the importance of the system Site Recovery Site Beyond the Building Remote or commercial recovery facilities Site/Datacenter Failover Re-route users and data to replicated sites Application Failover/ Load Balancing Application Continuous application access via clustering System Interaction Redundant Systems Continuous server, storage, network access Data SAN, NAS & DAS Beyond the Box Continuous data access Backup and Restore Real-time tape backup, Off-site storage Platform Rapid Equipment Replacement In the Box Vendor services and financing programs High Availability System Features Hot- swappable, redundant components with Mission-critical support Increasing cost, functionality and complexity The Causes of Downtime When a failure occurs, it makes an impact. Avoiding downtime results from properly planning, designing and implementing multiple levels of protection. Causes of Failure Examples Impacts… Component failure Bad memory chip, fan, power, HDD, data path, controller Platform, data Software defects/failures Driver hangs, OS hangs/reboots, virus, file corruption Platform, data, applications Planned administrative downtime Upgrade components, firmware, drivers, O/S, software Platform, data, applications Operator error and malicious users Accidental or intentional file deletion, unskilled operation, experimentation Platform, data, applications System outage/maintenance Software/systems requiring reboot, system board failure Applications Building/site disaster Fire, storms, collapse, explosion, and other localized disasters Site Metropolitan disaster Earthquake, hurricanes, floods, other regional natural catastrophes Site Availability To measure availability, we need to know How often a failure is expected, or the Mean Time to Failure (MTTF) What is the time it takes to recover from a failure, or the Mean Time to Recover The calculation for availability is Availability = MTTF MTTF + MTTR To achieve high availability MTTF must be as high as possible MTTR must be as low as possible In addition, you must consider business impact when calculating availability Levels of Availability What businesses are saying “having my email work is more important to me than having a dial tone” – Fortune 50 CIO “In the next 24 hours, 8 million e-mail messages will be exchanged among employees in the Boeing network”1 1 http://www.boeing.com/companyoffices/aboutus/quickfacts.html Email is business critical What analysts are saying Email is mission critical and must be efficient: in 2003 businesses will send 3.5 Trillion emails --over 13 Billion emails/day Gartner predicts that the volume of daily emails sent worldwide will reach 36 billion by 2005 – more than three times the number of emails sent in 20011 1 Gartner Dataquest Perspective, Market Analysis, “From Content to Knowledge: The Growing Gap, March 4, 2003 . Top Concerns of Today’s Messaging Environment Reliability Quick Recovery Security Privacy Business Integrity Windows Server 2003 Exchange Server 2003 Advanced Features New Features of Windows Server 2003 8 node failover clusters Shutdown tracker (log reasons for shutdown, restart) Diskpart – grow basic volumes Volume Shadow Copy Mount points (in Cluster) /USERVA = 3030 (boot.ini switch) Improved AD performance Better Memory Management New Features of Exchange Server 2003 Improved OWA (more like Outlook) Improved Virus Scanning API (VSAPI) Exchange Management Pack for MOM included New Migration tools Increased Network Performance Decreased network & processing costs Replication IPSec support between front-end and back-end clusters With Exchange Server 2003 on Windows Server 2003… Enable Server and Site Consolidation Improve Management and Administration Enhance User Experience and Information Management Improve Client and Server Communications (sync) Increase the User productivity AD & OS Compatibility Matrix Compatible operating systems Supported Active Directory environments Exchange version Windows Windows 2000 Server Server 2003 SP3+ Windows 2000 Server SP3+ Windows Server 2003 Exchange 2003 Yes Yes Yes Yes Exchange 2000 + SP3 Yes No Yes Yes Exchange 2000 + SP2 Yes No Yes Yes Exchange 5.5 + SP3 Yes No Not required Not required Exchange Server 2003 Clustering High Availability Cluster: Goals Availability Data, application, service Scalability CPU, storage, # nodes Application Recovery failover, restart Manageability Single Point of Administration Eliminate Single Point of Failure Redundancy throughout MSCS: Virtual Servers Clients connect to Virtual Servers (VS). If a cluster node running a VS fails, the other server will run the VS Client Virtual Server #1 Name: CLUSTERIP IP: 192.168.1.11 App: Quorum MSCS Client Virtual Server #2 Name: EXG1 IP: 192.168.1.12 App: Exchange Cluster Node A Name: CLUSTER_A IP: 192.168.1.1 APP: MSCS Quorum Virtual Server #X Name: EXG2 IP: 192.168.1.13 App: Exchange Cluster Node B Name: CLUSTER_B IP: 192.168.1.2 APP: MSCS EXG1 EXG2 Virtual Servers typically include the following resources: a disk, IP address, network name, and application service(s) Clients do not connect to physical nodes. Admins connect for administration Cluster Services – Active N+I Active (N) + Passive (I) combinations Clusters of smaller servers will continue to overtake larger proprietary systems Less $$ for hardware Scale better Faster Failover Exchange Server 2003 Clusters Server Version Active (N+I) ActiveN Windows 2K AS 2 node 3+1 node 7+1 node 7+1 node 2 node 3 nodes 7 nodes 7 nodes Windows 2K DC Windows 2K3 EE Windows 2K3 DC Exchange 2003 Installation on MSCS Easier to create cluster or add nodes using Cluster Administrator in Windows Server 2003 Microsoft Exchange Server 2003 automatically detects presence of MSCS cluster and installs necessary components. Microsoft® Exchange Failover Cluster Node Fails Failure Detected by Cluster Heartbeat Restart Exchange Resources Virtual Server Restore Communications Client side retry Surviving Node acquires Disk Reservations Check and mount the file systems Exchange 2000 Dependency Tree SMTP HTTP IMAP4 POP3 MSSearch Exchange Store Message Transfer Agent System Attendant Network Name IP Address Physical Disk Routing Exchange 2003 Dependency Tree Flattened dependency hierarchy of Exchange services Faster recovery times after failover SMTP Message Transfer Agent HTTP IMAP4 Exchange Store System Attendant Network Name IP Address Physical Disk MSSearch Routing Dell | EMC Storage Advanced Features Typical Storage Environment LAN Exchange 2000 40GB DLT7000 Tape Library 80GB SQL Server2000 40GB 45GB File & Print 15GB 80GB Other 15GB DDS-4 60GB What are the IT challenges with this environment ? Consolidated Storage Environment LAN Exchange 2000 SQL Server2000 Tape Library Consolidating Storage File & Print Other High Availability Level Storage = Achilles’ Heel Application Operating System Server HOST BUS ADAPTER STORAGE CONTROLLER RAID LEVEL DISK PORT Consolidated Storage Environment LAN Exchange 2000 SQL Server2000 Tape Library Consolidating Storage File & Print Other Redundant Storage Area Network (SAN) Redundant Storage System Multi-Path IO with failover (PowerPath) Redundant Storage Processors (RAID controllers) Protected write cache Mirroring SPS Vaulting RAID 1, 3, 5, 1+0 Dual Fibre Channel loops on storage system back-end PowerPath Load balance I/O across multiple paths to the same RAID controller I/O Path failover for redundant paths I/O’s are divided across both paths to SPB SnapView - Snapshot SnapView creates logical point-in-time views of production information Production Host 100 GB Takes only seconds to create a complete snapshot Copy on first write Production Data Snap 10 GB Snapshot allows access for test, backup, etc., without compromising the production data Snapshot SnapView - Clone Production Host SnapClone creates full point in time copy of another Volume 100 GB Production Data Snap 100 GB Snap Clone Backup Server or Testing Host Snapshots & SnapClone Array based product – no burden on host Read / write mountable by a secondary host for increased productivity Minimizes time that production data is unavailable to users Can eliminate scheduled downtime for backup Requires less disk space than a full mirror MirrorView Maintains synchronous remote mirroring between two Dell | EMC arrays Transparent to server, operating system, and applications Protects from unavailability and data loss Primary and secondary site can be remote storage for each other Failover production environment to remote site Backup Dell | Quantum Storage Data Protection: Value Tradeoffs with Different Solutions Short High Low Restore Time Availability Safety Long Low High Short One Mirroring Primary Disk Replication Snapshots Secondary Disk Time MultiRetained Vendor Backup Tape Archiving Long Many Prioritizing Data Based on Value Business slowed if data is unavailable; stopped if data is lost Important Business can operate with limited data availability; significant disruption if data is lost Lower priority Business can operate with minimal data availability; some disruption if data is lost Impact of Data Loss Essential High Availability Lifeblood Business stops if data is unavailable or lost Low Value Aligning Data Protection Needs with Technologies Local Tape Backup and Remote Tape Archive Snapshots Important Disk-Based Backup Local/Remote Tape Archive Impact of Data Loss Essential Asynchronous Mirrored or Replicated RAID High Availability Lifeblood Synchronous Mirrored RAID Tape Autoloader Non-essential Low General Purpose NAS Value Disk and Tape: Both Have a Role to Play 0101000001001101001 0101000001001101001 Backup Server with Backup Software 01 01 Disk-based hardware optimized for data protection PV 136T Tape Library and Dell Solutions: Meeting the Challenge Together Lifeblood Essential SDLT in Large Automation DLT/SDLT Midrange Automation Libraries Important Power Vault DLT autoloaders Lower priority Dell DLT/SDLT Drives DLTtape & Super DLTtape Media Exchange DR Demo On View at Dell Stand Disaster Recovery Site Production Site Domain Controller Domain Controller Promote Boot Update Remote Storage Remote DRMirrors Groups Server Exchange 2000 Exchange 2000 SITE Storage Groups OS Boot Disk OS Boot Disk mirror Exchange Logs Exchange Logs Mirror FAILURE! Exchange Store MirrorView Exchange Store Mirror Fibre Switch Fibre Switch CX600 FC4700 Dell EMC Nortel BT Business Continuity Solution Dell Application Solution Centre Fibre Connectivity EMC Solutions Operation Centre Exchange High Availability Solution Domain Controller Exchange Data Volumes Host boot volumes MirrorView A B C D Domain Controller 100 Miles Existing SAN ESAT BT DWDM Managed Service Port 3 (MirrorView) CX600-A Host A Clustered Host B Clustered Port 3 CX600-B Nortel Optera Nortel Optera Existing SAN Extended VLAN Host C Clustered Mgt Host Host D Clustered Dell Limerick DELL/EMC/Nortel/ESAT BT DWDM Installation EMC Cork Mgt Host More Information Dell HA Clustering website www.dell.com/clusters Dell Solutions website www.dell.com/solutions Dell Power Solutions Magazine (online) www.dell.com/powersolutions Dell ROI Online Calculators www.dell.com/roi Ask The Experts Get Your Questions Answered Ask the Experts area Wednesday 9-11 Dell Stand (All Week) Thank You Community Resources Community Resources http://www.microsoft.com/communities/default.mspx Most Valuable Professional (MVP) http://www.mvp.support.microsoft.com/ Newsgroups Converse online with Microsoft Newsgroups, including Worldwide http://www.microsoft.com/communities/newsgroups/default.mspx User Groups Meet and learn with your peers http://www.microsoft.com/communities/usergroups/default.mspx evaluations © 2003 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Backup Slides SAN Copy No host CPU cycles involved Can copy to/from CLARiiON (Dell | EMC) Symmetrix Usage Upgrade - one time migration to another storage system Test – routine copy to secondary storage for test Content Distribution – copy to multiple targets Source can be snapshot, clone, fractured mirror Copy data from LUN to LUN Target LUN must be > or = source LUN Primary Causes of Data Loss A data protection solution should protect you against all causes of data loss Natural Other Software Disasters 3% 4% Failure 5% Theft/ Sabotage 7% Human Error 38% Viruses 10% Power Failure/ Surges 12% Hardware Failure 20% Source: Quantum analysis Protection from Mirrored Disk • Purely disk-based backup systems do not offer adequate protection against human error, viruses, hackers or natural disasters • Removable media such as tape provides full protection Natural Other Disasters 3% Software 4% Failure Theft/ 5% Sabotage 7% Human Error 38% Viruses 10% Power Failure/ Surges 12% Source: Quantum analysis = Protected by mirrored disk Hardware Failure 20% = Not fully protected without removable tape media