Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Monitoring and Managing the Data Center Section 5 - Introduction © 2006 EMC Corporation. All rights reserved. 本章目标及内容 依赖于存储管理工具的数据监测与管理是本章要讨论的 主要内容。通过对存储的硬件、软件、信息容量、格式、 内容等诸多方面的监测,信息可以得到最优化的管理与应 用。同时,本章还介绍了一些主要的信息管理软件的基础 应用知识。 本章内容包括2个方面: 5.1 数据中心的监测(Monitoring in the Data Center) 5.2 数据中心的管理(Managing in the Data Center) © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 2 Section Objectives Upon completion of this section, you will be able to: Describe areas of the data center to monitor Discuss considerations for monitoring the data center Describe techniques for managing the data center © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 3 Monitoring in the Data Center Module 5.1 © 2006 EMC Corporation. All rights reserved. Monitoring in the Data Center After completing this module, you will be able to: Discuss data center areas to monitor List metrics to monitor for different data center components Describe the benefits of continuous monitoring Describe the challenges in implementing a unified and centralized monitoring solution in heterogeneous environments Describe industry standards for data center monitoring © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 5 Monitoring Data Center Components Client HBA Port HBA IP Keep Alive Port IP Network SAN Storage Arrays Health Capacity Performance Cluster Hosts/Servers with Applications © 2006 EMC Corporation. All rights reserved. Security Storage Systems Architecture - Introduction - 6 Why Monitor Data Centers? Availability – Continuous monitoring ensures availability – Warnings and errors are fixed proactively Scalability – Monitoring allows for capacity planning/trend analysis which in turn helps to scale the data center as the business grows Alerting – Administrators can be informed of failures and potential failures – Corrective action can be taken to ensure availability and scalability © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 7 Monitoring Health Why monitor health of different components? – Failure of any hardware/software component can lead to outage of a number of different components Example: A failed HBA could cause degraded access to a number of data devices in a multi-path environment or to loss of data access in a single path environment Monitoring health is fundamental and is easily understood and interpreted – At the very least health metrics should be monitored – Typically health issues would need to be addressed on a high priority © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 8 Monitoring Capacity Why monitor capacity? – Lack of proper capacity planning can lead to data un-availability and the ability to scale – Trend reports can be created from all the capacity data Enterprise is well informed of how IT resources are utilized Capacity monitoring prevents outages before they can occur – More preventive and predictive in nature than health metrics Based on reports one knows that 90% of a file system is full and that the file system is filling up at a particular rate 95% of all the ports have been utilized in a particular SAN fabric, a new switch should added if more arrays/servers are to be added to the same fabric © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 9 Monitoring Performance Why monitor Performance metrics? – Want all data center components to work efficiently/optimally – See if components are pushing performance limits or if they are being under utilized – Can be used to identify performance bottlenecks Performance Monitoring/Analysis can be extremely complicated – Dozens of inter-related metrics depending on the component in question – Most complicated of the various aspects of monitoring © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 10 Monitoring Security Why monitor security? – Prevent and track unauthorized access Accidental or malicious Enforcing security and monitoring for security breaches is a top priority for all businesses © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 11 Monitoring Servers Health – Hardware components HBA, NIC, graphic card, internal disk … – Status of various processes/applications Capacity HBA – File system utilization HBA – Database Table space/log space utilization – User quota Server © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 12 Monitoring Servers Performance – CPU utilization – Memory utilization – Transaction response times Security HBA – Login HBA – Authorization – Physical security Data center access © 2006 EMC Corporation. All rights reserved. Server Storage Systems Architecture - Introduction - 13 Monitoring the SAN Health – Fabrics Fabric errors, zoning errors – Ports Failed GBIC, status/attribute change SAN – Devices Status/attribute Change – Hardware Components Processor cards, fans, power supplies Capacity – ISL utilization – Aggregate switch utilization – Port utilization © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 14 Monitoring the SAN Performance – Connectivity ports Link failures Loss of signal Loss of synchronization Link utilization SAN Bandwidth MB/s or frames/s – Connectivity devices Statistics are usually a cumulative value of all the port statistics © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 15 Monitoring the SAN Security – Zoning Ensure communication between dedicated sets of ports (HBA and Storage Ports) – LUN Masking Ensure the only certain hosts have access to certain Storage Array volumes – Administrative Tasks Restrict administrative tasks to a select set of users Enforce strict passwords – Physical Security Access to Data Center should be monitored © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 16 Monitoring Storage Arrays Health – All hardware components Front End Back End Memory Disks Power Supplies … – Array Operating Environment RAID processes Environmental Sensors Replication processes Storage © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 17 Monitoring Storage Arrays Capacity – Configured/unconfigured capacity – Allocated/unallocated storage – Fan-in/fan-out ratios Performance – Front End utilization/throughput – Back End utilization/throughput – I/O profile – Response time – Cache metrics Storage © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 18 Monitoring Storage Arrays Security – LUN Access Ensure the only certain hosts have access to certain Storage Array volumes Disallow WWN spoofing – Administrative tasks Most arrays allow the restriction of various array configuration tasks Device configuration LUN masking Replication operations Port configuration – Physical Security Monitor access to data center © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 19 Monitoring IP Networks Health – Hardware Components Processor cards, fans, Power Supplies, ... – Cables Performance – Bandwidth – Latency – Packet Loss IP – Errors – Collisions Security © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 20 Monitoring the Data Center as a Whole Monitor data center environment – Temperature, humidity, airflow, hazards (water, smoke, etc.) – Voltage – power supply Physical security – Facility access (Monitoring cameras, access cards, etc.) © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 21 End-to-End Monitoring Client HBA Port HBA IP IP Network Keep Alive Port SAN Storage Arrays Single Failure Multiple Symptoms Root Cause Analysis Cluster Hosts/Servers with Applications © 2006 EMC Corporation. All rights reserved. Business Impact Storage Systems Architecture - Introduction - 22 Monitoring Health: Array Port Failure H1 Degraded HBA HBA SW1 H2 Degraded HBA Port HBA Port SW2 Storage Arrays H3 Degraded HBA HBA SAN Hosts/Servers with Applications © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 23 Monitoring Health: HBA failure Degraded H1 HBA HBA SW1 H2 HBA Port HBA Port SW2 Storage Arrays H3 HBA HBA SAN Hosts/Servers with Applications © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 24 Monitoring Health: Switch Failure SW1 Port Port All Hosts Degraded Port Port SW2 Hosts/Servers with Applications © 2006 EMC Corporation. All rights reserved. Storage Arrays SAN Storage Systems Architecture - Introduction - 25 Monitoring Capacity: Array New Server SAN SW1 Storage Array Port Port SW2 Port Port Hosts/Servers with Applications © 2006 EMC Corporation. All rights reserved. Can the Array provide the required storage to the new server? Storage Systems Architecture - Introduction - 26 Monitoring Capacity: Servers File System Space No Monitoring FS Monitoring File System File System Extend FS Warning: FS is 66% Full Critical: FS is 80% Full © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 27 New Server H4 SW1 Port Util. % Monitoring Performance: Array Port Utilization SW2 Port HBA HBA H1 HBA HBA H2 H3 100% H1 + H2 + H3 HBA HBA Port Port HBA HBA SAN Storage Arrays Hosts/Servers with Applications © 2006 EMC Corporation. All rights reserved. Module Title - 28 Monitoring Performance: Servers Critical: CPU Usage above 90% for the last 90 minutes © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 29 Monitoring Security: Servers Login 1 Login 2 Login 3 Critical: Three successive login failures for username “Bandit” on server “H4”, possible security threat © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 30 Monitoring Security: Array – Local Replication SAN SW1 Storage Array Port WG2 Workgroup 2 (WG2) Port SW2 Replication CMD Workgroup 1 (WG1) © 2006 EMC Corporation. All rights reserved. Port WG1 Port Warning: Attempted replication of WG2 devices by WG1 user – Access denied Storage Systems Architecture - Introduction - 31 Monitoring: Alerting of Events Warnings require administrative attention – File systems becoming full – Soft media errors Errors require immediate administrative attention – Power failures – Disk failures – Memory failures – Switch failures © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 32 Monitoring: Challenges EMC Hitachi Storage Arrays NetApp CAS NAS HP DAS IBM SAN Cisco TLU SUN Network Servers McData MF UNIX WIN Databases Oracle Informix © 2006 EMC Corporation. All rights reserved. SAN Applications IP Brocade MS SQL Storage Systems Architecture - Introduction - 33 Monitoring: Ideal Solution Monitoring/Management One UI Engine Storage Arrays Storage Arrays Network CAS NAS Servers, Databases, DAS SAN Applications TLU Servers Network MF UNIX WIN SAN Databases © 2006 EMC Corporation. All rights reserved. IP Applications Storage Systems Architecture - Introduction - 34 Without Standards… No common access layer between managed objects and applications – vendor specific No common data model Network Management Applications Management No interconnect independence Multi-layer management difficulty Legacy systems can not be accommodated No multi-vendor automated discovery Policy-based management is not possible across entire classes of devices © 2006 EMC Corporation. All rights reserved. Host Management Storage Management Database Management Interoperability! Storage Systems Architecture - Introduction - 35 Simple Network Management Protocol (SNMP) SNMP – Meant for network management – Inadequate for complete SAN Management Limitations of SNMP – No Common Object Model – Security - only newer SAN devices support v3 – Positive response mechanism – Inflexible - No auto discovery functions – No ACID (Atomicity, Consistency, Isolation, and Durability) properties – Richness of canonical intrinsic methods – Weak modeling constructs © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 36 Storage Management Initiative (SMI) Created by the Storage Networking Industry Association (SNIA) Integration of diverse multi-vendor storage networks Management Application Development of more powerful management applications Integration Infrastructure Object Model Mapping Vendor Unique Features Common interface for vendors to develop products that incorporate the management interface technology Key components – – – – – – – Inter-operability testing Education and collaboration Industry and customer promotion Promotions and demonstrations Technology center SMI specification Storage industry architects and developers © 2006 EMC Corporation. All rights reserved. SMI-S Interface •Platform Independent •Distributed •Automated Discovery •Security •Locking •Object Oriented CIM/WBEM Technology Tape Library Switch Array Many Other MOF MOF MOF MOF SNIA’s SMI-S Storage Systems Architecture - Introduction - 37 Standard Object Model per Device Vendor Unique Function Storage Management Initiative Specification (SMI-S) Based on: – Web Based Enterprise Management (WBEM) architecture – Common Information Model (CIM) Features: – A common interoperable and extensible management transport – A complete, unified and rigidly specified object model that provides for the control of a SAN – An automated discovery system Management Graphical User Storage Resource Management Performance Capacity Planning Removable Media Management Tools Container Management Volume Management Media Management Other Users Data Management File System Database Manager Backup and HSM Storage Management Interface Specification Managed Objects Physical Components Removable Media Tape Drive Disk Drive Robot Enclosure Host Bus Adapter Switch Logical Components Volume Clone Snapshot Media Set Zone Other – New approaches to the application of the CIM/WBEM technology © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 38 Common Information Model (CIM) Describes the management of data Details requirements within a domain Information model with required syntax © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 39 Web Based Enterprise Management (WBEM) © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 40 Enterprise Management Platforms (EMPs) Graphical applications Monitoring of many (if not all) data center components Alerting of errors reported by those components Management of many (if not all) data center components Can often launch proprietary management applications May include other functionality – Automatic provisioning – Scheduling of maintenance activities Proprietary architecture © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 41 Monitoring in the Data Center – Summary Key concepts covered in this module are: It is important to continuously monitoring of data center components to support the availability and scalability initiatives of any business – Components include the server, SAN, network, and storage arrays The four areas of monitoring: – – – – Health Capacity Performance Security There are attempts to define a common monitoring and management model © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 42 Apply Your Knowledge Upon completion of this topic, you will be able to: Describe how EMC ControlCenter can be used to monitor the Data Center © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 43 EMC ControlCenter Architecture User Interface Tier • Console (many) • Optional applications Agent Tier • Master Agent (1) • Application Agents (many) Infrastructure Tier • Server (one) • Repository (one) • Store (many) © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 44 EMC ControlCenter Console Primary interface through which the storage environment is viewed and managed Java-based application supported on Windows and Solaris platforms Objects managed by various agents are organized into groups such as Storage, Hosts, and Connectivity Information about an object can be retrieved by the Console from the Repository or in real-time directly from the agent Any command issued for the object is passed from the Console to the ControlCenter Server and handled appropriately There can be several Consoles spread across the network © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 45 EMC ControlCenter Server ControlCenter Server is the primary interface between the Console and the ControlCenter infrastructure ControlCenter Server provides a diverse collection of services including: – Web Applications Server – used for installing the Java Console – Security and access management, such as licensing, login, authentication, and authorization – Communication with the Console – Alert and event management – Real-time statistics – Object management to maintain a list of managed objects – Agent management to maintain a list of available agents ControlCenter Server retrieves data from the Repository for display by the Java and Web Console User initiated real-time data requests from some agents, are also handled by the ControlCenter Server Balances Agent to Store communication based on workload © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 46 EMC ControlCenter Repository Licensed, embedded Oracle 9i database that holds current and historical information about the managed environment ControlCenter Server executes transactions on the Repository to retrieve information requested by the Console Store(s) populate the Repository with persistent data from the agents Repository requires minimal user interaction or maintenance. The database has restricted access and can be updated only by ControlCenter applications © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 47 EMC ControlCenter Store Store receives the data sent by the agents, processes the data and updates the Repository There can be multiple Stores in the environment, providing load balancing, scaling, and failover © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 48 EMC ControlCenter Agents Master agent: – One per host – Manages other agents on the host – start/stop, monitor agent status and health ControlCenter Agents: – Runs on hosts to collect data and monitor object health – Generate alerts – Multiple agents can exist on a host – Passes information to the ControlCenter Store and the ControlCenter Server. © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 49 EMC ControlCenter Support for Storage Arrays The following Storage Arrays are supported by EMC ControlCenter EMC Symmetrix EMC CLARiiON EMC Centera EMC Celerra and Network Appliances NAS servers EMC Invista Hitachi Data Systems (including the HP and Sun resold versions) HP Storageworks IBM ESS SMI-S (Storage Management Initiative Specification) compliant arrays © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 50 EMC ControlCenter support for SAN Devices The following SAN devices are supported by ControlCenter EMC Connectrix Brocade McData Cisco Inrange (CNT) IBM Blade Server (IBM-branded Brocade models only) Dell Blade Server (Dell-branded Brocade models only) © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 51 EMC ControlCenter Support for Hosts The following hosts are supported by ControlCenter Dedicated Host agents – – – – – – – Microsoft Windows Hewlett-Packard HP-UX IBM AIX IBM mainframe Linux Novell Netware Sun Solaris Proxy management via Common Mapping Agent (CMA) – Compaq Tru64 – Fujitsu-Siemens BS2000 – Windows, Solaris, AIX, Linux, and HP-UX hosts can also be monitored by Common Mapping Agent proxy © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 52 EMC ControlCenter Support for Database and Backup The following databases are supported by ControlCenter Dedicated database agent – Oracle – DB2 on mainframe Proxy management via Common Mapping Agent (CMA) – – – – SQL Server Sybase Informix DB2 Dedicated backup agent – – – – EMC EDM IBM Tivoli EMC Networker Veritas Netbackup © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 53 Discovery of Managed Objects by Agents Automatic Discovery: Many agents discover data objects automatically Assisted Discovery: These agents must discover their objects by administrator action – Common Mapping Agent – Database Agent for Oracle – Fibre Channel Connectivity Agent – Storage Agents for CLARiiON, Centera, Invista, NAS, SMI, HP StorageWorks, HDS and ESS © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 54 Data Collection Policies (DCP) Formal set of statements used to manage the data collected by ControlCenter agents Policies specify the data to collect and the frequency of collection ControlCenter agents have predefined collection policy definitions and templates – Default definitions can be easily modified, or new definitions can be created from the templates provided © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 55 Console View of the Storage Environment SAN Switch Server Dual HBAs WWN of HBAs Storage Array Storage Array Front-end Directors and Ports © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 56 Alerts - Overview Why Alert? - Data availability – Monitor and report on events that could lead to application outages – Every ControlCenter agent can monitor a number of metrics 30 agents and 700+ alerts Alert categories – Health Examples - Database instance up/down, Symmetrix service processor down, Connectivity device port status – Capacity Examples - File System Space, File/Directory Size Change – Performance Examples – Symmetrix Total Hit %, Host CPU Usage © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 57 Alert Notification Notification capabilities Messages are directed to the ControlCenter console by default Messages can be directed to a Management Framework via Integration Gateway (SNMP) – governed by Management Policy associated with the Alert E-mail notification as specified in the Management Policy © 2006 EMC Corporation. All rights reserved. Storage Systems Architecture - Introduction - 58 EMC ControlCenter Console View of Alerts Message Object Name Alert state © 2006 EMC Corporation. All rights reserved. Severity Alert severity Storage Systems Architecture - Introduction - 59