Download CONFIGURING SYSTEM EVENT ALERT NOTIFICATIONS FOR

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Fault tolerance wikipedia , lookup

Electromagnetic compatibility wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Opto-isolator wikipedia , lookup

Transcript
CONFIGURING SYSTEM EVENT ALERT
®
NOTIFICATIONS FOR EMC KAZEON
RUNNING ON INTEL® PLATFORM
ABSTRACT
This white paper explains how to configure Kazeon servers running on Intel® platform
so that alerts will be sent upon generation of certain system events such as, hardware
failures, power supply failures, system restarts, etc. These alerts can be configured in
Kazeon server’s BMC web console, so that the notification will be sent to email
recipients or to SNMP trap receivers as SNMP traps.
September, 2014
Copyright © 2014 EMC Corporation. All Rights Reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without
notice.
The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with
respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a
particular purpose.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.
VMware and <insert other VMware marks in alphabetical order; remove sentence if no VMware marks needed. Remove highlight and
brackets> are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other
trademarks used herein are the property of their respective owners.
Part Number H13364
TABLE OF CONTENTS
EXECUTIVE SUMMARY
4
BMC OVERVIEW
4
CONFIGURATION
4
AUDIENCE
PRE-REQUISITES
4
4
ALERT CONFIGURATION STEPS
4
SNMP TRAP RECEIVER CONFIGURATION
7
SAMPLE ALERT NOTIFICATION
8
CONCLUSION
8
REFERENCES
8
EXECUTIVE SUMMARY
Kazeon servers running on Intel platform come with an in-built component called as BMC (Baseboard Management Controller). BMC,
an embedded computer system, enables out-of-band management of the Kazeon servers.
This whitepaper explains how to configure the Kazeon’s BMC so that alerts notification will be sent upon generation of certain system
events such as hardware failures, system restart, power status, etc. These alert notifications can be configured to be sent to
receivers including email recipients or SNMP trap receivers.
AUDIENCE
This white paper is intended for engineering, functional and support teams which deals with Kazeon product running on Intel
platform. It is assumed that the readers of this whitepaper have got a basic understanding of Kazeon product and SNMP and SMTP
protocols.
BMC OVERVIEW
BMC is an embedded computer system which implements IPMI (Intelligent Platform Management Interface) protocol and provides
remote web access along with email capabilities, LDAP support, emulation of remote CD/DVD drives and other media, and a host of
other capabilities. BMC is a highly powerful system; it operates and controls the server at very low-level. Various types of sensors
built into the computer system report to the BMC on parameters such as temperature, cooling fan speeds, power status, Operating
System status and so on.
CONFIGURATION
PRE-REQUISITES
Before starting with the configuration, make sure that BMC is configured with Kazeon Intel server (which can be configured using
Kazeon’s kaz_setup.pl script or from system BIOS) and you have following information:
1.
Kazeon’s BMC Web-console IP address and system admin credentials
2.
SMTP mail-server address (if the alerts are to be sent to email recipients)
3.
SNMP IP address (if the alerts are to be sent as SNMP traps)
ALERT CONFIGURATION STEPS
1. Logging in
Enter the configured IP address of the Kazeon’s BMC on-board NIC into your web browser. Please ensure that your web browser
supports HTTPS protocol. This will take you to the Intel® Integrated BMC Web Console module login page as shown below.
Login with the system administrator privileges.
Figure 1: BMC Web Console login page
4
2. Home Page
After successful login to the Integrated BMC Web Console module, the Integrated BMC Web Console home page appears as
shown below:
Figure 2: BMC Web Console home page
3. Alerts configuration page
Select the Configuration tab from the top horizontal toolbar. By default, this tab opens the IPv4 Network settings page. Click on
the Alerts link from the left panel. It will open Alerts page as shown in Fig 3. This page allows you to configure which system
events an alert should be sent for and the destination for the alerts. Up to two destinations can be selected for each LAN
channel.
Figure 3: Alerts page
5
Description of the options:
a.
Globally Enable Platform Event Filtering: This can be used to prevent sending alerts until you have fully specified your
desired alerting policies.
b.
Log Event on Filter Action: This can be used to enable or disable the logging of an event into the System Event Log when a
Filter Action is taken.
c.
Select the events that will trigger alerts: Select one or more events from the list that will trigger an alert. These events
correspond to the IPMI preconfigured Platform Event Filters. Each of these system event is described briefly as following:
•
Temperature Sensor Out of Range: One or more Temperature sensor number exceeded the manufacturer defined
threshold of temperature Celsius value.
•
System Restart: System reboot has happened.
•
Fan Failure: Fan has exceeded the threshold speed or has stopped working.
•
Power Supply Failure: Defective power supply identified.
•
BIOS: Post Error Code: An error occurred during BIOS POST (Power-On Self-Test). BIOS POST is a systematic check
of basic system devices and firmware.
•
Node Manager Exception: Node manager is a platform-resident technology that enforces power and thermal policies
for the platform. Node Manager Exception event will be sent each time maintained policy power limit is exceeded over
Correction Time limit.
•
Watchdog Timer: Generated when IPMI watchdog timer times out. IPMI watchdog can check to see if the OS is still
responsive. If the timer expires, then the BMC can take an action if it is configured to do so (reset, power down,
power cycle, or generate a critical interrupt)
•
Voltage Sensor Out of Range: The BMC monitors the main voltage sources in the system, including the baseboard,
memory and processors using analog/threshold sensors. This event can be caused by the device supplying the voltage
or by the device using the voltage.
•
Chassis Intrusion: Chassis intrusion is monitored on supported chassis, and the BMC can log corresponding events
when the chassis lid is opened or closed.
•
Memory Error: Reports the memory errors, if any. Memory errors are characterized as either soft or hard. Soft errors
are transient and occasional; hard errors are permanent and are found in the silicon or in metallization of the dynamic
RAM (DRAM) packaging.
•
FRB Failure: Fault resilient booting (FRB) is a set of BIOS and BMC algorithms and hardware support that allow a
multiprocessor system to boot even if the bootstrap processor (BSP) fails. FRB failure occurs when the timer expires
in the FRB2 phase of the boot operation.
•
d.
Hard Drive Failure: Indicates any failures with respect to the hard drive of the system.
LAN Channel to Configure: Select the LAN channel on which you want to configure the alerts. The LAN channel describes
the physical NIC connection on the server.
i. Baseboard Mgmt (BMC LAN Channel 1) is the on-board, shared NIC configured for management and shared with the
operating system.
ii. Baseboard Mgmt 2 (BMC LAN Channel 1) is the second on-board, shared NIC configured for management and shared
with the operating system.
iii. Intel® RMM (BMC LAN Channel 3) is the add-in RMM4 NIC.
Kazeon mainly uses channel 1, channel 3 in some cases. Hence, the destination configuration needs to be done separately
for each channel.
e.
Alert Destination #1/#2: Select the destination where the alerts notifications are to be sent. Select either SNMP with the
IP address or email address that the alert will be sent to. Up to two destinations can be selected for each LAN channel.
6
4. Alert Email Configuration page:
These configuration steps are required only if the alerts’ notification are to be sent to email recipients. Select the Alert Email
link from the left panel. This page allows you to configure the parameters required for email notifications.
Figure 4: Alerts Email Page
Description of the options:
a.
LAN Channel: Select the LAN channel to configure the destination.
b.
SMTP Server IP: Enter the IP address (and not the hostname) of the remote SMTP mail server that the alert emails will be
sent to.
c.
Sender Address: The sender address string to be put in the “From:” field of outgoing alert emails. This can be any alphanumeric string.
d.
Local Hostname: The hostname of the local machine that is generating the alert. It is put into the outgoing alert email.
At this point your Kazeon server is configured to send the system event alerts either to email recipients or SNMP traps. You can test
your configuration by clicking on the “Send Test Alerts” button present at the bottom of Alerts page. This will send some dummy
event notification to the configured email recipient or SNMP trap receiver.
SNMP TRAP RECEIVER CONFIGURATION
This section provides a high-level overview of SNMP trap receiver configuration so as to log the incoming alert notifications.
Please note that this is a sample configuration. For details, refer to SNMP documentation.
1.
Linux/Unix:
a.
Install the necessary SNMP packages such as net-snmp on the system.
b.
Edit the file /etc/snmp/snmptrapd.conf (create this file if not present) to include the following line:
authCommunity log [community_string]
c.
(Re)Start the SNMP trap daemon:
service snmptrapd start OR service snmptrapd restart
d.
By default, the incoming traps will be logged in system log file such as /var/log/messages. If you wish to log it in another
file, you have to use extra argument –Lf <log_file_name> while starting SNMP trap daemon. This can be done either by
directly launching the snmptrapd manually with all required options or modifying the startup file generally present at
/etc/sysconfig/snmptrad to include additional options.
7
2.
Windows:
a.
Install the necessary software such as net-snmp
b.
Register the snmptrapd as service using the script included in the net-snmp installer.
c.
Edit etc\snmp\snmptrapd.conf present in the user home directory to include following lines:
snmpTrapdAddr [System IP]:162
authCommunity log [community string]
d.
The default log location is log\snmptrapd.log under user’s home directory.
Apart from logging the incoming SNMP traps, additional tasks can be performed on the receipt of a SNMP trap by configuring the
listeners for them. For more details, refer to SNMP documentation.
SAMPLE ALERT NOTIFICATION
1. Sample email alert:
Once the Kazeon server is configured for system event alert for email recipients and some event is generated, then you will get
an email with subject “Alert from <machine_name>” with content similar to:
---Event that generated this alert:
RID:02D3 TS:08/08/2014 10:32:08 SN:BIOS Evt Sensor ST:System Event ED:OEM System Boot Event ET:Asserted EC:OK
RID:02D3 RT:02 TS:53E4A728 GID:0001 ER:04 ST:12 S#:83 ET:6F ED:01 FF FF EX:00 FF FF FF FF FF FF FF
---2. Sample SNMP log message:
SNMP trap receiver will log a message similar to following upon receipt of a SNMP trap from Kazeon Intel server.
--2014-08-08 03:32:13 kazintel35-bmc.kazeon.local [10.10.178.35] (via UDP: [10.10.178.35]:42969->[10.10.178.139]) TRAP,
SNMP v1, community public#012#011SNMPv2-SMI::enterprises.3183.1.1 Enterprise Specific Trap (1208065) Uptime: 162 days,
21:42:19.28#012#011SNMPv2-SMI::enterprises.3183.1.1.1 = Hex-STRING: 96 F9 BB A6 3B AF 11 E1 BD 1D 00 1E 67 2C 9E 73
#01200 38 A8 CA 39 1F FF FF 20 20 00 01 83 00 00 01 #012FF FF 00 00 00 00 00 00 00 00 00 00 00 00 C1 00 #01200 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 #01200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #01200 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 #01200 00 00 00 00 00 00 00 00 00 00 00 00 00
-----
CONCLUSION
This whitepaper explained how to configure Kazeon Intel servers, using in-built BMC capabilities, for sending out alert notifications
upon the occurrence of certain system events. These system events include hardware failure, power supply failure, system restarts,
memory errors, and so on, as described in Configuration section. The alert notifications can be sent either to email recipients or to
SNMP trap receivers.
REFERENCES
•
Intelligent Platform Management Interface (IPMI) Specification Second Generation v2.0
•
Intel® Server Board S2600GZ/GL : Technical Product Specification
•
Intel® Remote Management Module 4 and Integrated BMC Web Console User Guide
•
System Event Log (SEL) Troubleshooting Guide for Intel® 4600/2600/2400/1600/1400 Product Families
•
NET-SNMP Documentation: http://www.net-snmp.org/wiki
8