Event monitoring service is an integrated HP-UX service that monitors host hardware in real time and reports monitoring information to system maintenance personnel in a specified manner, this helps O & M personnel detect host faults in a timely and accurate manner and assist in determining the fault location to Improve the availability time of the host.
EMS can be managed through MRM (Monitoring request Manager). With MRM, you can set the monitoring scope, trigger conditions for event alarms, and alert methods for event information.
The MRM call method is as follows:
(1) log on to the host system as root
(2) run/etc/opt/resmon/lbin/monconfig
(3) Use (MRM) Monitoring request manager main menu for configuration
You can view, check, modify, delete, enable, and disable detectors in the MRM menu.
As follows:
========================================================== ==========================================
========================== Event monitoring service =========================== =
============================ Monitoring request manager ========================= =
========================================================== ==========================================
Event monitoring is currently enabled.
EMS version: a.04.10
STM version: c.46.15
========================================================== ==========================================
============== Monitoring request manager Main Menu ======================
========================================================== ==========================================
Note: monitoring requests let you specify the events for monitors
To report and the notification methods to use.
Select:
(S) how monitoring requests configured via monconfig
(C) Heck detailed monitoring status
(L) ist descriptions of available monitors
(A) dd A monitoring request
(D) elete a monitoring request
(M) odify an existing monitoring request
(E) nable monitoring
(K) ill (disable) Monitoring
(H) ELP
(Q) uit
Enter Selection: [s]
The following uses a custom monitor as an example to describe how to configure MRM:
(1) log on to the system as root
(2) run/etc/opt/resmon/lbin/monconfig To Go To The MRM main menu (as shown above)
(3) Select a and press Enter. The corresponding function option is (a) dd A monitoring request
(4) The available hardware modules are displayed. Generally, select all. Enter a and press Enter.
(5) Select the benchmark event level. It is recommended to select 2) Minor warning
(6) Select the alarm trigger condition, and select 4)> =
(7) Select monitoring event information alert method, and select 6) email
(8) Select the recipient of the event alert email. Enter the user name as needed, for example, monitor.
(9) comment on this monitor and select (a) dd
(10) Client configuration file. Select (c) Lear here.
(11) Save the preceding configuration information and return it to the main menu.
(12) on the main menu, select (s) how monitoring requests configured via monconfig to check whether the newly created monitor exists.
(13) return to the MRM main menu and select (c) Heck detailed monitoring status to view all valid monitoring statuses. The status varies with the host configuration and does not exist in the host, EMS will ignore, even if it is set to monitor all hardware in Step 4 above
(14) (e) nable monitoring, enabling the EMS service function
Note: Through the above steps, the new monitor is for real-time monitoring of all hardware modules (step 4), but only for the degree of severity greater than or equal to minor warning (Step 5 & step 6) the event is reported to the user Monitor (Step 8) by email (Step 6 ).
2. How to get information from event mail
The Time Warning email generated by EMS can be received through the internal network, without configuring the Domain Name Server. Emails generated by EMS are sent to the target user monitor as defined in advance, which can be received through the mail client on the PC (outlook, etc.
Taking outlook as an example, in order to receive event mail, the mail client software needs to create a new mail account with the username as the HP-UX username specified in MRM and the password as the password for the HP-UX, the POP3/SMTP server is the IP address of the host to be detected. We recommend that you set an automatic email receiving interval in outlook to receive event information from EMS in time.
Note:
(1) because of the security mechanism of the HP-UX, the root user's e-mail can not be received through the client software, so in MRM specify the event mail to receive the user is specified as another common user, for example, the monitor user is created this time.
(2) The POP3/pop port 110/109 should be opened in the network.
(3) The user for event mail is a user in the HP-UX, can also log on to the host, it is recommended to regularly modify the password of the user in the HP-UX, the corresponding, also need to modify the password of Outlook
The following is an example of the content of the event alert email generated by EMS. The following fault is caused by a system exception caused by manually unplugging a hard disk (note in Chinese)
> ------------ Event monitoring service Event Notification ------------ <
Notification time: Wed Jun 8 23:26:18 2005 event trigger time
Hpux1 sent event monitor Notification Information: indicates the host name.
/Storage/events/disks/default/0_0_1_1.15.0 is> = 2. hardware module and trigger
Its current value is critical (5). severity of the event
User comments:
Just a test :)
Event Data from monitor:
Event time ......: Wed Jun 8 23:26:16 2005
Severity ......: critical
Monitor...: disk_em
Event # ......: 101
System ......: hpux1
Summary: Event Overview
Disk at hardware path 0/0/1/1.15.0: device removed from monitoring
Description of error: fault description
The device has been removed from the list of devices being monitored
This monitor.
Probable cause/recommended action: Possible cause/Recommended Solution
The device was removed from the system, has stopped responding to
System or it has been replaced with a device that is not supported by this
Monitor.
Run ioscan to determine the State and type of the device.
Check the/var/STM/data/OS _decode_xref for the information indicating
Which devices are supported by this monitor.
Check other monitors to determine if they are now monitoring
Device by running/etc/opt/resmon/lbin/monconfig and using the "check
Monitoring "command.
Additional event data:
System IP address...: 15.85.114.14 Host IP Address
Event ID ......: 0x42a70e1800000000
Monitor version...: B .01.01
Event class ......: I/O event category
Client configuration file ...........:
/Var/STM/config/tools/monitor/default_disk_em.clcfg
Client configuration file version...: a.01.00
Qualification criteria met.
Number of events...: 1
Associated OS error log entry ID (s ):
None
Additional system data:
System Model Number ......: 9000/800/A500-44 host model number
OS version...
STM version ...... ..........: a.45.00
EMS version ......: a.04.00
Latest information on this event:
Http://docs.hp.com/hpux/content/hardware/ EMS /disk_em.htm#101
V-v d e t a I L S V- v-V
Component data:
Physical device path...: 0/0/1/1.15.0 physical path of the faulty device
Device class ......: disk device type
Inquiry vendor ID ......: Seagate equipment manufacturer
Inquiry product ID...: st34572wc Product NO.
Firmware version...: hp03 firmware version
Serial number ......: jkj118650qpjcx fault spare part serial number
> ---------- End event monitoring service Event Notification ---------- <
The enven mail displays the fault events, host names, event severity levels, the physical path of the faulty disk, the product ID of the hard disk, recommended inspection steps, host model, and operating system version. information, helps you detect and troubleshoot host hardware faults.
However, the host hardware failure may not be a simple fault of a single component. Therefore, the probable cause/Recommended Action Description in event mail may eventually find that the fault identification is inconsistent. This is a normal situation. Fault analysis usually requires more tools and methods for troubleshooting.