Zabbix enterprise application server hardware information monitoring

Source: Internet
Author: User

Zabbix enterprise application server hardware information monitoring

Now we will introduce the Zabbix monitoring server hardware information. Because our company's servers use dell (our company is a mobile game, and all servers are dell), the server models include r410, r420, r710, and r720, the system includes CentOS 5.x, centos 6.x, RedHat 5.x, redhat 6.x, Ubuntu 12.04, and ubuntu 12.04.4. For Hardware monitoring, I have tested ipmi, megacli, and smart, however, these monitoring software has fewer monitoring contents and is not commonly used. Finally, I found dell's dedicated omsa to meet my needs, the following describes how to use omsa to monitor the hardware information of a dell server.

Currently, I monitor the following hardware information:

1. cpu processor status
2. cpu power-saving mode (if the power-saving mode is enabled, it will be very difficult when the pressure is high)
3. raid status (such as the raid mode in which raid is enabled and whether the raid status is normal)
4. Memory status (you can check the maximum memory size supported by the current server and the current memory size. If the memory is faulty, you can check the location where the memory is faulty)
5. Machine temperature status (monitor whether the machine temperature exceeds the threshold)
6. Physical hard disk status (monitor whether the physical hard disk is faulty)
7. Power Supply Status (whether single power supply or dual power supply is faulty)
8. System Panel CMOS battery (whether the cmos battery is faulty)
9. Nic status (the current number of NICs and whether the NIC is faulty)
10. Fan (current number of fans and faults)
By default, alarms are disabled in cpu power-saving mode. Other monitoring tasks are monitored once every 15 minutes. If problems occur twice in a row, an alarm is triggered.

The following figure shows the monitoring chart.
1. Normal hardware server monitoring

2. Abnormal monitoring of some hardware

We can see that the cpu Of this server has enabled the power-saving mode, and the memory is faulty.
After checking through the command line, the faulty memory stick is found to be the first slot

The installation method is as follows:
I. Client
A. install it in the redhat or centos System
1. Install the dell yum Source

The wget-q-O-http://linux.dell.com/repo/hardware/latest/bootstrap.cgi | bash

2. Install omsa
Yum install srvadmin-all

3. soft connection
Ln-s/opt/dell/srvadmin/sbin/omreport/usr/bin/omreport
Ln-s/opt/dell/srvadmin/sbin/omconfig/usr/bin/omconfig

4. Disable the web mode (only cli can be run)
Echo "/usr/bin/omconfig system webserver action = stop">/opt/dell/srvadmin/sbin/srvadmin-services.sh

5. Start omsa
/Opt/dell/srvadmin/sbin/srvadmin-services.sh start

6. Add omsa to startup
Echo "/opt/dell/srvadmin/sbin/srvadmin-services.sh start">/etc/rc. local

The above is the installation of omsa in centos or redhat system.
B. The following is the installation in ubuntu.
1. Add Source
Echo 'deb http://linux.dell.com/repo/community/ubuntu precise openmanage' | sudo tee-a/etc/apt/sources. list. d/linux.dell.com. sources. list

2. Check and add key
Gpg -- keyserver pool.sks-keyservers.net -- recv-key 1285491434D8786F
Gpg-a -- export 1285491434D8786F | sudo apt-key add-

3. Update Source
Apt-get update-y

4. Install omsa
Apt-get install srvadmin-all-y

5. soft connection
12 ln-s/opt/dell/srvadmin/sbin/omreport/usr/bin/omreport
Ln-s/opt/dell/srvadmin/sbin/omconfig/usr/bin/omconfig

6. Start omsa in cli mode.
Service dataeng start

C. zabbix client Configuration
1. The configuration in zabbix_agentd.conf is as follows:
# Follow is monitor hardware
UserParameter = hardware_battery, omreport chassis batteries | awk '/^ Status/{if ($ NF = "OK") {print 1} else {print 0 }}'
UserParameter = hardware_cpu_model, awk-vhardware_cpu_crontol = 'sudoomreport chassis biossetup | awk'/C State/{if ($ NF = "Enabled ") {print 0} else {print 1} ''-vhardware_cpu_c1 = 'sudoomreport chassis biossetup | awk '/C1 [-| E]/{if ($ NF =" Enabled") {print 0} else {print 1} ''' BEGIN {if (hardware_cpu_crontol = 0 & hardware_cpu_c1 = 0) {print 0} else {print 1 }}'
UserParameter = hardware_fan_health, awk-vhardware_fan_number = 'omreport chassis fans | grep-c "^ Index" '-vhardware_fan = 'omreport chassis fans | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_fan_number = hardware_fan) {print 1} else {print 0 }}'
UserParameter = hardware_memory_health, awk-vhardware_memory = 'omreport chassis memory | awk'/^ Health/{print $ NF} ''' BEGIN {if (hardware_memory = "OK ") {print 1} else {print 0 }}'
UserParameter = hardware_nic_health, awk-vhardware_nic_number = 'omreport chassis failed | grep-c "Interface Name" '-vhardware_nic = 'omreport chassis failed | awk'/^ Connection Status/{print $ NF}' | wc-l ''in in {if (hardware_nic_number = hardware_nic) {print 1} else {print 0 }}'
UserParameter = hardware_cpu, omreport chassis processors | awk '/^ Health/{if ($ NF = "OK") {print 1} else {print 0 }}'
UserParameter = hardware_power_health, awk-vhardware_power_number = 'omreport chassis pwrsupplies | grep-c "Index" '-vhardware_power = 'omreport chassis pwrsupplies | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_power_number = hardware_power) {print 1} else {print 0 }}'
UserParameter = hardware_temp, omreport chassis temps | awk '/^ Status/{if ($ NF = "OK ") {print 1} else {print 0} '| head-n 1
UserParameter = hardware_physics_health, awk-vhardware_physics_disk_number = 'omreport storage pdisk controller = 0 | grep-c "^ ID" '-vhardware_physics_disk = 'omreport storage pdisk controller = 0 | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_physics_disk_number = hardware_physics_disk) {print 1} else {print 0 }}'
UserParameter = hardware_virtual_health, awk-vhardware_virtual_disk_number = 'omreport storage vdisk controller = 0 | grep-c "^ ID" '-vhardware_virtual_disk = 'omreport storage vdisk controller = 0 | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_virtual_disk_number = hardware_virtual_disk) {print 1} else {print 0 }}'

2. Restart the zabbix_agentd service.
Ps-ef | grep zabbix | grep-v grep | awk '{print $2}' | xargs kill-9
/Usr/local/zabbix/sbin/zabbix_agentd-c/usr/local/zabbix/conf/zabbix_agentd.conf

If you need to install it on another system, refer to the official wiki at http://linux.dell.com/wiki/index.php/repository/hardware.

For windows systems, refer to zabbix enterprise application for installing omsa Hardware Monitoring on Windows systems (Address)
Ii. Server
1. template Import
Import Template Hardware Monitor to zabbix (the Template is included in the attachment). The specific operations are not described.
2. Host Association Template
Associate the hardware server to be monitored with this template.

------------------------------------------ Split line ------------------------------------------

Free in http://linux.bkjia.com/

The username and password are both www.bkjia.com

The specific download directory is in/July 6,/July 11/July 11/Zabbix enterprise application server hardware information monitoring/

For the download method, see

------------------------------------------ Split line ------------------------------------------

Some Zabbix Tutorials:

Install and deploy the distributed monitoring system Zabbix 2.06

Install and deploy the distributed monitoring system Zabbix 2.06

Install and deploy Zabbix in CentOS 6.3

Zabbix distributed monitoring system practice

Under CentOS 6.3, Zabbix monitors apache server-status

Monitoring MySQL database Parameters Using Zabbix in CentOS 6.3

Install Zabbix 2.0.6 in 64-bit CentOS 6.2

ZABBIX details: click here
ZABBIX: click here

This article permanently updates the link address:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.