Zabbix enterprise application server hardware information monitoring

Source: Internet
Author: User

Because our company's servers use dell (our company is a mobile game, and all servers are dell), the server models include r410, r420, r710, and r720, the system includes centos 5.x, centos 6.x, redhat 5.x, redhat 6.x, ubuntu 12.04, and ubuntu 12.04.4. For Hardware monitoring, I have tested ipmi, megacli, and smart, however, these monitoring software has fewer monitoring contents and is not commonly used. Finally, I found dell's dedicated omsa to meet my needs, the following describes how to use omsa to monitor the hardware information of a dell server.

Currently, I monitor the following hardware information:

1. cpu processor status

2. cpu power-saving mode (if the power-saving mode is enabled, it will be very difficult when the pressure is high)

3. raid status (such as the raid mode in which raid is enabled and whether the raid status is normal)

4. Memory status (you can check the maximum memory size supported by the current server and the current memory size. If the memory is faulty, you can check the location where the memory is faulty)

5. Machine temperature status (monitor whether the machine temperature exceeds the threshold)

6. Physical hard disk status (monitor whether the physical hard disk is faulty)

7. Power Supply Status (whether single power supply or dual power supply is faulty)

8. System Panel CMOS battery (whether the cmos battery is faulty)

9. Nic status (the current number of NICs and whether the NIC is faulty)

10. Fan (current number of fans and faults)

By default, alarms are disabled in cpu power-saving mode. Other monitoring tasks are monitored once every 15 minutes. If problems occur twice in a row, an alarm is triggered.

The following figure shows the monitoring chart.

1. Normal hardware server monitoring

2. Abnormal monitoring of some hardware

We can see that the cpu Of this server has enabled the power-saving mode, and the memory is faulty.

After checking through the command line, the faulty memory stick is found to be the first slot

The installation method is as follows:

I. Client

A. install it in the redhat or centos System

1. Install the dell yum Source

The wget-q-O-http://linux.dell.com/repo/hardware/latest/bootstrap.cgi | bash

2. Install omsa

Yuminstallsrvadmin-all

3. soft connection

Ln-s/opt/dell/srvadmin/sbin/omreport/usr/bin/omreportln-s/opt/dell/srvadmin/sbin/omconfig/usr/bin/omconfig

4. Disable the web mode (only cli can be run)

Echo "/usr/bin/omconfig system webserver action = stop">/opt/dell/srvadmin/sbin/srvadmin-services.sh

5. Start omsa

/Opt/dell/srvadmin/sbin/srvadmin-services.sh start

6. Add omsa to startup

Echo "/opt/dell/srvadmin/sbin/srvadmin-services.sh start">/etc/rc. local

The above is the installation of omsa in centos or redhat system.

B. The following is the installation in ubuntu.

1. Add Source

Echo 'deb http://linux.dell.com/repo/community/ubuntu precise openmanage' | sudotee-a/etc/apt/sources. list. d/linux.dell.com. sources. list

2. Check and add key

Gpg -- keyserver pool.sks-keyservers.net -- recv-key 1285491434D8786Fgpg-a -- export1285491434D8786F | sudoapt-key add-

3. Update Source

Apt-get update-y

4. Install omsa

Apt-getinstallsrvadmin-all-y

5. soft connection

Ln-s/opt/dell/srvadmin/sbin/omreport/usr/bin/omreportln-s/opt/dell/srvadmin/sbin/omconfig/usr/bin/omconfig

6. Start omsa in cli mode.

1 service dataeng start

C. zabbix client Configuration

1. The configuration in zabbix_agentd.conf is as follows:

# Follow is monitor hardwareUserParameter = hardware_battery, omreport chassis batteries | awk '/^ Status/{if ($ NF = "OK ") {print 1} else {print 0} 'userparameter = hardware_cpu_model, awk-vhardware_cpu_crontol = 'sudoomreport chassis biossetup | awk'/C State/{if ($ NF = "Enabled ") {print 0} else {print 1} ''-vhardware_cpu_c1 = 'sudoomreport chassis biossetup | awk '/C1 [-| E]/{if ($ NF =" Enabled") {print 0} else {print 1} ''' BEGIN {if (hardware_cpu_crontol = 0 & hardware_cpu_c1 = 0) {print 0} else {print 1} 'userparameter = hardware_fan_health, awk-vhardware_fan_number = 'omreport chassis fans | grep-c "^ Index" '-vhardware_fan = 'omreport chassis fans | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_fan_number = hardware_fan) {print 1} else {print 0} 'userparameter = hardware_memory_health, awk-vhardware_memory = 'omreport chassis memory | awk'/^ Health/{print $ NF} ''' BEGIN {if (hardware_memory = "OK ") {print 1} else {print 0} 'userparameter = hardware_nic_health, awk-vhardware_nic_number = 'omreport chassis failed | grep-c "Interface Name" '-vhardware_nic = 'omreport chassis failed | awk'/^ Connection Status/{print $ NF}' | wc-l ''in in {if (hardware_nic_number = hardware_nic) {print 1} else {print 0} 'userparameter = hardware_cpu, omreport chassis processors | awk'/^ Health/{if ($ NF = "OK ") {print 1} else {print 0} 'userparameter = hardware_power_health, awk-vhardware_power_number = 'omreport chassis pwrsupplies | grep-c "Index" '-vhardware_power = 'omreport chassis pwrsupplies | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_power_number = hardware_power) {print 1} else {print 0} 'userparameter = hardware_temp, omreport chassis temps | awk '/^ Status/{if ($ NF = "OK ") {print 1} else {print 0} '| head-n 1 UserParameter = hardware_physics_health, awk-vhardware_physics_disk_number = 'omreport storage pdisk controller = 0 | grep-c "^ ID" '-vhardware_physics_disk = 'omreport storage pdisk controller = 0 | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_physics_disk_number = hardware_physics_disk) {print 1} else {print 0} 'userparameter = hardware_virtual_health, awk-vhardware_virtual_disk_number = 'omreport storage vdisk controller = 0 | grep-c "^ ID" '-vhardware_virtual_disk = 'omreport storage vdisk controller = 0 | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_virtual_disk_number = hardware_virtual_disk) {print 1} else {print 0 }}'

2. Restart the zabbix_agentd service.

Ps-ef | grepzabbix | grep-vgrep | awk '{print $2}' | xargskill-9/usr/local/zabbix/sbin/zabbix_agentd-c/usr/local/zabbix/conf/zabbix_agentd.conf

If you need to install it on another system, refer to the official wiki at http://linux.dell.com/wiki/index.php/repository/hardware.

Ii. Server

1. template Import

Import Template Hardware Monitor to zabbix (the Template is included in the attachment). The specific operations are not described.

2. Host Association Template

Associate the hardware server to be monitored with this template.
Click to download the Template

This article is from the "Yin-Technical Exchange" blog, please be sure to keep this source http://dl528888.blog.51cto.com/2382721/1403893

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.