Because our company's servers use dell (our company is a mobile game, and all servers are dell), the server models include r410, r420, r710, and r720, the system includes centos 5.x, centos 6.x, redhat 5.x, redhat 6.x, ubuntu 12.04, and ubuntu 12.04.4. For Hardware monitoring, I have tested ipmi, megacli, and smart, however, these monitoring software has fewer monitoring contents and is not commonly used. Finally, I found dell's dedicated omsa to meet my needs, the following describes how to use omsa to monitor the hardware information of a dell server.
Currently, I monitor the following hardware information:
1. cpu processor status
2. cpu power-saving mode (if the power-saving mode is enabled, it will be very difficult when the pressure is high)
3. raid status (such as the raid mode in which raid is enabled and whether the raid status is normal)
4. Memory status (you can check the maximum memory size supported by the current server and the current memory size. If the memory is faulty, you can check the location where the memory is faulty)
5. Machine temperature status (monitor whether the machine temperature exceeds the threshold)
6. Physical hard disk status (monitor whether the physical hard disk is faulty)
7. Power Supply Status (whether single power supply or dual power supply is faulty)
8. System Panel CMOS battery (whether the cmos battery is faulty)
9. Nic status (the current number of NICs and whether the NIC is faulty)
10. Fan (current number of fans and faults)
By default, alarms are disabled in cpu power-saving mode. Other monitoring tasks are monitored once every 15 minutes. If problems occur twice in a row, an alarm is triggered.
The following figure shows the monitoring chart.
1. Normal hardware server monitoring
2. Abnormal monitoring of some hardware
We can see that the cpu Of this server has enabled the power-saving mode, and the memory is faulty.
After checking through the command line, the faulty memory stick is found to be the first slot
The installation method is as follows:
I. Client
A. install it in the redhat or centos System
1. Install the dell yum Source
The wget-q-O-http://linux.dell.com/repo/hardware/latest/bootstrap.cgi | bash
2. Install omsa
Yuminstallsrvadmin-all
3. soft connection
Ln-s/opt/dell/srvadmin/sbin/omreport/usr/bin/omreportln-s/opt/dell/srvadmin/sbin/omconfig/usr/bin/omconfig
4. Disable the web mode (only cli can be run)
Echo "/usr/bin/omconfig system webserver action = stop">/opt/dell/srvadmin/sbin/srvadmin-services.sh
5. Start omsa
/Opt/dell/srvadmin/sbin/srvadmin-services.sh start
6. Add omsa to startup
Echo "/opt/dell/srvadmin/sbin/srvadmin-services.sh start">/etc/rc. local
The above is the installation of omsa in centos or redhat system.
B. The following is the installation in ubuntu.
1. Add Source
Echo 'deb http://linux.dell.com/repo/community/ubuntu precise openmanage' | sudotee-a/etc/apt/sources. list. d/linux.dell.com. sources. list
2. Check and add key
Gpg -- keyserver pool.sks-keyservers.net -- recv-key 1285491434D8786Fgpg-a -- export1285491434D8786F | sudoapt-key add-
3. Update Source
Apt-get update-y
4. Install omsa
Apt-getinstallsrvadmin-all-y
5. soft connection
Ln-s/opt/dell/srvadmin/sbin/omreport/usr/bin/omreportln-s/opt/dell/srvadmin/sbin/omconfig/usr/bin/omconfig
6. Start omsa in cli mode.
1 service dataeng start
C. zabbix client Configuration
1. The configuration in zabbix_agentd.conf is as follows:
# Follow is monitor hardwareUserParameter = hardware_battery, omreport chassis batteries | awk '/^ Status/{if ($ NF = "OK ") {print 1} else {print 0} 'userparameter = hardware_cpu_model, awk-vhardware_cpu_crontol = 'sudoomreport chassis biossetup | awk'/C State/{if ($ NF = "Enabled ") {print 0} else {print 1} ''-vhardware_cpu_c1 = 'sudoomreport chassis biossetup | awk '/C1 [-| E]/{if ($ NF =" Enabled") {print 0} else {print 1} ''' BEGIN {if (hardware_cpu_crontol = 0 & hardware_cpu_c1 = 0) {print 0} else {print 1} 'userparameter = hardware_fan_health, awk-vhardware_fan_number = 'omreport chassis fans | grep-c "^ Index" '-vhardware_fan = 'omreport chassis fans | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_fan_number = hardware_fan) {print 1} else {print 0} 'userparameter = hardware_memory_health, awk-vhardware_memory = 'omreport chassis memory | awk'/^ Health/{print $ NF} ''' BEGIN {if (hardware_memory = "OK ") {print 1} else {print 0} 'userparameter = hardware_nic_health, awk-vhardware_nic_number = 'omreport chassis failed | grep-c "Interface Name" '-vhardware_nic = 'omreport chassis failed | awk'/^ Connection Status/{print $ NF}' | wc-l ''in in {if (hardware_nic_number = hardware_nic) {print 1} else {print 0} 'userparameter = hardware_cpu, omreport chassis processors | awk'/^ Health/{if ($ NF = "OK ") {print 1} else {print 0} 'userparameter = hardware_power_health, awk-vhardware_power_number = 'omreport chassis pwrsupplies | grep-c "Index" '-vhardware_power = 'omreport chassis pwrsupplies | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_power_number = hardware_power) {print 1} else {print 0} 'userparameter = hardware_temp, omreport chassis temps | awk '/^ Status/{if ($ NF = "OK ") {print 1} else {print 0} '| head-n 1 UserParameter = hardware_physics_health, awk-vhardware_physics_disk_number = 'omreport storage pdisk controller = 0 | grep-c "^ ID" '-vhardware_physics_disk = 'omreport storage pdisk controller = 0 | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_physics_disk_number = hardware_physics_disk) {print 1} else {print 0} 'userparameter = hardware_virtual_health, awk-vhardware_virtual_disk_number = 'omreport storage vdisk controller = 0 | grep-c "^ ID" '-vhardware_virtual_disk = 'omreport storage vdisk controller = 0 | awk'/^ Status/{if ($ NF = "OK ") count + = 1} END {print count} ''' BEGIN {if (hardware_virtual_disk_number = hardware_virtual_disk) {print 1} else {print 0 }}'
2. Restart the zabbix_agentd service.
Ps-ef | grepzabbix | grep-vgrep | awk '{print $2}' | xargskill-9/usr/local/zabbix/sbin/zabbix_agentd-c/usr/local/zabbix/conf/zabbix_agentd.conf
If you need to install it on another system, refer to the official wiki at http://linux.dell.com/wiki/index.php/repository/hardware.
Ii. Server
1. template Import
Import Template Hardware Monitor to zabbix (the Template is included in the attachment). The specific operations are not described.
2. Host Association Template
Associate the hardware server to be monitored with this template.
Click to download the Template
This article is from the "Yin-Technical Exchange" blog, please be sure to keep this source http://dl528888.blog.51cto.com/2382721/1403893