Nagios monitoring ESXi hardware
General server hardware monitoring can be achieved through nagios + openmanage, but how can we implement Hardware Monitoring for Esxi hosts in vsphere?
There are two solutions:
1. It is implemented through the nagios plugin check_esx. In this way, you need to install the vmware vsphere sdk for perl toolkit.
2. Use the nagios plugin check_esxi_hardware.py, which is written in python.
The second method is simpler. Does python have built-in linux? Do you need more reasons?
Let's take a look at the official website introduction:
Http://www.claudiokuenzler.com/nagios-plugins/check_esxi_hardware.php#.VWV5_JCUfTA
Where:
Requirements
-Python must be installed
-The Python extension pywbem must be installed
Windows users click here for a step-by-step guide how to install Python and PyWBEM on a Windows server.
-If there is a firewall between your monitoring and ESXi server, open ports 443 and 5989
The above are the prerequisites for monitoring:
1. python must be installed
2. python extension package pywbem must be installed
3. Port 443,5989 of your Esxi host must be open to the nagios monitoring end.
Now, let's implement it now!
1. Install check_essi_hardware.py
cd /usr/local/nagios/libexecwget http://www.claudiokuenzler.com/nagios-plugins/check_esxi_hardware.pychown nagios.nagios check_esxi_hardware.pychmod 755 check_esxi_hardware.py
After the installation is complete, let's check the parameters of this plug-in:
[root@nagios libexec]# ./check_esxi_hardware.py Traceback (most recent call last): File "./check_esxi_hardware.py", line 222, in
import pywbemImportError: No module named pywbem[root@nagios libexec]# ./check_esxi_hardware.py -hTraceback (most recent call last): File "./check_esxi_hardware.py", line 222, in
import pywbemImportError: No module named pywbem
Oh, it turns out that the pywbem module is not installed, so install it now.
2. Install a third-party python Module
cd /usr/local/srcwget http://downloads.sourceforge.net/project/pywbem/pywbem/pywbem-0.7/pywbem-0.7.0.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fpywbem%2Ffiles%2Fpywbem%2F&ts=1299742557&use_mirror=voxeltar -zxvf pywbem-0.7.0.tar.gzcd pywbem-0.7.0python setup.py buildpython setup.py install --record files.txt
OK. pywbem installation is complete.
Note: (1 ).Don't use the pywbem-0.8.0 version, this version has bugs that make our plug-ins unusable
(2). python setup. py install -- record files.txt the purpose of recording the installation directory is to facilitate the uninstall of the plug-in, cat files.txt | xargs rm-rf
3. Use plug-ins
[root@nagios libexec]# ./check_esxi_hardware.py no parameters specifiedUsage: check_esxi_hardware.py https://hostname user password system [verbose]example: check_esxi_hardware.py https://my-shiny-new-vmware-server root fakepassword dellor, using new style options:usage: check_esxi_hardware.py -H hostname -U username -P password [-V system -v -p -I XX]example: check_esxi_hardware.py -H my-shiny-new-vmware-server -U root -P fakepassword -V auto -I ukor, verbosely:usage: check_esxi_hardware.py --host=hostname --user=username --pass=password [--vendor=system --verbose --perfdata --html=XX]Options: --version show program's version number and exit -h, --help show this help message and exit Mandatory parameters: -H HOST, --host=HOST report on HOST -U USER, --user=USER user to connect as -P PASS, --pass=PASS password, if password matches file:
, first line of given file will be used as password Optional parameters: -V VENDOR, --vendor=VENDOR Vendor code: auto, dell, hp, ibm, intel, or unknown (default) -v, --verbose print status messages to stdout (default is to be quiet) -p, --perfdata collect performance data for pnp4nagios (default is not to) -I XX, --html=XX generate html links for country XX (default is not to) -t TIMEOUT, --timeout=TIMEOUT timeout in seconds - no effect on Windows (default = no timeout) -i IGNORE, --ignore=IGNORE comma-separated list of elements to ignore --no-power don't collect power performance data --no-volts don't collect voltage performance data --no-current don't collect current performance data --no-temp don't collect temperature performance data --no-fan don't collect fan performance data
It can be seen from the above that this plug-in requires a user name and a password to connect to the Esxi host. To ensure security,
You only need to create a read-only user name and password on the Esxi host..
-U user name-P password-V server type, such as dell and hp, print the status information based on the actual situation-p combined with the drawing tool drawing
-I. The output is linked to dell or other official websites. Find a solution-t timeout-I. Ignore a monitoring item.
-- No-power does not collect power information, which is similar to the following.
4. Set a read-only user for the Esxi host
(1) log on to the Esxi host. In the "local users and groups" tab, right-click "add" in the blank area to add users.
(2) Set nagios users to "read-only roles ". In the "permission" tab, right-click "add permission" in the blank area and then press
OK. The read-only user nagios is added.
5. Test
[root@nagios libexec]# ./check_esxi_hardware.py -H 10.10.10.1 -U nagios -P nagios -V dell UNKNOWN: Authentication Error
An error is reported and authentication fails. I found the cause on the Internet that the Esxi version is different. For example, my local test machine and production environment are both 5.5, but the minor version is different and an error is returned.
Solution:
Log on to the Esxi host using ssh. Edit the following code:
~ # cat /etc/security/access.conf # This file is autogenerated and must not be edited.+:dcui:ALL+:root:ALL+:vpxuser:ALL+:vslauser:ALL-:nagios:ALL-:ALL:ALL
Remove "-: nagios: ALL" and add "+: nagios: sfcb" to the second line"
This method is applicable when users are not added frequently. You only need to change it once. However, frequent addition of users may result in access. to change the conf, you need to add "+: nagios: sfcb" to the scheduled task"
Test again:
[root@nagios libexec]# ./check_esxi_hardware.py -H 10.10.10.1 -U nagios -P nagios -V dell OK - Server: Dell Inc. PowerEdge R610 s/n: XXXXXX System BIOS: XXXXXXXXXXX
OK. Monitoring is normal.
6. Add it to the monitoring system.
(1) Add commands in commands. cfg.
vim /usr/local/nagios/etc/objects/commands.cfgdefine command { command_name check_esxi_hardware command_line $USER1$/check_esxi_hardware.py -H $HOSTADDRESS$ -U $ARG1$ -P $ARG2$ -V $ARG3$ -I isolutions -p -t 20 }
(2) Add a service
define service{ use local-service,srv-pnp host_name test service_description esxi_health check_command check_esxi_hardware!nagios!nagios!dell service_groups hardware_health notifications_enabled 1 }
Nagioscheck
Service nagios reload
(3) Monitoring
The link in href is the-I parameter generated in check_esxi_hardware.py, so that we can directly find the solution.