Custom Nagios plug-ins implement active Passive mode and Nagios mail-based simple alerts

Source: Internet
Author: User

The Nagios plug-in provides two return values: one is the plug-in exit status code and the other is the first line of data that the plugin prints on the console. Exit status codes can be used by Nagios main program

As the basis for judging the service status of the monitored system, the first line of data printed by the console can be supplemented by the Nagios main program as the service status of the monitored system.

will be displayed in the Administration page.

To manage the Nagios plugin, Nagios produces a subprocess every time a service is queried, and it uses the output and exit status codes from that command to

Determine the specific state. The status codes and descriptions that are recognized by the Nagios main program are as follows:

OK Exit code 0--indicates that the service is working properly

Warning exit code to indicate service is in alert state

Critical exit code 2--indicates that the service is in an emergency, critical state

Unknown exit code 3--indicates the service is in an unknown state


[Email protected] services]# head-7/usr/local/nagios/libexec/utils.sh

#! /bin/sh

State_ok=0

State_warning=1

state_critical=2

State_unknown=3

State_dependent=4


Example one: Judging whether the/etc/passwd file changes, using the Nrpe passive mode

Principle: Fingerprint collection md5sum/etc/passwd >/etc/passwd.md5 using md5sum

Using md5sum-c/etc/passwd.md5 to identify the fingerprint, there is no change in the OK, but the reverse is changed

Whether the monitoring password file has been changed:

Let's do the fingerprint library.

MD5SUM/ETC/PASSWD >/etc/passwd.md5

Create a script on the client vim/usr/local/nagios/libexec/check_passwd

#!/bin/bash

Char= ' md5sum-c/etc/passwd.md5 2>&1 |grep "OK" |wc-l '


If [$char-eq 1];then

echo "passwd is OK"

Exit 0

Else

echo "passwd is changed"

Exit 2

Fi


##### #给脚本执行权限

chmod +x/usr/local/nagios/libexec/check_passwd


# # # #定义check_passwd命令

Vim/usr/local/nagios/etc/nrpe.cfg

command[check_passwd]=/usr/local/nagios/libexec/check_passwd


# # # #重启nrpe服务


##### #在nagios主程序先手动抓取数据

[Email protected] libexec]#/check_nrpe-h 192.168.1.11-c check_passwd

PASSWD is OK


##### #在nagios主程序上定义service配置

Vim/usr/local/nagios/etc/objects/services.cfg (active mode and passive mode respective services.cfg configuration files, respectively managed)

Define Service{

Use Generic-service

HOST_NAME client02

Service_description check_passwd

Check_command check_nrpe!check_passwd

}


Then manually crawl the data on the Nagios server:

/usr/local/nagios/libexec/check_nrpe-h 192.168.1.11-c check_passwd

The data appears to indicate that the basic is no problem, restart the service, observe the Web platform page, such as:

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M00/86/83/wKiom1fBllqy-Lr4AAAWfikfDSQ239.jpg "title=" Custom monitoring client passwd file changes. jpg "alt=" wkiom1fbllqy-lr4aaawfikfdsq239.jpg "/>


Customizing the Monitoring Web URL to monitor with active mode

[[email protected] ~]# curl-i http://192.168.1.11/index.html 2>/dev/null|grep "OK"

http/1.1 OK

[[email protected] ~]# curl-i http://192.168.1.11/index.html 2>/dev/null|grep "OK" |wc-l

1

1. Writing Execution Scripts

Cd/usr/local/nagios/libexec

Vim Check_web_url

#!/bin/bash

Char= ' curl-i http://192.168.1.11/index.html 2>/dev/null|grep "OK" |wc-l '

If [$char-eq 1];then

echo "The URL is ok"

Exit 0

Else

echo "The URL is wrong"

Exit 2

Fi


chmod +x Check_web_url


2, add check_web_url This command to the COMMANDS.CFG configuration file

########### #define Command check_web_url##########

Define Command{

Command_name Check_web_url

Command_line $USER 1$/check_web_url

}


3. Edit the Servers.cfg file

Cd/usr/local/nagios/etc/services

Vim Web_url.cfg

Define Service{

Use Generic-service

host_name client02 Monitored host 192.168.1.11 defined in Hosts.cfg

Service_description Web_url

Check_period 24x7

Check_interval 5

Retry_interval 1

Max_check_attempts 3

Check_command Check_web_url because it's an active mode.

Notification_period 24x7

Notification_interval 30

Notification_options W,u,c,r

Contact_groups Admins

}


4, detection error, restart service

[Email protected] services]#/etc/init.d/nagios checkconfig

Running Configuration Check ...

Ok.


[Email protected] services]#/etc/init.d/nagios Reload

Running Configuration Check ...

Reloading Nagios configuration ...

Done


Success:

650) this.width=650; "src=" Http://s2.51cto.com/wyfs02/M02/86/83/wKioL1fBmPDCd2QiAAAXlLxo_MU338.jpg "title=" Custom scripting Active Mode monitoring web_url.jpg "alt=" Wkiol1fbmpdcd2qiaaaxllxo_mu338.jpg "/>

Look at the overall monitoring effect:

650) this.width=650; "src=" Http://s1.51cto.com/wyfs02/M01/86/83/wKioL1fBmWqCHTIEAACuhXzo46g069.jpg "title=" Monitor interface. jpg "alt=" wkiol1fbmwqchtieaacuhxzo46g069.jpg "/>


To implement the mail alarm function:

To configure alarms:

1. Add contacts and contact groups Contacts.cfg

Define Contact{

Contact_Name Huang

Use generic-contact---"The template used here is the contact definition in the template file.

Alias Nagios Admin

email [email protected]

}

Add the defined Contact_Name to a new group

New Contact Group:

Define Contactgroup{

Contactgroup_name mail_users here can define mail group, SMS group, etc.

Alias Nagios Administrators

Members Huang

}


2. Add alarm command commands.cfg, use the default command here, and of course you can write your own shell scripts or other language scripts.


3. Adjust the default template for contacts

Define Contact{

Name Generic-contact

Service_notification_period 24x7

Host_notification_period 24x7

Service_notification_options w,u,c,r,f,s

Host_notification_options d,u,r,f,s

Service_notification_commands Notify-service-by-email

Host_notification_commands Notify-host-by-email If you define a phone, here you can add Notify-host-by-email,notify-host-by-pager, here use mail alarm , so there's no need to set

Register 0

}


4. Add Alarm contact and alarm group in hosts and services configuration file.

Then modify the service, host definition in the template to

Contact_groups Admins instead

Contact_groups mail_users

Of course, it can not be defined in the template, in the hosts, services configuration file to define the different alarm mode and alarm Group


Experiment:

MV The index.html file below the site directory to the TMP directory, causing him to warning and trigger the alarm

Mv/var/www/html/index.html/tmp

You can see the warning status of the Web platform and see the Nagios logs

650) this.width=650; "src=" Http://s2.51cto.com/wyfs02/M00/86/83/wKioL1fBmyrzEzDXAAFWTU7W3v0614.jpg "title=" Message alarm error. jpg "alt=" wkiol1fbmyrzezdxaafwtu7w3v0614.jpg "/>

And then look at the message, found that no alarm message, read the log found that the Mail command, so

Yum-y Install MAILX


Because of the defined services alarm parameters:

Service_notification_options W,u,c,r,f,s, indicating that monitoring is back to normal will also trigger the message and then put index.html back into the site Directory

Mv/tmp/index.html/var/www/html

A little over a few minutes to find the monitor normal, view Nagios log

650) this.width=650; "src=" Http://s4.51cto.com/wyfs02/M01/86/83/wKioL1fBm8-x3OhLAACVPTVsSW0641.jpg "title=" Alarm logging. jpg "alt=" wkiol1fbm8-x3ohlaacvptvssw0641.jpg "/>

Check the message again, as follows:

650) this.width=650; "src=" http://s2.51cto.com/wyfs02/M00/86/83/wKioL1fBm_OBeHYmAACCLrunGEE231.jpg "title=" alarm. jpg "alt=" Wkiol1fbm_obehymaacclrungee231.jpg "/>

Simple to implement based on mail alarm function

New rookie Learning Exchange Group: 584498750

Custom Nagios plug-ins implement active Passive mode and Nagios mail-based simple alerts

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.