Nagios Event handling mechanism

Source: Internet
Author: User

Receive ZZ's task to automate the processing of an alert for Nagios

I have an impression in my mind that this function has been tested before the line.

First, you must review the following Nagios's official documents, confirm the feasibility, the following is the author of some self-thought useful information

1)
Understanding how commands are defined
Writing Event Handler Commands
Event handler commands would likely be shell or Perl scripts, but they can is any type of executable the can run from a
Command prompt. At a minimum, the scripts should take the following macros as arguments:
For Services: $SERVICESTATE $, $SERVICESTATETYPE $, $SERVICEATTEMPT $
For Hosts: $HOSTSTATE $, $HOSTSTATETYPE $, $HOSTATTEMPT $
This is said, for the host processing needs some parameters, and for the service needs of some parameters, this configuration is configured in the OBJECTS/COMMANDS.CFG
Official Document Records
Define Command{
Command_name restart-httpd
COMMAND_LINE/USR/LOCAL/NAGIOS/LIBEXEC/EVENTHANDLERS/RESTART-HTTPD $SERVICESTATE $ $SERVICESTATETYPE $ $ serviceattempt$
}

2)
The method of understanding the Host configuration file (the author on-line host and service merge a file, generally separate, the main host and service is separate)
Official Document Record:
Define Service{
HOST_NAME Somehost
Service_description HTTP
Max_check_attempts 4
Event_handler restart-httpd
...
}


Second, some explanations:

1)
Variable interpretation:
$SERVICESTATE $: Current status of the service (OK, WARNING, UNKNOWN, CRITICAL)
$SERVICESTATETYPE $: Server state type, divided into two kinds, soft state, hard state
$SERVICEATTEMPT $: Number of attempts to check for a soft state
These values are the three parameters that must be processed by the AutoRecover script that follows

2)
Hard: Rigid state
SOFT: Soft state
Nagios in the process of testing the service, if the first detection fails, the state becomes soft try to set the maximum number of times, the state is changed to become hard

3)
Some parameter configurations for event handling
event_handler_timeout=30 Timeout Period
Enable_event_handlers=1 Boot event handling mechanism
Event_handler


Third, the operation steps:
1)
Confirm that the event processing switch has not been opened
Enable_event_handlers=1
0: Off
1: Open

2)
Self-recovery script creation
There are too many parameters to handle, it is recommended to use case, the following is the official website script example, of course, can be written in other
#!/bin/sh
#
# Event Handler script for restarting the Web server in the local machine
#
# note:this script would only restart the Web server if the service is
# retried 3 times (in a "soft" state) or if the Web service somehow
# manages to fall into a ' hard ' error state.
#
# What's the HTTP service in?
Case "$" in
OK)
# The service just came back up and so don ' t do anything ...
;;
WARNING)
# We don ' t really care on warning states, since the service is probably still running ...
;;
UNKNOWN)
# We don ' t know what might is causing an unknown error and so don ' t do anything ...
;;
CRITICAL)
# aha! The HTTP service appears to a problem-perhaps we should restart the server ...
# is this a "soft" or a "hard" state?
Case "$" in
# We ' re in a "soft" state, meaning that Nagios are in the middle of retrying the
# Check before it turns into a ' hard ' state and contacts get notified ...
SOFT)
# What check attempt is we on? We don ' t want to restart the Web server on the first
# Check, because it may just be a fluke!
Case "$ $" in
# Wait until the check has been tried 3 times before restarting the Web server.
# If The check fails on the 4th time (after we restart the Web server), the
# type would turn to ' hard ' and contacts'll be notified of the problem.
# Hopefully this would restart the Web server successfully, so the 4th check would
# result in a "soft" recovery. If that happens no one gets notified because we
# fixed the problem!
3)
Echo-n "Restarting HTTP service (3rd soft critical state) ..."
# Call the init script to restart the HTTPD server
/ETC/RC.D/INIT.D/HTTPD restart
;;
Esac
;;
#

3) After the alarm, how to when the Nagios server to get the following sent to the alarm, to make self-maintenance (OBJECTS/COMMANDS.CFG)
In Objects/commands.cfg the file indicates when the alarm is generated, how to execute the remote script,


4)
In the configuration file for the host service, enable the time-processing mechanism
Event_handler Shell_name


/usr/local/nagios/bin/nagios-v/usr/local/nagios/etc/nagios.cfg
Detection of the above modified files, there is no error

Kill a service, do a simple test, OK

Live up to the trust of ZZ and solve

Get it done.

This article is from the "on the Road" blog, please be sure to keep this source http://xiaochengxiang.blog.51cto.com/2558908/1416222

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.