Nagios monitoring heartbeat, nagiosheartbeat

Source: Internet
Author: User

Nagios monitoring heartbeat, nagiosheartbeat

After heartbeat is set up, we need to monitor it. Next we will learn how to monitor it.

First, let's take a look at the following commands. These commands will be automatically added after heartbeat is installed. Our monitoring script will use these commands.

[Root @ usvr-210 libexec] # which cl_status/usr/bin/cl_status [root @ usvr-210 libexec] # cl_status listnodes # list nodes in the current heartbeat cluster 192.168.3.1usvr-211usvr-210 [root @ libusvr-210 exec] # cl_status nodestatus usvr-211 # list node status active [root @ usvr-210 libexec] # cl_status nodestatus 192.168.3.1 # list node status ping

Our principle of check_heartbeat.sh is to list all nodes in the cluster and monitor whether the status of all nodes is normal. The node status of our experiment is ping and active.

Critical when the number of active + ping instances is 0

When the number of active + ping nodes is smaller than the total number of nodes, It is warn.

OK when the number of active + ping nodes equals the total number of nodes

[Root @ usvr-210 libexec] # cat check_heartbeat.sh #! /Bin/bash # Author: Emmanuel Bretelle # Date: 12/03/2010 # Description: Retrieve Linux HA cluster status using cl_status # Based on http://www.randombugs.com/linux/howto-monitor-linux-heartbeat-snmp.html # Autor: Stanila Constantin Adrian # Date: 20/03/2009 # Description: Check the number of active heartbeats # http://www.randombugs.com # Get program pathREVISION = 1.3 PROGNAME = '/bin/basename $0' PROGPATH = 'EC Ho $0 |/bin/sed-e's, [\/] [^ \/] [^ \/] * $ ,, ''node _ NAME = 'uname-n' CL _ ST = '/usr/bin/cl_status' # nagios error codes #. $ PROGPATH/utils. sh OK = 0 WARNING = 1 CRITICAL = 2 UNKNOWN = 3 usage () {echo "\ Nagios plugin to heartbeat. usage: $ PROGNAME [-- help |-h] $ PROGNAME [-- version |-v] Options: -- help-lPrint this help information -- version-v Print version of plugin "} help () {print_revision $ PROGNAME $ REV ISION echo; usage; echo; support} while test-n "$1" do case "$1" in -- help |-h) help exit $ STATE_ OK ;; -- version |-v) print_revision $ PROGNAME $ REVISION exit $ STATE_ OK; #-H) # shift # HOST = $1; #-C) # shift # COMMUNITY = $1; *) echo "Heartbeat UNKNOWN: Wrong command usage"; exit $ UNKNOWN; esac shiftdone $ CL_ST hbstatus>/dev/nullres = $? If [$ res-ne 0] then echo "Heartbeat CRITICAL: heartbeat is not running on this node "exit $ CRITICALfideclare-I I = 0 declare-I A = 0 NODES = '$ CL_ST listnodes' for node in $ NODESdo status = '$ CL_ST nodestatus $ node 'let I = $ I + 1 # if [$ status = "active"] The number of active States detected by default, however, the ping status is also normal, so it is changed to the following conditions. If [$ status = "active"-o $ status = "ping"] then let A = $ A + 1 fidoneif [$ A-eq 0] then echo "Heartbeat CRITICAL: $ A/$ I "exit $ CRITICALelif [$ A-ne $ I] then echo" Heartbeat WARNING: $ A/$ I "exit $ WARNINGelse echo" Heartbeat OK: $ A/$ I "exit $ OKfi

In nagios clients, our lvs cluster usvr-210, usvr-211, we get monitoring information through check_nrpe on nagios servers.

Naigos Client

1. Copy the script to the nagios Command directory and modify the corresponding permissions.

Cp check_heartbeat.sh/usr/local/nagios/libexec/

Chmod a + x check_heartbeat.sh

Chown nagios. nagios check_heartbeat.sh

2. Add the monitoring command to the configuration file of the naigos client.

Vim/usr/local/nagios/etc/nrpe. cfg

Command [check_heartbeat] =/usr/local/nagios/libexec/check_heartbeat.sh

3. Reload the configuration file.

Service xinetd reload

Nagios Server

1. Add related monitoring services

define service {     use                     local-service    service_description     heartbeat-lvs-master    check_command           check_nrpe!check_heartbeat    service_groups          heartbeat_services     host_name               usvr-210    check_interval          5       notifications_enabled   1       notification_interval   30      contact_groups          admins}define service {     use                     local-service    service_description     heartbeat-lvs-slave    check_command           check_nrpe!check_heartbeat    service_groups          heartbeat_services     host_name               usvr-211    check_interval          5       notifications_enabled   1       notification_interval   30      contact_groups          admins}
2. Check and load the configuration file

Nagioscheck

Service nagios reload

Monitoring is as follows:



OK. Our heartbeat monitoring is complete.


I refer to this website.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.