標籤:heartbeat監控 heartbeat nagios
heartbeat架好後,我們就需要監控起來嘍,下面我們就來瞭解下怎麼監控。
首先來瞭解下幾個命令,這幾個命令在heartbeat安裝後會自動加上,我們的監控指令碼就用到這幾個命令。
[[email protected] libexec]# which cl_status/usr/bin/cl_status[[email protected] libexec]# cl_status listnodes #列出當前heartbeat叢集中的節點192.168.3.1usvr-211usvr-210[[email protected] libexec]# cl_status nodestatus usvr-211 #列出節點的狀態active[[email protected] libexec]# cl_status nodestatus 192.168.3.1 #列出節點的狀態ping
我們的check_heartbeat.sh原理就是列出叢集中所有節點,並監測所有節點的狀態是否正常,我們實驗的節點狀態為ping和active。
當active+ping的個數為0時critical
當active+ping的個數小於節點總個數時為wart
當active+ping的個數等於節點總個數時為ok
[[email protected] libexec]# cat check_heartbeat.sh #!/bin/bash# Author: Emmanuel Bretelle# Date: 12/03/2010# Description: Retrieve Linux HA cluster status using cl_status# Based on http://www.randombugs.com/linux/howto-monitor-linux-heartbeat-snmp.html ## Autor: Stanila Constantin Adrian# Date: 20/03/2009# Description: Check the number of active heartbeats# http://www.randombugs.com# Get program pathREVISION=1.3PROGNAME=`/bin/basename $0`PROGPATH=`echo $0 | /bin/sed -e 's,[\\/][^\\/][^\\/]*$,,'`NODE_NAME=`uname -n`CL_ST='/usr/bin/cl_status'#nagios error codes#. $PROGPATH/utils.sh OK=0WARNING=1CRITICAL=2UNKNOWN=3usage () { echo "Nagios plugin to heartbeat.Usage: $PROGNAME $PROGNAME [--help | -h] $PROGNAME [--version | -v]Options: --help -lPrint this help information --version -v Print version of plugin"}help () { print_revision $PROGNAME $REVISION echo; usage; echo; support}while test -n "$1"do case "$1" in --help | -h) help exit $STATE_OK;; --version | -v) print_revision $PROGNAME $REVISION exit $STATE_OK;;# -H)# shift# HOST=$1;;# -C)# shift# COMMUNITY=$1;; *) echo "Heartbeat UNKNOWN: Wrong command usage"; exit $UNKNOWN;; esac shiftdone$CL_ST hbstatus > /dev/nullres=$?if [ $res -ne 0 ]then echo "Heartbeat CRITICAL: Heartbeat is not running on this node" exit $CRITICALfideclare -i I=0declare -i A=0NODES=`$CL_ST listnodes`for node in $NODESdo status=`$CL_ST nodestatus $node` let I=$I+1# if [ $status == "active" ] 預設情況下檢測active狀態的個數,但是ping狀態也為正常狀態,因此改成如下條件。 if [ $status == "active" -o $status == "ping" ] then let A=$A+1 fidoneif [ $A -eq 0 ]then echo "Heartbeat CRITICAL: $A/$I" exit $CRITICALelif [ $A -ne $I ]then echo "Heartbeat WARNING: $A/$I" exit $WARNINGelse echo "Heartbeat OK: $A/$I" exit $OKfi
我們在nagios用戶端,也就是我們的lvs叢集usvr-210,usvr-211,我們通過nagios伺服器端的check_nrpe來擷取監控資訊。
naigos用戶端
1.先將指令碼複製到nagios命令目錄下並修改相應許可權
cp check_heartbeat.sh /usr/local/nagios/libexec/
chmod a+x check_heartbeat.sh
chown nagios.nagios check_heartbeat.sh
2.在naigos用戶端的設定檔中加入監控命令。
vim /usr/local/nagios/etc/nrpe.cfg
command[check_heartbeat]=/usr/local/nagios/libexec/check_heartbeat.sh
3.重新載入設定檔。
service xinetd reload
nagios服務端
1.加入相關監控服務
define service { use local-service service_description heartbeat-lvs-master check_command check_nrpe!check_heartbeat service_groups heartbeat_services host_name usvr-210 check_interval 5 notifications_enabled 1 notification_interval 30 contact_groups admins}define service { use local-service service_description heartbeat-lvs-slave check_command check_nrpe!check_heartbeat service_groups heartbeat_services host_name usvr-211 check_interval 5 notifications_enabled 1 notification_interval 30 contact_groups admins}2.檢查並載入設定檔
nagioscheck
service nagios reload
監控如下:
ok,我們的heartbeat監控完成了。
我是參考這個網站http://wiki.debuntu.org/wiki/Linux_HA_Heartbeat/Monitoring_with_Nagios,希望能對大家有所協助。
nagios監控heartbeat