MySQL Master/Slave latency monitoring script (pt-heartbeat), mysqlpt-heartbeat
We can use percona's powerful weapon pt-heartbeat to monitor the master-slave replication latency of MySQL databases. Pt-heartbeat updates a specific table in the master database by using a timestamp. Then, it reads the updated timestamp from the slave database and compares it with the local system time to get its latency. This article uses scripts to regularly check the replication latency between the slave database and the master database and send emails for your reference.
For how to install the pt-heartbeat tool, refer to percona-toolkit installation and introduction.
For more information about the pt-heartbeat tool, see Use pt-heartbeat to monitor master-slave replication latency.
1. Script Overview
A. regular use of the script -- check Method for a single check of the current delay (regular use of cron jobs such as every 1 minute or 5 minutes)
B. Determine whether the delay is within the controllable range by setting the specified delay threshold.
C. Once the current latency is greater than the specified threshold value, the -- monitor method will be used to continuously monitor its delay and write it into the log file.
D. For the -- monitor mode, the process runs for more than 30 minutes and kill the process from time to avoid excessive logs and insufficient space due to indefinite running.
2. Script content
[mysql@SZDB run]$ more ck_slave_lag.sh #!/bin/bash#set -xif [ $# -ne 3 ];then echo "usage:" echo "ck_slave_lag.sh <Servier-id> <MaxLag> <LogDir>" exit 0;fi# Author : Leshami# Blog : http://blog.csdn.net/leshamiServerID=$1MaxLag=$2LogDir=$3Timestamp=`date +%Y%m%d_%H%M%S`Rentition=7LogFile=$LogDir/slave_lag_$Timestamp.logLagDetail=$LogDir/slave_lag_Detail_$Timestamp.logmailadd=leshami@12306.cnecho $ServerIDecho $MaxLagecho $LogDirecho $LogFileecho $LagDetailecho $mailaddif [ ! -d $LogDir ];then mkdir -p $LogDirfiLag=`/usr/bin/pt-heartbeat --user=monitor --password=xxx -S /tmp/mysql.sock -D test --master-server-id=$ServerID --check`Lag=`echo ${Lag%.*}`#Lag=3echo $LagptStatus=`ps -ef|grep pt-heart|grep daemonize`echo $ptStatusif [ $Lag -gt $MaxLag ]; then echo "The current date is `date` at `hostname`." >>$LogFile echo "The current lag log file is $LogFile." >>$LogFile echo "The current replication lag is $Lag." >>$LogFile echo "The replication lag is larger than max lag $MaxLag." >>$LogFile if [ -z "$ptStatus" ] ; then echo "Start a monitor daemon with below command: " >>$LogFile echo "pt-heartbeat --user=monitor --password=xxx -S /tmp/mysql.sock -D test " >>$LogFile echo " --master-server-id=11 --monitor --print-master-server-id --daemonize --log=$LagDetail" >>$LogFile /usr/bin/pt-heartbeat --user=monitor --password=xxx -S /tmp/mysql.sock -D test \ --master-server-id=$ServerID --monitor --print-master-server-id --daemonize --log=$LagDetail echo "More detail please check lag log from $LagDetail." >>$LogFile cat $LogFile | mutt -s "Found slave lag on `hostname`." $mailadd fifiif [ -n "$ptStatus" ] ; then STime=`ps -ef|grep pt-heart|grep daemonize |gawk '{print $5}'` Pid=`ps -ef|grep pt-heart|grep daemonize |gawk '{print $2}'` STime=`date '+%Y%m%d'`" "$STime s_STime=`date -d "$STime" '+%s'` s_ETime=`date +%s` DiffSec=`expr $s_ETime - $s_STime` echo $STime echo $s_STime echo $s_ETime echo $DiffSec if [ "$DiffSec" -gt 1800 ]; then echo "kill -9 $Pid" kill -9 $Pid fifi# Remove history slave lag log.find $LogDir -name "*slave_lag*" -ctime +$Rentition -delete exit
3. Deployment reference
[mysql@SZDB run]$ crontab -l#check slave lag*/1 * * * * /run/ck_slave_lag.sh 11 3 /log/SlaveLag