使用pt-heartbeat監控主從複寫延遲,pt-heartbeat主從
MySQL主從複製是MySQL 高可用架構中重要的組成部分,該技術可以用於實現負載平衡,高可用和故障切換,以及提供備份等等。對於主從複製的監控,僅僅依賴於MySQL自身提供的show slave status並不可靠。pt-heartbeat是主從複寫延遲監控的不錯選擇,本文描述了主從複製情形下的延遲監控並給出相應樣本。
pt-heartbeat為percona-toolkit工具包中的一個,因此使用前需要先安裝percona-toolkit,請參考:percona-toolkit的安裝及簡介
1、pt-heartbeat的作用
pt-heartbeat measures replication lag on a MySQL or PostgreSQL server. You can use it to update a master or monitor a replica. If possible, MySQL connection options are read from your .my.cnf file. For more details, please use the --help option, or try 'perldoc /usr/bin/pt-heartbeat' for complete documentation.
pt-heartbeat is a two-part MySQL and PostgreSQL replication delay monitoring system that measures delay by looking at actual replicated data. This avoids reliance on the replication mechanism itself, which is unreliable. (For example, SHOW SLAVE STATUS on MySQL).
2、pt-heartbeat的原理
The first part is an --update instance of pt-heartbeat that connects to a master and updates a timestamp (“heartbeat record”) every --interval seconds. Since the heartbeat table may contain records from multiple masters (see “MULTI-SLAVE HIERARCHY”), the server’s ID (@@server_id) is used to identify records.
主庫上存在一個用於檢查延遲的表heartbeat,可手動或自動建立
pt-heartbeat使用--update參數串連到主庫上並持續(根據設定的--interval參數)使用一個時間戳記更新到表heartbeat
The second part is a --monitor or --check instance of pt-heartbeat that connects to a slave, examines the replicated heartbeat record from its immediate master or the specified --master-server-id, and computes the difference from the current system time. If replication between the slave and the master is delayed or broken, the computed difference will be greater than zero and otentially increase if --monitor is specified.
pt-heartbeat使用--monitor 或--check串連到從庫,檢查從主庫同步過來的時間戳記,並與當前系統時間戳進行比對產生一個差值,
該值則用於判斷延遲。(注,前提條件是主庫與從庫應保持時間同步)
You must either manually create the heartbeat table on the master or use --create-table. See --create-table for the proper heartbeat table structure. The MEMORY storage engine is suggested, but not re-quired of course, for MySQL.
The heartbeat table must contain a heartbeat row. By default, a heartbeat row is inserted if it doesn’t exist. This feature can be disabled with the --[no]insert-heartbeat-row option in case the database user does not have INSERT privileges.
pt-heartbeat depends only on the heartbeat record being replicated to the slave, so it works regardless of the replication mechanism (built-in replication, a system such as Continuent Tungsten, etc). It works at any depth in the replication hierarchy; for example, it will reliably report how far a slave lags its master’s master’s master. And if replication is stopped, it will continue to work and report (accurately!) that the slave is falling further and further behind the master.
pt-heartbeat has a maximum resolution of 0.01 second. The clocks on the master and slave servers must be closely synchronized via NTP. By default, --update checks happen on the edge of the second (e.g. 00:01) and --monitor checks happen halfway between seconds (e.g. 00:01.5). As long as the servers’ clocks are closely synchronized and replication events are propagating in less than half a second, pt-heartbeat will report zero seconds of delay.
pt-heartbeat will try to reconnect if the connection has an error, but will not retry if it can’t get a connection when it first starts.
The --dbi-driver option lets you use pt-heartbeat to monitor PostgreSQL as well. It is reported to work well with Slony-1 replication.
3、擷取pt-heartbeat協助資訊
a、擷取協助資訊
[root@DBMASTER01 ~]# pt-heartbeat #直接輸入pt-heartbeat可獲得一個簡要描述,使用pt-heartbeat --help獲得一個完整協助資訊
Usage: pt-heartbeat [OPTIONS] [DSN] --update|--monitor|--check|--stop
Errors in command-line arguments:
* Specify at least one of --stop, --update, --monitor or --check
* --database must be specified
b、幾個重要的參數
Specify at least one of --stop, --update, --monitor, or --check. #至少指定一個
--update, --monitor, and --check are mutually exclusive. #互斥參數
--daemonize and --check are mutually exclusive. #互斥參數
--check
Check slave delay once and exit. If you also specify --recurse, the tool will try to discover slave’s of the
given slave and check and print their lag, too. The hostname or IP and port for each slave is printed before its
delay. --recurse only works with MySQL.
--daemonize
Fork to the background and detach from the shell. POSIX operating systems only.
--frames
type: string; default: 1m,5m,15m
Timeframes for averages.
Specifies the timeframes over which to calculate moving averages when --monitor is given. Specify as a
comma-separated list of numbers with suffixes. The suf?x can be s for seconds, m for minutes, h for hours, or d
for days. The size of the largest frame determines the maximum memory usage, as up to the specified number
of per-second samples are kept in memory to calculate the averages. You can specify as many timeframes as
you like.
--monitor
Monitor slave delay continuously.
Specifies that pt-heartbeat should check the slave’s delay every second and report to STDOUT (or if --file
is given, to the file instead). The output is the current delay followed by moving averages over the timeframe
given in --frames. For example,
5s [ 0.25s, 0.05s, 0.02s ]
--stop
Stop running instances by creating the sentinel file.
--update
Update a master’s heartbeat.
4、示範使用pt-heartbeat
a、首先添加表[root@DBMASTER01 ~]# pt-heartbeat --user=root --password=xxx -S /tmp/mysql.sock -D test \> --master-server-id=11 --create-table --update MASTER> select * from heartbeat;+----------------------------+-----------+------------------+-----------+-----------------------+---------------------+| ts | server_id | file | position | relay_master_log_file | exec_master_log_pos |+----------------------------+-----------+------------------+-----------+-----------------------+---------------------+| 2014-12-01T09:48:14.003020 | 11 | mysql-bin.000390 | 677136957 | mysql-bin.000179 | 120 |+----------------------------+-----------+------------------+-----------+-----------------------+---------------------+b、更新主庫上的heartbeat[root@DBMASTER01 ~]# pt-heartbeat --user=root --password=xxx -S /tmp/mysql.sock -D test \> --master-server-id=11 --update &[1] 31249c、從庫上監控延遲[root@DBBAK01 ~]# pt-heartbeat --user=root --password=xxx -S /tmp/mysql.sock -D test \> --master-server-id=11 --monitor --print-master-server-id1.00s [ 0.02s, 0.00s, 0.00s ] 11 #即時延遲,1分鐘延遲,5分鐘延遲,15分鐘延遲1.00s [ 0.03s, 0.01s, 0.00s ] 11 # Author : Leshami1.00s [ 0.05s, 0.01s, 0.00s ] 11 # Blog : http://blog.csdn.net/leshami1.00s [ 0.07s, 0.01s, 0.00s ] 111.00s [ 0.08s, 0.02s, 0.01s ] 111.00s [ 0.10s, 0.02s, 0.01s ] 111.00s [ 0.12s, 0.02s, 0.01s ] 111.00s [ 0.13s, 0.03s, 0.01s ] 11d、其他動作樣本#將主庫上的update使用守護進程方式調度[root@DBMASTER01 ~]# pt-heartbeat --user=root --password=xxx -S /tmp/mysql.sock -D test \> --master-server-id=11 --update --daemonize#修改主庫上的更新間隔為2s [root@DBMASTER01 ~]# pt-heartbeat --user=root --password=xxx -S /tmp/mysql.sock -D test \> --master-server-id=11 --update --daemonize --interval=2#停止主庫上的pt-heartbeat守護進程[root@DBMASTER01 ~]# pt-heartbeat --stopSuccessfully created file /tmp/pt-heartbeat-sentinel[root@DBMASTER01 ~]# rm -rf /tmp/pt-heartbeat-sentinel#單次查看從庫上的延遲情況[robin@DBBAK01 ~]$ pt-heartbeat --user=root --password=xxx -S /tmp/mysql.sock -D test \> --master-server-id=11 --check 1.00#使用守護進程監控從庫並輸出日誌[root@DBBAK01 ~]# pt-heartbeat --user=root --password=xxx -S /tmp/mysql.sock -D test \--master-server-id=11 --monitor --print-master-server-id --daemonize --log=/tmp/slave-lag.log