First, the environment
1. System environment
system |
IP |
Host name |
Description |
server_id |
centos6.7 |
Masterip |
Master |
Database: Master |
177 |
centos6.7 |
Slaveip |
Slave |
Database: From |
148 |
2. Software Environment
Software |
version |
Installation Method |
Description |
PT Tool |
3.0.4 |
Compiling the installation |
This is a comprehensive toolkit that contains many PT commands |
MySQL Database |
5.6.37 |
Yum Installation |
Master-Slave environment |
3, need to use the library
Library name |
Table name |
Use |
Percona |
checksums |
Stores the results of the PT command monitoring, which is created on its own when the detection command is executed for the first time The Repair tool will read the table when it is repaired |
#本表格也可以自己创建, specify the name of the library table when using the PT tool, as described in the following parameter explanation.
Note: The self-built library table test has not been passed.
Second, why should the master from the consistency monitoring
1, master-slave replication is based on binlog logical replication, it is inevitable that replication data inconsistency risk
2, this risk will not only cause user data access inconsistent risk
3, and will cause the subsequent replication to appear 1032, 1062 error, which causes the replication architecture to stall the hidden danger
4, in order to find and solve this problem in time
5, we need to carry out regular or irregular replication data consistency verification and repair work
Third, master-slave consistency monitoring principle
Iv. Monitoring of PT Tools
The command to use:
1, Pt-table-check #监测主从一致
2, Pt-table-sync #修复主从一致
MySQL Master-slave consistency check, based on the PT tool.
#限制及问题
1, in the inspection phase, more than 1000 rows of data, if not set index or primary key, the error, the table will skip the check.
2, in the repair phase, if the table does not set the primary key or index, then fix the error, you can manually repair.
3, monitoring is based on the block, if the MySQL table data is not chunked, then when the table is too large, it will cause monitoring for a period of time to find that no problem will skip the change table.
4, when the database two data inconsistency, but checksum detection is consistent, there is no inconsistency.
The main library and the library account are consistent, the password inconsistency found this problem.
Cause: The value of chencksum has not changed for the operation to modify the password.
V. Installation of PT Tools
It is best to install all the main libraries from the library
1. Installation dependent
Yum Install Perl-io-socket-ssl perl-dbd-mysql perl-time-hires perl perl-dbi-y
Yum Install Perl-extutils-cbuilder perl-extutils-makemaker-y
2. Download the installation package
1. Download the website
3. Installation
Tar xzvf percona-toolkit-3.0.4_x86_64.tar.gz
CD percona-toolkit-3.0.4
Perl makefile.pl--Install to non-default path Prefix=${home}
Make
Make Test
Make install
Six, pt tools commonly used commands
1. Create a monitoring account
Grant all on * * to ' zxfly_check ' @ ' 192.168.22.% ' identified by ' zxfly ';
Flush privileges;
2, monitoring with the table is: percona.checksum The table will be created automatically.
Monitoring:
#指定库名
Pt-table-checksum--databases zxfly-u ' zxfly '-P ' zxfly '-hmasterip-p3306 2>/logs/pt_error.log 1>/logs/pt_ Info.log
#不指定库, monitor all databases (except Information_schema, Percona, Performance_schema libraries)
Pt-table-checksum--quiet-u ' zxfly '-P ' zxfly '-hmasterip-p3306 2>/logs/pt_error.log 1>/logs/pt_info.log
Pt-table-checksum--replicate-check-only-u ' zxfly '-P ' zxfly '-hmasterip-p3306
Print fix sql: Specify library tables
Pt-table-sync--databases=dataname--tables=table1,table2 h=masterip,u=zxfly,p=zxfly h=SlaveIP,u=zxfly,p=zxfly-- Charset=utf8--print
Repair:
Pt-table-sync--databases=dataname--tables=table1,table2 h=masterip,u=zxfly,p=zxfly h=SlaveIP,u=zxfly,p=zxfly-- Charset=utf8--exec
Vii. Common parameters of PT tools
1, Pt-table-checksum
Parameters |
parameter Description |
Notes |
--[no]check-replication-filters |
It is recommended to enable without checking the replication filter. You can use--databases later to specify the database you want to check. |
This parameter is not required for the current environment and is turned on by default |
--no-check-binlog-format |
Do not check the copy of the Binlog mode, if the Binlog mode is row, will be an error. |
Default is monitor, use default value, if adding this parameter may cause diff not to come out |
--replicate-check-only |
Only information that is not synchronized is displayed. |
Turn this on to reduce the output and display inconsistent slave hostname But this only shows the inconsistent information that has been detected and does not show the current. |
--replicate= |
Writes checksum information to the specified table, it is recommended to write directly to the database being inspected. You do not need to specify |
By default, the checksum table is created under the Percona library |
-h-u-p-p |
Masterip Monitoring account password port |
|
--databases= |
Specify the database that needs to be checked, and multiple separated by commas. |
|
--tables= |
Specify the tables that need to be checked, and multiple separated by commas |
|
--recursion-method |
Specify the mode to monitor from the library, using Processlist by default, or you can specify DSN |
Multiple from the library can be specified this way. --recursion-method=dsn=h=host,d=pt,t=dsns D Library name T table name |
--quiet |
Quiet mode, minimized printing, paper printing error line |
Similar to--replicate-check-only, but does not show from library information |
The DSN library table structure and usage are:
- CREATE TABLE ' DSNs ' (' id ' int (one) not null auto_increment, ' parent_id ' int (one) DEFAULT NULL, ' DSN ' varchar (255) NOT NULL, PRIMARY KEY (' id '));
- --Write from library information
- INSERT into DSNs (PARENT_ID,DSN) VALUES (1, "h=replica_host,u=checksums,p=password,p=3306");
- --if there are multiple from the library, insert multiple records.
- --You can also follow the following abbreviations
- INSERT into DSNs (PARENT_ID,DSN) VALUES (1, "h=replica_host");
2, Pt-table-sync
Parameters |
parameter Description |
Notes |
--replicate= |
Specify tables to be obtained through pt-table-checksum |
The default is to create a Percona library under the checksum table when you do not need to specify |
--databases= |
Specify the database to perform the synchronization |
Used when repairing only the specified library |
--tables= |
Specify the tables that need to be repaired, separated by commas |
|
--sync-to-master |
Specifies a DSN, which is the IP from which the |
Will go through show processlist or show slave status to automatically brahma. Error when the main library is not found |
H= u= p= |
server address, account number, password |
There are 2 IPs in the command, the first one is the address of Master, and the 2nd time is the address of slave. |
--print |
Print a repaired SQL statement |
Print only does not execute. |
--exec or--execute |
Perform the repair |
|
--algorithms=c |
Specifying the Fix algorithm |
The default Chunk,nibble,groupby,stream needs to specify the algorithm when it comes to error algorithm. |
--charset= |
Specifying the default character set |
Repair is unsuccessful if the data contains no secondary character set in Chinese |
3. Explanation of output information
TS: Time to complete the check.
ERRORS: The number of errors and warnings that occurred while checking.
Diffs:0 is consistent, and 1 means inconsistent. When--no-replicate-check is specified, it will always be 0 when the specified--replicate-check-only displays different information.
Rows: The number of rows in the table.
CHUNKS: The number of blocks that are divided into tables.
Skipped: The number of blocks skipped due to errors or warnings or too large.
Time: The duration of the execution.
Table: The names of the tables being checked.
Eight, PT tools common error information
1, monitoring error (not found from the library, using--recursion-method specified mode)
Diffs cannot be detected because no slaves were found. Please read the--recursion-method documentation for information.
2, Binlog mode problem (due to the use of--no-check-binlog-format skip monitoring because it is specified as row row replication, but this may not be able to monitor the main never consistent information, row row mode for the master and slave does not need to master-slave monitoring)
Replica centos-1 have Binlog_format ROW which could cause pt-table-checksum to break replication. Please read "Replicas using row-based Replication" in the limitations section of the tool ' s documentation. If you understand the risks, specify--no-check-binlog-format to disable this check.
3, no index or primary key caused, at the time of less data (less than 1000 lines, no test the information, more than 1000 rows error)
Cannot checksum table Test.t:there is No. Good index and the table is oversized. At./pt-table-checksum Line 6370.
4, waiting for information, the percentage that has been detected, because the primary key is set but no index results in no chunking and the table is too large.
Checksumming database.table:27% 01:17 remain
Ix. monitoring results for all data in multiple libraries
Pt-table-checksum Monitoring Database results:
Data Size |
monitoring time Spent |
table for monitoring errors |
cause |
93G |
30m28.523s |
4 x |
#字符集bug (tool comes with, cannot be solved) Formal platforms none of these tables |
|
|
1 x |
Not monitored (the Chinese table name is not monitored) Formal platform without this table |
|
|
1 x |
No primary key or index (number of data bars: 132835) |
Ten, PT Repair Tool Pt-table-sync encountered problems
After detection with pt-table-checksum, it is possible to use the Pt-table-sync tool to repair operations after discovering that the master is never consistent.
The principle is: check based on the results of pt-table-checksum monitoring and generate repair statements to repair the data from the library.
Error encountered:
1, repair time without any hint, but repair error. Use the –print Print repair statement to find and garbled.
How to: View the character set of a table and specify the default character set with the--charset= command
2, Error: Failed to prepare tablesyncchunk plugin:cannot chunk table ' zxfly_zxfly1 '. ' Mongo_task_data ' using the character Colu MN GUID, most likely because all values start with the same character. This table must is synced separately by specifying a list of--algorithms without the Chunk algorithm at/usr/local/bin/pt -table-sync line 4088. While doing table on 192.168.0.177
The reason is that in the default algorithm, to ensure that the primary key field data before the first one has a different character, and the table's primary key data is the same character.
Workaround:
Use the--algorithms= parameter to specify the algorithm, of course this should be best to restore the sub-database table.
6, repair error (cause: No unique index or primary key caused by, less than 1000, 1000 rows above no index or primary key in the monitoring will be skipped. )
Can ' t make changes on the master because no unique index exists At/usr/local/bin/pt-table-sync line 10591.
MySQL master-slave consistency check tool-pt