MySQL master-slave data consistency check and error correcting tool

Source: Internet
Author: User
Tags install perl percona

Directory

1. Overview

2. Installation of Percona-tooldit Tools

3. New User

4, Pt-table-checksum use

5, Pt-table-sync use

6. Personal Summary

1. Overview

If you are an OPS person, if your production environment deploys a MySQL system, and if your online MySQL is based on a master-slave copy of the architecture, then congratulations, it will probably bring you the master-slave data Inconsistent "bad luck."

Due to the native nature of the MySQL replication architecture, the data on the master-slave server is not possible to do "synchronous" replication, so the delay is inevitable, even if the server is not so busy, in the business is not busy time, from the library can catch up with the Master library Progress, may also be due to crash from the server, illegal shutdown, Such factors as program bugs cause data written on the main library to be inconsistent with data written from the library. And when this happens, there is no corresponding mechanism within MySQL to detect the master-slave data consistency, for the user, you do not know that the master-slave data is inconsistent.

So a tool is needed to solve this problem, and the Pt-table-checksum tool in the Percona-toolkit toolset is an efficient tool that detects inconsistent master-slave data without compromising MySQL performance. When data inconsistency is actually generated, the Percona-tools toolset also provides the Pt-table-sync tool to repair inconsistent data, which eliminates the hassle of redeploying from the server.

However, in the real production environment, these two tools still have some limitations, it should be true that the asynchronous replication of MySQL architecture caused the use of the tool limitations, because from the library will be slower than the main library, so in the validation of the table on the main library and check the table from the library is often inconsistent data, This is not caused by the delay from the library, so the two tools are best used in the following scenarios:

A) When upgrading from a server to a primary server, a data consistency check with the old primary server is required when the new primary server is online

b) Data consistency check should be performed after data migration

c), a consistency check should be performed after the data is updated from the library being mistakenly manipulated

d), planned data consistency check

The following demo is based on the master-slave replication environment built in the http://zhaochj.blog.51cto.com/368705/1635982 article.

2. Installation of Percona-toolkit Tools

Install the dependent packages and Percona-toolkit first:

[[Email protected] ~] yum-y Install perl perl-devel libaio libaio-devel perl-time-hires perl-dbd-mysql Perl-io-socket-ssl [[Email protected] ~] RPM-IVH percona-toolkit-2.2.13-1.noarch.rpm

3. New User

Create a non-root user to specifically perform data consistency detection, pt-table-checksum and pt-table-sync need to connect to the corresponding data viewing from the library, data modification and so on, so create a user on the main library:

mysql> GRANT select,insert,update,delete,create,process,super,replication Slave on * * to [email protected] ' 192.168.0.% ' identified by ' 111111 '; Query OK, 0 rows affected (0.01 sec) mysql> GRANT select,insert,update,delete,create,process,super,replication Slave On * * to [e-mail protected] ' 127.0.0.1 ' identified by ' 111111 '; Query OK, 0 rows Affected (0.00 sec) mysql> FLUSH privileges; Query OK, 0 rows Affected (0.00 sec)

When the user is new, test to see if the main and standby libraries are properly connected (because the user also needs to connect to the main library, so the top two users are created). This user needs a lot of permissions, there is no way, this is due to the two tools in Percona-toolkit the principle of the required permissions, if in order to save money, you can also directly give all permissions.

4, Pt-table-checksum use

On the TB1 table in the MYDB1 library as a test, view the contents of the TB1 on the main library:

Mysql> SELECT * from mydb1.tb1;+----+-------+------+| ID | name |  Age |+----+-------+------+| 1 |   Tom |  12 | | 2 |   Jem |  23 | | 3 |   Jason |  29 | | 4 |   AAA |  30 | | 5 |   B | |+----+-------+------+5 rows in Set (0.01 sec)

To view the contents of MYDB1.TB1 from the library:

Mysql> SELECT * from mydb1.tb1;+----+-------+------+| ID | name |  Age |+----+-------+------+| 1 |   Tom |  12 | | 2 |   Jem |  23 | | 3 |   Jason |  29 | | 4 |   AAA |  30 | | 5 |   B | |+----+-------+------+5 rows in Set (0.00 sec)

At this time the master-slave data are consistent, with the Pt-table-checksum tool test to see what the output of the results:

[Email protected] ~]# pt-table-checksum--nocheck-replication-filters--replicate=mydb1.checksums--databases=mydb1 H=127.0.0.1,u=monitor,p=111111replica slave have Binlog_format ROW which could cause pt-table-checksum to break replicatio  N. Please read "Replicas using row-based Replication" in the limitations section of the tool ' s documentation. If you understand the risks, specify--no-check-binlog-format to disable this check. #报错了 because the binary log of my MySQL environment is line-based, that is, ' Binlog _format=row ', if it is a row-based replication environment, Percona is not recommended to use the Pt-table-checksum tool for consistency checking of data, but it provides an option to skip this check.

The meaning of each common option:

--nocheck-replication-filters: Do not check for replication, we use--databases to specify the database to be checked

--replicate: Writes the information of the checksum to the specified table

--no-check-binlog-format: Do not check binary log file format

--replicate-check-only: Show only information that has inconsistent data

--databases: Specifies the database for validation, with multiple commas separated

--tables: Specifies the table of the checksum, separated by commas

H: Host, refers to the primary server IP

U: Account number

P: Password


Add the "--no-check-binlog-format" option to test again:

[[email protected] ~]# pt-table-checksum  --nocheck-replication-filters -- replicate=mydb1.checksums --no-check-binlog-format --databases=mydb1 --h=127.0.0.1,u=monitor,p= 111111            ts errors  diffs      ROWS  CHUNKS SKIPPED    TIME  table04-21t18:00:59      0      0         5       1        0   0.280 mydb1.tb104-21T18:00:59      0       0        2        1       0   0.331 mydb1.tb2# Watch the "DIFFS" column, This is not a "0" value if the data is inconsistent.

A similar error "No slaves were found" may be reported after running the upper command, as this is caused by the inability to connect from the server. When running the upper command, Pt-table-checksum will take a recursive approach to automatically find the main slave server, run "show Processlist;", then run "show slave hosts" to find, If you encounter an error that cannot be connected from the server, you can actively tell the primary server by adding "report_host= from the server IP" in the my.cnf from the server, and add "--in the Run pt-table-checksum command." Recursion-method=hosts "option so that the master server can use the" show slave hosts "command to see the IP address from the server.


Now we artificially make the master-slave data inconsistent, on the server from the TB1 table in the ID number 5 of the Age column is changed to 20:

mysql> Update mydb1.tb1 set age=20 where id=5; Query OK, 1 row affected (0.01 sec) Rows matched:1 changed:1 warnings:0mysql> select * from mydb1.tb1;+----+------- +------+| ID | name |  Age |+----+-------+------+| 1 |   Tom |  12 | | 2 |   Jem |  23 | | 3 |   Jason |  29 | | 4 |   AAA |  30 | | 5 |   B | |+----+-------+------+5 rows in Set (0.00 sec)

So the master-slave data is inconsistent, we run the Pt-table-checksum tool on the main server to test:

[[email protected] ~]# pt-table-checksum  --nocheck-replication-filters -- Replicate=mydb1.checksums --no-check-binlog-format --recursion-method=hosts --databases=mydb1  h=127.0.0.1,u=monitor,p=111111            ts  ERRORS  DIFFS     ROWS  CHUNKS SKIPPED     TIME TABLE04-21T18:27:00      0       1        5       1        0   0.307 mydb1.tb104-21T18:27:01       0      0        2        1       0   0.048  mydb1.tb2#tb1 the value of "diffs" for this table has changed to "1"。 

Then the Pt-table-sync tool came on.

5, Pt-table-sync use

You can use the "--print" option to see where the master and slave are inconsistent:

[Email protected] ~]# Pt-table-sync--replicate=mydb1.checksums h=127.0.0.1,u=monitor,p=111111 h=192.168.0.202,u= monitor,p=111111--charset=utf8--printreplace into ' mydb1 ', ' tb1 ' (' IDs ', ' name ', ' age ') VALUES (' 5 ', ' B ', ' a ') '/*percona- Toolkit src_db:mydb1 src_tbl:tb1 src_dsn:a=utf8,h=127.0.0.1,p=...,u=monitor dst_db:mydb1 dst_tbl:tb1 Dst_dsn:A=utf8, H=192.168.0.202,p=...,u=monitor lock:1 transaction:1 changing_src:mydb1.checksums replicate:mydb1.checksums bidirectional:0 pid:3205 user:root host:master*/; #上边的输出信息表示从库上id =5 The value of age should be 69. #命令中有两组 "h=, u=, p=", the first group specifies the primary server, and the second group points to the slave server.

The meanings of the various common options:

--replicate=: Represents a checksums table based on the Pt-table-checksum tool to fix problematic data

--databases=: Represents the database in which synchronization is performed, with multiple comma-separated

--tables=: A data table that performs synchronization, with multiple comma-separated

H=: Server Host Name

U=: Account Number

p=: Password

--print: Print only, but do not execute command

--execute: Execute command


After confirming that the data is really inconsistent, replace the "--print" option with "--execute" to execute the replacement statement:

[Email protected] ~]# Pt-table-sync--replicate=mydb1.checksums h=127.0.0.1,u=monitor,p=111111 h=192.168.0.202,u= monitor,p=111111--charset=utf8--execute

Manually correct the data from the library and run the Pt-table-checksum tool to see if the data for both tables is consistent:

[[email protected] ~]# pt-table-checksum --replicate =mydb1.checksums --nocheck-replication-filters --no-check-binlog-format --databases=mydb1  h=127.0.0.1,u=monitor,p=111111            ts  errors  diffs     rows  chunks skipped     time table04-21t21:42:31      0      0         5       1        0   0.309 mydb1.tb104-21T21:42:32       0      0        2        1       0   0.304  MYDB1.TB2 

This data is successfully corrected.

6. Personal Summary

These two tools are generally used together to make up for MySQL no data consistency check mechanism, so that operators in the master-slave replication architecture can be more maintenance better. Based on Percona official instructions in the Pt-table-checksum tool it is best to make replication statement-based replication, while statement-based and row-based replication have their own pros and cons, and if you consider the use of pt-table-checksum tools in later maintenance, Personally think it's time to set Binlog_format to statement, or mixed.

The last thing to say is that if the primary and standby data is really inconsistent in the production environment, rather than the delay, it is important to remember to back up the source data when using these tools, regardless of whether the source data is intact, or if some data has been corrupted, you must make a backup of the source data before you do the data repair work. In a high-pressure environment such as data recovery, who can guarantee that the operations you are doing are normative and correct? If the operation is wrong, you have at least a chance to roll back.


This article is from the "knowledge needs summary and records" blog, please be sure to keep this source http://zhaochj.blog.51cto.com/368705/1636703

MySQL master-slave data consistency check and error correcting tool

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.